I have a number of records of the html sections of a series of Google Earth placemarks
that I managed to extract from the raw kml file. The purpose of the exersize was to then use
the simplehtmldom.php api to extract the DATA from the raw html code. Some of the process is
going well... and some is NOT.
I have found that if I modify the raw html code by entering ID attributes into the html code
the simplehtmldom api has an easy time identifying the desired data, and the data can be far
'cleaner' by entering an id attribute as 'close' to the data as possible. But doing a php text search
and replace often requires finding a 'unique' identifyable portion of the html code and THEN placing
the 'id' attribute in a nearby html tag because the desired data is nested inside a non-unique tag.
As in I can identify a SPECIFIC <td> tag section where the data i desire is located but the data
is nested inside a <font> tag inside the <td> cluster.
Hence my problem...
If I do a search in the following code...
<td><b><font size="+2" color="#FF0000">Neighborhood:</font> <font size="+2" color="#0000FF">City of Sidney</font></b></td>
I can locate the 'Neighborhood:' string because it is unique in the whole html code. Then by some
charcter counting I am desiring to put my 'id' attribute in the NEXT font tag because it surrounds
the desired data the 'City of Sidney'... as in...
<td><b><font size="+2" color="#FF0000">Neighborhood:</font> <font id="neighborhood" size="+2" color="#0000FF">City of Sidney</font></b></td>
With this modification the desired data is easily found and cleanly produced.
But the html code while all operating correctly in a web page is not all identicle from a
'whitespace' point of view AND thus my problem. If I search the following code...
<td>
<b><font size="+2" color="#FF0000">Neighboorhood:</font>
<font size="+2" color="#0000FF">Greenacre</font></b>
</td>
While being identical as far as html is concerned if I search this code for the 'Neighboorhood:'
identifier I find it... but then attempting to place the id tag into the NEXT font tag is being
problematic. What i seem to need is a function that once the 'Neighboorhood:' string position is
identied and noted in the whole of the html code, to FIND and modify the NEXT occurance of
a font tag no matter what whitespace (or special charachters) may be occuring.
Any suggestions??
eatc7402