Jump to content

PHP search and replace problem


eatc7402

Recommended Posts

I have a number of records of the html sections of a series of Google Earth placemarks

that I managed to extract from the raw kml file. The purpose of the  exersize was to then use

the simplehtmldom.php api to extract the DATA from the raw html code. Some of the process is

going well... and some  is NOT.

 

I have found that if I modify the raw html code by entering ID attributes into the html code

the simplehtmldom api has an easy time identifying the desired data, and the data can be far

'cleaner' by entering an id attribute as 'close' to the data as possible. But doing a php text search

and replace often requires finding a 'unique' identifyable portion of the html code and THEN placing

the 'id' attribute in a nearby html tag because the desired data is nested inside a non-unique tag.

As in I can identify a SPECIFIC <td> tag section where the data i desire is located but the data

is nested inside a <font>  tag inside the <td> cluster.

 

Hence my problem...

 

If I do a search in the following  code...

<td><b><font size="+2" color="#FF0000">Neighborhood:</font> <font size="+2" color="#0000FF">City of Sidney</font></b></td>

 

I can locate the 'Neighborhood:' string because it is unique in the whole html code. Then by some

charcter counting I am desiring to put my 'id' attribute in the NEXT font tag because it surrounds

the desired data the 'City of Sidney'... as in...

 

<td><b><font size="+2" color="#FF0000">Neighborhood:</font> <font id="neighborhood" size="+2" color="#0000FF">City of Sidney</font></b></td>

 

With this modification the desired data is easily found and  cleanly produced.

 

But the html code while all operating correctly in a web page is not all identicle from a

'whitespace' point of view AND thus my problem. If I search the following code...

<td>
    <b><font size="+2" color="#FF0000">Neighboorhood:</font> 
       <font size="+2" color="#0000FF">Greenacre</font></b>
  </td>

 

While being identical as far as html is concerned if I search this code for the 'Neighboorhood:'

identifier I find it... but then attempting to place the id tag into the NEXT font tag is being

problematic. What i seem to need is a function that once the 'Neighboorhood:' string position is

identied and noted in the whole of the html code, to FIND and modify the NEXT occurance of

a font tag no matter what whitespace (or special charachters) may be occuring.

 

Any suggestions??

 

eatc7402

Link to comment
Share on other sites

try

<?php
$test = '<td>
    <b><font size="+2" color="#FF0000">Neighboorhood:</font> 
       <font size="+2" color="#0000FF">Greenacre</font></b>
  </td> ';
$out = preg_replace('/(Neighboorhood:.*?<font.*?)>/s', '\1 what you want to insert>', $test);
echo $out;
?>

Link to comment
Share on other sites

Well I found a php function to strip out and remove whitespace and special character, and then

using strlen and strreplaece from the then know positions seems to br getting me closer

to my desired outcome. The regular expression function given in a reply DID not do what I

desired which was a TEXT SUBSTITUION.

 

eact7402

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.