preg_match( help needed

lokie538 · January 17, 2009

Hi,

Im trying to extract data from a website. But for some reason its not working.

This is the code on the website:

 			  			<tr><td width="90">Suburb(s):</td><td>
		  			
		  			



                            
                            COOROY, 
                            


                            
                            COOROY MOUNTAIN, 
                            


                            
                            LAKE MACDONALD, 
                            


                            
                            TINBEERWAH
                            


                        </td></tr>

and this is the code im trying to use to get it:

preg_match('~<tr><td width="90">Suburb(s):</td><td>(.*?[^<])</td></tr>~i', $file, $yourpost3);

print $yourpost3[1]; 

/// $file just uses a saved html file

The bit im unsure of is (.*?[^<]) I don't know what this means?

It returns this error Notice: Undefined offset: 1 in C:\wamp\www\get.php on line 15

.josh · January 17, 2009

It means the regex failed so there's no $yourpost3[1] defined.

lokie538 · January 17, 2009

So have you got any ideas why it wouldnt work? and failed?

nrg_alpha · January 17, 2009

To elaborate a bit more on the actual meaning of that line:

.*? // match anything . (except a newline) zero or more times *, but make it lazy ?, so first check to see if the next character is anything but a < [^<], and if it is not, include the current character into the match, then move forward to the next character and retest.. otherwise, if it is a <, stop. The one problem I do see is this in your pattern: Suburb(s)... inside a pattern, brackets are considered the formation of grouping elements... so you need to escape those...

I suspect this is what you are looking for?

$str = <<<DATA
	  			<tr><td width="90">Suburb(s):</td><td>
		  			
		  			



                            
                            COOROY, 
                            


                            
                            COOROY MOUNTAIN, 
                            


                            
                            LAKE MACDONALD, 
                            


                            
                            TINBEERWAH
                            


                        </td></tr>
DATA;
preg_match('#<tr><td width="90">Suburb\(s\):</td><td>([^<]+)#s', $str, $match);
echo '<pre>'.print_r($match[1], true);

output:

  			
		  			



                            
                            COOROY, 
                            


                            
                            COOROY MOUNTAIN, 
                            


                            
                            LAKE MACDONALD, 
                            


                            
                            TINBEERWAH

You can have a look at the regex resources page to learn more about regex.

lokie538 · January 17, 2009

Yep that works a treat thanks mate!!!

Now I just have to find how to remove all the white spaces and line breaks so it is just a string. For instance the output should be "cooroy, cooroy mountain, lake mcdonald, tinbeerwah"

Ive been looking at this http://www.gskinner.com/RegExr/ trying to understand more hehe

Thanks for your help!!

Edit: this kinda works to remove the white spaces!

 $apples = str_replace(" ", "", $match[1]);
echo '<pre>'.print_r($match[1], true);

echo $apples;

nrg_alpha · January 17, 2009

Or if you wanted to break the remaining display into their own separate entries, you could also do this:

$arr = preg_split('#(?:\s{2,}|, )#', $match[1], -1, PREG_SPLIT_NO_EMPTY);
echo '<pre>'.print_r($arr, true);

Output:

Array
(
    [0] => COOROY
    [1] => COOROY MOUNTAIN
    [2] => LAKE MACDONALD
    [3] => TINBEERWAH
)

nrg_alpha · January 17, 2009

Don't forget to flag this as TOPIC SOLVED.

lokie538 · January 17, 2009

Thanks for the help your a legend!!

A legend of the internet!!

nrg_alpha · January 17, 2009

Thanks for the help your a legend!!

A legend of the internet!!

hehe.. not quite... I'm stilll a peon.

Sign In

preg_match( help needed

Recommended Posts

lokie538

Link to comment

Share on other sites

.josh

Link to comment

Share on other sites

lokie538

Link to comment

Share on other sites

nrg_alpha

Link to comment

Share on other sites

lokie538

Link to comment

Share on other sites

nrg_alpha

Link to comment

Share on other sites

nrg_alpha

Link to comment

Share on other sites

lokie538

Link to comment

Share on other sites

nrg_alpha

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information