Jump to content

Not getting all of Needle out of Haystack


coder007

Recommended Posts

Hey!

 

This is my RegEx (PCRE - preg):

 

preg_match("/<td><center>([0-9,.]+)<\/td><\/center><td><center>([0-9,.]+)<\/td><\/center><td><center>([0-9,.]+)<\/td><\/center><td><center>([0-9])<\/td><\/center>/", $line, $matches);

 

This is the haystack:

 

<td><center>5/30/2009 3:07:08 AM</td></center><td><center>66<br>Days</td></center><td><center><img border="0" src="images/teams/team_Orange.gif" width="14" height="13" title="Team: Orange"> </td></center><td><center>125,633.273</td></center><td><center>15,604.47</td></center><td><center>9,627.72</td></center><td><center><img src='images/war.gif' border=0 title='War is an option'> </td></center></tr><tr bgcolor='#ffffff'> 

 

The values I am trying to get are the numbers, namely the date (5/30/2009 3:07:08 AM), the first number (125,633.273), the second number (15,604.47), and the third number (9627.72).

 

Currently I am getting 0's for all but the date.

 

Thanks in advance!

 

Note to OP... that chunk of code has improper nesting (which could make things a little trickier to navigate):

 

<td><center>125,633.273</td></center>

should be

<td><center>125,633.273</center></td>

 

So perhaps a quick and dirty way could be:

$str = <<<HTML
<td><center>5/30/2009 3:07:08 AM</td></center><td><center>66<br>Days</td></center><td><center><img border="0" src="images/teams/team_Orange.gif" width="14" height="13" title="Team: Orange"> </td></center><td><center>125,633.273</td></center><td><center>15,604.47</td></center><td><center>9,627.72</td></center><td><center><img src='images/war.gif' border=0 title='War is an option'> </td></center></tr><tr bgcolor='#ffffff'>
HTML;

preg_match('#((?:\d{1,2}/){2}\d{4}[^<]+).+?([\d,]+\.\d+).+?([\d,]+\.\d+)#', $str, $match);
echo $match[1] . "<br />\n" . $match[2] . "<br />\n" . $match[3];

 

EDIT - The way I grab those two sets of numbers relies on the decimal to be involved. Again, this is a quick fast way with no fus.. as those tags are not ordered correctly in some spots, which makes me question the consistency of the code you are checking...

Well, what a shocker! Didn't know that...

 

Perhaps this will work better:

 

preg_match('#<td><center>([^<]+)</td></center><td><center>66<br>Days</td></center><td><center><img border="0" src="images/teams/team_Orange.gif" width="14" height="13" title="Team: Orange"> </td></center><td><center>([^<]+)</td></center><td><center>([^<]+)</td></center><td><center>([^<]+)</td>#', $line, $matches);

 

EDIT: Probs best off going with nrg_alpha's!

I suppose I could have included say <td><center> at the start of my pattern to help ensure the matching of the appropriate date location (in the event there is other dates located on the page - would have been nice if those td tags had some ids or classes to help differentiate themselves though).

Perhaps cutting and paste a 'small' portion of the code (containing one or two examples) if it differs from what you initially posted.. because I used that line of code you posted as a test, and it worked... so I'm thinking there might be some variances in the code that the pattern isn't taking into account?

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.