Jump to content

grab text between 2 tags that appear any number of times


poe

Recommended Posts

hi,

 

i have an html file that has:

 

<tr>

<td>1</td><td>57</td><td><A HREF="[red]player.cgi?[/red]3184">Francisco Rodriguez</A></td><td>69</td><td>15</td><td>2</td><td>3</td><td>47</td><td>4</td>

[red]</tr>[/red]

 

<tr>

<td colspan=9><img src='spacer.gif'></td>

</tr>

 

<tr>

<td>2</td><td>36</td><td><A HREF="[red]player.cgi?[/red]4954">Jered Weaver</A></td><td>19</td><td>19</td><td>11</td><td>2</td><td>6</td><td>2</td>

[red]</tr>[/red]

 

<tr>

<td colspan=9><img src='spacer.gif'></td>

</tr>

 

<tr>

<td>3</td><td>62</td><td><A HREF="[red]player.cgi?[/red]2544">Scot Shields</A></td><td>74</td><td>8</td><td>7</td><td>7</td><td>2</td><td>22</td>

[red]</tr>[/red]

 

<tr>

<td colspan=9><img src='spacer.gif'></td>

</tr>

 

<tr>

<td>4</td><td>48</td><td><A HREF="[red]player.cgi?[/red]187">Hector Carrasco</A></td><td>56</td><td>3</td><td>7</td><td>3</td><td>1</td><td>12</td>

[red]</tr>[/red]

 

<tr>

<td colspan=9><img src='spacer.gif'></td>

</tr>

 

etc.....

 

 

i want to extract the data that starts at "player.cgi?" AND stop at "</tr>"

 

then move on to the next occurance and repeat

 

i dont care about the tags inbetween, as i want to take the data and upload it to a database.

 

so the results i want are:

 

make:

 

3184|Francisco Rodriguez|69|15|2|3|47|4|

 

4954|Jered Weaver|19|19|11|2|6|2|

 

2544|Scot Shields|74|8|7|7|2|22|

 

187|Hector Carrasco|56|3|7|3|1|12|

 

I suppose there's a better way, but here's what I came up with.

I think you can figure it out on your own from this point.

 

<?php

$text = <<<HTML
<tr>
<td>1</td><td>57</td><td><A HREF="player.cgi?3184">Francisco Rodriguez</a></td><td>69</td><td>15</td><td>2</td><td>3</td><td>47</td><td>4</td>
</tr>

<tr>
<td colspan=9><img src='spacer.gif'></td>
</tr>

<tr>
<td>2</td><td>36</td><td><A HREF="player.cgi?4954">Jered Weaver</a></td><td>19</td><td>19</td><td>11</td><td>2</td><td>6</td><td>2</td>
</tr>

<tr>
<td colspan=9><img src='spacer.gif'></td>
</tr>

<tr>
<td>3</td><td>62</td><td><A HREF="player.cgi?2544">Scot Shields</a></td><td>74</td><td>8</td><td>7</td><td>7</td><td>2</td><td>22</td>
</tr>

<tr>
<td colspan=9><img src='spacer.gif'></td>
</tr>

<tr>
<td>4</td><td>48</td><td><A HREF="player.cgi?187">Hector Carrasco</a></td><td>56</td><td>3</td><td>7</td><td>3</td><td>1</td><td>12</td>
</tr>

<tr>
<td colspan=9><img src='spacer.gif'></td>
</tr>
HTML;



$regex = "/href=\"player.cgi\?([0-9]+)\">([a-z ]+)<\/a><\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td>/is";

preg_match_all($regex, $text, $matches);

unset($matches[0]);

echo "<pre>";
foreach ($matches as $match)
{
print_r($match);
echo "\n\n\n";
}
echo "</pre>";

?>

 

 

Orio.

cool thanks

 

1 more q.

 

what do i do if the name contains more than just a-z

 

such as:

 

<A HREF="player.cgi?2296">J.John-Ford Griffin (OF/3B)</a>

 

this name has:

a-z

0-9

.

-

(

)

/

 

how do i account for these characters?

 

i tried ([a-z0-9.-()/ ]+)

 

but i get:  Unknown modifier ']'

 

thanks

chris

cool thanks

 

1 more q.

 

what do i do if the name contains more than just a-z

 

such as:

 

<td><A HREF="player.cgi?2296">J.John-Ford Griffin</a>(OF/3B)</td>

 

this name has:

a-z

0-9

.

-

(

)

/

 

how do i account for these characters?

 

i tried ([a-z0-9.-()/ ]+)

 

but i get:  Unknown modifier ']'

 

thanks

chris

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.