poe Posted March 26, 2007 Share Posted March 26, 2007 hi, i have an html file that has: <tr> <td>1</td><td>57</td><td><A HREF="[red]player.cgi?[/red]3184">Francisco Rodriguez</A></td><td>69</td><td>15</td><td>2</td><td>3</td><td>47</td><td>4</td> [red]</tr>[/red] <tr> <td colspan=9><img src='spacer.gif'></td> </tr> <tr> <td>2</td><td>36</td><td><A HREF="[red]player.cgi?[/red]4954">Jered Weaver</A></td><td>19</td><td>19</td><td>11</td><td>2</td><td>6</td><td>2</td> [red]</tr>[/red] <tr> <td colspan=9><img src='spacer.gif'></td> </tr> <tr> <td>3</td><td>62</td><td><A HREF="[red]player.cgi?[/red]2544">Scot Shields</A></td><td>74</td><td>8</td><td>7</td><td>7</td><td>2</td><td>22</td> [red]</tr>[/red] <tr> <td colspan=9><img src='spacer.gif'></td> </tr> <tr> <td>4</td><td>48</td><td><A HREF="[red]player.cgi?[/red]187">Hector Carrasco</A></td><td>56</td><td>3</td><td>7</td><td>3</td><td>1</td><td>12</td> [red]</tr>[/red] <tr> <td colspan=9><img src='spacer.gif'></td> </tr> etc..... i want to extract the data that starts at "player.cgi?" AND stop at "</tr>" then move on to the next occurance and repeat i dont care about the tags inbetween, as i want to take the data and upload it to a database. so the results i want are: make: 3184|Francisco Rodriguez|69|15|2|3|47|4| 4954|Jered Weaver|19|19|11|2|6|2| 2544|Scot Shields|74|8|7|7|2|22| 187|Hector Carrasco|56|3|7|3|1|12| Quote Link to comment Share on other sites More sharing options...
Orio Posted March 26, 2007 Share Posted March 26, 2007 I suppose there's a better way, but here's what I came up with. I think you can figure it out on your own from this point. <?php $text = <<<HTML <tr> <td>1</td><td>57</td><td><A HREF="player.cgi?3184">Francisco Rodriguez</a></td><td>69</td><td>15</td><td>2</td><td>3</td><td>47</td><td>4</td> </tr> <tr> <td colspan=9><img src='spacer.gif'></td> </tr> <tr> <td>2</td><td>36</td><td><A HREF="player.cgi?4954">Jered Weaver</a></td><td>19</td><td>19</td><td>11</td><td>2</td><td>6</td><td>2</td> </tr> <tr> <td colspan=9><img src='spacer.gif'></td> </tr> <tr> <td>3</td><td>62</td><td><A HREF="player.cgi?2544">Scot Shields</a></td><td>74</td><td>8</td><td>7</td><td>7</td><td>2</td><td>22</td> </tr> <tr> <td colspan=9><img src='spacer.gif'></td> </tr> <tr> <td>4</td><td>48</td><td><A HREF="player.cgi?187">Hector Carrasco</a></td><td>56</td><td>3</td><td>7</td><td>3</td><td>1</td><td>12</td> </tr> <tr> <td colspan=9><img src='spacer.gif'></td> </tr> HTML; $regex = "/href=\"player.cgi\?([0-9]+)\">([a-z ]+)<\/a><\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td><td>([0-9]+)<\/td>/is"; preg_match_all($regex, $text, $matches); unset($matches[0]); echo "<pre>"; foreach ($matches as $match) { print_r($match); echo "\n\n\n"; } echo "</pre>"; ?> Orio. Quote Link to comment Share on other sites More sharing options...
poe Posted March 27, 2007 Author Share Posted March 27, 2007 cool thanks 1 more q. what do i do if the name contains more than just a-z such as: <A HREF="player.cgi?2296">J.John-Ford Griffin (OF/3B)</a> this name has: a-z 0-9 . - ( ) / how do i account for these characters? i tried ([a-z0-9.-()/ ]+) but i get: Unknown modifier ']' thanks chris Quote Link to comment Share on other sites More sharing options...
poe Posted March 27, 2007 Author Share Posted March 27, 2007 cool thanks 1 more q. what do i do if the name contains more than just a-z such as: <td><A HREF="player.cgi?2296">J.John-Ford Griffin</a>(OF/3B)</td> this name has: a-z 0-9 . - ( ) / how do i account for these characters? i tried ([a-z0-9.-()/ ]+) but i get: Unknown modifier ']' thanks chris Quote Link to comment Share on other sites More sharing options...
Orio Posted March 27, 2007 Share Posted March 27, 2007 You need to escape a few characters: ([a-z0-9\.\-\(\)\/ ]+) Orio. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.