bschultz Posted October 13, 2021 Share Posted October 13, 2021 I'm scraping a page with table contents like such: <table border="1" style="width:100%;" cellspacing="0" cellpadding="3"> <tr class="stats-section"> <td colspan="99">Scoring</td> </tr> <tr class="stats-section"> <td colspan="99">2nd Period</td> </tr> <tr class="hscore"> <td>UMD</td> <td>4×4</td> <td> Kobe Roth (1)</td> <td> Noah Cates, Casey Gilling </td> <td align="right">12:35</td> </tr> <tr class="vscore"> <td>BSU</td> <td>4×4</td> <td> Alex Ierullo (1)</td> <td> Kyle Looft </td> <td align="right">13:06</td> </tr> <tr class="stats-section"> <td colspan="99">3rd Period</td> </tr> <tr class="hscore"> <td>UMD</td> <td></td> <td> Blake Biondi (1)</td> <td> Quinn Olson </td> <td align="right">10:10</td> </tr> </table> I want to match the <tr class="hscore" AND the <tr class="vscore and change them into the following: <tr class="hscore"> <td>#1 UMD</td> <!--added #1 --> ... <tr class="vscore"> <td>#2 BSU</td> <!-- added #2 --> ... </tr></table> I don't know what order, or even how many of each hscore or vscore entries there will be. I need to auto increment a variable ($i++;) upon each match to echo the #1 and #2. Is regex my best bet? Maybe str_replace or is Simple Dom and a foreach of each <tr> better? I can't think of a way to add to the $i variable on each match. Thanks! Quote Link to comment Share on other sites More sharing options...
Barand Posted October 13, 2021 Share Posted October 13, 2021 Use xpath. $xml = simplexml_load_string($html); $trs = $xml->xpath("//tr[@class='hscore']"); $i = 1; foreach ($trs as $tr) { $tr->td[0] = "#$i ".$tr->td[0]; ++$i; } $trs = $xml->xpath("//tr[@class='vscore']"); $i = 1; foreach ($trs as $tr) { $tr->td[0] = "#$i ".$tr->td[0]; ++$i; } echo '<pre>' . htmlentities($xml->asXML()) . '<?pre>'; gives <?xml version="1.0"?> <table border="1" style="width:100%;" cellspacing="0" cellpadding="3"> <tr class="stats-section"> <td colspan="99">Scoring</td> </tr> <tr class="stats-section"> <td colspan="99">2nd Period</td> </tr> <tr class="hscore"> <td>#1 UMD</td> <td>4×4</td> <td> Kobe Roth (1)</td> <td> Noah Cates, Casey Gilling </td> <td align="right">12:35</td> </tr> <tr class="vscore"> <td>#1 BSU</td> <td>4×4</td> <td> Alex Ierullo (1)</td> <td> Kyle Looft </td> <td align="right">13:06</td> </tr> <tr class="stats-section"> <td colspan="99">3rd Period</td> </tr> <tr class="hscore"> <td>#2 UMD</td> <td/> <td> Blake Biondi (1)</td> <td> Quinn Olson </td> <td align="right">10:10</td> </tr> </table> Quote Link to comment Share on other sites More sharing options...
bschultz Posted October 13, 2021 Author Share Posted October 13, 2021 But I need the first entry...regardless if it's hscore or vscore...to be #1. The second entry...regardless of hscore or vscore...to be #2 The third is #3, etc. Quote Link to comment Share on other sites More sharing options...
Barand Posted October 13, 2021 Share Posted October 13, 2021 (edited) Then change the code. I've given you a start. [EDIT] What the hell! $xml = simplexml_load_string($html); $trs = $xml->xpath("//tr[@class='hscore' or @class='vscore']"); $i = 1; foreach ($trs as $tr) { $tr->td[0] = "#$i ".$tr->td[0]; ++$i; } echo '<pre>' . htmlentities($xml->asXML()) . '<?pre>'; Edited October 13, 2021 by Barand Quote Link to comment Share on other sites More sharing options...
bschultz Posted October 13, 2021 Author Share Posted October 13, 2021 Didn't know xpath had 'or'... I tried || and it didn't work well! Thank you for teaching me something new today! Quote Link to comment Share on other sites More sharing options...
Barand Posted October 14, 2021 Share Posted October 14, 2021 10 hours ago, bschultz said: Didn't know xpath had 'or' Neither did I until you said you needed them grouped together. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.