Orio Posted October 21, 2006 Share Posted October 21, 2006 Hello :)I am coding a little project for myself, but I ran into a problem when it came to regex.What basicly I want to do it filter results I get when I search using cURL.So I have HTML stored in the variable $html, and it holds something that looks like this:[code]// Alot of HTML above<table> <tr> <td class="box_content" align="center">186,996</td> <td class="box_content"><a href="/viewprofile.php?session=&id=1384922">User1</a></td> <td class="box_content" align="center">268,655,655</td> <td class="box_content" align="center">660</td> <td class="box_content" align="center">12</td> <td class="box_content" align="center"><img src="/images/images/2.gif"></td> </tr> <tr> <td class="box_content" align="center">186,997</td> <td class="box_content"><a href="/viewprofile.php?session=&id=1183963">User2</a></td> <td class="box_content" align="center">778,138</td> <td class="box_content" align="center">163</td> <td class="box_content" align="center">12</td> <td class="box_content" align="center"><img src="/images/images/3.gif"></td> </tr>//////////More and more of these table rows.... <tr> <td class="box_content" align="center">187,000</td> <td class="box_content"><a href="/viewprofile.php?session=&id=1172426">User50</a></td> <td class="box_content" align="center">364,387,830</td> <td class="box_content" align="center">6,200</td> <td class="box_content" align="center">12</td> <td class="box_content" align="center"><img src="/images/images/4.gif"></td> </tr></table>//More HTML below[/code]As you can see, every page holds 50 results. Moving to the next page and everything is no problem, but the problem is the filtering itself- I want to echo only the user's who have in their third coulmn a value of 200 million or greater. In this HTML example user50 and user1.So what I basicly thought of doing is to break the code using explode("<tr>", $html) and then (using a regular expression and preg_match_all I suppose) get all the info I need (between the <td> tags, I dont care echoing the link to the username too). Then I'll "clean" the numbers from commas etc' and check if the number I want is greater than 200mil. If it is, print the user.So, I'd really appreciate it if someone could help me get the information between the <td> tags, it has really been a struggle for me and I think google is going to ban me soon for searching too much :DThanks alot :)Orio. Quote Link to comment Share on other sites More sharing options...
c4onastick Posted October 21, 2006 Share Posted October 21, 2006 I'm certianly not a master at regex, but I have a similar project that I created. I think you might be shooting yourself in the foot with the explode command. Sure, you can do it that way, but it seems like it makes a bigger mess. I'd do one "preg_match_all" command, with sub expressions. That'll dump all the information into an array that you can step through and filter out the users that don't hit 200 million.[code]preg_match_all( '/<tr>.*?<td class=\"box_content\" align=\"center\">[0-9,]+<\/td>.*?<td class=\"box_content\"><a href=\"\/viewprofile.php\?session=&id=[0-9]+\">([a-zA-Z0-9]+)<\/a><\/td>.*?<td class=\"box_content\" align=\"center\">([0-9,]{11})<\/td>/is', $html, $result);[/code]I tested it at [url=http://regexlib.com/RETester.aspx]http://regexlib.com/RETester.aspx[/url], it works for the example you posted. And (I'm a little proud of this part) it'll only grab people with at least 100 million, so you get half of your filtering done right there with one function! Quote Link to comment Share on other sites More sharing options...
Orio Posted October 21, 2006 Author Share Posted October 21, 2006 Looks good :D Thanks!I haven't tested it yet, because I haven't finished the whole script, but can you tell me how $result will look like so I can use it properly?Orio. Quote Link to comment Share on other sites More sharing options...
c4onastick Posted October 21, 2006 Share Posted October 21, 2006 $result will be a multidimensional array looking something like this:[code]Array ( Array (Each occurance of full pattern matched, for you <tr>blah blah User1 blah blah 200 million</td>), Array (Each occurance of First sub pattern, here User1), Array (Each occurance of Second sub pattern, here 200 million) )[/code]I'd step through them with something like:[code]for($i=0; $i<count($result[0]); $i++){ $user = $result[1][$i]; $numeber = $result[2][$i]; //More code}[/code]I'm not sure where the 200 million numebr is going to go, but that preg_match_all will pull it out with the commas in it, so you'll have to remove those if you need to do math on it. Quote Link to comment Share on other sites More sharing options...
Orio Posted October 21, 2006 Author Share Posted October 21, 2006 I got everything working prefectly :DThanks a ton!!!Orio. Quote Link to comment Share on other sites More sharing options...
c4onastick Posted October 21, 2006 Share Posted October 21, 2006 Glad to help! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.