Helminthophobe Posted March 26, 2008 Share Posted March 26, 2008 Is it possible to create a loop with RegEx when looking for information? I'm sure my terminology is a bit off since I am absolutely new to RegEx so I'll give an example. I've built a script that digs through the source code of another site looking for data (see the bottom of the post for a preview of the code). I'm having trouble pulling the data from the following bit of source code (some source code missing in the example): <img id="ctl00_mainContent_rptWeapons_ctl00_imgWeapon" class="weapon" src="/images/halo3stats/weapons/e2b3837c-c27f-4497-a07d-8e59f153cff6.gif" style="border-width:0px;" /> <div class="num">99 (33.00%)</div></div> <img id="ctl00_mainContent_rptWeapons_ctl01_imgWeapon" class="weapon" src="/images/halo3stats/weapons/5f8fbbf9-6267-4257-9a2d-24f8c2e5441d.gif" style="border-width:0px;" /> <div class="num">71 (23.67%)</div></div> <img id="ctl00_mainContent_rptWeapons_ctl02_imgWeapon" class="weapon" src="/images/halo3stats/weapons/fdb4005f-45a4-472a-8646-9763ebc75aad.gif" style="border-width:0px;" /> <div class="num">45 (15.00%)</div></div> Is it possible to build a loop that finds the following and saves each result in a different variable every time the pattern is found? There is no set number of times the pattern may be found. It will be different each time. It may show up 20 times for one user and only 5 for another. <img id=\"(.+?)" class=\"weapon\" src=\"(.+?)" style=\"border-width:0px;\" \/>\s+<div class=\"num\">(.+?)<\/div><\/div> This is the script I am using now to find the other data that doesn't require a loop or anything. The URL contains the data for $tag. $ch = curl_init(); $timeout = 5; curl_setopt ($ch, CURLOPT_URL, 'http://www.bungie.net/stats/halo3/CareerStats.aspx?player=' . $tag . '&social=true&map=0'); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout); $in1 = curl_exec($ch); curl_close($ch); preg_match("/Kills :<\/td>\s+<td class=\"values\">(.+?)<\/td>/",$in1, $social_stats_kills); preg_match("/Deaths :<\/td>\s+<td class=\"values\">(.+?)<\/td>/",$in1, $social_stats_deaths); preg_match("/K\/D Ratio :<\/td>\s+<td class=\"values\">(.+?)<\/td>/",$in1, $social_stats_kdr); $h3gamertag = str_replace("%20"," ", $tag); $social_stats_kills = $social_stats_kills[1]; $social_stats_deaths = $social_stats_deaths[1]; $social_stats_kdr = $social_stats_kdr[1]; I hope I made sense. Thank you in advance for any help that is provided. Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/ Share on other sites More sharing options...
effigy Posted March 26, 2008 Share Posted March 26, 2008 Use preg_match_all. Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-501224 Share on other sites More sharing options...
Helminthophobe Posted March 26, 2008 Author Share Posted March 26, 2008 I still have troubles with understanding how to work with arrays and from what I understand preg_match_all saves the data in an array. How would I output the data using my code I posted in the orginal post? Thank you for you help so far. It's much appreciated. Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-501342 Share on other sites More sharing options...
effigy Posted March 26, 2008 Share Posted March 26, 2008 Per the docs: If no order flag is given, PREG_PATTERN_ORDER is assumed. PREG_PATTERN_ORDER Orders results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on. The easiest way to get used to arrays is to use pre and print_r to see what you're working with, e.g.: <pre> <?php print_r($array); ?> </pre> Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-501429 Share on other sites More sharing options...
Helminthophobe Posted March 27, 2008 Author Share Posted March 27, 2008 I had to wait until I got home to fiddle with this. I was able to figure out how to display the content after playing with it for a while. Thank you for the link and assistance, effigy. Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-501782 Share on other sites More sharing options...
Helminthophobe Posted March 27, 2008 Author Share Posted March 27, 2008 I'm still having a little trouble it seems. The following is the source code I am working with (some parts missing that aren't important): class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" /> <div class="num">9,318 (26.71%)</div></div> class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" /> <div class="num">4,720 (13.53%)</div></div> class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" /> <div class="num">3,896 (11.17%)</div></div> class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" /> <div class="num">3,460 (9.92%)</div></div> The following is my new code: <? $tag = str_replace(" ","%20",$tag); $ch = curl_init(); $timeout = 5; curl_setopt ($ch, CURLOPT_URL, 'http://www.bungie.net/stats/halo3/CareerStats.aspx?player=' . $tag . '&social=true&map=0'); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout); $in1 = curl_exec($ch); curl_close($ch); preg_match_all("#class=\"weapon\" src=\"(.+?)\" style=\"border-width:0px;\" \/>\s+<div class=\"num\">(.+?)<\/div><\/div>#",$in1, $weapon_data); echo "<img src=\"http://www.bungie.net" . $weapon_data[1][0] . "\"><br>" . $weapon_data[2][0] . "<br><br>\n"; echo "<img src=\"http://www.bungie.net" . $weapon_data[1][1] . "\"><br>" . $weapon_data[2][1] . "<br><br>\n"; echo "<img src=\"http://www.bungie.net" . $weapon_data[1][2] . "\"><br>" . $weapon_data[2][2] . "<br><br>\n"; echo "<img src=\"http://www.bungie.net" . $weapon_data[1][3] . "\"><br>" . $weapon_data[2][3] . "<br><br>\n"; ?> It works perfect with the exception of the output from $weapon_data[2][0]. This is the output of $weapon_data[2][0]: 9,318Â Â (26.71%) So I decided to separate the "9,318" and the "26.71%". I used the following: preg_match_all("#class=\"weapon\" src=\"(.+?)\" style=\"border-width:0px;\" \/>\s+<div class=\"num\">([\,\d]+)\s\s\(([\.\d]+)\%\)<\/div><\/div>#",$in1, $weapon_data); It doesn't find anything. I tested ([\,\d]+)\s\s\(([\.\d]+)\%\) with the PHP Live Regex Tester and it worked when just looking for 9,318 (26.71%). Any suggestions on a solution? I'm stumped. Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-501800 Share on other sites More sharing options...
effigy Posted March 27, 2008 Share Posted March 27, 2008 What character set is the page using? (Check the META tag.) Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-502183 Share on other sites More sharing options...
Helminthophobe Posted March 27, 2008 Author Share Posted March 27, 2008 Is this what you mean? <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> charset=utf-8? Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-502188 Share on other sites More sharing options...
effigy Posted March 27, 2008 Share Posted March 27, 2008 Yes. You have two options: (1) Use UTF-8 also; or (2) convert the UTF-8 into whatever character set you're using. Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-502228 Share on other sites More sharing options...
Helminthophobe Posted March 27, 2008 Author Share Posted March 27, 2008 I'm thinking option 1 will be the easiest but how would I go about option 2? I really, really appreciate the help you've given me. I've been real excited about the results I've been getting from this little project. You've been a huge help! Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-502243 Share on other sites More sharing options...
effigy Posted March 27, 2008 Share Posted March 27, 2008 iconv Link to comment https://forums.phpfreaks.com/topic/97960-regex-loop/#findComment-502263 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.