Jump to content

RegEx Loop


Helminthophobe

Recommended Posts

Is it possible to create a loop with RegEx when looking for information? I'm sure my terminology is a bit off since I am absolutely new to RegEx so I'll give an example. I've built a script that digs through the source code of another site looking for data (see the bottom of the post for a preview of the code). I'm having trouble pulling the data from the following bit of source code (some source code missing in the example):

 

<img id="ctl00_mainContent_rptWeapons_ctl00_imgWeapon" class="weapon" src="/images/halo3stats/weapons/e2b3837c-c27f-4497-a07d-8e59f153cff6.gif" style="border-width:0px;" />
     <div class="num">99  (33.00%)</div></div>
<img id="ctl00_mainContent_rptWeapons_ctl01_imgWeapon" class="weapon" src="/images/halo3stats/weapons/5f8fbbf9-6267-4257-9a2d-24f8c2e5441d.gif" style="border-width:0px;" />
     <div class="num">71  (23.67%)</div></div>
<img id="ctl00_mainContent_rptWeapons_ctl02_imgWeapon" class="weapon" src="/images/halo3stats/weapons/fdb4005f-45a4-472a-8646-9763ebc75aad.gif" style="border-width:0px;" />
     <div class="num">45  (15.00%)</div></div>

 

Is it possible to build a loop that finds the following and saves each result in a different variable every time the pattern is found? There is no set number of times the pattern may be found. It will be different each time. It may show up 20 times for one user and only 5 for another.

<img id=\"(.+?)" class=\"weapon\" src=\"(.+?)" style=\"border-width:0px;\" \/>\s+<div class=\"num\">(.+?)<\/div><\/div>

 

This is the script I am using now to find the other data that doesn't require a loop or anything. The URL contains the data for $tag.

$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, 'http://www.bungie.net/stats/halo3/CareerStats.aspx?player=' . $tag . '&social=true&map=0');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$in1 = curl_exec($ch);
curl_close($ch);

preg_match("/Kills :<\/td>\s+<td class=\"values\">(.+?)<\/td>/",$in1, $social_stats_kills); 
preg_match("/Deaths :<\/td>\s+<td class=\"values\">(.+?)<\/td>/",$in1, $social_stats_deaths); 
preg_match("/K\/D Ratio :<\/td>\s+<td class=\"values\">(.+?)<\/td>/",$in1, $social_stats_kdr); 

$h3gamertag = str_replace("%20"," ", $tag);
$social_stats_kills = $social_stats_kills[1];
$social_stats_deaths = $social_stats_deaths[1];
$social_stats_kdr = $social_stats_kdr[1];

 

I hope I made sense. Thank you in advance for any help that is provided.

Link to comment
Share on other sites

Per the docs:

 

If no order flag is given, PREG_PATTERN_ORDER is assumed.

 

PREG_PATTERN_ORDER

Orders results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on.

 

The easiest way to get used to arrays is to use pre and print_r to see what you're working with, e.g.:

 

<pre>
<?php
print_r($array);
?>
</pre>

 

Link to comment
Share on other sites

I'm still having a little trouble it seems.

 

The following is the source code I am working with (some parts missing that aren't important):

class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" />
     <div class="num">9,318  (26.71%)</div></div>
class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" />
     <div class="num">4,720  (13.53%)</div></div>
class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" />
     <div class="num">3,896  (11.17%)</div></div>
class="weapon" src="/images/halo3stats/weapons/0be8dc88-acc4-405d-9b82-1e0d8a4ca2f0.gif" style="border-width:0px;" />
     <div class="num">3,460  (9.92%)</div></div>

 

The following is my new code:

 

<?
$tag = str_replace(" ","%20",$tag);

$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, 'http://www.bungie.net/stats/halo3/CareerStats.aspx?player=' . $tag . '&social=true&map=0');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$in1 = curl_exec($ch);
curl_close($ch);

preg_match_all("#class=\"weapon\" src=\"(.+?)\" style=\"border-width:0px;\" \/>\s+<div class=\"num\">(.+?)<\/div><\/div>#",$in1, $weapon_data);

echo "<img src=\"http://www.bungie.net" . $weapon_data[1][0] . "\"><br>" . $weapon_data[2][0] . "<br><br>\n";
echo "<img src=\"http://www.bungie.net" . $weapon_data[1][1] . "\"><br>" . $weapon_data[2][1] . "<br><br>\n";
echo "<img src=\"http://www.bungie.net" . $weapon_data[1][2] . "\"><br>" . $weapon_data[2][2] . "<br><br>\n";
echo "<img src=\"http://www.bungie.net" . $weapon_data[1][3] . "\"><br>" . $weapon_data[2][3] . "<br><br>\n";

?>

 

It works perfect with the exception of the output from $weapon_data[2][0]. This is the output of $weapon_data[2][0]:

9,318Â Â (26.71%)

 

So I decided to separate the "9,318" and the "26.71%". I used the following:

preg_match_all("#class=\"weapon\" src=\"(.+?)\" style=\"border-width:0px;\" \/>\s+<div class=\"num\">([\,\d]+)\s\s\(([\.\d]+)\%\)<\/div><\/div>#",$in1, $weapon_data);

 

It doesn't find anything. I tested ([\,\d]+)\s\s\(([\.\d]+)\%\) with the PHP Live Regex Tester and it worked when just looking for 9,318  (26.71%). Any suggestions on a solution? I'm stumped.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.