drisate Posted April 10, 2009 Share Posted April 10, 2009 Hey guys i need to retreive vars from a foreign website but i am very bad with regular expretions. I need to retreive from a board page the username and posted date for every treads of the page the HTML looks like this and i need to retreive the red parts: ------------------------------------- <table cellpadding="4" cellspacing="1" border="0" style="width:100%" class="tableinborder"> <tr align="left"> <td class="tablea" valign="top"><a name="post7580219" id="post7580219"></a> <table style="width:100%" cellpadding="4" cellspacing="0" border="0" class="tablea_fc"> <tr> <td style="width:100%" class="smallfont"><span class="normalfont"><b> <a href="profile.php?userid=31486"> Da LaZ </a></b></span> <br /> Heavy Fighter <br /> <img src="en_images_ogame/star2.gif" border="0" alt title /><img src="en_images_ogame/star2.gif" border="0" alt title /><img src="en_images_ogame/star2.gif" border="0" alt title /> <br /> <br /> <img src="images/avatars/avatar-49031.jpg" border="0" alt="images/avatars/avatar-49031.jpg" title /><br /> <br /> Registration Date: 03-10-2006<br /> Posts: 1,102<br /> Universe: uni1<br /> Alliance: pirates<br /> <br /> <img src="en_images_ogame/spacer.gif" width="159" height="1" border="0" alt title /></td> </tr> </table> </td> <td class="tablea" valign="top" style="width:100%"> <table style="width:100%" cellpadding="4" cellspacing="0" border="0" class="tablea_fc"> <tr> <td style="width:100%" class="normalfont" align="left"> <table style="width:100%" cellpadding="4" cellspacing="0" border="0" class="tablea_fc"> <tr> <td><span class="smallfont"><b>evil vs POW</b></span></td> <td align="right" nowrap="nowrap"> <a href="addreply.php?postid=7580219"> <img src="en_images_ogame/replypost.gif" border="0" alt="Reply to this Post" title="Reply to this Post" /></a> <a href="addreply.php?action=quote&postid=7580219"> <img src="en_images_ogame/quote.gif" border="0" alt="Post Reply with Quote" title="Post Reply with Quote" /></a> <a href="editpost.php?postid=7580219"> <img src="en_images_ogame/editpost.gif" border="0" alt="Edit/Delete Posts" title="Edit/Delete Posts" /></a> <a href="report.php?postid=7580219"> <img src="en_images_ogame/report.gif" border="0" alt="Report Post to a Moderator" title="Report Post to a Moderator" /></a> <a href="javascript:self.scrollTo(0,0);"> <img src="en_images_ogame/goup.gif" border="0" alt="Go to the top of this page" title="Go to the top of this page" /></a></td> </tr> </table> <hr size="1" class="threadline" /> <div align="center"> <br /> message</div> </td> </tr> </table> </td> </tr> <tr> <td class="tablea" align="center" nowrap="nowrap"> <span class="smallfont"> <a href="thread.php?postid=7580219#post7580219"> <img src="en_images_ogame/posticon.gif" border="0" alt title /></a> 03-24-2009 <span class="time">02:37</span></span> </td> <td class="tablea" align="left" style="width:100%" valign="middle"> <span class="smallfont"> <img src="en_images_ogame/user_offline.gif" border="0" alt="Da LaZ is offline" title="Da LaZ is offline" /> <a href="search.php?action=user&userid=31486"> <img src="en_images_ogame/search.gif" border="0" alt="Search for Posts by Da LaZ" title="Search for Posts by Da LaZ" /></a> <a href="usercp.php?action=buddy&add=31486"> <img src="en_images_ogame/homie.gif" border="0" alt="Add Da LaZ to your Buddy List" title="Add Da LaZ to your Buddy List" /></a> <a href="pms.php?action=newpm&userid=31486"> <img src="en_images_ogame/pm.gif" border="0" alt="Send a Private Message to Da LaZ" title="Send a Private Message to Da LaZ" /></a> </span></td> </tr> </table> </td> </tr> </table> ------------------------------------- So objective 1, loop the page for every threads objective 2 for each loops, extract the username and posted date If you need a full page exemple, this is one: http://board.ogame.org/thread.php?threadid=537635 Link to comment https://forums.phpfreaks.com/topic/153524-expert-needed/ Share on other sites More sharing options...
nrg_alpha Posted April 10, 2009 Share Posted April 10, 2009 I'll supply the meat and potatoes, you supply the gravy if you get my drift $userName = array(); $postDate = array(); date_default_timezone_set('America/Montreal'); // *set this value to correct timezone of server in question $dom = new DOMDocument; @$dom->loadHTMLFile('http://board.ogame.org/thread.php?threadid=537635'); $xpath = new DOMXPath($dom); $aTag = $xpath->query('//a[substring(@href,1,19) ="profile.php?userid="]'); // extract user $spanTag = $xpath->query('//td[@class="tablea" or @class="tableb"]/span'); // extract post date foreach ($aTag as $aVal) { $userName[] = $aVal->nodeValue; // store user name into array $user } foreach ($spanTag as $spanVal) { if(preg_match('#(??:\d{2}-){2}\d{4}|Today,) \d{2}:\d{2}#', $spanVal->nodeValue, $match)){ $match[0] = str_replace('Today,', date('m-d-Y'), $match[0]); // if found, replace 'Today,' with today's date in xx-xx-2009 format $postDate[] = $match[0]; // store post date into array $time } } echo '<pre>'.print_r($userName, true); // outputs all user names echo '<pre>'.print_r($postDate, true); // outputs all post dates // *We set the default time zone in case the sequence 'Today,' is found within the time entry, which we convert to today's date using date(). Otherwise, there will be a Strict Standards Notice Output: Array ( [0] => Da LaZ [1] => Zombie [2] => .GameOver [3] => .GameOver [4] => .GameOver [5] => kepone factory [6] => Necessary Evil [7] => greenie ) Array ( [0] => 03-24-2009 02:37 [1] => 03-26-2009 19:13 [2] => 03-27-2009 23:45 [3] => 03-29-2009 22:45 [4] => 04-01-2009 20:24 [5] => 04-01-2009 20:39 [6] => 04-01-2009 20:49 [7] => 04-10-2009 03:40 ) Link to comment https://forums.phpfreaks.com/topic/153524-expert-needed/#findComment-806865 Share on other sites More sharing options...
nrg_alpha Posted April 10, 2009 Share Posted April 10, 2009 I forgot about 'Yesterday,' as a possibility, so after this line: $match[0] = str_replace('Today,', date('m-d-Y'), $match[0]); // if found, replace 'Today,' with today's date in xx-xx-2009 format You can add: $match[0] = str_replace('Yesterday,', date('m-d-Y', strtotime("-1 day")), $match[0]); // if found, replace 'Yesterday,' with today's date -1 day in xx-xx-2009 format Link to comment https://forums.phpfreaks.com/topic/153524-expert-needed/#findComment-806925 Share on other sites More sharing options...
nrg_alpha Posted April 11, 2009 Share Posted April 11, 2009 As I went out for a walk, I pondered about this thread and what I provided, and thought it could be faster (how much, I don't know.. didn't time it). Therefore, I tweaked the snippet here and there. This version has tighter entry extraction and doesn't use regex (and now I'm done): $userName = array(); $postDate = array(); date_default_timezone_set('America/Montreal'); // *set this value to correct timezone of server in question $dom = new DOMDocument; @$dom->loadHTMLFile('http://board.ogame.org/thread.php?threadid=537635'); $xpath = new DOMXPath($dom); $aTag = $xpath->query('//a[substring(@href,1,19) ="profile.php?userid="]'); // extract user $spanTag = $xpath->query('//td[@class="tablea" or @class="tableb"]/span[contains(.,":")]'); // extract post date foreach ($aTag as $aVal) { $userName[] = $aVal->nodeValue; // store user name into array $user } foreach ($spanTag as $spanVal){ if(strlen($spanVal->nodeValue) < 27){ $spanVal->nodeValue = trim($spanVal->nodeValue); $spanVal->nodeValue = (substr($spanVal->nodeValue, 0, 6) == 'Today,')? str_replace('Today,', date('m-d-Y'), $spanVal->nodeValue) : $spanVal->nodeValue; $spanVal->nodeValue = (substr($spanVal->nodeValue, 0, 10) == 'Yesterday,')? str_replace('Yesterday,', date('m-d-Y', strtotime("-1 day")), $spanVal->nodeValue) : $spanVal->nodeValue; $postDate[] = $spanVal->nodeValue; // store post date into array $time } } echo '<pre>'.print_r($userName, true); // outputs all user names echo '<pre>'.print_r($postDate, true); // outputs all post dates Either version should accomplish the same thing. Link to comment https://forums.phpfreaks.com/topic/153524-expert-needed/#findComment-807078 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.