Jump to content

Expert needed!


drisate

Recommended Posts

Hey guys i need to retreive vars from a foreign website but i am very bad with regular expretions. I need to retreive from a board page the username and posted date for every treads of the page the HTML looks like this and i need to retreive the red parts:

 

-------------------------------------

<table cellpadding="4" cellspacing="1" border="0" style="width:100%" class="tableinborder">

  <tr align="left">

    <td class="tablea" valign="top"><a name="post7580219" id="post7580219"></a>

    <table style="width:100%" cellpadding="4" cellspacing="0" border="0" class="tablea_fc">

      <tr>

        <td style="width:100%" class="smallfont"><span class="normalfont"><b>

        <a href="profile.php?userid=31486"> Da LaZ </a></b></span> <br />

        Heavy Fighter <br />

        <img src="en_images_ogame/star2.gif" border="0" alt title /><img src="en_images_ogame/star2.gif" border="0" alt title /><img src="en_images_ogame/star2.gif" border="0" alt title />

        <br />

        <br />

        <img src="images/avatars/avatar-49031.jpg" border="0" alt="images/avatars/avatar-49031.jpg" title /><br />

        <br />

        Registration Date: 03-10-2006<br />

        Posts: 1,102<br />

        Universe: uni1<br />

        Alliance: pirates<br />

        <br />

        <img src="en_images_ogame/spacer.gif" width="159" height="1" border="0" alt title /></td>

        </tr>

        </table>

        </td>

        <td class="tablea" valign="top" style="width:100%">

        <table style="width:100%" cellpadding="4" cellspacing="0" border="0" class="tablea_fc">

          <tr>

            <td style="width:100%" class="normalfont" align="left">

            <table style="width:100%" cellpadding="4" cellspacing="0" border="0" class="tablea_fc">

              <tr>

                <td><span class="smallfont"><b>evil vs POW</b></span></td>

                <td align="right" nowrap="nowrap">

                <a href="addreply.php?postid=7580219">

                <img src="en_images_ogame/replypost.gif" border="0" alt="Reply to this Post" title="Reply to this Post" /></a>

                <a href="addreply.php?action=quote&postid=7580219">

                <img src="en_images_ogame/quote.gif" border="0" alt="Post Reply with Quote" title="Post Reply with Quote" /></a>

                <a href="editpost.php?postid=7580219">

                <img src="en_images_ogame/editpost.gif" border="0" alt="Edit/Delete Posts" title="Edit/Delete Posts" /></a>

                <a href="report.php?postid=7580219">

                <img src="en_images_ogame/report.gif" border="0" alt="Report Post to a Moderator" title="Report Post to a Moderator" /></a>       

                <a href="javascript:self.scrollTo(0,0);">

                <img src="en_images_ogame/goup.gif" border="0" alt="Go to the top of this page" title="Go to the top of this page" /></a></td>

              </tr>

            </table>

            <hr size="1" class="threadline" />

            <div align="center">

              <br />

              message</div>

            </td>

            </tr>

          </table>

          </td>

        </tr>

        <tr>

          <td class="tablea" align="center" nowrap="nowrap">

          <span class="smallfont">

          <a href="thread.php?postid=7580219#post7580219">

          <img src="en_images_ogame/posticon.gif" border="0" alt title /></a> 03-24-2009

          <span class="time">02:37</span></span>  </td>

          <td class="tablea" align="left" style="width:100%" valign="middle">

          <span class="smallfont">

          <img src="en_images_ogame/user_offline.gif" border="0" alt="Da LaZ is offline" title="Da LaZ is offline" />

          <a href="search.php?action=user&userid=31486">

          <img src="en_images_ogame/search.gif" border="0" alt="Search for Posts by Da LaZ" title="Search for Posts by Da LaZ" /></a>

          <a href="usercp.php?action=buddy&add=31486">

          <img src="en_images_ogame/homie.gif" border="0" alt="Add Da LaZ to your Buddy List" title="Add Da LaZ to your Buddy List" /></a>

          <a href="pms.php?action=newpm&userid=31486">

          <img src="en_images_ogame/pm.gif" border="0" alt="Send a Private Message to Da LaZ" title="Send a Private Message to Da LaZ" /></a>

          </span></td>

        </tr>

        </table>

        </td>

        </tr>

        </table>

-------------------------------------

 

So objective 1, loop the page for every threads

objective 2 for each loops, extract the username and posted date

 

If you need a full page exemple, this is one: http://board.ogame.org/thread.php?threadid=537635

Link to comment
https://forums.phpfreaks.com/topic/153524-expert-needed/
Share on other sites

I'll supply the meat and potatoes, you supply the gravy if you get my drift ;)

 

$userName = array();
$postDate = array();
date_default_timezone_set('America/Montreal'); // *set this value to correct timezone of server in question

$dom = new DOMDocument;
@$dom->loadHTMLFile('http://board.ogame.org/thread.php?threadid=537635');
$xpath = new DOMXPath($dom);
$aTag = $xpath->query('//a[substring(@href,1,19) ="profile.php?userid="]'); // extract user
$spanTag = $xpath->query('//td[@class="tablea" or @class="tableb"]/span'); // extract post date

foreach ($aTag as $aVal) {
$userName[] = $aVal->nodeValue; // store user name into array $user
}

foreach ($spanTag as $spanVal) {
if(preg_match('#(??:\d{2}-){2}\d{4}|Today,) \d{2}:\d{2}#', $spanVal->nodeValue, $match)){
	$match[0] = str_replace('Today,', date('m-d-Y'), $match[0]); // if found, replace 'Today,' with today's date in xx-xx-2009 format
	$postDate[] = $match[0]; // store post date into array $time
}
}

echo '<pre>'.print_r($userName, true); // outputs all user names
echo '<pre>'.print_r($postDate, true); // outputs all post dates
// *We set the default time zone in case the sequence 'Today,' is found within the time entry, which we convert to today's date using date(). Otherwise, there will be a Strict Standards Notice

 

Output:

Array
(
    [0] => Da LaZ
    [1] => Zombie
    [2] => .GameOver
    [3] => .GameOver
    [4] => .GameOver
    [5] => kepone factory
    [6] => Necessary Evil
    [7] => greenie
)
Array
(
    [0] => 03-24-2009 02:37
    [1] => 03-26-2009 19:13
    [2] => 03-27-2009 23:45
    [3] => 03-29-2009 22:45
    [4] => 04-01-2009 20:24
    [5] => 04-01-2009 20:39
    [6] => 04-01-2009 20:49
    [7] => 04-10-2009 03:40
)

Link to comment
https://forums.phpfreaks.com/topic/153524-expert-needed/#findComment-806865
Share on other sites

I forgot about 'Yesterday,' as a possibility, so after this line:

$match[0] = str_replace('Today,', date('m-d-Y'), $match[0]); // if found, replace 'Today,' with today's date in xx-xx-2009 format

You can add:

$match[0] = str_replace('Yesterday,', date('m-d-Y', strtotime("-1 day")), $match[0]); // if found, replace 'Yesterday,' with today's date -1 day in xx-xx-2009 format

Link to comment
https://forums.phpfreaks.com/topic/153524-expert-needed/#findComment-806925
Share on other sites

As I went out for a walk, I pondered about this thread and what I provided, and thought it could be faster (how much, I don't know.. didn't time it).

Therefore, I tweaked the snippet here and there. This version has tighter entry extraction and doesn't use regex (and now I'm done):

 

$userName = array();
$postDate = array();
date_default_timezone_set('America/Montreal'); // *set this value to correct timezone of server in question

$dom = new DOMDocument;
@$dom->loadHTMLFile('http://board.ogame.org/thread.php?threadid=537635');
$xpath = new DOMXPath($dom);
$aTag = $xpath->query('//a[substring(@href,1,19) ="profile.php?userid="]'); // extract user
$spanTag = $xpath->query('//td[@class="tablea" or @class="tableb"]/span[contains(.,":")]'); // extract post date

foreach ($aTag as $aVal) {
$userName[] = $aVal->nodeValue; // store user name into array $user
}

foreach ($spanTag as $spanVal){
if(strlen($spanVal->nodeValue) < 27){
	$spanVal->nodeValue = trim($spanVal->nodeValue);
	$spanVal->nodeValue = (substr($spanVal->nodeValue, 0, 6) == 'Today,')? str_replace('Today,', date('m-d-Y'), $spanVal->nodeValue) : $spanVal->nodeValue;
	$spanVal->nodeValue = (substr($spanVal->nodeValue, 0, 10) == 'Yesterday,')? str_replace('Yesterday,', date('m-d-Y', strtotime("-1 day")), $spanVal->nodeValue) : $spanVal->nodeValue;
	$postDate[] = $spanVal->nodeValue; // store post date into array $time
}
}

echo '<pre>'.print_r($userName, true); // outputs all user names
echo '<pre>'.print_r($postDate, true); // outputs all post dates

 

Either version should accomplish the same thing.

Link to comment
https://forums.phpfreaks.com/topic/153524-expert-needed/#findComment-807078
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.