Jump to content

Data Screen Scrape


ainoy31

Recommended Posts

I an trying to get the data from an aspx form called __VIEWSTATE but the returned array is empty.

 

Here is my code:

<?
$url="http://simpleinternetsite.com";

$channel = curl_init();
curl_setopt($channel, CURLOPT_URL, $url);
curl_setopt($channel, CURLOPT_FAILONERROR, 1);
curl_setopt($channel, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($channel, CURLOPT_RETURNTRANSFER, 1);

$data=curl_exec($channel);

if($data)
{
            preg_match('/<input id="__VIEWSTATE" name="__VIEWSTATE" type="hidden" value="([^"]*?)">/', $data, $matches); 

print_r($matches);
}

?>

 

Much appreciation on this. AM

Link to comment
https://forums.phpfreaks.com/topic/163263-data-screen-scrape/
Share on other sites

Assuming the input tag has those attributes in that particular order...

 

Example:

$html = <<<END
<div>
<input type="hidden" name="ctl00_ScriptManager1_HiddenField" id="ctl00_ScriptManager1_HiddenField" value="" />
<input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" />
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTM4NjcyMDM5N2QYAgUyY3RsMDAkTWFpbkNvbnRlbnQkY3RsMDAkY3RsMDAkRmVhdHVyZWRQcm9kdWN0c1ZpZXcPD2RmZAUkY3RsMDAkRGlzY291bnRTaG9wcGVyQmFubmVyJG12QmFubmVyDw9kAgFkYW+MRR7yYW2BPd+5NsA+6H9x/D8=" />
</div>
END;

preg_match('#<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="([^"]+)" />#i', $html, $matches);
echo $matches[1];

 

Output:

/wEPDwUKMTM4NjcyMDM5N2QYAgUyY3RsMDAkTWFpbkNvbnRlbnQkY3RsMDAkY3RsMDAkRmVhdHVyZWRQcm9kdWN0c1ZpZXcPD2RmZAUkY3RsMDAkRGlzY291bnRTaG9wcGVyQmFubmVyJG12QmFubmVyDw9kAgFkYW+MRR7yYW2BPd+5NsA+6H9x/D8=

 

In your code, you didn't have to make the star into a lazy quantifier... and you forgot the [space]/ part after the closing quote at the end of the line (i'm going off the code in the url you mentioned).

Link to comment
https://forums.phpfreaks.com/topic/163263-data-screen-scrape/#findComment-861439
Share on other sites

 

RAWR!

 

preg_match('~<input[^>]*((?:id|name)\s?=\s?["\']__VIEWSTATE["\'])?[^>]*value\s?=\s?["\']([^"\']*)["\'](?(1)|[^>]*(?:id|name)\s?=\s?["\']__VIEWSTATE["\'])[^>]*>~i', $html, $matches);

 

Your value will be found in $match[2]

 

Yessiree, unlike your last prom date, this chica won't say no to you!

 

So basically this model offers top of the line fault-tolerance for your content stealing scraping needs.  Basically, if your input tag has an id or name attribute with __VIEWSTATE in it somewhere, it will get that value. 

 

- Only id? MATCHED.

- Only name? MATCHED.

- Both? MATCHED. 

- Before the value? MATCHED.

- After the value? MATCHED. 

- spaces in-between equal signs? MATCHED. 

- Single quotes used? MATCHED. 

- Double quotes used? MATCHED. 

- Case-insensitive? MATCHED. 

 

 

Link to comment
https://forums.phpfreaks.com/topic/163263-data-screen-scrape/#findComment-862334
Share on other sites

  • 2 weeks later...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.