Jump to content

Reading a page, but first logging in


Markieham

Recommended Posts

Hello :)

 

I am starting to work on a new project. For that, I am parsing several Calendars and then put them all together.

 

Now I ran into my first problem, that is a password protected website. There is an account set up for the script but I need to find a way to login, after which I navigate to the right page and start parsing.

 

Browsing and googling I found something about cURL, and I'm looking into it now... but being quite new to the php of this level I'm finding it hard to find what I need.

 

Could anyone point me in the right direction to make a script log in, (there is a logged in confirmation page, too) after which the script then navigates to the right page and parses it? Maybe even a simple example? :)

 

 

Any help is greatly appreciated!

Link to comment
https://forums.phpfreaks.com/topic/188908-reading-a-page-but-first-logging-in/
Share on other sites

Here is a start. You can change the values in the variables accordingly. What you should do is convert into a function so you can make multiple calls i.e first call is login page. If login is successful second call is calendar page, etc.. You should have a param so you can switch between HTTP methods such as POST & GET. If you are completing a form then the usual method is POST. To request a page the usual method is GET.

 

<?php
$ch = curl_init();
# where the key is the field name
$postData = array('username' => 'foo', 'password' => 'bar');
# Convert data array into a query string (ie animal=dog&sport=baseball)
foreach($dataArray as $key => $value) {
if(strlen(trim($value)) > 0) {
	$value = is_array($value) ? $value : urlencode($value);
	$tempString[] = $key . "=" . $value;
}
else {
	$tempString[] = $key;
}
}
$queryString = join('&', $tempString);
curl_setopt($ch, CURLOPT_POSTFIELDS, $queryString);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_NOBODY, FALSE);

$cookiePath = '/tmp/cookies.txt';
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiePath); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiePath);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
# add browser user agent
$userAgent = 'CURL BOT';
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
# target url & referer
$target = 'http://www.xyz.com/login';
$referer = 'http://www.xyz.com';
curl_setopt($ch, CURLOPT_URL, $target);
curl_setopt($ch, CURLOPT_REFERER, $referer);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_MAXREDIRS, 4);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);     


# Create return array
$returnArray['FILE']   = curl_exec($ch); 
$returnArray['STATUS'] = curl_getinfo($ch);
$returnArray['ERROR']  = curl_error($ch);
curl_close($ch);

if(!strlen($returnArray['ERROR'])) {
// move onto the next page	

}
?>

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.