Markieham Posted January 18, 2010 Share Posted January 18, 2010 Hello I am starting to work on a new project. For that, I am parsing several Calendars and then put them all together. Now I ran into my first problem, that is a password protected website. There is an account set up for the script but I need to find a way to login, after which I navigate to the right page and start parsing. Browsing and googling I found something about cURL, and I'm looking into it now... but being quite new to the php of this level I'm finding it hard to find what I need. Could anyone point me in the right direction to make a script log in, (there is a logged in confirmation page, too) after which the script then navigates to the right page and parses it? Maybe even a simple example? Any help is greatly appreciated! Link to comment https://forums.phpfreaks.com/topic/188908-reading-a-page-but-first-logging-in/ Share on other sites More sharing options...
JonnoTheDev Posted January 18, 2010 Share Posted January 18, 2010 Here is a start. You can change the values in the variables accordingly. What you should do is convert into a function so you can make multiple calls i.e first call is login page. If login is successful second call is calendar page, etc.. You should have a param so you can switch between HTTP methods such as POST & GET. If you are completing a form then the usual method is POST. To request a page the usual method is GET. <?php $ch = curl_init(); # where the key is the field name $postData = array('username' => 'foo', 'password' => 'bar'); # Convert data array into a query string (ie animal=dog&sport=baseball) foreach($dataArray as $key => $value) { if(strlen(trim($value)) > 0) { $value = is_array($value) ? $value : urlencode($value); $tempString[] = $key . "=" . $value; } else { $tempString[] = $key; } } $queryString = join('&', $tempString); curl_setopt($ch, CURLOPT_POSTFIELDS, $queryString); curl_setopt($ch, CURLOPT_POST, TRUE); curl_setopt($ch, CURLOPT_NOBODY, FALSE); $cookiePath = '/tmp/cookies.txt'; curl_setopt($ch, CURLOPT_COOKIEJAR, $cookiePath); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookiePath); curl_setopt($ch, CURLOPT_TIMEOUT, 60); # add browser user agent $userAgent = 'CURL BOT'; curl_setopt($ch, CURLOPT_USERAGENT, $userAgent); # target url & referer $target = 'http://www.xyz.com/login'; $referer = 'http://www.xyz.com'; curl_setopt($ch, CURLOPT_URL, $target); curl_setopt($ch, CURLOPT_REFERER, $referer); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); curl_setopt($ch, CURLOPT_MAXREDIRS, 4); curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); # Create return array $returnArray['FILE'] = curl_exec($ch); $returnArray['STATUS'] = curl_getinfo($ch); $returnArray['ERROR'] = curl_error($ch); curl_close($ch); if(!strlen($returnArray['ERROR'])) { // move onto the next page } ?> Link to comment https://forums.phpfreaks.com/topic/188908-reading-a-page-but-first-logging-in/#findComment-997413 Share on other sites More sharing options...
Markieham Posted January 18, 2010 Author Share Posted January 18, 2010 Thanks Neil, that's a great start it seems! I'm going to see if I can get it working. Will report back here. Thanks for the help so far! Link to comment https://forums.phpfreaks.com/topic/188908-reading-a-page-but-first-logging-in/#findComment-997452 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.