Jump to content

Can a PHP script log into a public site with username/pass?


doncoglioni

Recommended Posts

Hi all! :)

 

I have a script which analyses HTML documents and summarises some information for them, related to my work. The script itself works fine, if I have access to the HTML pages concerned. However, most of the ones I need are only accessible when I log in to a (public) website with a username and password field.

 

So my question is this: can a PHP script automatically "enter" the username and password, and log into the website, then (remaining authenticated for that session) fetch the protected pages that I need, just as if a human was sitting doing the same thing? No captchas are involved. Obviously I would need some information regarding the target page's username/password form and method of submission... but how would I go about doing this, generally?

 

My sincere thanks for any help.

Link to comment
Share on other sites

I wrote a script that logged into my chase bank website and it would grab information on my IRA.  Then store the value in a database so that I could track it easier.  Here is all the code with the obvious usernames and such blocked out...

 

<?php

    ######### Set up field values #########
$fields = "authmethod=userpassword&";
$fields .= "locale=en_us&";
$fields .= "usr_name=bobby&";
$fields .= "usr_password=3456&";
$fields .= "hiddenuri=/online/logon/on_successful_logon.jsp?LOB=COLLogon&";
$fields .= "LOB=COLLogon";

$agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)";
$ref = "https://chaseonline.chase.com/colappmgr/colportal/prospect?_nfpb=true&_pageLabel=page_logonform";


    ######### Prepare curl settings and variables #########
$ch=curl_init(); 
curl_setopt($ch, CURLOPT_URL, "https://chaseonline.chase.com/siteminderagent/forms/formpost.fcc");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_MAXREDIRS, 4);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields); 
curl_setopt($ch, CURLOPT_TIMEOUT, 120); 
curl_setopt($ch, CURLOPT_USERAGENT, $agent); 
curl_setopt($ch, CURLOPT_REFERER, $ref);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 
$buffer = curl_exech($ch);
curl_close($ch);

    ######### Search html code for the needed string #########
//find the location of the market value
$temp = "Market Value</td><td class=";
$response_start = strPos($buffer, $temp);

$response_mid = strPos($buffer, "$", $response_start);
$response_end = strPos($buffer, "  </td><td", $response_mid);

$temp_code = substr($buffer, $response_mid + 1, ($response_end - $response_mid - 1));
$temp_code = ereg_replace (",", "", $temp_code);


    ######### minor error checking and database insert #########
//check to make sure that some long error or other bad data was not returned
if (strlen($temp_code) < 15)
{

	########  Insert Values into database ########

	//this connects to the server using the user name and password
	$db = mysql_connect("localhost","user","password")
	or die("Could not connect!");

	//this then selects the database
	mysql_select_db($db = "ira_daderbase")
	or die ("Could not select database");

	$now = mktime();
	$today = mktime(0,0,0);

	$sql = "SELECT * FROM `retirement` WHERE `time` > '".$today."';";
	$result = mysql_query($sql);

	//this is done to make sure that there is only 1 insert per day
	if (mysql_num_rows($result) == 0)
	{
		$sql = "INSERT INTO `retirement` ( `id` , `time` , `amount` )";
		$sql .= "VALUES ('', '".$now."', '".$temp_code."');";
		mysql_query($sql);
	}
}
?>

 

Hope that example helps you out

Link to comment
Share on other sites

Hope that example helps you out

 

cunoodle2; you're an absolute genius.  I have no idea what I was doing wrong, but now my script is logging me in to the website.  HOWEVER - all that happens is the "successful login" page is displayed, and if I try to access any further pages within the member's area, it acts as if I'm not logged in at all.  It's as if it's only logging me in for a second, then immediately logging me back out again.  Is there a reason behind this that you might think of?

 

Thank you, either way!  Thank you very much.

Link to comment
Share on other sites

I thought I'd also post my code so you can see what's going on, and what's going wrong (in comments!)

 

	
#### //Logging into the secure site, just passing username/password, referer and useragent. #### 
$ch=curl_init(); 
curl_setopt($ch, CURLOPT_URL, "https://secure.site.com/page1_userlogin.php");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($ch, CURLOPT_COOKIEJAR, 'C:\cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'C:\cookies.txt');
curl_setopt($ch, CURLOPT_MAXREDIRS, 4);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, 'username=abcdefg&password=1234567'); 
curl_setopt($ch, CURLOPT_TIMEOUT, 120); 
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9'); 
curl_setopt($ch, CURLOPT_REFERER, 'http://secure.site.com/welcome_page.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 
$buffer = curl_exec($ch);
curl_close($ch);

####  // Now I'm logged in, I should be able to access other secure pages (to parse them with my parser!) #### 
####  // BUT I CAN'T - I parse the standard "not logged in" page instead :'( #### 
include("phpHTMLParser.php");
$parser = new phpHTMLParser(file_get_contents("https://secure.site.com/page2_this_is_the_target.php"));
$HTMLObject = $parser->parse();
$HTMLObject = $parser->parse_tags(array("div"));
$HTMLObject->output();

?>

Link to comment
Share on other sites

Or you might pass the cookie with your login details as a parameter. You need to find out sample cookie content before doing so, though, for example in Opera by setting 'Ask before accepting cookies'. It will show you the content of the cookie when trying  to log in.

Link to comment
Share on other sites

Thanks, guys.

 

I found out the problem was that I wasn't following the "correct" login procedure for my website.  On further examination, I noticed it doesn't use cookies to keep track of its visitors, and that if you go back to the home page, you're logged out automatically!  So I just got the HTTPHeaders extension for Firefox, and copied all the headers as I moved through every page (from home page -> login -> welcome page -> account page -> my target page) and now I have 4 sets of cURL commands with the various headers going between those pages.

Link to comment
Share on other sites

If you don't mine please post your code in hopes that someone in the future would be able to learn form it.  Make sure you block out all usernames/passwords and important information.  It took me forever to figure out that curl stuff to log into my bank as I wrote it all "from scratch."  Hopefully my code helped you and your code will help the next person. =)

Link to comment
Share on other sites

If you don't mine please post your code in hopes that someone in the future would be able to learn form it.  Make sure you block out all usernames/passwords and important information.  It took me forever to figure out that curl stuff to log into my bank as I wrote it all "from scratch."  Hopefully my code helped you and your code will help the next person. =)

 

I'd be absolutely delighted to, as soon as I get one tiny thing fixed with it - please do give my other (short) post a read; I'd love your help.

It's right here ---> http://www.phpfreaks.com/forums/index.php/topic,188902.0.html

 

Ooh, and for the benefit of us all:  how do you get the fancy colored code?  Mine's all greyscale :(

 

Thanks again :)

Link to comment
Share on other sites

Ooh, and for the benefit of us all:  how do you get the fancy colored code?  Mine's all greyscale :(

 

Thanks again :)

 

Make sure you add the PHP tags. <?php and ?>

 

With <?php

<?php
#### //Logging into the secure site, just passing username/password, referer and useragent. #### 
$ch=curl_init(); 
curl_setopt($ch, CURLOPT_URL, "https://secure.site.com/page1_userlogin.php");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($ch, CURLOPT_COOKIEJAR, 'C:\cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'C:\cookies.txt');
curl_setopt($ch, CURLOPT_MAXREDIRS, 4);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, 'username=abcdefg&password=1234567'); 
curl_setopt($ch, CURLOPT_TIMEOUT, 120); 
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9'); 
curl_setopt($ch, CURLOPT_REFERER, 'http://secure.site.com/welcome_page.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 
$buffer = curl_exec($ch);
curl_close($ch);

####  // Now I'm logged in, I should be able to access other secure pages (to parse them with my parser!) #### 
####  // BUT I CAN'T - I parse the standard "not logged in" page instead :'( #### 
include("phpHTMLParser.php");
$parser = new phpHTMLParser(file_get_contents("https://secure.site.com/page2_this_is_the_target.php"));
$HTMLObject = $parser->parse();
$HTMLObject = $parser->parse_tags(array("div"));
$HTMLObject->output();

?>

 

Without <?php

#### //Logging into the secure site, just passing username/password, referer and useragent. ####

$ch=curl_init();

curl_setopt($ch, CURLOPT_URL, "https://secure.site.com/page1_userlogin.php");

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);

curl_setopt($ch, CURLOPT_COOKIEJAR, 'C:\cookies.txt');

curl_setopt($ch, CURLOPT_COOKIEFILE, 'C:\cookies.txt');

curl_setopt($ch, CURLOPT_MAXREDIRS, 4);

curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);

curl_setopt($ch, CURLOPT_POST, 1);

curl_setopt($ch, CURLOPT_POSTFIELDS, 'username=abcdefg&password=1234567');

curl_setopt($ch, CURLOPT_TIMEOUT, 120);

curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.9) Gecko/20071025 Firefox/2.0.0.9');

curl_setopt($ch, CURLOPT_REFERER, 'http://secure.site.com/welcome_page.php');

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);

$buffer = curl_exec($ch);

curl_close($ch);

 

####  // Now I'm logged in, I should be able to access other secure pages (to parse them with my parser!) ####

####  // BUT I CAN'T - I parse the standard "not logged in" page instead :'( ####

include("phpHTMLParser.php");

$parser = new phpHTMLParser(file_get_contents("https://secure.site.com/page2_this_is_the_target.php"));

$HTMLObject = $parser->parse();

$HTMLObject = $parser->parse_tags(array("div"));

$HTMLObject->output();

 

?>

 

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.