Jump to content

How to extract/download content from HTTPS page?


shoiab

Recommended Posts

Hello to all the Members of this forum, Im Shoiab, A novice programmer in php.. for my first job I have been recently assigned a project, in which I have got to extract/download the contents of the webpage (of my clients website) from HTTPS webpage using cURL. In other words I want to extract the same exact webpage to my local host.

 

Let me tell you, what all I have done so far, I am able to download the web content from "www.virginholidays.co.uk" here is the link to book a resort

"http://www.virginholidays.co.uk/brochures/florida/holidays/orlando/kissimmee/champions_world_resort" when i click on BOOK THE HOLIDAY BUTTON, it takes me to "https webpage" from which im not able to download (https://www.virginholidays.co.uk/book/start)

 

Im using windows XP, IE 5, php 5.2 and fiddler.

 

Here is my code:

 

$req1="GET /book/start HTTP/1.0\r\n";

$req1.='Accept: */*';

$req1.="\r\nAccept-Encoding: gzip, deflate

Cookie: _#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781;

 

__utmc=262657675;

CoreID6=60127103647212586967853;

 

__utma=262657675.233062282.1258696796.1259047752.1259059734.14;

__utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct)

 

|utmcmd=(none);

_#uid=1258696798931.315033071.3223127.1883.436744734.051;

 

_#srchist=11611%3A1%3A20091221055958;

_#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958;

 

__utmb=262657675;

 

ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y;

 

cookie_complete=Region%3DFlorida%26Resort%3D2018.OR;

 

_csoot=1259036845125;

 

ememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depart

 

ureAirport=MAN&DepartureDate=Fri 11 Dec

 

2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA

 

ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg

 

e8=&SearchType=complete; _csuid=X47174a9c82f607;

 

cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop

 

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;

 

InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR

 

3.5.30729)

 

Host: http://www.virginholidays.co.uk

Connection: Keep-Alive

Accept-Language: en-us";

 

$header[0] = "Accept:

 

text/xml,application/xml,application/xhtml+xml,application/json,";

$header[0] .=

 

"text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";

$header[] = "Cache-Control: public";

$header[] = "Connection: keep-alive";

$header[] = "Keep-Alive: 300";

$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";

$header[] = "Accept-Language: en-us,en;q=0.5";

$header[] = "Pragma: "; // browsers keep this blank.

$cookie="#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781;

 

__utmc=262657675;

 

CoreID6=60127103647212586967853;

 

__utma=262657675.233062282.1258696796.1259047752.1259059734.14;

__utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct)

 

|utmcmd=(none);

_#uid=1258696798931.315033071.3223127.1883.436744734.051;

 

_#srchist=11611%3A1%3A20091221055958;

_#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958;

 

__utmb=262657675;

ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y;

 

cookie_complete=Region%3DFlorida%26Resort%3D2018.OR;

 

_csoot=1259036845125;

 

RememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depar

 

tureAirport=MAN&DepartureDate=Fri 11 Dec

 

2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA

 

ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg

 

e8=&SearchType=complete; _csuid=X47174a9c82f607;

 

cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop";

 

$ch = curl_init();

curl_setopt($ch,

 

CURLOPT_URL,"https://www.virginholidays.co.uk/book/start");

curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);

curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);

curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt ($ch, CURLOPT_HTTPHEADER, $header);

curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE);

curl_setopt($ch, CURLOPT_POST, 0);

curl_setopt($ch, CURLOPT_HEADER, 1);

curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');

curl_setopt ($ch, CURLOPT_COOKIE, $cookie);

$response1=curl_exec($ch);

curl_close($ch);

echo $response1;

 

$response = str_replace

 

("/_assets/","http://www.virginholidays.co.uk/_assets/",$response);

$response = str_replace

 

("/brochures/","http://www.virginholidays.co.uk/brochures/",$respon

 

se);

$response = str_replace

 

("/dynamichtag.aspx","http://www.virginholidays.co.uk/dynamichtag.a

 

spx",$response);

echo $response;

 

Could you please help me download the content of https webpage? Im not sure what is the issue? Is the cookie or session expired? Or I need to write a different code..?

 

Please help,

Thanks in advance.

Link to comment
Share on other sites

hai shoiab

i dont think u can pass the values in the https..

coz these values will be encrypted while getting forwarded.

[pre]<form name="aspnetForm" method="post" action="default.aspx" onsubmit="javascript:return WebForm_OnSubmit();" id="aspnetForm">

<div>

<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />

<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />

<input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" />

<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTEyMjI5MDMwOQ9kFgJ....................[/pre]

 

if u check the source code then u will come to know that

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.