shoiab Posted December 17, 2009 Share Posted December 17, 2009 Hello to all the Members of this forum, Im Shoiab, A novice programmer in php.. for my first job I have been recently assigned a project, in which I have got to extract/download the contents of the webpage (of my clients website) from HTTPS webpage using cURL. In other words I want to extract the same exact webpage to my local host. Let me tell you, what all I have done so far, I am able to download the web content from "www.virginholidays.co.uk" here is the link to book a resort "http://www.virginholidays.co.uk/brochures/florida/holidays/orlando/kissimmee/champions_world_resort" when i click on BOOK THE HOLIDAY BUTTON, it takes me to "https webpage" from which im not able to download (https://www.virginholidays.co.uk/book/start) Im using windows XP, IE 5, php 5.2 and fiddler. Here is my code: $req1="GET /book/start HTTP/1.0\r\n"; $req1.='Accept: */*'; $req1.="\r\nAccept-Encoding: gzip, deflate Cookie: _#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781; __utmc=262657675; CoreID6=60127103647212586967853; __utma=262657675.233062282.1258696796.1259047752.1259059734.14; __utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct) |utmcmd=(none); _#uid=1258696798931.315033071.3223127.1883.436744734.051; _#srchist=11611%3A1%3A20091221055958; _#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958; __utmb=262657675; ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y; cookie_complete=Region%3DFlorida%26Resort%3D2018.OR; _csoot=1259036845125; ememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depart ureAirport=MAN&DepartureDate=Fri 11 Dec 2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg e8=&SearchType=complete; _csuid=X47174a9c82f607; cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) Host: http://www.virginholidays.co.uk Connection: Keep-Alive Accept-Language: en-us"; $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,application/json,"; $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; $header[] = "Cache-Control: public"; $header[] = "Connection: keep-alive"; $header[] = "Keep-Alive: 300"; $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"; $header[] = "Accept-Language: en-us,en;q=0.5"; $header[] = "Pragma: "; // browsers keep this blank. $cookie="#lc=#; 90225614_clogin=l=1259059733&v=1&e=1259062485781; __utmc=262657675; CoreID6=60127103647212586967853; __utma=262657675.233062282.1258696796.1259047752.1259059734.14; __utmz=262657675.1258696796.1.1.utmccn=(direct)|utmcsr=(direct) |utmcmd=(none); _#uid=1258696798931.315033071.3223127.1883.436744734.051; _#srchist=11611%3A1%3A20091221055958; _#sess=1%7C20091120062958%7C1; _#vdf=11611%7C1%7C20091221055958; __utmb=262657675; ASP.NET_SessionId=zpn5ftje1xxodv55f1h3yg45; cmTPSet=Y; cookie_complete=Region%3DFlorida%26Resort%3D2018.OR; _csoot=1259036845125; RememberedSearch=GeographyArea=Florida&GeographyResort=329.OR&Depar tureAirport=MAN&DepartureDate=Fri 11 Dec 2009&Duration=7&AdultPax=2&ChildPax=0&InfantPax=0&ChildAge1=&ChildA ge2=&ChildAge3=&ChildAge4=&ChildAge5=&ChildAge6=&ChildAge7=&ChildAg e8=&SearchType=complete; _csuid=X47174a9c82f607; cmRS=t3=1259060790328&pi=Hotel%20Options%20-%20Atop"; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,"https://www.virginholidays.co.uk/book/start"); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0); curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_HTTPHEADER, $header); curl_setopt($ch, CURLOPT_COOKIESESSION, TRUE); curl_setopt($ch, CURLOPT_POST, 0); curl_setopt($ch, CURLOPT_HEADER, 1); curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate'); curl_setopt ($ch, CURLOPT_COOKIE, $cookie); $response1=curl_exec($ch); curl_close($ch); echo $response1; $response = str_replace ("/_assets/","http://www.virginholidays.co.uk/_assets/",$response); $response = str_replace ("/brochures/","http://www.virginholidays.co.uk/brochures/",$respon se); $response = str_replace ("/dynamichtag.aspx","http://www.virginholidays.co.uk/dynamichtag.a spx",$response); echo $response; Could you please help me download the content of https webpage? Im not sure what is the issue? Is the cookie or session expired? Or I need to write a different code..? Please help, Thanks in advance. Link to comment https://forums.phpfreaks.com/topic/185445-how-to-extractdownload-content-from-https-page/ Share on other sites More sharing options...
Deoctor Posted December 17, 2009 Share Posted December 17, 2009 hai shoiab i dont think u can pass the values in the https.. coz these values will be encrypted while getting forwarded. [pre]<form name="aspnetForm" method="post" action="default.aspx" onsubmit="javascript:return WebForm_OnSubmit();" id="aspnetForm"> <div> <input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" /> <input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" /> <input type="hidden" name="__LASTFOCUS" id="__LASTFOCUS" value="" /> <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTEyMjI5MDMwOQ9kFgJ....................[/pre] if u check the source code then u will come to know that Link to comment https://forums.phpfreaks.com/topic/185445-how-to-extractdownload-content-from-https-page/#findComment-979043 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.