Jump to content

Getting Data From Protected Website (Curl Issues)


Raypulsif

Recommended Posts

Hey guys,

 

i'm new here.

 

English is not my native language but i'll try my best.

 

First, what i would like to do.

The purpose of the code is too get data (html file) which is located in a website, but you need to have an account to access the page.

 

After differents tries and misses and several hours into discovering curl lib and testing, everything i tried failed.

 

I tried to solve this step by step and i'm afraid something goes wrong at step 1 but i can't tell what, and how to fix it.

 

This is my code :

 

<?php 
/* 
 Here is a script that is usefull to : 
 - login to a POST form, 
 - store a session cookie, 
 - download a file once logged in. 
*/ 


// INIT CURL 
$ch = curl_init(); 


// SET URL FOR THE POST FORM LOGIN 
curl_setopt($ch, CURLOPT_URL, 'https://mywebsite.com/user/login'); 


// ENABLE HTTP POST 
curl_setopt ($ch, CURLOPT_POST, 1); 


// SET POST PARAMETERS : FORM VALUES FOR EACH FIELD 
curl_setopt ($ch, CURLOPT_POSTFIELDS, 'name=myname&pass=mypass&form_id=user_login'); 


// IMITATE CLASSIC BROWSER'S BEHAVIOUR : HANDLE COOKIES 
curl_setopt ($ch, CURLOPT_COOKIEJAR, "/tmp/cookieFileName.txt"); 
//curl_setopt($ch, CURLOPT_REFERER, 'http://mywebsite.com');


//curl_setopt($ch, CURLOPT_HEADER, TRUE);
//curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);


# Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL 
# not to print out the results of its query. 
# Instead, it will return the results as a string return value 
# from curl_exec() instead of the usual true/false. 
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); 


// EXECUTE 1st REQUEST (FORM LOGIN) 
$store = curl_exec ($ch);


$info = curl_getinfo($ch);
/* i might have already problems here since $info contains :

Array
(
   [url] => https://mywebsite.com/user/login
   [content_type] => 
   [http_code] => 0
   [header_size] => 0
   [request_size] => 0
   [filetime] => -1
   [ssl_verify_result] => 0
   [redirect_count] => 0
   [total_time] => 0
   [namelookup_time] => 0
   [connect_time] => 0.171
   [pretransfer_time] => 0
   [size_upload] => 0
   [size_download] => 0
   [speed_download] => 0
   [speed_upload] => 0
   [download_content_length] => -1
   [upload_content_length] => -1
   [starttransfer_time] => 0
   [redirect_time] => 0
)
*/
// SET FILE TO DOWNLOAD 
curl_setopt($ch, CURLOPT_URL, 'http://mywebsite.com/users/en/myfile/1/'); 
curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookieFileName.txt");

// EXECUTE 2nd REQUEST (FILE DOWNLOAD) 
$content = curl_exec ($ch); 

// CLOSE CURL 
curl_close ($ch); 


?>

 

$content contains a "you must be logged" page instead of "this is your data" page.

 

2nd possible problem : the cookie.txt contains :

 

 

# Netscape HTTP Cookie File

# http://curl.haxx.se/rfc/cookie_spec.html

# This file was generated by libcurl! Edit at your own risk.

 

mywebsite.com FALSE / FALSE 0 LOL_TRIB p4epeqgp9tfijl0evi91rsl225

 

and not all the cookies that are stored in my navigator if i log in manually.

 

Could someone explain to me where are my errors, or give me a hint please ?

 

Thanks.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.