Jump to content

Search the Community

Showing results for tags 'curl scraping'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • Welcome to PHP Freaks
    • Announcements
    • Introductions
  • PHP Coding
    • PHP Coding Help
    • Regex Help
    • Third Party Scripts
    • FAQ/Code Snippet Repository
  • SQL / Database
    • MySQL Help
    • PostgreSQL
    • Microsoft SQL - MSSQL
    • Other RDBMS and SQL dialects
  • Client Side
    • HTML Help
    • CSS Help
    • Javascript Help
    • Other
  • Applications and Frameworks
    • Applications
    • Frameworks
    • Other Libraries
  • Web Server Administration
    • PHP Installation and Configuration
    • Linux
    • Apache HTTP Server
    • Microsoft IIS
    • Other Web Server Software
  • Other
    • Application Design
    • Other Programming Languages
    • Editor Help (PhpStorm, VS Code, etc)
    • Website Critique
    • Beta Test Your Stuff!
  • Freelance, Contracts, Employment, etc.
    • Services Offered
    • Job Offerings
  • General Discussion
    • PHPFreaks.com Website Feedback
    • Miscellaneous

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


AIM


MSN


Website URL


ICQ


Yahoo


Jabber


Skype


Location


Interests


Age


Donation Link

Found 1 result

  1. Hey guys, i'm new here. English is not my native language but i'll try my best. First, what i would like to do. The purpose of the code is too get data (html file) which is located in a website, but you need to have an account to access the page. After differents tries and misses and several hours into discovering curl lib and testing, everything i tried failed. I tried to solve this step by step and i'm afraid something goes wrong at step 1 but i can't tell what, and how to fix it. This is my code : <?php /* Here is a script that is usefull to : - login to a POST form, - store a session cookie, - download a file once logged in. */ // INIT CURL $ch = curl_init(); // SET URL FOR THE POST FORM LOGIN curl_setopt($ch, CURLOPT_URL, 'https://mywebsite.com/user/login'); // ENABLE HTTP POST curl_setopt ($ch, CURLOPT_POST, 1); // SET POST PARAMETERS : FORM VALUES FOR EACH FIELD curl_setopt ($ch, CURLOPT_POSTFIELDS, 'name=myname&pass=mypass&form_id=user_login'); // IMITATE CLASSIC BROWSER'S BEHAVIOUR : HANDLE COOKIES curl_setopt ($ch, CURLOPT_COOKIEJAR, "/tmp/cookieFileName.txt"); //curl_setopt($ch, CURLOPT_REFERER, 'http://mywebsite.com'); //curl_setopt($ch, CURLOPT_HEADER, TRUE); //curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); # Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL # not to print out the results of its query. # Instead, it will return the results as a string return value # from curl_exec() instead of the usual true/false. curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); // EXECUTE 1st REQUEST (FORM LOGIN) $store = curl_exec ($ch); $info = curl_getinfo($ch); /* i might have already problems here since $info contains : Array ( [url] => https://mywebsite.com/user/login [content_type] => [http_code] => 0 [header_size] => 0 [request_size] => 0 [filetime] => -1 [ssl_verify_result] => 0 [redirect_count] => 0 [total_time] => 0 [namelookup_time] => 0 [connect_time] => 0.171 [pretransfer_time] => 0 [size_upload] => 0 [size_download] => 0 [speed_download] => 0 [speed_upload] => 0 [download_content_length] => -1 [upload_content_length] => -1 [starttransfer_time] => 0 [redirect_time] => 0 ) */ // SET FILE TO DOWNLOAD curl_setopt($ch, CURLOPT_URL, 'http://mywebsite.com/users/en/myfile/1/'); curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookieFileName.txt"); // EXECUTE 2nd REQUEST (FILE DOWNLOAD) $content = curl_exec ($ch); // CLOSE CURL curl_close ($ch); ?> $content contains a "you must be logged" page instead of "this is your data" page. 2nd possible problem : the cookie.txt contains : # Netscape HTTP Cookie File # http://curl.haxx.se/rfc/cookie_spec.html # This file was generated by libcurl! Edit at your own risk. mywebsite.com FALSE / FALSE 0 LOL_TRIB p4epeqgp9tfijl0evi91rsl225 and not all the cookies that are stored in my navigator if i log in manually. Could someone explain to me where are my errors, or give me a hint please ? Thanks.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.