phoenixx Posted September 13, 2010 Share Posted September 13, 2010 I need to extract data from one of our suppliers but their system is password protected. If I could work the access information into the @file_get_contents statement I could get the data w/ no problem. Any help would be great. Here is the page code of the data I need to pass in the login information. =============================================== <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML> <HEAD> <title>Welcome to xxxxxxxxxxxxxx.com. Please Login...</title> <meta content="Microsoft Visual Studio .NET 7.1" name="GENERATOR"> <meta content="Visual Basic .NET 7.1" name="CODE_LANGUAGE"> <meta content="JavaScript" name="vs_defaultClientScript"> <meta content="http://schemas.microsoft.com/intellisense/ie5" name="vs_targetSchema"> <script language="JavaScript"> function setFocus() { if (Login.txtUserID.value == '') { Login.txtUserID.focus(); } else { Login.txtPassword.focus(); } } </script> </HEAD> <body link="#e00ee0" bgColor="#ffffff" onload="setFocus();" MS_POSITIONING="GridLayout"> <form name="Login" method="post" action="https://www.xxxxxxxxxx.com/Catalog/series_detail.asp?varGroup=20&varSeries=14953?hcu=1&hs=1&hm=1" id="Login"> <input type="hidden" name="__VIEWSTATE" value="FdJ0ceC8UWuw3Tw+c2uk+WrgGSSaMeEhJL36a7EtIQdRABdjOpSAApMnrAp6uKmaT9bE5Uq5FqUeMhKPC5N4SvCYRu6arNDFw9Iu3/8aUH/0OlLUkJj3Ub0bEmCoEJ5axcNr1gY/j6wcc+vYXNcCvaP4U4nIVwAEEn48Kwt1EGBLrv6iVPvkkKMt1ADAvtR3RGQmncS3tcmxXs9EiN1V+Niqq1lt4s2v" /> Quote Link to comment Share on other sites More sharing options...
cags Posted September 14, 2010 Share Posted September 14, 2010 I doubt you can successfully scrape a password protected file with file_get_contents, you will probably have to use cURL. Also, it looks like the target page is in asp, which will likely make your life a lot more difficult. You will probably have to make a cURL request to the login page, capturing the cookies as you do so, then post the login details along with the cookies, then request the page(s) that you actually want to scrape. Quote Link to comment Share on other sites More sharing options...
Psycho Posted September 14, 2010 Share Posted September 14, 2010 Just to add: it is *possible* that the site is coded to accept login variables on the URL. There are some that do that with the username and some with the password as well. However, you would have to know if they are allowed and what the parameter names are. Here is what you could try. View the source of the login page and see what the field names are for the username and the password. Then add those namess along with their respective values to the url and see if that gives you access to the page without being ogged in already. Quote Link to comment Share on other sites More sharing options...
phoenixx Posted November 12, 2010 Author Share Posted November 12, 2010 tried that, also tried the old way of doing it with username:password@ in the url and neither worked. Quote Link to comment Share on other sites More sharing options...
.josh Posted November 12, 2010 Share Posted November 12, 2010 I doubt you can successfully scrape a password protected file with file_get_contents, you will probably have to use cURL. Also, it looks like the target page is in asp, which will likely make your life a lot more difficult. You will probably have to make a cURL request to the login page, capturing the cookies as you do so, then post the login details along with the cookies, then request the page(s) that you actually want to scrape. Quote Link to comment Share on other sites More sharing options...
daydreamer Posted November 13, 2010 Share Posted November 13, 2010 This maybe useful: http://www.askapache.com/htaccess/sending-post-form-data-with-php-curl.html Try this: 1. install live http headers firefox plugin. 2. login with your browser. 3. view the initial first post, and see the post contents. It will be in this format: username=yourusername&password=yourpassword&otherstuff...... 4. the above post is what you need to replicate with your PHP script using cURL. 5. Login with cURL (enable cURL cookies) using PHP, then try to download the file you are after. It should work as you now have authenticated yourself to the server you are downloading from. Quote Link to comment Share on other sites More sharing options...
salathe Posted November 13, 2010 Share Posted November 13, 2010 phoenixx, your HTML sample is missing a few details (like the username/password fields!) and without more details we can't help you with targetted answers. Can we see the full HTML? Even better then that would be a trace of the HTTP headers/content from the requests and responses when logging in and accessing the password-protected page. For what it's worth, you almost certainly (barring anything really crazy) could do what you want with file_get_contents() even if some folks in this thread say otherwise. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.