Jump to content

Akkari

Members
  • Posts

    32
  • Joined

  • Last visited

Profile Information

  • Gender
    Not Telling

Akkari's Achievements

Member

Member (2/5)

0

Reputation

  1. Hey there everyone, I'm using curl to retrieve webpages and lookup a certain string in their source code. Apparently a lot of websites use 303,304,...etc redirects for various reasons like: redirecting domain.com to www.domain.com redirecting domain.com to domain.com/portal ...etc Now the problem is that when curl hits any kind of redirect, it doesn't "follow" it but simply retrieves the page that has nothing it it except the header redirect and hence fails. Is there any way to make curl "follow" redirects and retrieve the page from where it's supposed to be? Thanks!
  2. Thanks for the input everyone. I managed to solve the issue by changing the test website I was testing the script on. It turned out that the original test website http://site.com had a 3xx redirect to http://www.site.com which made curl fail to retrieve the website. I will open a new topic with that new issue, hopefully someone would be able to point me in the right direction. Thanks!
  3. Thanks a lot for the response guys. @sumpygump The string is "/images/" (without the quotes of course) which occurs within URLs referencing the images folder of the website. So usually it'll occur as part of an image URL displayed on the page. @ManiacDan I think your suggestion might be useful in other situations so I'm looking it up now. However, in this particular situation I echoed out $content and it displayed the target site perfectly. Appreciate your responses, guys!
  4. Hello there everyone, Been a while since I last posted. I've successfully retrieved external HTML pages using Curl, through the following code: $ch = curl_init("http://www.site.com/"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_BINARYTRANSFER, true); $content = curl_exec($ch); curl_close($ch); I then tried adding something, which I thought that there certainly has more to it than that but tried it anyway: if(strpos($content,"string_to_search_for") == false) echo "Not found."; else echo "Found."; Now this returned "Not found" every time for me, even if the string was present in that page source code. On a side note, I will be using this to evaluate hundreds, potentially thousands of websites to see if they have that string present in the source code. How long do you think execution time would be for say, a 1000 URL? And would there be a better approach to speed things up? Thanks!
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.