Jump to content

Web scraping blocked


SammyP

Recommended Posts

I have been using some PHP code to get some football scores from a website for a while, but it has stopped working.

 

I can see the site myself, and the source code is as before.

 

When my PHP code tries to read it though, the page is just one line. I assume this is intentional by the site. (I'm not sure why, as the results aren't theirs, and there are plenty of other sites with them.)

 

Anyway, I am simply getting them from another place now, but I am wondering how they do this, and if it is avoidable.

 

 

Link to comment
https://forums.phpfreaks.com/topic/60758-web-scraping-blocked/
Share on other sites

A lot of websites are available in differant formats. For instance, you can view a lot of websites from a mobile phone, but they will look completely differant to how they would look in your ordinary browser. The content of the website is therefore dependant on the information it can gather from the user.

 

If you were using something like file_get_contents, i think i would be right in saying that no information is passed to the site. This often results in a vastly cut down version of the site being retrieved. Usually you are better off using cURL. You can pass a lot of things in the request like a user-agent. This usually helps you get the content you require.

 

Could be something completely differant that caused your problems. But its a possibility.

Link to comment
https://forums.phpfreaks.com/topic/60758-web-scraping-blocked/#findComment-302264
Share on other sites

No I don't have their permission, and I am happily using another site now. Don't worry that I'm doing anything I shouldn't, I'm not. I just want the results of football matches, and I can type them in from the paper or anywhere, but that requires me to be at my computer all weekend, which I'm not.

 

I am now just curious now about their methods. I do a lot of web scraping, and I like to know how these things work.

 

They might simply have blocked the IP address I suppose. That will be no fun, as the cURL functions won't work either. I am going to test them anyway. Thanks for that advice GingerRobot, I hadn't heard of those functions before. Will let you know how it goes.

 

 

 

Link to comment
https://forums.phpfreaks.com/topic/60758-web-scraping-blocked/#findComment-302275
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.