muppet77 Posted March 2, 2018 Share Posted March 2, 2018 I’d like to get the points estimates and fooball team names from a site as variables in php. I’ve tried file get contents and also htmlentities but I can’t seem to spot the team names or prices in there. What’s happening? Why don’t they appear in my php output? One site is https://mobile.sportingindex.com/markets/4855dfa3c28b43ceb332768100728224/c0c1dc8fc78347cca90c48e40328a64b/group_a.7e4187d6-9e42-4bb0-9bde-779fd4019f61/00000000000000000000000000000000/ And another that gives the same data so will also be ok to use instead of the first is https://www.spreadex.com/sports/mobile/page/spr/596301/1/10047 Thanks for any help. Quote Link to comment Share on other sites More sharing options...
Barand Posted March 2, 2018 Share Posted March 2, 2018 Perhaps it's because there is no code to access the files and outout the data? Quote Link to comment Share on other sites More sharing options...
muppet77 Posted March 2, 2018 Author Share Posted March 2, 2018 Yes I’m more convinced the page blocks scraping and you need a certificate. (Is that what you just said!) Quote Link to comment Share on other sites More sharing options...
gizmola Posted March 2, 2018 Share Posted March 2, 2018 No he pointed out that you provided no code, or debugging information. It could certainly be that the site is looking for agent information and you are not getting the same output you normally would because your scraping code doesn't look like a normal browser. I can also tell you in advance, that to parse html pages with php, I would highly recommend using either simplexml or DOM to load and subsequently locate and parse individual elements from the page. If you want further help, we need to see at very least, relevant code snippets. Quote Link to comment Share on other sites More sharing options...
muppet77 Posted March 2, 2018 Author Share Posted March 2, 2018 Ok thanks. echo htmlentities(file_get_contents(“url”)); Is what I’m using. The output doesn’t have the team names in, as per when you visit the actual page (as above) Thanks Quote Link to comment Share on other sites More sharing options...
ginerjm Posted March 2, 2018 Share Posted March 2, 2018 You think one line of code tells us what you are doing? Why are you using htmlentities? Please explain. If this were me, I would first examine the actual unmodified code that you are reading and search for recognizable items that identify the parts that you want to extract. Perhaps a div tag with a certain class or id value. Or even another tag that has a name attribute. That's why I wonder why you are using htmlentities. Quote Link to comment Share on other sites More sharing options...
muppet77 Posted March 2, 2018 Author Share Posted March 2, 2018 Hi giner jm Actually that is all my code! I’ve added HTML entities because without it, the actual webpage appears in my browser. Without it, I can see the source code to see if the teams are there and what to then target to extract the team names one by one. Quote Link to comment Share on other sites More sharing options...
kicken Posted March 2, 2018 Share Posted March 2, 2018 Those sites load all their details via some ajax process so the information you want isn't going to be in the source. It you want to swipe their info you'll have to spend time reverse engineering their site and figure out what requests are needed to get the information you want. That or hire someone to do it for you. Quote Link to comment Share on other sites More sharing options...
ginerjm Posted March 2, 2018 Share Posted March 2, 2018 I always use 'view source' in my browser so didn't see it your way. But 'seeing' the data as a web page makes it easier to visually search for something and THEN look in the 'source' to find the surrounding html items that identify it. Quote Link to comment Share on other sites More sharing options...
muppet77 Posted March 2, 2018 Author Share Posted March 2, 2018 Thanks. When I view it in the web page - the team names and prices aren’t there..... Quote Link to comment Share on other sites More sharing options...
ginerjm Posted March 2, 2018 Share Posted March 2, 2018 (edited) Show us (a small sample) what you are seeing in native source, not the 'view' Edited March 2, 2018 by ginerjm Quote Link to comment Share on other sites More sharing options...
muppet77 Posted March 2, 2018 Author Share Posted March 2, 2018 Hi I’m on my phone at the moment but it’s view source of https://www.spreadex.com/sports/mobile/page/spr/596301/1/10047 is that’s what you mean? I’m after the team names and their spread point estimates. Eg Manchester City 98-99.5 Quote Link to comment Share on other sites More sharing options...
dalecosp Posted March 2, 2018 Share Posted March 2, 2018 Heh. Loading that page entails *55* separate requests. There are almost 10 XHR requests and at least one to a websocket ... I'd probably start there. Good luck. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.