loren646 Posted April 30, 2013 Share Posted April 30, 2013 I'm sure it's possible. I want to go to a website. Input values then copy and paste data it found from that search. Not even sure where to start. Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/ Share on other sites More sharing options...
requinix Posted April 30, 2013 Share Posted April 30, 2013 Generally. What site and what are you grabbing? Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427328 Share on other sites More sharing options...
loren646 Posted April 30, 2013 Author Share Posted April 30, 2013 http://a810-bisweb.nyc.gov/bisweb/bispi00.jsp example (on row 1 of property search): Select "manhattan" Input "22" "west 11 st" Click "GO" ---- Click "complaints" then click each complaint and copy the disposition i.e. "01/26/2004 - A9 - ECB & BUILDINGS VIOLATIONS SERVED" Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427345 Share on other sites More sharing options...
requinix Posted April 30, 2013 Share Posted April 30, 2013 Okay, I can't find anything which prohibits you from doing this. What data are you trying to get and in what form? Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427363 Share on other sites More sharing options...
akphidelt2007 Posted April 30, 2013 Share Posted April 30, 2013 It's actually much easier than it seems. Just look up file_get_contents. The only thing you will have to know is regex and how to manipulate the url to get the correct contents. Like I did a project for some guys where I scraped all of ESPN's baseball data for the past decade and that was simply just changing the date on the URL and parsing ESPN's structure. Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427368 Share on other sites More sharing options...
cob05 Posted May 1, 2013 Share Posted May 1, 2013 It's actually much easier than it seems. Just look up file_get_contents. The only thing you will have to know is regex and how to manipulate the url to get the correct contents. Like I did a project for some guys where I scraped all of ESPN's baseball data for the past decade and that was simply just changing the date on the URL and parsing ESPN's structure. I'm looking to do something like that for football (NFL), you don't happen to have some sample code you could share do you? Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427608 Share on other sites More sharing options...
loren646 Posted May 1, 2013 Author Share Posted May 1, 2013 Okay, I can't find anything which prohibits you from doing this. What data are you trying to get and in what form? just text data. I can either put it in a mysql database or excel. it doesn't matter. i just want to automate it - rather than do it manually. Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427627 Share on other sites More sharing options...
loren646 Posted May 1, 2013 Author Share Posted May 1, 2013 It's actually much easier than it seems. Just look up file_get_contents. The only thing you will have to know is regex and how to manipulate the url to get the correct contents. Like I did a project for some guys where I scraped all of ESPN's baseball data for the past decade and that was simply just changing the date on the URL and parsing ESPN's structure. Thanks. I'm going to do some reading up on this right now. Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427628 Share on other sites More sharing options...
akphidelt2007 Posted May 1, 2013 Share Posted May 1, 2013 I'm looking to do something like that for football (NFL), you don't happen to have some sample code you could share do you? Getting the contents part is easy, it's the parsing that takes some time. This was for mlb. It worked perfectly for me, but this was two years ago... and I know there's probably a lot of efficiencies you can add to it. But for time purposes I'll just post the simple code. This was to get individual game data for each game. //plug in a date here that you want to get the info for or to start your loop for tons of dates $date = '2013-05-01'; $unix = strtotime($date) $espnDate = date('Ymd',$unix); $url = 'http://scores.espn.go.com/mlb/scoreboard?date='.$espnDate; //here's how easy it is to get the file $handle = file_get_contents($url); $str = htmlentities($handle); //extract the game ids from the game date $pattern = '/(\d*)-gameDetails/'; preg_match_all($pattern, $str, $gameIDs); //now you have the divs that contain each of the games and you just loop through them and then go through the same process foreach($gameIDs[1] as $id) { $url = 'http://scores.espn.go.com/mlb/boxscore?gameId='.$id; $handle = file_get_contents($url); $str = htmlentities($handle); //now you have a mess of regex to parse the actual html to break up the actual data and store in a database } Link to comment https://forums.phpfreaks.com/topic/277460-can-i-data-scrape-datamine-data-using-php/#findComment-1427637 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.