dadamssg Posted July 18, 2012 Share Posted July 18, 2012 I'm trying to scrape images from the mark-up of certain webpages. These webpages all have a slideshow. Their sources are contained in javascript objects on the page. I'm thinking i need to get_file_contents("http://www.example.com/page/1"); and then have a preg_match_all() function that i can input a phrase(ie. '"LargeUrl": ', or '"Description":') and get whatever's in the quotes directly after those instances. var photos = {}; photos['photo-391094'] = {"LargeUrl": "http://www.example.org/images/1.png","Description":"blah blah balh"}; photos['photo-391095'] = {"LargeUrl": "http://www.example.org/images/2.png","Description":"blah blah balh"}; photos['photo-391096'] = {"LargeUrl": "http://www.example.org/images/3.png","Description":"blah blah balh"}; I have this function, but it returns the entire line after the input phrase. How can i modify it to look for whatever's in quotes directly after the input keyword? Or am i doing it all wrong and theres an easier way? $page = file_get_contents("http://www.example.org/page/1"); $word = "\"LargeUrl\":"; if(preg_match_all("/(?<=$word)\S+/i", $page, $matches)) { echo "<pre>"; print_r($matches); echo "</pre>"; } Quote Link to comment https://forums.phpfreaks.com/topic/265867-scraping-images-from-javascript-object/ Share on other sites More sharing options...
requinix Posted July 18, 2012 Share Posted July 18, 2012 Obligatory "whose website and are they allowing you to do this?" Quote Link to comment https://forums.phpfreaks.com/topic/265867-scraping-images-from-javascript-object/#findComment-1362326 Share on other sites More sharing options...
dadamssg Posted July 18, 2012 Author Share Posted July 18, 2012 they're user uploaded photos and will be contacting each user to ask for their permission to use the photos. Quote Link to comment https://forums.phpfreaks.com/topic/265867-scraping-images-from-javascript-object/#findComment-1362328 Share on other sites More sharing options...
xyph Posted July 18, 2012 Share Posted July 18, 2012 "LargeUrl": "([^"]+) Match the characters <"LargeUrl": "> literally <"LargeUrl": "> Match the regular expression below and capture its match into backreference number 1 <([^"]+)> Match any character that is NOT a <"> <[^"]+> Between one and unlimited times, as many times as possible, giving back as needed (greedy) <+> Quote Link to comment https://forums.phpfreaks.com/topic/265867-scraping-images-from-javascript-object/#findComment-1362463 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.