Nuv Posted September 3, 2012 Share Posted September 3, 2012 Hey, I'm trying to scrape google.de/shopping . However I'm having following problem. Please check - view-source:http://www.google.de/search?hl=de&tbm=shop&q=4242002690209&oq=4242002690209 The url for the product is something like <a href="/products/catalog?hl=de&q=4242002690209&um=1&ie=UTF-8&tbm=shop&cid=2594634728159287170&sa=X&ei=5etEUJOlNI3zrQfuh4HADQ&ved=0CFYQ8wIwAA">Bosch MCM 42024 Red Diamond/Silber, Styline K?chenmaschine</a> And the url of the product i'm getting from my code is /products/catalog?hl=de&q=4242002690209&um=1&ie=UTF-8&tbm=shop&cid=2594634728159287170 Why is it that "&sa=X&ei=5etEUJOlNI3zrQfuh4HADQ&ved=0CFYQ8wIwAA" part is missing when im trying to scrape the url and where is it coming from ? P.S - In "&sa=X&ei=5etEUJOlNI3zrQfuh4HADQ&ved=0CFYQ8wIwAA" ei value changes everytime so try searching "Bosch MCM 42024 Red Diamond/Silber, Styline K?chenmaschine" while checking the source code. Code im using - <?php $get_EAN = '4242002690209'; $url = "http://www.google.de/search?hl=de&tbm=shop&q=".$get_EAN."&oq=".$get_EAN; $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_USERAGENT, "msn"); curl_setopt ($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_AUTOREFERER, 1); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); $get_product = curl_exec ($ch); curl_close ($ch); preg_match_all('~<div class="pslimain"><h3 class="r"><a href="(.*?)"~s', $get_product, $get_price); $url = "http://www.google.de".$get_price[1][0]; $ch = curl_init(); curl_setopt ($ch, CURLOPT_URL, $url); curl_setopt ($ch, CURLOPT_USERAGENT, "msn"); curl_setopt ($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_AUTOREFERER, 1); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); $list_price = curl_exec ($ch); curl_close ($ch); print_r($get_price[1][0]); //print_r($get_product); //print_r($list_price); ?> Quote Link to comment https://forums.phpfreaks.com/topic/267952-googlede-generating-additional-info-in-the-link-scraping/ Share on other sites More sharing options...
Nuv Posted September 3, 2012 Author Share Posted September 3, 2012 Got a workaround. It runs without using that. Quote Link to comment https://forums.phpfreaks.com/topic/267952-googlede-generating-additional-info-in-the-link-scraping/#findComment-1374901 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.