hamidjoukar Posted September 3, 2015 Share Posted September 3, 2015 When I read HTML source of below linkhttp://www.dresslink.com/women-candy-color-handbag-leather-cross-body-shoulder-bag-bucket-bag-p-10908.htmlI can find below data about the product: <script type="text/javascript"> item.stock['ss42356']=[]; DL.item.stock['ss42356']['qty']=56; DL.item.stock['ss42356']['sku']='SV000837_B'; DL.item.stock['ss42356']['inexistence']=0; DL.item.stock['ss42356']['down_shelf']=0; DL.item.stock['ss42356']['procurement_cycle']='8'; DL.item.stock['ss42356']['paid_set']=[]; DL.item.stock['ss42356']['paid_set'].push(35630); DL.item.color_image['35630']='of7ea7'; DL.item.stock['ss42357']=[]; DL.item.stock['ss42357']['qty']=29; DL.item.stock['ss42357']['sku']='SV000837_G'; DL.item.stock['ss42357']['inexistence']=0; DL.item.stock['ss42357']['down_shelf']=0; DL.item.stock['ss42357']['procurement_cycle']='6'; DL.item.stock['ss42357']['paid_set']=[]; DL.item.stock['ss42357']['paid_set'].push(35631); DL.item.color_image['35631']='of710e'; DL.item.stock['ss42358']=[]; DL.item.stock['ss42358']['qty']=14; DL.item.stock['ss42358']['sku']='SV000837_BR'; DL.item.stock['ss42358']['inexistence']=0; DL.item.stock['ss42358']['down_shelf']=0; DL.item.stock['ss42358']['procurement_cycle']='17'; DL.item.stock['ss42358']['paid_set']=[]; DL.item.stock['ss42358']['paid_set'].push(35632); DL.item.color_image['35632']='of77c1'; DL.item.stock['ss42359']=[]; DL.item.stock['ss42359']['qty']=36; DL.item.stock['ss42359']['sku']='SV000837_O'; DL.item.stock['ss42359']['inexistence']=0; DL.item.stock['ss42359']['down_shelf']=0; DL.item.stock['ss42359']['procurement_cycle']='7'; DL.item.stock['ss42359']['paid_set']=[]; DL.item.stock['ss42359']['paid_set'].push(35633); DL.item.color_image['35633']='of7136'; </script> I need to know the quantity for each SKU, so I need to produce a simple array containing each SKU name and it's quantity like below $a = array( 'SV000837_B' => '56', 'SV000837_G' => '29', 'SV000837_BR' => '14', 'SV000837_O' => '36', ); Please help me write a PHP code using regex and any other method to provide above array. Quote Link to comment https://forums.phpfreaks.com/topic/298041-get-product-information-from-html-source-regex/ Share on other sites More sharing options...
Ch0cu3r Posted September 3, 2015 Share Posted September 3, 2015 Try <?php // webpage you are scraping the javascript code from $page_url = 'http://www.dresslink.com/women-candy-color-handbag-leather-cross-body-shoulder-bag-bucket-bag-p-10908.html'; // load the webpage into DOMDocument libxml_use_internal_errors(true); $doc = new DOMDocument(); $doc->loadHTMLFile($page_url); // use XPath to return the second <script> element inside the <div class="dd1"> element // this is where the javascript code containing the stock array is in the webpage $xpath = new DOMXPath($doc); $result = $xpath->query('//div[@class="dd1"]/script[2]'); // retrieve the node element value $JS_stock_array_code = $result[0]->nodeValue; // use regex to find the qty and sku values preg_match_all("~\[('[\w\d]+')\]\['qty'\]=(\d+);.+\[\\1\]\['sku'\]='([\w\d]+)'~", $JS_stock_array_code, $matches); // loop through the results and define sku array // the sku is used as the array key // the quantity is the assigned to the sku $skus = array(); foreach($matches[3] as $key => $sku) { $qty = $matches[2][$key]; $skus[$sku] = $qty; } // output $sku array printf('<pre>%s</pre>', print_r($skus, 1)); Output for me is Array ( [SV000837_B] => 49 [SV000837_G] => 26 [SV000837_BR] => 11 [SV000837_O] => 35 ) Quote Link to comment https://forums.phpfreaks.com/topic/298041-get-product-information-from-html-source-regex/#findComment-1520195 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.