Jump to content

max_maggot

Members
  • Posts

    15
  • Joined

  • Last visited

Posts posted by max_maggot

  1. I am trying to retrieve a really tricky array of values. The value I am trying to retrieve are the

    -price values and the text within the price tag

    -color values which can be more than one option (black and white in this case)

    <dd>
    <div class="input-box">
      <select name="options[1216]" id="select_1216" 
       class="product-custom-option" title=""  onchange="opConfig.reloadPrice()">
       <option value="" >-- Please Select --</option>
       <option value="4185"  price="4.99" >4GB +$4.99</option>
       <option value="4186"  price="5.99" >8GB +$5.99</option>
       <option value="4187"  price="9.99" >16GB +$9.99</option>
       <option value="4188"  price="16.99" >32GB +$16.99</option>
       <option value="4189"  price="29.99" >64GB +$29.99</option>
      </select>            
    </div>
    </dd>
                
    <dt>
    <label class="required"><em>*</em>Color</label></dt>
    <dd class="last">
        <div class="input-box">
            <ul id="options-1215-list" class="options-list">
              <li><input type="radio" class="radio  
                   validate-one-required-by-name product-custom-option" 
                   onclick="opConfig.reloadPrice()" name="options[1215]" 
                   id="options_1215_2" value="4183"  price="0" />
    
                   <span class="label"><label for="options_1215_2">Black </label></span>
                   <script type="text/javascript">
              </li>
              <li>
                  <input type="radio" class="radio  
                   validate-one-required-by-name product-custom-option" 
                   onclick="opConfig.reloadPrice()" name="options[1215]" 
                   id="options_1215_3" value="4184"  price="0" />
                  <span class="label"><label for="options_1215_3">White </label></span>              
              </li>
            </ul>                                    
    </div>
    

    I have wrtten the following code to try and retrieve the values of the price but to no avail. I'm trying to use a wildcard character in the xquery to try because other pages will have different option names. Here is what I came up with so far. There won't be more than 20 options on a page. Any help would be much appreciated. Thanks in advance.

    $option_array = array();
          for($k = 1; $k <= 20; $k++) {
               $option_array[$k] = $xpath->query("//div[@class='input-box']//select/*/option[$k]/@price")->item(0)->textContent;
               //If we have captured all the images then we need to exit the for loop
               if($option_array[$k] == null){
                     break;
               }
          }
    
  2. Hi,

    I have two xquery statements that I can't seem to get working.

     

    I am trying to read the weight value from this site: http://www.vopmart.com/cs2013135.html

     

    I have the xquery statement

    $items['weight'] = $xpath->query("//div/table/tr/td[@class='et7']")->item(0)->textContent;
    

    I am also having a problem trying to retrieve the meta_keyword using this statement

    $items['meta_keyword'] = $xpath->query("//meta[@name='keywords']/@content")->item(0)->textContent;
    

    If anybody can spot where I am going wrong I would really appreciate it.

     

     

     

     

  3. OK,

    Fixed initial problem. See code snippet below. The code is still crashing out after it completes its task. Any help is much appreciated.

    //This function takes a all product page for a category and opens each of the pages individually
    //It then scrapes all the information about that product and stores the details in a CSV file
    
    function get_product_details($product_link_list)
    {
        //Open CSV file
        //a+ opens the file for writing and placing the pointer at the end of the file to append new data
        //If the file does not exist a+ will try to create the file products.csv
        //Appending the data to this file happens later in this function.
    
        //looping variable
        $i = 0;
        global $file_handle;
    
        //Load DOM of product page
        //@$html->loadHtmlFile($category_sub_page_list[$i]);
    
        while ($i <= count($product_link_list)) { //loop through each of the product details pages and scrape data {
            $html = new DOMDocument();
            $html->loadHTMLFile($product_link_list[$i]);
            $xpath = new DOMXPath($html);
    
    
    
            $csv_details = "";
            $html = file_get_html($product_link_list[$i]);
            $items = array();
    
            //TODO: title is wrong, finish scraped values, Fix up headings at top of code.
            foreach ($html->find('div.main') as $article) {
    
                //capture content from website
                $item['title'] = $article->find('li.product', 0)->plaintext;
                $item['sku'] = $article->find('div.sku-no', 0)->plaintext;
                $item['price'] = $article->find('div.price-box', 0)->plaintext;
                //Capture HTML code and content for description
                $item['description'] = $article->find('div.std', 0)->outertext;
                $items['categories'] = $xpath->query("//a[@class='in-category']")->item(0)->textContent;
                $items['image'] = $xpath->query("//img/@src")->item(0)->textContent; //Get and Set Product Image
                $items['thumbnail_label'] = $xpath->query("//div[@class='highslide-caption']")->item(0)->textContent;
                $items['media_image'] = $xpath->query("//div[@class='highslide-gallery']//a/@href")->item(0)->textContent;
                $items['meta_description'] = $xpath->query("//meta[@name='description']/@content")->item(0)->textContent;
                $items['meta_keyword'] = $xpath->query("//meta[@name='keywords']/@content")->item(0)->textContent; //Get and Set meta keywords
                $item['name'] = $article->find('li.product', 0)->plaintext;
                $items['short_description'] = $xpath->query("//div[@class='short-description']")->item(0)->textContent; //Get and Set short description
                $items['small_image'] = $xpath->query("//div[@class='highslide-gallery']//a/@href")->item(0)->textContent; //Get and Set small image
                $items['thumbnail_label'] = $xpath->query("//div[@class='highslide-caption']")->item(0)->textContent; //Get and Set thumbnail label
                $items['weight'] = $xpath->query("//td[@class='et7']/text()")->item(0)->textContent;
    
                //Trim and fix values
                $item['title'] = str_replace(' ', '', $item['title']);
                $item['sku'] = str_replace('SKU:', '', $item['sku']);
                $item['price'] = str_replace(' ', '', $item['price']);
                $item['name'] = str_replace(' ', '', $item['price']);
                //Remove HTML code from the scraped data.
                $items['short_description'] = trim(preg_replace('/\s\s+/', ' ', $items['short_description']));
                $items['thumbnail_label'] = trim(preg_replace('/\s\s+/', ' ', $items['thumbnail_label']));
    
                //Assign values to temporary variables for writing to SCV
                $sku = $item['sku'];                                    //Scrape
                $title = $item['title'];                                //Scrape
                $_store = " ";                                          //Default value
                $_attribute_set = "Default";                            //Default value
                $_type = "Simple";                                      //Default value
                $_category = $items['categories'];                       //Scrape
                $_root_category = "Default Category";                   //Default value
                $_product_websites = "base";                            //Default value
                $color = " ";                                           //Default value
                $cost = $item['price'];                                 //Scrape
                $country_of_manufacture = " ";                          //Set Value - No tag to get this
                $created_at = " ";                                      //Default value
                $custom_design = "99";                                  //Default value
                $custom_design_from = "1";                              //Default value
                $custom_design_to = " ";                                //Default value
                $custom_layout_update = " ";                            //Default Value
                $description = $item['description'];                    //Scrape
                $gallery = " ";                                         //Default Value
                $gift_message_available = " ";                          //Default Value
                $has_options = "1";                                     //Default value
                $image = $items['image'];                               //Scrape
                $image_label = $items['thumbnail_label'];                 //Scrape
                $manufacturer = " ";                                    //Default - cannot scrape
                $media_gallery = $items['_media_image'];                  //Scrape
                $meta_description =  $items['meta_description'];          //Scrape
                $meta_keyword = $items['meta_keyword'];                   //Scrape
                $meta_title = $item['title'];                           //Scrape
                $minimal_price = " ";                                   //Default Value
                $msrp = "0";                                            //Default value
                $msrp_display_actual_price_type = "100";                //Default value
                $msrp_enabled = "0";                                    //Default value
                $name = $item['name'];                                   //Scrape
                $news_from_date = " ";                                  //Default value
                $news_to_date = " ";                                    //Default value
                $options_container = "1";                               //Default value
                $page_layout = " ";                                     //Default value
                $price = $item['price'];                                //Scrape
                $required_options = "0";                                //Default value
                $short_description = $items['short_description'];         //Scrape
                $small_image = $items['small_image'];                     //Scrape
                $small_image_label = $items['thumbnail_label'];           //Scrape
                $special_from_date = " ";                               //Default value
                $special_price = " ";                                   //Default value
                $special_to_date = " ";                                 //Default value
                $status = "1";                                          //Default value
                $tax_class_id = "1";                                    //Default value
                $thumbnail = "0";                                       //Default value
                $thumbnail_label = "1";                                 //Default value
                $updated_at = "0";                                      //Default value
                $url_key = "0";                                         //Default value
                $url_path = " ";                                        //Default value
                $visibility = " ";                                      //Default value
                $weight = $items['weight'];                               //Scrape and Ramon has Work to do
                $qty = "100";                                           //Default value
                $min_qty = " ";                                         //Default value
                $use_config_min_qty = " ";                              //Default value
                $is_qty_decimal = " ";                                  //Default value
                $backorders = " ";                                      //Default value
                $use_config_backorders = " ";                           //Default value
                $min_sale_qty = " ";                                    //Default value
                $use_config_min_sale_qty = " ";                         //Default value
                $max_sale_qty = " ";                                    //Default value
                $use_config_max_sale_qty = " ";                         //Default value
                $is_in_stock = "1";                                     //Default value
                $notify_stock_qty = " ";                                //Default value
                $use_config_notify_stock_qty = " ";                     //Default value
                $manage_stock = "88";                                   //Default value
                $use_config_manage_stock = "0";                         //Default value
                $stock_status_changed_auto = " ";                       //Default value
                $use_config_qty_increments = "1";                       //Default value
                $qty_increments = " ";                                  //Default value
                $use_config_enable_qty_inc = " ";                       //Default value
                $is_decimal_divided = " ";                              //Default value
                $_links_related_sku = " ";                              //Default value
                $_links_related_position = " ";                         //Default value
                $_links_crosssell_sku = " ";                            //Default value
                $_links_crosssell_position = " ";                       //Default value
                $_links_upsell_sku = " ";                               //Default value
                $_links_upsell_position = " ";                          //Default value
                $_associated_sku = "0";                                 //Default value
                $_associated_default_qty = " ";                         //Default value
                $_associated_position = "0";                            //Default value
                $_tier_price_website = " ";                             //Default value
                $_tier_price_customer_group = " ";                      //Default value
                $_tier_price_qty = " ";                                 //Default value
                $_tier_price_price = " ";                               //Default value
                $_group_price_website = " ";                            //Default value
                $_group_price_customer_group = " ";                     //Default value
                $_group_price_price = " ";                              //Default value
                $_media_attribute_id = " ";                             //Default value
                $_media_image = " ";                                    //Default value
                $_media_label = " ";                                    //Default value
                $_media_position = " ";                                 //Default value
                $_media_is_disabled = " ";                              //Default value
                $_custom_option_store = " ";                            //Default value
                $_custom_option_type = " ";                             //Default value
                $_custom_option_title = " ";                            //Default value
                $_custom_option_is_required = " ";                      //Default value
                $_custom_option_price = " ";                            //Default value
                $_custom_option_sku = " ";                              //Default value
                $_custom_option_max_characters = " ";                   //Default value
                $_custom_option_sort_order = " ";                       //Default value
                $_custom_option_row_title = " ";                        //Default value
                $_custom_option_row_price = " ";                        //Default value
                $_custom_option_row_sku = " ";                          //Default value
                $_custom_option_row_sort = " ";                         //Default value
                $enable_config_enable_qty_inc = " ";                    //Default value
                $enable_qty_inc = " ";                                  //Default value
    
                //Append data to CSV file
                $csv_details .= $sku . "," . $title . "," . $_store . "," . $_attribute_set . "," . $_type . "," . $_category . "," . $_root_category . "," . $_product_websites . ","
                    . $color . "," . $cost . "," . $country_of_manufacture . "," . $created_at . "," . $custom_design . "," . $custom_design_from . "," . $custom_design_to . ","
                    . "," . $custom_layout_update . "," . $description . "," . $gallery . "," . $gift_message_available . "," . $has_options . "," . $image . "," . $image_label . ","
                    . $manufacturer . "," . $media_gallery . "," . $meta_description . "," . $meta_keyword . "," . $meta_title . "," . $minimal_price . "," . $msrp . "," .
                    $msrp_display_actual_price_type . "," . $msrp_enabled . "," . $name . "," . $news_from_date . "," . $news_to_date . "," . $options_container . "," .
                    $page_layout . "," . $price . "," . $required_options . "," . $short_description . "," . $small_image . "," . $small_image_label . "," . $special_from_date
                    . "," . $special_price . "," . $special_to_date . "," . $status . "," . $tax_class_id . "," . $thumbnail . "," . $thumbnail_label . "," . $updated_at . "," .
                    $url_key . "," . $url_path . "," . $visibility . "," . $weight . "," . $qty . "," . $min_qty . "," . $use_config_min_qty . "," . $is_qty_decimal . "," .
                    $backorders . "," . $use_config_backorders . "," . $min_sale_qty . "," . $use_config_min_sale_qty . "," . $max_sale_qty . "," . $use_config_max_sale_qty
                    . "," . $is_in_stock . "," . $notify_stock_qty . "," . $use_config_notify_stock_qty . "," . $manage_stock . "," . $use_config_manage_stock . "," .
                    $stock_status_changed_auto . "," . $use_config_qty_increments . "," . $qty_increments . "," . $use_config_enable_qty_inc . "," . $qty_increments . "," .
                    $use_config_qty_increments . "," . "," . $is_decimal_divided . "," . $_links_related_sku . "," . $_links_related_position . "," .
                    $_links_crosssell_position . "," . $_links_crosssell_sku . "," . $_links_crosssell_position . "," . $_links_upsell_sku . "," . $_links_upsell_position
                    . "," . $_associated_sku . "," . $_associated_default_qty . "," . $_associated_position . "," . $_tier_price_website . "," . $_tier_price_customer_group
                    . "," . $_tier_price_qty . "," . $_tier_price_price . "," . $_group_price_website . "," . $_group_price_customer_group . "," . $_group_price_price . "," .
                    $_media_attribute_id . "," . $_media_image . "," . $_media_label . "," . $_media_position . "," . $_media_is_disabled . "," . $_custom_option_store . "," .
                    $_custom_option_type . "," . $_custom_option_title . "," . $_custom_option_is_required . "," . $_custom_option_price . "," . $_custom_option_sku . "," .
                    $_custom_option_max_characters . "," . $_custom_option_sort_order . "," . $_custom_option_row_sort . "," . $_custom_option_row_title . "," .
                    $_custom_option_row_price . "," . $_custom_option_row_sku . "," . $_custom_option_row_sort . "," . $enable_config_enable_qty_inc . "," . $enable_qty_inc . "\r\n";
    
    
                fwrite($file_handle, $csv_details);
    
                //move to next product category page
                $i++;
            }
    
        }
    }
    
  4. Hi all,

    I I am writing a script to scrape a website. The client wants extra details captured from the website so I've had to include Xquery as well as  html DOM to retrieve these values. I'm getting an error when the while loop moves from 0 to 1 which is displayed below. I know there seems like a lot of code but I believe the error is in the creation/destruction of the DOMXpath. The rest of the code is for the scrape. It moves through a list of product webpages for a category of products and scrapes the appropriate data then moves on to the next category and the product pages associated with it.

     

     

    The program dies on the line $xpath = new DOMXPath($html_x); whne the i variable reaches 1 (second time running through the loop). If you need the whole source code, I can PM you. This has been driving me crazy all day.

     

    Thanks for any help you can provide. It is very much appreciated.

    function get_product_details($product_link_list)
    {
        //Open CSV file
        //a+ opens the file for writing and placing the pointer at the end of the file to append new data
        //If the file does not exist a+ will try to create the file products.csv
        //Appending the data to this file happens later in this function.
    
        //looping variable
        $i = 0;
        global $file_handle;
        $html = new DOMDocument();
        $html_x= new DOMXPath();
        //Load DOM of product page
        //@$html->loadHtmlFile($category_sub_page_list[$i]);
    
        while ($i <= count($product_link_list)) { //loop through each of the product details pages and scrape data {
    
            $html->loadHTMLFile($product_link_list[$i]);
            $xpath = new DOMXPath($html_x);
    
    
            $csv_details = "";
            $html = file_get_html($product_link_list[$i]);
            $items = array();
    
            //TODO: title is wrong, finish scraped values, Fix up headings at top of code.
            foreach ($html->find('div.main') as $article) {
    
                //capture content from website
                $item['title'] = $article->find('li.product', 0)->plaintext;
                $item['sku'] = $article->find('div.sku-no', 0)->plaintext;
                $item['price'] = $article->find('div.price-box', 0)->plaintext;
                //Capture HTML code and content for description
                $item['description'] = $article->find('div.std', 0)->outertext;
                $items['categories'] = $xpath->query("//a[@class='in-category']")->item(0)->textContent;
                $items['image'] = $xpath->query("//img/@src")->item(0)->textContent; //Get and Set Product Image
                $items['thumbnail_label'] = $xpath->query("//div[@class='highslide-caption']")->item(0)->textContent;
                $items['media_image'] = $xpath->query("//div[@class='highslide-gallery']//a/@href")->item(0)->textContent;
                $items['meta_description'] = $xpath->query("//meta[@name='description']/@content")->item(0)->textContent;
                $items['meta_keyword'] = $xpath->query("//meta[@name='keywords']/@content")->item(0)->textContent; //Get and Set meta keywords
                $item['name'] = $article->find('li.product', 0)->plaintext;
                $items['short_description'] = $xpath->query("//div[@class='short-description']")->item(0)->textContent; //Get and Set short description
                $items['small_image'] = $xpath->query("//div[@class='highslide-gallery']//a/@href")->item(0)->textContent; //Get and Set small image
                $items['thumbnail_label'] = $xpath->query("//div[@class='highslide-caption']")->item(0)->textContent; //Get and Set thumbnail label
                $items['weight'] = $xpath->query("//td[@class='et7']/text()")->item(0)->textContent;
    
                //Trim and fix values
                $item['title'] = str_replace(' ', '', $item['title']);
                $item['sku'] = str_replace('SKU:', '', $item['sku']);
                $item['price'] = str_replace(' ', '', $item['price']);
                $item['name'] = str_replace(' ', '', $item['price']);
                //Remove HTML code from the scraped data.
                $items['short_description'] = trim(preg_replace('/\s\s+/', ' ', $items['short_description']));
                $items['thumbnail_label'] = trim(preg_replace('/\s\s+/', ' ', $items['thumbnail_label']));
    
                //Assign values to temporary variables for writing to SCV
                $sku = $item['sku'];                                    //Scrape
                $title = $item['title'];                                //Scrape
                $_store = " ";                                          //Default value
                $_attribute_set = "Default";                            //Default value
                $_type = "Simple";                                      //Default value
                $_category = $items['categories'];                       //Scrape
                $_root_category = "Default Category";                   //Default value
                $_product_websites = "base";                            //Default value
                $color = " ";                                           //Default value
                $cost = $item['price'];                                 //Scrape
                $country_of_manufacture = " ";                          //Set Value - No tag to get this
                $created_at = " ";                                      //Default value
                $custom_design = "99";                                  //Default value
                $custom_design_from = "1";                              //Default value
                $custom_design_to = " ";                                //Default value
                $custom_layout_update = " ";                            //Default Value
                $description = $item['description'];                    //Scrape
                $gallery = " ";                                         //Default Value
                $gift_message_available = " ";                          //Default Value
                $has_options = "1";                                     //Default value
                $image = $items['image'];                               //Scrape
                $image_label = $items['thumbnail_label'];                 //Scrape
                $manufacturer = " ";                                    //Default - cannot scrape
                $media_gallery = $items['_media_image'];                  //Scrape
                $meta_description =  $items['meta_description'];          //Scrape
                $meta_keyword = $items['meta_keyword'];                   //Scrape
                $meta_title = $item['title'];                           //Scrape
                $minimal_price = " ";                                   //Default Value
                $msrp = "0";                                            //Default value
                $msrp_display_actual_price_type = "100";                //Default value
                $msrp_enabled = "0";                                    //Default value
                $name = $item['name'];                                   //Scrape
                $news_from_date = " ";                                  //Default value
                $news_to_date = " ";                                    //Default value
                $options_container = "1";                               //Default value
                $page_layout = " ";                                     //Default value
                $price = $item['price'];                                //Scrape
                $required_options = "0";                                //Default value
                $short_description = $items['short_description'];         //Scrape
                $small_image = $items['small_image'];                     //Scrape
                $small_image_label = $items['thumbnail_label'];           //Scrape
                $special_from_date = " ";                               //Default value
                $special_price = " ";                                   //Default value
                $special_to_date = " ";                                 //Default value
                $status = "1";                                          //Default value
                $tax_class_id = "1";                                    //Default value
                $thumbnail = "0";                                       //Default value
                $thumbnail_label = "1";                                 //Default value
                $updated_at = "0";                                      //Default value
                $url_key = "0";                                         //Default value
                $url_path = " ";                                        //Default value
                $visibility = " ";                                      //Default value
                $weight = $items['weight'];                               //Scrape and Ramon has Work to do
                $qty = "100";                                           //Default value
                $min_qty = " ";                                         //Default value
                $use_config_min_qty = " ";                              //Default value
                $is_qty_decimal = " ";                                  //Default value
                $backorders = " ";                                      //Default value
                $use_config_backorders = " ";                           //Default value
                $min_sale_qty = " ";                                    //Default value
                $use_config_min_sale_qty = " ";                         //Default value
                $max_sale_qty = " ";                                    //Default value
                $use_config_max_sale_qty = " ";                         //Default value
                $is_in_stock = "1";                                     //Default value
                $notify_stock_qty = " ";                                //Default value
                $use_config_notify_stock_qty = " ";                     //Default value
                $manage_stock = "88";                                   //Default value
                $use_config_manage_stock = "0";                         //Default value
                $stock_status_changed_auto = " ";                       //Default value
                $use_config_qty_increments = "1";                       //Default value
                $qty_increments = " ";                                  //Default value
                $use_config_enable_qty_inc = " ";                       //Default value
                $is_decimal_divided = " ";                              //Default value
                $_links_related_sku = " ";                              //Default value
                $_links_related_position = " ";                         //Default value
                $_links_crosssell_sku = " ";                            //Default value
                $_links_crosssell_position = " ";                       //Default value
                $_links_upsell_sku = " ";                               //Default value
                $_links_upsell_position = " ";                          //Default value
                $_associated_sku = "0";                                 //Default value
                $_associated_default_qty = " ";                         //Default value
                $_associated_position = "0";                            //Default value
                $_tier_price_website = " ";                             //Default value
                $_tier_price_customer_group = " ";                      //Default value
                $_tier_price_qty = " ";                                 //Default value
                $_tier_price_price = " ";                               //Default value
                $_group_price_website = " ";                            //Default value
                $_group_price_customer_group = " ";                     //Default value
                $_group_price_price = " ";                              //Default value
                $_media_attribute_id = " ";                             //Default value
                $_media_image = " ";                                    //Default value
                $_media_label = " ";                                    //Default value
                $_media_position = " ";                                 //Default value
                $_media_is_disabled = " ";                              //Default value
                $_custom_option_store = " ";                            //Default value
                $_custom_option_type = " ";                             //Default value
                $_custom_option_title = " ";                            //Default value
                $_custom_option_is_required = " ";                      //Default value
                $_custom_option_price = " ";                            //Default value
                $_custom_option_sku = " ";                              //Default value
                $_custom_option_max_characters = " ";                   //Default value
                $_custom_option_sort_order = " ";                       //Default value
                $_custom_option_row_title = " ";                        //Default value
                $_custom_option_row_price = " ";                        //Default value
                $_custom_option_row_sku = " ";                          //Default value
                $_custom_option_row_sort = " ";                         //Default value
                $enable_config_enable_qty_inc = " ";                    //Default value
                $enable_qty_inc = " ";                                  //Default value
    
                //Append data to CSV file
                $csv_details .= $sku . "," . $title . "," . $_store . "," . $_attribute_set . "," . $_type . "," . $_category . "," . $_root_category . "," . $_product_websites . ","
                    . $color . "," . $cost . "," . $country_of_manufacture . "," . $created_at . "," . $custom_design . "," . $custom_design_from . "," . $custom_design_to . ","
                    . "," . $custom_layout_update . "," . $description . "," . $gallery . "," . $gift_message_available . "," . $has_options . "," . $image . "," . $image_label . ","
                    . $manufacturer . "," . $media_gallery . "," . $meta_description . "," . $meta_keyword . "," . $meta_title . "," . $minimal_price . "," . $msrp . "," .
                    $msrp_display_actual_price_type . "," . $msrp_enabled . "," . $name . "," . $news_from_date . "," . $news_to_date . "," . $options_container . "," .
                    $page_layout . "," . $price . "," . $required_options . "," . $short_description . "," . $small_image . "," . $small_image_label . "," . $special_from_date
                    . "," . $special_price . "," . $special_to_date . "," . $status . "," . $tax_class_id . "," . $thumbnail . "," . $thumbnail_label . "," . $updated_at . "," .
                    $url_key . "," . $url_path . "," . $visibility . "," . $weight . "," . $qty . "," . $min_qty . "," . $use_config_min_qty . "," . $is_qty_decimal . "," .
                    $backorders . "," . $use_config_backorders . "," . $min_sale_qty . "," . $use_config_min_sale_qty . "," . $max_sale_qty . "," . $use_config_max_sale_qty
                    . "," . $is_in_stock . "," . $notify_stock_qty . "," . $use_config_notify_stock_qty . "," . $manage_stock . "," . $use_config_manage_stock . "," .
                    $stock_status_changed_auto . "," . $use_config_qty_increments . "," . $qty_increments . "," . $use_config_enable_qty_inc . "," . $qty_increments . "," .
                    $use_config_qty_increments . "," . "," . $is_decimal_divided . "," . $_links_related_sku . "," . $_links_related_position . "," .
                    $_links_crosssell_position . "," . $_links_crosssell_sku . "," . $_links_crosssell_position . "," . $_links_upsell_sku . "," . $_links_upsell_position
                    . "," . $_associated_sku . "," . $_associated_default_qty . "," . $_associated_position . "," . $_tier_price_website . "," . $_tier_price_customer_group
                    . "," . $_tier_price_qty . "," . $_tier_price_price . "," . $_group_price_website . "," . $_group_price_customer_group . "," . $_group_price_price . "," .
                    $_media_attribute_id . "," . $_media_image . "," . $_media_label . "," . $_media_position . "," . $_media_is_disabled . "," . $_custom_option_store . "," .
                    $_custom_option_type . "," . $_custom_option_title . "," . $_custom_option_is_required . "," . $_custom_option_price . "," . $_custom_option_sku . "," .
                    $_custom_option_max_characters . "," . $_custom_option_sort_order . "," . $_custom_option_row_sort . "," . $_custom_option_row_title . "," .
                    $_custom_option_row_price . "," . $_custom_option_row_sku . "," . $_custom_option_row_sort . "," . $enable_config_enable_qty_inc . "," . $enable_qty_inc . "\r\n";
    
    
                fwrite($file_handle, $csv_details);
    
                //move to next product category page
                $i++;
            }
    
        }
    }
  5. <div class="price-box">
       <span class="regular-price" id="product-price-13129" itemtype="http://schema.org/Offer" itemscope="" itemprop="offers">
           <span class="price" itemprop="price">$9.80</span> 
       </span>
    </div>
    

    Hi all,

    In the above HTML code I am using the following xpath query to try and retrieve the price value ($9.80). I can't seem to get the query correct. Here is what I have come up with so far

    $items[price] = $xpath->query("//div[@class='price-box']//span/[@class='regular-price']//span/[@class='price']")->item(0)->textContent; //Get and Set price
    

    Any help or direction would be greatly appreciated.

     

     

  6. Hi folks,

     

    I'm trying to access the elements in the following page:

     

    http://www.vopmart.com/pa1013129.html

     

    I'm looking to access the div class="main" element and then retrieve each of the elements within this. I;ve completed similar tasks to this in the past but I just can't seem to retrieve anything from this. Any help would be greatly appreciated.

     

    My code is

     //looping variable
        $i = 0;
    
        while ($i <= count($product_link_list)) { //loop through each of the product details pages and scrape data {
    
            $html = new DOMDocument();
    
            //Load DOM of individual product page
            $html->loadHTMLFile($product_link_list[$i]);
    
            foreach($html->find('div.main')as $node) {
    
                // Find all images
                foreach ($node->find('img') as $element) {
                    echo $element->src . '<br>';
                }
                // Find all links
                foreach ($node->find('a') as $element) {
                    echo $element->href . '<br>';
                }
            //looping variable
            $j = 0;
    }
    
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.