Jump to content

Combining XML files


Mario_Party

Recommended Posts

Hi everyone, I am downloading product catalogues as XML files and I want to combine some of these files together.  I've managed to put together something that almost works, but not quite.

 

What happens is the files are combined, and it almost works, however at the very end of the combined file, I am missing a few elements for the final product, which are quite crucial.

 

The variables that are passed are not important, apart from $a which is an array containing the names of the files that are being combined together (the name only, not the whole path + file).

 

function save_catalog($mid, $a, $message)
{
  global $xml_feeds_directory;
  global $xml_output_directory;
  global $xml_prefix;
  global $xml_file_ext;
  
  $message_extension = "";
  $output_file_name = $xml_prefix . $mid . "." . $xml_file_ext;
  $full_output_file_name = $xml_output_directory . "/" . $output_file_name;
  
  // Create the XML header
  $xml2 = xml_header($full_output_file_name);

  foreach($a as $file)
  {        
    $xml_full_file_name = $xml_feeds_directory . "/" . $file;
    
    /* Loop through the products */
    // Load the XML file
    $xml = xml_load($xml_full_file_name);
    
    while($xml->read())
    {
      switch($xml->nodeType)
      {
        case(XMLREADER::ELEMENT):
        // If we encounter a new product node
        if($xml->localName == "product")
        {
          // Start a new product in the XML
          $xml2->startElement("product");
          $xml2->setIndent(false); 

            // Loop through all nodes within the node and extract data as an array (and fill the $product_array)
            $xml_array = xml_node_to_array($xml);
            // Output the array to the file
            $xml2 = xml_array_to_file($xml_array, $xml2);
                      
          // End the product element "product"
          $xml2->setIndent(true);
          $xml2->endElement();
        }        
        break;
      }
    }

    var_dump($xml_array) . "<br />\r\n";    
    // Close the XML, ready for reload
    $xml->close();
  }
  
  // Write the end 
  $xml2 = xml_footer($xml2);
  
  $message_extension = format_combined_array($a);
  
  $message .= $message_extension; 
  
  // Save the XML
  xml_save($output_file_name, $full_output_file_name, $message);
}

 

The two functions below are the header and footer

 

function xml_footer($xml2)
{
  // Close the catalog element
  $xml2->endElement();
  // End the document
  $xml2->endDocument();
  
  return $xml2;
}

function xml_header($full_output_file_name)
{
  // Open the XML file for writing to
  $xml2 = new XMLWriter();
  
  $xml2->openURI($full_output_file_name . ".tmp");
  /* Write the header */ 
  $xml2->startDocument("1.0", "UTF-8");
  $xml2->writeDTD("product_catalog", NULL, "http://www.jdoqocy.com/content/dtd/product_catalog_1_1.dtd");
  $xml2->setIndent(true);
  $xml2->startElement("catalog");
  
  return $xml2;
}

 

The function below takes the XML and produces an array of the current node:

 

function xml_node_to_array($xml)
{
  $assoc = NULL;
  $n = 0; 
  while($xml->read())
  { 
    if($xml->nodeType == XMLReader::END_ELEMENT) break; 
    if($xml->nodeType == XMLReader::ELEMENT and !$xml->isEmptyElement)
    { 
      $assoc[$n]['name'] = $xml->name; 
      if($xml->hasAttributes) while($xml->moveToNextAttribute()) $assoc[$n]['atr'][$xml->name] = $xml->value;
      $assoc[$n]['val'] = xml_node_to_array($xml); 
      $n++; 
    } 
    else if($xml->isEmptyElement)
    { 
      $assoc[$n]['name'] = $xml->name; 
      if($xml->hasAttributes) while($xml->moveToNextAttribute()) $assoc[$n]['atr'][$xml->name] = $xml->value;
      $assoc[$n]['val'] = ""; 
      $n++;                
    } 
    else if($xml->nodeType == XMLReader::TEXT) $assoc = $xml->value; 
  }
   
  return $assoc;
}

 

And this function takes the array and outputs it to the file

 

 function xml_array_to_file($xml_array, $xml2)
{
  global $log_file;
  global $product_array;
  global $price_array;

  if(is_array($xml_array))
  {
    foreach($xml_array as $array1 => $array2)
    {
      // Start the XML element "name" 
      $xml2->startElement((string)$array2["name"]);
      // If the "name" element has the value of an array 
      if(is_array($array2["val"]))
      {
        // Extract each "name" and "val" pair from the array
        // var_dump($array2["val"]) . "\r\n<br />";
        $xml2 = xml_array_to_file($array2["val"], $xml2);
      }
      else
      {
        // Otherwise output the "val" as is
        $xml2->text((string)($array2["val"]));
      }
      // End the XML element
      $xml2->endElement();      
  }
  else
  {
    flog($log_file, "Failed to load the array " . $xml_array);
    exit("Failed to load the array " . $xml_array . ".");        
  }
  
  return $xml2;
}

 

The XML files that are read when they are being combined are loaded into the system with the function below:

 

// Load the XML file 
function xml_load($xml_full_file_name)
{
  // global $xml_full_file_name;
  global $log_file;

  // Load the XML file
  $xml = new XMLReader();
  if(!$xml->open($xml_full_file_name))
  {
    flog($log_file, "Failed to open " . $file . "\r\n\r\n");
    exit("Failed to open <b>" . $file . "</b><br /><br />");    
  }
  return $xml;
}

 

The xml_save function is not important, it just renames the temporary XML file being written to the final one.

 

Here is the end of an XML file being produced

 

<product><programname>Adobe</programname><programurl>http://www.adobe.com</programurl><catalogname>NA Volume Store Product Catalog</catalogname><lastupdated>23/01/2011</lastupdated><name>FrameMaker Server 10 - Back-up CD/DVD</name><keywords>FrameMaker Server 10 - License</keywords><description>A physical CD or DVD for installation or back-up. ** Requires applicable license serial number in order to activate **</description><sku>FrameMaker Server 10 - Back-up CD/DVD</sku><currency>USD</currency><price>20.00</price><buyurl>(edited out)</buyurl><impressionurl>http://www.ftjcfx.com/image-4376401-10674112</impressionurl><imageurl>http://drh.img.digitalriver.com/DRHM/Storefront/Company/adbevlus/images/product/detail/Framemaker_server_10_150x150.jpg</imageurl><instock>YES</instock><ID><mid>1359820</mid><pid>http%3A%2F%2Fstore.digitalriver.com%2Fstore%3FAction%3DDisplayProductDetailsPage%26Locale%3Den_US%26SiteID%3Dadbevlus%26productID%3D223503300</pid><category></category><minprice>20.00</minprice><partner>cj</partner></ID> </product>
<product><programname>Adobe</programname><programurl>http://www.adobe.com</programurl><catalogname>NA Volume Store Product Catalog</catalogname><lastupdated>24/04/2011</lastupdated><name>Total Training™ Online: Adobe® Acrobat Pro Library</name><keywords>Total Training, training, online, library, acrobat, train, education, learn</keywords><description>Learn everything you need to know for Acrobat 7,8, 9 and X Pro!</description><sku>Total Training™ Online: Adobe® Acrobat Pro Library</sku><currency>USD</currency><price>99.00</price><buyurl></buyurl> </product>

 

As you can see the final product is missing quite a few elements from there, especially the ID element, which is crucial.  It is quite strange that the buyurl is blank for the last product, because it is not blank in the file it is reading it from.

 

If you need any more information, let me know, thanks.

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.