Jump to content

analysing xml-files using php


surion

Recommended Posts

hi

at the moment I'm trying to write a script that generates an XSL file to transform an xml file to another xml file.

to generate the xsl file I first need to analyse the xml file I get, to do this analising I use the next script I found on php.net:

 

function readxmlfile($file) {
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
if (!($fp = fopen($file, "r"))) {
  die("could not open XML input");
}

while ($data = fread($fp, 4096)) {
   if (!xml_parse($xml_parser, $data, feof($fp))) {
    die(sprintf("XML error: %s at line %d",
      xml_error_string(xml_get_error_code($xml_parser)),
      xml_get_current_line_number($xml_parser)));
  }
}

xml_parser_free($xml_parser);
}

function startElement($parser, $name, $attrs)  {
  global $depth;
global $elements;

  echo "$name--$depth<br />\n";
$elements[] = array('name'=>$name,'depth'=>$depth);
$depth++;
}

function endElement($parser, $name) {
  global $depth;
$depth--;
}

 

it generates an array containing the names of the elements and their depths in the xml file, wich is the data I need for an analysis of the xml file. works perfect, TOUGH, I got one little problem, the element names I get are ALWAYS uppercase values. wich is not good since xsl is case sensitive.

For example if an element in my input xml is called "CaTaLoG" i still get "CATALOG", any idea how to solve this problem?

 

Offcourse I could scan trough the xml file before analysing it and perform some string replaces to make sure all tags are uppercase, but that would take the performance down on large xml files AND my output format shouldn't be only uppercase, so thats a bad solution :P any other suggestions?

Link to comment
https://forums.phpfreaks.com/topic/82204-analysing-xml-files-using-php/
Share on other sites

aha, very intresting suggestion mate, i ve tested a little with it:

 

data.xml:

<?xml version="1.0" encoding="utf-8"?>
<Properties>
<Property>
	<Name>some data</Name>
	<Beds>some data</Beds>
</Property>
<Property>
	<Name>some more data</Name>
	<Beds>some more data</Beds>
</Property>
<Property>
	<Name>again some more data</Name>
	<Beds>again some more data</Beds>
</Property>
</Properties>

 

my phpscript:

$dom = new domDocument;
$dom->load('data.xml');
$sx = simplexml_import_dom($dom);
showdata($sx);

function showdata($sx) {
foreach((array) $sx as $tagname => $val) {
	if (is_string($val)) {
		echo $tagname."<br />"; 
	}
	elseif (is_array($val)) {
		echo $tagname."<br />";
		showdata($val);
	} 
	elseif (is_object($val)) {
		echo $tagname."<br />";
		showdata($val);
	}
}
}

 

the output:

Property
0
Name
Beds
1
Name
Beds
2
Name
Beds

 

Looks sweet, upper & lower cases are perfect now, BUT

-what happened to my Root element Properties?

-why does Property get only listed once?

-where do those numbers come from?

 

anywayz, i love the fact that i can use recursion now, one step closer again

-what happened to my Root element Properties?

 

Good question, never wanted to retrieve the root element before

 

-why does Property get only listed once?

 

Cos property is now an array key

 

-where do those numbers come from?

They're array heys too.  imagine this:  $Properties = array($property => array([0] => array('name' => 'somedata'...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.