Jump to content
#StayAtHome ×


This topic is now archived and is closed to further replies.


XHTML + PHP -- Pulling In XHTML Data

Recommended Posts

Good evening.


I am trying to work with a page that is written in XHTML -- http://us.battle.net/wow/en/character/bleeding-hollow/chewables/advanced


What I am trying to do is pull the entire page and put it into something that I can pull data out of.


I thought of pulling the webpage via a CURL and the converting the XHTML to an array with something like this:


function & xmlToArray($xmlData, $includeTopTag = false, $lowerCaseTags = true){

	$xmlArray = array();

	$parser = xml_parser_create();
	xml_parse_into_struct($parser, $xmlData, $vals, $index);

	$temp = $depth = array();

	foreach ($vals as $value) {

		switch ($value['type']) {

		case 'open':
		case 'complete':
			array_push($depth, $value['tag']);
			$p = join('::', $depth);
			if ($lowerCaseTags) {
				$p = strtolower($p);
				if (is_array($value['attributes']))
					$value['attributes'] = array_change_key_case($value['attributes']);
			$data = ( isset($value['attributes']) ? array($value['attributes']) : array());
			$data = ( trim($value['value']) ? array_merge($data, array($value['value'])) : $data);
			if (isset($temp[$p])) $temp[$p] = array_merge($temp[$p], $data);
			else $temp[$p] = $data;
			if ($value['type']=='complete') array_pop($depth);

		case 'close':



	if (!$includeTopTag) unset($temp["page"]);

	foreach ($temp as $key => $value) {

		if (count($value)==1) { $value = reset($value); }

		$levels = explode('::', $key);
		$num_levels = count($levels);

		if ($num_levels==1) {
			$xmlArray[$levels[0]] = $value;
		} else {
			$pointer = &$xmlArray;
			for ($i=0; $i<$num_levels; $i++) {
				if ( !isset( $pointer[$levels[$i]] ) ) {
					$pointer[$levels[$i]] = array();
				$pointer = &$pointer[$levels[$i]];
			$pointer = $value;


	return ($includeTopTag ? $xmlArray : reset($xmlArray));



This, however, is geared for parsing XML and not XHTML.  Any advice on how to work through the XHTML and put the data is something like an array in other to pull data out of something that has some structure?


I know about XPath, but the code needs to be XML in order for that to work (at least that is my understanding).  If there is a way to use XPath with XHTML, please let me know.



Share this post

Link to post
Share on other sites

  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.