Jump to content

Recommended Posts

Good evening.

 

I am trying to work with a page that is written in XHTML -- http://us.battle.net/wow/en/character/bleeding-hollow/chewables/advanced

 

What I am trying to do is pull the entire page and put it into something that I can pull data out of.

 

I thought of pulling the webpage via a CURL and the converting the XHTML to an array with something like this:

 

function & xmlToArray($xmlData, $includeTopTag = false, $lowerCaseTags = true){

	$xmlArray = array();

	$parser = xml_parser_create();
	xml_parse_into_struct($parser, $xmlData, $vals, $index);
	xml_parser_free($parser);

	$temp = $depth = array();

	foreach ($vals as $value) {

		switch ($value['type']) {

		case 'open':
		case 'complete':
			array_push($depth, $value['tag']);
			$p = join('::', $depth);
			if ($lowerCaseTags) {
				$p = strtolower($p);
				if (is_array($value['attributes']))
					$value['attributes'] = array_change_key_case($value['attributes']);
			}
			$data = ( isset($value['attributes']) ? array($value['attributes']) : array());
			$data = ( trim($value['value']) ? array_merge($data, array($value['value'])) : $data);
			if (isset($temp[$p])) $temp[$p] = array_merge($temp[$p], $data);
			else $temp[$p] = $data;
			if ($value['type']=='complete') array_pop($depth);
			break;

		case 'close':
			array_pop($depth);
			break;

		}

	}

	if (!$includeTopTag) unset($temp["page"]);

	foreach ($temp as $key => $value) {

		if (count($value)==1) { $value = reset($value); }

		$levels = explode('::', $key);
		$num_levels = count($levels);

		if ($num_levels==1) {
			$xmlArray[$levels[0]] = $value;
		} else {
			$pointer = &$xmlArray;
			for ($i=0; $i<$num_levels; $i++) {
				if ( !isset( $pointer[$levels[$i]] ) ) {
					$pointer[$levels[$i]] = array();
				}
				$pointer = &$pointer[$levels[$i]];
			}
			$pointer = $value;
		}

	}

	return ($includeTopTag ? $xmlArray : reset($xmlArray));

}

 

This, however, is geared for parsing XML and not XHTML.  Any advice on how to work through the XHTML and put the data is something like an array in other to pull data out of something that has some structure?

 

I know about XPath, but the code needs to be XML in order for that to work (at least that is my understanding).  If there is a way to use XPath with XHTML, please let me know.

 

Thanks.

Link to comment
https://forums.phpfreaks.com/topic/230671-xhtml-php-pulling-in-xhtml-data/
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.