Jump to content

Problems with using DomDocument to copy forms... how to fix?


physaux

Recommended Posts

Hello everyone, I am using DomDocument, to find <form>...</form> data, and my goal with it was to extract the information between the form tags. Here is my code so far:

function gettag($pagedata, $tag){
$wasinloop = false;
$foundgoodvalue = false;
$dom = new DOMDocument;
libxml_use_internal_errors(true);
@$dom->loadHTML($pagedata);
libxml_use_internal_errors(false);
$xpath = new DOMXPath($dom);
$aTag = $xpath->query('//'.$tag.'');
$finaloutput = array();
foreach($aTag as $url){
	$wasinloop = true;
	$nodevalue = $url->nodeValue;
	if($nodevalue == "" or $nodevalue == NULL){
		$nodevalue = "";
	}else{
		$foundgoodvalue = true;	
	}
	echo "[found a '".$tag."']";
	$finalouput[] = array('content' => $nodevalue);
}
if($wasinloop == false or $foundgoodvalue == false){
	$finaloutput[0]['content'] = "";	
}
return $finaloutput;	
}

 

So as you can see, the $nodevalue should be all the information between form tags... which it is... kind of.. The data I get back is not the "HTML Version" of what was inbetween the tags, just the "text version". The "$nodevalue" does not keep the html structure, meaning the <input></input> tags dissapear and everything. I want to keep it all, I just want to "extract" what is inside of the form tags. But this program is destroying that html code.

 

Could anyone advise me how I can fix this? Should I just use a preg match? That would get confusing because the form changes sometimes.. I thought this function should be able to do just this? :confused:

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.