Jump to content

DOM


The Little Guy

Recommended Posts

I am trying to get the DOM of this file, but I don't think what I am doing with this file is working...

 

Here is the file I am reading:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml"> 
<head> 
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 
<title><clp name="title" type="text"></clp></title> 
<link rel="stylesheet" href="style.css" />
</head>
<body>
<a href="main.php" class="logo"><img src="images/logo.png" alt="logo" /></a>
<div id="left">
	<clp construct="multiple">
		<clp name="navigation" type="link" construct="multiple"></clp>
		<clp name="sub navigation" type="link" construct="multiple"></clp>
	</clp>
</div>
<div id="right"><clp name="content" type="textarea"></clp></div>
</body>
</html>

 

And here is the code to "Supposedly" read the DOM of file, but I am not sure if it is working or not.

$filename = "./templates/{$_GET['template']}/{$_GET['style']}";
$doc = new DOMDocument();
$doc->load($filename);
$clp = $doc->getElementsByTagName("clp");
print_r($clp);

 

Thanks!

Link to comment
Share on other sites

so what was wrong with the regex solution you said worked?

 

Because Danial0 recommended DOM, so I wanted to give it a try, and I added a construct to my language.

 

I tried using: SimpleXML, It seems to read it, but I am not sure if it is returning all the correct values or not..

Link to comment
Share on other sites

You have to be very careful when parsing XHTML.

 

SimpleXML and DOMDocument are the best tools to use if they'll work.  I've never used DOMDocument, but I can say the last time I used SimpleXML it barfs all over itself if the file is not valid XML to start with.  If you can trust the validity of the document then either should work fine.

 

I've never used regexps to parse markup, but I have a coworker who tried (after SimpleXML failed due to the document being invalid markup).  He basically ran into the problem of being unable to write a regexp that matches an opening tag with the appropriate closing tag.  The problem is with nested elements, such as tables or divs.  Say you try to write a regexp to match a div with id="foobar"; that is trivial.  Now try to match it's appropriate closing div-tag when:

1) You don't know how deep in the page div::id="foobar" is (i.e. it has div parents)

2) You don't know how many div children div::id="foobar" has

 

The last resort (and most reliable) is to program a parser of your own.  I mean a real parser that is a FSM that grabs either a character or a token at a time and switches between states to determine how the current token (or character) should be handled.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.