rv20 Posted May 18, 2009 Share Posted May 18, 2009 I was trying the php tidy extension/classes, so i tried a couple of sources of html with set missing tags and it worked well, replaced the missing tags, but one example i tried with two missing closing tags, a missing </title> closing and a missing </b> closing tag, it totally messed up the replacement of the missing closing </title> tag placing it under the body tag. Here is the source html i feed in, <?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> <body> <b>hello </body> </html> here is the result of php tidy, <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//w3c//dtd xhtml 1.0 strict//en" "http://www.w3.org/tr/xhtml1/dtd/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> <body> <b>hello</b> </body> </title> </head> </html> here is the php code i used for this, <?php include "vars.php"; $html = open_html_file_for_reading("val.html"); $config = array( 'indent' => true, 'output-xml' => true, 'input-xml' => true, 'wrap' => '1000'); // Tidy $tidy = new tidy(); $tidy->parseString($html, $config, 'utf8'); $tidy->cleanRepair(); echo tidy_get_output($tidy); ?> any help would be great, it might be something to do with mixing standards in html and xhtml , strict , translational and frameset, or maybe the <b> is depreciated and it is getting confused, but as i said so far it only seems to mess things up if the</title> is missing..... any ideas? MAybe my config options $config = array( 'indent' => true, 'output-xml' => true, 'input-xml' => true, 'wrap' => '1000'); just need to be tweaked?? Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.