rv20 Posted May 18, 2009 Share Posted May 18, 2009 I was trying the php tidy extension/classes, so i tried a couple of sources of html with set missing tags and it worked well, replaced the missing tags, but one example i tried with two missing closing tags, a missing </title> closing and a missing </b> closing tag, it totally messed up the replacement of the missing closing </title> tag placing it under the body tag. Here is the source html i feed in, <?xml version="1.0" encoding="utf-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> <body> <b>hello </body> </html> here is the result of php tidy, <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html PUBLIC "-//w3c//dtd xhtml 1.0 strict//en" "http://www.w3.org/tr/xhtml1/dtd/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title> <body> <b>hello</b> </body> </title> </head> </html> here is the php code i used for this, <?php include "vars.php"; $html = open_html_file_for_reading("val.html"); $config = array( 'indent' => true, 'output-xml' => true, 'input-xml' => true, 'wrap' => '1000'); // Tidy $tidy = new tidy(); $tidy->parseString($html, $config, 'utf8'); $tidy->cleanRepair(); echo tidy_get_output($tidy); ?> any help would be great, it might be something to do with mixing standards in html and xhtml , strict , translational and frameset, or maybe the <b> is depreciated and it is getting confused, but as i said so far it only seems to mess things up if the</title> is missing..... any ideas? MAybe my config options $config = array( 'indent' => true, 'output-xml' => true, 'input-xml' => true, 'wrap' => '1000'); just need to be tweaked?? Link to comment https://forums.phpfreaks.com/topic/158585-why-does-php-tidy-act-like-this/ Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.