Andy-H Posted June 14, 2012 Share Posted June 14, 2012 I am writing a simple templating system for a website I'm making, basically I want to retrieve content from files and insert it into a document using jquery style selectors, I'm unsure whether to use DOMDocument or regular expressions, which one would be faster for this? I'm leaning for regex atm but I'm not too clued up on it: <?php $tag = 'span'; $pat = <<<PAT ~(.*?)<{$tag}.*id\s*=\s*["?|'?]testing["?|'?][^>]*>(.*?)</\s*{$tag}\s*>|\s*>|\s*/>(.*?)~i PAT; $htm = <<<HTM test <span id="test">Test</span> <span class="testing" id="testing">Testing</span> tredst HTM; preg_match_all($pat, $htm, $matches); echo '<pre>'. htmlentities(print_r($matches, 1), ENT_QUOTES, 'UTF-8'); ?> OUTPUT: Array ( [0] => Array ( [0] => > [1] => > [2] => <span class="testing" id="testing">Testing</span> ) [1] => Array ( [0] => [1] => [2] => ) [2] => Array ( [0] => [1] => [2] => Testing ) [3] => Array ( [0] => [1] => [2] => ) ) Desired: Array ( [0] => Array ( [0] => test <span id="test">Test</span> [1] => <span class="testing" id="testing"> [2] => Testing [3] => </span> tredst ) ) Any help appreciated. P.S. Sorry if this should be in regex help, I was unsure as I also wanted advice on whether it was the right decision not to go with DOMDocument. Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/ Share on other sites More sharing options...
kicken Posted June 15, 2012 Share Posted June 15, 2012 If your wanting to parse HTML you should use DOMDocument. Your HTML will have to be mostly valid though for it to work properly. Parsing HTML with regex is generally considered a bad idea. It works sometimes, I'll do it sometimes for one-off re-format scripts or small personal-use scrapers but for something like a template engine you'd be better off going with a real html parsing solution like domdocument. Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354062 Share on other sites More sharing options...
Andy-H Posted June 15, 2012 Author Share Posted June 15, 2012 Now got $tag = 'span'; $pat = <<<PAT ~(.*)(<{$tag}.*class\s*=\s*["?|'?]testing["?|'?][^>]*>)(.*?)(</\s*{$tag}\s*>)(.*)~is PAT; $htm = <<<HTM <span> <span class="testing" id="test">Test</span> <span class="testing" id="testing">Testing</span> </span> HTM; preg_match_all($pat, $htm, $matches, PREG_SET_ORDER); echo '<pre>'. htmlentities(print_r($matches, 1)); Outputs: Array ( [0] => Array ( [0] => <span> <span class="testing" id="test">Test</span> <span class="testing" id="testing">Testing</span> </span> [1] => <span> <span class="testing" id="test">Test</span> [2] => <span class="testing" id="testing"> [3] => Testing [4] => </span> [5] => </span> ) ) Desired output: Array ( [0] => Array ( [0] => <span> <span class="testing" id="test">Test</span> <span class="testing" id="testing">Testing</span> </span> [1] => <span> [2] => <span class="testing" id="test"> [3] => Test [4] => </span> [5] => <span class="testing" id="testing">Testing</span> </span> ) [1] => Array ( [0] => <span> <span class="testing" id="test">Test</span> <span class="testing" id="testing">Testing</span> </span> [1] => <span> <span class="testing" id="test">Test</span> [2] => <span class="testing" id="testing"> [3] => Testing [4] => </span> [5] => </span> ) ) Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354066 Share on other sites More sharing options...
Andy-H Posted June 15, 2012 Author Share Posted June 15, 2012 If your wanting to parse HTML you should use DOMDocument. Your HTML will have to be mostly valid though for it to work properly. Parsing HTML with regex is generally considered a bad idea. It works sometimes, I'll do it sometimes for one-off re-format scripts or small personal-use scrapers but for something like a template engine you'd be better off going with a real html parsing solution like domdocument. Ok, so scrap the regex idea, thanks. I also have another problem, I want to be able to call templates like so: $Page = (new Template('default', ['site' => 'b2c']))->getContent('slider')->insertAfter('#header'); However, after calling getContent, I want it to return another object for the insertAfter, rather than update a class member to hold the content, this way the insertBefore/after / append/prependTo methods are only exposed when content is loaded, is this the right way to go? Here's what I have so far. Template.class.php namespace phantom\classes\templating; class Template { protected $_pageContent; public function __construct($template, array $data = array()) { $this->_pageContent = $this->_getContent($template, $data); } public function getContent($file_name, array $data = array()) { return new Content($this->_getContent($file_name, $data), $this); } public function querySelector($selector) { $selector = expolode('#', $selector); $tag = $selector[0]; $match = $selector[1]; } protected function _getContent($file_name, array $data = array()) { ob_start(); extract($data, EXTR_SKIP); include 'templates'. DIRECTORY_SEPARATOR . $file_name .'.tmpl.php'; return ob_get_clean(); } } Content.class.php namespace phantom\classes\templating; class Content { protected $_content; protected $_template; public function __construct($content, Template $tmpl) { $this->_content = $content; $this->_template = $tmpl; } public function insertBefore($tag) { } public function insertAfter($tag) { } public function appendTo($tag) { } public function prependTo($tag) { } } But now I am unsure of how to update the template contents without exposing public methods to set the content?? Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354069 Share on other sites More sharing options...
Andy-H Posted June 15, 2012 Author Share Posted June 15, 2012 OK, I now have: Template.class.php <?php namespace phantom\classes\templating; class Template { protected $_document; public function __construct($tmpl_file, array $data = array(), $ver = '4.01', $enc = 'UTF-8') { $this->_document = new DOMDocument($ver, $enc); $this->_document->loadHTML($this->_getContent($tmpl_file, $data)); } public function getContent($file_name, array $data = array()) { return new Content($this->_getContent($file_name, $data), $this->_document); } protected function _getContent($file_name, array $data = array()) { ob_start(); extract($data, EXTR_SKIP); include 'templates'. DIRECTORY_SEPARATOR . $file_name .'.tmpl.php'; return ob_get_clean(); } } Content.class.php <?php namespace phantom\classes\templating; class Content { protected $_content; protected $_document; public function __construct($content, DOMDocument $document) { $this->_content = $content; $this->_document = $document; } public function insertBefore($tag) { $this->_getElement($tag); } public function insertAfter($tag) { } public function appendTo($tag) { } public function prependTo($tag) { } protected function _getElement($tag) { if ( substr($tag, 0, 1) == '#' ) return $this->_document->getElementById(substr($tag, 1)); } } I am now stuck as to how to convert a HTML string into a document fragment, I know you can do this with well-formed XHTML, however, I am using HTML 4.01, anyone got any ideas how I could do something along the lines of: $DOMDocument->loadFragment('<div class="slider"><h1>Tracking vehichles</h1><p>Blah blah blah</p><>')->insertAfter(DOMNode); ?? thanks for any help. Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354075 Share on other sites More sharing options...
trq Posted June 15, 2012 Share Posted June 15, 2012 You might want to take a look at phpQuery. http://code.google.com/p/phpquery/ Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354077 Share on other sites More sharing options...
Andy-H Posted June 15, 2012 Author Share Posted June 15, 2012 I was looking at that yesterday, I'd rather do it using SPL if possible, as I might re-use this code in several environments. Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354081 Share on other sites More sharing options...
trq Posted June 15, 2012 Share Posted June 15, 2012 You've lost me, your not using anything from SPL. Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354085 Share on other sites More sharing options...
Andy-H Posted June 15, 2012 Author Share Posted June 15, 2012 Oh, sorry, I just mean I only want to use things pre-packaged with PHP, like available on servers where PHP was built with default configuration. Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354088 Share on other sites More sharing options...
Andy-H Posted June 15, 2012 Author Share Posted June 15, 2012 $doc = new DOMDocument('4.01', 'UTF-8'); $doc->loadHTML('<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html lang="en"> <head> <title>Phantom - Tracking Ststems and Accessories</title> <!-- META //--> <meta name="description" content="" > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" > <!-- LINKS AND SCRIPTS //--> <link rel="stylesheet" href="css/layout.css" type="text/css" > </head> <body> <div id="clouds"><> <div id="container"> <!-- HEADER //--> <div id="header"> <h1> <a href="/"> <img src="images/layout/b2c/phantom.png" alt="Phantom vehicle tracking and accessories" > </a> </h1> <div id="header_right"> <ul id="header_nav" class="navigation"> <li class="first"><a href="#">About</a></li> <li><a href="#">News</a></li> </ul> <img class="telephone" src="images/layout/b2c/tel.png" alt="Telephone number" > <> <> <!-- SLIDER //--> <div id="slider"> <> <!-- PRODUCT NAVLIGATION IMAGES //--> <ul id="product_navigation"> <li class="first"><a href="#" id="remap">Engine ECU remapping</a></li> <li><a href="#" id="tyre-pro">Tyre protector</a></li> <li><a href="#" id="sat-dish">Caravan and motorhome satellite dish</a></li> <li><a href="#" id="reverse-sensor">Reverse sensor</a></li> <li><a href="#" id="in-car-cam">In car camera</a></li> <li><a href="#" id="alarms">Caravan and motorhome alarms</a></li> <li><a href="#" id="tracking">Caravan and motorhome tracking</a></li> <li><a href="#" id="subs">Renew tracking subscription</a></li> </ul> <!-- CONTENT //--> <div id="content"> <div class="clr"><> <> <> <!-- FOOTER //--> <div id="footer"> <div id="logo"><> <div class="green_banner"> <div id="motto">Protect, Secure, Enjoy<> <> <div class="blue_banner"><> <> </body> </html>'); $frag = $doc->createDocumentFragment(); $frag->appendXML(' <!-- MAIN NAVIGATION //--> <ul id="main_nav" class="navigation"> <li class="first"><a href="#">Home</a></li> <li><a href="#">Tracking</a></li> <li><a href="#">Remapping</a></li> <li><a href="#">Tyre protector</a></li> <li><a href="#">Alarms</a></li> <li><a href="#">Cameras and sensors</a></li> <li><a href="#">Insurance</a></li> <li><a href="#">Contact us</a></li> </ul>'); $doc->getElementById('container')->insertBefore($frag, $doc->getElementById('slider')); echo $doc->saveHTML(); Seems to work quite well, as long as I add the /> for non-closing tags, but it outputs them correctly Cheers Quote Link to comment https://forums.phpfreaks.com/topic/264199-domdocument-or-preg_match_all/#findComment-1354090 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.