boby Posted August 5, 2006 Share Posted August 5, 2006 Hello,I am writing a script that out of a webpage.Because the "get_meta_tags" function is very slow and cannot handle very well line breaks, I'm using following code:Can someone please review this regex and if possible improve or suggest something better?[code=php]<?phppreg_match_all ('/<[\s]*meta[\s]*name[\s]*=[\s]*["\']?([^>"\']*)["\']?[\s]*content[\s]*=[\s]*["\']?([^>"\']*)["\']?[\s]*[\/]?[\s]*>/si', $content, $matches);?>[/code]Thank you very much!Boby Link to comment https://forums.phpfreaks.com/topic/16639-improve-regex-for-meta-tags/ Share on other sites More sharing options...
effigy Posted August 5, 2006 Share Posted August 5, 2006 That's not a good approach because the attributes can be in any order. First match the whole meta tag, then get the attributes. Something like [url=http://www.phpfreaks.com/forums/index.php/topic,96844.msg387986.html#msg387986]this[/url]. Link to comment https://forums.phpfreaks.com/topic/16639-improve-regex-for-meta-tags/#findComment-69843 Share on other sites More sharing options...
boby Posted August 5, 2006 Author Share Posted August 5, 2006 Do you mean to get first the content between <head> and </head> or just meta tags? But how do I get just meta'sIs this OK?[code=php:0]preg_match_all ('/<[\s]*meta(.*)[\/]?[\s]*>/si', $content, $matches);[/code]Thank you Link to comment https://forums.phpfreaks.com/topic/16639-improve-regex-for-meta-tags/#findComment-69867 Share on other sites More sharing options...
wildteen88 Posted August 5, 2006 Share Posted August 5, 2006 This what I have to get the meta tag contents:[code=php:0]<?php$text = '<meta name="keywords" content="PHP, MySQL, bulletin, board, free, open, source, smf, simple, machines, forum" />';if(preg_match("#<meta([^>]*)>#si", $text, $matches)){ //echo '<pre>' . htmlentities(print_r($matches, true)) . '</pre><br /><br />'; //$matches[1] is what stores the contents of the meta tag $matches[1] = str_replace("/", '', $matches[1]); // put each attribute and its value into an array $attrs = preg_split("#(\"\s)#i", trim($matches[1])); // "e now create a meta array. The format of the array will be: // Array([attribute_name] => [attribute_value]) foreach($attrs as $attr) { // we now get the attribute name and attribute value in sperate variables list($attr_name, $attr_value) = explode("=", $attr); // we create our meta array, trimming of any double quotes remaining in the attribute value string. $meta[$attr_name] = trim($attr_value, '"'); } echo '<pre>' . print_r($meta, true) . '</pre>';}?>[/code]I use the meta tag from the forum for reference.The code outputs the following:[code]Array( [name] => keywords [content] => PHP, MySQL, bulletin, board, free, open, source, smf, simple, machines, forum)[/code][b]EDIT[/b] Updated code to get each attribute into an array. Link to comment https://forums.phpfreaks.com/topic/16639-improve-regex-for-meta-tags/#findComment-69882 Share on other sites More sharing options...
boby Posted August 5, 2006 Author Share Posted August 5, 2006 Woow, thank you :DI will test it asap Link to comment https://forums.phpfreaks.com/topic/16639-improve-regex-for-meta-tags/#findComment-69934 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.