godsent Posted May 1, 2009 Share Posted May 1, 2009 This code gets what is between <title></title> in websites, so yes it returns exact title. function getMetaTitle($content){ $pattern = "|<[\s]*title[\s]*>([^<]+)<[\s]*/[\s]*title[\s]*>|Ui"; if(preg_match($pattern, $content, $match)) return $match[1]; else return false; } $url = 'http://127.0.0.1/'; $content = file_get_contents($url); $title = getMetaTitle($content); What i don't understand is how this line is being made: $pattern = "|<[\s]*title[\s]*>([^<]+)<[\s]*/[\s]*title[\s]*>|Ui"; It almost look like a "code" for me. Please explain. Quote Link to comment Share on other sites More sharing options...
Mchl Posted May 1, 2009 Share Posted May 1, 2009 http://www.regular-expressions.info/tutorial.html Quote Link to comment Share on other sites More sharing options...
Daniel0 Posted May 1, 2009 Share Posted May 1, 2009 The pattern <[\s]*title[\s]*>([^<]+)<[\s]*/[\s]*title[\s]*> Means this: < Match < [\s]* Match a whitespace character 0 or more times title Match title [\s]* Match a whitespace character 0 or more times > Match > ([^<]+) Match any character 1 or more times which is not < and store these in backreference 1 < Match < [\s]* Match a whitespace character 0 or more times / Match / [\s]* Match a whitespace character 0 or more times title Match title [\s]* Match a whitespace character 0 or more times > Match > Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted May 1, 2009 Share Posted May 1, 2009 Granted, there's no need to stuff \s within a character class (as in [\s]) if it is all by itself, as \s is already a shorthand character class for all whitespace characters. Additional regex resources: weblog tools collection PHPFreaks Resources (tutorial links under 'Other Sources') PHPFreaks regex tutorial Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.