sachavdk Posted April 18, 2007 Share Posted April 18, 2007 I'm trying to get the css code out of the style tags of a html documents like this: <style type="text/css"> <!-- .style1 { font-size: 11px; } --> </style> I'm trying to get <!-- .style1 { font-size: 11px; } --> This is what I have now preg_match("/(<style type=\"text\/css\">)+([.])+(<\/style>)/", $fread, $style); Quote Link to comment Share on other sites More sharing options...
Lumio Posted April 18, 2007 Share Posted April 18, 2007 preg_match('/'.preg_quote('<style type="text/css">').'(.+?)'.preg_quote('</style>').'/', $fread, $style); Quote Link to comment Share on other sites More sharing options...
Glyde Posted April 18, 2007 Share Posted April 18, 2007 preg_match("@<style[^>]+(type=['\"]?[^\"]+['\"]?)?[^>]?>(.+?)</style>)@is", $fread, $style); print_r($style); This should be more compatible for matching all different types of style tags. Quote Link to comment Share on other sites More sharing options...
sachavdk Posted April 18, 2007 Author Share Posted April 18, 2007 Both methods give errors. The first one gives Warning: preg_match() [function.preg-match]: Unknown modifier 'c' in ... The second: Warning: preg_match() [function.preg-match]: Compilation failed: unmatched parentheses at offset 53 in ... Quote Link to comment Share on other sites More sharing options...
Glyde Posted April 18, 2007 Share Posted April 18, 2007 preg_match("@<style[^>]+(type=['\"]?[^\"]+['\"]?)?[^>]?>(.+?)</style>@is", $fread, $style); print_r($style); My bad, had an extra parenthesis. Quote Link to comment Share on other sites More sharing options...
sachavdk Posted April 18, 2007 Author Share Posted April 18, 2007 Indeed it seems to work now, thx. Quote Link to comment Share on other sites More sharing options...
sachavdk Posted April 18, 2007 Author Share Posted April 18, 2007 Although it works, I'd like to know how it works, if you have time to tell me Quote Link to comment Share on other sites More sharing options...
Glyde Posted April 18, 2007 Share Posted April 18, 2007 preg_match("@<style[^>]+(type=['\"]?[^'\"]+['\"]?)?[^>]+?>(.+?)</style>@is", $fread, $style); print_r($style); My bad, had an extra parenthesis. Well, I'll go through all of the "regex" parts, because I don't think the plain text parts need explanation. [^>]+ - This piece of regex says to search for any characters except the closing HTML tag. This was put between style and type in case the user maybe had an id attribute, or had multiple spaces. (type=['\"]?[^\"]+['\"]?)? - This says to search the string for everything in the parenthesis. The question mark indicates that whatever is in the parenthesis doesn't HAVE to be there, and may not, in case the type="" is omitted, this will still find the user's style. ['\"]? - Some users use single quotes, some use double, some don't use any. That's what this handles. [^'\"]+ - This will match anything inside of the type="" tag, searching for any character except a quote. [^>]+? - This was strapped on there in case there was anything after the type="" tag. (.+?) - This will match any character between the style tags. Quote Link to comment Share on other sites More sharing options...
sachavdk Posted April 18, 2007 Author Share Posted April 18, 2007 maybe some last questions, what is the "is" doing after the @delimiter? and if I just replace style with body, it doesn't work I'm doing: $fread = fread($cont, filesize($file)); preg_match("@<body[^>]+(type=['\"]?[^'\"]+['\"]?)?[^>]+?>(.+?)</body>@is", $fread, $fbody); preg_match("@<style[^>]+(type=['\"]?[^'\"]+['\"]?)?[^>]+?>(.+?)</style>@is", $fread, $fstyle); $fstyle[2] contains the content between the style tags, but $fbody[2] is empty. and last, shouldn't $fread[1] contain "text/css"? because it is empty... Quote Link to comment Share on other sites More sharing options...
Glyde Posted April 18, 2007 Share Posted April 18, 2007 $fread[1] will not contain text/css, $fstyle[1] should contain that. i and s are pattern modifiers. i (PCRE_CASELESS) If this modifier is set, letters in the pattern match both upper and lower case letters. s (PCRE_DOTALL) If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.