Jump to content

Why isn't this Regex capturing everything between the <body> tags?


MySQL_Narb

Recommended Posts

1. The content you are matching against doesn't have a closing body tag

 

2. You need the 's' parameter to tell the expression to traverse multiple lines - otherwise it looks for a match on single lines only

 

The page you are using doesn't allow you to add that flag. Try a different one (e.g. http://regex101.com/) or test it yourself.

 

  • Abusing regexes to write a wacky HTML parser is crap.
  • Where's the closing tag?
  • That site has no flag for making "." match newlines. At least it's not documented.

 

 

 

1. The content you are matching against doesn't have a closing body tag

 

2. You need the 's' parameter to tell the expression to traverse multiple lines - otherwise it looks for a match on single lines only

 

The page you are using doesn't allow you to add that flag. Try a different one (e.g. http://regex101.com/) or test it yourself.

 

Sorry, somehow the HTML I copied in must have gotten cutoff. Even when there is a closing tag, it does not get a match.

 

And I'm actually doing this in JavaScript, and I don't think it has an HTML parser like PHP that I know of. This was the only REGEX specific section I saw, so I just posted it here.

 

And I'll look for the "s" parameter, but I thought that was basically "m"

In javavscrpt there is no s flag so the dot includes newlines characters. Insead you need to use [^] to match any character not a dot

 

Edit: Working example http://regexr.com/38n74

 

 

 

And I'm actually doing this in JavaScript, and I don't think it has an HTML parser like PHP that I know of.

Umm, the DOM?

body = document.getElementByTagName('body');
alert(body[0].innerHTML);

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.