Jump to content

Match all characters but a word


freshwebs

Recommended Posts

How do I match all characters except for a certain word?  I need to match text which may contain line breaks but stop when it comes to a </div> tag.  I've just been using "<div>([[:print:][:space:]]*?)</div>" but I feel like there's a better way to do it.

Thanks for any help!

Aaron

Link to comment
https://forums.phpfreaks.com/topic/31704-match-all-characters-but-a-word/
Share on other sites

That's it right there, this is a great use of lazy quantifiers (the *? in this example). You could get fancy (read: crazy) and do something like:
[code]preg_match_all('/<div>([^<]*(?:<(?!\/div)|[^<]*)*)<\/div>/is', $html, $matches);[/code]
But at some point you're really just beating a dead horse. Which way is easier? To me, hands-down your way (I just cooked this one up as a "devil's advocate" counter example, plus regex is FUN!). When I scrape things, I do it the way you do:
[code]preg_match_all(/'<start tag>.*?<\/end tag>/s', $html, $matches);[/code]
As long as it matches the right stuff in polynomial time, I'd say that's a pretty good way to do it.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.