Jump to content

Recommended Posts

I have to extract info from two different type of paragraph tags on a page

 

The first is

<p><a href="link.html">text</a></p>

The second is

<p class="row">
<span class="hh" id="rag"> </span>
<a href="link.html">text -</a></p>

 

using

<[p^>]+>

gets the first kind but Im having problems matching the tag with an attribute

I need to get everything inbetween the paragraph tags from both styles.

any help would be appreciated

Garethp, <p[^>]+> will not match <p>, you would have to use an asterisk instead of the plus. And you have to make your second quantifier lazy, else it'll match to the last closing tag in the source. Lastly, you misplaced the pattern modifier ;) This'll do:

 

~<p\b[^>]*>(.*?)</p>~is

 

I added the word boundary \b so that it doesn't wrongfully match tags like pre and param. Important in case those tags appear in the source.

A question mark following a quantifier makes it lazy, meaning that it will match as little as possible. There's a longer explanation and some examples in this thread: http://www.phpfreaks.com/forums/index.php/topic,236933.msg1104789.html#msg1104789

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.