Jump to content

Recommended Posts

I have some HTML code and I'm trying to extract some text between the HTML tags with preg_match.

I have the following regular expression string:

$regex = "/class=\"BVRRReviewText description\">[[:alnum:][:punct:][:word:]\s]+<\/span>/";

This is one of the matches that comes back. (I'm trying to retrieve the text in between the tags)

 

class="BVRRReviewText description">Bought this hat as a gift for two men. The fit was very snug for one of them, and too small for the other. Might work for a child.</span><span class="BVRRReviewTextSuffix">"</span>

 

Why does it stop at the second </span> and not the first. I've tried multiple methods and haven't got it to work.

Also, is it possible to return just the text without the HTML tags?

 

Thanks.

Link to comment
https://forums.phpfreaks.com/topic/241095-php-regular-expressions/
Share on other sites

dmaksimov you need to use a selector to specify which area you want to get, this is accomplished by using () around the area in the pattern you want to capture. You also need to set your pattern to not be greedy by using a question mark.

 

So if you wanted to use your pattern you would change it to:

$regex = "/class=\"BVRRReviewText description\">([[:alnum:][:punct:][:word:]\s]+?)<\/span>/";

 

When I parse data with pregex I usually do a quick and dirty pattern, in this case I'd do something along the lines of:

 

$regex = '~class="BVRRReviewText description"\>(.*?)\</span\>~';

 

Also a note

\ + * ? [ ^ ] $ ( ) { } = ! < > | : -

 

are special characters in pregex so ensure you escape them properly so your pattern really should be something along the lines of:

 

$regex = '~class\="BVRRReviewText description"\>([[:alnum:][:punct:][:word:]\s]+?)\</span\>~';

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.