Jump to content

preg_match() - wrong match to the regular expression given in pattern


terungwa

Recommended Posts

I am using preg_match() to perform a url regular expression match for a video embed function.

I have provided the third parameter of preg_match ($matches) therefore $matches[0] should contain the text that matched the full pattern, however I have noticed in the code example below, that the space after the query component of the specified url together with the first word after the space is captured and stored in $matches[0] as well.

$UserInput = "This is my video embed,  http://www.youtube.com/watch?v=GXHijTS9_2g check it out .";
preg_match('@http:\/\/(?:www.)?(\w*).com\/watch\?v=(\w*\W?\w*)@', $UserInput, $matches);

var_dump($matches) provides the output below:

array (size=4)
  0 => string 'http://www.youtube.com/watch?v=GXHijTS9_2g check' (length=48)
  1 => string '' (length=0)
  2 => string 'youtube' (length=7)
  3 => string 'GXHijTS9_2g check' (length=17)

What am I missing in the regex?

 

Thanks.

The problem is with the regex for matching the the v= part of the youtube url

v=(\w*\W?\w*)
  • \w* will match any word character (typically letters, numbers, underscores, etc).

This will match GXHijTS9_2g

  • \W? will match a non word character (eg space, comma, period, etc) if it exists.

This will match the space after the youtube url

  • \w* will match any word character (letters, numbers, etc).

This will match the first word after the url

 

To solve the problem  remove \W?\w* from the regex

The problem is with the regex for matching the the v= part of the youtube url

v=(\w*\W?\w*)
  • \w* will match any word character (typically letters, numbers, underscores, etc).

This will match GXHijTS9_2g

  • \W? will match a non word character (eg space, comma, period, etc) if it exists.

This will match the space after the youtube url

  • \w* will match any word character (letters, numbers, etc).

This will match the first word after the url

 

To solve the problem  remove \W?\w* from the regex

Removing this bit of code below as suggested will prevent dashes from being captured. Dashes are a part of the youtube query string. all i need is to prevent spaces from being captured.

 

Thanks.

\W?\w*

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.