Jump to content

preg_match() - wrong match to the regular expression given in pattern


Go to solution Solved by Ch0cu3r,

Recommended Posts

I am using preg_match() to perform a url regular expression match for a video embed function.

I have provided the third parameter of preg_match ($matches) therefore $matches[0] should contain the text that matched the full pattern, however I have noticed in the code example below, that the space after the query component of the specified url together with the first word after the space is captured and stored in $matches[0] as well.

$UserInput = "This is my video embed,  http://www.youtube.com/watch?v=GXHijTS9_2g check it out .";
preg_match('@http:\/\/(?:www.)?(\w*).com\/watch\?v=(\w*\W?\w*)@', $UserInput, $matches);

var_dump($matches) provides the output below:

array (size=4)
  0 => string 'http://www.youtube.com/watch?v=GXHijTS9_2g check' (length=48)
  1 => string '' (length=0)
  2 => string 'youtube' (length=7)
  3 => string 'GXHijTS9_2g check' (length=17)

What am I missing in the regex?

 

Thanks.

Edited by terungwa

The problem is with the regex for matching the the v= part of the youtube url

v=(\w*\W?\w*)
  • \w* will match any word character (typically letters, numbers, underscores, etc).

This will match GXHijTS9_2g

  • \W? will match a non word character (eg space, comma, period, etc) if it exists.

This will match the space after the youtube url

  • \w* will match any word character (letters, numbers, etc).

This will match the first word after the url

 

To solve the problem  remove \W?\w* from the regex

Edited by Ch0cu3r

The problem is with the regex for matching the the v= part of the youtube url

v=(\w*\W?\w*)
  • \w* will match any word character (typically letters, numbers, underscores, etc).

This will match GXHijTS9_2g

  • \W? will match a non word character (eg space, comma, period, etc) if it exists.

This will match the space after the youtube url

  • \w* will match any word character (letters, numbers, etc).

This will match the first word after the url

 

To solve the problem  remove \W?\w* from the regex

Removing this bit of code below as suggested will prevent dashes from being captured. Dashes are a part of the youtube query string. all i need is to prevent spaces from being captured.

 

Thanks.

\W?\w*
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.