Jump to content

how to force the substring to be considered the full match


dsaba

Recommended Posts

Here is a general pattern for matching emails:

~\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b~i

this will match the entire email

 

 

here's the same pattern but putting the '@' symbol into the first substring match

~\b[A-Z0-9._%+-]+(@)[A-Z0-9.-]+\.[A-Z]{2,4}\b~i

 

 

now is there a way to make this first substring match to be the full pattern match?

so instead of matching the whole email, I would only want to match the '@' in the email, take not I want to match this as the FULL pattern match, and not in the substring

 

I am asking to see this done in this sample email pattern, but in essence I am asking if there is a method to do this universally on any regex pattern you want to force substrings to be the full pattern matches, or is it just specific to the particular pattern?

 

-thank you

Link to comment
Share on other sites

I don't see your point. Per the docs the full pattern match is always the first index, followed by the captured areas. If you want the first capture to be the 0 index, you can modify the array or the code to treat it so. If you're using preg_match_all, PREG_SET_ORDER might be what you want.

Link to comment
Share on other sites

I don't see your point.

The point was to find out if this can be done.

 

I saw a thread earlier where you did this:

~(?<=<price>\$\x20)\d+|(?<=\.)\d+(?=</price>)~e

 

while it could have been written like this:

~<price>\$ ([0-9]{2})\.([0-9]{2})</price>~

 

you see how in my version what was originally matched in subgroups has been manipulated to only match as the full match, it looked like you use lookaheads/lookbehinds to do this..etc..

 

this is where my question came from..

Link to comment
Share on other sites

I don't see your point.

The point was to find out if this can be done.

 

As is, the way the function is defined and written--no.

 

you see how in my version what was originally matched in subgroups has been manipulated to only match as the full match, it looked like you use lookaheads/lookbehinds to do this..etc..

 

My pattern uses lookarounds which only match positions, not content; therefore, the captured content happened to be the full match. It was captured in order to utilize $1 in the replacement; however, it seems that not capturing anything and using $0 works.

Link to comment
Share on other sites

If what you're looking for is for the captured string to be only the '@' you might be thinking of something like this:

 

~(?<=\b[A-Z0-9._%+-]+)@(?=[A-Z0-9.-]+\.[A-Z]{2,4})\b~i

 

The problem is lookahead and lookbehind only work with fixed length patterns. So no +, *, or {m..n} is possible, making the above regex give an error when run.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.