Jump to content

how to force the substring to be considered the full match


dsaba

Recommended Posts

Here is a general pattern for matching emails:

~\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b~i

this will match the entire email

 

 

here's the same pattern but putting the '@' symbol into the first substring match

~\b[A-Z0-9._%+-]+(@)[A-Z0-9.-]+\.[A-Z]{2,4}\b~i

 

 

now is there a way to make this first substring match to be the full pattern match?

so instead of matching the whole email, I would only want to match the '@' in the email, take not I want to match this as the FULL pattern match, and not in the substring

 

I am asking to see this done in this sample email pattern, but in essence I am asking if there is a method to do this universally on any regex pattern you want to force substrings to be the full pattern matches, or is it just specific to the particular pattern?

 

-thank you

I don't see your point. Per the docs the full pattern match is always the first index, followed by the captured areas. If you want the first capture to be the 0 index, you can modify the array or the code to treat it so. If you're using preg_match_all, PREG_SET_ORDER might be what you want.

I don't see your point.

The point was to find out if this can be done.

 

I saw a thread earlier where you did this:

~(?<=<price>\$\x20)\d+|(?<=\.)\d+(?=</price>)~e

 

while it could have been written like this:

~<price>\$ ([0-9]{2})\.([0-9]{2})</price>~

 

you see how in my version what was originally matched in subgroups has been manipulated to only match as the full match, it looked like you use lookaheads/lookbehinds to do this..etc..

 

this is where my question came from..

I don't see your point.

The point was to find out if this can be done.

 

As is, the way the function is defined and written--no.

 

you see how in my version what was originally matched in subgroups has been manipulated to only match as the full match, it looked like you use lookaheads/lookbehinds to do this..etc..

 

My pattern uses lookarounds which only match positions, not content; therefore, the captured content happened to be the full match. It was captured in order to utilize $1 in the replacement; however, it seems that not capturing anything and using $0 works.

If what you're looking for is for the captured string to be only the '@' you might be thinking of something like this:

 

~(?<=\b[A-Z0-9._%+-]+)@(?=[A-Z0-9.-]+\.[A-Z]{2,4})\b~i

 

The problem is lookahead and lookbehind only work with fixed length patterns. So no +, *, or {m..n} is possible, making the above regex give an error when run.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.