how to force the substring to be considered the full match

dsaba · December 22, 2007

Here is a general pattern for matching emails:

~\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b~i

this will match the entire email

here's the same pattern but putting the '@' symbol into the first substring match

~\b[A-Z0-9._%+-]+(@)[A-Z0-9.-]+\.[A-Z]{2,4}\b~i

now is there a way to make this first substring match to be the full pattern match?

so instead of matching the whole email, I would only want to match the '@' in the email, take not I want to match this as the FULL pattern match, and not in the substring

I am asking to see this done in this sample email pattern, but in essence I am asking if there is a method to do this universally on any regex pattern you want to force substrings to be the full pattern matches, or is it just specific to the particular pattern?

-thank you

effigy · December 25, 2007

I don't see your point. Per the docs the full pattern match is always the first index, followed by the captured areas. If you want the first capture to be the 0 index, you can modify the array or the code to treat it so. If you're using preg_match_all, PREG_SET_ORDER might be what you want.

dsaba · December 25, 2007

I don't see your point.

The point was to find out if this can be done.

I saw a thread earlier where you did this:

~(?<=<price>\$\x20)\d+|(?<=\.)\d+(?=</price>)~e

while it could have been written like this:

~<price>\$ ([0-9]{2})\.([0-9]{2})</price>~

you see how in my version what was originally matched in subgroups has been manipulated to only match as the full match, it looked like you use lookaheads/lookbehinds to do this..etc..

this is where my question came from..

effigy · December 26, 2007

I don't see your point.

The point was to find out if this can be done.

As is, the way the function is defined and written--no.

you see how in my version what was originally matched in subgroups has been manipulated to only match as the full match, it looked like you use lookaheads/lookbehinds to do this..etc..

My pattern uses lookarounds which only match positions, not content; therefore, the captured content happened to be the full match. It was captured in order to utilize $1 in the replacement; however, it seems that not capturing anything and using $0 works.

manixrock · December 29, 2007

If what you're looking for is for the captured string to be only the '@' you might be thinking of something like this:

~(?<=\b[A-Z0-9._%+-]+)@(?=[A-Z0-9.-]+\.[A-Z]{2,4})\b~i

The problem is lookahead and lookbehind only work with fixed length patterns. So no +, *, or {m..n} is possible, making the above regex give an error when run.

Sign In

how to force the substring to be considered the full match

Recommended Posts

dsaba

Link to comment

Share on other sites

effigy

Link to comment

Share on other sites

dsaba

Link to comment

Share on other sites

effigy

Link to comment

Share on other sites

manixrock

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information