Jump to content

preg - match words that start and end with same letter


scanreg

Recommended Posts

I have the following pattern that is supposed to match any words that begin and end with the same letter:

 

(^.).*/1$

 

results should be stuff like:

 

blab

wow

 

However, I don't understand how the pattern above is limited to the same letter at each end of the word. I get that (^.) is the first character and that .*/1$ is the last character

 

But how does the pattern above know that the first and last characters must match?

 

Thanks

Link to comment
Share on other sites

I'm not sure why it's  captured like that, capturing of the caret doesn't seem required. Essentially speaking the full stop in the brackets will capture the first character in the provided input (the ^ ensuring you are matching against the start of the string). The .* will then match the rest of the string (assuming there are no \n characters in it) and the /1 is I suspect supposed to be a back reference, but I'm not sure what syntax it is using, with the PCRE engine (ie functions beginning preg_) I believe a back reference should be a forward slash i.e. \1. A back reference refers to a value previously captured in the pattern (in this case the first character of the string). The dollar sign signifies the end.

Link to comment
Share on other sites

I'm not sure why it's  captured like that, capturing of the caret doesn't seem required. Essentially speaking the full stop in the brackets will capture the first character in the provided input (the ^ ensuring you are matching against the start of the string). The .* will then match the rest of the string (assuming there are no \n characters in it) and the /1 is I suspect supposed to be a back reference, but I'm not sure what syntax it is using, with the PCRE engine (ie functions beginning preg_) I believe a back reference should be a forward slash i.e. \1. A back reference refers to a value previously captured in the pattern (in this case the first character of the string). The dollar sign signifies the end.

 

Ah, so the back reference contains a numerical reference to the first character with the value 1

 

\1

 

That's how it knows to match the first char, am I on target? I think that's what you mean

 

Thanks so much

Link to comment
Share on other sites

No, the \1 refers to the first capture group, which means the first set of brackets. So whatever characters are captured in the first set of brackets will be placed back into the pattern wherever you place \1 in the pattern. Whatever is captured by the second set of brackets is placed back using \2 and so on.

Link to comment
Share on other sites

No, the \1 refers to the first capture group, which means the first set of brackets. So whatever characters are captured in the first set of brackets will be placed back into the pattern wherever you place \1 in the pattern. Whatever is captured by the second set of brackets is placed back using \2 and so on.

 

Super thanks :-)

Link to comment
Share on other sites

Something like this should do it

'/\b([a-z])[]a-z]*\1\b/i'

 

Two issue though:

a) I think you accidently inserted a ] in your a-z character class, and

b) Your character class would suffice with zero times (due to using the * quantifier).

 

So your pattern would be problematic in something like:

 

$str = 'I took my dad to the bb range!';
preg_match_all('/\b([a-z])[a-z]*\1\b/i', $str, $matches);
echo '<pre>'.print_r($matches[0], true);

 

Both dad and bb will register in the $matches array. I suspect you would need to make the quantifier a + like so: [a-z]+ This would force the word to be at least 3 characters long (unless of course the OP doesn't mind matching "words" like bb, or qq, ii, etc...

Link to comment
Share on other sites

Yes, The ] shouldn't be there. It should be

/\b([a-z])[a-z]*\1\b/i

As for the *, the OP requested that a word begins and ends with the same letter, so it will work with any word two or more letters long. I suppose it should be made to work for words like "a" and "I" so it should be

/\b([a-z])([a-z]*\1)?\b/i

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.