preg_replace search patterns.

guitarist809 · February 5, 2007

Hello,

I was just looking at php.net's manual on preg_replace (http://www.php.net/preg_replace) and I am totally lost on how it works. For example:

<?php
$string = 'April 15, 2003';
$pattern = '/(\w+) (\d+), (\d+)/i';
$replacement = '${1}1,$3';
echo preg_replace($pattern, $replacement, $string);
?>

I dont even understand what /(\w+) (\d+), (\d+)/i even means, as I continued through the page, i saw this pattern "/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/','/^\s*{(\w+)}\s*=/"... woah...

If somebody could tell me or direct me to a link on how all this works, that would be great

thanks,

matt n

Balmung-San · February 5, 2007

I think http://www.regexp.info is a good source for Regular Expressions.

Also, check out the Regex forums on this site.

The way preg_replace works is that it uses a Perl-compatible regular expression ($pattern) to search the string ($string), and replaces its matches with a replacement ($replacement).

guitarist809 · February 5, 2007

That kind of helps. I already knew how the preg_replace worked, I'm just confused on how the patterns work.

like *.txt in the regex is .*\.txt

there are slashes and + signs in these patterns and im lost on how they all work.

Balmung-San · February 5, 2007

The + will evaluate the associated character class (one before it, if I'm correct) 1 or more times. So if it can't find it, the whole pattern fails. The * will match it 0 or more times, so the whole pattern may not fail.

Those \ characters are escape characters. If you wanted to search for one of the regex reserved characters (such as + or *) you'd have to escape it first (like to match for a + sign you'd have to use \+).

The \w and \d are shorthands for character sets, if I remember correctly. A \d will match a digit, and \w will match A-Z or a-z, and depending on the engine _ and any digits.

For your *.txt example, it'll look for any character before a .txt any number of times. To break it down:

. is the anything matcher.

* will tell it to repeat any number of times

\. will escape the ., meaning look for a literal dot

txt will look for txt in that exact order.

I'm not sure what the / and /i do, though I know they're a pair.

guitarist809 · February 5, 2007

Hmm, its starting to make some sense now

with the anything matcher (the .), does it match single characters such as 'a' or 'b', or does it match multicharacters, like 'abcd' or 'efgh'?

so, .*\.txt would match the following if there was a string like "one.txt two.txt. three.txt"

one.txt

two.txt

three.txt, or would it match like o.txt n.txt e.txt t.txt w.txt, etc...?

Balmung-San · February 5, 2007

Well, . by itself will only match one character. However, .* tells it to match any character repeatedly. I believe the regex engine is also smart enough to take a look at what you have next, and if it finds that character it moves onward.

snakebit · February 5, 2007

To find this format of date in a text you can tray that:

April 15, 2003

'[A-Za-z]{3,9}, [0-9]{1,2} [0-9]{4}' - this is a solution, I don't say it's perfect but you can tray it

This mean:

[A-Za-z]{3,9}, - find string which length is between 3 and 9 charcters, with coma "," and space after that

[0-9]{1,2} - find 2 number with 1 or 2 digits and space

[0-9]{4} - find exactly 4 digit number

If you want questions you can ask

guitarist809 · February 5, 2007

So, this is how it works...

[characters and numbers]{lengthmin,lengthmax}any other characters

so, [a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-zA-Z0-9]{3} would be something similar to an email (i hope lol)

guitarist809 · February 5, 2007

[bump]

effigy · February 5, 2007

Basically, yes, but there can be a 2 character domain.

guitarist809 · February 5, 2007

oh yea

the example I made would allow this email to be validated, right? [email protected] right? (i hope lol)

One last question (unless the above doesnt work, then i'll have more lol)

what does the ^ mean? like [^x] ?

effigy · February 5, 2007

Test it out

As the first character in a character class, it negates the class. Outside of a character class, it asserts the beginning of a line.

guitarist809 · February 6, 2007

uh...

=\ im lost.

effigy · February 6, 2007

The documentation has been available to you since the very first reply; I suggest going through it:

- See "Negated Character Classes" here.

- See "Using ^..." here.

Sign In

preg_replace search patterns.

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived

Important Information