Jump to content

Recommended Posts

Hello,

I was just looking at php.net's manual on preg_replace (http://www.php.net/preg_replace) and I am totally lost on how it works. For example:

<?php
$string = 'April 15, 2003';
$pattern = '/(\w+) (\d+), (\d+)/i';
$replacement = '${1}1,$3';
echo preg_replace($pattern, $replacement, $string);
?> 

 

I dont even understand what /(\w+) (\d+), (\d+)/i even means, as I continued through the page, i saw this pattern "/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/','/^\s*{(\w+)}\s*=/"... woah...

 

If somebody could tell me or direct me to a link on how all this works, that would be great

 

thanks,

matt n

Link to comment
https://forums.phpfreaks.com/topic/37157-preg_replace-search-patterns/
Share on other sites

I think http://www.regexp.info is a good source for Regular Expressions.

 

Also, check out the Regex forums on this site.

 

The way preg_replace works is that it uses a Perl-compatible regular expression ($pattern) to search the string ($string), and replaces its matches with a replacement ($replacement).

The + will evaluate the associated character class (one before it, if I'm correct) 1 or more times. So if it can't find it, the whole pattern fails. The * will match it 0 or more times, so the whole pattern may not fail.

 

Those \ characters are escape characters. If you wanted to search for one of the regex reserved characters (such as + or *) you'd have to escape it first (like to match for a + sign you'd have to use \+).

 

The \w and \d are shorthands for character sets, if I remember correctly. A \d will match a digit, and \w will match A-Z or a-z, and depending on the engine _ and any digits.

 

For your *.txt example, it'll look for any character before a .txt any number of times. To break it down:

 

. is the anything matcher.

* will tell it to repeat any number of times

\. will escape the ., meaning look for a literal dot

txt will look for txt in that exact order.

 

I'm not sure what the / and /i do, though I know they're a pair.

Hmm, its starting to make some sense now :)

 

with the anything matcher (the .), does it match single characters such as 'a' or 'b', or does it match multicharacters, like 'abcd' or 'efgh'?

 

so, .*\.txt would match the following if there was a string like "one.txt two.txt. three.txt"

one.txt

two.txt

three.txt, or would it match like o.txt n.txt e.txt t.txt w.txt, etc...?

To find this format of date in a text you can tray that:

 

April 15, 2003

'[A-Za-z]{3,9}, [0-9]{1,2} [0-9]{4}' - this is a solution, I don't say it's perfect but you can tray it ;)

This mean:

[A-Za-z]{3,9}, - find string which length is between 3 and 9 charcters, with coma "," and space after that

[0-9]{1,2} - find 2 number with 1 or 2 digits and space

[0-9]{4} - find exactly 4 digit number

If you want questions you can ask :)

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.