guitarist809 Posted February 5, 2007 Share Posted February 5, 2007 Hello, I was just looking at php.net's manual on preg_replace (http://www.php.net/preg_replace) and I am totally lost on how it works. For example: <?php $string = 'April 15, 2003'; $pattern = '/(\w+) (\d+), (\d+)/i'; $replacement = '${1}1,$3'; echo preg_replace($pattern, $replacement, $string); ?> I dont even understand what /(\w+) (\d+), (\d+)/i even means, as I continued through the page, i saw this pattern "/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/','/^\s*{(\w+)}\s*=/"... woah... If somebody could tell me or direct me to a link on how all this works, that would be great thanks, matt n Quote Link to comment Share on other sites More sharing options...
Balmung-San Posted February 5, 2007 Share Posted February 5, 2007 I think http://www.regexp.info is a good source for Regular Expressions. Also, check out the Regex forums on this site. The way preg_replace works is that it uses a Perl-compatible regular expression ($pattern) to search the string ($string), and replaces its matches with a replacement ($replacement). Quote Link to comment Share on other sites More sharing options...
guitarist809 Posted February 5, 2007 Author Share Posted February 5, 2007 That kind of helps. I already knew how the preg_replace worked, I'm just confused on how the patterns work. like *.txt in the regex is .*\.txt there are slashes and + signs in these patterns and im lost on how they all work. Quote Link to comment Share on other sites More sharing options...
Balmung-San Posted February 5, 2007 Share Posted February 5, 2007 The + will evaluate the associated character class (one before it, if I'm correct) 1 or more times. So if it can't find it, the whole pattern fails. The * will match it 0 or more times, so the whole pattern may not fail. Those \ characters are escape characters. If you wanted to search for one of the regex reserved characters (such as + or *) you'd have to escape it first (like to match for a + sign you'd have to use \+). The \w and \d are shorthands for character sets, if I remember correctly. A \d will match a digit, and \w will match A-Z or a-z, and depending on the engine _ and any digits. For your *.txt example, it'll look for any character before a .txt any number of times. To break it down: . is the anything matcher. * will tell it to repeat any number of times \. will escape the ., meaning look for a literal dot txt will look for txt in that exact order. I'm not sure what the / and /i do, though I know they're a pair. Quote Link to comment Share on other sites More sharing options...
guitarist809 Posted February 5, 2007 Author Share Posted February 5, 2007 Hmm, its starting to make some sense now with the anything matcher (the .), does it match single characters such as 'a' or 'b', or does it match multicharacters, like 'abcd' or 'efgh'? so, .*\.txt would match the following if there was a string like "one.txt two.txt. three.txt" one.txt two.txt three.txt, or would it match like o.txt n.txt e.txt t.txt w.txt, etc...? Quote Link to comment Share on other sites More sharing options...
Balmung-San Posted February 5, 2007 Share Posted February 5, 2007 Well, . by itself will only match one character. However, .* tells it to match any character repeatedly. I believe the regex engine is also smart enough to take a look at what you have next, and if it finds that character it moves onward. Quote Link to comment Share on other sites More sharing options...
snakebit Posted February 5, 2007 Share Posted February 5, 2007 To find this format of date in a text you can tray that: April 15, 2003 '[A-Za-z]{3,9}, [0-9]{1,2} [0-9]{4}' - this is a solution, I don't say it's perfect but you can tray it This mean: [A-Za-z]{3,9}, - find string which length is between 3 and 9 charcters, with coma "," and space after that [0-9]{1,2} - find 2 number with 1 or 2 digits and space [0-9]{4} - find exactly 4 digit number If you want questions you can ask Quote Link to comment Share on other sites More sharing options...
guitarist809 Posted February 5, 2007 Author Share Posted February 5, 2007 So, this is how it works... [characters and numbers]{lengthmin,lengthmax}any other characters so, [a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-zA-Z0-9]{3} would be something similar to an email (i hope lol) Quote Link to comment Share on other sites More sharing options...
guitarist809 Posted February 5, 2007 Author Share Posted February 5, 2007 [bump] Quote Link to comment Share on other sites More sharing options...
effigy Posted February 5, 2007 Share Posted February 5, 2007 Basically, yes, but there can be a 2 character domain. Quote Link to comment Share on other sites More sharing options...
guitarist809 Posted February 5, 2007 Author Share Posted February 5, 2007 oh yea the example I made would allow this email to be validated, right? username0573@domain032.com right? (i hope lol) One last question (unless the above doesnt work, then i'll have more lol) what does the ^ mean? like [^x] ? Quote Link to comment Share on other sites More sharing options...
effigy Posted February 5, 2007 Share Posted February 5, 2007 Test it out As the first character in a character class, it negates the class. Outside of a character class, it asserts the beginning of a line. Quote Link to comment Share on other sites More sharing options...
guitarist809 Posted February 6, 2007 Author Share Posted February 6, 2007 uh... =\ im lost. Quote Link to comment Share on other sites More sharing options...
effigy Posted February 6, 2007 Share Posted February 6, 2007 The documentation has been available to you since the very first reply; I suggest going through it: - See "Negated Character Classes" here. - See "Using ^..." here. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.