[SOLVED] How does this one work?

kratsg · October 14, 2007

<?php
$string = <<<EOF
<img src="http://www.example.com/path/image.gif" height="500" width="500">
<img src="http://www.example.com/path/image2.gif" width="500" 
height="500">
<img src="http://www.example.com/path/image3.gif" >
<img     height="500"     width="500" 
src="http://www.example.com/path/image4.gif"     >
EOF; 
$pattern = '/<img\s.*?src="(.*?)".*?>/s';
$repl = '<img src="createthumb.php?src=$1&w=100">';
echo preg_replace($pattern, $repl, $string );
?>

Saw this example, didn't have an explanation o_o It makes literally no sense to me. (I know basics of regex) Can someone break it down? :-D

MadTechie · October 14, 2007

OK i hope this helps...

<?php
//Set String
$string = '
<img src="http://www.example.com/path/image.gif" height="500" width="500">
<img src="http://www.example.com/path/image2.gif" width="500" 
height="500">
<img src="http://www.example.com/path/image3.gif" >
<img     height="500"     width="500" 
src="http://www.example.com/path/image4.gif"     >
';

//The Find
$pattern = '/<img\s.*?src="(.*?)".*?>/s';
//The Replace
$repl = '<img src="createthumb.php?src=$1&w=100">';
//Do the Find and Replace and print results
echo preg_replace($pattern, $repl, $string );
?>

Okay..

The Find

my sample Text is this

hello <img src="http://www.example.com/path/image.gif" height="500" width="500"> world

1. Match the characters

<img

Now We have

<img

2. Match a single "whitespace character" (spaces, tabs, line breaks, etc.) thats the \s

Now We have

<img

(note the space at the end)

3. Match anything not a line break character, using lazyiness(the ? afte the * means lazy) .*?

Now We have

<img

,

okay explain a little more, it matches until the next characters are found (thats a basic example of lazy) as the next set are 'src="' it finds them next thus no addions, so results is no different..

4. Match the characters 'src="' literally src="

Now we have

<img src=

5. Matches everything until " (6.) is found, and stores it as backreference number 1 (.*?)

Now we have

<img src="http://www.example.com/path/image.gif

and backreference1 = http://www.example.com/path/image.gif

the reason we store it is because its in brackets (), i'm english we call them brackets

6. Match the character "

Now we have

<img src="http://www.example.com/path/image.gif

7. Match any single character that is not a line break character .*?>/s

Match everything until > and then "whitespace character"

Okay that probably confused you, but i expect other members to help or if your confused to highlight what part.. but i'll continue..

now remeber the sample text was

hello <img src="http://www.example.com/path/image.gif" height="500" width="500"> world

the find found

<img src="http://www.example.com/path/image.gif" height="500" width="500">

and stored

http://www.example.com/path/image.gif

in backreference number 1

so we move on..

the Replace...

<img src="createthumb.php?src=$1&w=100">

what this does it replace what we found

<img src="http://www.example.com/path/image.gif" height="500" width="500">

with

<img src="createthumb.php?src=$1&w=100">

BUT..

Note the src=$1

the $1 means backreference number 1

so infact we replace

<img src="http://www.example.com/path/image.gif" height="500" width="500">

with

<img src="createthumb.php?src=http://www.example.com/path/image.gif&w=100">

i'll await your questions

kratsg · October 14, 2007

I'm a bit lost on two parts, the lazyness, and the backreferences (I've never heard of those.. ever)

Lazyness:

.*?

I know the "." means to match any non-whitespace character, and the "*' means 0 or more times, but what the heck is going on with the question mark? Or why does it even make it "lazy" for that matter?

Backreference:

( [stuff] )

So how do you know if it's backreference2 or backreference# (unless it's based on first=first, second=second, and so on?) And you're able to extract any part of text you want using a backreference? Does BBcode use backreferences to make links and all that?

IE:

[url=someplace.com]Go to Some Place[/url]

and then the pattern would be:

$pattern = '/[url=(.*?)](.*?)[/url]/';

(I just tried to copy-paste from what was below and change it around o_o)

Just noticed the "/s" at the end as well, that's supposed to mean ignore cases?

MadTechie · October 14, 2007

your example didn't escape the ['s

this would work

data= "[url=test.com]test[/url]";
$result = preg_replace('%\[url=(.*?)\](.*?)\[/url\]%i', "URL=\$1\r\nText=\$2", $data);

1. the Backreferance is in order

2. lazyness

from the example above

(.*?)\]

Now if the Expression was

(.*)\]

the .* would be greedy and match everything including the ], thats no good..

so we use lazyness .*? which looks ahead and see we want to match a ], thus matches until it see a ] ahead..

i hope this makes sence..

as a side note

i would use this code

$result = preg_replace('%\[url=([^\]]*)\]([^[]*)\[/url\]%i', "URL=\$1\r\nText=\$2", $data);

this thread could go on for weeks, a great book is O'REILLY, Mastering Regular Expression

EDIT: the /s means that the . (dot) also mateches new line

kratsg · October 14, 2007

What you put here:

data= "[url=test.com]test[/url]";
$result = preg_replace('%\[url=(.*?)\](.*?)\[/url\]%i', "URL=\$1\r\nText=\$2", $data);

I can see that you escaped all the brackets, but what the heck is with your modulus operators (% signs)? Also, it doesn't really make sense to escape the $ in "$1" part of the url, seeing how it's not really treating it as a special character of the regular expression, but rather as the literal variable. (If that made sense? xD)

$result = preg_replace('%\[url=([^\]]*)\]([^[]*)\[/url\]%i', "URL=\$1\r\nText=\$2", $data);

Let me see if I can break it up:

%\[url=

No clue w/ %, but it escapes the bracket, treats it as a literal so it looks for "[url=" in the string.

([^\]]*)\]

The backreference1 is going to contain the following:

As many of anything except the "]" (hence the ^ part of it) and then a "]" at the end.

([^[]*)

The backreference2 is going to contain the following:

What the heck is going on here? Everything except everything? Never saw a "[]" in RegExp before.

\[/url\]%i

It's going to look for "[/url]", but I don't know what the %i at the end means o_o

MadTechie · October 14, 2007

What the heck is going on here? Everything except everything? Never saw a "[]" in RegExp before.

LOL, my bad.. that was incorrect, it should of been ([^\[]*),

the % just tell preg_replace that the expression has started or ended, the %i means end of the expression and use i (ignore case) the fact i used % doesn't make any differents you could us @ or | or #, anything you don't use in the expression..

\[ = escaped [

kratsg · October 14, 2007

Alright, now that actually makes sense, now I feel like an expert in this :-D

Sign In

[SOLVED] How does this one work?

Recommended Posts

kratsg

Link to comment

Share on other sites

MadTechie

Link to comment

Share on other sites

kratsg

Link to comment

Share on other sites

MadTechie

Link to comment

Share on other sites

kratsg

Link to comment

Share on other sites

MadTechie

Link to comment

Share on other sites

kratsg

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information