Simple REGEX Question

trevrep · April 26, 2009

Hi, sorry for making a new topic, I am very unfamiliar with REGEX and the more I try to fix my problem the more confused I become which is annoying because I know it is a simple thing I am trying to do.

Anyway... I am trying to replace all instances of

[img file:"image.jpg" alt:"alttext" /]

with

<img src="image.jpg" alt="alttext" />

in a string, where image.jpg and alttext are likely to be different each time.

I am currently trying to match the pattern before I think about the replacement. My problem appears that if I have two instances I seem to match the beginning of the first instance, all the way to the end of the second instance, rather than two separate matches.

$test = '[img file:"image1.jpg" alt:"alttext1" /] some other content [img file:"image2.jpg" alt:"alttext2" /] some more content [img file:"image3.jpg" alt:"alttext3" /] closing content';
$matches = array();
preg_match_all("/\[img file\:\".*\..{3,5}\" alt\:\".*\" \/]/", $test, $matches);
print_r($matches);

The output being:

Array ( [0] => Array ( [0] => [img file:"image1.jpg" alt:"alttext1" /] some other content [img file:"image2.jpg" alt:"alttext2" /] some more content [img file:"image3.jpg" alt:"alttext3" /] ) )

Thanks for any help you can give.

Mchl · April 26, 2009

It seems that your issue is described here: http://www.regular-expressions.info/dot.html

.josh · April 26, 2009

$string = '[img file:"image.jpg" alt:"alttext" /]';
$string = preg_replace('~\[img file:"([^"]*)" alt:"([^"]*)" /\]~','<img src="$1" alt="$2" />',$string);
echo $string;

.josh · April 26, 2009

p.s.-

My problem appears that if I have two instances I seem to match the beginning of the first instance, all the way to the end of the second instance, rather than two separate matches.

your original problem is because your .* is greedy.

trevrep · April 26, 2009

$string = '[img file:"image.jpg" alt:"alttext" /]http://';
$string = preg_replace('~\[img file:"([^"]*)" alt:"([^"]*)" /\]~','<img src="$1" alt="$2" />',$string);
echo $string;[/img][/img]

Thanks for that, and thanks to Mchl. I did come up with

preg_match_all("/\[img file\:\"[^\"]*\.[^\"]{3,5}\" alt\:\"[^\"]*\" \/]/", $test, $matches);

between your posts, which places three matches in the array from the same test string I mentioned above.

My question now is how are you populating $1 and $2. I have a list of images that a user can select and using javascript it inserts the relevant code for the image, but they are not limited to one image. On submit I strip tags and replace the code with the appropriate html tags to stop them entering their own.

.josh · April 26, 2009

$1 and $2 are populated from the captured matches from the pattern. putting (...) around something in a pattern captures what's inside it and you can use it as $1 etc.. in the replacement (note: these are not real variables. they are internal to preg_replace. So you can't turn around on the next line and do for instance echo $1;).

.josh · April 26, 2009

also, the pattern you came up with:

preg_match_all("/\[img file\:\"[^\"]*\.[^\"]{3,5}\" alt\:\"[^\"]*\" \/]/", $test, $matches);

It technically probably will work, but first off, you're using a negative char class matching anything but a " up until a literal dot. So the negative character class should be a \. instead of \" (don't need to escape that quote either...). Also for the 2nd negative char class, you are specifying a range, so it would be okay to just use .* instead of the negative char class. Overall, I'm not really sure why you are wanting to even specifically look for a . and 3-5 chars... if you're shooting for actual file validation, it would be better to check the mime type or even better, check that it is indeed an image, by trying to make an image source out of it with a gd function. There's really no point to you not just sticking with file:"[^"]*" Also there are several things in there that you are escaping that you don't really need to escape. No harm I guess, but it makes it less readable.

trevrep · April 26, 2009

I'll stick with what you've gave me in that case, I was looking for a file extension but if they're selecting from a list then it's going to be there anyway. That's brilliant, thank you so much. I have to find a similar method for Javascript to update a preview pane but this one is far more important, thanks!

nrg_alpha · April 26, 2009

So the negative character class should be a \. instead of \"...

If by this you mean [^\.], you don't need to escape a dot within a character class as most meta characters (dot included) lose their special meaning inside a character class and are as a result treated as literals by default: [^.]

Granted, it's the same difference as having it escaped.. As you put it ealier, it simply boils down to being more readable.

.josh · April 26, 2009

ah I wasn't 100% whether dot loses its superpower inside class or not and I didn't feel like looking it up :/

Sign In

Simple REGEX Question

Recommended Posts

trevrep

Link to comment

Share on other sites

Mchl

Link to comment

Share on other sites

.josh

Link to comment

Share on other sites

.josh

Link to comment

Share on other sites

trevrep

Link to comment

Share on other sites

.josh

Link to comment

Share on other sites

.josh

Link to comment

Share on other sites

trevrep

Link to comment

Share on other sites

nrg_alpha

Link to comment

Share on other sites

.josh

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information