Jump to content

Regex (or preg_replace?) to close img tags


johnny_up

Recommended Posts

Hi,

I have a bunch of flat html files that I need to import to our current site using an import html tool.  However, the flat html files have img tags that aren't properly closed, and so the utility keeps choking.

 

Anyone know how to turn:

<img blah blah blah>

into <img blah blah blah />

 

using regex or preg_replace?

 

I'm sure there is a solution out there, but I don't know what it is...

 

Any help at all would be soo appreciated.

 

Thanks.

Link to comment
https://forums.phpfreaks.com/topic/163409-regex-or-preg_replace-to-close-img-tags/
Share on other sites

AlexWD your pattern won't work.  Your * is greedy and therefore will keep gobbling up everything until it reaches the last > it can find before a new line.  At the very least you need to make it non-greedy by adding a ? after the *, but the better thing to do would be to use a negative character class. Also, I threw an 'i' modifier in there for a case-insensitive match on 'img' just in case.

 

$text = preg_replace('/<img([^>]*)>/i' , "<img $1 />", $text);

AlexWD your pattern won't work.  Your * is greedy and therefore will keep gobbling up everything until it reaches the last > it can find before a new line.  At the very least you need to make it non-greedy by adding a ? after the *, but the better thing to do would be to use a negative character class. Also, I threw an 'i' modifier in there for a case-insensitive match on 'img' just in case.

 

$text = preg_replace('/<img([^>]*)>/i' , "<img $1 />", $text);

I tested my example, and it worked for the limited examples I fed into it. But I'm a Regex amateur.

AlexWD, if you want a more elaborate explanation of what CV is talking about (with regards to greediness), you can have a look at this thread (post #11 and #14 sheds more light on these issues). So yeah, in general, .* and .+ are not the best things to use (they have their place, but should only be used in the right circumstances).

AlexWD, if you want a more elaborate explanation of what CV is talking about (with regards to greediness), you can have a look at this thread (post #11 and #14 sheds more light on these issues). So yeah, in general, .* and .+ are not the best things to use (they have their place, but should only be used in the right circumstances).

 

:P Thanks, I thought about that a little later that I was foolish because I didn't test it with more than one '>' in the text. And figured out why I was wrong, just too lazy to edit my post. Thanks for the help; I'm trying to get better at regex.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.