Jump to content

Function to remove middle characters?


jeffkas

Recommended Posts

Hi all. 

 

Is there a function.. or easy way.. to completely remove a line from my description?  For example: 

I want to remove any instances of

<img src= * />

  anywhere it is found in $description. 

 

That is.. anything in between the beginning tag and end tag, I want removed. 

 

Thank you!

Link to comment
https://forums.phpfreaks.com/topic/225541-function-to-remove-middle-characters/
Share on other sites

preg_replace looks to be just what I need.  However, as I'm stripping this from an rss feed, the syntax is slightly different and I'm having trouble making it work.  I've looked up preg_replace on a couple sites but it's a bit confusing!  Here's an example of a line I'm trying to remove:

 

<img src="http://someurl.com/img1.gif" />

 

So I'm trying this:

$text = preg_replace('<img src=&quot [^>]*/>', '', $text);

 

No luck yet.  Can you spot what I'm doing wrong? 

 

Error:  Unknown modifier 'q' in ... 

Thanks again. 

 

 

I'm much closer. 

 

$text = preg_replace('/<img [^>]*\/>/', '', $text);

 

works, but... 

 

It will parse all the page code and take everything out in $text until it reaches the very last closing tag (removing a couple paragraphs I want to keep).  I want it to only remove what it finds up to the img closing tag. 

 

Thoughts? 

The problem is the [^>]* bit, which tells the regex to keep reading characters that are NOT >. Since you don't have > in your text, it just keeps reading characters.

 

To solve this, I believe you will need to use a lookahead assertion. http://www.php.net/manual/en/regexp.reference.assertions.php

 

$text = preg_replace('/<img .*(?!\/>)\/>/', '', $text);

 

Lookaheads will make the regex much slower. Maybe there is a better way to do this.

The problem is the [^>]* bit, which tells the regex to keep reading characters that are NOT >. Since you don't have > in your text, it just keeps reading characters.

 

To solve this, I believe you will need to use a lookahead assertion. http://www.php.net/manual/en/regexp.reference.assertions.php

 

$text = preg_replace('/<img .*(?!\/>)\/>/', '', $text);

 

Lookaheads will make the regex much slower. Maybe there is a better way to do this.

Is there a reason it can't be decoded then worked with?

or

<?php
$text = 'bla <img src="http://someurl.com/img1.gif" /> bla ';
$text = htmlspecialchars_decode($text);
$text = preg_replace('/<img [^>]*>/', '', $text);
echo $text;
?>

 

Holy smokes!  I spent the next 3 hours last night building an array to strip the above tag and the many others I was finding...  with plans to continue today and then find this post and realize that...

 

$text = htmlspecialchars_decode($text);

 

was all that was needed!  It took care of formatting everything!  It's all good though... learned some new things. 

 

A huge thanks! 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.