Jump to content


Photo

Weeding out part of an RSS feed


  • Please log in to reply
2 replies to this topic

#1 czheng

czheng
  • New Members
  • Pip
  • Newbie
  • 2 posts

Posted 15 July 2006 - 04:28 AM

Hey folks. Newcomer here.

I've spent the last few days modifying an RSS Reader script I downloaded somewhere, and I've got it running almost exactly how I'd like it. The last thing I'd like to do is, for the following feed(http://ma.gnolia.com...l/people/czheng), remove from each ITEM everything from (and including)
<b>Tags:
up to (but not including)
<description>
.

I'm guessing it would be something like:
$rss_feed = preg_replace('**what goes here?**', '', $rss_feed);
but of course I need the pattern.

Thanks in advance if anyone can help.

P.S. How might I also strip out all the following tags?
<b></b><p></p>

FYI, I looked at this thread (http://www.phpfreaks...ic,99383.0.html) but it didn't get me the result I wanted...

#2 Wildbug

Wildbug
  • Members
  • PipPipPip
  • Advanced Member
  • 1,149 posts

Posted 16 July 2006 - 05:16 PM


... remove from each ITEM everything from (and including)

<b>Tags:
up to (but not including)
<description>
.


Try this expression:

/(<b>Tags:.*?)(?=<description>)/s

The first set of parentheses capture the enclosed expression.  The ".*?" construct means "any character, zero or more, ungreedy."  The second parenthetical expression is a lookahead assertion, meaning that the regex engine looks for the previous part of the expression, followed by "<description>" without including that in the result.

P.S. How might I also strip out all the following tags?

<b></b><p></p>


And this:

preg_replace('/<[bp]\/?>/','',$text);

Twice a day my clock works PERFECTLY!  I can't figure out what's wrong with it.

#3 czheng

czheng
  • New Members
  • Pip
  • Newbie
  • 2 posts

Posted 17 July 2006 - 08:23 PM

Like a dummy, I didn't pay close enough attention and didn't realize I was going for the CLOSING description tag, not the opening one. The following code works now:

$rss_feed = preg_replace("#Tags:(.*?)(?=<\/description>)#", "", $rss_feed);

And I wasn't able to remove the (p) and (b) tags without decoding the HTML. Once I did that, everything was set.

Thanks again.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users