Jump to content


Photo

php rss parser


  • Please log in to reply
13 replies to this topic

#1 jcombs_31

jcombs_31
  • Staff Alumni
  • Advanced Member
  • 2,066 posts
  • LocationFL

Posted 15 December 2004 - 08:30 PM

I've been looking at a couple scripts to parse rss feeds into html. Any in particular that work nicely with a link, short description and aren't too many pages of code?

#2 jcombs_31

jcombs_31
  • Staff Alumni
  • Advanced Member
  • 2,066 posts
  • LocationFL

Posted 16 December 2004 - 08:34 PM

nobody uses a nice script for RSS content?

#3 mlin

mlin
  • Members
  • PipPipPip
  • Advanced Member
  • 91 posts

Posted 16 December 2004 - 09:11 PM

magpie is excellent, just started playing with it yesterday myself:
http://magpierss.sourceforge.net/

So you dont have to read too much, I'll give you a quick rundown.

place rss_fetch.inc, rss_parse.inc, rss_cache.inc, rss_utils.inc, and the extlib folder into your include dir.

include rss_fetch.inc in your script:
require_once("include/rss_fetch.inc");

set ur url:
$url = "http://somesite.com/news.rss";

create rss object:
$rss = fetch_rss($url);

echo "<pre>";
print_r['$rss'];
echo "</pre>";

that'll show you how the items get stored in the rss object. You can loop thru and display them any way you want.

hope this helps



#4 jcombs_31

jcombs_31
  • Staff Alumni
  • Advanced Member
  • 2,066 posts
  • LocationFL

Posted 16 December 2004 - 09:38 PM

Thanks, does this have the ability to display some data in the feed rather than just a link?

I used the code in the readme file

<?php 
	require_once('includes/rss_fetch.inc');
	$url = "http://www.lockergnome.com/rss/web.php";
	$rss = fetch_rss( $url );
	
	echo "Channel Title: " . $rss->channel['title'] . "<p>";
	echo "<ul>";
	foreach ($rss->items as $item) {
  $href = $item['link'];
  $title = $item['title'];
  echo "<li><a href=$href>$title</a></li>";
	}
	echo "</ul>";

?>

and get the links to a feed, but then have a cache error

"Cache couldn't make dir './cache'. Cache unable to open file for writing: ./cache\039d3bbd5586f0089b4edc1d921dbe72"

#5 gizmola

gizmola
  • Administrators
  • Advanced Member
  • 4,664 posts
  • LocationLos Angeles, CA USA

Posted 16 December 2004 - 09:51 PM

I looked at Magpie and decided it was a bit heavier than what I wanted. I opted instead to use this class: LastRSS.

It probably works in a similar manner to magpie. Here's some sample code that I used, to give you an idea.

$rss = new lastRSS();
if ($rs = $rss->Get('http://... put url of xml feed here'')) {
	$dbh = mysql_connect($DBHOST, $DBUSER, $DBPWD) or die();
	mysql_select_db($DBDB);
	if ($rs['items_count'] > 0) {
  for ($i=$rs['items_count']-1; $i--; $i >= 0) {
  	$item = $rs['items'][$i];
  	foreach ($item as $key => $value)
    $item[$key] = addslashes($value);
  	//YYYY-MM-DD HH:MM:SS (MYSQL FORMAT)
  	$pubdate = date('Y-m-d h:i:s', strtotime($item['pubDate']));
  	if (!empty($item['guid'])) {
    $sql = "SELECT COUNT(*) AS countof FROM mg_news WHERE guid = '$item[guid]'";
  	} else {
    $sql = "SELECT COUNT(*) AS countof FROM mg_news WHERE url = '$item[link]'";
  	}
  	echo "$sql<br \>";
  	$rslt = mysql_query($sql, $dbh) or die(MYSQL_ERROR());
  	$row = mysql_fetch_assoc($rslt);
  	//Tue, 14 Sep 2004 22:21:39
  	if ($row['countof'] == 0) {  	
    $sql = 'INSERT INTO mg_news (mg_newssource_id, int_state, guid, date, subject, news, url) VALUES (';
    $sql .= "$nsid, $defstate, '$item[guid]', '$pubdate', '$item[title]', '$item[description]', '$item[link]')";
    $rslt = mysql_query($sql, $dbh);
    echo "$sql<br \>";
  	} else {
    echo 'Guid or Link exists, skipping<br \>';
  	} 
  }
	}
} else {
	echo 'Uh oh, didn\'t work';
}

As you should note from my example, the entire purpose of my use here was to get the information into a database table. There are a lot of alternative methods to doing that, which includes the caching to file method.

I don't worry about that because I control when this script goes out to the the site and pulls information via cron. They have simpler examples in their documentation.

The problem with your writing your cache file is probably a permissions issue. Remember that the webserver process is the one that is writing the file to disk, so you need to make sure that the permissions on the cache directory are such that the webserver user has read/write to that directory.


#6 jcombs_31

jcombs_31
  • Staff Alumni
  • Advanced Member
  • 2,066 posts
  • LocationFL

Posted 16 December 2004 - 09:58 PM

I looked at Magpie and decided it was a bit heavier than what I wanted.  I opted instead to use this class:  LastRSS.

It probably works in a similar manner to magpie.  Here's some sample code that I used, to give you an idea.

$rss = new lastRSS();
if ($rs = $rss->Get('http://... put url of xml feed here'')) {
	$dbh = mysql_connect($DBHOST, $DBUSER, $DBPWD) or die();
	mysql_select_db($DBDB);
	if ($rs['items_count'] > 0) {
  for ($i=$rs['items_count']-1; $i--; $i >= 0) {
  	$item = $rs['items'][$i];
  	foreach ($item as $key => $value)
    $item[$key] = addslashes($value);
  	//YYYY-MM-DD HH:MM:SS (MYSQL FORMAT)
  	$pubdate = date('Y-m-d h:i:s', strtotime($item['pubDate']));
  	if (!empty($item['guid'])) {
    $sql = "SELECT COUNT(*) AS countof FROM mg_news WHERE guid = '$item[guid]'";
  	} else {
    $sql = "SELECT COUNT(*) AS countof FROM mg_news WHERE url = '$item[link]'";
  	}
  	echo "$sql<br \>";
  	$rslt = mysql_query($sql, $dbh) or die(MYSQL_ERROR());
  	$row = mysql_fetch_assoc($rslt);
  	//Tue, 14 Sep 2004 22:21:39
  	if ($row['countof'] == 0) {  	
    $sql = 'INSERT INTO mg_news (mg_newssource_id, int_state, guid, date, subject, news, url) VALUES (';
    $sql .= "$nsid, $defstate, '$item[guid]', '$pubdate', '$item[title]', '$item[description]', '$item[link]')";
    $rslt = mysql_query($sql, $dbh);
    echo "$sql<br \>";
  	} else {
    echo 'Guid or Link exists, skipping<br \>';
  	} 
  }
	}
} else {
	echo 'Uh oh, didn\'t work';
}

As you should note from my example, the entire purpose of my use here was to get the information into a database table.  There are a lot of alternative methods to doing that, which includes the caching to file method.

I don't worry about that because I control when this script goes out to the the site and pulls information via cron.  They have simpler examples in their documentation.

The problem with your writing your cache file is probably a permissions issue.  Remember that the webserver process is the one that is writing the file to disk, so you need to make sure that the permissions on the cache directory are such that the webserver user has read/write to that directory.

View Post


Thanks, I fixed the permissions issue, but still I want to have a script that gives glipse at the content of the feed, not just the link. I have much smaller script than magpe that does the same thing.


#7 gizmola

gizmola
  • Administrators
  • Advanced Member
  • 4,664 posts
  • LocationLos Angeles, CA USA

Posted 16 December 2004 - 10:01 PM

A couple other things so you don't get confused by my code.

- I don't want to insert duplicate new stories in my news table, so I avoid that using either th guid or the link. When I wrote this the assumption was that I would have multiple news sources so I wasn't sure if they would all support a guid (many don't) which is the equivalent of a unique url.


So it should be pretty obvious that the class reads all the items and creates an array of them. I loop through them and inside the loop assign a temp variable item that is just one of the rss entries.
$item = $rs['items'][$i];

The important line to note is this one:

$sql .= "$nsid, $defstate, '$item[guid]', '$pubdate', '$item[title]', '$item[description]', '$item[link]')";

Here you can see the class item names:

$item['guid'] -> the unique url
$item['pubDate'] -> The publish date.

I have some date manipulation code in there you might find of interest, so i can convert it into a date mysql is happy with.

$item['title'] -> The Title
$item['link'] -> The link
$item['description'] -> The text abstract you are looking for (usually the first n lines)


#8 gizmola

gizmola
  • Administrators
  • Advanced Member
  • 4,664 posts
  • LocationLos Angeles, CA USA

Posted 16 December 2004 - 10:03 PM

Oh yeah, in case you were wondering, you can see this in action at http://www.movie-gurus.com/ in that the News section is the data pulled from the feeds. You have to click on a news item to get to the abstract view for a story, but that small paragraph is what gets pulled in in the description.

#9 jcombs_31

jcombs_31
  • Staff Alumni
  • Advanced Member
  • 2,066 posts
  • LocationFL

Posted 16 December 2004 - 10:23 PM

Oh yeah, in case you were wondering, you can see this in action at http://www.movie-gurus.com/ in that the News section is the data pulled from the feeds.  You have to click on a news item to get to the abstract view for a story, but that small paragraph is what gets pulled in in the description.

View Post


I didn't realize I could just add an $item['description']. Now I'm gettin where I wanna be.

thanks for your feedback.

#10 mlin

mlin
  • Members
  • PipPipPip
  • Advanced Member
  • 91 posts

Posted 16 December 2004 - 11:51 PM

cool, glad your getting there...

some feeds dont offer descriptions, most do though. Just incase the feed doesn't have a description, you could test for it with a line like:

$desc = isset($item['description']) ? $item['description'] : "";

so if there is a description available, it'll be used, otherwise prints nothing.

Like i said, I just started playing with syndications yesterday, and I found magpie tutorials all over the internet, even in a book i had around the house. I like it so far, but the comment about using a script to grab all the headlines and database them via cron sound's very interesting. Solves bandwidth issues altogether, however, it kills the functionality of being able the get the news the minute it's posted. Anyway, just my thoughts

#11 nfr

nfr
  • Members
  • PipPipPip
  • Advanced Member
  • 34 posts

Posted 18 December 2004 - 02:52 PM

Hi -

I'm trying to get a feed from the BBC Sports website (just as a test). Code is below:

<?
require_once("../inc/rss_fetch.inc");
$url = "http://news.bbc.co.uk/rss/sportonline_uk_edition/front_page/rss091.xml";
$rss = fetch_rss( $url );

echo "Channel Title: " . $rss->channel['title'] . "<p>";
echo "<ul>";
foreach ($rss->items as $item) {
 $href = $item['link'];
 $title = $item['title'];
 echo "<li><a href=$href>$title</a></li>";
}
echo "</ul>";

?>

However, I´m getting the following error:

Warning: MagpieRSS: Failed to fetch http://news.bbc.co.u...page/rss091.xml. (HTTP Error: connection failed (11) in /inc/rss_fetch.inc on line 237
Channel Title:


Warning: Invalid argument supplied for foreach() in /news/news.php on line 20

I've tried with several RSS feeds but get the same error every time. Anyone know why?

Rgds,

Neil.

#12 zfade3

zfade3
  • New Members
  • Pip
  • Newbie
  • 2 posts

Posted 21 December 2005 - 08:58 PM

Hi Niel,

I'm getting the same error on my site. I just found this from the magpie faq:

[!--quoteo--][div class=\'quotetop\']QUOTE[/div][div class=\'quotemain\'][!--quotec--]4. Error: MagpieRSS: Failed to fetch [a href=\"http://example.com/rss.xml\" target=\"_blank\"]http://example.com/rss.xml[/a]. (HTTP Error: connection failed (1)

A connection error of type <b>1</b> means "permission denied". This usually means that your
ISP has configued PHP so that it can't open outgoing sockets (usually for security reasons).

The only solution to this is to ask your ISP for help.

Sometimes you'll also get the related `connection failed (11)` (e.g. on sourceforge.net)
which also means PHP is configured in such a way that Magpie can't work.[/quote]

While this helps to know that it is a php configuration problem, it gives no clue on what needs to be changed to make it work.

This is frustrating :(

#13 zfade3

zfade3
  • New Members
  • Pip
  • Newbie
  • 2 posts

Posted 21 December 2005 - 11:26 PM

Okay, I was able to dive into the problem a little bit deeper. If you go to the Snoopy class and search for fsockopen that is where the problem is occuring. Just do an echo there on the $errstr and you should see what kind of error you are getting. Mine is a "No route to host" problem.

#14 vivek_bharadhwaj

vivek_bharadhwaj
  • New Members
  • Pip
  • Newbie
  • 1 posts

Posted 25 April 2006 - 11:04 AM

[!--quoteo(post=184287:date=Dec 18 2004, 08:22 PM:name=nfr)--][div class=\'quotetop\']QUOTE(nfr @ Dec 18 2004, 08:22 PM) View Post[/div][div class=\'quotemain\'][!--quotec--]
Hi -

I'm trying to get a feed from the BBC Sports website (just as a test). Code is below:

<?
require_once("../inc/rss_fetch.inc");
$url = "http://news.bbc.co.uk/rss/sportonline_uk_edition/front_page/rss091.xml";
$rss = fetch_rss( $url );

echo "Channel Title: " . $rss->channel['title'] . "<p>";
echo "<ul>";
foreach ($rss->items as $item) {
 $href = $item['link'];
 $title = $item['title'];
 echo "<li><a href=$href>$title</a></li>";
}
echo "</ul>";

?>

However, I´m getting the following error:

Warning: MagpieRSS: Failed to fetch [a href=\"http://news.bbc.co.uk/rss/sportonline_uk_edition/front_page/rss091.xml\" target=\"_blank\"]http://news.bbc.co.uk/rss/sportonline_uk_e...page/rss091.xml[/a]. (HTTP Error: connection failed (11) in /inc/rss_fetch.inc on line 237
Channel Title:
Warning: Invalid argument supplied for foreach() in /news/news.php on line 20

I've tried with several RSS feeds but get the same error every time. Anyone know why?

Rgds,

Neil.
[/quote]

Hi All,
I am very new to the world of PHP and programming. I have been doing a few small things here and there. I tried the magpie RSS parser and I constantly run into the below problem. My code is same as the one posted here.

Warning: MagpieRSS: Failed to fetch [a href=\"http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml\" target=\"_blank\"]http://newsrss.bbc.co.uk/rss/newsonline_wo...nt_page/rss.xml[/a] (HTTP Error: connection failed (3) in \...\magpie\magpierss-0.72\rss_fetch.inc on line 238.

I am unable to find what the error connection failed (3) means.
Any help in this regard will be most helpful.

Regards,
Vivek C.A





0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users