Jump to content

Recommended Posts

Hi all,

 

I am retrieving an RSS feed and if it isnt already in the database, write to it. The problem is that this grabbing all of the feeds, when I only want to check and or write the first one. This is the original code with the foreach in, I have tried the following:

 

$limit = 1;
for($x=0;$x<$limit;$x++) {

 

but am having no joy, I thinmk I am close, but not enough! Here is the main code (database connection made already and the sql is reading and writing fine):

 

<?php
libxml_use_internal_errors(true);
//$RSS_DOC = simpleXML_load_file($feed_url);
 $RSS_DOC = simpleXML_load_file('FEED IN HERE/');
if (!$RSS_DOC) {
echo "Failed loading XML\n";
foreach(libxml_get_errors() as $error) {
echo "\t", $error->message;
}
}
foreach($RSS_DOC->channel->item as $RSSitem)
{
$item_id = md5($RSSitem->title);
$fetch_date = date("Y-m-j G:i:s"); //NOTE: we don't use a DB SQL function so its database independant
$item_title = $RSSitem->title;
$item_date = date("Y-m-j G:i:s", strtotime($RSSitem->pubDate));
$item_url = $RSSitem->link;
echo "Processing item '" , $item_id , "' on " , $fetch_date , "<br/>";
echo $item_title, " - ";
echo $item_date, "<br/>";
echo $item_url, "<br/>";
// Does record already exist? Only insert if new item...
$item_exists_sql = "SELECT item_id FROM rssingest where item_id = '" . $item_id . "'";
$item_exists = mysql_query($item_exists_sql);
if(mysql_num_rows($item_exists)<1)
{
echo "<p>Inserting new item..</p>";
$item_insert_sql = "INSERT INTO rssingest(item_id, feed_url, item_title, item_date, item_url, fetch_date) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $item_title . "', '" . $item_date . "', '" . $item_url . "', '" . $fetch_date . "')";
$insert_item = mysql_query($item_insert_sql);
				 echo "Query: $item_insert_sql";
}
else
{
echo "<font color=blue>Not inserting existing item..</font><br/>";
}
echo "<br/>";
}
}

 

Thanks,

 

G

Link to comment
https://forums.phpfreaks.com/topic/271416-grabbing-first-rss-feed/
Share on other sites

I only see one feed in there... Do you mean the items in the feed? Only look at the first one?

foreach($RSS_DOC->channel->item as $RSSitem)

That's the loop. Remove it, the {, the associated }, and save yourself from changing variable names by substituting in

$RSSitem = $RSS_DOC->channel->item[0];

(which assumes that the feed always has at least one item in it).

Thanks, I have implemented your changes, and it has cut it down to just the first item in the feed (which was what I was after, sorry if I wasnt clear..)

 

Now however, I seem to have lost the items from the feed, particularly the title which I write twice, once as a title and the other with md5 to create an item_id. Here is the full code and I have printed out the results of the echos and the query itself below:

 

libxml_use_internal_errors(true);
//$RSS_DOC = simpleXML_load_file($feed_url);
 $RSS_DOC = simpleXML_load_file('http://rss.cnn.com/rss/edition.rss');
if (!$RSS_DOC) {
echo "Failed loading XML\n";
foreach(libxml_get_errors() as $error) {
echo "\t", $error->message;
}
}

$rss_title = $RSS_DOC->channel->title;
$rss_link = $RSS_DOC->channel->link;
$rss_editor = $RSS_DOC->channel->managingEditor;
$rss_copyright = $RSS_DOC->channel->copyright;
$rss_date = $RSS_DOC->channel->pubDate;
 $feed_url = 'http://rss.cnn.com/rss/edition.rss';
//Loop through each item in the RSS document

//foreach($RSS_DOC->channel->item as $RSSitem)
//{

$item_id = md5($RSSitem->title[0]);
$fetch_date = date("Y-m-j G:i:s"); //NOTE: we don't use a DB SQL function so its database independant
$item_title = $RSSitem->title[0];
$item_date = date("Y-m-j G:i:s", strtotime($RSSitem->pubDate));
$item_url = $RSSitem->link;

echo "Processing item '" , $item_id , "' on " , $fetch_date , "<br/>";
echo $item_title, " - ";
echo $item_date, "<br/>";
echo $item_url, "<br/>";

// Does record already exist? Only insert if new item...

$item_exists_sql = "SELECT item_id FROM rssingest where item_id = '" . $item_id . "'";
$item_exists = mysql_query($item_exists_sql);
if(mysql_num_rows($item_exists)<1)
{
echo "<p>Inserting new item..</p>";
$item_insert_sql = "INSERT INTO rssingest(item_id, feed_url, item_title, item_date, item_url, fetch_date) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $item_title . "', '" . $item_date . "', '" . $item_url . "', '" . $fetch_date . "')";
$insert_item = mysql_query($item_insert_sql);
				 echo "Query: $item_insert_sql";
}
else
{
echo "<font color=blue>Not inserting existing item..</font><br/>";
}

echo "<br/>";

 

The printed screen looks like this:

 

Processing item 'd41d8cd98f00b204e9800998ecf8427e' on 2012-11-30 22:45:00

- 1970-01-1 1:00:00

 

Inserting new item..

Query: INSERT INTO rssingest(item_id, feed_url, item_title, item_date, item_url, fetch_date) VALUES ('d41d8cd98f00b204e9800998ecf8427e', 'http://rss.cnn.com/rss/edition.rss' '', '1970-01-1 1:00:00', '', '2012-11-30 22:45:00')

I'm not sure what the problem is. What output did you expect to see?

 

[edit] By the way, if you want something unique for each item, please try to use its instead of its title. That's exactly what it's for.

Edited by requinix
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.