whatnow Posted November 10, 2007 Share Posted November 10, 2007 I have the following code which grabs a RSS feed, it shows up the results in my browser as: Title: Genzyme stumps up £7m for NCG partnership Description: Genzyme UK has entered into a partnership with the NHS National Commissioning Group, under which it will provide funding of £7 million over three years to help support the care of patients with lysosomal storage disorders. Link: http://www.pharmatimes.com/WorldNews/article.aspx?id=12177&src=WorldNewsRSS Now how would I go about actually making it read the RSS, take on board the URL for the full article and then additionally grabbing the artical from the webpage, as to result in: Title: Genzyme stumps up £7m for NCG partnership Description: Genzyme UK has entered into a partnership with the NHS National Commissioning Group, under which it will provide funding of £7 million over three years to help support the care of patients with lysosomal storage disorders. Main Article: Genzyme UK has entered into a partnership with the NHS National Commissioning Group, under which it will provide funding of £7 million over three years to help support the care of patients with lysosomal storage disorders. LSDs are rare and often severe metabolic disorders – such as Gaucher, Fabry and Pompe diseases - that need specialist and multi....(etc) Link: http://www.pharmatimes.com/WorldNews/article.aspx?id=12177&src=WorldNewsRSS The page itself has the content in a span class; named 'newsContent'. Do I just need to make a code which just lifts this span out of the page? That seems like a inefficient method of what I want to achieve, when ideally I could just call the content in another way? Are there other ways, or is a crude method the onyl way to take content from other sites like this? ( I will happily admit, I am fresh to RSS ) I've been searching the internet for this for three hours now and to be honest i'm not getting good results. I've read about bloggers stealing content so it seems possible but i've not found any practical code for doing just that. I assure you this isn't for illicit gains, I've been asked to do it for a job interview i'm preparing for, so naturally any help would be more than appreciated. Code: <?php $rssFeeds = array ('http://www.pharmatimes.com/p.aspx?n=ZGFpbHl2aWRlb25ld3M=&s=VmlkZW9OZXdz'); //Loop through the array, reading the feeds one by one foreach ($rssFeeds as $feed) { readFeeds($feed); } function startElement($xp,$name,$attributes) { global $item,$currentElement; $currentElement = $name; //the other functions will always know which element we're parsing if ($currentElement == 'ITEM') { //by default PHP converts everything to uppercase $item = true; // We're only interested in the contents of the item element. This flag keeps track of where we are }} function endElement($xp,$name) { global $item,$currentElement,$title,$description,$link; if ($name == 'ITEM') { // If we're at the end of the item element, display // the data, and reset the globals echo "<b>Title:</b> $title<br>"; echo "<b>Description:</b> $description<br>"; echo "<b>Link:</b> $link<br><br>"; $title = ''; $description = ''; $link = ''; $item = false; }} function characterDataHandler($xp,$data) { global $item,$currentElement,$title,$description,$link; if ($item) { //Only add to the globals if we're inside an item element. switch($currentElement) { case "TITLE": $title .= $data; // We use .= because this function may be called multiple times for one element. break; case "DESCRIPTION": $description.=$data; break; case "LINK": $link.=$data; break; }} }} function readFeeds($feed) { $fh = fopen($feed,'r'); // open file for reading $xp = xml_parser_create(); // Create an XML parser resource xml_set_element_handler($xp, "startElement", "endElement"); // defines which functions to call when element started/ended xml_set_character_data_handler($xp, "characterDataHandler"); while ($data = fread($fh, 4096)) { if (!xml_parse($xp,$data)) { return 'Error in the feed'; } } } ?> Quote Link to comment Share on other sites More sharing options...
whatnow Posted November 10, 2007 Author Share Posted November 10, 2007 never mind, I've found a dirty hack which will do the job. something like this; <?php $config['url'] = "http://www.pharmatimes.com/WorldNews/article.aspx?id=12190"; // url of html to grab $config['start_tag'] = "<body>"; // where you want to start grabbing $config['end_tag'] = "</body>"; // where you want to stop grabbing $config['show_tags'] = 1; // do you want the tags to be shown when you show the html? 1 = yes, 0 = no class grabber { var $error = ''; var $html = ''; function grabhtml( $url, $start, $end ) { $file = file_get_contents( $url ); if( $file ) { if( preg_match_all( "#$start(.*?)$end#s", $file, $match ) ) { $this->html = $match; } else { $this->error = "Tags cannot be found."; } } else { $this->error = "Site cannot be found!"; } } function strip( $html, $show, $start, $end ) { if( !$show ) { $html = str_replace( $start, "", $html ); $html = str_replace( $end, "", $html ); return $html; } else { return $html; } } } $grab = new grabber; $grab->grabhtml( $config['url'], $config['start_tag'], $config['end_tag'] ); echo $grab->error; foreach( $grab->html[0] as $html ) { $string1 = stristr( $grab->strip( $html, $config['show_tags'], $config['start_tag'], $config['end_tag'] ),'<span class="body">' ) . "<br>"; } $string2 = (explode('span',$string1)); $string2 = $string2[1]; echo $string2; ?> but it's not very clean, but it works, so horray for me. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.