Jump to content

Extracting the Anchor Text from the RSS...


natasha_thomas

Recommended Posts

Folks,

 

I tired all my PHP skills to extract domain name strings from a RSS Feed and put each domain name as an Array element, but all in vain:

 

Here is the RSS:

 

http://bulliesatwork.co.uk/master/dev/domp/expdom/domains.php

 

What i want to extract:

Do you see a list of domain names, which are Anchored, all i need is to extract these domain names llik "abc.co uk"  (observe there is a space between .co and uk, which can be removed with str_replace())

 

Here is my first try: (Using SimpleHTMLDomParser)

require_once('simple_html_dom.php');

$html = file_get_html('http://bulliesatwork.co.uk/master/dev/domp/expdom/domains.php');

$domains = $html->find('div[class="entry"] a', 0);

foreach($domains as $dom)
{
    
    
    echo str_replace(' ', '.', $dom->plaintext);
} 

$html->clear();
unset($html);

 

 

 

Here is my another try with DOM Document:

$scrapeurl = 'http://bulliesatwork.co.uk/master/dev/domp/expdom/domains.php';         

$keywords = file_get_contents($scrapeurl);

$keywords = json_decode($keywords);

foreach( $keywords->responseData->results as $keyword) 
{    
    echo str_replace("...",".",$keyword->title).'<br/>';
   
    }

 

 

 

In both the cases, DOM document is created but it seems the Document has all information except the Domain names i want to extract.

 

Please help me out to extract the doamin names.

 

Cheers

Link to comment
https://forums.phpfreaks.com/topic/237745-extracting-the-anchor-text-from-the-rss/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.