Jump to content

RSS poster script?


Arel3

Recommended Posts

I have found a script that posts RSS's for me on a site that I'm building. However I would like to also post other articles that don't have a feed. Is there a legal/respectable way to harvest and post these articles on my site? Does such a script or application exist? What would be the keywords I should search for?

Link to comment
https://forums.phpfreaks.com/topic/179466-rss-poster-script/
Share on other sites

However I would like to also post other articles that don't have a feed

If the data is not freely available via an API or feed then its guaranteed that the website owner doesn't want you to have the data (or hasn't the skills to create a data source). However most article sites contain user submitted articles, they do not belong to the website owner so you will find the same article all over the web.

If you want these articles then you need to write a bot that can extract the page content and then filter all the shit out (html, etc) and leave the article. There will not be a specific script to do this as every website is different, however the tools to make it work are available. Look at CURL. Be careful when scraping data.

Link to comment
https://forums.phpfreaks.com/topic/179466-rss-poster-script/#findComment-947195
Share on other sites

Thank you both.

 

jonsjava, the sites I'm refering to that have the articles I'd like to post don't have a feed. So they don't have an XML file. That is very good info, the kind of info I was look for though...thank you!

 

 

Yes, neil.johnson, I am being very careful with it. That is why I've asked the expert freaks ;):P

Link to comment
https://forums.phpfreaks.com/topic/179466-rss-poster-script/#findComment-947396
Share on other sites

However I would like to also post other articles that don't have a feed

If the data is not freely available via an API or feed then its guaranteed that the website owner doesn't want you to have the data (or hasn't the skills to create a data source). However most article sites contain user submitted articles, they do not belong to the website owner so you will find the same article all over the web.

If you want these articles then you need to write a bot that can extract the page content and then filter all the shit out (html, etc) and leave the article. There will not be a specific script to do this as every website is different, however the tools to make it work are available. Look at CURL. Be careful when scraping data.

 

Do you have suggestions of where I can look to create a bot specific to each site?

Link to comment
https://forums.phpfreaks.com/topic/179466-rss-poster-script/#findComment-947397
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.