Jump to content

scraping websites


phpsycho

Recommended Posts

Okay so I am scraping websites for their descriptions keywords and titles.

I noticed that a lot of websites use the same keywords and descriptions on every page..

so my idea is to scrape the index and find all the links in there and scrape them all then after they been scraped check all of the descriptions and if the descriptions match then pull some text unique to each page and use that.

I can't seem to wrap my head around it.. how would I accomplish this?

I scrape with curl then find keywords description and title then find all links on the site and scrape those.

 

soo I was thinking making an array of the descriptions and then checking and inserting to the db but doesn't seem like it would work.

Any ideas?

 

Oh also.. how would I grab just text from each page that is different from every other page?

lol very confusing

Link to comment
https://forums.phpfreaks.com/topic/247971-scraping-websites/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.