Jump to content


New Members
  • Posts

  • Joined

  • Last visited

jaja13's Achievements


Newbie (1/5)



  1. Thank you for the beautiful code! You made my life much easier.
  2. To clarify, a match will be a domain on 2 or more pages. A domain found more than once on the same page will not be a match.
  3. I'll have to read up on this more. Thanks for the suggestions.
  4. No spam, I want to run this script with cron and get email alerts when it finds sites that are generating buzz (linked from multiple sites in my list)
  5. I'm just learning php and I have a web scraper I'm working on using Simple HTML DOM. It's almost complete but still lacks a bit of logic. What I want the script to do is scrape multiple pages and compare the links, and IF a matching domain is found linked from more than 1 page, send an email What I've come up works for matching a domain that's hard coded into the script, but I want to match domains from other pages. And, the script will send an email for every match it finds but I just want 1 email with all the matching domains. I believe array_intersect() is the function I need to be working with but I can't figure this out. I will be so happy if I can get this completed. Thanks for your time and consideration. Here is my code // Pull in PHP Simple HTML DOM Parser include("simple_html_dom.php"); $sitesToCheck = array( array("url" => "http://www.google.com"), array("url" => "http://www.yahoo.com"), array("url" => "http://www.facebook.com") ); // For every page to check... foreach($sitesToCheck as $site) { $url = $site["url"]; // Get the URL's current page content $html = file_get_html($url); // Find all links foreach($html->find('a') as $element) { $href = $element->href; $link = $href; $pattern = '/\w+\..{2,3}(?:\..{2,3})?(?:$|(?=\/))/i'; $domain = $link; if (preg_match($pattern, $domain, $matches) === 1) { $domain = $matches[0]; } // This works for matching google.com // but I want to match with $domain from other sites if (preg_match("/google.com/", $domain)) { mail("someone@example.com","Match found",$domain); } else { echo "A match was not found." . "<br />"; } } }
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.