Jump to content

wannabe21

New Members
  • Posts

    6
  • Joined

  • Last visited

wannabe21's Achievements

Newbie

Newbie (1/5)

0

Reputation

  1. I can't edit my previous post .. ? below is newer function, removed some mistakes function getFavicon($url) { $check = parse_url($url); if(empty($check['scheme'])) { // check if there http whatever, if not add it. $url = 'http://' . ltrim($url, '/'); } $check = parse_url($url); if (!empty($check['host'])) { // Get host path (thats all we need , get rid of path) check if smth is there $url = $check['host']; $url = $check['scheme'] . '://' . $url; // put back original scheme } else { return false; } $href = false; $ch = curl_init($url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,2); curl_setopt($ch, CURLOPT_TIMEOUT, 2); //timeout in seconds $content = curl_exec($ch); if (!empty($content)) { $dom = new DOMDocument(); @$dom->loadHTML($content); //supress errors $items = $dom->getElementsByTagName('link'); foreach ($items as $item) { $rel = $item->getAttribute('rel'); if ($rel == 'icon' or $rel == 'shortcut icon') { $href = $item->getAttribute('href'); break; } } } if ($href != false) { $href = parse_url($href); return $url . $href['path']; } return false; }
  2. I put smth together and would appreciate feedback, perhaps it can be faster with the function below you can 'feed' any url, it will check if there is a scheme (ie http/ftp/feed etc) if there isn't any it will assume http and add it. Then it will get the document and looks for favicon. If there is any it will return the full path to the favicon. If favicon cannot be found it will return FALSE. function getFavicon($url) { $check = parse_url($url); krumo($check); if(empty($check['scheme'])) { $url = 'http://' . ltrim($url, '/'); } $url = parse_url($url); if (!empty($url['host'])) { // Get host path & check if smth is there $url = $url['host']; $url = $url['scheme'] . ltrim($url, '/'); // put back original scheme } else { return false; } $href = false; $ch = curl_init($url); curl_setopt($ch, CURLOPT_HEADER, 0); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT ,2); curl_setopt($ch, CURLOPT_TIMEOUT, 2); //timeout in seconds $content = curl_exec($ch); if (!empty($content)) { $dom = new DOMDocument(); @$dom->loadHTML($content); //supress errors $items = $dom->getElementsByTagName('link'); foreach ($items as $item) { $rel = $item->getAttribute('rel'); if ($rel == 'icon' or $rel == 'shortcut icon') { $href = $item->getAttribute('href'); break; } } } if ($href != false) { $href = parse_url($href); return $url . $href['path']; } }
  3. I don't want to use it because it creates a dependancy. I 'just' want to fetch the url to the favicon from any given url.
  4. Hi, What is the fastest way to get a favicon from any given url? Pls don't suggest to use http://www.google.com/s2/favicons?domain_url=
  5. Partial downloading images is horrible you mentioned, is there any other way to get their size? I am aware that websites are not uniformly build, there is no standard with regards on how content is build. Also, excluding 3rd party links is not an option because some websites pull their images from a completely different domain.
  6. Hi, This is my first post and I would like to kick it off with a question I am trying to write a function that grabs the most relevant image from a website, however I encounter a few problems for which I need a bit of help for. To get into what I am trying to accomplish I suggest reading the article on : shareaholic This article describes quite in depth what you should do to grab the most relevant image from a website. Most functionality as described in that article are functional in my own program. just 32KB of images are fetched to get the headers so that i can calculate the width & height and aspect ratio. the array is sorted, big on top, small on bottom All OG Meta tags and Twitter Tags are found and stored in an array todo: compile a list of most used DIV ID for content, wordpress is using 'content' and other CMS's 'main' . etc etc With the information I should be able to grab the most relevant image from most websites, BUT... The thing that I encounter is that the biggest image is not always the most relevant image plus some ad images on certain websites have the perfect aspect ratio and are quite big, so I get wrong results. Did anyone here ever tried to do the same thing? and if so, how did you work around 'my' problem? Perhaps my approach is totally not correct. Thanks in advance, W//
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.