Jump to content

Faster solution to check image


eevan79

Recommended Posts

 

 

I use a script for sharing links. Everything works ok, but checking for valid images can take a long time.

First through CURL method I parse certain tags (for description and title) then I use the following script to find images from the submited link:

 

$dom = new domDocument;
  @$dom->loadHTML('<?xml encoding="UTF-8">' . substr($content,0,35000));
  $dom->preserveWhiteSpace = false;
  $images = $dom->getElementsByTagName('img');

 

 

It sometimes happens that the images have not not a good path(depending on whether the link is permanent or not) and then use the getimagesize function that works well. But it can be very slow ...sometimes it takes 15-30 seconds.

Without function getimagesize process takes a few seconds.

Is there a better (faster) method to check whether the image is valid or not?

 

Link to comment
Share on other sites

getimagesize() is not slow, but due to the fact that you are requesting from a remote location, it could become slow - and if you're calling it over and over in the same script, it could take a while. If you must check for valid images, you're stuck between a rock and a hard place as, ultimately, you have to open some sort of socket and request the image. I can assure that fetching the image normally with either sockets ore get_file_contents() will take just as long.

Link to comment
Share on other sites

I have limited parser to check 5 images (and user can choose one image that will appear with submited link).

 

 

Despite the fact that I limited checking for the 5 images, the process can still take a while (5-20 sec.), but sometimes lasts a few seconds. It seems that there is no other (faster) solution.

Link to comment
Share on other sites

Could this help at all?

 

<form name="input" action="" method="get">
Check Files Remote Url: <input type="text" name="url" />
<input type="submit" value="Go" />
</form> 

<?php
$url = mysql_real_escape_string($_GET['url']);

function checkRemote($check_url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$check_url);
    curl_setopt($ch, CURLOPT_NOBODY, 1);
    curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    if(curl_exec($ch)!==FALSE) {
        return true;
    }
    else {
        return false;
    }
}

$check_result =  checkRemote($url);
if ($check_result == "true") {
echo "$url exists.";
} else {
echo "$url doesn't exist.";
}
?>

 

Now that you know if actually exists or not, you can then just check the end of the url to see if it's an acceptable image format as .png .jpg .jpeg .gif .bmp or so on.

Link to comment
Share on other sites

I added the image check, I'm sure can shift things around or even make it all one function.

 

<form name="input" action="" method="get">
Check Files Remote Url: <input type="text" name="url" />
<input type="submit" value="Go" />
</form> 

<?php
$url = mysql_real_escape_string($_GET['url']);

function checkRemote($check_url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$check_url);
    curl_setopt($ch, CURLOPT_NOBODY, 1);
    curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    if(curl_exec($ch)!==FALSE) {
        return true;
    }
    else {
        return false;
    }
}

$check_result =  checkRemote($url);
if ($check_result == "true") {
echo "$url exists.<br />";
$allowedExtensions = array("gif","png","bmp","jpg","jpeg","gif","ico");

if (!in_array(end(explode(".",
            strtolower($url))),
            $allowedExtensions)) {
echo "not allowed image type";
} else {
echo "$url allowed and good image type.<br />";
$check_ok = true;
}
} else {
echo "$url doesn't exist.<br />";
}

if ($check_ok){
echo "Do something with this good image at $url";
}
?>

Link to comment
Share on other sites

Ha, I couldn't stand it, I was compelled to make this into a single function.

 

<form name="input" action="" method="get">
Check Files Remote Url: <input type="text" name="url" />
<input type="submit" value="Go" />
</form> 

<?php
$url = mysql_real_escape_string($_GET['url']);

function checkRemote($check_url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$check_url);
    curl_setopt($ch, CURLOPT_NOBODY, 1);
    curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    if(curl_exec($ch)!==FALSE) {
        $allowedExtensions = array("gif","png","bmp", "jpg","jpeg","ico");

if (in_array(end(explode(".",
            strtolower($check_url))),
            $allowedExtensions)) {
return true;
} else {
return false;
}
} else {
        return false;
    }
}


//usage
$check_result =  checkRemote($url);
if ($check_result == "true") {
echo "$url good image.";
} else {
echo "$url doesn't exist or not an accepted image type.";
}

?>

Link to comment
Share on other sites

That curl is quite fast. I think a lot of speed up the process. I have yet to test. I already use a script that checks a valid link, now I use it for images.

 

 

Thanks ... I wonder how I came to my mind not to use CURL for all this  :)

Link to comment
Share on other sites

Just had a thought to be even faster.

 

I would probably do a check to see if the image extension is acceptable first ..then do a curl check.

 

No sense checking it with curl if isn't an image and also the right extension in the first place.

Link to comment
Share on other sites

I added a few more things to this.

 

Did a check on file extension first, if matches then continues with the rest.

A check to see if is no redirection and that curl responds to the same link as was inserted.

Lowercase the protocol and parsed domain section, for checking to see if they are the same.

 

<form name="input" action="" method="get">
Check if image url exists: <input type="text" name="url" style="width:480px" />
<input type="submit" value="Go" />
</form> 

<?php
$url = mysql_real_escape_string($_GET['url']);

if (isset($_GET['url']) OR $url != ""){

function checkRemote($check_url) {

function parsedHost($new_parse_url) {
                $parsedUrl = parse_url(trim($new_parse_url));
                return trim($parsedUrl[host] ? $parsedUrl[host] : array_shift(explode('/', $parsedUrl[path], 2)));
}
                
function trimProtocol($the_url) {
    $the_url = trim($the_url);
    $lowered_url = substr_replace(strtolower(parsedHost($the_url)),$the_url,0);
    $trimmed_url = ltrim($lowered_url, array('http://','http://www.','https://','https://www.','www.'));
    return $trimmed_url;
}

$check_url = trim($check_url);
$allowedExtensions = array("gif","png","bmp", "jpg","jpeg","ico");
              
if (in_array(end(explode(".",
            strtolower($check_url))),
            $allowedExtensions)) {
            //echo"allowed type -";
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$check_url);
    curl_setopt($ch, CURLOPT_NOBODY, 1);
    curl_setopt($ch, CURLOPT_FAILONERROR, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    $lastUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);

/* 
echo trimProtocol($check_url) ." compares ". trimProtocol($lastUrl);    
if (trimProtocol($check_url) != trimProtocol($lastUrl)) {
echo "$check_url Different location than $lastUrl";
}
*/

if(curl_exec($ch)!==FALSE && $check_url == $lastUrl) {
echo"<h3>Image Exists</h3>";
return true;
} else {
echo"image does not exist -";
return false;
}
} else {
echo" not allowed type -";
return false;
}
}


//usage
$check_result = checkRemote($url);
if ($check_result == "true") {
if (substr($url, 0, 4) != "http") {
$url = "http://$url";
}
echo "<a href='$url'><img src='$url' border='0'></a>";
} else {
echo "$url doesn't exist or not an accepted image type.";
}
} else {
echo"<h3>Please insert an image location.</h3>";
}
echo '<br />';
?>

 

I do see some flaws with this method, may be better to get the image info actually.

 

queries after the image, possible to eliminate the query after the extension

http://mm04.nasaimages.org/MediaManager/srvr?mediafile=/Size1/nasaNAS-12-NA/66989/sig07-009_mac.jpg&userid=1&username=admin&resolution=1&servertype=JVA&cid=12&iid=nasaNAS&vcid=NA&usergroup=spitzer_-_nasa-12-Admin&profileid=56

 

but

 

images displayed through a script would be impossible to determine without seeing the actual image info

http://get.blogdns.com/url-thumb.php?size=400&text=Dynaindex.com&textsize=12&textcolor=aqua&url=http://kindergoth.com/

 

I came across this that gets the image size without using getimagesize()

 

http://mtekk.us/archives/guides/check-image-dimensions-without-getimagesize/

 

But it has to do it in different ways per image type, that still wouldn't help if was a script.

 

It seems there is no extremely fast 100% correct way to do this.

 

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.