Jump to content

Regex to get file type from BLOB


onlyican

Recommended Posts

Hi

 

Background:

We have a large MySQL table which stores data and images as BLOBS (do not ask)

 

My Task:

To extract the images from the SQL and save them as image files

 

Problem:

to Calculate the image type from the blob itself. (if its jpg, png ect)

 

I have heard that I can use a regex to work out if the image is jpg / png ect, but not sure what it is, could anyone advice??

 

Cheers

 

Link to comment
https://forums.phpfreaks.com/topic/265747-regex-to-get-file-type-from-blob/
Share on other sites

Could you not just use the header ?

Or does this not work on blobs ?

 

if(header("Content-type: image/jpg")){
print $row['image'];  
}else if (header("Content-type: image/gif"); ){
print $row['image'];  
}else if(header("Content-type: image/png") ){
print $row['image'];  
}

You can try using http://www.php.net/manual/en/function.exif-imagetype.php:

 

file_put_contents($row['id'], $row['bdata']);
switch (exif_imagetype($row['id'])) {
  case IMAGETYPE_GIF:
    rename($row['id'], $row['id'] . '.gif');
    break;
  ..
}

Alternatively.. Each image type has it's own unique 'signature' within the first few bytes of the image. That means you can just run through an array of pre-defined signatures and check which matches, then write to a file:

 

function getImageTypeFromBlob($imageData)
{
    $signatures = array(
        'jpg' => "\xFF\xD8\xFF",
        'gif' => "GIF",
        'png' => "\x89PNG",
        'bmp' => "BM",
        'swf' => "CWS"
    );

    $first4Bytes = substr($imageData, 0, 4);

    foreach ($signatures as $imageType => $signature) {
        if (strpos($first4Bytes, $signature) === 0) {
            return $imageType;
        }
    }

    return false;
}

Cheers Adam.

 

I have to do this for thousands of image entries in the db

 

We have approx 175000 entries per DAY. this is over at least 3 years of data.

 

So I might use your option Adam simply to save one process on the rename (normally wouldn't bother but at this scale, i think I might)

Hmm I would definitely recommend benchmarking the two methods. Both have positives and negatives that stretched over 175000 repetitions could be quite a substantial difference. I would still lean towards the exif_imagetype() method though to be honest, just because it's a native function and more robust. I don't think the additional file read and rename would cause much of a difference. Benchmark them and find out though..

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.