Jump to content

Image uploads and malicious scripts


Recommended Posts

I've been surfing the web and reading various articles, and probably have more questions than answers, so any guidance or direction to resources will (hopefully) be useful.

I'm trying to connect the dots to more adequately understand the security issues within uploading image files.

From its inception, light hits a camera sensor and an image is created. Is it in binary form? ASCII? Other?

Now suppose additional code is added to the image. (For this example, let's say it's a simple script that says Hello - which I suppose would STILL be considered malicious).

If it's simply placed into the image code, how can I open the image (as the recipient) to see the code in its TEXT form?

(I'm assuming that the code would need to be activated either by clicking the script or calling the code in order to actual function)

And if the code is hidden or camouflaged by using an alternate character set, how would it be translated from the unnoticeable character set into something more meaningful in order to perform?

Edited by phppup
Typos
Link to post
Share on other sites

Oh dear.

A raw image from a camera is where it records, with its electronic circuitry, the "colors" of each "pixel" that it can measure. It records that in a file in the very most basic way that "pixel X,Y is color Z" can go. Raw image files are large because there's a lot of data.

To deal with the file size, images are compressed. If I write "computer computer computer" that takes 26 characters, but if you and I agree on another representation of words, I could compress the message to something like "3x computer" and 11 characters. Lossless PNG images work the same way. I could compress it even further as "3x cmptr" (8 characters) by stripping out the vowels, thus losing some information but still leaving enough that you know what I'm trying to say; lossy JPEG images do that.

That takes care of the image itself, however images have more information than that: GPS coordinates where it was taken, camera model information, etc. That has to be represented in a way that doesn't conflict with the image data. The easiest way to do that is to say, inside the file, "The next piece of information is the GPS coordinates: (...). The next piece of information is the camera model's name: (...). The next piece of information is the compressed image data: (...)." PNG and JPEG and such images dictate compression but they also dictate how those blocks of information are arranged, and software capable of reading them will know how to read each block - or perhaps how to skip each block it doesn't care about until it finds the one(s) it wants.

The information in each block can vary: the GPS coordinates block may have two 32-bit floating point values for the latitude and longitude, the camera model name may be a string value, and obviously the image data is image data.

Consider one of the simplest attack vectors: PHP code inside a string-type data block.
I could take a real image, add a "camera model name" block, and specify as the name the string "<?php phpinfo(); ?>". That's perfectly valid to do. I could then take that image, rename it from bad.png to bad.php, and try to upload it. Unsafe image uploading code will attempt to read data about the image, discover that the file is a very legitimate PNG, and upload it to a location like /uploads/requinix/bad.php. See how it kept the same file name and extension? I could then go to the website, go to /uploads/requinix/bad.php where the site thinks is going to be an image, but the .php extension will be run as PHP code and I'll get phpinfo() output.

You can protect yourself against those attacks by following best practices about file uploads - most significant being determining the appropriate file extension on your own instead of trusting the uploaded file's name to be correct.

More complicated are attacks that target specific image parsing code. Not your website itself, but the software that knew how to read PNG images directly. I'm not going to go too deep into this because it's complicated.
Remember the camera model string? There's one question about how it works: where does the value of the string end? The two typical answers are that the length of the string is included (so "camera model block" + string length + string) or that the string is terminated by a special character (like NUL \0). So what happens if you don't obey that rule? I might take that bad.png I created before, load it into a special editor, and break the string (by altering the string length value or by removing the \0). With appropriate adjustments I might be able to trick an image parser into doing things it isn't supposed to do.

You cannot protect yourself against those attacks, practically speaking.

  • Great Answer 1
Link to post
Share on other sites
Posted (edited)

@requinix THANK YOU.

Very enlightening.

I think I have most precautionary measures covered adequately, but the more I read, the higher my stress level climbs.

Granted there are some articles/blogs that are just inaccurate, misleading, or completely wrong; and I thank you for helping me sort them out from the valid resources and information.

I've seen recommendations to encode images to base64. Or decode to hex. Or transform into a string. All suggesting that analyzing the file in this way could facilitate in detecting hidden scripting: Are any of these ideas worth considering? Effective?

And then I stumbled across blobs, but thankfully, I don't see that as my preferred pathway.

Edited by phppup
Link to post
Share on other sites
1 hour ago, phppup said:

I've seen recommendations to encode images to base64. Or decode to hex. Or transform into a string. All suggesting that analyzing the file in this way could facilitate in detecting hidden scripting: Are any of these ideas worth considering? Effective?

I don't know how any of that would help - at least not in an automated way.

If you want to remove "hidden scripting" then use an image processing library that can deal with the image and strip out everything else that doesn't matter.
However that doesn't eliminate the possibility of someone going through a whole lot of effort to create a plain image whose compressed data contains a malicious string. If that's even possible to do.

You cannot protect yourself against everything. You can protect yourself against things that actually matter.

Link to post
Share on other sites
Posted (edited)

 

Quote

I don't know how any of that would help - at least not in an automated way.

  Does someone like this

Quote

$imageFile = file_get_contents($image_path);

$dangerousSyntax = ['<?', '<?php', '?>'];

$error = '';

foreach($dangerousSyntax as $value) {

$find = strpos($value, $imageFile);

if( $find == true ) { 

unlink$image_path);

$error = 'Found dangerous code in image';

             }

      }

//$error could be used to determine other actions that would follow

seem like a practical and effective effort?

Limitations? Potential problems?

Sensible?

Edited by phppup
Clean up post
Link to post
Share on other sites

But might it be a start?

Stopping even a single 'bad actor' seeems worthwhile.

Still, if I 

echo file_get_contents($image_path);

what am I seeing?

ASCII? Hex? Other?

How do I clean it up to view it properly (in its entirety)?

Is it the same for jpg , png, bmp, etc?

Link to post
Share on other sites
3 hours ago, phppup said:

But might it be a start?

No.

 

3 hours ago, phppup said:

Stopping even a single 'bad actor' seeems worthwhile.

You can't see the forest for the trees.

 

3 hours ago, phppup said:

Still, if I 

echo file_get_contents($image_path);

what am I seeing?

ASCII? Hex? Other?

This is a really, really basic and fundamental question about what files are and what file_get_contents does. The kind of thing that I would expect you to know the answer to, if not for the fact that you're in up to your neck in something you don't understand.

Link to post
Share on other sites

There's is a way check the "inside" of an image file

        $contents = file_get_contents($file['tmp_name']);
        $position = strpos($contents, '<?php');
        return $position !== false;

HOWEVER, it's a memory hog and doesn't work very good for large image files as you would need something more robust than the strpos() function. I personally don't allow image uploading other than myself who I trust. 😄

Link to post
Share on other sites
1 hour ago, requinix said:

This is a really, really basic and fundamental question about what files are and what file_get_contents does. 

Unfortunately not a single search result had offered a sentence that elaborates to say "...displays the contents in language XYZ."

Nor does any site elaborate on that manner regarding "creating an image", "image code", et al.

I assume it all begins with binary, but references to hex, base, etc. seem almost arbitrary without a foundational resource.

Quote

up to your neck......

Ya got that right.

But i think I've been understanding and learning more, thanks to the help I've gotten here.

 

At this point "the forest" will probably take care of "the trees" since I'm hopefully disarming ill-intended code with other measures already.

This "last thought" seemed like a reasonable idea, if for no other purpose, than to alert me of a potential attack (rather than actually prevent it).

Link to post
Share on other sites
5 hours ago, phppup said:

I assume it all begins with binary,

Yes, fundamentally it all begins as a binary stream of data.  Your "What do I see?" question doesn't really make sense because what you see depends on how you decide to interpret that binary data.  You can see whatever you want.

If you take an JPEG image for example and open it in notepad you'll see that binary data interpreted as text and you'll just see random characters.   Which characters specifically would depend on which character set you use.

If you open it in a hex editor, you'll see individual bytes represented in hexadecimal format, maybe with ASCII characters along the side for those bytes that match a printable ASCII character.

If you open it in a browser, you'd likely see the image rendered as the browser would interpret that string of bytes as image data.

If you wanted you could opt to render it as a string of 1s and 0s.

Any file just contains "arbitrary binary data".   How that data gets interpreted is what gives you your different files, encodings, etc and there's nothing preventing you from interpreting the same data in multiple different ways, it just may not make any sense if you do.

Link to post
Share on other sites
Posted (edited)
7 hours ago, kicken said:

If you wanted you could opt to render it as a string of 1s and 0s.

...and there's nothing preventing you from interpreting the same data in multiple different ways...

@kicken Thanks for clarifying that for me.

Thankfully, my understanding wasn't/isn't that far off, although my terminology may have been a little murky.

So how can I display an image as just 1s and 0s in a browser?

And to my REAL question, what is the preferred/default method of interpretation for PHP?

 

(Am I correct that images are essentially meaningless to PHP for display purposes?

ECHO $img;  is worthless [until HTML intervenes to help translate with an <img> tag]

It's like a memo being passed thru the United Nations assembly. It is written in plain English, but the note needs to be interpreted depending on the recipient. And for some, the message will never be clear. [No political innuendo intended. LOL])

 

So what is being displayed from the

file_get_contents($image) 

result?

Edited by phppup
Clean up post
Link to post
Share on other sites
3 hours ago, phppup said:

So how can I display an image as just 1s and 0s in a browser?

You'd have to write code specifically to interpret it and display it that way, it's not something that the browser would do by itself.

3 hours ago, phppup said:

So what is being displayed from the

file_get_contents($image) 

result?

Depends on what your doing with the result.  If you're sending it to a browser, then it depends on whether or not you sent the correct Content-type header.   If you did, the browser would render the image.  If you didn't, the browser might guess and render the image, might display it as text, or might just offer it as a download.

 

Edited by kicken
Link to post
Share on other sites

@kicken  Using file_get_contents($image)

gave me a page with characters and LOTS of black diamond-question marks.

That's what got me into this mess, and lead me down a path that had suggestions on the WWW to use everything from decode_base64/encode_base64 to bin2hex etc.

I think all my images are now inverted, upside-down, and  mixed between Latin, Arabic, and Japanese characters. LOL

It seemed like a good idea at the start.

Maybe I'll just back away from the path verrrry slowly.

Link to post
Share on other sites
33 minutes ago, phppup said:

gave me a page with characters and LOTS of black diamond-question marks.

That would be the browser choosing this option:

4 hours ago, kicken said:

If you didn't, the browser might guess and render the image, might display it as text, or might just offer it as a download.

It's taking that raw binary data and interpreting it as a text string and displaying the results.  Of course, the result is a bunch of nonsense because that raw binary data isn't supposed to be interpreted that way.

 

Link to post
Share on other sites

I stayed surfing the web again (someone should unplug that thing) and went down the rabbit hole, again.

Quote

//header("Content-Type: image/jpeg");
//header("Content-Transfer-Encoding: binary");
readfile("1a.jpg" , "r");

The code above displays lines of character text.

Uncommenting the two header lines created a black screen with a black outlined square.

How can I get a the image to display?

Can I use this method to display multiple images?

Will this provide a layer of security by eliminating a visible url to the image?

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.