Leaderboard

in all areas
Custom Date
- Custom Date
  Between and

gizmola

Administrators
- Points
  
  2
- Posts
  
  6,104
- Search
kicken

Gurus
- Points
  
  1
- Posts
  
  4,705
- Search
benanamen

Members
- Points
  
  1
- Posts
  
  2,134
- Search

Popular Content

Showing content with the highest reputation on 03/06/2023 in all areas

seeking tips with reading files

Also, this thread is probably interesting to consider....
- March 6, 2023
1 point
seeking tips with reading files

I'm all for academic exercises for the benefit of learning. I think you will find this page of some help in continuing to explore the jpeg and jfif standards. However if your goal is simply to verify if an image is valid or not, that is problematic, because jfif allows for sections of a file to be ignored, so that special data could be placed there when the file is created. You could look at exif as essentially being this type of extension, so using the exif check functions is valuable in combination with other techniques. Exif data doesn't have to be there, but if you decide that you will only accept images that also have exif, then that is another valuable and efficient check, as you can use an exif checking function to exclude images that don't have valid exif data. In general, the proven method of knocking down malicious images is to use a combination of getimagesize and imagecreatefromstring, or the imagemagick routines kicken referenced. You used getimagesize to knock down files you have already decided are too large, and then recreate the image from file data. Either of these failing should cause rejection. Trying to go through the files and decipher them is most certainly a block operation where you would want to read the binary values, looking for the segments, and have routines that can decipher those individual segments. A simple loop is not going to be maintainable in my opinion. If I was trying to do this, I'd also want to try and see what gd and/or imagemagick source is doing, as those are both open source libraries written in c/c++. For example, imagemagick has a component used to identify the internals of an image. It's available in their command line tool that allows analysis and modification of an image. The source is here. A very large and complicated bit of code it seems.
- March 6, 2023
1 point
seeking tips with reading files

If your goal is to strip potentially harmful comments / metadata from an image, the way to do that would be to use an image library to re-generate the image file without that data. The image magick extension has a function for this. I'm not sure if loading then re-saving an image with GD will accomplish this or not. Trying to just arbitrarily manipulate an image file is a poor approach. Even if it works with your test images, it may not work with all images. You'd need to have a good understanding of the file format so you can parse it and manipulate it properly, which is a lot of work when you can just use an existing library instead. You don't have to use generators and yield to save memory, just reading the file a bit at a time. A simple loop like this: $fp = fopen('file', 'rb'); while (!feof($fp)){ $line = fgets($fp); //do stuff } fclose($fp); will also only use enough memory to hold a line's worth of file data at a time without the complication of a generator. Often, parsing a binary format file is not something you'd do line-by-line anyway. You'd read various chunks based on the file format, possibly seeking to a particular position in the file first.
- March 6, 2023
1 point
PHP Version 5.4.45

Php 7 has reached end of life. Tell them to upgrade you to Php 8.
- March 2, 2023
1 point

This leaderboard is set to New York/GMT-04:00

Sign In

Leaderboard

gizmola

Points

Posts

kicken

Points

Posts

benanamen

Points

Posts

Popular Content

seeking tips with reading files

seeking tips with reading files

seeking tips with reading files

PHP Version 5.4.45

Browse

Activity

Important Information