I have came across many projects where checking file uploads and content-type (mime-type) is poorly implemented or heavy in resource.
Methods I have seen so far:
1. Checking content-type from file name: this inefficient, a user can just rename a file and you are fooled, or the file can have a different file format and you will not get the expected result.
2. Using Rmagick to check if the file is an image. This is so slow and uses so much Ram. You can try to initialize an rmagick object from an image file, then rescue when the file is not an image.
3. Using mini_magick to check if a file. This method is faster than rmagick. Implemen ted same way as rmagick.
A Better method for OSX and Linux, is to use the command line tool “file” included in most UNIX operating systems.
It is very fast and very accurate.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | file = "/path/to/file.ext" if RUBY_PLATFORM.match(/darwin|linux|unix|solaris|bsd/) content_type = `file --raw --brief "#{file}"`.chomp case when content_type.match(/image|png|jpg|jpeg|gif/) real_type = "image" when content_type.match(/pdf/) real_type = "pdf" when content_type.match("Microsoft Word|Microsoft Office Document") real_type = "doc" else # This can go on and on real_type = "Unknown" end end |
Some examples of content types:
.doc = Microsoft Word document data
.doc = Microsoft Office Document
.pdf = PDF document, version 1.4
.pdf = PDF document, version 1.3
.psd = Adobe Photoshop Image
.png = PNG image data, 3508 x 4961, 8-bit/color RGBA, non-interlaced
.gif = GIF image data, version 89a, 195 x 109
.jpg = JPEG image data, EXIF standard
etc…
I hope this can be useful to someone.

Mathaba.net
Why yes, yes indeed it’s useful. This has been on my not-important-enough-to-be-a-bug list since forever. I had no idea it would be so simple as file –raw –brief.
thanks for the tip!
This is pretty useful. Is nginx 7 still having errors?
Thanks!
Barce, over here nignx 0.7 has no more stability issues.
passenger has been updated since then and well tested in 0.7 branch
Beside the whole point, but as a side note the case construct allows you some regex matching convenience. The above can equivalently be written as: