I have came across many projects where checking file uploads and content-type (mime-type) is poorly implemented or heavy in resource.

Methods I have seen so far:

1. Checking content-type from file name: this inefficient, a user can just rename a file and you are fooled, or the file can have a different file format and you will not get the expected result.

2. Using Rmagick to check if the file is an image. This is so slow and uses so much Ram. You can try to initialize an rmagick object from an image file, then rescue when the file is not an image.

3. Using mini_magick to check if a file. This method is faster than rmagick. Implemen ted same way as rmagick.

A Better method for OSX and Linux,  is to use the command line tool “file” included in most UNIX operating systems.

It is very fast and very accurate.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
file = "/path/to/file.ext"
if RUBY_PLATFORM.match(/darwin|linux|unix|solaris|bsd/)
 content_type = `file --raw --brief "#{file}"`.chomp
 case
  when content_type.match(/image|png|jpg|jpeg|gif/)
   real_type = "image"
  when content_type.match(/pdf/)
   real_type = "pdf"
  when content_type.match("Microsoft Word|Microsoft Office Document")
   real_type = "doc"
  else # This can go on and on
   real_type = "Unknown"
  end
end

Some examples of content types:

.doc = Microsoft Word document data

.doc = Microsoft Office Document

.pdf = PDF document, version 1.4

.pdf = PDF document, version 1.3

.psd = Adobe Photoshop Image

.png = PNG image data, 3508 x 4961, 8-bit/color RGBA, non-interlaced

.gif = GIF image data, version 89a, 195 x 109

.jpg = JPEG image data, EXIF standard

etc…

I hope this can be useful to someone.

VN:F [1.9.13_1145]
Rating: 9.0/10 (1 vote cast)
VN:F [1.9.13_1145]
Rating: +1 (from 1 vote)
Really checking the content-type/mime_type of a file in OSX and Linux, 9.0 out of 10 based on 1 rating

4 Responses to “Really checking the content-type/mime_type of a file in OSX and Linux”

  1. Why yes, yes indeed it’s useful. This has been on my not-important-enough-to-be-a-bug list since forever. I had no idea it would be so simple as file –raw –brief.

    thanks for the tip!

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
  2. This is pretty useful. Is nginx 7 still having errors?

    Thanks!

    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
  3. Barce, over here nignx 0.7 has no more stability issues.
    passenger has been updated since then and well tested in 0.7 branch

    VN:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.13_1145]
    Rating: 0 (from 0 votes)
  4. Beside the whole point, but as a side note the case construct allows you some regex matching convenience. The above can equivalently be written as:

    
    real_type = case content_type
                when /image|png|jpg|jpeg|gif/
                  'image'
                when /pdf/
                  'pdf'
                # etc.
                end
    
    VA:F [1.9.13_1145]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.13_1145]
    Rating: 0 (from 0 votes)

Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">

© 2012 Ruby, Rails, OSX and Linux fun Suffusion theme by Sayontan Sinha

Switch to our mobile site