OT: Hypothetical Security Question

ben-v · September 18, 2006, 3:19am

I am handling photo uploads on my site, and I only accept image/jpeg,
and image/pjpeg, but I was wondering what identifies the content type;
Am I relying on the user’s browser (which could be modified) to identify
the content type, or does my server (CGI/FastCGi and Apache) identify
the content type… And also, I have heard of viruses such as Perrun
that can be inserted into JPEGS, but regarding to that, is that
exploiting this vulnerability if it exists, or is it exploiting
something else, by having the computer execute code in the jpegs binary.
And also, for example, I know Perrun is Win32 only, and while my server
is LInux and has Antivirus, my clientel is largely Windows based; Is
this a concern; can I count on av software to clean threats for other
platforms? And If none of these concerns are the best scenario, what
measures do sites that handle photo upload typically take? Sorry if this
is a little off topic.

ben-v · September 18, 2006, 1:21pm

Depend on the browser for the initial identification.

Once the file is loaded to your server, before it is made accessible,
open it as a byte stream and confirm that it is, in fact, a valid JPEG.
If it is not a valid JPEG, throw it out.

Reject JPEG’s with an excessively large uncompressed or non-lossy
compressed sections, or that match a signature for the known viruses.

After this, shell out to the AV software to scan the file, capturing
the AV log, and reject any files that trigger warnings.

After all of this, make the file available for download.

ben-v · September 18, 2006, 11:24pm

Once the file is loaded to your server, before it is made accessible,
open it as a byte stream and confirm that it is, in fact, a valid JPEG.
If it is not a valid JPEG, throw it out.
I know this may be a stupid question, but how would I open I byte stream
and find out it’s true file type to be a JPEG? I have checked the Euby
File Class, and there is nothing on it, and searching “ruby byte
streams” doesn’t return much on google.