Introduction to the

Content sniffing, also known as media type sniffing or MIME sniffing, is the practice of examining the contents of a byte stream in an attempt to infer the file format of the data in it. Content sniffing is often used to compensate for metadata information when the media type is not specified accurately.

This article covers common scenarios and possible problems with content sniffing.

MIME types

MIME is Multipurpose Internet Mail Extensions. It is a standard that indicates the nature and format of a document, file, or various bytes. It is defined in RFC 6838 of the IETF. The Internet Numbering Authority (IANA) is responsible for defining all official MIME types.

MIME structures consist of two parts, type and subtype, which are separated by / :

type/subtype

Type represents the general category to which the data type belongs, such as video or text. The subtype determines the exact data type of the specified type that the MIME type represents. For example, for MIME type text, the subtype might be plain (plain text), HTML (HTML source code), or calendar (for iCalendar/.ics) files.

Each type has its own set of possible subtypes, and a MIME type must contain a type and a subtype.

You can also add additional arguments:

type/subtype; parameter=value

For example, for any MIME type whose primary type is text, the optional charset argument can be used to specify the character set of the characters in the data. If no character set is specified, the default is ASCII (US-ASCII) unless overridden by the user agent Settings. To specify a UTF-8 text file, use the MIME type text/plain; Charset = utf-8.

MIME types are case-insensitive, but traditionally lowercase is used, except for parameter values, whose case may or may not have a specific meaning.

MIME has two types, discrete and multipart.

A discrete type is a type that represents a single file or medium, such as a single text or music file, or a single video.

A multipart type is a file composed of multiple components, each of which has its own independent MIME type. Or, multiple files sent together in a transaction. For example, multiple attachments in an E-mail are a multipart MIME type.

Let’s look at the common discrete type:

  1. Application, for example:application/octet-stream.application/pdf.application/pkcs8andapplication/zipAnd so on.
  2. AudioList, for example:audio/mpeg.audio/vorbis.
  3. Font, for example:font/woff.font/ttfandfont/otf.
  4. Image, for example:image/jpeg.image/pngandimage/svg+xml.
  5. Model, for example:model/3mfmodel/vml.
  6. Text, for example:text/plain.text/csvtext/html.
  7. Video, for example:video/mp4.

Common types of Multipart are as follows:

  1. Message, for example:message/rfc822andmessage/partial.
  2. MultipartList, such as multipart/form-data andmultipart/byteranges.

Browser sniffing

Because the browser uses the MIME Type, not the file extension, to determine how to handle a URL, it is important that the Web server sends the correct MIME Type in the content-Type header of the response. If not configured correctly, the browser is likely to misunderstand the contents of the file, the site will fail to function properly, and the downloaded file may be incorrectly handled.

To solve this problem, or to improve the user experience, many browsers use MIME content sniffing, which is to guess the format of the MIME type by parsing the contents of the file.

Different browsers handle MIME sniffing differently. However, they can create serious security vulnerabilities because some MIME types are executable, allowing a malicious attacker to obvert the MIME sniffing algorithm, allowing the attacker to perform operations that neither the site operator nor the user anticipated, such as cross-site scripting attacks.

If you do not want the browser to sniff, you can set the x-Content-type Options header in the server response, such as:

X-Content-Type-Options: nosniff

The head type was first supported in IE 8, but now almost all browsers support the head type.

Client sniffing

We usually need to determine whether the browser is IE in JS, and then do the response processing:

var isIEBrowser = false; if (window.ActiveXObject) { isIEBrowser = true; } // Or, shorter: var isIE = (window.ActiveXObject ! == undefined);

The example above is a very simple client-side sniff to determine whether the browser is Internet Explorer by determining whether the Window has the ActiveXObject property.

This article has been published in

密码学系列之:内容嗅探

The most popular interpretation, the most profound dry goods, the most concise tutorial, many you do not know the small skills waiting for you to find!

Welcome to follow my public number: “procedures those things”, understand technology, more understand you!