HTML

Living Standard — Last Updated 13 March 2012

16 IANA considerations

16.1 `text/html`

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:

text

Subtype name:

html

Required parameters:

No required parameters

Optional parameters:

charset: The charset parameter may be provided to definitively specify the document's character encoding, overriding any character encoding declarations in the document. The parameter's value must be the name of the character encoding used to serialize the file, must be a valid character encoding name, and must be an ASCII case-insensitive match for the preferred MIME name for that encoding. [IANACHARSET]

Encoding considerations:

8bit (see the section on character encoding declarations)

Security considerations:

Entire novels have been written about the security considerations that apply to HTML documents. Many are listed in this document, to which the reader is referred for more details. Some general concerns bear mentioning here, however:

HTML is scripted language, and has a large number of APIs (some of which are described in this document). Script can expose the user to potential risks of information leakage, credential leakage, cross-site scripting attacks, cross-site request forgeries, and a host of other problems. While the designs in this specification are intended to be safe if implemented correctly, a full implementation is a massive undertaking and, as with any software, user agents are likely to have security bugs.

Even without scripting, there are specific features in HTML which, for historical reasons, are required for broad compatibility with legacy content but that expose the user to unfortunate security problems. In particular, the img element can be used in conjunction with some other features as a way to effect a port scan from the user's location on the Internet. This can expose local network topologies that the attacker would otherwise not be able to determine.

HTML relies on a compartmentalization scheme sometimes known as the same-origin policy. An origin in most cases consists of all the pages served from the same host, on the same port, using the same protocol.

It is critical, therefore, to ensure that any untrusted content that forms part of a site be hosted on a different origin than any sensitive content on that site. Untrusted content can easily spoof any other page on the same origin, read data from that origin, cause scripts in that origin to execute, submit forms to and from that origin even if they are protected from cross-site request forgery attacks by unique tokens, and make use of any third-party resources exposed to or rights granted to that origin.

Interoperability considerations:

Rules for processing both conforming and non-conforming content are defined in this specification.

Published specification:

This document is the relevant specification. Labeling a resource with the text/html type asserts that the resource is an HTML document using the HTML syntax.

Applications that use this media type:

Web browsers, tools for processing Web content, HTML authoring tools, search engines, validators.

Additional information:

Magic number(s):: No sequence of bytes can uniquely identify an HTML document. More information on detecting HTML documents is available in the Media Type Sniffing specification. [MIMESNIFF]
File extension(s):: "html" and "htm" are commonly, but certainly not exclusively, used as the extension for HTML documents.
Macintosh file type code(s):: TEXT

Person & email address to contact for further information:

Ian Hickson <ian@hixie.ch>

Intended usage:

Common

Restrictions on usage:

No restrictions apply.

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

Fragment identifiers used with text/html resources either refer to the indicated part of the document or provide state information for in-page scripts.

16.2 `multipart/x-mixed-replace`

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:

multipart

Subtype name:

x-mixed-replace

Required parameters:

boundary (defined in RFC2046) [RFC2046]

Optional parameters:

No optional parameters.

Encoding considerations:

binary

Security considerations:

Subresources of a multipart/x-mixed-replace resource can be of any type, including types with non-trivial security implications such as text/html.

Interoperability considerations:

None.

Published specification:

This specification describes processing rules for Web browsers. Conformance requirements for generating resources with this type are the same as for multipart/mixed. [RFC2046]

Applications that use this media type:

This type is intended to be used in resources generated by Web servers, for consumption by Web browsers.

Additional information:

Magic number(s):: No sequence of bytes can uniquely identify a multipart/x-mixed-replace resource.
File extension(s):: No specific file extensions are recommended for this type.
Macintosh file type code(s):: No specific Macintosh file type codes are recommended for this type.

Person & email address to contact for further information:

Ian Hickson <ian@hixie.ch>

Intended usage:

Common

Restrictions on usage:

No restrictions apply.

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

Fragment identifiers used with multipart/x-mixed-replace resources apply to each body part as defined by the type used by that body part.

16.3 `application/xhtml+xml`

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:

application

Subtype name:

xhtml+xml

Required parameters:

Same as for application/xml [RFC3023]

Optional parameters:

Same as for application/xml [RFC3023]

Encoding considerations:

Same as for application/xml [RFC3023]

Security considerations:

Same as for application/xml [RFC3023]

Interoperability considerations:

Same as for application/xml [RFC3023]

Published specification:

Labeling a resource with the application/xhtml+xml type asserts that the resource is an XML document that likely has a root element from the HTML namespace. Thus, the relevant specifications are the XML specification, the Namespaces in XML specification, and this specification. [XML] [XMLNS]

Applications that use this media type:

Same as for application/xml [RFC3023]

Additional information:

Magic number(s):: Same as for application/xml [RFC3023]
File extension(s):: "xhtml" and "xht" are sometimes used as extensions for XML resources that have a root element from the HTML namespace.
Macintosh file type code(s):: TEXT

Person & email address to contact for further information:

Ian Hickson <ian@hixie.ch>

Intended usage:

Common

Restrictions on usage:

No restrictions apply.

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

Fragment identifiers used with application/xhtml+xml resources have the same semantics as with any XML MIME type. [RFC3023]

16.4 `application/x-www-form-urlencoded`

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:

application

Subtype name:

x-www-form-urlencoded

Required parameters:

No parameters

Optional parameters:

No parameters

Encoding considerations:

7bit (US-ASCII encoding of octets that themselves can be encoding text using any ASCII-compatible character encoding)

Security considerations:

In isolation, an application/x-www-form-urlencoded payload poses no security risks. However, as this type is usually used as part of a form submission, all the risks that apply to HTML forms need to be considered in the context of this type.

Interoperability considerations:

Rules for generating and processing application/x-www-form-urlencoded payloads are defined in this specification.

Published specification:

This document is the relevant specification. Algorithms for encoding and decoding are defined.

Applications that use this media type:

Web browsers and servers.

Additional information:

Magic number(s):: There is no reliable mechanism for recognising application/x-www-form-urlencoded payloads.
File extension(s):: Not applicable.
Macintosh file type code(s):: Not applicable.

Person & email address to contact for further information:

Ian Hickson <ian@hixie.ch>

Intended usage:

Common

Restrictions on usage:

This type is only intended to be used to describe HTML form submission payloads.

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

Fragment identifiers have no meaning with the application/x-www-form-urlencoded type as this type is only used for uploaded payloads that do not have URL identifiers.

16.5 `text/cache-manifest`

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:

text

Subtype name:

cache-manifest

Required parameters:

No parameters

Optional parameters:

No parameters

Encoding considerations:

8bit (always UTF-8)

Security considerations:

Cache manifests themselves pose no immediate risk unless sensitive information is included within the manifest. Implementations, however, are required to follow specific rules when populating a cache based on a cache manifest, to ensure that certain origin-based restrictions are honored. Failure to correctly implement these rules can result in information leakage, cross-site scripting attacks, and the like.

Interoperability considerations:

Rules for processing both conforming and non-conforming content are defined in this specification.

Published specification:

This document is the relevant specification.

Applications that use this media type:

Web browsers.

Additional information:

Magic number(s):: Cache manifests begin with the string "CACHE MANIFEST", followed by either a U+0020 SPACE character, a U+0009 CHARACTER TABULATION (tab) character, a U+000A LINE FEED (LF) character, or a U+000D CARRIAGE RETURN (CR) character.
File extension(s):: "appcache"
Macintosh file type code(s):: No specific Macintosh file type codes are recommended for this type.

Person & email address to contact for further information:

Ian Hickson <ian@hixie.ch>

Intended usage:

Common

Restrictions on usage:

No restrictions apply.

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

Fragment identifiers have no meaning with text/cache-manifest resources.

16.6 `text/ping`

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:

text

Subtype name:

ping

Required parameters:

No parameters

Optional parameters:

No parameters

Encoding considerations:

Not applicable.

Security considerations:

If used exclusively in the fashion described in the context of hyperlink auditing, this type introduces no new security concerns.

Interoperability considerations:

Rules applicable to this type are defined in this specification.

Published specification:

This document is the relevant specification.

Applications that use this media type:

Web browsers.

Additional information:

Magic number(s):: text/ping resources always consist of the four bytes 0x50 0x49 0x4E 0x47 (ASCII 'PING').
File extension(s):: No specific file extension is recommended for this type.
Macintosh file type code(s):: No specific Macintosh file type codes are recommended for this type.

Person & email address to contact for further information:

Ian Hickson <ian@hixie.ch>

Intended usage:

Common

Restrictions on usage:

Only intended for use with HTTP POST requests generated as part of a Web browser's processing of the ping attribute.

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

Fragment identifiers have no meaning with text/ping resources.

16.7 `application/microdata+json`

This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.

Type name:

application

Subtype name:

microdata+json

Required parameters:

Same as for application/json [JSON]

Optional parameters:

Same as for application/json [JSON]

Encoding considerations:

8bit (always UTF-8)

Security considerations:

Same as for application/json [JSON]

Interoperability considerations:

Same as for application/json [JSON]

Published specification:

Labeling a resource with the application/microdata+json type asserts that the resource is a JSON text that consists of an object with a single entry called "items" consisting of an array of entries, each of which consists of an object with an entry called "id" whose value is a string, an entry called "type" whose value is another string, and an entry called "properties" whose value is an object whose entries each have a value consisting of an array of either objects or strings, the objects being of the same form as the objects in the aforementioned "items" entry. Thus, the relevant specifications are the JSON specification and this specification. [JSON]

Applications that use this media type:

Same as for application/json [JSON]

Additional information:

Magic number(s):: Same as for application/json [JSON]
File extension(s):: Same as for application/json [JSON]
Macintosh file type code(s):: Same as for application/json [JSON]

Person & email address to contact for further information:

Ian Hickson <ian@hixie.ch>

Intended usage:

Common

Restrictions on usage:

No restrictions apply.

Author:

Ian Hickson <ian@hixie.ch>

Change controller:

W3C

Fragment identifiers used with application/microdata+json resources have the same semantics as when used with application/json (namely, at the time of writing, no semantics at all). [JSON]

16.8 `Ping-From`

This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]

Header field name:: Ping-From
Applicable protocol:: http
Status:: standard
Author/Change controller:: W3C
Specification document(s):: This document is the relevant specification.
Related information:: None.

16.9 `Ping-To`

This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]

Header field name:: Ping-To
Applicable protocol:: http
Status:: standard
Author/Change controller:: W3C
Specification document(s):: This document is the relevant specification.
Related information:: None.

16.10 `http+aes` scheme

This section describes a URL scheme registration for the IANA URI scheme registry. [RFC4395]

URI scheme name:

http+aes

Status:

permanent

URI scheme syntax:

Same as http, with the userinfo component instead used for specifying the decryption key. (This key is provided in the form of 16, 24, or 32 bytes encoded as ASCII and escaped as necessary using the URL escape mechanism; it is not in the "username:password" form, and the ":" character is not special in this component when using this scheme.)

URI scheme semantics:

Same as http, except that the message body must be decrypted by applying the AES-CTR algorithm using the key specified in the URL's userinfo component, after unescaping it from the URL syntax to bytes, and using a zero nonce. If there is no such component, or if that component, when unescaped from the URL syntax to bytes, does not consist of exactly 16, 24, or 32 bytes, then the user agent must act as if the resource could not be obtained due to a network error, and may report the problem to the user.

Encoding considerations:

Same as http, but the userinfo component represents bytes encoded using ASCII and the URL escape mechanism.

Applications/protocols that use this URI scheme name:

Same as http.

Interoperability considerations:

Same as http, but specifically for private resources that are hosted by untrusted intermediary servers as in a content delivery network.

Security considerations:

URLs using this scheme contain sensitive information (the key used to decrypt the referenced content) and as such should be handled with care, e.g. only sent over TLS-encrypted connections, and only sent to users who are authorized to access the encrypted content.

User agents are encouraged to not show the key in user interface elements where the URL is displayed: first, it's ugly and not useful to the user; and second, it could be used to obscure the domain name.

The http+aes URL scheme only enables the content of a particular resource to be encrypted. Any sensitive information held in HTTP headers is still transmitted in the clear. The length of the resource is still visible. The rate at which the data is transmitted is also unobscured. The name of the resource is not hidden. If this scheme is used to obscure private information, it is important to consider how these side channels might leak information.

For example, the length of a file containing only the user's age in seconds encoded in ASCII would easily let an attacker watching the network traffic or with access to the system hosting the files determine if the user was less than 3 years old, less than 30 years old, or more than 30 years old, just from the length of the file. Padding the file to ten digits (either with trailing spaces or leading zeros) would make all ages from zero to three hundred indistinguishable.

Another example would be the file name. Consider a bank where each user first downloads a "data.json" file, which points to some other files for more data, such that users in debt download a "debt.json" file while users in credit download a "credit.json" file. In such a scenario, users can be categorised by an attacker watching network traffic or with access to the system hosting the files without the attacker ever having to decrypt the "data.json" files.

Each resource encrypted in this fashion must use a fresh key. Otherwise, an attacker can use commonalities in the resources' plaintexts to determine the key and decrypt all the resources sharing a key.

Authors should take care not to embed arbitrary content from the same site using the same scheme, as all content using the http+aes scheme on the same host (and same port) shares the same origin and can therefore leak the keys of any other content also opened at that origin. This problem can be mitigated using the iframe element and the sandbox attribute to embed such content.

The security considerations that apply to http apply as well.

Contact:

Ian Hickson <ian@hixie.ch>

Author/Change controller:

Ian Hickson <ian@hixie.ch>

References:

The http URL scheme is defined in: http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging

16.11 `https+aes` scheme

This section describes a URL scheme registration for the IANA URI scheme registry. [RFC4395]

URI scheme name:: https+aes
Status:: permanent
URI scheme syntax:: Same as http+aes.
URI scheme semantics:: Same as http+aes, but using HTTP over TLS (as in, HTTPS) instead of HTTP, and defaulting to the HTTPS port instead of HTTP's port.
Encoding considerations:: Same as http+aes.
Applications/protocols that use this URI scheme name:: Same as https.
Interoperability considerations:: Same as https, but specifically for private resources that are hosted by untrusted intermediary servers as in a content delivery network.
Security considerations:: The security considerations that apply to http+aes and https apply as well.
Contact:: Ian Hickson <ian@hixie.ch>
Author/Change controller:: Ian Hickson <ian@hixie.ch>
References:: The https URL scheme is defined in: http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging

16.12 `web+` scheme prefix

This section describes a convention for use with the IANA URI scheme registry. It does not itself register a specific scheme. [RFC4395]

URI scheme name:: Schemes starting with the four characters "web+" followed by one or more letters in the range a-z.
Status:: permanent
URI scheme syntax:: Scheme-specific.
URI scheme semantics:: Scheme-specific.
Encoding considerations:: All "web+" schemes should use UTF-8 encodings were relevant.
Applications/protocols that use this URI scheme name:: Scheme-specific.
Interoperability considerations:: The scheme is expected to be used in the context of Web applications.
Security considerations:: Any Web page is able to register a handler for all "web+" schemes. As such, these schemes must not be used for features intended to be core platform features (e.g. network transfer protocols like HTTP or FTP). Similarly, such schemes must not store confidential information in their URLs, such as usernames, passwords, personal information, or confidential project names.
Contact:: Ian Hickson <ian@hixie.ch>
Author/Change controller:: Ian Hickson <ian@hixie.ch>
References:: Custom scheme and content handlers, HTML Living Standard: http://www.whatwg.org/specs/web-apps/current-work/#custom-handlers

16 IANA considerations

16.1 text/html

16.2 multipart/x-mixed-replace

16.3 application/xhtml+xml

16.4 application/x-www-form-urlencoded

16.5 text/cache-manifest

16.6 text/ping

16.7 application/microdata+json

16.8 Ping-From

16.9 Ping-To

16.10 http+aes scheme

16.11 https+aes scheme

16.12 web+ scheme prefix

16.1 `text/html`

16.2 `multipart/x-mixed-replace`

16.3 `application/xhtml+xml`

16.4 `application/x-www-form-urlencoded`

16.5 `text/cache-manifest`

16.6 `text/ping`

16.7 `application/microdata+json`

16.8 `Ping-From`

16.9 `Ping-To`

16.10 `http+aes` scheme

16.11 `https+aes` scheme

16.12 `web+` scheme prefix