text/html
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
charset
The charset
parameter may be provided
to definitively specify the document's character
encoding, overriding any character encoding declarations in the
document. The parameter's value must be the name of the
character encoding used to serialize the file, must be a valid
character encoding name, and must be an ASCII
case-insensitive match for the preferred MIME
name for that encoding. [IANACHARSET]
Entire novels have been written about the security considerations that apply to HTML documents. Many are listed in this document, to which the reader is referred for more details. Some general concerns bear mentioning here, however:
HTML is scripted language, and has a large number of APIs (some of which are described in this document). Script can expose the user to potential risks of information leakage, credential leakage, cross-site scripting attacks, cross-site request forgeries, and a host of other problems. While the designs in this specification are intended to be safe if implemented correctly, a full implementation is a massive undertaking and, as with any software, user agents are likely to have security bugs.
Even without scripting, there are specific features in HTML
which, for historical reasons, are required for broad
compatibility with legacy content but that expose the user to
unfortunate security problems. In particular, the img
element can be used in conjunction with some other features as a
way to effect a port scan from the user's location on the
Internet. This can expose local network topologies that the
attacker would otherwise not be able to determine.
HTML relies on a compartmentalization scheme sometimes known as the same-origin policy. An origin in most cases consists of all the pages served from the same host, on the same port, using the same protocol.
It is critical, therefore, to ensure that any untrusted content that forms part of a site be hosted on a different origin than any sensitive content on that site. Untrusted content can easily spoof any other page on the same origin, read data from that origin, cause scripts in that origin to execute, submit forms to and from that origin even if they are protected from cross-site request forgery attacks by unique tokens, and make use of any third-party resources exposed to or rights granted to that origin.
html
" and "htm
"
are commonly, but certainly not exclusively, used as the
extension for HTML documents.TEXT
Fragment identifiers used with text/html
resources
either refer to the indicated part of the document or
provide state information for in-page scripts.
multipart/x-mixed-replace
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
boundary
(defined in RFC2046) [RFC2046]
multipart/x-mixed-replace
resource can be of any type, including types with non-trivial
security implications such as text/html
.
multipart/mixed
. [RFC2046]
multipart/x-mixed-replace
resource.Fragment identifiers used with
multipart/x-mixed-replace
resources apply to each body
part as defined by the type used by that body part.
application/xhtml+xml
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
application/xml
[RFC3023]application/xml
[RFC3023]application/xml
[RFC3023]application/xml
[RFC3023]application/xml
[RFC3023]application/xhtml+xml
type asserts that the resource is an XML document that likely has
a root element from the HTML namespace. Thus, the
relevant specifications are the XML specification, the Namespaces
in XML specification, and this specification. [XML] [XMLNS]
application/xml
[RFC3023]application/xml
[RFC3023]xhtml
" and "xht
"
are sometimes used as extensions for XML resources that have a
root element from the HTML namespace.TEXT
Fragment identifiers used with application/xhtml+xml
resources have the same semantics as with any XML MIME
type. [RFC3023]
application/x-www-form-urlencoded
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
In isolation, an application/x-www-form-urlencoded
payload poses no security risks. However, as this type is usually
used as part of a form submission, all the risks that apply to
HTML forms need to be considered in the context of this type.
application/x-www-form-urlencoded
payloads are
defined in this specification.
application/x-www-form-urlencoded
payloads.Fragment identifiers have no meaning with the
application/x-www-form-urlencoded
type as this type is
only used for uploaded payloads that do not have URL
identifiers.
text/cache-manifest
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
Cache manifests themselves pose no immediate risk unless sensitive information is included within the manifest. Implementations, however, are required to follow specific rules when populating a cache based on a cache manifest, to ensure that certain origin-based restrictions are honored. Failure to correctly implement these rules can result in information leakage, cross-site scripting attacks, and the like.
CACHE
MANIFEST
", followed by either a U+0020 SPACE character, a
U+0009 CHARACTER TABULATION (tab) character, a U+000A LINE FEED
(LF) character, or a U+000D CARRIAGE RETURN (CR) character.appcache
"Fragment identifiers have no meaning with
text/cache-manifest
resources.
text/ping
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
If used exclusively in the fashion described in the context of hyperlink auditing, this type introduces no new security concerns.
text/ping
resources always consist of the four
bytes 0x50 0x49 0x4E 0x47 (ASCII 'PING').ping
attribute.Fragment identifiers have no meaning with
text/ping
resources.
application/microdata+json
This registration is for community review and will be submitted to the IESG for review, approval, and registration with IANA.
application/json
[JSON]application/json
[JSON]application/json
[JSON]application/json
[JSON]application/microdata+json
type asserts that the
resource is a JSON text that consists of an object with a single
entry called "items
" consisting of an array
of entries, each of which consists of an object with an entry
called "id
" whose value is a string, an
entry called "type
" whose value is another
string, and an entry called "properties
"
whose value is an object whose entries each have a value
consisting of an array of either objects or strings, the objects
being of the same form as the objects in the aforementioned "items
" entry. Thus, the relevant specifications
are the JSON specification and this specification. [JSON]
application/json
[JSON]application/json
[JSON]application/json
[JSON]application/json
[JSON]Fragment identifiers used with
application/microdata+json
resources have the same
semantics as when used with application/json
(namely,
at the time of writing, no semantics at all). [JSON]
Ping-From
This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]
Ping-To
This section describes a header field for registration in the Permanent Message Header Field Registry. [RFC3864]
http+aes
schemeThis section describes a URL scheme registration for the IANA URI scheme registry. [RFC4395]
http+aes
http
, with the userinfo
component instead used for specifying the
decryption key. (This key is provided in the form of 16, 24, or 32
bytes encoded as ASCII and escaped as necessary using the URL
escape mechanism; it is not in the "username:password" form, and
the ":" character is not special in this component when using this
scheme.)http
, except that the message
body must be decrypted by applying the AES-CTR algorithm using the
key specified in the URL's userinfo
component, after unescaping it from the URL syntax to bytes, and
using a zero nonce. If there is no such component, or if that
component, when unescaped from the URL syntax to bytes, does not
consist of exactly 16, 24, or 32 bytes, then the user agent must
act as if the resource could not be obtained due to a network
error, and may report the problem to the user.http
, but the userinfo
component represents bytes encoded using
ASCII and the URL escape mechanism.http
.http
, but specifically for
private resources that are hosted by untrusted intermediary servers
as in a content delivery network.URLs using this scheme contain sensitive information (the key used to decrypt the referenced content) and as such should be handled with care, e.g. only sent over TLS-encrypted connections, and only sent to users who are authorized to access the encrypted content.
User agents are encouraged to not show the key in user interface elements where the URL is displayed: first, it's ugly and not useful to the user; and second, it could be used to obscure the domain name.
The http+aes
URL scheme only enables the
content of a particular resource to be encrypted. Any
sensitive information held in HTTP headers is still transmitted in
the clear. The length of the resource is still visible. The rate
at which the data is transmitted is also unobscured. The name of
the resource is not hidden. If this scheme is used to obscure
private information, it is important to consider how these side
channels might leak information.
For example, the length of a file containing only the user's age in seconds encoded in ASCII would easily let an attacker watching the network traffic or with access to the system hosting the files determine if the user was less than 3 years old, less than 30 years old, or more than 30 years old, just from the length of the file. Padding the file to ten digits (either with trailing spaces or leading zeros) would make all ages from zero to three hundred indistinguishable.
Another example would be the file name. Consider a bank where each user first downloads a "data.json" file, which points to some other files for more data, such that users in debt download a "debt.json" file while users in credit download a "credit.json" file. In such a scenario, users can be categorised by an attacker watching network traffic or with access to the system hosting the files without the attacker ever having to decrypt the "data.json" files.
Each resource encrypted in this fashion must use a fresh key. Otherwise, an attacker can use commonalities in the resources' plaintexts to determine the key and decrypt all the resources sharing a key.
Authors should take care not to embed arbitrary content from
the same site using the same scheme, as all content using the
http+aes
scheme on the same host (and same
port) shares the same origin and can therefore leak
the keys of any other content also opened at that origin. This
problem can be mitigated using the iframe
element and
the sandbox
attribute to embed such content.
The security considerations that apply to http
apply as well.
http
URL scheme is defined in:
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging
https+aes
schemeThis section describes a URL scheme registration for the IANA URI scheme registry. [RFC4395]
https+aes
http+aes
.http+aes
, but using HTTP over TLS
(as in, HTTPS) instead of HTTP, and defaulting to the HTTPS port
instead of HTTP's port.http+aes
.https
.https
, but specifically for
private resources that are hosted by untrusted intermediary servers
as in a content delivery network.The security considerations that apply to http+aes
and https
apply as
well.
https
URL scheme is defined in:
http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging
web+
scheme prefixThis section describes a convention for use with the IANA URI scheme registry. It does not itself register a specific scheme. [RFC4395]
web+
" followed by one or more letters in the range
a
-z
.
web+
" schemes should use UTF-8 encodings were relevant.web+
" schemes. As such, these schemes must not be
used for features intended to be core platform features (e.g.
network transfer protocols like HTTP or FTP). Similarly, such
schemes must not store confidential information in their URLs,
such as usernames, passwords, personal information, or
confidential project names.