What is the issue with the Infra Standard?
TL;DR: Of the terms ‘user agent’ and ‘web browser’, exactly one term is defined, yet both of these terms are used extensively and inconsistently; sometimes they are explicitly treated differently; sometimes they seem to be interchangeable.
This issue discusses use of the terms ‘user agent’ and ‘web browser’ across all WHATWG specifications.
The term ‘user agent’ is defined in Infra as shown here:
A user agent is any software entity that acts on behalf of a user, for example by retrieving and rendering web content and facilitating end user interaction with it.
This example sounds a lot like a ‘web browser’, although that term does not seem to be defined anywhere.
The term ‘user agent’ is so important that the very same paragraph that defines it includes cross-references:
A person can use many different user agents in their day-to-day life, including by configuring an implementation to act as several user agents at once, for example by using multiple profiles or the implementation’s private browsing mode.
Despite that, it seems like none of the other uses of the term ‘user agent’ (including in Infra) are cross-referenced in this way.
Despite everything said so far, WHATWG specifications include many normative uses of the term ‘web browser’ (other than as an example). Some cases explicitly state that ‘web browsers’ are subject to different requirements from other user agents:
-
HTML:
-
Section 2.1.8 (‘Conformance classes’):
Web browsers that support the XML syntax must process elements and attributes from the HTML namespace found in XML documents as described in this specification, so that users can interact with them ….
Web browsers that support the HTML syntax must process documents labeled with an HTML MIME type as described in this specification, so that users can interact with them.
User agents that process HTML and XML documents purely to render non-interactive versions of them must comply to the same conformance criteria as web browsers, except that they are exempt from requirements regarding user interaction.
This implies that the defining characteristic of ‘web browsers’ is interactivity, but that contradicts the subheading ‘Web browsers and other interactive user agents’.
-
Section 4.6.7 (‘Link types’):
New link types that are to be implemented by web browsers are to be added to this standard. The remainder can be registered as extensions.
-
Section 7 (‘Loading web pages’):
This section describes features that apply most directly to web browsers. Having said that, except where specified otherwise, the requirements defined in this section do apply to all user agents, whether they are web browsers or not.
It seems that nothing in this section actually ‘specifie[s] otherwise’, but the possibility is there.
-
Section 9.2.6 (‘Interpreting an event stream’):
This section defines the algorithm ‘dispatch the event’ differently, depending on whether the user agent is a ‘web browser’, even though the algorithm given for ‘web browsers’ seems entirely generic and applicable to any user agent that uses the EventSource interface.
Some uses, although non-normative, add to this confusion:
-
URL:
-
Section 4.4 (‘URL parsing’):
Note: Non-web-browser implementations only need to implement the basic URL parser.
In other words, for ‘non-web-browser’ implementations, parsing of blob URLs does not need to be implemented, although everything else about blob URLs (including their origins, which are also defined in URL) still applies.
Other uses of the term ‘web browser’ imply that it is interchangeable with ‘user agent’ (but then the latter term, which has a definition, should be used exclusively):
Other normative uses of ‘web browser’ in HTML
-
Section 4.5.13 (‘The data element’):
When combined with microformats or the microdata attributes defined in this specification, the element serves to provide both a machine-readable value for the purposes of data processors, and a human-readable value for the purposes of rendering in a web browser.
-
Section 4.8.8 (‘The video element’):
Content may be provided inside the video element. User agents should not show this content to the user; it is intended for older web browsers which do not support video ….
-
Section 4.8.9 (‘The audio element’):
Content may be provided inside the audio element. User agents should not show this content to the user; it is intended for older web browsers which do not support audio ….
-
Section 4.12.1 (‘The script element’):
The defer attribute may be specified even if the async attribute is specified, to cause legacy web browsers that only support defer (and not async) to fall back to the defer behavior ….
-
Section 4.12.1.2 (‘Scripting languages’):
User agents are not required to support JavaScript. This standard needs to be updated if a language other than JavaScript comes along and gets similar wide adoption by web browsers.
-
Section 9.3 (‘Cross-document messaging’):
Web browsers, for security and privacy reasons, prevent documents in different domains from affecting each other ….
-
Section 17.2 (‘multipart/x-mixed-replace’):
This specification describes processing rules for web browsers.
This type is intended to be used in resources generated by web servers, for consumption by web browsers.
What is the issue with the Infra Standard?
TL;DR: Of the terms ‘user agent’ and ‘web browser’, exactly one term is defined, yet both of these terms are used extensively and inconsistently; sometimes they are explicitly treated differently; sometimes they seem to be interchangeable.
This issue discusses use of the terms ‘user agent’ and ‘web browser’ across all WHATWG specifications.
The term ‘user agent’ is defined in Infra as shown here:
This example sounds a lot like a ‘web browser’, although that term does not seem to be defined anywhere.
The term ‘user agent’ is so important that the very same paragraph that defines it includes cross-references:
Despite that, it seems like none of the other uses of the term ‘user agent’ (including in Infra) are cross-referenced in this way.
Despite everything said so far, WHATWG specifications include many normative uses of the term ‘web browser’ (other than as an example). Some cases explicitly state that ‘web browsers’ are subject to different requirements from other user agents:
HTML:
Section 2.1.8 (‘Conformance classes’):
This implies that the defining characteristic of ‘web browsers’ is interactivity, but that contradicts the subheading ‘Web browsers and other interactive user agents’.
Section 4.6.7 (‘Link types’):
Section 7 (‘Loading web pages’):
It seems that nothing in this section actually ‘specifie[s] otherwise’, but the possibility is there.
Section 9.2.6 (‘Interpreting an event stream’):
This section defines the algorithm ‘dispatch the event’ differently, depending on whether the user agent is a ‘web browser’, even though the algorithm given for ‘web browsers’ seems entirely generic and applicable to any user agent that uses the
EventSourceinterface.Some uses, although non-normative, add to this confusion:
URL:
Section 4.4 (‘URL parsing’):
In other words, for ‘non-web-browser’ implementations, parsing of blob URLs does not need to be implemented, although everything else about blob URLs (including their origins, which are also defined in URL) still applies.
Other uses of the term ‘web browser’ imply that it is interchangeable with ‘user agent’ (but then the latter term, which has a definition, should be used exclusively):
Compatibility abstract:
HTML:
Section 17.4 (‘
text/ping’):But section 4.6.6 (‘Hyperlink auditing’) uses the term ‘user agent’ exclusively.
MIME Sniffing introduction (this section is probably meant to be non-normative):
Quirks Mode abstract:
Web IDL abstract:
WebSockets:
Section 2.2 (‘Opening handshake’):
(This seems analogous to an example or note and therefore non-normative.)
This issue seems to apply equally to any user agent.
Other normative uses of ‘web browser’ in HTML
Section 4.5.13 (‘The
dataelement’):Section 4.8.8 (‘The
videoelement’):Section 4.8.9 (‘The
audioelement’):Section 4.12.1 (‘The
scriptelement’):Section 4.12.1.2 (‘Scripting languages’):
Section 9.3 (‘Cross-document messaging’):
Section 17.2 (‘
multipart/x-mixed-replace’):