Network Working Group Request for Comments: 2109 Category: Standards Track | D. Kristol Bell Laboratories, Lucent Technologies L. Montulli Netscape Communications February 1997 |
Status of this Memo
This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.
This document specifies a way to create a stateful session with HTTP requests and responses. It describes two new headers, Cookie and Set-Cookie, which carry state information between participating origin servers and user agents. The method described here differs from Netscape's Cookie proposal, but it can interoperate with HTTP/1.0 user agents that use Netscape's method. (See the HISTORICAL section.)
The terms user agent, client, server, proxy, and origin server have the same meaning as in the HTTP/1.0 specification.
Fully-qualified host name (FQHN) means either the fully-qualified domain name (FQDN) of a host (i.e., a completely specified domain name ending in a top-level domain such as .com or .uk), or the numeric Internet Protocol (IP) address of a host. The fully qualified domain name is preferred; use of numeric IP addresses is strongly discouraged.
The terms request-host and request-URI refer to the values the client would send to the server as, respectively, the host (but not port) and abs_path portions of the absoluteURI (http_URL) of the HTTP request line. Note that request-host must be a FQHN.
Hosts names can be specified either as an IP address or a FQHN string. Sometimes we compare one host name with another. Host A's name domain-matches host B's if
Because it was used in Netscape's original implementation of state management, we will use the term cookie to refer to the state information that passes between an origin server and user agent, and that gets stored by the user agent.
This document describes a way to create stateful sessions with HTTP requests and responses. Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests; the technique allows clients and servers that wish to exchange state information to place HTTP requests and responses within a larger context, which we term a "session". This context might be used to create, for example, a "shopping cart", in which user selections can be aggregated before purchase, or a magazine browsing system, in which a user's previous reading affects which offerings are presented.
There are, of course, many different potential contexts and thus many different potential types of session. The designers' paradigm for sessions created by the exchange of cookies has these key attributes:
We outline here a way for an origin server to send state information to the user agent, and for the user agent to return the state information to the origin server. The goal is to have a minimal impact on HTTP and user agents. Only origin servers that need to maintain sessions would suffer any significant impact, and that impact can largely be confined to Common Gateway Interface (CGI) programs, unless the server provides more sophisticated state management support. (See Implementation Considerations, below.)
The two state management headers, Set-Cookie and Cookie, have common syntactic properties involving attribute-value pairs. The following grammar uses the notation, and tokens DIGIT (decimal digits) and token (informally, a sequence of non-special, non-white space characters) from the HTTP/1.1 specification [RFC 2068] to describe their syntax.
av-pairs = av-pair *(";" av-pair) av-pair = attr ["=" value] ; optional value attr = token value = word word = token | quoted-string
Attributes (names) (attr) are case-insensitive. White space is permitted between tokens. Note that while the above syntax description shows value as optional, most attrs require them.
NOTE: The syntax above allows whitespace between the attribute and the = sign.
The origin server initiates a session, if it so desires. (Note that "session" here does not refer to a persistent network connection but to a logical session created from HTTP requests and responses. The presence or absence of a persistent connection should have no effect on the use of cookie-derived sessions). To initiate a session, the origin server returns an extra response header to the client, Set- Cookie. (The details follow later.)
A user agent returns a Cookie request header (see below) to the origin server if it chooses to continue a session. The origin server may ignore it or use it to determine the current state of the
session. It may send back to the client a Set-Cookie response header with the same or different information, or it may send no Set-Cookie header at all. The origin server effectively ends a session by sending the client a Set-Cookie header with Max-Age=0.
Servers may return a Set-Cookie response headers with any response. User agents should send Cookie request headers, subject to other rules detailed below, with every request.
An origin server may include multiple Set-Cookie headers in a response. Note that an intervening gateway could fold multiple such headers into a single header.
The syntax for the Set-Cookie response header is
set-cookie = "Set-Cookie:" cookies cookies = 1#cookie cookie = NAME "=" VALUE *(";" cookie-av) NAME = attr VALUE = value cookie-av = "Comment" "=" value | "Domain" "=" value | "Max-Age" "=" value | "Path" "=" value | "Secure" | "Version" "=" 1*DIGIT
Informally, the Set-Cookie response header comprises the token Set- Cookie:, followed by a comma-separated list of one or more cookies. Each cookie begins with a NAME=VALUE pair, followed by zero or more semi-colon-separated attribute-value pairs. The syntax for attribute-value pairs was shown earlier. The specific attributes and the semantics of their values follows. The NAME=VALUE attribute- value pair must come first in each cookie. The others, if present, can occur in any order. If an attribute appears more than once in a cookie, the behavior is undefined.
An origin server must be cognizant of the effect of possible caching of both the returned resource and the Set-Cookie header. Caching "public" documents is desirable. For example, if the origin server wants to use a public document such as a "front door" page as a sentinel to indicate the beginning of a session for which a Set- Cookie response header must be generated, the page should be stored in caches "pre-expired" so that the origin server will see further requests. "Private documents", for example those that contain information strictly private to a session, should not be cached in shared caches.
If the cookie is intended for use by a single user, the Set-cookie header should not be cached. A Set-cookie header that is intended to be shared by multiple users may be cached.
The origin server should send the following additional HTTP/1.1 response headers, depending on circumstances:
The user agent keeps separate track of state information that arrives via Set-Cookie response headers from each origin server (as distinguished by name or IP address and port). The user agent applies these defaults for optional attributes that are missing:
To prevent possible security or privacy violations, a user agent rejects a cookie (shall not store its information) if any of the following is true:
If a user agent receives a Set-Cookie response header whose NAME is the same as a pre-existing cookie, and whose Domain and Path attribute values exactly (string) match those of a pre-existing cookie, the new cookie supersedes the old. However, if the Set- Cookie has a value for Max-Age of zero, the (old and new) cookie is discarded. Otherwise cookies accumulate until they expire (resources permitting), at which time they are discarded.
Because user agents have finite space in which to store cookies, they may also discard older cookies to make space for newer ones, using, for example, a least-recently-used algorithm, along with constraints on the maximum number of cookies that each origin server may set.
If a Set-Cookie response header includes a Comment attribute, the user agent should store that information in a human-readable form with the cookie and should display the comment text as part of a cookie inspection user interface.
User agents should allow the user to control cookie destruction. An infrequently-used cookie may function as a "preferences file" for network applications, and a user may wish to keep it even if it is the least-recently-used cookie. One possible implementation would be an interface that allows the permanent storage of a cookie through a checkbox (or, conversely, its immediate destruction).
Privacy considerations dictate that the user have considerable control over cookie management. The PRIVACY section contains more information.
When it sends a request to an origin server, the user agent sends a Cookie request header to the origin server if it has cookies that are applicable to the request, based on
cookie = "Cookie:" cookie-version 1*((";" | ",") cookie-value) cookie-value = NAME "=" VALUE [";" path] [";" domain] cookie-version = "$Version" "=" value NAME = attr VALUE = value path = "$Path" "=" value domain = "$Domain" "=" value
The value of the cookie-version attribute must be the value from the Version attribute, if any, of the corresponding Set-Cookie response header. Otherwise the value for cookie-version is 0. The value for the path attribute must be the value from the Path attribute, if any, of the corresponding Set-Cookie response header. Otherwise the attribute should be omitted from the Cookie request header. The value for the domain attribute must be the value from the Domain attribute, if any, of the corresponding Set-Cookie response header. Otherwise the attribute should be omitted from the Cookie request header.
Note that there is no Comment attribute in the Cookie request header corresponding to the one in the Set-Cookie response header. The user agent does not return the comment information to the origin server.
The following rules apply to choosing applicable cookie-values from among all the cookies the user agent has.
Note: For backward compatibility, the separator in the Cookie header is semi-colon (;) everywhere. A server should also accept comma (,) as the separator between cookie-values for future compatibility.
Users must have control over sessions in order to ensure privacy. (See PRIVACY section below.) To simplify implementation and to prevent an additional layer of complexity where adequate safeguards exist, however, this document distinguishes between transactions that are verifiable and those that are unverifiable. A transaction is verifiable if the user has the option to review the request-URI prior to its use in the transaction. A transaction is unverifiable if the user does not have that option. Unverifiable transactions typically arise when a user agent automatically requests inlined or embedded entities or when it resolves redirection (3xx) responses from an origin server. Typically the origin transaction, the transaction that the user initiates, is verifiable, and that transaction may directly or indirectly induce the user agent to make unverifiable transactions.
When it makes an unverifiable transaction, a user agent must enable a session only if a cookie with a domain attribute D was sent or received in its origin transaction, such that the host name in the Request-URI of the unverifiable transaction domain-matches D.
This restriction prevents a malicious service author from using unverifiable transactions to induce a user agent to start or continue a session with a server in a different domain. The starting or continuation of such sessions could be contrary to the privacy expectations of the user, and could also be a security problem.
User agents may offer configurable options that allow the user agent, or any autonomous programs that the user agent executes, to ignore the above rule, so long as these override options default to "off".
Many current user agents already provide a review option that would render many links verifiable. For instance, some user agents display the URL that would be referenced for a particular link when the mouse pointer is placed over that link. The user can therefore determine whether to visit that site before causing the browser to do so. (Though not implemented on current user agents, a similar technique could be used for a button used to submit a form -- the user agent
could display the action to be taken if the user were to select that button.) However, even this would not make all links verifiable; for example, links to automatically loaded images would not normally be subject to "mouse pointer" verification.
Many user agents also provide the option for a user to view the HTML source of a document, or to save the source to an external file where it can be viewed by another application. While such an option does provide a crude review mechanism, some users might not consider it acceptable for this purpose.
A user agent returns much of the information in the Set-Cookie header to the origin server when the Path attribute matches that of a new request. When it receives a Cookie header, the origin server should treat cookies with NAMEs whose prefix is $ specially, as an attribute for the adjacent cookie. The value for such a NAME is to be interpreted as applying to the lexically (left-to-right) most recent cookie whose name does not have the $ prefix. If there is no previous cookie, the value applies to the cookie mechanism as a whole. For example, consider the cookie
Cookie: $Version="1"; Customer="WILE_E_COYOTE"; $Path="/acme"
$Version applies to the cookie mechanism as a whole (and gives the version number for the cookie mechanism). $Path is an attribute whose value (/acme) defines the Path attribute that was used when the Customer cookie was defined in a Set-Cookie response header.
One reason for separating state information from both a URL and document content is to facilitate the scaling that caching permits. To support cookies, a caching proxy must obey these rules already in the HTTP specification:
Most detail of request and response headers has been omitted. Assume the user agent has no stored cookies.
Cookie: $Version="1"; Customer="WILE_E_COYOTE"; $Path="/acme"; Part_Number="Rocket_Launcher_0001"; $Path="/acme"
[form data]
User selects shipping method from form.
Cookie: $Version="1"; Customer="WILE_E_COYOTE"; $Path="/acme"; Part_Number="Rocket_Launcher_0001"; $Path="/acme"; Shipping="FedEx"; $Path="/acme"
[form data]
User chooses to process order.
Transaction is complete.
The user agent makes a series of requests on the origin server, after each of which it receives a new cookie. All the cookies have the same Path attribute and (default) domain. Because the request URLs all have /acme as a prefix, and that matches the Path attribute, each request contains all the cookies received so far.
This example illustrates the effect of the Path attribute. All detail of request and response headers has been omitted. Assume the user agent has no stored cookies.
Imagine the user agent has received, in response to earlier requests, the response headers
Cookie: $Version="1"; Part_Number="Riding_Rocket_0023"; $Path="/acme/ammo"; Part_Number="Rocket_Launcher_0001"; $Path="/acme"
Note that the NAME=VALUE pair for the cookie with the more specific Path attribute, /acme/ammo, comes before the one with the less specific Path attribute, /acme. Further note that the same cookie name appears more than once.
A subsequent request by the user agent to the (same) server for a URL of the form /acme/parts/ would include the following request header:
Cookie: $Version="1"; Part_Number="Rocket_Launcher_0001"; $Path="/acme"
Here, the second cookie's Path attribute /acme/ammo is not a prefix of the request URL, /acme/parts/, so the cookie does not get forwarded to the server.
Here we speculate on likely or desirable details for an origin server that implements state management.
An origin server's content should probably be divided into disjoint application areas, some of which require the use of state information. The application areas can be distinguished by their request URLs. The Set-Cookie header can incorporate information about the application areas by setting the Path attribute for each one.
The session information can obviously be clear or encoded text that describes state. However, if it grows too large, it can become unwieldy. Therefore, an implementor might choose for the session information to be a key to a server-side resource. Of course, using
a database creates some problems that this state management specification was meant to avoid, namely:
Caching benefits the scalability of WWW. Therefore it is important to reduce the number of documents that have state embedded in them inherently. For example, if a shopping-basket-style application always displays a user's current basket contents on each page, those pages cannot be cached, because each user's basket's contents would be different. On the other hand, if each page contains just a link that allows the user to "Look at My Shopping Basket", the page can be cached.
Practical user agent implementations have limits on the number and size of cookies that they can store. In general, user agents' cookie support should have no fixed limits. They should strive to store as many frequently-used cookies as possible. Furthermore, general-use user agents should provide each of the following minimum capabilities individually, although not necessarily simultaneously:
The information in a Set-Cookie response header must be retained in its entirety. If for some reason there is inadequate space to store the cookie, it must be discarded, not truncated.
Applications should use as few and as small cookies as possible, and they should cope gracefully with the loss of a cookie.
User agents may choose to set an upper bound on the number of cookies to be stored from a given host or domain name or on the size of the cookie information. Otherwise a malicious server could attempt to flood a user agent with many cookies, or large cookies, on successive responses, which would force out cookies the user agent had received from other servers. However, the minima specified above should still be supported.
An origin server could create a Set-Cookie header to track the path of a user through the server. Users may object to this behavior as an intrusive accumulation of information, even if their identity is not evident. (Identity might become evident if a user subsequently fills out a form that contains identifying information.) This state management specification therefore requires that a user agent give the user control over such a possible intrusion, although the interface through which the user is given this control is left unspecified. However, the control mechanisms provided shall at least allow the user
an origin server. (The user agent would then behave like one that is unaware of how to handle Set-Cookie response headers.)
When the user agent terminates execution, it should let the user discard all state information. Alternatively, the user agent may ask the user whether state information should be retained; the default should be "no". If the user chooses to retain state information, it would be restored the next time the user agent runs.
NOTE: User agents should probably be cautious about using files to store cookies long-term. If a user runs more than one instance of the user agent, the cookies could be commingled or otherwise messed up.
The restrictions on the value of the Domain attribute, and the rules concerning unverifiable transactions, are meant to reduce the ways that cookies can "leak" to the "wrong" site. The intent is to restrict cookies to one, or a closely related set of hosts. Therefore a request-host is limited as to what values it can set for Domain. We consider it acceptable for hosts host1.foo.com and host2.foo.com to share cookies, but not a.com and b.com.
Similarly, a server can only set a Path for cookies that are related to the request-URI.
The information in the Set-Cookie and Cookie headers is unprotected. Two consequences are:
Proper application design can avoid spoofing attacks from related domains. Consider:
The server at victim.cracker.edu should detect that the second cookie was not one it originated by noticing that the Domain attribute is not for itself and ignore it.
A user agent should make every attempt to prevent the sharing of session information between hosts that are in different domains. Embedded or inlined objects may cause particularly severe privacy problems if they can be used to share cookies between disparate hosts. For example, a malicious server could embed cookie information for host a.com in a URI for a CGI on host b.com. User agent implementors are strongly encouraged to prevent this sort of exchange whenever possible.
Three other proposals have been made to accomplish similar goals. This specification is an amalgam of Kristol's State-Info proposal and Netscape's Cookie proposal.
Brian Behlendorf proposed a Session-ID header that would be user- agent-initiated and could be used by an origin server to track "clicktrails". It would not carry any origin-server-defined state, however. Phillip Hallam-Baker has proposed another client-defined session ID mechanism for similar purposes.
While both session IDs and cookies can provide a way to sustain stateful sessions, their intended purpose is different, and, consequently, the privacy requirements for them are different. A user initiates session IDs to allow servers to track progress through them, or to distinguish multiple users on a shared machine. Cookies are server-initiated, so the cookie mechanism described here gives users control over something that would otherwise take place without the users' awareness. Furthermore, cookies convey rich, server- selected information, whereas session IDs comprise user-selected, simple information.
HTTP/1.0 clients and servers may use Set-Cookie and Cookie headers that reflect Netscape's original cookie proposal. These notes cover inter-operation between "old" and "new" cookies.
This proposal adds attribute-value pairs to the Cookie request header in a compatible way. An "old" client that receives a "new" cookie will ignore attributes it does not understand; it returns what it does understand to the origin server. A "new" client always sends cookies in the new form.
An "old" server that receives a "new" cookie will see what it thinks are many cookies with names that begin with a $, and it will ignore them. (The "old" server expects these cookies to be separated by semi-colon, not comma.) A "new" server can detect cookies that have passed through an "old" client, because they lack a $Version attribute.
Netscape's original proposal defined an Expires header that took a date value in a fixed-length variant format in place of Max-Age:
Wdy, DD-Mon-YY HH:MM:SS GMT
Note that the Expires date format contains embedded spaces, and that "old" cookies did not have quotes around values. Clients that implement to this specification should be aware of "old" cookies and Expires.
In Netscape's original proposal, the values in attribute-value pairs did not accept "-quoted strings. Origin servers should be cautious about sending values that require quotes unless they know the receiving user agent understands them (i.e., "new" cookies). A ("new") user agent should only use quotes around values in Cookie headers when the cookie's version(s) is (are) all compliant with this specification or later.
In Netscape's original proposal, no whitespace was permitted around the = that separates attribute-value pairs. Therefore such whitespace should be used with caution in new implementations.
Some caches, such as those conforming to HTTP/1.0, will inevitably cache the Set-Cookie header, because there was no mechanism to suppress caching of headers prior to HTTP/1.1. This caching can lead to security problems. Documents transmitted by an origin server along with Set-Cookie headers will usually either be uncachable, or will be "pre-expired". As long as caches obey instructions not to cache documents (following Expires: <a date in the past> or Pragma: no-cache (HTTP/1.0), or Cache-control: no-cache (HTTP/1.1)) uncachable documents present no problem. However, pre-expired documents may be stored in caches. They require validation (a conditional GET) on each new request, but some cache operators loosen the rules for their caches, and sometimes serve expired documents without first validating them. This combination of factors can lead to cookies meant for one user later being sent to another user. The Set-Cookie header is stored in the cache, and, although the document is stale (expired), the cache returns the document in response to later requests, including cached headers.
This document really represents the collective efforts of the following people, in addition to the authors: Roy Fielding, Marc Hedlund, Ted Hardie, Koen Holtman, Shel Kaphan, Rohit Khare.
David M. Kristol Bell Laboratories, Lucent Technologies 600 Mountain Ave. Room 2A-227 Murray Hill, NJ 07974 Phone: (908) 582-2250 Fax: (908) 582-5809 EMail: dmk@bell-labs.com Lou Montulli Netscape Communications Corp. 501 E. Middlefield Rd. Mountain View, CA 94043 Phone: (415) 528-2600 EMail: montulli@netscape.com