# HTTP Archive (HAR) v1.3 Proposal Summary of changes from `v1.2`: - new property: [`request.headersCompression`](#request) - new property: [`response.headersCompression`](#response) - new property: [`postData.encoding`](#postdata) - new property: [`postData.params[].encoding`](#params) --- ## Data Structure HAR files are required to be saved in `UTF-8` encoding, other encodings are forbidden. The spec requires that tools support and ignore a BOM and allow them to emit one if they like. Summary of HAR object types: - [log](#log) - [creator](#creator) - [browser](#browser) - [pages](#pages) - [pageTimings](#pagetimings) - [entries](#entries) - [request](#request) - [headers](#headers) - [cookies](#cookies) - [queryString](#querystring) - [postData](#postData) - [params](#params) - [response](#response) - [headers](#headers) - [cookies](#cookies) - [content](#content) - [cache](#cache) - [timings](#timings) --- ### log This object represents the root of exported data. ```json "log": { "version" : "1.2", "creator" : {}, "browser" : {}, "pages": [], "entries": [], "comment": "" } ``` | name | type | required | default | description | | ------------------------- | --------- | --------- | ------- | ----------------------------------------------------------------------------------------------------------------- | | **version** | `string` | ✔️ | `1.3` | Version number of the format | | [**creator**](#creator) | `object` | ✔️ | `N/A` | Name and version info of the log [creator](#creator) application | | [**browser**](#browser) | `object` | ✖️ | `N/A` | Name and version info of used [browser](#browser) | | [**pages**](#pages) | `array` | ✖️ | `N/A` | List of all exported (tracked) [pages](#pages) | | [**entries**](#entries) | `array` | ✔️ | `N/A` | List of all exported (tracked) [requests](#entries) | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > There is one [`page`](#pages) object for every exported web page and one [`entry`](#entries) object for every HTTP request. In case when an HTTP trace tool isn't able to group requests by a page, the `pages` array is empty and individual requests doesn't have a parent page. --- ### creator ```json "creator": { "name": "Firebug", "version": "1.6", "comment": "" } ``` | name | type | required | default | description | | ------------- | --------- | --------- | ------- | ------------------------------------------------------------------ | | **name** | `string` | ✔️ | `N/A` | Name of the application used to export the log | | **version** | `string` | ✔️ | `N/A` | Version of the application used to export the log | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | --- ### browser ```json "browser": { "name": "Firefox", "version": "3.6", "comment": "" } ``` | name | type | required | default | description | | ------------- | --------- | --------- | ------- | ------------------------------------------------------------------ | | **name** | `string` | ✔️ | `N/A` | Name of the browser used to export the log | | **version** | `string` | ✔️ | `N/A` | Version of the browser used to export the log | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | --- ### pages This object represents list of exported pages. ```json "pages": [ { "startedDateTime": "2009-04-16T12:07:25.123+01:00", "id": "page_0", "title": "Test Page", "pageTimings": {}, "comment": "" } ] ``` | name | type | required | default | description | | --------------------------------- | --------- | --------- | ------- | ----------------------------------------------------------------------------------------------------------- | | **startedDateTime** | `string` | ✔️ | `N/A` | Date and time stamp for the beginning of the page load ([`ISO 8601`][iso-8601]) | | **id** | `string` | ✔️ | `N/A` | Unique identifier of a page within the [`log`](#log). [`Entries`](#entries) use it to refer the parent page | | **title** | `string` | ✔️ | `N/A` | Page title | | [**pageTimings**](#pagetimings) | `object` | ✔️ | `N/A` | Detailed timing info about page load | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | --- ### pageTimings This object describes timings for various events (states) fired during the page load. All times are specified in milliseconds. If a time info is not available appropriate field is set to `-1` ```json "pageTimings": { "onContentLoad": 1720, "onLoad": 2500, "comment": "" } ``` | name | type | required | default | description | | ----------------- | --------- | --------- | ------- | --------------------------------------------------------------------------------------------------------------------------- | | **onContentLoad** | `number` | ✖️ | `-1` | Content of the page loaded. Number of milliseconds since page load started ([`page.startedDateTime`](#pages)) | | **onLoad** | `number` | ✖️ | `-1` | Page is loaded (`onLoad` event fired). Number of milliseconds since page load started ([`page.startedDateTime`](#pages)) | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > Depeding on the browser, `onContentLoad` property represents `DOMContentLoad` event or `document.readyState == interactive`. --- ### entries This object represents an array with all exported HTTP requests. Sorting entries by `startedDateTime` (starting from the oldest) is preferred way how to export data since it can make importing faster. However the reader application should always make sure the array is sorted (if required for the import). ```json "entries": [ { "pageref": "page_0", "startedDateTime": "2009-04-16T12:07:23.596Z", "time": 50, "request": {}, "response": {}, "cache": {}, "timings": {}, "serverIPAddress": "10.0.0.1", "connection": "52492", "comment": "" } ] ``` | name | type | required | default | description | | ------------------------- | --------- | --------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | | **pageref** | `string` | ✖️ | `N/A` | Unique Reference to the parent page | | **startedDateTime** | `string` | ✔️ | `N/A` | Date and time stamp of the request start ([`ISO 8601`][iso-8601]) | | **time** | `number` | ✔️ | `0` | Total elapsed time of the request in milliseconds. This is the sum of all timings available in the timings object (i.e. not including `-1` values) | | [**request**](#request) | `object` | ✔️ | `N/A` | Detailed info about the request | | [**response**](#response) | `object` | ✔️ | `N/A` | Detailed info about the response | | [**cache**](#cache) | `object` | ✔️ | `N/A` | Info about cache usage | | [**timings**](#timings) | `object` | ✔️ | `N/A` | Detailed timing info about request/response round trip | | **serverIPAddress** | `string` | ✖️ | `N/A` | IP address of the server that was connected (result of DNS resolution) | | **connection** | `string` | ✖️ | `N/A` | Unique ID of the parent TCP/IP connection, can be the client or server port number. | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > `connection`: Note that a port number doesn't have to be unique identifier in cases where the port is shared for more connections. If the port isn't available for the application, any other unique connection ID can be used instead (e.g. connection index) --- ### request This object contains detailed info about performed request. ```json "request": { "method": "GET", "url": "http://www.example.com/path/?param=value", "httpVersion": "HTTP/1.1", "cookies": [], "headers": [], "queryString" : [], "postData" : {}, "headersSize" : 150, "bodySize" : 0, "comment" : "" } ``` | name | type | required | default | description | | --------------------------------- | --------- | --------- | ------- | ------------------------------------------------------------------------------------------------------------------------ | | **method** | `string` | ✔️ | `N/A` | [Request method][rfc7231-methods] | | **url** | `string` | ✔️ | `N/A` | Absolute URL of the request (fragments are not included) | | **httpVersion** | `string` | ✔️ | `N/A` | [Request HTTP Version][rfc7230-version] | | [**cookies**](#cookies) | `array` | ✔️ | `N/A` | List of [`cookie`](#cookies) objects | | [**headers**](#headers) | `array` | ✔️ | `N/A` | List of [`header`](#headers) objects | | [**queryString**](#querystring) | `array` | ✔️ | `N/A` | List of query parameter objects | | [**postData**](#postdata) | `object` | ✖️ | `N/A` | Posted data info | | **headersSize** | `number` | ✔️ | `-1` | Total number of bytes from the start of the HTTP request message until (and including) the double `CRLF` before the body | | **headersCompression** | `number` | ✖️ | `N/A` | *(new in 1.3)* Number of bytes saved (for [`HTTP/2`][rfc7540-headers]) | | **bodySize** | `number` | ✔️ | `-1` | Size of the request body in bytes (e.g. `POST` data payload) | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > The total request size sent can be computed as follows *(if both values are available)*: > > ```js > let totalSize = entry.request.headersSize + entry.request.bodySize > ``` --- ### response This object contains detailed info about the response. ```json "response": { "status": 200, "statusText": "OK", "httpVersion": "HTTP/1.1", "cookies": [], "headers": [], "content": {}, "redirectURL": "", "headersSize" : 160, "bodySize" : 850, "comment" : "" } ``` | name | type | required | default | description | | ------------------------- | --------- | --------- | ------- | ------------------------------------------------------------------------------------------------------------------------- | | **status** | `number` | ✔️ | `N/A` | [Response status][rfc7231-status] | | **statusText** | `string` | ✔️ | `N/A` | [Response status description][rfc7231-status] | | **httpVersion** | `string` | ✔️ | `N/A` | [Response HTTP Version][rfc7230-version] | | [**cookies**](#cookies) | `array` | ✔️ | `N/A` | List of [`cookie`](#cookies) objects | | [**headers**](#headers) | `array` | ✔️ | `N/A` | List of [`header`](#headers) objects | | [**content**](#content) | `object` | ✔️ | `N/A` | Details about the response body | | **redirectURL** | `string` | ✔️ | `N/A` | Redirection target URL from the Location response header | | **headersSize** | `number` | ✔️ | `-1` | Total number of bytes from the start of the HTTP response message until (and including) the double `CRLF` before the body | | **headersCompression** | `number` | ✖️ | `N/A` | *(new in 1.3)* Number of bytes saved (for [`HTTP/2`][rfc7540-headers]) | | **bodySize** | `number` | ✔️ | `-1` | Size of the received response body in bytes. Set to `0` in case of responses coming from the cache (`304`) | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > - `headersSize`: The size of received response-headers is computed only from headers that are really received from the server. Additional headers appended by the browser are not included in this number, but they appear in the list of header objects. > - The total response size received can be computed as follows *(if both values are available)*: > ```js > let totalSize = entry.response.headersSize + entry.response.bodySize > ``` --- ### cookies This object contains list of all cookies (used in [`request`](#request) and [`response`](#response) objects). ```json "cookies": [ { "name": "TestCookie", "value": "Cookie Value", "path": "/", "domain": "www.janodvarko.cz", "expires": "2009-07-24T19:20:30.123+02:00", "httpOnly": false, "secure": false, "comment": "" } ] ``` | name | type | required | default | description | | ------------- | --------- | --------- | ------- | ----------------------------------------------------------------------------------- | | **name** | `string` | ✔️ | `N/A` | The name of the cookie | | **value** | `string` | ✔️ | `N/A` | The cookie value | | **path** | `string` | ✖️ | `N/A` | The path pertaining to the cookie | | **domain** | `string` | ✖️ | `N/A` | The host of the cookie | | **expires** | `string` | ✖️ | `N/A` | Cookie expiration time. ([`ISO 8601`][iso-8601]) | | **httpOnly** | `boolean` | ✖️ | `N/A` | Set to true if the cookie is HTTP only, false otherwise | | **secure** | `boolean` | ✖️ | `N/A` | true` if the cookie was transmitted over ssl, `false` otherwise | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | --- ### headers This object contains list of all headers (used in [`request`](#request) and [`response`](#response) objects). ```json "headers": [ { "name": "Accept-Encoding", "value": "gzip,deflate", "comment": "" }, { "name": "Accept-Language", "value": "en-us,en;q=0.5", "comment": "" } ] ``` | name | type | required | default | description | | ------------- | --------- | --------- | ------- | ------------------------------------------------------------------ | | **name** | `string` | ✔️ | `N/A` | The name of the header | | **value** | `string` | ✔️ | `N/A` | The header value | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | --- ### queryString This object contains list of all parameters & values parsed from a query string, if any (embedded in [`request`](#request) object). ```json "queryString": [ { "name": "param1", "value": "value1", "comment": "" }, { "name": "param1", "value": "value1", "comment": "" } ] ``` | name | type | required | default | description | | ------------- | --------- | --------- | ------- | ------------------------------------------------------------------ | | **name** | `string` | ✔️ | `N/A` | The name of the query | | **value** | `string` | ✔️ | `N/A` | The query value | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | --- ### postData This object describes posted data, if any (embedded in [`request`](#request) object). ```json "postData": { "mimeType": "multipart/form-data", "params": [], "text" : "plain posted data", "comment": "" } ``` | name | type | required | default | description | | --------------------- | --------- | --------- | ------- | -------------------------------------------------------------------- | | **mimeType** | `string` | ✔️ | `N/A` | [Mime type][mime-type] of posted data | | [**params**](#params) | `array` | ✔️ * | `N/A` | List of posted parameters (in case of URL encoded parameters) | | **text** | `string` | ✔️ * | `N/A` | Plain text posted data | | **encoding** | `string` | ✖️ | `N/A` | *(new in 1.3)* - Encoding used for request `text` field e.g "base64" | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > - `text` and `params` fields are mutually exclusive. > - `encoding` field is useful for including binary responses (e.g. images) into the HAR file. ###### Example > **original** > ``` > [binary data] > ``` > **encoded** > ```json > "postData": { > "mimeType": "image/jpeg", > "text": "W2JpbmFyeSBkYXRhXQ==", > "encoding": "base64", > "comment": "" > } > ``` --- ### params List of posted parameters, if any (embedded in [`postData`](#postData) object). ```json "params": [ { "name": "paramName", "value": "paramValue", "fileName": "example.pdf", "contentType": "application/pdf", "comment": "" } ] ``` | name | type | required | default | description | | ----------------- | --------- | --------- | ------- | ------------------------------------------------------------------- | | **name** | `string` | ✔️ | `N/A` | name of a posted parameter | | **value** | `string` | ✖️ | `N/A` | value of a posted parameter or content of a posted file | | **fileName** | `string` | ✖️ | `N/A` | name of a posted file | | **contentType** | `string` | ✖️ | `N/A` | content type of a posted file | | **encoding** | `string` | ✖️ | `N/A` | *(new in 1.3)* - Encoding used for `value` field e.g "base64" | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | ###### Example > **encoded** > ```json > "postData": { > "mimeType": "multipart/form-data", > "comment": "", > "params": [ > { > "name": "image", > "value": "W2JpbmFyeSBkYXRhXQ==", > "encoding": "base64", > "fileName": "example.jpg", > "contentType": "image/jpeg", > "comment": "" > } > ] > } > ``` --- ### content This object describes details about response content (embedded in [`response`](#response) object). ```json "content": { "size": 33, "compression": 0, "mimeType": "text/html; charset=utf-8", "text": "\n", "comment": "" } ``` | name | type | required | default | description | | ----------------- | --------- | --------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- | | **size** | `number` | ✔️ | `N/A` | Length of the returned content in bytes. Should be equal to `response.bodySize` if there is no compression and bigger when the content has been compressed | | **compression** | `number` | ✖️ | `N/A` | Number of bytes saved | | **mimeType** | `string` | ✔️ | `N/A` | MIME type of the response `text` *(value of the `Content-Type` response header)*. The charset attribute of the MIME type is included *(if available)* | | **text** | `string` | ✖️ | `N/A` | Response body sent from the server or loaded from the browser cache. | | **encoding** | `string` | ✖️ | `N/A` | Encoding used for response `text` field e.g "base64" | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > - The `text` field is populated with textual content only > - The `text` field is either HTTP decoded text or a encoded (e.g. "base64") representation of the response body > - Leave out the `encoding` field if the text field is HTTP decoded (decompressed & unchunked), than trans-coded from its original character set into `UTF-8` > - Before setting the text field, the HTTP response is decoded (decompressed & unchunked), than trans-coded from its original character set into `UTF-8`. > - Additionally, it can be encoded using e.g. base64. > - Ideally, the application should be able to unencode a base64 blob and get a byte-for-byte identical resource to what the browser operated on. > - `encoding` field is useful for including binary responses (e.g. images) into the HAR file. ###### Example > **original** > ```html > \n > ``` > **encoded** > ```json > "content": { > "size": 33, > "compression": 0, > "mimeType": "text/html; charset=utf-8", > "text": "PGh0bWw+PGhlYWQ+PC9oZWFkPjxib2R5Lz48L2h0bWw+XG4=", > "encoding": "base64", > "comment": "" > } > ``` --- ### cache This objects contains info about a request coming from browser cache. ```json "cache": { "beforeRequest": {}, "afterRequest": {}, "comment": "" } ``` | name | type | required | default | description | | ------------------------------------------------- | --------- | --------- | ------- | ------------------------------------------------------------------- | | [**beforeRequest**](#beforerequest--afterrequest) | `object` | ✖️ | `N/A` | State of a cache entry before the request | | [**afterRequest**](#beforerequest--afterrequest) | `object` | ✖️ | `N/A` | State of a cache entry after the request | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | ###### Example 1 No cache information are available (or you can just leave out the entire field). > ```json > "cache": {} > ``` ###### Example 2 Info about the cache entry before request is not available and there is no cache entry after the request. > ```json > "cache": { > "afterRequest": null > } > ``` ###### Example 3 No cache entry before nor after the request. > ```json > "cache": { > "beforeRequest": null, > "afterRequest": null > } > ``` ###### Example 4 Indicate that the entry was not in the cache but was store after the content was downloaded by the request. > ```json > "cache": { > "beforeRequest": null, > "afterRequest": { > "expires": "2009-04-16T15:50:36", > "lastAccess": "2009-16-02T15:50:34", > "eTag": "", > "hitCount": 0, > "comment": "" > } > } > ``` --- ### beforeRequest / afterRequest ```json "beforeRequest": { "expires": "2009-04-16T15:50:36", "lastAccess": "2009-16-02T15:50:34", "eTag": "", "hitCount": 0, "comment": "" } ``` | name | type | required | default | description | | ----------------- | --------- | --------- | ------- | ------------------------------------------------------------------- | | **expires** | `string` | ✖️ | `N/A` | Expiration time of the cache entry | | **lastAccess** | `string` | ✔️ | `N/A` | The last time the cache entry was opened | | **eTag** | `string` | ✔️ | `N/A` | [Etag][rfc7232-etag] | | **hitCount** | `number` | ✔️ | `N/A` | The number of times the cache entry has been opened | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | --- ### timings This object describes various phases within request-response round trip. All times are specified in milliseconds. ```json "timings": { "blocked": 0, "dns": -1, "connect": 15, "send": 20, "wait": 38, "receive": 12, "ssl": -1, "comment": "" } ``` | name | type | required | default | description | | ------------- | --------- | --------- | ------- | ------------------------------------------------------------------- | | **blocked** | `number` | ✖️ | `-1` | Time spent in a queue waiting for a network connection | | **dns** | `number` | ✖️ | `-1` | DNS resolution time. The time required to resolve a host name | | **connect** | `number` | ✖️ | `-1` | Time required to create TCP connection | | **send** | `number` | ✔️ | `N/A` | Time required to send HTTP request to the server | | **wait** | `number` | ✔️ | `N/A` | Waiting for a response from the server | | **receive** | `number` | ✔️ | `N/A` | Time required to read entire response from the server (or cache) | | **ssl** | `number` | ✖️ | `-1` | Time required for SSL/TLS negotiation. | | **comment** | `string` | ✖️ | `N/A` | A comment provided by the user or the application | > - If the `ssl` field is defined then the time is also included in the connect field *(to ensure backward compatibility with HAR `v1.1`)* > - The `send`, `wait` and `receive` timings are not optional and **must** have non-negative values. > - An exporting tool can omit the blocked, dns, connect and ssl, timings on every request if it is unable to provide them. > - Tools that can provide these timings can set their values to `-1` if they don’t apply. For example, connect would be `-1` for requests which re-use an existing connection. > - The `time` value for the request must be equal to the sum of the timings supplied in this section *(excluding any `-1` values)*. > - The Following must be true in case there are no `-1` values (`entry` is an object in [`log.entries`](#entries)) : > ```js > entry.time == entry.timings.blocked + entry.timings.dns + entry.timings.connect > entry.timings.send + entry.timings.wait + entry.timings.receive > ``` --- ## Custom Fields The specification allows adding new custom fields into the output format. Following rules must be applied: - Custom fields and elements MUST start with an underscore (spec fields should never start with an underscore. - Parsers MUST ignore all custom fields and elements if the file was not written by the same tool loading the file. - Parsers MUST ignore all non-custom fields that they don't know how to parse because the minor version number is greater than the maximum minor version for which they were written. - Parsers can reject files that contain non-custom fields that they know were not present in a specific version of the spec. ## Versioning Scheme The spec number has following syntax: ``` . ``` Where the major version indicates overall backwards compatibility and the minor version indicates incremental changes. So, any backwardly compatible changes to the spec will result in an increase of the minor version. If an existing fields had to be broken then major version would increase (e.g. 2.0). ###### Examples: ``` 1.2 -> 1.3 1.111 -> 1.112 (in case of 111 more changes) 1.5 -> 2.0 (2.0 is not compatible with 1.5) ``` So following construct can be used to detect incompatible version if a tool supports HAR since 1.1. ```js if (majorVersion != 1 || minorVersion < 1) { throw "Incompatible version" } ``` In this example a tool throws an exception if the version is e.g.: `0.8`, `0.9`, `1.0`, but works with `1.1`, `1.2`, `1.112` etc. Version `2.x` would be rejected. ## Compression Compression of the HAR file is not part of the core HAR spec. However, in order to store HAR files more efficiently, it is recommended that you compress HAR files on disk (you might want to use `*.zhar` extension for zipped HAR files). An application supporting HAR, is not required to support compressed HAR files. If the application doesn't support compressed HAR then it's the responsibility of the user to decompress before passing the HAR file into it. [HTTP Compression][http-compression] is one of the best practices how to speed up web applications and it's also recommended for HAR files. [http-compression]: http://en.wikipedia.org/wiki/HTTP_compression [iso-8601]: http://en.wikipedia.org/wiki/ISO_8601 [mime-type]: http://httpwg.github.io/specs/rfc7231.html#media.type [rfc7230-version]: http://httpwg.github.io/specs/rfc7230.html#http.version [rfc7231-methods]: http://httpwg.github.io/specs/rfc7231.html#methods [rfc7231-status]: http://httpwg.github.io/specs/rfc7231.html#status.codes [rfc7232-etag]: http://httpwg.org/specs/rfc7232.html#header.etag [rfc7540-headers]: http://httpwg.org/specs/rfc7540.html#HeaderBlock