# Codex CLI

## Where the data comes from

`~/.codex/sessions/**/*.jsonl` and `~/.codex/archived_sessions/**/*.jsonl` —
rollout session logs. Read-only. The Codex desktop app *moves* (not copies) a
session's JSONL from `sessions/` to `archived_sessions/` when it's archived, so
both are scanned; a session that briefly exists in both (a stale copy) is
deduplicated by relative path before parsing. `--codex-dir` overrides the
**sessions** directory; its `archived_sessions` sibling is derived from it.

## What's captured

- **Token usage** from `token_count` events (`last_token_usage`): input, cached
  input, output, reasoning output, total.
- **Tools** from `response_item` payloads (`exec_command`, `apply_patch`,
  `write_stdin`, …). Shell executions (`exec_command` and friends) are
  decomposed to the real command basename (`grep`, `cargo`, …) so the tool list
  reflects what actually ran, not one undifferentiated shell bucket.
- **Durations** from `task_complete` events (`duration_ms`) → the COMPLETION
  section.
- **Sessions / model / project** from `session_meta` and `turn_context`.

## How tokens are counted

OpenAI reports `cached_input_tokens` *inside* `input_tokens`. agent-walker
subtracts it, so `input` means **fresh (uncached) input**, and stores the cached
part as **cache_read** — the same schema as Claude:

```
tokens = input + output + cache_read
```

So the Total view adds up consistently with Claude. Two real differences
(behaviour, not accounting):

- **Lower cache ratio.** OpenAI caches less aggressively than Claude Code's
  full-context-every-turn, so Codex's cache_read share is smaller — Codex totals
  are more "fresh work" per token.
- **No cache writes.** OpenAI does not bill cache creation, so there is no
  cache_creation component.
- `reasoning_output_tokens` is a **subset of** `output` (already counted inside
  it), not added on top.

## Caveats

- **`exec_command` is decomposed to the real command.** Codex runs most
  reads / runs / searches through a shell wrapper (`exec_command`, usually
  `bash -lc "…"`); agent-walker extracts the effective command basename
  (`grep`/`cat`/`cargo`/…) so the tool list sees the real work instead of one
  undifferentiated shell bucket. A bare `exec_command` remains only when the
  wrapped command can't be parsed (the fallback bucket).
- `apply_patch` = file edits.

## Cost

Same cache-aware formula as Claude. GPT models price cache reads at ≈0.1× input
and have no cache-write cost (OpenAI caches automatically and bills no write);
rates come from LiteLLM. Sessions that log only the provider name are priced as
the current Codex default model. The same API-equivalent caveat applies — it is
not your ChatGPT subscription bill.

## Why this won't match Codex's own usage views

Same idea as Claude — different window and unit:

- **`codex /status`** shows tokens **remaining** against rolling 5-hour and
  weekly plan limits (a percentage of your quota), not a cumulative N-day count.
- **OpenAI's Platform usage dashboard** is per monthly billing cycle; cached
  tokens are shown but folded into the totals.

agent-walker reports a raw, cache-inclusive token count over a trailing N-day
window, so it won't equal either. Use it for cross-agent comparison, not to
reconcile your OpenAI bill.