{ "cells": [ { "cell_type": "markdown", "id": "98f4e72a", "metadata": {}, "source": [ "# Getting Started with OpenAI Models on Amazon Bedrock\n", "\n", "OpenAI models on Amazon Bedrock expose an OpenAI-compatible Responses API surface for production workflows that need text generation, structured outputs, application tools, direct file inputs, response state, prompt caching, and background work. This cookbook keeps the examples concrete by building a support-assistant workflow for **BrightCart**, a fictional retailer handling delayed and damaged-order replacement requests.\n", "\n", "You will use the OpenAI Python SDK for normal application calls and a small raw HTTPS helper when it is useful to inspect the exact request body. The flow starts with setup and a minimal preflight, then layers on response lifecycle, model controls, structured JSON, application tools, file input, state management, caching, background processing, context compaction, operations checks, and cleanup.\n", "\n", "You will learn how to:\n", "\n", "1. Configure a Bedrock-hosted OpenAI model with Bedrock-specific environment variables.\n", "2. Verify the Responses endpoint and inspect response schema, usage metadata, and normalized errors.\n", "3. Send text requests with both raw HTTPS and the OpenAI SDK.\n", "4. Generate schema-constrained JSON and lighter JSON-mode handoffs.\n", "5. Call application-managed function tools, parallel tools, and custom text tools.\n", "6. Send a direct PDF input, continue stateful and stateless conversations, and carry encrypted reasoning context.\n", "7. Use prompt caching, background mode, compaction, operational smoke checks, and stored-response cleanup.\n", "\n", "Prerequisites: a bearer token for OpenAI models on Amazon Bedrock, Python 3.9 or newer, and network access to your Bedrock OpenAI-compatible endpoint.\n", "\n", "This guide runs `openai.gpt-5.4` in `us-west-2` by default. To use another supported pairing, change `AWS_REGION`, `BEDROCK_MODEL`, and `BEDROCK_BASE_URL` together before running the setup cells.\n", "\n", "| AWS Region | Supported model IDs |\n", "| --- | --- |\n", "| `us-west-2` | `openai.gpt-5.4` |\n", "| `us-east-2` | `openai.gpt-5.5`, `openai.gpt-5.4` |\n" ] }, { "cell_type": "markdown", "id": "8331f876", "metadata": {}, "source": [ "## 1. Configure Amazon Bedrock\n", "\n", "This section prepares the notebook runtime. It installs the small Python stack, reads Bedrock-specific environment variables, creates both a raw HTTPS session and an OpenAI SDK client, discovers model metadata when the endpoint provides it, and defines shared helpers used by later examples.\n", "\n", "Set these environment variables before running the notebook. The default pairing is `us-west-2` with `openai.gpt-5.4`.\n", "\n", "```bash\n", "export AWS_BEARER_TOKEN_BEDROCK=\"YOUR_BEDROCK_BEARER_TOKEN\"\n", "export AWS_REGION=\"us-west-2\"\n", "export BEDROCK_MODEL=\"openai.gpt-5.4\"\n", "export BEDROCK_BASE_URL=\"https://bedrock-mantle.${AWS_REGION}.api.aws/openai/v1\"\n", "```\n", "\n", "The bearer token is read from `AWS_BEARER_TOKEN_BEDROCK`. If it is missing, the setup cell asks for it with a password-style prompt and does not print it.\n" ] }, { "cell_type": "markdown", "id": "4a7f4690", "metadata": {}, "source": [ "### 1.1 Install Dependencies\n", "\n", "Install the packages used by the notebook. The OpenAI SDK is used for the application examples, `requests` is used for raw HTTPS calls to the Responses endpoint, and `pandas` plus IPython display helpers keep request and response summaries readable in the Cookbook renderer. Inspect the cell output only to confirm the packages installed or were already present.\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "4236e3c7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m26.1.1\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", "Note: you may need to restart the kernel to use updated packages.\n", "Dependencies installed or already available: openai, requests, pandas, ipython\n" ] } ], "source": [ "%pip install -U \"openai>=2.28.0\" requests pandas ipython --quiet\n", "print(\"Dependencies installed or already available: openai, requests, pandas, ipython\")" ] }, { "cell_type": "markdown", "id": "1cd3f5e2", "metadata": {}, "source": [ "### 1.2 Import Libraries and Defaults\n", "\n", "Import the standard libraries, SDK, HTTP client, and display utilities used throughout the notebook. This cell also sets the default Bedrock region and model used when environment variables are not already set. Inspect the printed defaults to confirm the notebook will start from `us-west-2` and `openai.gpt-5.4` unless you override them.\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "d53dd7c9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Imports loaded.\n", "Default region: us-west-2\n", "Default model: openai.gpt-5.4\n" ] } ], "source": [ "from __future__ import annotations\n", "\n", "import base64\n", "import builtins\n", "import html\n", "import json\n", "import os\n", "import shlex\n", "import textwrap\n", "import time\n", "from datetime import date, timedelta\n", "from getpass import getpass\n", "from typing import Any, Callable, Iterable\n", "\n", "import pandas as pd\n", "import requests\n", "from IPython.display import HTML, Markdown, display\n", "from openai import OpenAI\n", "\n", "DEFAULT_REGION = \"us-west-2\"\n", "DEFAULT_MODEL = \"openai.gpt-5.4\"\n", "PREFERRED_MODELS = [DEFAULT_MODEL]\n", "\n", "\n", "def gpt_version_tuple(model_id: str) -> tuple[int, int] | None:\n", " normalized = model_id.lower().removeprefix(\"openai.\")\n", " if not normalized.startswith(\"gpt-\"):\n", " return None\n", " version = normalized.removeprefix(\"gpt-\").split(\"-\")[0]\n", " parts = version.split(\".\")\n", " try:\n", " major = builtins.int(parts[0])\n", " minor = builtins.int(parts[1]) if len(parts) > 1 else 0\n", " except ValueError:\n", " return None\n", " return major, minor\n", "\n", "\n", "def prompt_cache_retention_for_model(model_id: str) -> str:\n", " version = gpt_version_tuple(model_id)\n", " if version and version >= (5, 5):\n", " return \"24h\"\n", " return \"in_memory\"\n", "\n", "pd.set_option(\"display.max_columns\", None)\n", "pd.set_option(\"display.max_rows\", 200)\n", "pd.set_option(\"display.max_colwidth\", None)\n", "pd.set_option(\"display.width\", 160)\n", "\n", "\n", "def display_wrapped_table(df: pd.DataFrame, *, max_col_width_px: int = 520, index: bool = False) -> None:\n", " if df.empty:\n", " display(Markdown(\"_No rows to display._\"))\n", " return\n", " table_html = df.to_html(index=index, escape=True, border=0)\n", " table_html = table_html.replace('', '
')\n", " display(HTML(f\"\"\"\n", " \n", " {table_html}\n", " \"\"\"))\n", "\n", "print(\"Imports loaded.\")\n", "print(\"Default region:\", DEFAULT_REGION)\n", "print(\"Default model:\", DEFAULT_MODEL)\n" ] }, { "cell_type": "markdown", "id": "3f0f8320", "metadata": {}, "source": [ "### 1.3 Configure Bedrock Credentials and Clients\n", "\n", "Read Bedrock configuration from the environment and construct clients. `BEDROCK_BASE_URL` is normalized once, the raw `requests.Session` gets the bearer token in its headers, and the OpenAI SDK client is created explicitly with the same token and base URL. Inspect the rendered table to confirm the selected region, model, endpoint, SDK client configuration, and stored-response cleanup behavior before making live calls.\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "50c559a3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", "
\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
settingvalue
AWS_REGIONus-west-2
BEDROCK_MODELopenai.gpt-5.4
BEDROCK_BASE_URLhttps://bedrock-mantle.us-west-2.api.aws/openai/v1
SDK clientOpenAI(api_key=AWS_BEARER_TOKEN_BEDROCK, base_url=BEDROCK_BASE_URL)
cleanup stored responsesTrue
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "\n", "\n", "def env_value(*names: str) -> str | None:\n", " for name in names:\n", " value = os.environ.get(name)\n", " if value:\n", " return value\n", " return None\n", "\n", "\n", "def env_flag(name: str, default: bool = False) -> bool:\n", " value = env_value(name)\n", " if value is None:\n", " return default\n", " return value.strip().lower() in {\"1\", \"true\", \"yes\", \"on\"}\n", "\n", "\n", "def normalize_base_url(url: str) -> str:\n", " url = url.strip().rstrip(\"/\")\n", " if url.endswith(\"/responses\"):\n", " return url[: -len(\"/responses\")]\n", " return url\n", "\n", "\n", "def endpoint(path: str) -> str:\n", " return f\"{BEDROCK_BASE_URL}/{path.lstrip('/')}\"\n", "\n", "\n", "def responses_url(base_url: str) -> str:\n", " return f\"{normalize_base_url(base_url)}/responses\"\n", "\n", "\n", "API_TIMEOUT_SECONDS = float(env_value(\"BEDROCK_REQUEST_TIMEOUT_SECONDS\") or \"60\")\n", "MAX_RETRIES = builtins.int(env_value(\"BEDROCK_MAX_RETRIES\") or \"0\")\n", "CLEAN_UP_STORED_RESPONSES = env_flag(\"BEDROCK_CLEANUP_STORED_RESPONSES\", True)\n", "FAIL_ON_CHECK_FAILURE = env_flag(\"BEDROCK_FAIL_ON_CHECK_FAILURE\", False)\n", "RUN_RESPONSIVENESS_CHECK = env_flag(\"BEDROCK_RESPONSIVENESS_CHECK\", True)\n", "TRANSIENT_STATUS_CODES = {408, 409, 429, 500, 502, 503, 504}\n", "\n", "AWS_REGION = (env_value(\"AWS_REGION\") or DEFAULT_REGION).strip() or DEFAULT_REGION\n", "BEDROCK_MODEL = (env_value(\"BEDROCK_MODEL\") or DEFAULT_MODEL).strip() or DEFAULT_MODEL\n", "BEDROCK_BASE_URL = normalize_base_url(\n", " env_value(\"BEDROCK_BASE_URL\") or f\"https://bedrock-mantle.{AWS_REGION}.api.aws/openai/v1\"\n", ")\n", "RESPONSES_URL = responses_url(BEDROCK_BASE_URL)\n", "AWS_BEARER_TOKEN_BEDROCK = env_value(\"AWS_BEARER_TOKEN_BEDROCK\")\n", "\n", "if not AWS_BEARER_TOKEN_BEDROCK:\n", " AWS_BEARER_TOKEN_BEDROCK = getpass(\"Paste your AWS Bedrock bearer token for this kernel session: \").strip()\n", " if AWS_BEARER_TOKEN_BEDROCK:\n", " os.environ[\"AWS_BEARER_TOKEN_BEDROCK\"] = AWS_BEARER_TOKEN_BEDROCK\n", "\n", "if not AWS_BEARER_TOKEN_BEDROCK:\n", " raise RuntimeError(\"AWS_BEARER_TOKEN_BEDROCK is required to run the live examples.\")\n", "\n", "http = requests.Session()\n", "http.headers.update({\n", " \"Authorization\": f\"Bearer {AWS_BEARER_TOKEN_BEDROCK}\",\n", " \"Content-Type\": \"application/json\",\n", "})\n", "\n", "client = OpenAI(api_key=AWS_BEARER_TOKEN_BEDROCK, base_url=BEDROCK_BASE_URL, max_retries=0)\n", "BASE_URL = BEDROCK_BASE_URL\n", "\n", "config_rows = [\n", " {\"setting\": \"AWS_REGION\", \"value\": AWS_REGION},\n", " {\"setting\": \"BEDROCK_MODEL\", \"value\": BEDROCK_MODEL},\n", " {\"setting\": \"BEDROCK_BASE_URL\", \"value\": BEDROCK_BASE_URL},\n", " {\"setting\": \"SDK client\", \"value\": \"OpenAI(api_key=AWS_BEARER_TOKEN_BEDROCK, base_url=BEDROCK_BASE_URL)\"},\n", " {\"setting\": \"cleanup stored responses\", \"value\": CLEAN_UP_STORED_RESPONSES},\n", "]\n", "display_wrapped_table(pd.DataFrame(config_rows), max_col_width_px=680)\n" ] }, { "cell_type": "markdown", "id": "23cf01d3", "metadata": {}, "source": [ "### 1.4 Discover Available Models\n", "\n", "Discover available models when the selected endpoint exposes model-list metadata, then choose the model for the rest of the notebook. If `BEDROCK_MODEL` is set, the notebook uses that value; otherwise it prefers `openai.gpt-5.4`. The model-list call is optional because some compatible endpoints may allow inference even when model metadata is unavailable. Inspect the selected model and any returned catalog rows.\n" ] }, { "cell_type": "code", "execution_count": 4, "id": "ed26644c", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
selected_modelmodel_was_explicitmodel_catalog_statusdiscovered_model_countprompt_cache_retentionprompt_cache_retention_notenote
openai.gpt-5.4Falseusing configured model0in_memoryGPT-5.5 and later use 24h extended prompt caching; earlier GPT-5 models can use in_memory.This endpoint did not expose model-list metadata. The guide will continue with the configured model.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Continuing with: openai.gpt-5.4\n" ] } ], "source": [ "from __future__ import annotations\n", "\n", "\n", "def list_openai_models(client: OpenAI) -> list[str]:\n", " return sorted(model.id for model in client.models.list(timeout=API_TIMEOUT_SECONDS).data)\n", "\n", "\n", "def resolve_model_id(client: OpenAI | None) -> tuple[str, list[str], str | None]:\n", " configured_model = env_value(\"BEDROCK_MODEL\")\n", " available_models: list[str] = []\n", " model_discovery_note: str | None = None\n", "\n", " if client is not None:\n", " try:\n", " available_models = list_openai_models(client)\n", " except Exception as exc:\n", " status_code = getattr(exc, \"status_code\", None)\n", " if status_code == 404:\n", " model_discovery_note = \"This endpoint did not expose model-list metadata. The guide will continue with the configured model.\"\n", " else:\n", " model_discovery_note = f\"Model-list metadata could not be listed. The guide will continue with the configured model. Details: {builtins.str(exc)[:240]}\"\n", "\n", " if configured_model:\n", " return configured_model, available_models, model_discovery_note\n", "\n", " for candidate in PREFERRED_MODELS:\n", " if candidate in available_models:\n", " return candidate, available_models, model_discovery_note\n", "\n", " for candidate in available_models:\n", " if candidate.startswith(\"openai.\"):\n", " return candidate, available_models, model_discovery_note\n", "\n", " if available_models:\n", " return available_models[0], available_models, model_discovery_note\n", "\n", " return PREFERRED_MODELS[0], available_models, model_discovery_note\n", "\n", "\n", "EXPLICIT_MODEL = env_value(\"BEDROCK_MODEL\")\n", "MODEL_ID, AVAILABLE_MODELS, MODEL_DISCOVERY_NOTE = resolve_model_id(client)\n", "os.environ[\"BEDROCK_MODEL\"] = MODEL_ID\n", "PROMPT_CACHE_RETENTION = prompt_cache_retention_for_model(MODEL_ID)\n", "PROMPT_CACHE_RETENTION_NOTE = (\n", " \"GPT-5.5 and later use 24h extended prompt caching; earlier GPT-5 models can use in_memory.\"\n", ")\n", "\n", "config_rows = [{\n", " \"selected_model\": MODEL_ID,\n", " \"model_was_explicit\": bool(EXPLICIT_MODEL),\n", " \"model_catalog_status\": \"listed\" if AVAILABLE_MODELS else \"using configured model\",\n", " \"discovered_model_count\": len(AVAILABLE_MODELS),\n", " \"prompt_cache_retention\": PROMPT_CACHE_RETENTION,\n", " \"prompt_cache_retention_note\": PROMPT_CACHE_RETENTION_NOTE,\n", " \"note\": MODEL_DISCOVERY_NOTE or \"Model selection is ready.\",\n", "}]\n", "display_wrapped_table(pd.DataFrame(config_rows), max_col_width_px=620)\n", "\n", "if AVAILABLE_MODELS:\n", " display_wrapped_table(pd.DataFrame({\"available_models\": AVAILABLE_MODELS[:25]}), max_col_width_px=520)\n", "else:\n", " print(\"Continuing with:\", MODEL_ID)\n" ] }, { "cell_type": "markdown", "id": "d37d8bdb", "metadata": {}, "source": [ "### 1.5 Helper Functions Setup\n", "\n", "Define shared helpers for the workflow. These helpers render request shapes, normalize API errors, send raw HTTPS requests, wrap SDK calls with optional retries, extract `output_text`, summarize token usage, track stored response IDs, and display compact tables. The examples below stay focused on each API concept while the helpers handle repeated mechanics. Inspect this cell if you want to understand how response text, usage, errors, and cleanup are processed.\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "13c12262", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Helpers ready.\n" ] } ], "source": [ "from __future__ import annotations\n", "\n", "RESULTS_SUMMARY: list[dict[str, Any]] = []\n", "EXAMPLE_RESPONSES: list[dict[str, str]] = []\n", "STORED_RESPONSE_IDS: list[str] = []\n", "OUTPUT_WIDTH = 100\n", "MAX_DISPLAY_TEXT_CHARS = builtins.int(env_value(\"BEDROCK_MAX_DISPLAY_CHARS\") or \"1200\")\n", "\n", "\n", "def truncate_display_text(text: Any, *, limit: int = MAX_DISPLAY_TEXT_CHARS) -> str:\n", " rendered = builtins.str(text).strip()\n", " if len(rendered) <= limit:\n", " return rendered\n", " return rendered[:limit].rstrip() + \"\\n[Display truncated for readability. Inspect the Python variable for the full value.]\"\n", "\n", "\n", "def compact_text(text: Any, limit: int = 220) -> str:\n", " rendered = \" \".join(builtins.str(text).split())\n", " if len(rendered) <= limit:\n", " return rendered\n", " return rendered[:limit].rstrip() + \"...\"\n", "\n", "\n", "def require(condition: Any, message: str) -> None:\n", " if not condition:\n", " raise ValueError(message)\n", "\n", "\n", "def warn_or_raise(condition: bool, message: str) -> bool:\n", " if condition:\n", " return True\n", " display(HTML(f\"
Warning: {html.escape(message)}
\"))\n", " if FAIL_ON_CHECK_FAILURE:\n", " raise AssertionError(message)\n", " return False\n", "\n", "\n", "def display_text_block(label: str, text: Any, *, limit: int = MAX_DISPLAY_TEXT_CHARS) -> None:\n", " safe_label = html.escape(label)\n", " safe_text = html.escape(truncate_display_text(text, limit=limit))\n", " display(HTML(f\"\"\"\n", "
\n", "
{safe_label}
\n", "
{safe_text}
\n", "
\n", " \"\"\"))\n", "\n", "\n", "def print_wrapped(text: Any, *, width: int = OUTPUT_WIDTH) -> None:\n", " print(textwrap.fill(builtins.str(text), width=width, break_long_words=True, break_on_hyphens=False))\n", "\n", "\n", "def print_json(value: Any, *, width: int = OUTPUT_WIDTH) -> None:\n", " display_json_block(\"JSON\", value)\n", "\n", "\n", "def print_label(label: str) -> None:\n", " display(HTML(f\"
{html.escape(label)}
\"))\n", "\n", "\n", "def print_labeled_text(label: str, text: Any) -> None:\n", " display_text_block(label, text)\n", "\n", "\n", "def print_labeled_json(label: str, value: Any) -> None:\n", " display_json_block(label, value)\n", "\n", "\n", "def display_json_block(label: str, value: Any, *, limit: int = MAX_DISPLAY_TEXT_CHARS) -> None:\n", " rendered = json.dumps(value, indent=2, default=builtins.str)\n", " display_text_block(label, rendered, limit=limit)\n", "\n", "\n", "def summarize_content(content: Any) -> str:\n", " if isinstance(content, builtins.str):\n", " return compact_text(content)\n", " if isinstance(content, builtins.list):\n", " parts: list[str] = []\n", " for item in content:\n", " if not isinstance(item, builtins.dict):\n", " parts.append(compact_text(item, 80))\n", " continue\n", " item_type = item.get(\"type\", \"item\")\n", " if item_type == \"input_text\":\n", " parts.append(f\"input_text: {compact_text(item.get('text', ''), 120)}\")\n", " elif item_type == \"input_file\":\n", " parts.append(f\"input_file: {item.get('filename', '')}\")\n", " else:\n", " parts.append(item_type)\n", " return \"; \".join(parts)\n", " return compact_text(content)\n", "\n", "\n", "def summarize_input(input_value: Any) -> str:\n", " if isinstance(input_value, builtins.str):\n", " return compact_text(input_value, 260)\n", " if isinstance(input_value, builtins.list):\n", " messages: list[str] = []\n", " for item in input_value[:4]:\n", " if isinstance(item, builtins.dict):\n", " role = item.get(\"role\", item.get(\"type\", \"item\"))\n", " messages.append(f\"{role}: {summarize_content(item.get('content', item))}\")\n", " else:\n", " messages.append(compact_text(item, 120))\n", " suffix = f\"; +{len(input_value) - 4} more\" if len(input_value) > 4 else \"\"\n", " return f\"{len(input_value)} item(s): \" + \"; \".join(messages) + suffix\n", " return compact_text(input_value, 260)\n", "\n", "\n", "def summarize_text_format(text_config: Any) -> str:\n", " if not isinstance(text_config, builtins.dict):\n", " return compact_text(text_config)\n", " fmt = text_config.get(\"format\")\n", " if isinstance(fmt, builtins.dict):\n", " fmt_type = fmt.get(\"type\")\n", " if fmt_type == \"json_schema\":\n", " schema = fmt.get(\"schema\") or {}\n", " required = schema.get(\"required\") or []\n", " return f\"json_schema: {fmt.get('name')} strict={fmt.get('strict')} required={len(required)} fields\"\n", " if fmt_type:\n", " return builtins.str(fmt_type)\n", " return compact_text(text_config)\n", "\n", "\n", "def request_summary_rows(payload: dict[str, Any]) -> list[dict[str, str]]:\n", " rows: list[dict[str, str]] = []\n", " ordered_keys = [\n", " \"model\", \"max_output_tokens\", \"store\", \"background\", \"service_tier\", \"previous_response_id\",\n", " \"parallel_tool_calls\", \"prompt_cache_key\", \"prompt_cache_retention\",\n", " ]\n", " for key in ordered_keys:\n", " if key in payload:\n", " rows.append({\"field\": key, \"value\": compact_text(payload[key], 180)})\n", " if \"reasoning\" in payload:\n", " rows.append({\"field\": \"reasoning\", \"value\": compact_text(payload[\"reasoning\"], 180)})\n", " if \"text\" in payload:\n", " rows.append({\"field\": \"text format\", \"value\": summarize_text_format(payload[\"text\"])})\n", " if \"include\" in payload:\n", " rows.append({\"field\": \"include\", \"value\": compact_text(payload[\"include\"], 180)})\n", " if \"tools\" in payload:\n", " tool_names = [tool.get(\"name\", tool.get(\"type\", \"tool\")) for tool in payload.get(\"tools\", [])]\n", " rows.append({\"field\": \"tools\", \"value\": \", \".join(tool_names)})\n", " if \"tool_choice\" in payload:\n", " rows.append({\"field\": \"tool_choice\", \"value\": compact_text(payload[\"tool_choice\"], 180)})\n", " if \"input\" in payload:\n", " rows.append({\"field\": \"input\", \"value\": summarize_input(payload[\"input\"])})\n", " return rows\n", "\n", "\n", "def print_request_shape(payload: dict[str, Any]) -> None:\n", " rows = request_summary_rows(redact_payload(payload))\n", " print_label(\"Request shape\")\n", " display_wrapped_table(pd.DataFrame(rows), max_col_width_px=520)\n", "\n", "\n", "def print_response_summary(response_or_summary: Any) -> None:\n", " summary = response_or_summary if isinstance(response_or_summary, builtins.dict) and \"output\" not in response_or_summary else summarize_response(response_or_summary)\n", " preferred = [\n", " \"id\", \"model\", \"status\", \"output_item_types\", \"input_tokens\", \"cached_input_tokens\",\n", " \"output_tokens\", \"total_tokens\", \"reasoning_output_tokens\", \"service_tier\",\n", " ]\n", " rows = [{\"field\": key, \"value\": compact_text(summary.get(key), 220)} for key in preferred if key in summary]\n", " print_label(\"Response summary\")\n", " display_wrapped_table(pd.DataFrame(rows), max_col_width_px=420)\n", "\n", "\n", "def print_key_takeaway(text: str) -> None:\n", " display(HTML(f\"
Key takeaway: {html.escape(text)}
\"))\n", "\n", "\n", "def redact_payload(payload: dict[str, Any]) -> dict[str, Any]:\n", " def redact(value: Any) -> Any:\n", " if isinstance(value, builtins.dict):\n", " return {\n", " key: (\"\" if key == \"file_data\" else redact(item))\n", " for key, item in value.items()\n", " }\n", " if isinstance(value, builtins.list):\n", " return [redact(item) for item in value]\n", " return value\n", "\n", " return json.loads(json.dumps(redact(payload), default=builtins.str))\n", "\n", "\n", "def compact_detail(detail: Any) -> str:\n", " if isinstance(detail, (builtins.dict, builtins.list)):\n", " return compact_text(json.dumps(detail, default=builtins.str), 500)\n", " return compact_text(detail, 500)\n", "\n", "\n", "def record_check(name: str, status: str, detail: Any = \"\") -> None:\n", " RESULTS_SUMMARY.append({\"name\": name, \"status\": status, \"detail\": compact_detail(detail)})\n", "\n", "\n", "def record_response(example: str, response_type: str, content: Any, limit: int = 900) -> None:\n", " if isinstance(content, pd.DataFrame):\n", " rendered = content.to_json(orient=\"records\", indent=2)\n", " elif isinstance(content, (builtins.dict, builtins.list)):\n", " rendered = json.dumps(content, indent=2, default=builtins.str)\n", " else:\n", " rendered = builtins.str(content)\n", " rendered = rendered.strip()\n", " if len(rendered) > limit:\n", " rendered = rendered[:limit].rstrip() + chr(10) + \"...\"\n", " EXAMPLE_RESPONSES.append({\n", " \"example\": example,\n", " \"response_type\": response_type,\n", " \"response\": rendered,\n", " })\n", "\n", "\n", "def print_response_gallery() -> pd.DataFrame:\n", " gallery = pd.DataFrame(EXAMPLE_RESPONSES)\n", " if gallery.empty:\n", " gallery = pd.DataFrame(columns=[\"example\", \"response_type\", \"response\"])\n", " display_wrapped_table(gallery, max_col_width_px=620)\n", " return gallery\n", "\n", "\n", "def normalize_error(response: requests.Response, body: Any) -> dict[str, Any]:\n", " return {\n", " \"exception_class\": \"HTTPError\",\n", " \"status_code\": response.status_code,\n", " \"retryable\": response.status_code in TRANSIENT_STATUS_CODES,\n", " \"request_id\": response.headers.get(\"x-request-id\"),\n", " \"body\": body,\n", " }\n", "\n", "\n", "def describe_api_error(exc: Exception) -> dict[str, Any]:\n", " try:\n", " parsed = json.loads(builtins.str(exc))\n", " if isinstance(parsed, builtins.dict) and \"status_code\" in parsed:\n", " return {\n", " \"exception_class\": type(exc).__name__,\n", " \"status_code\": parsed.get(\"status_code\"),\n", " \"retryable\": parsed.get(\"retryable\"),\n", " \"request_id\": parsed.get(\"request_id\"),\n", " \"message\": compact_text(parsed.get(\"body\", parsed), 500),\n", " }\n", " except Exception:\n", " pass\n", "\n", " status_code = getattr(exc, \"status_code\", None)\n", " response = getattr(exc, \"response\", None)\n", " request_id = None\n", " if response is not None:\n", " headers = getattr(response, \"headers\", {})\n", " request_id = headers.get(\"x-request-id\") if hasattr(headers, \"get\") else None\n", " return {\n", " \"exception_class\": type(exc).__name__,\n", " \"status_code\": status_code,\n", " \"retryable\": status_code in TRANSIENT_STATUS_CODES,\n", " \"request_id\": request_id,\n", " \"message\": builtins.str(exc)[:500],\n", " }\n", "\n", "\n", "def request_json(method: str, path: str, *, payload: dict[str, Any] | None = None) -> dict[str, Any]:\n", " response = http.request(\n", " method,\n", " endpoint(path),\n", " json=payload,\n", " timeout=API_TIMEOUT_SECONDS,\n", " )\n", " try:\n", " body = response.json() if response.text else {}\n", " except json.JSONDecodeError:\n", " body = {\"raw_text\": response.text}\n", " if response.status_code >= 400:\n", " raise RuntimeError(json.dumps(normalize_error(response, body), indent=2, default=builtins.str))\n", " return body\n", "\n", "\n", "def to_dict(value: Any) -> Any:\n", " if hasattr(value, \"model_dump\"):\n", " return value.model_dump(mode=\"json\")\n", " if isinstance(value, builtins.list):\n", " return [to_dict(item) for item in value]\n", " if isinstance(value, builtins.dict):\n", " return {key: to_dict(item) for key, item in value.items()}\n", " return value\n", "\n", "\n", "def output_text(response: Any) -> str:\n", " direct = getattr(response, \"output_text\", None)\n", " if direct:\n", " return direct\n", " data = to_dict(response)\n", " pieces: list[str] = []\n", " for item in data.get(\"output\", []) or []:\n", " for content in item.get(\"content\", []) or []:\n", " if content.get(\"type\") == \"output_text\":\n", " pieces.append(content.get(\"text\", \"\"))\n", " return \"\".join(pieces)\n", "\n", "\n", "def response_items(response: Any) -> list[dict[str, Any]]:\n", " data = to_dict(response)\n", " return builtins.list(data.get(\"output\", []) or [])\n", "\n", "\n", "def first_output_item(response: Any, item_type: str) -> dict[str, Any] | None:\n", " for item in response_items(response):\n", " if item.get(\"type\") == item_type:\n", " return item\n", " return None\n", "\n", "\n", "def summarize_response(response: Any) -> dict[str, Any]:\n", " data = to_dict(response)\n", " usage = data.get(\"usage\") or {}\n", " input_details = usage.get(\"input_tokens_details\") or {}\n", " output_details = usage.get(\"output_tokens_details\") or {}\n", " return {\n", " \"id\": data.get(\"id\"),\n", " \"model\": data.get(\"model\"),\n", " \"status\": data.get(\"status\"),\n", " \"output_item_types\": [item.get(\"type\") for item in data.get(\"output\", []) or []],\n", " \"input_tokens\": usage.get(\"input_tokens\"),\n", " \"output_tokens\": usage.get(\"output_tokens\"),\n", " \"total_tokens\": usage.get(\"total_tokens\"),\n", " \"cached_input_tokens\": input_details.get(\"cached_tokens\"),\n", " \"reasoning_output_tokens\": output_details.get(\"reasoning_tokens\"),\n", " \"service_tier\": data.get(\"service_tier\"),\n", " }\n", "\n", "\n", "def call_with_retries(label: str, func: Callable[..., Any], *args: Any, **kwargs: Any) -> Any:\n", " kwargs.setdefault(\"timeout\", API_TIMEOUT_SECONDS)\n", " last_exc: Exception | None = None\n", " for attempt in range(1, MAX_RETRIES + 2):\n", " try:\n", " return func(*args, **kwargs)\n", " except Exception as exc:\n", " last_exc = exc\n", " error = describe_api_error(exc)\n", " should_retry = bool(error[\"retryable\"] and attempt <= MAX_RETRIES)\n", " if not should_retry:\n", " raise\n", " time.sleep(min(2 ** (attempt - 1), 8))\n", " raise RuntimeError(f\"{label} failed after retries\") from last_exc\n", "\n", "\n", "def create_response(**kwargs: Any) -> Any:\n", " kwargs.setdefault(\"model\", MODEL_ID)\n", " return call_with_retries(\"responses.create\", client.responses.create, **kwargs)\n", "\n", "\n", "def retrieve_response(response_id: str) -> Any:\n", " return call_with_retries(\"responses.retrieve\", client.responses.retrieve, response_id)\n", "\n", "\n", "def delete_response(response_id: str) -> Any:\n", " return call_with_retries(\"responses.delete\", client.responses.delete, response_id)\n", "\n", "\n", "def remember_stored_response(response: Any) -> None:\n", " response_id = getattr(response, \"id\", None) or to_dict(response).get(\"id\")\n", " if response_id:\n", " STORED_RESPONSE_IDS.append(response_id)\n", "\n", "\n", "def handle_example_error(features: str | list[str], exc: Exception) -> None:\n", " feature_list = [features] if isinstance(features, builtins.str) else features\n", " error = describe_api_error(exc)\n", " for feature in feature_list:\n", " record_check(feature, \"warn\", error)\n", " print_labeled_text(\"Result\", \"This live call did not complete in this environment.\")\n", " print_labeled_json(\"Response summary\", error)\n", "\n", "\n", "def build_curl_command(payload: dict[str, Any]) -> str:\n", " body = json.dumps(payload)\n", " return \" \".join([\n", " \"curl\", \"-sS\", shlex.quote(RESPONSES_URL),\n", " \"-H\", shlex.quote(\"Content-Type: application/json\"),\n", " \"-H\", shlex.quote(\"Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK\"),\n", " \"-d\", shlex.quote(body),\n", " ])\n", "\n", "\n", "def run_raw_http_request(payload: dict[str, Any]) -> dict[str, Any]:\n", " return request_json(\"POST\", \"/responses\", payload=payload)\n", "\n", "print(\"Helpers ready.\")\n" ] }, { "cell_type": "markdown", "id": "5361f141", "metadata": {}, "source": [ "### 1.6 Verify the Endpoint\n", "\n", "The first live call is intentionally tiny. It sends a minimal Responses request with `store=false` and a short text instruction so you can catch setup issues before running richer examples. Inspect the request shape, returned text, status, model, output item types, and token usage.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 6, "id": "38425696", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
inputReply with exactly: ok
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
ok
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_gnt2qiavimim2lvfrtosh472mmsodtphk4glbiiefj7joxb4k4ra
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens162
cached_input_tokens0
output_tokens5
total_tokens167
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: A tiny response confirms that the endpoint, key, model, and request shape are working.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "preflight_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": \"Reply with exactly: ok\",\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(preflight_payload)\n", "try:\n", " preflight_response = create_response(**preflight_payload)\n", " require(output_text(preflight_response).strip(), \"Preflight response did not return output text.\")\n", " record_check(\"Endpoint shape\", \"pass\", RESPONSES_URL)\n", " model_selection_detail = f\"{len(AVAILABLE_MODELS)} models discovered\" if AVAILABLE_MODELS else \"Using configured model; model-list metadata is not required for requests.\"\n", " record_check(\"Model selection\", \"pass\", model_selection_detail)\n", " preflight_text = output_text(preflight_response).strip()\n", " record_response(\"Endpoint verification\", \"text\", preflight_text)\n", " print_labeled_text(\"Result\", preflight_text)\n", " print_response_summary(preflight_response)\n", " print_key_takeaway('A tiny response confirms that the endpoint, key, model, and request shape are working.')\n", "except Exception as exc:\n", " handle_example_error([\"Endpoint shape\", \"Model selection\"], exc)\n" ] }, { "cell_type": "markdown", "id": "f3c079d9", "metadata": {}, "source": [ "### 1.7 Normalize API Errors\n", "\n", "Production integrations need consistent error logging for status codes, retry decisions, request IDs, and response bodies. This cell documents the normalized error shape used by the notebook without intentionally making a failing request. Later cells use the same shape when a live call fails or returns a non-2xx status.\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "4e027820", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "
JSON
\n", "
{\n", " "normalized_fields": [\n", " "exception_class",\n", " "status_code",\n", " "retryable",\n", " "request_id",\n", " "message"\n", " ],\n", " "retryable_status_codes": [\n", " 408,\n", " 409,\n", " 429,\n", " 500,\n", " 502,\n", " 503,\n", " 504\n", " ],\n", " "notes": "call_with_retries(...) uses this taxonomy for transient retry handling."\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "error_taxonomy_example = {\n", " \"normalized_fields\": [\"exception_class\", \"status_code\", \"retryable\", \"request_id\", \"message\"],\n", " \"retryable_status_codes\": sorted(TRANSIENT_STATUS_CODES),\n", " \"notes\": \"call_with_retries(...) uses this taxonomy for transient retry handling.\",\n", "}\n", "record_check(\"Error handling\", \"pass\", error_taxonomy_example)\n", "print_json(error_taxonomy_example)" ] }, { "cell_type": "markdown", "id": "9dc176b7", "metadata": {}, "source": [ "## 2. Make Your First Responses Requests\n", "\n", "This section shows the Responses request surface from two angles. First, you inspect and run a raw HTTPS request so the endpoint, headers, and JSON body are visible. Then you use the OpenAI SDK for the same kind of application workflow, which is the path most production code should prefer once configuration is correct.\n" ] }, { "cell_type": "markdown", "id": "87b26703", "metadata": {}, "source": [ "### 2.1 Inspect the Raw HTTPS Request Shape\n", "\n", "Build a minimal Responses payload for a BrightCart support-assistant reply and render a copy-pasteable `curl` command. The command references `$AWS_BEARER_TOKEN_BEDROCK` instead of embedding a token, and the notebook does not execute shell commands that put bearer tokens in process arguments. Inspect the `model`, `input`, `max_output_tokens`, and `store` fields.\n" ] }, { "cell_type": "code", "execution_count": 9, "id": "7c2aa5e3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
inputBrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
curl -sS https://bedrock-mantle.us-west-2.api.aws/openai/v1/responses -H 'Content-Type: application/json' -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' -d '{"model": "openai.gpt-5.4", "input": "BrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text.", "max_output_tokens": 1024, "store": false}'
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: The curl command shows the raw HTTPS shape behind the SDK call.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "basic_curl_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": \"BrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text.\",\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(basic_curl_payload)\n", "print_labeled_text(\"Result\", build_curl_command(basic_curl_payload))\n", "print_key_takeaway('The curl command shows the raw HTTPS shape behind the SDK call.')\n" ] }, { "cell_type": "markdown", "id": "3ed94d52", "metadata": {}, "source": [ "### 2.2 Send the Raw HTTPS Request\n", "\n", "Send the same request through the raw HTTPS helper. This cell demonstrates the wire-level `POST /responses` path and extracts text from the response body by walking output items. Inspect the returned response ID, model, status, and text output to understand the schema your application receives.\n" ] }, { "cell_type": "code", "execution_count": 10, "id": "958f17c8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
inputBrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
Empathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\n", "Action: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Response summary
\n", "
{\n", " "id": "resp_naythl6fvzhoctlsdogd4vpr673q5ibagqqpiujbast3sy6viroa",\n", " "model": "openai.gpt-5.4",\n", " "status": "completed"\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: The response body contains message output that application code can extract as text.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "\n", "print_request_shape(basic_curl_payload)\n", "try:\n", " basic_http_response = run_raw_http_request(basic_curl_payload)\n", " record_check(\"Text generation\", \"pass\", basic_http_response.get(\"id\"))\n", " response_text_parts = []\n", " for item in basic_http_response.get(\"output\", []):\n", " for content in item.get(\"content\", []):\n", " if content.get(\"type\") == \"output_text\":\n", " response_text_parts.append(content.get(\"text\", \"\"))\n", " raw_http_output = \"\".join(response_text_parts).strip()\n", " record_response(\"First raw HTTPS request\", \"text\", raw_http_output)\n", " print_labeled_text(\"Result\", raw_http_output)\n", " print_labeled_json(\"Response summary\", {\n", " \"id\": basic_http_response.get(\"id\"),\n", " \"model\": basic_http_response.get(\"model\"),\n", " \"status\": basic_http_response.get(\"status\"),\n", " })\n", " print_key_takeaway(\"The response body contains message output that application code can extract as text.\")\n", "except Exception as exc:\n", " handle_example_error(\"Text generation\", exc)\n" ] }, { "cell_type": "markdown", "id": "95e7940e", "metadata": {}, "source": [ "### 2.3 Use the OpenAI SDK\n", "\n", "The OpenAI SDK can call OpenAI-compatible APIs when you pass the Bedrock bearer token and base URL explicitly. This cell sends a text-generation request through `client.responses.create`, sets `reasoning.effort` to `low`, and prints a compact response summary. Inspect the output text, token counts, output item types, and any reasoning-token metadata returned by the endpoint.\n", "\n", "Official docs: [Reasoning models](https://developers.openai.com/api/docs/guides/reasoning) describes using reasoning effort with the Responses API.\n" ] }, { "cell_type": "code", "execution_count": 11, "id": "817e8717", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
reasoning{'effort': 'low'}
inputWrite a three-sentence overview for a developer building a BrightCart support assistant with the Responses API.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
Use the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_nmvqefzghd5hi67uwy4wfvhwnnzild3lslqsxqor3cat63kmucoq
modelopenai.gpt-5.4
statuscompleted
output_item_types['reasoning', 'message']
input_tokens177
cached_input_tokens0
output_tokens129
total_tokens306
reasoning_output_tokens18
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: The SDK returns a response object with text, status, token usage, and output item metadata.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "sdk_text_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": \"Write a three-sentence overview for a developer building a BrightCart support assistant with the Responses API.\",\n", " \"reasoning\": {\"effort\": \"low\"},\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(sdk_text_payload)\n", "try:\n", " text_response = create_response(**sdk_text_payload)\n", " sdk_text = output_text(text_response).strip()\n", " require(sdk_text, \"SDK text response did not return output text.\")\n", " record_check(\"Text generation\", \"pass\", summarize_response(text_response))\n", " record_check(\"Reasoning effort\", \"pass\", summarize_response(text_response))\n", " record_response(\"SDK text generation\", \"text\", sdk_text)\n", " print_labeled_text(\"Result\", sdk_text)\n", " print_response_summary(text_response)\n", " print_key_takeaway('The SDK returns a response object with text, status, token usage, and output item metadata.')\n", "except Exception as exc:\n", " handle_example_error([\"Text generation\", \"Reasoning effort\"], exc)\n" ] }, { "cell_type": "markdown", "id": "e781a295", "metadata": {}, "source": [ "### 2.4 Create and Retrieve a Response\n", "\n", "The Responses API can store a response and retrieve it later by ID. This pattern is useful for audit trails, debugging, and follow-up turns that reference prior context. This cell creates a stored response, tracks the ID for cleanup, retrieves it, and compares the retrieved text and usage metadata.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 12, "id": "dc323795", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeTrue
inputBrightCart is building a support assistant for delayed replacement orders. Return exactly three labeled plain-text lines: goal, data needed, and human-review rule. Do not use leading hyphens or bold text.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
goal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\n", "data needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\n", "human-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Created response summary
\n", "
{\n", " "id": "resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia",\n", " "model": "openai.gpt-5.4",\n", " "status": "completed",\n", " "output_item_types": [\n", " "message"\n", " ],\n", " "input_tokens": 198,\n", " "output_tokens": 109,\n", " "total_tokens": 307,\n", " "cached_input_tokens": 0,\n", " "reasoning_output_tokens": 0,\n", " "service_tier": "default"\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens198
cached_input_tokens0
output_tokens109
total_tokens307
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: store=True lets an application retrieve the response later by ID with usage metadata intact.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "lifecycle_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": (\n", " \"BrightCart is building a support assistant for delayed replacement orders. \"\n", " \"Return exactly three labeled plain-text lines: goal, data needed, and human-review rule. Do not use leading hyphens or bold text.\"\n", " ),\n", " \"max_output_tokens\": 1024,\n", " \"store\": True,\n", "}\n", "\n", "print_request_shape(lifecycle_payload)\n", "try:\n", " lifecycle_response = create_response(**lifecycle_payload)\n", " remember_stored_response(lifecycle_response)\n", " retrieved_response = retrieve_response(lifecycle_response.id)\n", " retrieved_summary = summarize_response(retrieved_response)\n", " retrieved_text = output_text(retrieved_response).strip()\n", " require(retrieved_text, \"Retrieved response did not contain text output.\")\n", "\n", " lifecycle_status = \"pass\" if retrieved_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n", " record_check(\"Responses lifecycle\", lifecycle_status, retrieved_response.id)\n", " record_check(\"Response schema\", \"pass\", retrieved_summary)\n", " record_check(\"Usage metadata\", \"pass\" if retrieved_summary.get(\"total_tokens\") is not None else \"warn\", retrieved_summary)\n", " record_response(\"Create and retrieve response\", \"text\", retrieved_text)\n", "\n", " print_labeled_text(\"Result\", retrieved_text)\n", " print_labeled_json(\"Created response summary\", summarize_response(lifecycle_response))\n", " print_response_summary(retrieved_summary)\n", " print_key_takeaway('store=True lets an application retrieve the response later by ID with usage metadata intact.')\n", "except Exception as exc:\n", " handle_example_error([\"Responses lifecycle\", \"Response schema\", \"Usage metadata\"], exc)\n" ] }, { "cell_type": "markdown", "id": "609cd0ad", "metadata": {}, "source": [ "### 2.5 Add Reasoning Effort, Service Tier, and Prompt Cache Parameters\n", "\n", "Model controls travel alongside the normal input. This request combines `reasoning.effort`, `service_tier`, `prompt_cache_key`, and `prompt_cache_retention` so you can see how operational controls and prompt-cache metadata appear in the same response schema as ordinary text output. Inspect `service_tier`, `cached_input_tokens`, reasoning token metadata, and total token usage.\n", "\n", "Note: This notebook uses `PROMPT_CACHE_RETENTION` instead of hard-coding `prompt_cache_retention`. The value is `in_memory` for `openai.gpt-5.4`, and `24h` for `openai.gpt-5.5` and later models because those models require extended prompt caching.\n" ] }, { "cell_type": "code", "execution_count": 13, "id": "1e6f45d6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
service_tierauto
prompt_cache_keybrightcart-support-policy-guide
prompt_cache_retentionin_memory
reasoning{'effort': 'low'}
inputFor the BrightCart support assistant, explain prompt caching in exactly two labeled plain-text lines: one latency benefit and one consistency benefit.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
Latency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\n", "Consistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq
modelopenai.gpt-5.4
statuscompleted
output_item_types['reasoning', 'message']
input_tokens183
cached_input_tokens0
output_tokens91
total_tokens274
reasoning_output_tokens34
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Model controls travel with the same request as normal input, while returned metadata can vary by endpoint.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "control_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": (\n", " \"For the BrightCart support assistant, explain prompt caching in exactly two labeled plain-text lines: \"\n", " \"one latency benefit and one consistency benefit.\"\n", " ),\n", " \"reasoning\": {\"effort\": \"low\"},\n", " \"prompt_cache_key\": \"brightcart-support-policy-guide\",\n", " \"prompt_cache_retention\": PROMPT_CACHE_RETENTION,\n", " \"service_tier\": \"auto\",\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(control_payload)\n", "try:\n", " control_response = create_response(**control_payload)\n", " control_summary = summarize_response(control_response)\n", " control_text = output_text(control_response).strip()\n", " require(control_text, \"Control response did not return text.\")\n", " status = \"pass\" if control_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n", " record_check(\"Prompt caching\", \"pass\" if control_summary.get(\"cached_input_tokens\") is not None else \"warn\", control_summary)\n", " record_check(\"Service tier\", \"pass\" if control_summary.get(\"service_tier\") is not None else \"warn\", control_summary)\n", " record_check(\"Reasoning effort\", status, control_summary)\n", " record_response(\"Service tier and prompt cache request\", \"text\", control_text)\n", " print_labeled_text(\"Result\", control_text)\n", " print_response_summary(control_summary)\n", " print_key_takeaway('Model controls travel with the same request as normal input, while returned metadata can vary by endpoint.')\n", "except Exception as exc:\n", " handle_example_error([\"Prompt caching\", \"Service tier\", \"Reasoning effort\"], exc)\n" ] }, { "cell_type": "markdown", "id": "6eb8aaa7", "metadata": {}, "source": [ "## 3. Generate Structured JSON\n", "\n", "Structured JSON turns model output into data that application code can parse, validate, and route. This section compares strict schema-constrained output with lighter JSON mode. Use Structured Outputs when your application needs a contract; use JSON mode when valid JSON is enough but the exact schema can remain flexible.\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "5592a8ac", "metadata": {}, "source": [ "### 3.1 Define the Structured Output Schema\n", "\n", "Define the support-ticket schema used by the next live request. The schema lists the exact fields the application expects, including category, priority, sentiment, summary, required actions, and escalation status. Inspect the request shape to see how `text.format.type=\"json_schema\"`, `strict=true`, and the JSON Schema are attached to a normal Responses request.\n" ] }, { "cell_type": "code", "execution_count": 14, "id": "9daa5861", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
text formatjson_schema: support_ticket_triage strict=True required=7 fields
inputSupport ticket TICKET-7429: Maya Chen says ORDER-8831 is a replacement for a damaged standing desk. The replacement is two days late, the carrier scan has not moved, and she needs the desk before Monday. She asks for a supervisor callback and refund options. T...
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: The schema is part of the request and defines the fields the next cell validates.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "support_triage_schema = {\n", " \"type\": \"object\",\n", " \"properties\": {\n", " \"ticket_id\": {\"type\": \"string\"},\n", " \"category\": {\"type\": \"string\", \"enum\": [\"delivery_delay\", \"return_exchange\", \"damaged_item\", \"billing\", \"account\"]},\n", " \"priority\": {\"type\": \"string\", \"enum\": [\"low\", \"medium\", \"high\", \"urgent\"]},\n", " \"customer_sentiment\": {\"type\": \"string\"},\n", " \"summary\": {\"type\": \"string\"},\n", " \"required_actions\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}, \"minItems\": 2},\n", " \"escalation_needed\": {\"type\": \"boolean\"},\n", " },\n", " \"required\": [\"ticket_id\", \"category\", \"priority\", \"customer_sentiment\", \"summary\", \"required_actions\", \"escalation_needed\"],\n", " \"additionalProperties\": False,\n", "}\n", "\n", "structured_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": (\n", " \"Support ticket TICKET-7429: Maya Chen says ORDER-8831 is a replacement for a damaged standing desk. \"\n", " \"The replacement is two days late, the carrier scan has not moved, and she needs the desk before Monday. \"\n", " \"She asks for a supervisor callback and refund options. Triage this ticket for the next support agent.\"\n", " ),\n", " \"text\": {\"format\": {\"type\": \"json_schema\", \"name\": \"support_ticket_triage\", \"strict\": True, \"schema\": support_triage_schema}},\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(structured_payload)\n", "print_key_takeaway('The schema is part of the request and defines the fields the next cell validates.')\n" ] }, { "cell_type": "markdown", "id": "6a1dea0f", "metadata": {}, "source": [ "### 3.2 Validate Schema-Constrained Output\n", "\n", "Call the model with the schema from the previous cell, parse the returned text as JSON, and validate important fields in Python. The API request asks for schema adherence, while application-side validation still checks that the returned object is suitable for downstream routing. Inspect the parsed object and the response summary.\n" ] }, { "cell_type": "code", "execution_count": 15, "id": "8b813739", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
text formatjson_schema: support_ticket_triage strict=True required=7 fields
inputSupport ticket TICKET-7429: Maya Chen says ORDER-8831 is a replacement for a damaged standing desk. The replacement is two days late, the carrier scan has not moved, and she needs the desk before Monday. She asks for a supervisor callback and refund options. T...
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
{\n", " "ticket_id": "TICKET-7429",\n", " "category": "delivery_delay",\n", " "priority": "urgent",\n", " "customer_sentiment": "frustrated and time-sensitive",\n", " "summary": "Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.",\n", " "required_actions": [\n", " "Review ORDER-8831 shipment status and confirm last carrier scan/update.",\n", " "Contact carrier or open a trace/escalation for stalled tracking.",\n", " "Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.",\n", " "Arrange supervisor callback per customer request.",\n", " "Review and communicate refund options, including refund for replacement order and any prior damaged-item resolution details.",\n", " "Verify whether replacement shipment should be intercepted/returned if a refund or reshipment is approved."\n", " ],\n", " "escalation_needed": true\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_rk7y6aobdqx2m2fjpnllihuoyku2n55gj5o2h5jxl7lru77mwwiq
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens328
cached_input_tokens0
output_tokens200
total_tokens528
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Schema-constrained output gives application code a predictable JSON object to parse and validate.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "\n", "\n", "def validate_support_triage(payload: dict[str, Any]) -> dict[str, Any]:\n", " require(\"ticket_id\" in payload, \"Missing key: ticket_id\")\n", " require(payload.get(\"ticket_id\") == \"TICKET-7429\", \"Ticket ID did not match expected value.\")\n", " require(\"required_actions\" in payload, \"Missing key: required_actions\")\n", " require(isinstance(payload.get(\"required_actions\"), builtins.list), \"required_actions must be a list.\")\n", " require(len(payload[\"required_actions\"]) >= 2, \"required_actions should contain at least two actions.\")\n", " return payload\n", "\n", "print_request_shape(structured_payload)\n", "try:\n", " structured_response = create_response(**structured_payload)\n", " raw_structured_text = output_text(structured_response).strip()\n", " try:\n", " structured_payload_result = validate_support_triage(json.loads(raw_structured_text))\n", " record_check(\"Structured Outputs\", \"pass\", structured_payload_result)\n", " record_response(\"Structured ticket triage\", \"json\", structured_payload_result)\n", " print_labeled_json(\"Result\", structured_payload_result)\n", " except json.JSONDecodeError as e:\n", " raise ValueError(f\"Invalid JSON: {e}\")\n", " except Exception as parse_exc:\n", " record_check(\"Structured Outputs\", \"warn\", {\"message\": \"Response did not match the expected schema shape.\", \"text_sample\": raw_structured_text[:600], \"error\": builtins.str(parse_exc)})\n", " print_labeled_text(\"Result\", \"The request completed, but the returned text did not match the expected schema shape.\")\n", " print_wrapped(raw_structured_text[:1200])\n", " print_response_summary(structured_response)\n", " print_key_takeaway('Schema-constrained output gives application code a predictable JSON object to parse and validate.')\n", "except Exception as exc:\n", " handle_example_error(\"Structured Outputs\", exc)" ] }, { "cell_type": "markdown", "id": "271d6dec", "metadata": {}, "source": [ "### 3.3 Use JSON Mode\n", "\n", "JSON mode asks the model to return a valid JSON object without enforcing a strict schema. This is useful for lightweight handoffs where you still want parsable output but can tolerate a looser contract. This cell requests a support handoff object, parses it, and checks for the expected keys.\n" ] }, { "cell_type": "code", "execution_count": 16, "id": "635266ce", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
text formatjson_object
inputReturn JSON for a support chat handoff with keys customer_name, order_id, issue_summary, next_step, and metrics_to_watch. Context: Maya Chen asks about delayed replacement order ORDER-8831; the carrier scan is stale. metrics_to_watch should be an array.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
{\n", " "customer_name": "Maya Chen",\n", " "order_id": "ORDER-8831",\n", " "issue_summary": "Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.",\n", " "next_step": "Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.",\n", " "metrics_to_watch": [\n", " "tracking_scan_recency",\n", " "carrier_exception_status",\n", " "replacement_order_delivery_eta",\n", " "customer_follow_up_time"\n", " ]\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_2wloqig6bh6sz2ysyiozx7oon7hogdzavqnv4pn7wcmnfkxnmlra
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens212
cached_input_tokens0
output_tokens122
total_tokens334
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: JSON mode is useful when valid JSON is enough and a strict schema is not required.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "json_mode_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": (\n", " \"Return JSON for a support chat handoff with keys customer_name, order_id, issue_summary, next_step, \"\n", " \"and metrics_to_watch. Context: Maya Chen asks about delayed replacement order ORDER-8831; the carrier scan is stale. \"\n", " \"metrics_to_watch should be an array.\"\n", " ),\n", " \"text\": {\"format\": {\"type\": \"json_object\"}},\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(json_mode_payload)\n", "try:\n", " json_mode_response = create_response(**json_mode_payload)\n", " payload = json.loads(output_text(json_mode_response).strip())\n", " require({\"customer_name\", \"order_id\", \"issue_summary\", \"next_step\", \"metrics_to_watch\"}.issubset(payload), \"JSON mode response missed required keys.\")\n", " record_check(\"JSON mode\", \"pass\", payload)\n", " record_response(\"JSON support handoff\", \"json\", payload)\n", " print_labeled_json(\"Result\", payload)\n", " print_response_summary(json_mode_response)\n", " print_key_takeaway('JSON mode is useful when valid JSON is enough and a strict schema is not required.')\n", "except Exception as exc:\n", " handle_example_error(\"JSON mode\", exc)\n" ] }, { "cell_type": "markdown", "id": "1be28f8c", "metadata": {}, "source": [ "### 3.4 Control Verbosity from Reasoning Effort\n", "\n", "Verbosity controls help tune the shape of generated prose, while reasoning effort controls how much reasoning work the model spends before answering. The notebook demonstrates `reasoning.effort` in the SDK and model-control cells above; this cell focuses on `text.verbosity` by sending compact and detailed versions of the same policy topic. Inspect the side-by-side text and token summaries to compare style and usage.\n" ] }, { "cell_type": "code", "execution_count": 17, "id": "5f0850ca", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "
Request shape
\n", "
{\n", " "compact": {\n", " "model": "openai.gpt-5.4",\n", " "input": "Explain BrightCart's delayed-replacement policy to a new support agent. Reply in one sentence under 35 words.",\n", " "text": {\n", " "verbosity": "low"\n", " },\n", " "max_output_tokens": 1024,\n", " "store": false\n", " },\n", " "detailed": {\n", " "model": "openai.gpt-5.4",\n", " "input": "Explain BrightCart's delayed-replacement policy to a new support agent. Reply in exactly three numbered plain-text lines, each under 18 words. Do not use leading hyphens or bold text.",\n", " "text": {\n", " "verbosity": "high"\n", " },\n", " "max_output_tokens": 1024,\n", " "store": false\n", " }\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: compact guidance
\n", "
BrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: detailed guidance
\n", "
1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\n", "2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\n", "3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
requestidmodelstatusoutput_item_typesinput_tokensoutput_tokenstotal_tokenscached_input_tokensreasoning_output_tokensservice_tier
compactresp_m33l24gl3wl55lwqxhnrqkrf34ikc4rex4sazo2pnqpp4malnliqopenai.gpt-5.4completed[message]1803421400default
detailedresp_icqbez74xlmlvw4kl3yf2qtbutfopfefayzftragcwmhu4cohnuaopenai.gpt-5.4completed[message]1976025700default
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Verbosity controls tune the answer style while the prompt still bounds the output.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "verbosity_prompt = \"Explain BrightCart's delayed-replacement policy to a new support agent.\"\n", "compact_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": verbosity_prompt + \" Reply in one sentence under 35 words.\",\n", " \"text\": {\"verbosity\": \"low\"},\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "detailed_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": verbosity_prompt + \" Reply in exactly three numbered plain-text lines, each under 18 words. Do not use leading hyphens or bold text.\",\n", " \"text\": {\"verbosity\": \"high\"},\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_labeled_json(\"Request shape\", {\n", " \"compact\": redact_payload(compact_payload),\n", " \"detailed\": redact_payload(detailed_payload),\n", "})\n", "try:\n", " compact_response = create_response(**compact_payload)\n", " detailed_response = create_response(**detailed_payload)\n", " compact_guidance_text = output_text(compact_response).strip()\n", " detailed_guidance_text = output_text(detailed_response).strip()\n", " require(compact_guidance_text and detailed_guidance_text, \"Verbosity responses did not return text.\")\n", " compact_summary = summarize_response(compact_response)\n", " detailed_summary = summarize_response(detailed_response)\n", " status = \"pass\" if compact_summary.get(\"status\") in {None, \"completed\"} and detailed_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n", " record_check(\"Verbosity\", status, {\"compact_chars\": len(compact_guidance_text), \"detailed_chars\": len(detailed_guidance_text)})\n", " record_response(\"Compact policy guidance\", \"text\", compact_guidance_text)\n", " record_response(\"Detailed policy guidance\", \"text\", detailed_guidance_text)\n", " print_labeled_text(\"Result: compact guidance\", compact_guidance_text)\n", " print_labeled_text(\"Result: detailed guidance\", detailed_guidance_text)\n", " verbosity_summary = pd.DataFrame([\n", " {\"request\": \"compact\", **compact_summary},\n", " {\"request\": \"detailed\", **detailed_summary},\n", " ])\n", " print_label(\"Response summary\")\n", " display_wrapped_table(verbosity_summary, max_col_width_px=420)\n", " print_key_takeaway('Verbosity controls tune the answer style while the prompt still bounds the output.')\n", "except Exception as exc:\n", " handle_example_error(\"Verbosity\", exc)\n" ] }, { "cell_type": "markdown", "id": "ed8b4b11", "metadata": {}, "source": [ "## 4. Add Application-Managed Tools\n", "\n", "Function calling lets the model ask your application for data or actions, but your code remains responsible for executing tools and returning results. This section defines local BrightCart tools, then walks through a single function call, multiple independent calls, and a custom text tool. The examples keep tool outputs deterministic so the request loop is easy to inspect.\n", "\n", "\n" ] }, { "cell_type": "markdown", "id": "1c543f50", "metadata": {}, "source": [ "### 4.1 Define Local Tool Schemas and Functions\n", "\n", "Define local sample tools for order status and customer profile lookups. The tool schemas describe the names, descriptions, argument shapes, required fields, and strictness that the model can use when deciding what to call. The Python functions stand in for application systems such as order management, CRM, or policy services.\n" ] }, { "cell_type": "code", "execution_count": 18, "id": "dce0bb66", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sample function tools:\n" ] }, { "data": { "text/html": [ "\n", "
\n", "
JSON
\n", "
[\n", " "get_order_status",\n", " "get_customer_profile"\n", "]
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Sample order lookup:\n" ] }, { "data": { "text/html": [ "\n", "
\n", "
JSON
\n", "
{\n", " "order_id": "ORDER-8831",\n", " "customer_id": "CUST-1042",\n", " "item": "standing desk replacement",\n", " "status": "delayed",\n", " "carrier_scan": "No movement for 36 hours at Denver sort center",\n", " "promised_delivery": "2026-06-01",\n", " "recommended_policy": "If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval."\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "function_tools = [\n", " {\n", " \"type\": \"function\",\n", " \"name\": \"get_order_status\",\n", " \"description\": \"Look up a sample BrightCart order status.\",\n", " \"parameters\": {\n", " \"type\": \"object\",\n", " \"properties\": {\"order_id\": {\"type\": \"string\", \"description\": \"An order ID such as ORDER-8831.\"}},\n", " \"required\": [\"order_id\"],\n", " \"additionalProperties\": False,\n", " },\n", " \"strict\": True,\n", " },\n", " {\n", " \"type\": \"function\",\n", " \"name\": \"get_customer_profile\",\n", " \"description\": \"Look up sample customer context for a BrightCart support interaction.\",\n", " \"parameters\": {\n", " \"type\": \"object\",\n", " \"properties\": {\"customer_id\": {\"type\": \"string\", \"description\": \"A customer ID such as CUST-1042.\"}},\n", " \"required\": [\"customer_id\"],\n", " \"additionalProperties\": False,\n", " },\n", " \"strict\": True,\n", " },\n", "]\n", "\n", "\n", "def get_order_status(order_id: str) -> dict[str, Any]:\n", " orders = {\n", " \"ORDER-8831\": {\n", " \"order_id\": \"ORDER-8831\",\n", " \"customer_id\": \"CUST-1042\",\n", " \"item\": \"standing desk replacement\",\n", " \"status\": \"delayed\",\n", " \"carrier_scan\": \"No movement for 36 hours at Denver sort center\",\n", " \"promised_delivery\": (date.today() + timedelta(days=2)).isoformat(),\n", " \"recommended_policy\": \"If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval.\",\n", " },\n", " \"ORDER-2044\": {\n", " \"order_id\": \"ORDER-2044\",\n", " \"customer_id\": \"CUST-1042\",\n", " \"item\": \"ergonomic chair\",\n", " \"status\": \"delivered\",\n", " \"carrier_scan\": \"Delivered yesterday at front desk\",\n", " \"promised_delivery\": (date.today() - timedelta(days=1)).isoformat(),\n", " \"recommended_policy\": \"Confirm delivery details before opening a replacement request.\",\n", " },\n", " }\n", " return orders.get(order_id, {\"order_id\": order_id, \"status\": \"unknown\", \"customer_id\": None})\n", "\n", "\n", "def get_customer_profile(customer_id: str) -> dict[str, Any]:\n", " profiles = {\n", " \"CUST-1042\": {\n", " \"customer_id\": \"CUST-1042\",\n", " \"name\": \"Maya Chen\",\n", " \"loyalty_tier\": \"Gold\",\n", " \"region\": \"California\",\n", " \"recent_issue\": \"Damaged standing desk replacement\",\n", " \"contact_preference\": \"email with SMS updates for shipping changes\",\n", " }\n", " }\n", " return profiles.get(customer_id, {\"customer_id\": customer_id, \"loyalty_tier\": \"unknown\"})\n", "\n", "\n", "def dispatch_tool_call(call: dict[str, Any]) -> dict[str, Any]:\n", " name = call[\"name\"]\n", " args = json.loads(call[\"arguments\"])\n", " if name == \"get_order_status\":\n", " output = get_order_status(**args)\n", " elif name == \"get_customer_profile\":\n", " output = get_customer_profile(**args)\n", " else:\n", " raise ValueError(f\"Unsupported tool: {name}\")\n", " return {\"type\": \"function_call_output\", \"call_id\": call[\"call_id\"], \"output\": json.dumps(output)}\n", "\n", "print(\"Sample function tools:\")\n", "print_json([tool[\"name\"] for tool in function_tools])\n", "print(\"\\nSample order lookup:\")\n", "print_json(get_order_status(\"ORDER-8831\"))" ] }, { "cell_type": "markdown", "id": "e1c9e1db", "metadata": {}, "source": [ "### 4.2 Call a Function Tool\n", "\n", "This cell runs the basic function-calling loop. The first request gives the model an order-status tool and asks it to choose arguments. The application parses the returned `function_call`, runs the local Python function, sends a `function_call_output` item back, and asks for the final grounded answer. Inspect the tool arguments, local tool output, final model text, and response metadata.\n" ] }, { "cell_type": "code", "execution_count": 19, "id": "5919a970", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
toolsget_order_status
tool_choicerequired
input1 item(s): user: Use get_order_status for ORDER-8831, then explain the next best action for the support agent in two labeled plain-text lines. Do not use leading hyphens or bold text.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: tool arguments
\n", "
{\n", " "order_id": "ORDER-8831"\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: tool output
\n", "
{\n", " "order_id": "ORDER-8831",\n", " "customer_id": "CUST-1042",\n", " "item": "standing desk replacement",\n", " "status": "delayed",\n", " "carrier_scan": "No movement for 36 hours at Denver sort center",\n", " "promised_delivery": "2026-06-01",\n", " "recommended_policy": "If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval."\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: final model answer
\n", "
Status: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\n", "Next best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_nqjijq6jvanvxdglnj4iuezrkdm2eoizutfiezzxbmm42xtynw6a
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens674
cached_input_tokens0
output_tokens75
total_tokens749
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Function calling separates model-selected arguments from application-executed business logic.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "function_input = [{\"role\": \"user\", \"content\": \"Use get_order_status for ORDER-8831, then explain the next best action for the support agent in two labeled plain-text lines. Do not use leading hyphens or bold text.\"}]\n", "order_status_tool = [tool for tool in function_tools if tool[\"name\"] == \"get_order_status\"]\n", "function_request = {\n", " \"model\": MODEL_ID,\n", " \"input\": function_input,\n", " \"tools\": order_status_tool,\n", " \"tool_choice\": \"required\",\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "\n", "def create_tool_plan_with_auto_fallback(request: dict[str, Any]) -> tuple[Any, str]:\n", " try:\n", " return create_response(**request), builtins.str(request.get(\"tool_choice\"))\n", " except Exception as first_exc:\n", " fallback_request = {**request, \"tool_choice\": \"auto\"}\n", " try:\n", " return create_response(**fallback_request), \"auto\"\n", " except Exception:\n", " raise first_exc\n", "\n", "\n", "print_request_shape(function_request)\n", "try:\n", " function_plan, tool_choice_used = create_tool_plan_with_auto_fallback(function_request)\n", " function_calls = [item for item in response_items(function_plan) if item.get(\"type\") == \"function_call\"]\n", "\n", " if function_calls:\n", " function_call = function_calls[0]\n", " function_args = json.loads(function_call[\"arguments\"])\n", " require(function_args.get(\"order_id\") == \"ORDER-8831\", f\"Unexpected function arguments: {function_args}\")\n", " tool_output = dispatch_tool_call(function_call)\n", " final_response = create_response(\n", " model=MODEL_ID,\n", " input=function_input + response_items(function_plan) + [tool_output],\n", " tools=order_status_tool,\n", " max_output_tokens=1024,\n", " store=False,\n", " )\n", " final_answer = output_text(final_response).strip()\n", " tool_output_payload = json.loads(tool_output[\"output\"])\n", " record_check(\"Function calling\", \"pass\", {\"tool_choice_used\": tool_choice_used, \"arguments\": function_args})\n", " record_response(\"Order-status tool answer\", \"text\", final_answer)\n", " print_labeled_json(\"Result: tool arguments\", function_args)\n", " print_labeled_json(\"Result: tool output\", tool_output_payload)\n", " print_labeled_text(\"Result: final model answer\", final_answer)\n", " print_response_summary(final_response)\n", " print_key_takeaway('Function calling separates model-selected arguments from application-executed business logic.')\n", " else:\n", " fallback_order = get_order_status(\"ORDER-8831\")\n", " fallback_prompt = (\n", " \"The model response did not include a function_call item. Use this application lookup result \"\n", " \"to answer in two labeled plain-text lines without leading hyphens or bold text: \" + json.dumps(fallback_order)\n", " )\n", " final_response = create_response(\n", " model=MODEL_ID,\n", " input=function_input + [{\"role\": \"user\", \"content\": fallback_prompt}],\n", " max_output_tokens=1024,\n", " store=False,\n", " )\n", " final_answer = output_text(final_response).strip()\n", " returned_item_types = [item.get(\"type\") for item in response_items(function_plan)]\n", " record_check(\"Function calling\", \"warn\", {\"tool_choice_used\": tool_choice_used, \"returned_item_types\": returned_item_types})\n", " record_response(\"Order-status local fallback answer\", \"text\", final_answer)\n", " print_labeled_json(\"Result: returned output item types\", returned_item_types)\n", " print_labeled_json(\"Result: local tool output\", fallback_order)\n", " print_labeled_text(\"Result: final model answer\", final_answer)\n", " print_response_summary(final_response)\n", " print_key_takeaway('The local lookup keeps the function-calling pattern understandable even when the model returns text.')\n", "except Exception as exc:\n", " handle_example_error(\"Function calling\", exc)" ] }, { "cell_type": "markdown", "id": "fde761fd", "metadata": {}, "source": [ "### 4.3 Handle Multiple Tool Calls\n", "\n", "Parallel tool calls let the model request more than one independent lookup from a single turn. This cell allows two order-status lookups, executes each local function call, and sends both outputs back before asking the model to compare the active shipping issues. Inspect the returned order IDs and final answer to confirm that application data, not model memory, grounds the response.\n" ] }, { "cell_type": "code", "execution_count": 20, "id": "53f83130", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
parallel_tool_callsTrue
toolsget_order_status, get_customer_profile
tool_choiceauto
input1 item(s): user: Use get_order_status for ORDER-8831 and ORDER-2044, then summarize whether Maya has one shipping problem or multiple active shipping problems in two labeled plain-text lines. Do not use leading hyphens or bold text.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: returned tool calls
\n", "
{\n", " "tool_call_count": 1,\n", " "order_ids": [\n", " "ORDER-8831"\n", " ]\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: local tool outputs
\n", "
[\n", " {\n", " "order_id": "ORDER-8831",\n", " "customer_id": "CUST-1042",\n", " "item": "standing desk replacement",\n", " "status": "delayed",\n", " "carrier_scan": "No movement for 36 hours at Denver sort center",\n", " "promised_delivery": "2026-06-01",\n", " "recommended_policy": "If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval."\n", " },\n", " {\n", " "order_id": "ORDER-2044",\n", " "customer_id": "CUST-1042",\n", " "item": "ergonomic chair",\n", " "status": "delivered",\n", " "carrier_scan": "Delivered yesterday at front desk",\n", " "promised_delivery": "2026-05-29",\n", " "recommended_policy": "Confirm delivery details before opening a replacement request."\n", " }\n", "]
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: final model answer
\n", "
Order statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\n", "Shipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_ovmlijo2lf2nmc7udlofbbw7n2xmlxs6mxdpftlp6iamfcjoeerq
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens404
cached_input_tokens0
output_tokens50
total_tokens454
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Local lookup outputs keep the parallel-tool pattern understandable if not every call is returned.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "parallel_input = [{\"role\": \"user\", \"content\": \"Use get_order_status for ORDER-8831 and ORDER-2044, then summarize whether Maya has one shipping problem or multiple active shipping problems in two labeled plain-text lines. Do not use leading hyphens or bold text.\"}]\n", "parallel_request = {\n", " \"model\": MODEL_ID,\n", " \"input\": parallel_input,\n", " \"tools\": function_tools,\n", " \"tool_choice\": \"auto\",\n", " \"parallel_tool_calls\": True,\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(parallel_request)\n", "try:\n", " parallel_plan = create_response(**parallel_request)\n", " parallel_calls = [item for item in response_items(parallel_plan) if item.get(\"type\") == \"function_call\"]\n", " parallel_outputs = [dispatch_tool_call(call) for call in parallel_calls]\n", " parallel_order_ids = [json.loads(call[\"arguments\"]).get(\"order_id\") for call in parallel_calls if call.get(\"name\") == \"get_order_status\"]\n", " expected_order_ids = [\"ORDER-8831\", \"ORDER-2044\"]\n", " missing_order_ids = [order_id for order_id in expected_order_ids if order_id not in builtins.set(parallel_order_ids)]\n", "\n", " if not missing_order_ids:\n", " parallel_final = create_response(\n", " model=MODEL_ID,\n", " input=parallel_input + response_items(parallel_plan) + parallel_outputs,\n", " tools=function_tools,\n", " max_output_tokens=1024,\n", " store=False,\n", " )\n", " parallel_answer = output_text(parallel_final).strip()\n", " record_check(\"Parallel tool calls\", \"pass\", {\"tool_call_count\": len(parallel_calls), \"order_ids\": parallel_order_ids})\n", " record_response(\"Parallel order lookup answer\", \"text\", parallel_answer)\n", " print_labeled_json(\"Result: tool calls\", {\"tool_call_count\": len(parallel_calls), \"order_ids\": parallel_order_ids})\n", " print_labeled_text(\"Result: final model answer\", parallel_answer)\n", " print_response_summary(parallel_final)\n", " print_key_takeaway('Parallel tool calls let the model request multiple lookups, while the application still controls execution.')\n", " else:\n", " fallback_orders = [get_order_status(order_id) for order_id in expected_order_ids]\n", " fallback_prompt = (\n", " \"The model did not request every expected order lookup. Use these application lookup results \"\n", " \"to answer in two labeled plain-text lines without leading hyphens or bold text: \" + json.dumps(fallback_orders)\n", " )\n", " parallel_final = create_response(\n", " model=MODEL_ID,\n", " input=parallel_input + [{\"role\": \"user\", \"content\": fallback_prompt}],\n", " max_output_tokens=1024,\n", " store=False,\n", " )\n", " parallel_answer = output_text(parallel_final).strip()\n", " record_check(\"Parallel tool calls\", \"warn\", {\"returned_order_ids\": parallel_order_ids, \"missing_order_ids\": missing_order_ids})\n", " record_response(\"Parallel order lookup fallback answer\", \"text\", parallel_answer)\n", " print_labeled_json(\"Result: returned tool calls\", {\"tool_call_count\": len(parallel_calls), \"order_ids\": parallel_order_ids})\n", " print_labeled_json(\"Result: local tool outputs\", fallback_orders)\n", " print_labeled_text(\"Result: final model answer\", parallel_answer)\n", " print_response_summary(parallel_final)\n", " print_key_takeaway('Local lookup outputs keep the parallel-tool pattern understandable if not every call is returned.')\n", "except Exception as exc:\n", " handle_example_error(\"Parallel tool calls\", exc)" ] }, { "cell_type": "markdown", "id": "88218be7", "metadata": {}, "source": [ "### 4.4 Use a Custom Text Tool\n", "\n", "Custom tools pass freeform text to application-owned logic instead of requiring a structured JSON argument object. This cell defines a support-note normalizer, requests a custom tool call, and includes a local fallback if the endpoint returns ordinary text instead of a custom call. Inspect the output item types and the normalized note.\n" ] }, { "cell_type": "code", "execution_count": 21, "id": "94436242", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
toolsnormalize_support_note
tool_choice{'type': 'custom', 'name': 'normalize_support_note'}
input1 item(s): user: Call normalize_support_note with this exact note. Do not answer directly; send the note to the custom tool: order-8831 | cust-1042 | replacement delayed | customer wants supervisor | offer expedited replacement or 15% co...
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: local fallback normalization
\n", "
ORDER_ID: ORDER-8831\n", "CUSTOMER_ID: CUST-1042\n", "ISSUE: REPLACEMENT DELAYED\n", "CUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\n", "POLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: returned output item types
\n", "
[\n", " "custom_tool_call"\n", "]
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: custom tool input
\n", "
order-8831 | cust-1042 | replacement delayed | customer wants supervisor | offer expedited replacement or 15% concession
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: application-owned normalized output
\n", "
ORDER_ID: ORDER-8831\n", "CUSTOMER_ID: CUST-1042\n", "ISSUE: REPLACEMENT DELAYED\n", "CUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\n", "POLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_gwyauif44dnxpxrcssrxj4bh57tmgg3zwr67hfcklfswshbbscoa
modelopenai.gpt-5.4
statuscompleted
output_item_types['custom_tool_call']
input_tokens674
cached_input_tokens0
output_tokens37
total_tokens711
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Custom tools are useful when the application owns a freeform parsing or execution step.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "custom_tools = [\n", " {\n", " \"type\": \"custom\",\n", " \"name\": \"normalize_support_note\",\n", " \"description\": \"Normalize a freeform support note written by an agent. Input is plain text.\",\n", " \"format\": {\"type\": \"text\"},\n", " }\n", "]\n", "\n", "\n", "def normalize_support_note_text(note: str) -> str:\n", " fields = [part.strip().upper() for part in note.split(\"|\")]\n", " labels = [\"ORDER_ID\", \"CUSTOMER_ID\", \"ISSUE\", \"CUSTOMER_REQUEST\", \"POLICY_OPTION\"]\n", " return \"\\n\".join(\n", " f\"{label}: {value}\"\n", " for label, value in zip(labels, fields)\n", " if value\n", " )\n", "\n", "\n", "support_note = \"order-8831 | cust-1042 | replacement delayed | customer wants supervisor | offer expedited replacement or 15% concession\"\n", "custom_input = [{\n", " \"role\": \"user\",\n", " \"content\": (\n", " \"Call normalize_support_note with this exact note. Do not answer directly; \"\n", " f\"send the note to the custom tool: {support_note}\"\n", " ),\n", "}]\n", "custom_request = {\n", " \"model\": MODEL_ID,\n", " \"input\": custom_input,\n", " \"tools\": custom_tools,\n", " \"tool_choice\": {\"type\": \"custom\", \"name\": \"normalize_support_note\"},\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(custom_request)\n", "print_labeled_text(\"Result: local fallback normalization\", normalize_support_note_text(support_note))\n", "try:\n", " custom_plan = create_response(**custom_request)\n", " returned_item_types = [item.get(\"type\") for item in response_items(custom_plan)]\n", " try:\n", " custom_call = first_output_item(custom_plan, \"custom_tool_call\")\n", " if custom_call is None:\n", " raise LookupError(\"No custom_tool_call item returned.\")\n", " tool_input = custom_call.get(\"input\", \"\").strip()\n", " normalized_note = normalize_support_note_text(tool_input)\n", " record_check(\"Custom tools\", \"pass\", {\"output_item_types\": returned_item_types, \"normalized_note\": normalized_note})\n", " record_response(\"Normalized support note\", \"text\", normalized_note)\n", " print_labeled_json(\"Result: returned output item types\", returned_item_types)\n", " print_labeled_text(\"Result: custom tool input\", tool_input)\n", " print_labeled_text(\"Result: application-owned normalized output\", normalized_note)\n", " except LookupError:\n", " fallback_text = output_text(custom_plan).strip() or \"No text content was returned.\"\n", " normalized_note = normalize_support_note_text(support_note)\n", " record_check(\"Custom tools\", \"warn\", {\n", " \"expected\": \"custom_tool_call item named normalize_support_note\",\n", " \"actual_output_item_types\": returned_item_types,\n", " \"meaning\": \"The model response did not include a custom-tool invocation, so the application fallback normalization is shown for teaching.\",\n", " })\n", " record_response(\"Custom tool text fallback\", \"text\", fallback_text)\n", " record_response(\"Application-owned normalization fallback\", \"text\", normalized_note)\n", " print_labeled_json(\"Result: returned output item types\", returned_item_types or [\"no typed output items returned\"])\n", " print_labeled_text(\"Result: model text response\", fallback_text)\n", " print_labeled_text(\"Result: application-owned normalization\", normalized_note)\n", " print_response_summary(custom_plan)\n", " print_key_takeaway('Custom tools are useful when the application owns a freeform parsing or execution step.')\n", "except Exception as exc:\n", " handle_example_error(\"Custom tools\", exc)" ] }, { "cell_type": "markdown", "id": "d064c671", "metadata": {}, "source": [ "## 5. Send Direct File Input\n", "\n", "Direct file input is separate from application-managed tools. A file can be included in the current Responses request as an `input_file` item alongside text instructions, which is useful when the model should read the file for this turn without setting up a retrieval index.\n", "\n", "### 5.1 Attach a PDF as `input_file`\n", "\n", "This cell generates a tiny PDF transcript in memory, attaches it as base64 file data, and asks for exact JSON fields from the document. Inspect the PDF preview, expected fields, parsed response, and usage summary.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 22, "id": "96b1aae1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "
Result: PDF transcript preview
\n", "
BrightCart support transcript\n", "Ticket: TICKET-7429\n", "Customer: Maya Chen\n", "Order: ORDER-8831\n", "Product: Standing desk replacement\n", "Issue: Replacement for a damaged item is delayed and carrier scan has not moved\n", "Customer request: Supervisor callback and refund options\n", "Policy options: expedited replacement or 15% concession with agent approval after 48-hour delay
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
text formatjson_object
input1 item(s): user: input_file: brightcart-support-transcript.pdf; input_text: Read the attached PDF support transcript and return JSON with keys ticket_id, customer, order_id, product, issue, reques...
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: expected fields
\n", "
{\n", " "ticket_id": "TICKET-7429",\n", " "customer": "Maya Chen",\n", " "order_id": "ORDER-8831",\n", " "product": "Standing desk replacement"\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
{"ticket_id":"TICKET-7429","customer":"Maya Chen","order_id":"ORDER-8831","product":"Standing desk replacement","issue":"Replacement for a damaged item is delayed and carrier scan has not moved","requested_resolution":"Supervisor callback and refund options","policy_options":"expedited replacement or 15% concession with agent approval after 48-hour delay"}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_colsvndmpjd6qczpemdjscmsbjefmgl5vh6i7alqt52jflentfna
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens713
cached_input_tokens0
output_tokens82
total_tokens795
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Direct file input is useful when the file should be read in the current request context.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "def make_simple_pdf(lines: list[str]) -> bytes:\n", " def pdf_escape(text: str) -> str:\n", " return text.replace(\"\\\\\", \"\\\\\\\\\").replace(\"(\", \"\\\\(\").replace(\")\", \"\\\\)\")\n", "\n", " stream_lines = [\"BT\", \"/F1 11 Tf\", \"72 740 Td\", \"15 TL\"]\n", " for idx, line in enumerate(lines):\n", " if idx:\n", " stream_lines.append(\"T*\")\n", " stream_lines.append(f\"({pdf_escape(line)}) Tj\")\n", " stream_lines.append(\"ET\")\n", " stream = \"\\n\".join(stream_lines).encode(\"latin-1\", \"replace\")\n", "\n", " objects = [\n", " b\"<< /Type /Catalog /Pages 2 0 R >>\",\n", " b\"<< /Type /Pages /Kids [3 0 R] /Count 1 >>\",\n", " b\"<< /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] /Resources << /Font << /F1 4 0 R >> >> /Contents 5 0 R >>\",\n", " b\"<< /Type /Font /Subtype /Type1 /BaseFont /Helvetica >>\",\n", " b\"<< /Length \" + builtins.str(len(stream)).encode(\"ascii\") + b\" >>\\nstream\\n\" + stream + b\"\\nendstream\",\n", " ]\n", "\n", " pdf = b\"%PDF-1.4\\n\"\n", " offsets = [0]\n", " for idx, obj in enumerate(objects, start=1):\n", " offsets.append(len(pdf))\n", " pdf += f\"{idx} 0 obj\\n\".encode(\"ascii\") + obj + b\"\\nendobj\\n\"\n", " xref_offset = len(pdf)\n", " pdf += f\"xref\\n0 {len(objects) + 1}\\n0000000000 65535 f \\n\".encode(\"ascii\")\n", " for offset in offsets[1:]:\n", " pdf += f\"{offset:010d} 00000 n \\n\".encode(\"ascii\")\n", " pdf += f\"trailer\\n<< /Size {len(objects) + 1} /Root 1 0 R >>\\nstartxref\\n{xref_offset}\\n%%EOF\\n\".encode(\"ascii\")\n", " return pdf\n", "\n", "\n", "file_lines = [\n", " \"BrightCart support transcript\",\n", " \"Ticket: TICKET-7429\",\n", " \"Customer: Maya Chen\",\n", " \"Order: ORDER-8831\",\n", " \"Product: Standing desk replacement\",\n", " \"Issue: Replacement for a damaged item is delayed and carrier scan has not moved\",\n", " \"Customer request: Supervisor callback and refund options\",\n", " \"Policy options: expedited replacement or 15% concession with agent approval after 48-hour delay\",\n", "]\n", "file_text = \"\\n\".join(file_lines)\n", "pdf_data = base64.b64encode(make_simple_pdf(file_lines)).decode(\"utf-8\")\n", "\n", "expected_direct_file_fields = {\n", " \"ticket_id\": \"TICKET-7429\",\n", " \"customer\": \"Maya Chen\",\n", " \"order_id\": \"ORDER-8831\",\n", " \"product\": \"Standing desk replacement\",\n", "}\n", "\n", "direct_file_request = {\n", " \"model\": MODEL_ID,\n", " \"input\": [\n", " {\n", " \"role\": \"user\",\n", " \"content\": [\n", " {\n", " \"type\": \"input_file\",\n", " \"filename\": \"brightcart-support-transcript.pdf\",\n", " \"file_data\": f\"data:application/pdf;base64,{pdf_data}\",\n", " },\n", " {\n", " \"type\": \"input_text\",\n", " \"text\": (\n", " \"Read the attached PDF support transcript and return JSON with keys \"\n", " \"ticket_id, customer, order_id, product, issue, requested_resolution, and policy_options. \"\n", " \"Use exact values from the file. Do not return null for fields that are present in the file.\"\n", " ),\n", " },\n", " ],\n", " }\n", " ],\n", " \"text\": {\"format\": {\"type\": \"json_object\"}},\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_labeled_text(\"Result: PDF transcript preview\", file_text)\n", "print_request_shape(direct_file_request)\n", "print_labeled_json(\"Result: expected fields\", expected_direct_file_fields)\n", "try:\n", " direct_file_response = create_response(**direct_file_request)\n", " raw_direct_file_output = output_text(direct_file_response).strip()\n", " try:\n", " direct_file_payload = json.loads(raw_direct_file_output)\n", " missing_or_empty = [\n", " key for key, expected in expected_direct_file_fields.items()\n", " if builtins.str(direct_file_payload.get(key, \"\")).strip().lower() != expected.lower()\n", " ]\n", " null_fields = [key for key, value in direct_file_payload.items() if value in {None, \"\", []}]\n", " if missing_or_empty or null_fields:\n", " record_check(\"Direct file inputs\", \"warn\", {\n", " \"message\": \"The request completed, but the model did not extract the expected values from the attached PDF.\",\n", " \"missing_or_unexpected_fields\": missing_or_empty,\n", " \"empty_fields\": null_fields,\n", " \"payload\": direct_file_payload,\n", " })\n", " record_response(\"Support transcript extraction returned by model\", \"json\", direct_file_payload)\n", " print_labeled_text(\"Result\", \"The request completed, but the model did not extract the expected values from the attached PDF.\")\n", " print_labeled_json(\"Result: returned JSON\", direct_file_payload)\n", " else:\n", " record_check(\"Direct file inputs\", \"pass\", direct_file_payload)\n", " record_response(\"Support transcript extraction\", \"json\", direct_file_payload)\n", " print_labeled_json(\"Result\", direct_file_payload)\n", " except Exception as parse_exc:\n", " record_check(\"Direct file inputs\", \"warn\", {\n", " \"message\": \"The request completed, but the response was not valid JSON.\",\n", " \"text_sample\": raw_direct_file_output[:600],\n", " \"error\": builtins.str(parse_exc),\n", " })\n", " record_response(\"Support transcript extraction text\", \"text\", raw_direct_file_output[:1200])\n", " print_labeled_text(\"Result\", raw_direct_file_output[:1200])\n", " print_response_summary(direct_file_response)\n", " print_key_takeaway('Direct file input is useful when the file should be read in the current request context.')\n", "except Exception as exc:\n", " handle_example_error(\"Direct file inputs\", exc)" ] }, { "cell_type": "markdown", "id": "492681d1", "metadata": {}, "source": [ "## 6. Manage Conversation State\n", "\n", "Conversation state determines how follow-up turns receive prior context. The Responses API supports stored continuation with `previous_response_id`, and applications can also manage state themselves by resending relevant input history. This section compares both patterns, then shows encrypted reasoning context where supported.\n", "\n" ] }, { "cell_type": "markdown", "id": "29e58e15", "metadata": {}, "source": [ "### 6.1 Continue with `previous_response_id`\n", "\n", "Use `previous_response_id` to continue from a stored response without resending the full prior prompt. The first request stores the BrightCart case details; the second request passes only the new follow-up instruction plus the previous response ID. Inspect whether the follow-up preserves the order, customer, issue, and next action.\n" ] }, { "cell_type": "code", "execution_count": 23, "id": "3a1e4556", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
previous_response_id<response-id-from-prior-stored-turn>
inputReturn five labeled lines: ticket ID, order ID, customer name, issue, and next best action.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
Ticket ID: TICKET-4812\n", "Order ID: ORDER-8831\n", "Customer Name: Maya Chen\n", "Issue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\n", "Next Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_gkgqadc2gd24747lmhy5waftt5tga7eibtv67k77lndjgbuioo6q
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens715
cached_input_tokens0
output_tokens86
total_tokens801
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: previous_response_id lets a follow-up use stored context without resending the full prior turn.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "promised_delivery = (date.today() + timedelta(days=2)).isoformat()\n", "stateful_seed_input = (\n", " f\"Customer Maya Chen opened ticket TICKET-4812 about order ORDER-8831. \"\n", " \"The item is a standing desk replacement for a damaged delivery. \"\n", " f\"The promised delivery date is {promised_delivery}, but the carrier scan has not moved in 36 hours. \"\n", " \"Customer sentiment is frustrated because this is the second attempt. \"\n", " \"Support policy says to offer expedited replacement or a 15% concession if the delay exceeds 48 hours. \"\n", " \"Escalation owner is Tier 2 Returns.\"\n", ")\n", "stateful_followup_input = \"Return five labeled lines: ticket ID, order ID, customer name, issue, and next best action.\"\n", "stateful_request_shape = {\n", " \"model\": MODEL_ID,\n", " \"input\": stateful_followup_input,\n", " \"previous_response_id\": \"\",\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(stateful_request_shape)\n", "try:\n", " stateful_turn_1 = create_response(model=MODEL_ID, input=stateful_seed_input, max_output_tokens=1024, store=True)\n", " remember_stored_response(stateful_turn_1)\n", " stateful_turn_2 = create_response(model=MODEL_ID, input=stateful_followup_input, previous_response_id=stateful_turn_1.id, max_output_tokens=1024, store=False)\n", " text = output_text(stateful_turn_2).strip()\n", " require(\"order-8831\" in text.lower() or \"maya\" in text.lower(), \"Stateful continuation response missed expected support context.\")\n", " record_check(\"Stateful continuation\", \"pass\", stateful_turn_1.id)\n", " record_response(\"Stateful support handoff\", \"text\", text)\n", " print_labeled_text(\"Result\", text)\n", " print_response_summary(stateful_turn_2)\n", " print_key_takeaway('previous_response_id lets a follow-up use stored context without resending the full prior turn.')\n", "except Exception as exc:\n", " handle_example_error(\"Stateful continuation\", exc)\n" ] }, { "cell_type": "markdown", "id": "cab55353", "metadata": {}, "source": [ "### 6.2 Rebuild Stateless Context\n", "\n", "Stateless continuation means the application sends the relevant history on every request. This is a good fit when your product already owns conversation storage, retention policy, or audit requirements. This cell sends a short chat history plus a new handoff instruction and inspects the summary and token usage.\n" ] }, { "cell_type": "code", "execution_count": 24, "id": "647ee6e8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
input4 item(s): user: Support chat TICKET-3920: Customer Jordan Lee says ORDER-7718 arrived with a cracked monitor stand.; assistant: Captured damaged-item issue for ORDER-7718 and asked for preferred resolution.; user: Jordan wants a replacement shipped this week and asks whether the damaged item must be returned first.; user: Summarize this support chat for the next agent in five labeled plain-text lines. Do not use leading hyphens or bold text.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
Customer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\n", "Issue: Damaged item; monitor stand is cracked on arrival.\n", "Requested Resolution: Customer wants a replacement shipped this week.\n", "Open Question: Jordan asked whether the damaged item must be returned before replacement is sent.\n", "Status: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_mezt6yqizyswuvujvudnonr34b73ndyyu2qsfncgtjyppzie5vva
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens255
cached_input_tokens0
output_tokens78
total_tokens333
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Stateless continuation sends the relevant history with each request when the application owns conversation storage.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "stateless_history = [\n", " {\"role\": \"user\", \"content\": \"Support chat TICKET-3920: Customer Jordan Lee says ORDER-7718 arrived with a cracked monitor stand.\"},\n", " {\"role\": \"assistant\", \"content\": \"Captured damaged-item issue for ORDER-7718 and asked for preferred resolution.\"},\n", " {\"role\": \"user\", \"content\": \"Jordan wants a replacement shipped this week and asks whether the damaged item must be returned first.\"},\n", "]\n", "stateless_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": stateless_history + [{\"role\": \"user\", \"content\": \"Summarize this support chat for the next agent in five labeled plain-text lines. Do not use leading hyphens or bold text.\"}],\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(stateless_payload)\n", "try:\n", " stateless_response = create_response(**stateless_payload)\n", " stateless_text = output_text(stateless_response).strip()\n", " require(stateless_text, \"Stateless continuation response did not return text.\")\n", " record_check(\"Stateless continuation\", \"pass\", summarize_response(stateless_response))\n", " record_response(\"Stateless support handoff\", \"text\", stateless_text)\n", " print_labeled_text(\"Result\", stateless_text)\n", " print_response_summary(stateless_response)\n", " print_key_takeaway('Stateless continuation sends the relevant history with each request when the application owns conversation storage.')\n", "except Exception as exc:\n", " handle_example_error(\"Stateless continuation\", exc)\n" ] }, { "cell_type": "markdown", "id": "9da3c529", "metadata": {}, "source": [ "### 6.3 Carry Encrypted Reasoning Context\n", "\n", "Reasoning-capable models may return reasoning items and encrypted reasoning content when requested. This cell asks for encrypted reasoning metadata, carries prior response items into a follow-up request, and inspects whether encrypted content was returned. The hidden reasoning text is not exposed; the application only carries opaque context forward where supported.\n", "\n", "Official docs: [Reasoning models](https://developers.openai.com/api/docs/guides/reasoning) describes reasoning models and reasoning effort in Responses workflows.\n" ] }, { "cell_type": "code", "execution_count": 25, "id": "15ffab13", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
reasoning{'effort': 'medium'}
include['reasoning.encrypted_content']
input1 item(s): user: For a customer-support assistant handling names, order IDs, and refund context, compare stateful and stateless continuation in two sentences.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: reasoning metadata
\n", "
{\n", " "returned_item_types": [\n", " "reasoning",\n", " "message"\n", " ],\n", " "encrypted_reasoning_content_returned": true\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: follow-up answer
\n", "
Recommendation: Stateless continuation\n", "Reason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_qmjgoymsxqf32ht3apisbvscrv4d5t5x2tnxeftkxwpyl4eppjka
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens295
cached_input_tokens0
output_tokens56
total_tokens351
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Encrypted reasoning content can be carried forward where supported without exposing hidden reasoning text.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "encrypted_history = [\n", " {\"role\": \"user\", \"content\": \"For a customer-support assistant handling names, order IDs, and refund context, compare stateful and stateless continuation in two sentences.\"}\n", "]\n", "encrypted_turn_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": encrypted_history,\n", " \"reasoning\": {\"effort\": \"medium\"},\n", " \"include\": [\"reasoning.encrypted_content\"],\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(encrypted_turn_payload)\n", "try:\n", " encrypted_turn_1 = create_response(**encrypted_turn_payload)\n", " encrypted_turn_2 = create_response(\n", " model=MODEL_ID,\n", " input=encrypted_history + response_items(encrypted_turn_1) + [\n", " {\"role\": \"user\", \"content\": \"Based on the prior reasoning context, recommend one approach for a regulated support workflow in two labeled plain-text lines. Do not use leading hyphens or bold text.\"}\n", " ],\n", " max_output_tokens=1024,\n", " store=False,\n", " )\n", " reasoning_items = [item for item in response_items(encrypted_turn_1) if item.get(\"type\") == \"reasoning\"]\n", " has_encrypted_content = any(item.get(\"encrypted_content\") for item in reasoning_items)\n", " record_check(\"Encrypted reasoning\", \"pass\", {\"encrypted_content_returned\": has_encrypted_content, \"reasoning_item_count\": len(reasoning_items)})\n", " encrypted_answer = output_text(encrypted_turn_2).strip()\n", " record_response(\"State strategy recommendation\", \"text\", encrypted_answer)\n", " print_labeled_json(\"Result: reasoning metadata\", {\n", " \"returned_item_types\": [item.get(\"type\") for item in response_items(encrypted_turn_1)],\n", " \"encrypted_reasoning_content_returned\": has_encrypted_content,\n", " })\n", " print_labeled_text(\"Result: follow-up answer\", encrypted_answer)\n", " print_response_summary(encrypted_turn_2)\n", " print_key_takeaway('Encrypted reasoning content can be carried forward where supported without exposing hidden reasoning text.')\n", "except Exception as exc:\n", " handle_example_error(\"Encrypted reasoning\", exc)\n" ] }, { "cell_type": "markdown", "id": "1a4abe4f", "metadata": {}, "source": [ "## 7. Use Prompt Caching\n", "\n", "Prompt caching improves latency and cost when requests share an exact static prefix.\n", "\n", "### 7.1 Compare Two Cache-Keyed Requests\n", "\n", "This cell places stable BrightCart policy text at the beginning of the input, sends the same request twice with a `prompt_cache_key`, and compares token metadata. Inspect `cached_input_tokens` on the second response when the endpoint returns cache details.\n", "\n", "Note: `PROMPT_CACHE_RETENTION` is selected from the active `MODEL_ID`. It uses `24h` for `openai.gpt-5.5` and later models, and `in_memory` for `openai.gpt-5.4`.\n" ] }, { "cell_type": "code", "execution_count": 26, "id": "9a2c7825", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
prompt_cache_keybrightcart-support-policy-v1
prompt_cache_retentionin_memory
input2 item(s): system: BrightCart support policy: 1. Be empathetic, concise, and specific about the customer's order. 2. Do not promise refunds, credits, or delivery dates unless the policy context supports it. 3. For damaged-item replacements...; user: Draft a two-sentence agent reply for Maya Chen about delayed replacement order ORDER-8831.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Prompt-cache input size
\n", "
{\n", " "estimated_input_words": 3016,\n", " "target_minimum_tokens": 2048\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
Hi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
First request summary
\n", "
{\n", " "id": "resp_3w6r6ipbqa5z2max35awv3i23i5sjmfa33zzw3vhpgqxcvchhkgq",\n", " "model": "openai.gpt-5.4",\n", " "status": "completed",\n", " "output_item_types": [\n", " "message"\n", " ],\n", " "input_tokens": 3970,\n", " "output_tokens": 66,\n", " "total_tokens": 4036,\n", " "cached_input_tokens": 0,\n", " "reasoning_output_tokens": 0,\n", " "service_tier": "default"\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Second request summary
\n", "
{\n", " "id": "resp_zzjeqttoswdjdwwl56xpolvly23w4h2n5dtsdoddgxqwfbkn7npq",\n", " "model": "openai.gpt-5.4",\n", " "status": "completed",\n", " "output_item_types": [\n", " "message"\n", " ],\n", " "input_tokens": 3970,\n", " "output_tokens": 48,\n", " "total_tokens": 4018,\n", " "cached_input_tokens": 0,\n", " "reasoning_output_tokens": 0,\n", " "service_tier": "default"\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
requestinput_tokenscached_input_tokensoutput_tokenstotal_tokens
first39700664036
second39700484018
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: cached_input_tokens is the metadata field to inspect for prompt-cache reuse.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "base_support_policy = [\n", " \"BrightCart support policy:\",\n", " \"1. Be empathetic, concise, and specific about the customer's order.\",\n", " \"2. Do not promise refunds, credits, or delivery dates unless the policy context supports it.\",\n", " \"3. For damaged-item replacements, check replacement status before offering concessions.\",\n", " \"4. If a replacement delay exceeds 48 hours, offer expedited replacement or a 15% concession subject to agent approval.\",\n", "]\n", "policy_reference_paragraph = (\n", " \"Expanded cacheable policy context: BrightCart agents should identify the customer, order ID, replacement status, \"\n", " \"carrier scan age, promised delivery window, item category, prior concessions, and supervisor approval needs before \"\n", " \"drafting a customer-facing answer. The assistant should preserve a calm tone, avoid unsupported promises, separate \"\n", " \"confirmed facts from assumptions, recommend one clear next action, and document why any escalation, expedited \"\n", " \"replacement, or concession is appropriate. Repeated policy context like this is intentionally stable across many \"\n", " \"requests so prompt caching can reuse the prefix when the same cache key is supplied.\"\n", ")\n", "expanded_policy_context = \"\\n\".join(\n", " f\"Policy reference paragraph {idx + 1}: {policy_reference_paragraph}\"\n", " for idx in range(32)\n", ")\n", "stable_support_policy = \"\\n\".join(base_support_policy + [expanded_policy_context])\n", "cache_input = [\n", " {\"role\": \"system\", \"content\": stable_support_policy},\n", " {\"role\": \"user\", \"content\": \"Draft a two-sentence agent reply for Maya Chen about delayed replacement order ORDER-8831.\"},\n", "]\n", "estimated_cache_input_words = len(json.dumps(cache_input).split())\n", "require(estimated_cache_input_words > 2048, f\"Prompt-cache input should be over 2048 words; found {estimated_cache_input_words}.\")\n", "cache_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": cache_input,\n", " \"prompt_cache_key\": \"brightcart-support-policy-v1\",\n", " \"prompt_cache_retention\": PROMPT_CACHE_RETENTION,\n", " \"max_output_tokens\": 1024,\n", " \"store\": False,\n", "}\n", "\n", "print_request_shape(cache_payload)\n", "print_labeled_json(\"Prompt-cache input size\", {\"estimated_input_words\": estimated_cache_input_words, \"target_minimum_tokens\": 2048})\n", "try:\n", " cache_response_1 = create_response(**cache_payload)\n", " cache_response_2 = create_response(**cache_payload)\n", " cache_summary_1 = summarize_response(cache_response_1)\n", " cache_summary_2 = summarize_response(cache_response_2)\n", " cache_comparison = pd.DataFrame([\n", " {\n", " \"request\": \"first\",\n", " \"input_tokens\": cache_summary_1.get(\"input_tokens\"),\n", " \"cached_input_tokens\": cache_summary_1.get(\"cached_input_tokens\"),\n", " \"output_tokens\": cache_summary_1.get(\"output_tokens\"),\n", " \"total_tokens\": cache_summary_1.get(\"total_tokens\"),\n", " },\n", " {\n", " \"request\": \"second\",\n", " \"input_tokens\": cache_summary_2.get(\"input_tokens\"),\n", " \"cached_input_tokens\": cache_summary_2.get(\"cached_input_tokens\"),\n", " \"output_tokens\": cache_summary_2.get(\"output_tokens\"),\n", " \"total_tokens\": cache_summary_2.get(\"total_tokens\"),\n", " },\n", " ])\n", " record_check(\"Prompt caching\", \"pass\" if cache_summary_2.get(\"cached_input_tokens\") is not None else \"warn\", {\"first\": cache_summary_1, \"second\": cache_summary_2})\n", " cache_reply = output_text(cache_response_2).strip()\n", " record_response(\"Prompt-cache token comparison\", \"table\", cache_comparison)\n", " record_response(\"Cached support-policy reply\", \"text\", cache_reply)\n", "\n", " print_labeled_text(\"Result\", cache_reply)\n", " print_labeled_json(\"First request summary\", cache_summary_1)\n", " print_labeled_json(\"Second request summary\", cache_summary_2)\n", " print_label(\"Response summary\")\n", " display_wrapped_table(cache_comparison, max_col_width_px=260)\n", " print_key_takeaway('cached_input_tokens is the metadata field to inspect for prompt-cache reuse.')\n", "except Exception as exc:\n", " handle_example_error(\"Prompt caching\", exc)" ] }, { "cell_type": "markdown", "id": "3766504d", "metadata": {}, "source": [ "## 8. Run Background Work\n", "\n", "Background mode starts a response asynchronously and lets the application poll for terminal status.\n", "\n", "### 8.1 Submit and Poll a Background Response\n", "\n", "This cell sends `background=true`, stores the response ID, polls while status is queued or in progress, and then prints the final manager summary. Inspect the status history, final status, response ID, and token summary.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 27, "id": "3e5ec164", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeTrue
backgroundTrue
inputReturn exactly three labeled plain-text lines for a support-manager summary: theme, risk, next action. Keep each line under 12 words. Do not use leading hyphens or bold text. Same-day BrightCart support backlog: 1. 18 delayed-order contacts, mostly from the We...
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: status history
\n", "
[\n", " "in_progress",\n", " "completed"\n", "]
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result: manager summary
\n", "
theme: Shipping delays dominate, especially West Coast distribution lane.\n", "risk: Rising dissatisfaction from delays, replacements, and return exceptions.\n", "next action: Escalate West Coast lane issues and review holiday return policy.
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
idresp_lmmtsvgk3ntolh5ci5vxccmsa6uxcgrsq7v54jpz7oewmociyesa
modelopenai.gpt-5.4
statuscompleted
output_item_types['message']
input_tokens246
cached_input_tokens0
output_tokens45
total_tokens291
reasoning_output_tokens0
service_tierdefault
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Background mode starts work asynchronously and lets the application poll by response ID.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "backlog = \"\"\"\n", "Same-day BrightCart support backlog:\n", "1. 18 delayed-order contacts, mostly from the West Coast distribution lane.\n", "2. 7 damaged-item replacement contacts; 3 mention replacement delays.\n", "3. 5 return-window exception requests after holiday promotions.\n", "\"\"\".strip()\n", "background_payload = {\n", " \"model\": MODEL_ID,\n", " \"input\": f\"Return exactly three labeled plain-text lines for a support-manager summary: theme, risk, next action. Keep each line under 12 words. Do not use leading hyphens or bold text.\\n\\n{backlog}\",\n", " \"background\": True,\n", " \"max_output_tokens\": 1024,\n", " \"store\": True,\n", "}\n", "\n", "print_request_shape(background_payload)\n", "try:\n", " background_response = create_response(**background_payload)\n", " remember_stored_response(background_response)\n", " status_history = [getattr(background_response, \"status\", None)]\n", " for _ in range(15):\n", " if getattr(background_response, \"status\", None) not in {\"queued\", \"in_progress\"}:\n", " break\n", " time.sleep(2)\n", " background_response = retrieve_response(background_response.id)\n", " status_history.append(getattr(background_response, \"status\", None))\n", " background_summary = summarize_response(background_response)\n", " manager_summary = output_text(background_response).strip()\n", " require(manager_summary, \"Background response did not return text.\")\n", " status = \"pass\" if background_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n", " record_check(\"Background mode\", status, {\"status_history\": status_history, \"id\": getattr(background_response, \"id\", None), \"final_status\": background_summary.get(\"status\")})\n", " record_response(\"Background manager summary\", \"text\", manager_summary)\n", " print_labeled_json(\"Result: status history\", status_history)\n", " print_labeled_text(\"Result: manager summary\", manager_summary)\n", " print_response_summary(background_summary)\n", " print_key_takeaway('Background mode starts work asynchronously and lets the application poll by response ID.')\n", "except Exception as exc:\n", " handle_example_error(\"Background mode\", exc)" ] }, { "cell_type": "markdown", "id": "82b83ade", "metadata": {}, "source": [ "## 9. Compact Long-Running Context\n", "\n", "Compaction reduces long conversation state into durable facts, open questions, constraints, and next actions. This cell documents the application-side compaction pattern as a small JSON object so the concept is clear without adding another live feature path. Inspect which facts are kept and which details are omitted before the next turn.\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 28, "id": "41dd4ab2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "
JSON
\n", "
{\n", " "feature": "Compaction",\n", " "how_to_apply": "Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.",\n", " "brightcart_example": {\n", " "durable_facts": [\n", " "Customer Maya Chen",\n", " "ORDER-8831",\n", " "replacement delayed",\n", " "carrier scan stale"\n", " ],\n", " "policy_constraints": [\n", " "Do not promise refund without eligibility",\n", " "Offer expedited replacement or 15% concession after 48-hour delay with approval"\n", " ],\n", " "next_action": "Check latest carrier scan and supervisor callback status."\n", " }\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "compaction_note = {\n", " \"feature\": \"Compaction\",\n", " \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\n", " \"brightcart_example\": {\n", " \"durable_facts\": [\"Customer Maya Chen\", \"ORDER-8831\", \"replacement delayed\", \"carrier scan stale\"],\n", " \"policy_constraints\": [\"Do not promise refund without eligibility\", \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"],\n", " \"next_action\": \"Check latest carrier scan and supervisor callback status.\",\n", " },\n", "}\n", "record_check(\"Compaction\", \"documented\", compaction_note)\n", "record_response(\"Compacted support context\", \"json\", compaction_note)\n", "print_json(compaction_note)" ] }, { "cell_type": "markdown", "id": "0658a6a3", "metadata": {}, "source": [ "## 10. Run Operational Smoke Checks\n", "\n", "Operational smoke checks are lightweight setup checks, not a load test or service-level measurement. This cell sends three short requests, measures local elapsed time, summarizes success rate and token usage, and infers the region from the configured Bedrock base URL. Inspect latency, completion status, sample outputs, and token totals.\n" ] }, { "cell_type": "code", "execution_count": 29, "id": "20d26f31", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Request shape
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fieldvalue
modelopenai.gpt-5.4
max_output_tokens1024
storeFalse
service_tierauto
inputReply with one short customer-support sentence.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
Result
\n", "
{\n", " "region_hint": "us-west-2",\n", " "base_url_host": "bedrock-mantle.us-west-2.api.aws",\n", " "sample_count": 3,\n", " "success_rate": 1.0,\n", " "completed_rate": 1.0,\n", " "avg_latency_seconds": 0.362,\n", " "p50_latency_seconds": 0.377,\n", " "p90_latency_seconds": 0.4,\n", " "total_output_tokens": 34,\n", " "total_tokens": 544\n", "}
\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Response summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
oklatency_secondsoutput_tokenstotal_tokensstatussample_output
True0.40014184completedWe apologize for the delay with your replacement order.
True0.3106174completedResolution Rate
True0.37714186completedI’m escalating the return exception to a supervisor.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Key takeaway: Responsiveness samples are setup checks, not a load test or service-level measurement.
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from __future__ import annotations\n", "def infer_region_from_base_url(base_url: str) -> str | None:\n", " host = normalize_base_url(base_url).replace(\"https://\", \"\").split(\"/\")[0]\n", " for part in host.split(\".\"):\n", " if part.count(\"-\") >= 2 and any(char.isdigit() for char in part):\n", " return part\n", " return None\n", "\n", "\n", "def percentile(values: list[float], pct: float) -> float | None:\n", " if not values:\n", " return None\n", " ordered = sorted(values)\n", " index = min(len(ordered) - 1, max(0, round((pct / 100) * (len(ordered) - 1))))\n", " return round(ordered[index], 3)\n", "\n", "\n", "operations_features = [\"Latency runtime example\", \"Throughput runtime example\", \"Reliability runtime example\", \"Region check\"]\n", "operations_payload = {\"model\": MODEL_ID, \"input\": \"Reply with one short customer-support sentence.\", \"service_tier\": \"auto\", \"max_output_tokens\": 1024, \"store\": False}\n", "\n", "print_request_shape(operations_payload)\n", "if not RUN_RESPONSIVENESS_CHECK:\n", " record_check(\"Endpoint responsiveness\", \"skipped\", \"BEDROCK_RESPONSIVENESS_CHECK is disabled.\")\n", " print_labeled_text(\"Result\", \"Responsiveness check disabled.\")\n", "else:\n", " prompts = [\n", " \"Reply in one short sentence: apologize for a delayed replacement order.\",\n", " \"Reply with one metric name for support-assistant quality.\",\n", " \"Reply in one short sentence: hand off a return exception to a supervisor.\",\n", " ]\n", " samples = []\n", " for idx, prompt in enumerate(prompts):\n", " started = time.perf_counter()\n", " try:\n", " response = create_response(model=MODEL_ID, input=prompt, service_tier=\"auto\", max_output_tokens=1024, store=False)\n", " elapsed = time.perf_counter() - started\n", " summary = summarize_response(response)\n", " text = output_text(response).strip()\n", " samples.append({\n", " \"ok\": bool(text),\n", " \"latency_seconds\": round(elapsed, 3),\n", " \"output_tokens\": summary.get(\"output_tokens\") or 0,\n", " \"total_tokens\": summary.get(\"total_tokens\") or 0,\n", " \"status\": summary.get(\"status\"),\n", " \"sample_output\": text[:140],\n", " })\n", " except Exception as exc:\n", " elapsed = time.perf_counter() - started\n", " samples.append({\"ok\": False, \"latency_seconds\": round(elapsed, 3), \"error\": describe_api_error(exc)})\n", "\n", " successes = [sample for sample in samples if sample[\"ok\"]]\n", " completed = [sample for sample in successes if sample.get(\"status\") in {None, \"completed\"}]\n", " latencies = [sample[\"latency_seconds\"] for sample in successes]\n", " responsiveness_summary = {\n", " \"region_hint\": infer_region_from_base_url(BASE_URL),\n", " \"base_url_host\": normalize_base_url(BASE_URL).replace(\"https://\", \"\").split(\"/\")[0],\n", " \"sample_count\": len(samples),\n", " \"success_rate\": len(successes) / len(samples) if samples else 0,\n", " \"completed_rate\": len(completed) / len(samples) if samples else 0,\n", " \"avg_latency_seconds\": round(sum(latencies) / len(latencies), 3) if latencies else None,\n", " \"p50_latency_seconds\": percentile(latencies, 50),\n", " \"p90_latency_seconds\": percentile(latencies, 90),\n", " \"total_output_tokens\": sum(sample.get(\"output_tokens\", 0) for sample in samples),\n", " \"total_tokens\": sum(sample.get(\"total_tokens\", 0) for sample in samples),\n", " }\n", " status = \"pass\" if len(successes) == len(samples) and len(completed) == len(samples) else \"warn\"\n", " for feature in operations_features:\n", " record_check(feature, status, responsiveness_summary)\n", " record_response(\"Endpoint responsiveness summary\", \"json\", {**responsiveness_summary, \"samples\": samples})\n", " print_labeled_json(\"Result\", responsiveness_summary)\n", " print_label(\"Response summary\")\n", " display_wrapped_table(pd.DataFrame(samples), max_col_width_px=360)\n", " print_key_takeaway('Responsiveness samples are setup checks, not a load test or service-level measurement.')\n" ] }, { "cell_type": "markdown", "id": "0724c842", "metadata": {}, "source": [ "## 11. Clean Up and Review Results\n", "\n", "Stored responses created by lifecycle, stateful continuation, and background examples are tracked in `STORED_RESPONSE_IDS`. This final cell attempts to delete stored responses when cleanup is enabled, then prints the run summary and example-response gallery. Inspect warnings first; they usually identify endpoint configuration, model availability, or feature-support differences.\n" ] }, { "cell_type": "code", "execution_count": 30, "id": "4cf13814", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
Stored response cleanup
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
response_idstatusdetail
resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfiawarn{'exception_class': 'AuthenticationError', 'status_code': 401, 'retryable': False, 'request_id': 'req_gkni5zyr7lkjkz2vfiwvkev2qgxs76crcwz5whhjrdkma7up3yta', 'message': 'Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}'}
resp_vjrtvnakcgxjhnq5b7cj7rtowtdh7chkkf3aqwbdkynhjqiklp3awarn{'exception_class': 'AuthenticationError', 'status_code': 401, 'retryable': False, 'request_id': 'req_vvwhsmp2rkrzwbkajqdpdod2o4j5xo2vxelzxbcp2fjan2ybwc2a', 'message': 'Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}'}
resp_lmmtsvgk3ntolh5ci5vxccmsa6uxcgrsq7v54jpz7oewmociyesawarn{'exception_class': 'AuthenticationError', 'status_code': 401, 'retryable': False, 'request_id': 'req_btgbpxspnokm3wzfndybnfuv3kjudxt3r2ihlsmru7ziisn52goa', 'message': 'Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}'}
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Run summary
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namestatusdetail
Endpoint shapepasshttps://bedrock-mantle.us-west-2.api.aws/openai/v1/responses
Model selectionpassUsing configured model; model-list metadata is not required for requests.
Error handlingpass{\"normalized_fields\": [\"exception_class\", \"status_code\", \"retryable\", \"request_id\", \"message\"], \"retryable_status_codes\": [408, 409, 429, 500, 502, 503, 504], \"notes\": \"call_with_retries(...) uses this taxonomy for transient retry handling.\"}
Text generationpassresp_naythl6fvzhoctlsdogd4vpr673q5ibagqqpiujbast3sy6viroa
Text generationpass{\"id\": \"resp_nmvqefzghd5hi67uwy4wfvhwnnzild3lslqsxqor3cat63kmucoq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 177, \"output_tokens\": 129, \"total_tokens\": 306, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 18, \"service_tier\": \"default\"}
Reasoning effortpass{\"id\": \"resp_nmvqefzghd5hi67uwy4wfvhwnnzild3lslqsxqor3cat63kmucoq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 177, \"output_tokens\": 129, \"total_tokens\": 306, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 18, \"service_tier\": \"default\"}
Responses lifecyclepassresp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia
Response schemapass{\"id\": \"resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 198, \"output_tokens\": 109, \"total_tokens\": 307, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"}
Usage metadatapass{\"id\": \"resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 198, \"output_tokens\": 109, \"total_tokens\": 307, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"}
Prompt cachingpass{\"id\": \"resp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 183, \"output_tokens\": 91, \"total_tokens\": 274, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 34, \"service_tier\": \"default\"}
Service tierpass{\"id\": \"resp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 183, \"output_tokens\": 91, \"total_tokens\": 274, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 34, \"service_tier\": \"default\"}
Reasoning effortpass{\"id\": \"resp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 183, \"output_tokens\": 91, \"total_tokens\": 274, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 34, \"service_tier\": \"default\"}
Structured Outputspass{\"ticket_id\": \"TICKET-7429\", \"category\": \"delivery_delay\", \"priority\": \"urgent\", \"customer_sentiment\": \"frustrated and time-sensitive\", \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\", \"require...
JSON modepass{\"customer_name\": \"Maya Chen\", \"order_id\": \"ORDER-8831\", \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\", \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\", \"metrics_to_watch\": [\"tracking_scan_recency\", \"carrier_exception_status\", \"replacement_order_delivery_eta\", \"customer_follow_up_time\"]}
Verbositypass{\"compact_chars\": 182, \"detailed_chars\": 332}
Function callingpass{\"tool_choice_used\": \"required\", \"arguments\": {\"order_id\": \"ORDER-8831\"}}
Parallel tool callswarn{\"returned_order_ids\": [\"ORDER-8831\"], \"missing_order_ids\": [\"ORDER-2044\"]}
Custom toolspass{\"output_item_types\": [\"custom_tool_call\"], \"normalized_note\": \"ORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION\"}
Direct file inputswarn{\"message\": \"The request completed, but the response was not valid JSON.\", \"text_sample\": \"{\\\"ticket_id\\\":\\\"TICKET-7429\\\",\\\"customer\\\":\\\"Maya Chen\\\",\\\"order_id\\\":\\\"ORDER-8831\\\",\\\"product\\\":\\\"Standing desk replacement\\\",\\\"issue\\\":\\\"Replacement for a damaged item is delayed and carrier scan has not moved\\\",\\\"requested_resolution\\\":\\\"Supervisor callback and refund options\\\",\\\"policy_options\\\":\\\"expedited replacement or 15% concession with agent approval after 48-hour delay\\\"}\", \"error\": \"unhashable...
Stateful continuationpassresp_vjrtvnakcgxjhnq5b7cj7rtowtdh7chkkf3aqwbdkynhjqiklp3a
Stateless continuationpass{\"id\": \"resp_mezt6yqizyswuvujvudnonr34b73ndyyu2qsfncgtjyppzie5vva\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 255, \"output_tokens\": 78, \"total_tokens\": 333, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"}
Encrypted reasoningpass{\"encrypted_content_returned\": true, \"reasoning_item_count\": 1}
Prompt cachingpass{\"first\": {\"id\": \"resp_3w6r6ipbqa5z2max35awv3i23i5sjmfa33zzw3vhpgqxcvchhkgq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 3970, \"output_tokens\": 66, \"total_tokens\": 4036, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"}, \"second\": {\"id\": \"resp_zzjeqttoswdjdwwl56xpolvly23w4h2n5dtsdoddgxqwfbkn7npq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 3970, \"outp...
Background modepass{\"status_history\": [\"in_progress\", \"completed\"], \"id\": \"resp_lmmtsvgk3ntolh5ci5vxccmsa6uxcgrsq7v54jpz7oewmociyesa\", \"final_status\": \"completed\"}
Compactiondocumented{\"feature\": \"Compaction\", \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\", \"brightcart_example\": {\"durable_facts\": [\"Customer Maya Chen\", \"ORDER-8831\", \"replacement delayed\", \"carrier scan stale\"], \"policy_constraints\": [\"Do not promise refund without eligibility\", \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"], \"next_action\": \"Check latest carrier scan and...
Latency runtime examplepass{\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544}
Throughput runtime examplepass{\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544}
Reliability runtime examplepass{\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544}
Region checkpass{\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544}
Stored response cleanupwarn[{\"response_id\": \"resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia\", \"status\": \"warn\", \"detail\": {\"exception_class\": \"AuthenticationError\", \"status_code\": 401, \"retryable\": false, \"request_id\": \"req_gkni5zyr7lkjkz2vfiwvkev2qgxs76crcwz5whhjrdkma7up3yta\", \"message\": \"Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}\"}}, {\"response_id\": \"resp_vjrtvnakcgxjhnq5b7cj7rt...
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
Example responses
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
exampleresponse_typeresponse
Endpoint verificationtextok
First raw HTTPS requesttextEmpathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\\nAction: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you.
SDK text generationtextUse the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues.
Create and retrieve responsetextgoal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\\ndata needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\\nhuman-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested.
Service tier and prompt cache requesttextLatency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\\nConsistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions.
Structured ticket triagejson{\\n  \"ticket_id\": \"TICKET-7429\",\\n  \"category\": \"delivery_delay\",\\n  \"priority\": \"urgent\",\\n  \"customer_sentiment\": \"frustrated and time-sensitive\",\\n  \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\",\\n  \"required_actions\": [\\n    \"Review ORDER-8831 shipment status and confirm last carrier scan/update.\",\\n    \"Contact carrier or open a trace/escalation for stalled tracking.\",\\n    \"Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.\",\\n    \"Arrange supervisor callback per customer request.\",\\n    \"Review and communicate refund options, including refund\\n...
JSON support handoffjson{\\n  \"customer_name\": \"Maya Chen\",\\n  \"order_id\": \"ORDER-8831\",\\n  \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\",\\n  \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\",\\n  \"metrics_to_watch\": [\\n    \"tracking_scan_recency\",\\n    \"carrier_exception_status\",\\n    \"replacement_order_delivery_eta\",\\n    \"customer_follow_up_time\"\\n  ]\\n}
Compact policy guidancetextBrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window.
Detailed policy guidancetext1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\\n2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\\n3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged.
Order-status tool answertextStatus: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\\nNext best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval.
Parallel order lookup fallback answertextOrder statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\\nShipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered.
Normalized support notetextORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION
Support transcript extraction texttext{\"ticket_id\":\"TICKET-7429\",\"customer\":\"Maya Chen\",\"order_id\":\"ORDER-8831\",\"product\":\"Standing desk replacement\",\"issue\":\"Replacement for a damaged item is delayed and carrier scan has not moved\",\"requested_resolution\":\"Supervisor callback and refund options\",\"policy_options\":\"expedited replacement or 15% concession with agent approval after 48-hour delay\"}
Stateful support handofftextTicket ID: TICKET-4812\\nOrder ID: ORDER-8831\\nCustomer Name: Maya Chen\\nIssue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\\nNext Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed
Stateless support handofftextCustomer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\\nIssue: Damaged item; monitor stand is cracked on arrival.\\nRequested Resolution: Customer wants a replacement shipped this week.\\nOpen Question: Jordan asked whether the damaged item must be returned before replacement is sent.\\nStatus: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement.
State strategy recommendationtextRecommendation: Stateless continuation\\nReason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage.
Prompt-cache token comparisontable[\\n  {\\n    \"request\":\"first\",\\n    \"input_tokens\":3970,\\n    \"cached_input_tokens\":0,\\n    \"output_tokens\":66,\\n    \"total_tokens\":4036\\n  },\\n  {\\n    \"request\":\"second\",\\n    \"input_tokens\":3970,\\n    \"cached_input_tokens\":0,\\n    \"output_tokens\":48,\\n    \"total_tokens\":4018\\n  }\\n]
Cached support-policy replytextHi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary.
Background manager summarytexttheme: Shipping delays dominate, especially West Coast distribution lane.\\nrisk: Rising dissatisfaction from delays, replacements, and return exceptions.\\nnext action: Escalate West Coast lane issues and review holiday return policy.
Compacted support contextjson{\\n  \"feature\": \"Compaction\",\\n  \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\\n  \"brightcart_example\": {\\n    \"durable_facts\": [\\n      \"Customer Maya Chen\",\\n      \"ORDER-8831\",\\n      \"replacement delayed\",\\n      \"carrier scan stale\"\\n    ],\\n    \"policy_constraints\": [\\n      \"Do not promise refund without eligibility\",\\n      \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"\\n    ],\\n    \"next_action\": \"Check latest carrier scan and supervisor callback status.\"\\n  }\\n}
Endpoint responsiveness summaryjson{\\n  \"region_hint\": \"us-west-2\",\\n  \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\",\\n  \"sample_count\": 3,\\n  \"success_rate\": 1.0,\\n  \"completed_rate\": 1.0,\\n  \"avg_latency_seconds\": 0.362,\\n  \"p50_latency_seconds\": 0.377,\\n  \"p90_latency_seconds\": 0.4,\\n  \"total_output_tokens\": 34,\\n  \"total_tokens\": 544,\\n  \"samples\": [\\n    {\\n      \"ok\": true,\\n      \"latency_seconds\": 0.4,\\n      \"output_tokens\": 14,\\n      \"total_tokens\": 184,\\n      \"status\": \"completed\",\\n      \"sample_output\": \"We apologize for the delay with your replacement order.\"\\n    },\\n    {\\n      \"ok\": true,\\n      \"latency_seconds\": 0.31,\\n      \"output_tokens\": 6,\\n      \"total_tokens\": 174,\\n      \"status\": \"completed\",\\n      \"sample_output\": \"Resolution Rate\"\\n    },\\n    {\\n      \"ok\": true,\\n      \"latency_seconds\": 0.377,\\n      \"output_tokens\": 14,\\n      \"total_tokens\": 186,\\n      \"status\": \"completed\",\\n      \"sample_output\": \"I\\u2019m e\\n...
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
exampleresponse_typeresponse
0Endpoint verificationtextok
1First raw HTTPS requesttextEmpathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\\nAction: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you.
2SDK text generationtextUse the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues.
3Create and retrieve responsetextgoal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\\ndata needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\\nhuman-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested.
4Service tier and prompt cache requesttextLatency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\\nConsistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions.
5Structured ticket triagejson{\\n  \"ticket_id\": \"TICKET-7429\",\\n  \"category\": \"delivery_delay\",\\n  \"priority\": \"urgent\",\\n  \"customer_sentiment\": \"frustrated and time-sensitive\",\\n  \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\",\\n  \"required_actions\": [\\n    \"Review ORDER-8831 shipment status and confirm last carrier scan/update.\",\\n    \"Contact carrier or open a trace/escalation for stalled tracking.\",\\n    \"Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.\",\\n    \"Arrange supervisor callback per customer request.\",\\n    \"Review and communicate refund options, including refund\\n...
6JSON support handoffjson{\\n  \"customer_name\": \"Maya Chen\",\\n  \"order_id\": \"ORDER-8831\",\\n  \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\",\\n  \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\",\\n  \"metrics_to_watch\": [\\n    \"tracking_scan_recency\",\\n    \"carrier_exception_status\",\\n    \"replacement_order_delivery_eta\",\\n    \"customer_follow_up_time\"\\n  ]\\n}
7Compact policy guidancetextBrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window.
8Detailed policy guidancetext1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\\n2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\\n3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged.
9Order-status tool answertextStatus: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\\nNext best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval.
10Parallel order lookup fallback answertextOrder statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\\nShipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered.
11Normalized support notetextORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION
12Support transcript extraction texttext{\"ticket_id\":\"TICKET-7429\",\"customer\":\"Maya Chen\",\"order_id\":\"ORDER-8831\",\"product\":\"Standing desk replacement\",\"issue\":\"Replacement for a damaged item is delayed and carrier scan has not moved\",\"requested_resolution\":\"Supervisor callback and refund options\",\"policy_options\":\"expedited replacement or 15% concession with agent approval after 48-hour delay\"}
13Stateful support handofftextTicket ID: TICKET-4812\\nOrder ID: ORDER-8831\\nCustomer Name: Maya Chen\\nIssue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\\nNext Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed
14Stateless support handofftextCustomer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\\nIssue: Damaged item; monitor stand is cracked on arrival.\\nRequested Resolution: Customer wants a replacement shipped this week.\\nOpen Question: Jordan asked whether the damaged item must be returned before replacement is sent.\\nStatus: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement.
15State strategy recommendationtextRecommendation: Stateless continuation\\nReason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage.
16Prompt-cache token comparisontable[\\n  {\\n    \"request\":\"first\",\\n    \"input_tokens\":3970,\\n    \"cached_input_tokens\":0,\\n    \"output_tokens\":66,\\n    \"total_tokens\":4036\\n  },\\n  {\\n    \"request\":\"second\",\\n    \"input_tokens\":3970,\\n    \"cached_input_tokens\":0,\\n    \"output_tokens\":48,\\n    \"total_tokens\":4018\\n  }\\n]
17Cached support-policy replytextHi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary.
18Background manager summarytexttheme: Shipping delays dominate, especially West Coast distribution lane.\\nrisk: Rising dissatisfaction from delays, replacements, and return exceptions.\\nnext action: Escalate West Coast lane issues and review holiday return policy.
19Compacted support contextjson{\\n  \"feature\": \"Compaction\",\\n  \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\\n  \"brightcart_example\": {\\n    \"durable_facts\": [\\n      \"Customer Maya Chen\",\\n      \"ORDER-8831\",\\n      \"replacement delayed\",\\n      \"carrier scan stale\"\\n    ],\\n    \"policy_constraints\": [\\n      \"Do not promise refund without eligibility\",\\n      \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"\\n    ],\\n    \"next_action\": \"Check latest carrier scan and supervisor callback status.\"\\n  }\\n}
20Endpoint responsiveness summaryjson{\\n  \"region_hint\": \"us-west-2\",\\n  \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\",\\n  \"sample_count\": 3,\\n  \"success_rate\": 1.0,\\n  \"completed_rate\": 1.0,\\n  \"avg_latency_seconds\": 0.362,\\n  \"p50_latency_seconds\": 0.377,\\n  \"p90_latency_seconds\": 0.4,\\n  \"total_output_tokens\": 34,\\n  \"total_tokens\": 544,\\n  \"samples\": [\\n    {\\n      \"ok\": true,\\n      \"latency_seconds\": 0.4,\\n      \"output_tokens\": 14,\\n      \"total_tokens\": 184,\\n      \"status\": \"completed\",\\n      \"sample_output\": \"We apologize for the delay with your replacement order.\"\\n    },\\n    {\\n      \"ok\": true,\\n      \"latency_seconds\": 0.31,\\n      \"output_tokens\": 6,\\n      \"total_tokens\": 174,\\n      \"status\": \"completed\",\\n      \"sample_output\": \"Resolution Rate\"\\n    },\\n    {\\n      \"ok\": true,\\n      \"latency_seconds\": 0.377,\\n      \"output_tokens\": 14,\\n      \"total_tokens\": 186,\\n      \"status\": \"completed\",\\n      \"sample_output\": \"I\\u2019m e\\n...
\n", "
" ], "text/plain": [ " example response_type \\\n", "0 Endpoint verification text \n", "1 First raw HTTPS request text \n", "2 SDK text generation text \n", "3 Create and retrieve response text \n", "4 Service tier and prompt cache request text \n", "5 Structured ticket triage json \n", "6 JSON support handoff json \n", "7 Compact policy guidance text \n", "8 Detailed policy guidance text \n", "9 Order-status tool answer text \n", "10 Parallel order lookup fallback answer text \n", "11 Normalized support note text \n", "12 Support transcript extraction text text \n", "13 Stateful support handoff text \n", "14 Stateless support handoff text \n", "15 State strategy recommendation text \n", "16 Prompt-cache token comparison table \n", "17 Cached support-policy reply text \n", "18 Background manager summary text \n", "19 Compacted support context json \n", "20 Endpoint responsiveness summary json \n", "\n", " response \n", "0 ok \n", "1 Empathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\\nAction: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you. \n", "2 Use the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues. \n", "3 goal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\\ndata needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\\nhuman-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested. \n", "4 Latency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\\nConsistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions. \n", "5 {\\n \"ticket_id\": \"TICKET-7429\",\\n \"category\": \"delivery_delay\",\\n \"priority\": \"urgent\",\\n \"customer_sentiment\": \"frustrated and time-sensitive\",\\n \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\",\\n \"required_actions\": [\\n \"Review ORDER-8831 shipment status and confirm last carrier scan/update.\",\\n \"Contact carrier or open a trace/escalation for stalled tracking.\",\\n \"Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.\",\\n \"Arrange supervisor callback per customer request.\",\\n \"Review and communicate refund options, including refund\\n... \n", "6 {\\n \"customer_name\": \"Maya Chen\",\\n \"order_id\": \"ORDER-8831\",\\n \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\",\\n \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\",\\n \"metrics_to_watch\": [\\n \"tracking_scan_recency\",\\n \"carrier_exception_status\",\\n \"replacement_order_delivery_eta\",\\n \"customer_follow_up_time\"\\n ]\\n} \n", "7 BrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window. \n", "8 1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\\n2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\\n3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged. \n", "9 Status: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\\nNext best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval. \n", "10 Order statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\\nShipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered. \n", "11 ORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION \n", "12 {\"ticket_id\":\"TICKET-7429\",\"customer\":\"Maya Chen\",\"order_id\":\"ORDER-8831\",\"product\":\"Standing desk replacement\",\"issue\":\"Replacement for a damaged item is delayed and carrier scan has not moved\",\"requested_resolution\":\"Supervisor callback and refund options\",\"policy_options\":\"expedited replacement or 15% concession with agent approval after 48-hour delay\"} \n", "13 Ticket ID: TICKET-4812\\nOrder ID: ORDER-8831\\nCustomer Name: Maya Chen\\nIssue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\\nNext Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed \n", "14 Customer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\\nIssue: Damaged item; monitor stand is cracked on arrival.\\nRequested Resolution: Customer wants a replacement shipped this week.\\nOpen Question: Jordan asked whether the damaged item must be returned before replacement is sent.\\nStatus: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement. \n", "15 Recommendation: Stateless continuation\\nReason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage. \n", "16 [\\n {\\n \"request\":\"first\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":66,\\n \"total_tokens\":4036\\n },\\n {\\n \"request\":\"second\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":48,\\n \"total_tokens\":4018\\n }\\n] \n", "17 Hi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary. \n", "18 theme: Shipping delays dominate, especially West Coast distribution lane.\\nrisk: Rising dissatisfaction from delays, replacements, and return exceptions.\\nnext action: Escalate West Coast lane issues and review holiday return policy. \n", "19 {\\n \"feature\": \"Compaction\",\\n \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\\n \"brightcart_example\": {\\n \"durable_facts\": [\\n \"Customer Maya Chen\",\\n \"ORDER-8831\",\\n \"replacement delayed\",\\n \"carrier scan stale\"\\n ],\\n \"policy_constraints\": [\\n \"Do not promise refund without eligibility\",\\n \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"\\n ],\\n \"next_action\": \"Check latest carrier scan and supervisor callback status.\"\\n }\\n} \n", "20 {\\n \"region_hint\": \"us-west-2\",\\n \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\",\\n \"sample_count\": 3,\\n \"success_rate\": 1.0,\\n \"completed_rate\": 1.0,\\n \"avg_latency_seconds\": 0.362,\\n \"p50_latency_seconds\": 0.377,\\n \"p90_latency_seconds\": 0.4,\\n \"total_output_tokens\": 34,\\n \"total_tokens\": 544,\\n \"samples\": [\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.4,\\n \"output_tokens\": 14,\\n \"total_tokens\": 184,\\n \"status\": \"completed\",\\n \"sample_output\": \"We apologize for the delay with your replacement order.\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.31,\\n \"output_tokens\": 6,\\n \"total_tokens\": 174,\\n \"status\": \"completed\",\\n \"sample_output\": \"Resolution Rate\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.377,\\n \"output_tokens\": 14,\\n \"total_tokens\": 186,\\n \"status\": \"completed\",\\n \"sample_output\": \"I\\u2019m e\\n... " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from __future__ import annotations\n", "\n", "cleanup_rows = []\n", "tracked_ids = list(dict.fromkeys(STORED_RESPONSE_IDS))\n", "\n", "if not tracked_ids:\n", " cleanup_rows.append({\"response_id\": \"none\", \"status\": \"no stored responses tracked\", \"detail\": \"\"})\n", "elif not CLEAN_UP_STORED_RESPONSES:\n", " for stored_id in tracked_ids:\n", " cleanup_rows.append({\"response_id\": stored_id, \"status\": \"skipped\", \"detail\": \"BEDROCK_CLEANUP_STORED_RESPONSES is disabled\"})\n", " record_check(\"Stored response cleanup\", \"skipped\", cleanup_rows)\n", "else:\n", " for stored_id in tracked_ids:\n", " try:\n", " delete_result = delete_response(stored_id)\n", " cleanup_rows.append({\"response_id\": stored_id, \"status\": \"deleted\", \"detail\": compact_text(to_dict(delete_result), 240)})\n", " except Exception as exc:\n", " cleanup_rows.append({\"response_id\": stored_id, \"status\": \"warn\", \"detail\": describe_api_error(exc)})\n", " cleanup_status = \"pass\" if all(row[\"status\"] == \"deleted\" for row in cleanup_rows) else \"warn\"\n", " record_check(\"Stored response cleanup\", cleanup_status, cleanup_rows)\n", "\n", "print_label(\"Stored response cleanup\")\n", "display_wrapped_table(pd.DataFrame(cleanup_rows), max_col_width_px=520)\n", "\n", "summary_df = pd.DataFrame(RESULTS_SUMMARY)\n", "print_label(\"Run summary\")\n", "display_wrapped_table(summary_df, max_col_width_px=620)\n", "\n", "print_label(\"Example responses\")\n", "print_response_gallery()\n" ] } ], "metadata": { "kernelspec": { "display_name": ".venv (3.11.8)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.8" } }, "nbformat": 4, "nbformat_minor": 5 }