{
"cells": [
{
"cell_type": "markdown",
"id": "98f4e72a",
"metadata": {},
"source": [
"# Getting Started with OpenAI Models on Amazon Bedrock\n",
"\n",
"OpenAI models on Amazon Bedrock expose an OpenAI-compatible Responses API surface for production workflows that need text generation, structured outputs, application tools, direct file inputs, response state, prompt caching, and background work. This cookbook keeps the examples concrete by building a support-assistant workflow for **BrightCart**, a fictional retailer handling delayed and damaged-order replacement requests.\n",
"\n",
"You will use the OpenAI Python SDK for normal application calls and a small raw HTTPS helper when it is useful to inspect the exact request body. The flow starts with setup and a minimal preflight, then layers on response lifecycle, model controls, structured JSON, application tools, file input, state management, caching, background processing, context compaction, operations checks, and cleanup.\n",
"\n",
"You will learn how to:\n",
"\n",
"1. Configure a Bedrock-hosted OpenAI model with Bedrock-specific environment variables.\n",
"2. Verify the Responses endpoint and inspect response schema, usage metadata, and normalized errors.\n",
"3. Send text requests with both raw HTTPS and the OpenAI SDK.\n",
"4. Generate schema-constrained JSON and lighter JSON-mode handoffs.\n",
"5. Call application-managed function tools, parallel tools, and custom text tools.\n",
"6. Send a direct PDF input, continue stateful and stateless conversations, and carry encrypted reasoning context.\n",
"7. Use prompt caching, background mode, compaction, operational smoke checks, and stored-response cleanup.\n",
"\n",
"Prerequisites: a bearer token for OpenAI models on Amazon Bedrock, Python 3.9 or newer, and network access to your Bedrock OpenAI-compatible endpoint.\n",
"\n",
"This guide runs `openai.gpt-5.4` in `us-west-2` by default. To use another supported pairing, change `AWS_REGION`, `BEDROCK_MODEL`, and `BEDROCK_BASE_URL` together before running the setup cells.\n",
"\n",
"| AWS Region | Supported model IDs |\n",
"| --- | --- |\n",
"| `us-west-2` | `openai.gpt-5.4` |\n",
"| `us-east-2` | `openai.gpt-5.5`, `openai.gpt-5.4` |\n"
]
},
{
"cell_type": "markdown",
"id": "8331f876",
"metadata": {},
"source": [
"## 1. Configure Amazon Bedrock\n",
"\n",
"This section prepares the notebook runtime. It installs the small Python stack, reads Bedrock-specific environment variables, creates both a raw HTTPS session and an OpenAI SDK client, discovers model metadata when the endpoint provides it, and defines shared helpers used by later examples.\n",
"\n",
"Set these environment variables before running the notebook. The default pairing is `us-west-2` with `openai.gpt-5.4`.\n",
"\n",
"```bash\n",
"export AWS_BEARER_TOKEN_BEDROCK=\"YOUR_BEDROCK_BEARER_TOKEN\"\n",
"export AWS_REGION=\"us-west-2\"\n",
"export BEDROCK_MODEL=\"openai.gpt-5.4\"\n",
"export BEDROCK_BASE_URL=\"https://bedrock-mantle.${AWS_REGION}.api.aws/openai/v1\"\n",
"```\n",
"\n",
"The bearer token is read from `AWS_BEARER_TOKEN_BEDROCK`. If it is missing, the setup cell asks for it with a password-style prompt and does not print it.\n"
]
},
{
"cell_type": "markdown",
"id": "4a7f4690",
"metadata": {},
"source": [
"### 1.1 Install Dependencies\n",
"\n",
"Install the packages used by the notebook. The OpenAI SDK is used for the application examples, `requests` is used for raw HTTPS calls to the Responses endpoint, and `pandas` plus IPython display helpers keep request and response summaries readable in the Cookbook renderer. Inspect the cell output only to confirm the packages installed or were already present.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4236e3c7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m26.1.1\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Dependencies installed or already available: openai, requests, pandas, ipython\n"
]
}
],
"source": [
"%pip install -U \"openai>=2.28.0\" requests pandas ipython --quiet\n",
"print(\"Dependencies installed or already available: openai, requests, pandas, ipython\")"
]
},
{
"cell_type": "markdown",
"id": "1cd3f5e2",
"metadata": {},
"source": [
"### 1.2 Import Libraries and Defaults\n",
"\n",
"Import the standard libraries, SDK, HTTP client, and display utilities used throughout the notebook. This cell also sets the default Bedrock region and model used when environment variables are not already set. Inspect the printed defaults to confirm the notebook will start from `us-west-2` and `openai.gpt-5.4` unless you override them.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d53dd7c9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Imports loaded.\n",
"Default region: us-west-2\n",
"Default model: openai.gpt-5.4\n"
]
}
],
"source": [
"from __future__ import annotations\n",
"\n",
"import base64\n",
"import builtins\n",
"import html\n",
"import json\n",
"import os\n",
"import shlex\n",
"import textwrap\n",
"import time\n",
"from datetime import date, timedelta\n",
"from getpass import getpass\n",
"from typing import Any, Callable, Iterable\n",
"\n",
"import pandas as pd\n",
"import requests\n",
"from IPython.display import HTML, Markdown, display\n",
"from openai import OpenAI\n",
"\n",
"DEFAULT_REGION = \"us-west-2\"\n",
"DEFAULT_MODEL = \"openai.gpt-5.4\"\n",
"PREFERRED_MODELS = [DEFAULT_MODEL]\n",
"\n",
"\n",
"def gpt_version_tuple(model_id: str) -> tuple[int, int] | None:\n",
" normalized = model_id.lower().removeprefix(\"openai.\")\n",
" if not normalized.startswith(\"gpt-\"):\n",
" return None\n",
" version = normalized.removeprefix(\"gpt-\").split(\"-\")[0]\n",
" parts = version.split(\".\")\n",
" try:\n",
" major = builtins.int(parts[0])\n",
" minor = builtins.int(parts[1]) if len(parts) > 1 else 0\n",
" except ValueError:\n",
" return None\n",
" return major, minor\n",
"\n",
"\n",
"def prompt_cache_retention_for_model(model_id: str) -> str:\n",
" version = gpt_version_tuple(model_id)\n",
" if version and version >= (5, 5):\n",
" return \"24h\"\n",
" return \"in_memory\"\n",
"\n",
"pd.set_option(\"display.max_columns\", None)\n",
"pd.set_option(\"display.max_rows\", 200)\n",
"pd.set_option(\"display.max_colwidth\", None)\n",
"pd.set_option(\"display.width\", 160)\n",
"\n",
"\n",
"def display_wrapped_table(df: pd.DataFrame, *, max_col_width_px: int = 520, index: bool = False) -> None:\n",
" if df.empty:\n",
" display(Markdown(\"_No rows to display._\"))\n",
" return\n",
" table_html = df.to_html(index=index, escape=True, border=0)\n",
" table_html = table_html.replace('
', '')\n",
" display(HTML(f\"\"\"\n",
" \n",
" {table_html}\n",
" \"\"\"))\n",
"\n",
"print(\"Imports loaded.\")\n",
"print(\"Default region:\", DEFAULT_REGION)\n",
"print(\"Default model:\", DEFAULT_MODEL)\n"
]
},
{
"cell_type": "markdown",
"id": "3f0f8320",
"metadata": {},
"source": [
"### 1.3 Configure Bedrock Credentials and Clients\n",
"\n",
"Read Bedrock configuration from the environment and construct clients. `BEDROCK_BASE_URL` is normalized once, the raw `requests.Session` gets the bearer token in its headers, and the OpenAI SDK client is created explicitly with the same token and base URL. Inspect the rendered table to confirm the selected region, model, endpoint, SDK client configuration, and stored-response cleanup behavior before making live calls.\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "50c559a3",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | setting | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | AWS_REGION | \n",
" us-west-2 | \n",
"
\n",
" \n",
" | BEDROCK_MODEL | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | BEDROCK_BASE_URL | \n",
" https://bedrock-mantle.us-west-2.api.aws/openai/v1 | \n",
"
\n",
" \n",
" | SDK client | \n",
" OpenAI(api_key=AWS_BEARER_TOKEN_BEDROCK, base_url=BEDROCK_BASE_URL) | \n",
"
\n",
" \n",
" | cleanup stored responses | \n",
" True | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"\n",
"\n",
"def env_value(*names: str) -> str | None:\n",
" for name in names:\n",
" value = os.environ.get(name)\n",
" if value:\n",
" return value\n",
" return None\n",
"\n",
"\n",
"def env_flag(name: str, default: bool = False) -> bool:\n",
" value = env_value(name)\n",
" if value is None:\n",
" return default\n",
" return value.strip().lower() in {\"1\", \"true\", \"yes\", \"on\"}\n",
"\n",
"\n",
"def normalize_base_url(url: str) -> str:\n",
" url = url.strip().rstrip(\"/\")\n",
" if url.endswith(\"/responses\"):\n",
" return url[: -len(\"/responses\")]\n",
" return url\n",
"\n",
"\n",
"def endpoint(path: str) -> str:\n",
" return f\"{BEDROCK_BASE_URL}/{path.lstrip('/')}\"\n",
"\n",
"\n",
"def responses_url(base_url: str) -> str:\n",
" return f\"{normalize_base_url(base_url)}/responses\"\n",
"\n",
"\n",
"API_TIMEOUT_SECONDS = float(env_value(\"BEDROCK_REQUEST_TIMEOUT_SECONDS\") or \"60\")\n",
"MAX_RETRIES = builtins.int(env_value(\"BEDROCK_MAX_RETRIES\") or \"0\")\n",
"CLEAN_UP_STORED_RESPONSES = env_flag(\"BEDROCK_CLEANUP_STORED_RESPONSES\", True)\n",
"FAIL_ON_CHECK_FAILURE = env_flag(\"BEDROCK_FAIL_ON_CHECK_FAILURE\", False)\n",
"RUN_RESPONSIVENESS_CHECK = env_flag(\"BEDROCK_RESPONSIVENESS_CHECK\", True)\n",
"TRANSIENT_STATUS_CODES = {408, 409, 429, 500, 502, 503, 504}\n",
"\n",
"AWS_REGION = (env_value(\"AWS_REGION\") or DEFAULT_REGION).strip() or DEFAULT_REGION\n",
"BEDROCK_MODEL = (env_value(\"BEDROCK_MODEL\") or DEFAULT_MODEL).strip() or DEFAULT_MODEL\n",
"BEDROCK_BASE_URL = normalize_base_url(\n",
" env_value(\"BEDROCK_BASE_URL\") or f\"https://bedrock-mantle.{AWS_REGION}.api.aws/openai/v1\"\n",
")\n",
"RESPONSES_URL = responses_url(BEDROCK_BASE_URL)\n",
"AWS_BEARER_TOKEN_BEDROCK = env_value(\"AWS_BEARER_TOKEN_BEDROCK\")\n",
"\n",
"if not AWS_BEARER_TOKEN_BEDROCK:\n",
" AWS_BEARER_TOKEN_BEDROCK = getpass(\"Paste your AWS Bedrock bearer token for this kernel session: \").strip()\n",
" if AWS_BEARER_TOKEN_BEDROCK:\n",
" os.environ[\"AWS_BEARER_TOKEN_BEDROCK\"] = AWS_BEARER_TOKEN_BEDROCK\n",
"\n",
"if not AWS_BEARER_TOKEN_BEDROCK:\n",
" raise RuntimeError(\"AWS_BEARER_TOKEN_BEDROCK is required to run the live examples.\")\n",
"\n",
"http = requests.Session()\n",
"http.headers.update({\n",
" \"Authorization\": f\"Bearer {AWS_BEARER_TOKEN_BEDROCK}\",\n",
" \"Content-Type\": \"application/json\",\n",
"})\n",
"\n",
"client = OpenAI(api_key=AWS_BEARER_TOKEN_BEDROCK, base_url=BEDROCK_BASE_URL, max_retries=0)\n",
"BASE_URL = BEDROCK_BASE_URL\n",
"\n",
"config_rows = [\n",
" {\"setting\": \"AWS_REGION\", \"value\": AWS_REGION},\n",
" {\"setting\": \"BEDROCK_MODEL\", \"value\": BEDROCK_MODEL},\n",
" {\"setting\": \"BEDROCK_BASE_URL\", \"value\": BEDROCK_BASE_URL},\n",
" {\"setting\": \"SDK client\", \"value\": \"OpenAI(api_key=AWS_BEARER_TOKEN_BEDROCK, base_url=BEDROCK_BASE_URL)\"},\n",
" {\"setting\": \"cleanup stored responses\", \"value\": CLEAN_UP_STORED_RESPONSES},\n",
"]\n",
"display_wrapped_table(pd.DataFrame(config_rows), max_col_width_px=680)\n"
]
},
{
"cell_type": "markdown",
"id": "23cf01d3",
"metadata": {},
"source": [
"### 1.4 Discover Available Models\n",
"\n",
"Discover available models when the selected endpoint exposes model-list metadata, then choose the model for the rest of the notebook. If `BEDROCK_MODEL` is set, the notebook uses that value; otherwise it prefers `openai.gpt-5.4`. The model-list call is optional because some compatible endpoints may allow inference even when model metadata is unavailable. Inspect the selected model and any returned catalog rows.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "ed26644c",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | selected_model | \n",
" model_was_explicit | \n",
" model_catalog_status | \n",
" discovered_model_count | \n",
" prompt_cache_retention | \n",
" prompt_cache_retention_note | \n",
" note | \n",
"
\n",
" \n",
" \n",
" \n",
" | openai.gpt-5.4 | \n",
" False | \n",
" using configured model | \n",
" 0 | \n",
" in_memory | \n",
" GPT-5.5 and later use 24h extended prompt caching; earlier GPT-5 models can use in_memory. | \n",
" This endpoint did not expose model-list metadata. The guide will continue with the configured model. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Continuing with: openai.gpt-5.4\n"
]
}
],
"source": [
"from __future__ import annotations\n",
"\n",
"\n",
"def list_openai_models(client: OpenAI) -> list[str]:\n",
" return sorted(model.id for model in client.models.list(timeout=API_TIMEOUT_SECONDS).data)\n",
"\n",
"\n",
"def resolve_model_id(client: OpenAI | None) -> tuple[str, list[str], str | None]:\n",
" configured_model = env_value(\"BEDROCK_MODEL\")\n",
" available_models: list[str] = []\n",
" model_discovery_note: str | None = None\n",
"\n",
" if client is not None:\n",
" try:\n",
" available_models = list_openai_models(client)\n",
" except Exception as exc:\n",
" status_code = getattr(exc, \"status_code\", None)\n",
" if status_code == 404:\n",
" model_discovery_note = \"This endpoint did not expose model-list metadata. The guide will continue with the configured model.\"\n",
" else:\n",
" model_discovery_note = f\"Model-list metadata could not be listed. The guide will continue with the configured model. Details: {builtins.str(exc)[:240]}\"\n",
"\n",
" if configured_model:\n",
" return configured_model, available_models, model_discovery_note\n",
"\n",
" for candidate in PREFERRED_MODELS:\n",
" if candidate in available_models:\n",
" return candidate, available_models, model_discovery_note\n",
"\n",
" for candidate in available_models:\n",
" if candidate.startswith(\"openai.\"):\n",
" return candidate, available_models, model_discovery_note\n",
"\n",
" if available_models:\n",
" return available_models[0], available_models, model_discovery_note\n",
"\n",
" return PREFERRED_MODELS[0], available_models, model_discovery_note\n",
"\n",
"\n",
"EXPLICIT_MODEL = env_value(\"BEDROCK_MODEL\")\n",
"MODEL_ID, AVAILABLE_MODELS, MODEL_DISCOVERY_NOTE = resolve_model_id(client)\n",
"os.environ[\"BEDROCK_MODEL\"] = MODEL_ID\n",
"PROMPT_CACHE_RETENTION = prompt_cache_retention_for_model(MODEL_ID)\n",
"PROMPT_CACHE_RETENTION_NOTE = (\n",
" \"GPT-5.5 and later use 24h extended prompt caching; earlier GPT-5 models can use in_memory.\"\n",
")\n",
"\n",
"config_rows = [{\n",
" \"selected_model\": MODEL_ID,\n",
" \"model_was_explicit\": bool(EXPLICIT_MODEL),\n",
" \"model_catalog_status\": \"listed\" if AVAILABLE_MODELS else \"using configured model\",\n",
" \"discovered_model_count\": len(AVAILABLE_MODELS),\n",
" \"prompt_cache_retention\": PROMPT_CACHE_RETENTION,\n",
" \"prompt_cache_retention_note\": PROMPT_CACHE_RETENTION_NOTE,\n",
" \"note\": MODEL_DISCOVERY_NOTE or \"Model selection is ready.\",\n",
"}]\n",
"display_wrapped_table(pd.DataFrame(config_rows), max_col_width_px=620)\n",
"\n",
"if AVAILABLE_MODELS:\n",
" display_wrapped_table(pd.DataFrame({\"available_models\": AVAILABLE_MODELS[:25]}), max_col_width_px=520)\n",
"else:\n",
" print(\"Continuing with:\", MODEL_ID)\n"
]
},
{
"cell_type": "markdown",
"id": "d37d8bdb",
"metadata": {},
"source": [
"### 1.5 Helper Functions Setup\n",
"\n",
"Define shared helpers for the workflow. These helpers render request shapes, normalize API errors, send raw HTTPS requests, wrap SDK calls with optional retries, extract `output_text`, summarize token usage, track stored response IDs, and display compact tables. The examples below stay focused on each API concept while the helpers handle repeated mechanics. Inspect this cell if you want to understand how response text, usage, errors, and cleanup are processed.\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "13c12262",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Helpers ready.\n"
]
}
],
"source": [
"from __future__ import annotations\n",
"\n",
"RESULTS_SUMMARY: list[dict[str, Any]] = []\n",
"EXAMPLE_RESPONSES: list[dict[str, str]] = []\n",
"STORED_RESPONSE_IDS: list[str] = []\n",
"OUTPUT_WIDTH = 100\n",
"MAX_DISPLAY_TEXT_CHARS = builtins.int(env_value(\"BEDROCK_MAX_DISPLAY_CHARS\") or \"1200\")\n",
"\n",
"\n",
"def truncate_display_text(text: Any, *, limit: int = MAX_DISPLAY_TEXT_CHARS) -> str:\n",
" rendered = builtins.str(text).strip()\n",
" if len(rendered) <= limit:\n",
" return rendered\n",
" return rendered[:limit].rstrip() + \"\\n[Display truncated for readability. Inspect the Python variable for the full value.]\"\n",
"\n",
"\n",
"def compact_text(text: Any, limit: int = 220) -> str:\n",
" rendered = \" \".join(builtins.str(text).split())\n",
" if len(rendered) <= limit:\n",
" return rendered\n",
" return rendered[:limit].rstrip() + \"...\"\n",
"\n",
"\n",
"def require(condition: Any, message: str) -> None:\n",
" if not condition:\n",
" raise ValueError(message)\n",
"\n",
"\n",
"def warn_or_raise(condition: bool, message: str) -> bool:\n",
" if condition:\n",
" return True\n",
" display(HTML(f\"Warning: {html.escape(message)}
\"))\n",
" if FAIL_ON_CHECK_FAILURE:\n",
" raise AssertionError(message)\n",
" return False\n",
"\n",
"\n",
"def display_text_block(label: str, text: Any, *, limit: int = MAX_DISPLAY_TEXT_CHARS) -> None:\n",
" safe_label = html.escape(label)\n",
" safe_text = html.escape(truncate_display_text(text, limit=limit))\n",
" display(HTML(f\"\"\"\n",
" \n",
"
{safe_label}
\n",
"
{safe_text}
\n",
"
\n",
" \"\"\"))\n",
"\n",
"\n",
"def print_wrapped(text: Any, *, width: int = OUTPUT_WIDTH) -> None:\n",
" print(textwrap.fill(builtins.str(text), width=width, break_long_words=True, break_on_hyphens=False))\n",
"\n",
"\n",
"def print_json(value: Any, *, width: int = OUTPUT_WIDTH) -> None:\n",
" display_json_block(\"JSON\", value)\n",
"\n",
"\n",
"def print_label(label: str) -> None:\n",
" display(HTML(f\"{html.escape(label)}
\"))\n",
"\n",
"\n",
"def print_labeled_text(label: str, text: Any) -> None:\n",
" display_text_block(label, text)\n",
"\n",
"\n",
"def print_labeled_json(label: str, value: Any) -> None:\n",
" display_json_block(label, value)\n",
"\n",
"\n",
"def display_json_block(label: str, value: Any, *, limit: int = MAX_DISPLAY_TEXT_CHARS) -> None:\n",
" rendered = json.dumps(value, indent=2, default=builtins.str)\n",
" display_text_block(label, rendered, limit=limit)\n",
"\n",
"\n",
"def summarize_content(content: Any) -> str:\n",
" if isinstance(content, builtins.str):\n",
" return compact_text(content)\n",
" if isinstance(content, builtins.list):\n",
" parts: list[str] = []\n",
" for item in content:\n",
" if not isinstance(item, builtins.dict):\n",
" parts.append(compact_text(item, 80))\n",
" continue\n",
" item_type = item.get(\"type\", \"item\")\n",
" if item_type == \"input_text\":\n",
" parts.append(f\"input_text: {compact_text(item.get('text', ''), 120)}\")\n",
" elif item_type == \"input_file\":\n",
" parts.append(f\"input_file: {item.get('filename', '')}\")\n",
" else:\n",
" parts.append(item_type)\n",
" return \"; \".join(parts)\n",
" return compact_text(content)\n",
"\n",
"\n",
"def summarize_input(input_value: Any) -> str:\n",
" if isinstance(input_value, builtins.str):\n",
" return compact_text(input_value, 260)\n",
" if isinstance(input_value, builtins.list):\n",
" messages: list[str] = []\n",
" for item in input_value[:4]:\n",
" if isinstance(item, builtins.dict):\n",
" role = item.get(\"role\", item.get(\"type\", \"item\"))\n",
" messages.append(f\"{role}: {summarize_content(item.get('content', item))}\")\n",
" else:\n",
" messages.append(compact_text(item, 120))\n",
" suffix = f\"; +{len(input_value) - 4} more\" if len(input_value) > 4 else \"\"\n",
" return f\"{len(input_value)} item(s): \" + \"; \".join(messages) + suffix\n",
" return compact_text(input_value, 260)\n",
"\n",
"\n",
"def summarize_text_format(text_config: Any) -> str:\n",
" if not isinstance(text_config, builtins.dict):\n",
" return compact_text(text_config)\n",
" fmt = text_config.get(\"format\")\n",
" if isinstance(fmt, builtins.dict):\n",
" fmt_type = fmt.get(\"type\")\n",
" if fmt_type == \"json_schema\":\n",
" schema = fmt.get(\"schema\") or {}\n",
" required = schema.get(\"required\") or []\n",
" return f\"json_schema: {fmt.get('name')} strict={fmt.get('strict')} required={len(required)} fields\"\n",
" if fmt_type:\n",
" return builtins.str(fmt_type)\n",
" return compact_text(text_config)\n",
"\n",
"\n",
"def request_summary_rows(payload: dict[str, Any]) -> list[dict[str, str]]:\n",
" rows: list[dict[str, str]] = []\n",
" ordered_keys = [\n",
" \"model\", \"max_output_tokens\", \"store\", \"background\", \"service_tier\", \"previous_response_id\",\n",
" \"parallel_tool_calls\", \"prompt_cache_key\", \"prompt_cache_retention\",\n",
" ]\n",
" for key in ordered_keys:\n",
" if key in payload:\n",
" rows.append({\"field\": key, \"value\": compact_text(payload[key], 180)})\n",
" if \"reasoning\" in payload:\n",
" rows.append({\"field\": \"reasoning\", \"value\": compact_text(payload[\"reasoning\"], 180)})\n",
" if \"text\" in payload:\n",
" rows.append({\"field\": \"text format\", \"value\": summarize_text_format(payload[\"text\"])})\n",
" if \"include\" in payload:\n",
" rows.append({\"field\": \"include\", \"value\": compact_text(payload[\"include\"], 180)})\n",
" if \"tools\" in payload:\n",
" tool_names = [tool.get(\"name\", tool.get(\"type\", \"tool\")) for tool in payload.get(\"tools\", [])]\n",
" rows.append({\"field\": \"tools\", \"value\": \", \".join(tool_names)})\n",
" if \"tool_choice\" in payload:\n",
" rows.append({\"field\": \"tool_choice\", \"value\": compact_text(payload[\"tool_choice\"], 180)})\n",
" if \"input\" in payload:\n",
" rows.append({\"field\": \"input\", \"value\": summarize_input(payload[\"input\"])})\n",
" return rows\n",
"\n",
"\n",
"def print_request_shape(payload: dict[str, Any]) -> None:\n",
" rows = request_summary_rows(redact_payload(payload))\n",
" print_label(\"Request shape\")\n",
" display_wrapped_table(pd.DataFrame(rows), max_col_width_px=520)\n",
"\n",
"\n",
"def print_response_summary(response_or_summary: Any) -> None:\n",
" summary = response_or_summary if isinstance(response_or_summary, builtins.dict) and \"output\" not in response_or_summary else summarize_response(response_or_summary)\n",
" preferred = [\n",
" \"id\", \"model\", \"status\", \"output_item_types\", \"input_tokens\", \"cached_input_tokens\",\n",
" \"output_tokens\", \"total_tokens\", \"reasoning_output_tokens\", \"service_tier\",\n",
" ]\n",
" rows = [{\"field\": key, \"value\": compact_text(summary.get(key), 220)} for key in preferred if key in summary]\n",
" print_label(\"Response summary\")\n",
" display_wrapped_table(pd.DataFrame(rows), max_col_width_px=420)\n",
"\n",
"\n",
"def print_key_takeaway(text: str) -> None:\n",
" display(HTML(f\"Key takeaway: {html.escape(text)}
\"))\n",
"\n",
"\n",
"def redact_payload(payload: dict[str, Any]) -> dict[str, Any]:\n",
" def redact(value: Any) -> Any:\n",
" if isinstance(value, builtins.dict):\n",
" return {\n",
" key: (\"\" if key == \"file_data\" else redact(item))\n",
" for key, item in value.items()\n",
" }\n",
" if isinstance(value, builtins.list):\n",
" return [redact(item) for item in value]\n",
" return value\n",
"\n",
" return json.loads(json.dumps(redact(payload), default=builtins.str))\n",
"\n",
"\n",
"def compact_detail(detail: Any) -> str:\n",
" if isinstance(detail, (builtins.dict, builtins.list)):\n",
" return compact_text(json.dumps(detail, default=builtins.str), 500)\n",
" return compact_text(detail, 500)\n",
"\n",
"\n",
"def record_check(name: str, status: str, detail: Any = \"\") -> None:\n",
" RESULTS_SUMMARY.append({\"name\": name, \"status\": status, \"detail\": compact_detail(detail)})\n",
"\n",
"\n",
"def record_response(example: str, response_type: str, content: Any, limit: int = 900) -> None:\n",
" if isinstance(content, pd.DataFrame):\n",
" rendered = content.to_json(orient=\"records\", indent=2)\n",
" elif isinstance(content, (builtins.dict, builtins.list)):\n",
" rendered = json.dumps(content, indent=2, default=builtins.str)\n",
" else:\n",
" rendered = builtins.str(content)\n",
" rendered = rendered.strip()\n",
" if len(rendered) > limit:\n",
" rendered = rendered[:limit].rstrip() + chr(10) + \"...\"\n",
" EXAMPLE_RESPONSES.append({\n",
" \"example\": example,\n",
" \"response_type\": response_type,\n",
" \"response\": rendered,\n",
" })\n",
"\n",
"\n",
"def print_response_gallery() -> pd.DataFrame:\n",
" gallery = pd.DataFrame(EXAMPLE_RESPONSES)\n",
" if gallery.empty:\n",
" gallery = pd.DataFrame(columns=[\"example\", \"response_type\", \"response\"])\n",
" display_wrapped_table(gallery, max_col_width_px=620)\n",
" return gallery\n",
"\n",
"\n",
"def normalize_error(response: requests.Response, body: Any) -> dict[str, Any]:\n",
" return {\n",
" \"exception_class\": \"HTTPError\",\n",
" \"status_code\": response.status_code,\n",
" \"retryable\": response.status_code in TRANSIENT_STATUS_CODES,\n",
" \"request_id\": response.headers.get(\"x-request-id\"),\n",
" \"body\": body,\n",
" }\n",
"\n",
"\n",
"def describe_api_error(exc: Exception) -> dict[str, Any]:\n",
" try:\n",
" parsed = json.loads(builtins.str(exc))\n",
" if isinstance(parsed, builtins.dict) and \"status_code\" in parsed:\n",
" return {\n",
" \"exception_class\": type(exc).__name__,\n",
" \"status_code\": parsed.get(\"status_code\"),\n",
" \"retryable\": parsed.get(\"retryable\"),\n",
" \"request_id\": parsed.get(\"request_id\"),\n",
" \"message\": compact_text(parsed.get(\"body\", parsed), 500),\n",
" }\n",
" except Exception:\n",
" pass\n",
"\n",
" status_code = getattr(exc, \"status_code\", None)\n",
" response = getattr(exc, \"response\", None)\n",
" request_id = None\n",
" if response is not None:\n",
" headers = getattr(response, \"headers\", {})\n",
" request_id = headers.get(\"x-request-id\") if hasattr(headers, \"get\") else None\n",
" return {\n",
" \"exception_class\": type(exc).__name__,\n",
" \"status_code\": status_code,\n",
" \"retryable\": status_code in TRANSIENT_STATUS_CODES,\n",
" \"request_id\": request_id,\n",
" \"message\": builtins.str(exc)[:500],\n",
" }\n",
"\n",
"\n",
"def request_json(method: str, path: str, *, payload: dict[str, Any] | None = None) -> dict[str, Any]:\n",
" response = http.request(\n",
" method,\n",
" endpoint(path),\n",
" json=payload,\n",
" timeout=API_TIMEOUT_SECONDS,\n",
" )\n",
" try:\n",
" body = response.json() if response.text else {}\n",
" except json.JSONDecodeError:\n",
" body = {\"raw_text\": response.text}\n",
" if response.status_code >= 400:\n",
" raise RuntimeError(json.dumps(normalize_error(response, body), indent=2, default=builtins.str))\n",
" return body\n",
"\n",
"\n",
"def to_dict(value: Any) -> Any:\n",
" if hasattr(value, \"model_dump\"):\n",
" return value.model_dump(mode=\"json\")\n",
" if isinstance(value, builtins.list):\n",
" return [to_dict(item) for item in value]\n",
" if isinstance(value, builtins.dict):\n",
" return {key: to_dict(item) for key, item in value.items()}\n",
" return value\n",
"\n",
"\n",
"def output_text(response: Any) -> str:\n",
" direct = getattr(response, \"output_text\", None)\n",
" if direct:\n",
" return direct\n",
" data = to_dict(response)\n",
" pieces: list[str] = []\n",
" for item in data.get(\"output\", []) or []:\n",
" for content in item.get(\"content\", []) or []:\n",
" if content.get(\"type\") == \"output_text\":\n",
" pieces.append(content.get(\"text\", \"\"))\n",
" return \"\".join(pieces)\n",
"\n",
"\n",
"def response_items(response: Any) -> list[dict[str, Any]]:\n",
" data = to_dict(response)\n",
" return builtins.list(data.get(\"output\", []) or [])\n",
"\n",
"\n",
"def first_output_item(response: Any, item_type: str) -> dict[str, Any] | None:\n",
" for item in response_items(response):\n",
" if item.get(\"type\") == item_type:\n",
" return item\n",
" return None\n",
"\n",
"\n",
"def summarize_response(response: Any) -> dict[str, Any]:\n",
" data = to_dict(response)\n",
" usage = data.get(\"usage\") or {}\n",
" input_details = usage.get(\"input_tokens_details\") or {}\n",
" output_details = usage.get(\"output_tokens_details\") or {}\n",
" return {\n",
" \"id\": data.get(\"id\"),\n",
" \"model\": data.get(\"model\"),\n",
" \"status\": data.get(\"status\"),\n",
" \"output_item_types\": [item.get(\"type\") for item in data.get(\"output\", []) or []],\n",
" \"input_tokens\": usage.get(\"input_tokens\"),\n",
" \"output_tokens\": usage.get(\"output_tokens\"),\n",
" \"total_tokens\": usage.get(\"total_tokens\"),\n",
" \"cached_input_tokens\": input_details.get(\"cached_tokens\"),\n",
" \"reasoning_output_tokens\": output_details.get(\"reasoning_tokens\"),\n",
" \"service_tier\": data.get(\"service_tier\"),\n",
" }\n",
"\n",
"\n",
"def call_with_retries(label: str, func: Callable[..., Any], *args: Any, **kwargs: Any) -> Any:\n",
" kwargs.setdefault(\"timeout\", API_TIMEOUT_SECONDS)\n",
" last_exc: Exception | None = None\n",
" for attempt in range(1, MAX_RETRIES + 2):\n",
" try:\n",
" return func(*args, **kwargs)\n",
" except Exception as exc:\n",
" last_exc = exc\n",
" error = describe_api_error(exc)\n",
" should_retry = bool(error[\"retryable\"] and attempt <= MAX_RETRIES)\n",
" if not should_retry:\n",
" raise\n",
" time.sleep(min(2 ** (attempt - 1), 8))\n",
" raise RuntimeError(f\"{label} failed after retries\") from last_exc\n",
"\n",
"\n",
"def create_response(**kwargs: Any) -> Any:\n",
" kwargs.setdefault(\"model\", MODEL_ID)\n",
" return call_with_retries(\"responses.create\", client.responses.create, **kwargs)\n",
"\n",
"\n",
"def retrieve_response(response_id: str) -> Any:\n",
" return call_with_retries(\"responses.retrieve\", client.responses.retrieve, response_id)\n",
"\n",
"\n",
"def delete_response(response_id: str) -> Any:\n",
" return call_with_retries(\"responses.delete\", client.responses.delete, response_id)\n",
"\n",
"\n",
"def remember_stored_response(response: Any) -> None:\n",
" response_id = getattr(response, \"id\", None) or to_dict(response).get(\"id\")\n",
" if response_id:\n",
" STORED_RESPONSE_IDS.append(response_id)\n",
"\n",
"\n",
"def handle_example_error(features: str | list[str], exc: Exception) -> None:\n",
" feature_list = [features] if isinstance(features, builtins.str) else features\n",
" error = describe_api_error(exc)\n",
" for feature in feature_list:\n",
" record_check(feature, \"warn\", error)\n",
" print_labeled_text(\"Result\", \"This live call did not complete in this environment.\")\n",
" print_labeled_json(\"Response summary\", error)\n",
"\n",
"\n",
"def build_curl_command(payload: dict[str, Any]) -> str:\n",
" body = json.dumps(payload)\n",
" return \" \".join([\n",
" \"curl\", \"-sS\", shlex.quote(RESPONSES_URL),\n",
" \"-H\", shlex.quote(\"Content-Type: application/json\"),\n",
" \"-H\", shlex.quote(\"Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK\"),\n",
" \"-d\", shlex.quote(body),\n",
" ])\n",
"\n",
"\n",
"def run_raw_http_request(payload: dict[str, Any]) -> dict[str, Any]:\n",
" return request_json(\"POST\", \"/responses\", payload=payload)\n",
"\n",
"print(\"Helpers ready.\")\n"
]
},
{
"cell_type": "markdown",
"id": "5361f141",
"metadata": {},
"source": [
"### 1.6 Verify the Endpoint\n",
"\n",
"The first live call is intentionally tiny. It sends a minimal Responses request with `store=false` and a short text instruction so you can catch setup issues before running richer examples. Inspect the request shape, returned text, status, model, output item types, and token usage.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "38425696",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | input | \n",
" Reply with exactly: ok | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
ok
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_gnt2qiavimim2lvfrtosh472mmsodtphk4glbiiefj7joxb4k4ra | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 162 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 5 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 167 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: A tiny response confirms that the endpoint, key, model, and request shape are working.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"preflight_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": \"Reply with exactly: ok\",\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(preflight_payload)\n",
"try:\n",
" preflight_response = create_response(**preflight_payload)\n",
" require(output_text(preflight_response).strip(), \"Preflight response did not return output text.\")\n",
" record_check(\"Endpoint shape\", \"pass\", RESPONSES_URL)\n",
" model_selection_detail = f\"{len(AVAILABLE_MODELS)} models discovered\" if AVAILABLE_MODELS else \"Using configured model; model-list metadata is not required for requests.\"\n",
" record_check(\"Model selection\", \"pass\", model_selection_detail)\n",
" preflight_text = output_text(preflight_response).strip()\n",
" record_response(\"Endpoint verification\", \"text\", preflight_text)\n",
" print_labeled_text(\"Result\", preflight_text)\n",
" print_response_summary(preflight_response)\n",
" print_key_takeaway('A tiny response confirms that the endpoint, key, model, and request shape are working.')\n",
"except Exception as exc:\n",
" handle_example_error([\"Endpoint shape\", \"Model selection\"], exc)\n"
]
},
{
"cell_type": "markdown",
"id": "f3c079d9",
"metadata": {},
"source": [
"### 1.7 Normalize API Errors\n",
"\n",
"Production integrations need consistent error logging for status codes, retry decisions, request IDs, and response bodies. This cell documents the normalized error shape used by the notebook without intentionally making a failing request. Later cells use the same shape when a live call fails or returns a non-2xx status.\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "4e027820",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
"
JSON
\n",
"
{\n",
" "normalized_fields": [\n",
" "exception_class",\n",
" "status_code",\n",
" "retryable",\n",
" "request_id",\n",
" "message"\n",
" ],\n",
" "retryable_status_codes": [\n",
" 408,\n",
" 409,\n",
" 429,\n",
" 500,\n",
" 502,\n",
" 503,\n",
" 504\n",
" ],\n",
" "notes": "call_with_retries(...) uses this taxonomy for transient retry handling."\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"error_taxonomy_example = {\n",
" \"normalized_fields\": [\"exception_class\", \"status_code\", \"retryable\", \"request_id\", \"message\"],\n",
" \"retryable_status_codes\": sorted(TRANSIENT_STATUS_CODES),\n",
" \"notes\": \"call_with_retries(...) uses this taxonomy for transient retry handling.\",\n",
"}\n",
"record_check(\"Error handling\", \"pass\", error_taxonomy_example)\n",
"print_json(error_taxonomy_example)"
]
},
{
"cell_type": "markdown",
"id": "9dc176b7",
"metadata": {},
"source": [
"## 2. Make Your First Responses Requests\n",
"\n",
"This section shows the Responses request surface from two angles. First, you inspect and run a raw HTTPS request so the endpoint, headers, and JSON body are visible. Then you use the OpenAI SDK for the same kind of application workflow, which is the path most production code should prefer once configuration is correct.\n"
]
},
{
"cell_type": "markdown",
"id": "87b26703",
"metadata": {},
"source": [
"### 2.1 Inspect the Raw HTTPS Request Shape\n",
"\n",
"Build a minimal Responses payload for a BrightCart support-assistant reply and render a copy-pasteable `curl` command. The command references `$AWS_BEARER_TOKEN_BEDROCK` instead of embedding a token, and the notebook does not execute shell commands that put bearer tokens in process arguments. Inspect the `model`, `input`, `max_output_tokens`, and `store` fields.\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "7c2aa5e3",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | input | \n",
" BrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
curl -sS https://bedrock-mantle.us-west-2.api.aws/openai/v1/responses -H 'Content-Type: application/json' -H 'Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK' -d '{"model": "openai.gpt-5.4", "input": "BrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text.", "max_output_tokens": 1024, "store": false}'
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: The curl command shows the raw HTTPS shape behind the SDK call.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"basic_curl_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": \"BrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text.\",\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(basic_curl_payload)\n",
"print_labeled_text(\"Result\", build_curl_command(basic_curl_payload))\n",
"print_key_takeaway('The curl command shows the raw HTTPS shape behind the SDK call.')\n"
]
},
{
"cell_type": "markdown",
"id": "3ed94d52",
"metadata": {},
"source": [
"### 2.2 Send the Raw HTTPS Request\n",
"\n",
"Send the same request through the raw HTTPS helper. This cell demonstrates the wire-level `POST /responses` path and extracts text from the response body by walking output items. Inspect the returned response ID, model, status, and text output to understand the schema your application receives.\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "958f17c8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | input | \n",
" BrightCart customer Maya asks why replacement order ORDER-8831 is delayed. Write two labeled plain-text lines for the support agent. Do not use leading hyphens or bold text. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
Empathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\n",
"Action: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Response summary
\n",
"
{\n",
" "id": "resp_naythl6fvzhoctlsdogd4vpr673q5ibagqqpiujbast3sy6viroa",\n",
" "model": "openai.gpt-5.4",\n",
" "status": "completed"\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: The response body contains message output that application code can extract as text.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"\n",
"print_request_shape(basic_curl_payload)\n",
"try:\n",
" basic_http_response = run_raw_http_request(basic_curl_payload)\n",
" record_check(\"Text generation\", \"pass\", basic_http_response.get(\"id\"))\n",
" response_text_parts = []\n",
" for item in basic_http_response.get(\"output\", []):\n",
" for content in item.get(\"content\", []):\n",
" if content.get(\"type\") == \"output_text\":\n",
" response_text_parts.append(content.get(\"text\", \"\"))\n",
" raw_http_output = \"\".join(response_text_parts).strip()\n",
" record_response(\"First raw HTTPS request\", \"text\", raw_http_output)\n",
" print_labeled_text(\"Result\", raw_http_output)\n",
" print_labeled_json(\"Response summary\", {\n",
" \"id\": basic_http_response.get(\"id\"),\n",
" \"model\": basic_http_response.get(\"model\"),\n",
" \"status\": basic_http_response.get(\"status\"),\n",
" })\n",
" print_key_takeaway(\"The response body contains message output that application code can extract as text.\")\n",
"except Exception as exc:\n",
" handle_example_error(\"Text generation\", exc)\n"
]
},
{
"cell_type": "markdown",
"id": "95e7940e",
"metadata": {},
"source": [
"### 2.3 Use the OpenAI SDK\n",
"\n",
"The OpenAI SDK can call OpenAI-compatible APIs when you pass the Bedrock bearer token and base URL explicitly. This cell sends a text-generation request through `client.responses.create`, sets `reasoning.effort` to `low`, and prints a compact response summary. Inspect the output text, token counts, output item types, and any reasoning-token metadata returned by the endpoint.\n",
"\n",
"Official docs: [Reasoning models](https://developers.openai.com/api/docs/guides/reasoning) describes using reasoning effort with the Responses API.\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "817e8717",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | reasoning | \n",
" {'effort': 'low'} | \n",
"
\n",
" \n",
" | input | \n",
" Write a three-sentence overview for a developer building a BrightCart support assistant with the Responses API. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
Use the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_nmvqefzghd5hi67uwy4wfvhwnnzild3lslqsxqor3cat63kmucoq | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['reasoning', 'message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 177 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 129 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 306 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 18 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: The SDK returns a response object with text, status, token usage, and output item metadata.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"sdk_text_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": \"Write a three-sentence overview for a developer building a BrightCart support assistant with the Responses API.\",\n",
" \"reasoning\": {\"effort\": \"low\"},\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(sdk_text_payload)\n",
"try:\n",
" text_response = create_response(**sdk_text_payload)\n",
" sdk_text = output_text(text_response).strip()\n",
" require(sdk_text, \"SDK text response did not return output text.\")\n",
" record_check(\"Text generation\", \"pass\", summarize_response(text_response))\n",
" record_check(\"Reasoning effort\", \"pass\", summarize_response(text_response))\n",
" record_response(\"SDK text generation\", \"text\", sdk_text)\n",
" print_labeled_text(\"Result\", sdk_text)\n",
" print_response_summary(text_response)\n",
" print_key_takeaway('The SDK returns a response object with text, status, token usage, and output item metadata.')\n",
"except Exception as exc:\n",
" handle_example_error([\"Text generation\", \"Reasoning effort\"], exc)\n"
]
},
{
"cell_type": "markdown",
"id": "e781a295",
"metadata": {},
"source": [
"### 2.4 Create and Retrieve a Response\n",
"\n",
"The Responses API can store a response and retrieve it later by ID. This pattern is useful for audit trails, debugging, and follow-up turns that reference prior context. This cell creates a stored response, tracks the ID for cleanup, retrieves it, and compares the retrieved text and usage metadata.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "dc323795",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" True | \n",
"
\n",
" \n",
" | input | \n",
" BrightCart is building a support assistant for delayed replacement orders. Return exactly three labeled plain-text lines: goal, data needed, and human-review rule. Do not use leading hyphens or bold text. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
goal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\n",
"data needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\n",
"human-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Created response summary
\n",
"
{\n",
" "id": "resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia",\n",
" "model": "openai.gpt-5.4",\n",
" "status": "completed",\n",
" "output_item_types": [\n",
" "message"\n",
" ],\n",
" "input_tokens": 198,\n",
" "output_tokens": 109,\n",
" "total_tokens": 307,\n",
" "cached_input_tokens": 0,\n",
" "reasoning_output_tokens": 0,\n",
" "service_tier": "default"\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 198 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 109 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 307 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: store=True lets an application retrieve the response later by ID with usage metadata intact.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"lifecycle_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": (\n",
" \"BrightCart is building a support assistant for delayed replacement orders. \"\n",
" \"Return exactly three labeled plain-text lines: goal, data needed, and human-review rule. Do not use leading hyphens or bold text.\"\n",
" ),\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": True,\n",
"}\n",
"\n",
"print_request_shape(lifecycle_payload)\n",
"try:\n",
" lifecycle_response = create_response(**lifecycle_payload)\n",
" remember_stored_response(lifecycle_response)\n",
" retrieved_response = retrieve_response(lifecycle_response.id)\n",
" retrieved_summary = summarize_response(retrieved_response)\n",
" retrieved_text = output_text(retrieved_response).strip()\n",
" require(retrieved_text, \"Retrieved response did not contain text output.\")\n",
"\n",
" lifecycle_status = \"pass\" if retrieved_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n",
" record_check(\"Responses lifecycle\", lifecycle_status, retrieved_response.id)\n",
" record_check(\"Response schema\", \"pass\", retrieved_summary)\n",
" record_check(\"Usage metadata\", \"pass\" if retrieved_summary.get(\"total_tokens\") is not None else \"warn\", retrieved_summary)\n",
" record_response(\"Create and retrieve response\", \"text\", retrieved_text)\n",
"\n",
" print_labeled_text(\"Result\", retrieved_text)\n",
" print_labeled_json(\"Created response summary\", summarize_response(lifecycle_response))\n",
" print_response_summary(retrieved_summary)\n",
" print_key_takeaway('store=True lets an application retrieve the response later by ID with usage metadata intact.')\n",
"except Exception as exc:\n",
" handle_example_error([\"Responses lifecycle\", \"Response schema\", \"Usage metadata\"], exc)\n"
]
},
{
"cell_type": "markdown",
"id": "609cd0ad",
"metadata": {},
"source": [
"### 2.5 Add Reasoning Effort, Service Tier, and Prompt Cache Parameters\n",
"\n",
"Model controls travel alongside the normal input. This request combines `reasoning.effort`, `service_tier`, `prompt_cache_key`, and `prompt_cache_retention` so you can see how operational controls and prompt-cache metadata appear in the same response schema as ordinary text output. Inspect `service_tier`, `cached_input_tokens`, reasoning token metadata, and total token usage.\n",
"\n",
"Note: This notebook uses `PROMPT_CACHE_RETENTION` instead of hard-coding `prompt_cache_retention`. The value is `in_memory` for `openai.gpt-5.4`, and `24h` for `openai.gpt-5.5` and later models because those models require extended prompt caching.\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "1e6f45d6",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | service_tier | \n",
" auto | \n",
"
\n",
" \n",
" | prompt_cache_key | \n",
" brightcart-support-policy-guide | \n",
"
\n",
" \n",
" | prompt_cache_retention | \n",
" in_memory | \n",
"
\n",
" \n",
" | reasoning | \n",
" {'effort': 'low'} | \n",
"
\n",
" \n",
" | input | \n",
" For the BrightCart support assistant, explain prompt caching in exactly two labeled plain-text lines: one latency benefit and one consistency benefit. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
Latency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\n",
"Consistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['reasoning', 'message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 183 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 91 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 274 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 34 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Model controls travel with the same request as normal input, while returned metadata can vary by endpoint.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"control_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": (\n",
" \"For the BrightCart support assistant, explain prompt caching in exactly two labeled plain-text lines: \"\n",
" \"one latency benefit and one consistency benefit.\"\n",
" ),\n",
" \"reasoning\": {\"effort\": \"low\"},\n",
" \"prompt_cache_key\": \"brightcart-support-policy-guide\",\n",
" \"prompt_cache_retention\": PROMPT_CACHE_RETENTION,\n",
" \"service_tier\": \"auto\",\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(control_payload)\n",
"try:\n",
" control_response = create_response(**control_payload)\n",
" control_summary = summarize_response(control_response)\n",
" control_text = output_text(control_response).strip()\n",
" require(control_text, \"Control response did not return text.\")\n",
" status = \"pass\" if control_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n",
" record_check(\"Prompt caching\", \"pass\" if control_summary.get(\"cached_input_tokens\") is not None else \"warn\", control_summary)\n",
" record_check(\"Service tier\", \"pass\" if control_summary.get(\"service_tier\") is not None else \"warn\", control_summary)\n",
" record_check(\"Reasoning effort\", status, control_summary)\n",
" record_response(\"Service tier and prompt cache request\", \"text\", control_text)\n",
" print_labeled_text(\"Result\", control_text)\n",
" print_response_summary(control_summary)\n",
" print_key_takeaway('Model controls travel with the same request as normal input, while returned metadata can vary by endpoint.')\n",
"except Exception as exc:\n",
" handle_example_error([\"Prompt caching\", \"Service tier\", \"Reasoning effort\"], exc)\n"
]
},
{
"cell_type": "markdown",
"id": "6eb8aaa7",
"metadata": {},
"source": [
"## 3. Generate Structured JSON\n",
"\n",
"Structured JSON turns model output into data that application code can parse, validate, and route. This section compares strict schema-constrained output with lighter JSON mode. Use Structured Outputs when your application needs a contract; use JSON mode when valid JSON is enough but the exact schema can remain flexible.\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "5592a8ac",
"metadata": {},
"source": [
"### 3.1 Define the Structured Output Schema\n",
"\n",
"Define the support-ticket schema used by the next live request. The schema lists the exact fields the application expects, including category, priority, sentiment, summary, required actions, and escalation status. Inspect the request shape to see how `text.format.type=\"json_schema\"`, `strict=true`, and the JSON Schema are attached to a normal Responses request.\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "9daa5861",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | text format | \n",
" json_schema: support_ticket_triage strict=True required=7 fields | \n",
"
\n",
" \n",
" | input | \n",
" Support ticket TICKET-7429: Maya Chen says ORDER-8831 is a replacement for a damaged standing desk. The replacement is two days late, the carrier scan has not moved, and she needs the desk before Monday. She asks for a supervisor callback and refund options. T... | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: The schema is part of the request and defines the fields the next cell validates.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"support_triage_schema = {\n",
" \"type\": \"object\",\n",
" \"properties\": {\n",
" \"ticket_id\": {\"type\": \"string\"},\n",
" \"category\": {\"type\": \"string\", \"enum\": [\"delivery_delay\", \"return_exchange\", \"damaged_item\", \"billing\", \"account\"]},\n",
" \"priority\": {\"type\": \"string\", \"enum\": [\"low\", \"medium\", \"high\", \"urgent\"]},\n",
" \"customer_sentiment\": {\"type\": \"string\"},\n",
" \"summary\": {\"type\": \"string\"},\n",
" \"required_actions\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}, \"minItems\": 2},\n",
" \"escalation_needed\": {\"type\": \"boolean\"},\n",
" },\n",
" \"required\": [\"ticket_id\", \"category\", \"priority\", \"customer_sentiment\", \"summary\", \"required_actions\", \"escalation_needed\"],\n",
" \"additionalProperties\": False,\n",
"}\n",
"\n",
"structured_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": (\n",
" \"Support ticket TICKET-7429: Maya Chen says ORDER-8831 is a replacement for a damaged standing desk. \"\n",
" \"The replacement is two days late, the carrier scan has not moved, and she needs the desk before Monday. \"\n",
" \"She asks for a supervisor callback and refund options. Triage this ticket for the next support agent.\"\n",
" ),\n",
" \"text\": {\"format\": {\"type\": \"json_schema\", \"name\": \"support_ticket_triage\", \"strict\": True, \"schema\": support_triage_schema}},\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(structured_payload)\n",
"print_key_takeaway('The schema is part of the request and defines the fields the next cell validates.')\n"
]
},
{
"cell_type": "markdown",
"id": "6a1dea0f",
"metadata": {},
"source": [
"### 3.2 Validate Schema-Constrained Output\n",
"\n",
"Call the model with the schema from the previous cell, parse the returned text as JSON, and validate important fields in Python. The API request asks for schema adherence, while application-side validation still checks that the returned object is suitable for downstream routing. Inspect the parsed object and the response summary.\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "8b813739",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | text format | \n",
" json_schema: support_ticket_triage strict=True required=7 fields | \n",
"
\n",
" \n",
" | input | \n",
" Support ticket TICKET-7429: Maya Chen says ORDER-8831 is a replacement for a damaged standing desk. The replacement is two days late, the carrier scan has not moved, and she needs the desk before Monday. She asks for a supervisor callback and refund options. T... | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
{\n",
" "ticket_id": "TICKET-7429",\n",
" "category": "delivery_delay",\n",
" "priority": "urgent",\n",
" "customer_sentiment": "frustrated and time-sensitive",\n",
" "summary": "Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.",\n",
" "required_actions": [\n",
" "Review ORDER-8831 shipment status and confirm last carrier scan/update.",\n",
" "Contact carrier or open a trace/escalation for stalled tracking.",\n",
" "Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.",\n",
" "Arrange supervisor callback per customer request.",\n",
" "Review and communicate refund options, including refund for replacement order and any prior damaged-item resolution details.",\n",
" "Verify whether replacement shipment should be intercepted/returned if a refund or reshipment is approved."\n",
" ],\n",
" "escalation_needed": true\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_rk7y6aobdqx2m2fjpnllihuoyku2n55gj5o2h5jxl7lru77mwwiq | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 328 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 200 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 528 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Schema-constrained output gives application code a predictable JSON object to parse and validate.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"\n",
"\n",
"def validate_support_triage(payload: dict[str, Any]) -> dict[str, Any]:\n",
" require(\"ticket_id\" in payload, \"Missing key: ticket_id\")\n",
" require(payload.get(\"ticket_id\") == \"TICKET-7429\", \"Ticket ID did not match expected value.\")\n",
" require(\"required_actions\" in payload, \"Missing key: required_actions\")\n",
" require(isinstance(payload.get(\"required_actions\"), builtins.list), \"required_actions must be a list.\")\n",
" require(len(payload[\"required_actions\"]) >= 2, \"required_actions should contain at least two actions.\")\n",
" return payload\n",
"\n",
"print_request_shape(structured_payload)\n",
"try:\n",
" structured_response = create_response(**structured_payload)\n",
" raw_structured_text = output_text(structured_response).strip()\n",
" try:\n",
" structured_payload_result = validate_support_triage(json.loads(raw_structured_text))\n",
" record_check(\"Structured Outputs\", \"pass\", structured_payload_result)\n",
" record_response(\"Structured ticket triage\", \"json\", structured_payload_result)\n",
" print_labeled_json(\"Result\", structured_payload_result)\n",
" except json.JSONDecodeError as e:\n",
" raise ValueError(f\"Invalid JSON: {e}\")\n",
" except Exception as parse_exc:\n",
" record_check(\"Structured Outputs\", \"warn\", {\"message\": \"Response did not match the expected schema shape.\", \"text_sample\": raw_structured_text[:600], \"error\": builtins.str(parse_exc)})\n",
" print_labeled_text(\"Result\", \"The request completed, but the returned text did not match the expected schema shape.\")\n",
" print_wrapped(raw_structured_text[:1200])\n",
" print_response_summary(structured_response)\n",
" print_key_takeaway('Schema-constrained output gives application code a predictable JSON object to parse and validate.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Structured Outputs\", exc)"
]
},
{
"cell_type": "markdown",
"id": "271d6dec",
"metadata": {},
"source": [
"### 3.3 Use JSON Mode\n",
"\n",
"JSON mode asks the model to return a valid JSON object without enforcing a strict schema. This is useful for lightweight handoffs where you still want parsable output but can tolerate a looser contract. This cell requests a support handoff object, parses it, and checks for the expected keys.\n"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "635266ce",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | text format | \n",
" json_object | \n",
"
\n",
" \n",
" | input | \n",
" Return JSON for a support chat handoff with keys customer_name, order_id, issue_summary, next_step, and metrics_to_watch. Context: Maya Chen asks about delayed replacement order ORDER-8831; the carrier scan is stale. metrics_to_watch should be an array. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
{\n",
" "customer_name": "Maya Chen",\n",
" "order_id": "ORDER-8831",\n",
" "issue_summary": "Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.",\n",
" "next_step": "Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.",\n",
" "metrics_to_watch": [\n",
" "tracking_scan_recency",\n",
" "carrier_exception_status",\n",
" "replacement_order_delivery_eta",\n",
" "customer_follow_up_time"\n",
" ]\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_2wloqig6bh6sz2ysyiozx7oon7hogdzavqnv4pn7wcmnfkxnmlra | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 212 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 122 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 334 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: JSON mode is useful when valid JSON is enough and a strict schema is not required.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"json_mode_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": (\n",
" \"Return JSON for a support chat handoff with keys customer_name, order_id, issue_summary, next_step, \"\n",
" \"and metrics_to_watch. Context: Maya Chen asks about delayed replacement order ORDER-8831; the carrier scan is stale. \"\n",
" \"metrics_to_watch should be an array.\"\n",
" ),\n",
" \"text\": {\"format\": {\"type\": \"json_object\"}},\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(json_mode_payload)\n",
"try:\n",
" json_mode_response = create_response(**json_mode_payload)\n",
" payload = json.loads(output_text(json_mode_response).strip())\n",
" require({\"customer_name\", \"order_id\", \"issue_summary\", \"next_step\", \"metrics_to_watch\"}.issubset(payload), \"JSON mode response missed required keys.\")\n",
" record_check(\"JSON mode\", \"pass\", payload)\n",
" record_response(\"JSON support handoff\", \"json\", payload)\n",
" print_labeled_json(\"Result\", payload)\n",
" print_response_summary(json_mode_response)\n",
" print_key_takeaway('JSON mode is useful when valid JSON is enough and a strict schema is not required.')\n",
"except Exception as exc:\n",
" handle_example_error(\"JSON mode\", exc)\n"
]
},
{
"cell_type": "markdown",
"id": "1be28f8c",
"metadata": {},
"source": [
"### 3.4 Control Verbosity from Reasoning Effort\n",
"\n",
"Verbosity controls help tune the shape of generated prose, while reasoning effort controls how much reasoning work the model spends before answering. The notebook demonstrates `reasoning.effort` in the SDK and model-control cells above; this cell focuses on `text.verbosity` by sending compact and detailed versions of the same policy topic. Inspect the side-by-side text and token summaries to compare style and usage.\n"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "5f0850ca",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
"
Request shape
\n",
"
{\n",
" "compact": {\n",
" "model": "openai.gpt-5.4",\n",
" "input": "Explain BrightCart's delayed-replacement policy to a new support agent. Reply in one sentence under 35 words.",\n",
" "text": {\n",
" "verbosity": "low"\n",
" },\n",
" "max_output_tokens": 1024,\n",
" "store": false\n",
" },\n",
" "detailed": {\n",
" "model": "openai.gpt-5.4",\n",
" "input": "Explain BrightCart's delayed-replacement policy to a new support agent. Reply in exactly three numbered plain-text lines, each under 18 words. Do not use leading hyphens or bold text.",\n",
" "text": {\n",
" "verbosity": "high"\n",
" },\n",
" "max_output_tokens": 1024,\n",
" "store": false\n",
" }\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: compact guidance
\n",
"
BrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: detailed guidance
\n",
"
1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\n",
"2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\n",
"3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | request | \n",
" id | \n",
" model | \n",
" status | \n",
" output_item_types | \n",
" input_tokens | \n",
" output_tokens | \n",
" total_tokens | \n",
" cached_input_tokens | \n",
" reasoning_output_tokens | \n",
" service_tier | \n",
"
\n",
" \n",
" \n",
" \n",
" | compact | \n",
" resp_m33l24gl3wl55lwqxhnrqkrf34ikc4rex4sazo2pnqpp4malnliq | \n",
" openai.gpt-5.4 | \n",
" completed | \n",
" [message] | \n",
" 180 | \n",
" 34 | \n",
" 214 | \n",
" 0 | \n",
" 0 | \n",
" default | \n",
"
\n",
" \n",
" | detailed | \n",
" resp_icqbez74xlmlvw4kl3yf2qtbutfopfefayzftragcwmhu4cohnua | \n",
" openai.gpt-5.4 | \n",
" completed | \n",
" [message] | \n",
" 197 | \n",
" 60 | \n",
" 257 | \n",
" 0 | \n",
" 0 | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Verbosity controls tune the answer style while the prompt still bounds the output.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"verbosity_prompt = \"Explain BrightCart's delayed-replacement policy to a new support agent.\"\n",
"compact_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": verbosity_prompt + \" Reply in one sentence under 35 words.\",\n",
" \"text\": {\"verbosity\": \"low\"},\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"detailed_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": verbosity_prompt + \" Reply in exactly three numbered plain-text lines, each under 18 words. Do not use leading hyphens or bold text.\",\n",
" \"text\": {\"verbosity\": \"high\"},\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_labeled_json(\"Request shape\", {\n",
" \"compact\": redact_payload(compact_payload),\n",
" \"detailed\": redact_payload(detailed_payload),\n",
"})\n",
"try:\n",
" compact_response = create_response(**compact_payload)\n",
" detailed_response = create_response(**detailed_payload)\n",
" compact_guidance_text = output_text(compact_response).strip()\n",
" detailed_guidance_text = output_text(detailed_response).strip()\n",
" require(compact_guidance_text and detailed_guidance_text, \"Verbosity responses did not return text.\")\n",
" compact_summary = summarize_response(compact_response)\n",
" detailed_summary = summarize_response(detailed_response)\n",
" status = \"pass\" if compact_summary.get(\"status\") in {None, \"completed\"} and detailed_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n",
" record_check(\"Verbosity\", status, {\"compact_chars\": len(compact_guidance_text), \"detailed_chars\": len(detailed_guidance_text)})\n",
" record_response(\"Compact policy guidance\", \"text\", compact_guidance_text)\n",
" record_response(\"Detailed policy guidance\", \"text\", detailed_guidance_text)\n",
" print_labeled_text(\"Result: compact guidance\", compact_guidance_text)\n",
" print_labeled_text(\"Result: detailed guidance\", detailed_guidance_text)\n",
" verbosity_summary = pd.DataFrame([\n",
" {\"request\": \"compact\", **compact_summary},\n",
" {\"request\": \"detailed\", **detailed_summary},\n",
" ])\n",
" print_label(\"Response summary\")\n",
" display_wrapped_table(verbosity_summary, max_col_width_px=420)\n",
" print_key_takeaway('Verbosity controls tune the answer style while the prompt still bounds the output.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Verbosity\", exc)\n"
]
},
{
"cell_type": "markdown",
"id": "ed8b4b11",
"metadata": {},
"source": [
"## 4. Add Application-Managed Tools\n",
"\n",
"Function calling lets the model ask your application for data or actions, but your code remains responsible for executing tools and returning results. This section defines local BrightCart tools, then walks through a single function call, multiple independent calls, and a custom text tool. The examples keep tool outputs deterministic so the request loop is easy to inspect.\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "1c543f50",
"metadata": {},
"source": [
"### 4.1 Define Local Tool Schemas and Functions\n",
"\n",
"Define local sample tools for order status and customer profile lookups. The tool schemas describe the names, descriptions, argument shapes, required fields, and strictness that the model can use when deciding what to call. The Python functions stand in for application systems such as order management, CRM, or policy services.\n"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "dce0bb66",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Sample function tools:\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
"
JSON
\n",
"
[\n",
" "get_order_status",\n",
" "get_customer_profile"\n",
"]
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Sample order lookup:\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
"
JSON
\n",
"
{\n",
" "order_id": "ORDER-8831",\n",
" "customer_id": "CUST-1042",\n",
" "item": "standing desk replacement",\n",
" "status": "delayed",\n",
" "carrier_scan": "No movement for 36 hours at Denver sort center",\n",
" "promised_delivery": "2026-06-01",\n",
" "recommended_policy": "If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval."\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"function_tools = [\n",
" {\n",
" \"type\": \"function\",\n",
" \"name\": \"get_order_status\",\n",
" \"description\": \"Look up a sample BrightCart order status.\",\n",
" \"parameters\": {\n",
" \"type\": \"object\",\n",
" \"properties\": {\"order_id\": {\"type\": \"string\", \"description\": \"An order ID such as ORDER-8831.\"}},\n",
" \"required\": [\"order_id\"],\n",
" \"additionalProperties\": False,\n",
" },\n",
" \"strict\": True,\n",
" },\n",
" {\n",
" \"type\": \"function\",\n",
" \"name\": \"get_customer_profile\",\n",
" \"description\": \"Look up sample customer context for a BrightCart support interaction.\",\n",
" \"parameters\": {\n",
" \"type\": \"object\",\n",
" \"properties\": {\"customer_id\": {\"type\": \"string\", \"description\": \"A customer ID such as CUST-1042.\"}},\n",
" \"required\": [\"customer_id\"],\n",
" \"additionalProperties\": False,\n",
" },\n",
" \"strict\": True,\n",
" },\n",
"]\n",
"\n",
"\n",
"def get_order_status(order_id: str) -> dict[str, Any]:\n",
" orders = {\n",
" \"ORDER-8831\": {\n",
" \"order_id\": \"ORDER-8831\",\n",
" \"customer_id\": \"CUST-1042\",\n",
" \"item\": \"standing desk replacement\",\n",
" \"status\": \"delayed\",\n",
" \"carrier_scan\": \"No movement for 36 hours at Denver sort center\",\n",
" \"promised_delivery\": (date.today() + timedelta(days=2)).isoformat(),\n",
" \"recommended_policy\": \"If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval.\",\n",
" },\n",
" \"ORDER-2044\": {\n",
" \"order_id\": \"ORDER-2044\",\n",
" \"customer_id\": \"CUST-1042\",\n",
" \"item\": \"ergonomic chair\",\n",
" \"status\": \"delivered\",\n",
" \"carrier_scan\": \"Delivered yesterday at front desk\",\n",
" \"promised_delivery\": (date.today() - timedelta(days=1)).isoformat(),\n",
" \"recommended_policy\": \"Confirm delivery details before opening a replacement request.\",\n",
" },\n",
" }\n",
" return orders.get(order_id, {\"order_id\": order_id, \"status\": \"unknown\", \"customer_id\": None})\n",
"\n",
"\n",
"def get_customer_profile(customer_id: str) -> dict[str, Any]:\n",
" profiles = {\n",
" \"CUST-1042\": {\n",
" \"customer_id\": \"CUST-1042\",\n",
" \"name\": \"Maya Chen\",\n",
" \"loyalty_tier\": \"Gold\",\n",
" \"region\": \"California\",\n",
" \"recent_issue\": \"Damaged standing desk replacement\",\n",
" \"contact_preference\": \"email with SMS updates for shipping changes\",\n",
" }\n",
" }\n",
" return profiles.get(customer_id, {\"customer_id\": customer_id, \"loyalty_tier\": \"unknown\"})\n",
"\n",
"\n",
"def dispatch_tool_call(call: dict[str, Any]) -> dict[str, Any]:\n",
" name = call[\"name\"]\n",
" args = json.loads(call[\"arguments\"])\n",
" if name == \"get_order_status\":\n",
" output = get_order_status(**args)\n",
" elif name == \"get_customer_profile\":\n",
" output = get_customer_profile(**args)\n",
" else:\n",
" raise ValueError(f\"Unsupported tool: {name}\")\n",
" return {\"type\": \"function_call_output\", \"call_id\": call[\"call_id\"], \"output\": json.dumps(output)}\n",
"\n",
"print(\"Sample function tools:\")\n",
"print_json([tool[\"name\"] for tool in function_tools])\n",
"print(\"\\nSample order lookup:\")\n",
"print_json(get_order_status(\"ORDER-8831\"))"
]
},
{
"cell_type": "markdown",
"id": "e1c9e1db",
"metadata": {},
"source": [
"### 4.2 Call a Function Tool\n",
"\n",
"This cell runs the basic function-calling loop. The first request gives the model an order-status tool and asks it to choose arguments. The application parses the returned `function_call`, runs the local Python function, sends a `function_call_output` item back, and asks for the final grounded answer. Inspect the tool arguments, local tool output, final model text, and response metadata.\n"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "5919a970",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | tools | \n",
" get_order_status | \n",
"
\n",
" \n",
" | tool_choice | \n",
" required | \n",
"
\n",
" \n",
" | input | \n",
" 1 item(s): user: Use get_order_status for ORDER-8831, then explain the next best action for the support agent in two labeled plain-text lines. Do not use leading hyphens or bold text. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: tool arguments
\n",
"
{\n",
" "order_id": "ORDER-8831"\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: tool output
\n",
"
{\n",
" "order_id": "ORDER-8831",\n",
" "customer_id": "CUST-1042",\n",
" "item": "standing desk replacement",\n",
" "status": "delayed",\n",
" "carrier_scan": "No movement for 36 hours at Denver sort center",\n",
" "promised_delivery": "2026-06-01",\n",
" "recommended_policy": "If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval."\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: final model answer
\n",
"
Status: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\n",
"Next best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_nqjijq6jvanvxdglnj4iuezrkdm2eoizutfiezzxbmm42xtynw6a | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 674 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 75 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 749 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Function calling separates model-selected arguments from application-executed business logic.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"function_input = [{\"role\": \"user\", \"content\": \"Use get_order_status for ORDER-8831, then explain the next best action for the support agent in two labeled plain-text lines. Do not use leading hyphens or bold text.\"}]\n",
"order_status_tool = [tool for tool in function_tools if tool[\"name\"] == \"get_order_status\"]\n",
"function_request = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": function_input,\n",
" \"tools\": order_status_tool,\n",
" \"tool_choice\": \"required\",\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"\n",
"def create_tool_plan_with_auto_fallback(request: dict[str, Any]) -> tuple[Any, str]:\n",
" try:\n",
" return create_response(**request), builtins.str(request.get(\"tool_choice\"))\n",
" except Exception as first_exc:\n",
" fallback_request = {**request, \"tool_choice\": \"auto\"}\n",
" try:\n",
" return create_response(**fallback_request), \"auto\"\n",
" except Exception:\n",
" raise first_exc\n",
"\n",
"\n",
"print_request_shape(function_request)\n",
"try:\n",
" function_plan, tool_choice_used = create_tool_plan_with_auto_fallback(function_request)\n",
" function_calls = [item for item in response_items(function_plan) if item.get(\"type\") == \"function_call\"]\n",
"\n",
" if function_calls:\n",
" function_call = function_calls[0]\n",
" function_args = json.loads(function_call[\"arguments\"])\n",
" require(function_args.get(\"order_id\") == \"ORDER-8831\", f\"Unexpected function arguments: {function_args}\")\n",
" tool_output = dispatch_tool_call(function_call)\n",
" final_response = create_response(\n",
" model=MODEL_ID,\n",
" input=function_input + response_items(function_plan) + [tool_output],\n",
" tools=order_status_tool,\n",
" max_output_tokens=1024,\n",
" store=False,\n",
" )\n",
" final_answer = output_text(final_response).strip()\n",
" tool_output_payload = json.loads(tool_output[\"output\"])\n",
" record_check(\"Function calling\", \"pass\", {\"tool_choice_used\": tool_choice_used, \"arguments\": function_args})\n",
" record_response(\"Order-status tool answer\", \"text\", final_answer)\n",
" print_labeled_json(\"Result: tool arguments\", function_args)\n",
" print_labeled_json(\"Result: tool output\", tool_output_payload)\n",
" print_labeled_text(\"Result: final model answer\", final_answer)\n",
" print_response_summary(final_response)\n",
" print_key_takeaway('Function calling separates model-selected arguments from application-executed business logic.')\n",
" else:\n",
" fallback_order = get_order_status(\"ORDER-8831\")\n",
" fallback_prompt = (\n",
" \"The model response did not include a function_call item. Use this application lookup result \"\n",
" \"to answer in two labeled plain-text lines without leading hyphens or bold text: \" + json.dumps(fallback_order)\n",
" )\n",
" final_response = create_response(\n",
" model=MODEL_ID,\n",
" input=function_input + [{\"role\": \"user\", \"content\": fallback_prompt}],\n",
" max_output_tokens=1024,\n",
" store=False,\n",
" )\n",
" final_answer = output_text(final_response).strip()\n",
" returned_item_types = [item.get(\"type\") for item in response_items(function_plan)]\n",
" record_check(\"Function calling\", \"warn\", {\"tool_choice_used\": tool_choice_used, \"returned_item_types\": returned_item_types})\n",
" record_response(\"Order-status local fallback answer\", \"text\", final_answer)\n",
" print_labeled_json(\"Result: returned output item types\", returned_item_types)\n",
" print_labeled_json(\"Result: local tool output\", fallback_order)\n",
" print_labeled_text(\"Result: final model answer\", final_answer)\n",
" print_response_summary(final_response)\n",
" print_key_takeaway('The local lookup keeps the function-calling pattern understandable even when the model returns text.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Function calling\", exc)"
]
},
{
"cell_type": "markdown",
"id": "fde761fd",
"metadata": {},
"source": [
"### 4.3 Handle Multiple Tool Calls\n",
"\n",
"Parallel tool calls let the model request more than one independent lookup from a single turn. This cell allows two order-status lookups, executes each local function call, and sends both outputs back before asking the model to compare the active shipping issues. Inspect the returned order IDs and final answer to confirm that application data, not model memory, grounds the response.\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "53f83130",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | parallel_tool_calls | \n",
" True | \n",
"
\n",
" \n",
" | tools | \n",
" get_order_status, get_customer_profile | \n",
"
\n",
" \n",
" | tool_choice | \n",
" auto | \n",
"
\n",
" \n",
" | input | \n",
" 1 item(s): user: Use get_order_status for ORDER-8831 and ORDER-2044, then summarize whether Maya has one shipping problem or multiple active shipping problems in two labeled plain-text lines. Do not use leading hyphens or bold text. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: returned tool calls
\n",
"
{\n",
" "tool_call_count": 1,\n",
" "order_ids": [\n",
" "ORDER-8831"\n",
" ]\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: local tool outputs
\n",
"
[\n",
" {\n",
" "order_id": "ORDER-8831",\n",
" "customer_id": "CUST-1042",\n",
" "item": "standing desk replacement",\n",
" "status": "delayed",\n",
" "carrier_scan": "No movement for 36 hours at Denver sort center",\n",
" "promised_delivery": "2026-06-01",\n",
" "recommended_policy": "If delay exceeds 48 hours, offer expedited replacement or 15% concession with agent approval."\n",
" },\n",
" {\n",
" "order_id": "ORDER-2044",\n",
" "customer_id": "CUST-1042",\n",
" "item": "ergonomic chair",\n",
" "status": "delivered",\n",
" "carrier_scan": "Delivered yesterday at front desk",\n",
" "promised_delivery": "2026-05-29",\n",
" "recommended_policy": "Confirm delivery details before opening a replacement request."\n",
" }\n",
"]
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: final model answer
\n",
"
Order statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\n",
"Shipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_ovmlijo2lf2nmc7udlofbbw7n2xmlxs6mxdpftlp6iamfcjoeerq | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 404 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 50 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 454 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Local lookup outputs keep the parallel-tool pattern understandable if not every call is returned.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"parallel_input = [{\"role\": \"user\", \"content\": \"Use get_order_status for ORDER-8831 and ORDER-2044, then summarize whether Maya has one shipping problem or multiple active shipping problems in two labeled plain-text lines. Do not use leading hyphens or bold text.\"}]\n",
"parallel_request = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": parallel_input,\n",
" \"tools\": function_tools,\n",
" \"tool_choice\": \"auto\",\n",
" \"parallel_tool_calls\": True,\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(parallel_request)\n",
"try:\n",
" parallel_plan = create_response(**parallel_request)\n",
" parallel_calls = [item for item in response_items(parallel_plan) if item.get(\"type\") == \"function_call\"]\n",
" parallel_outputs = [dispatch_tool_call(call) for call in parallel_calls]\n",
" parallel_order_ids = [json.loads(call[\"arguments\"]).get(\"order_id\") for call in parallel_calls if call.get(\"name\") == \"get_order_status\"]\n",
" expected_order_ids = [\"ORDER-8831\", \"ORDER-2044\"]\n",
" missing_order_ids = [order_id for order_id in expected_order_ids if order_id not in builtins.set(parallel_order_ids)]\n",
"\n",
" if not missing_order_ids:\n",
" parallel_final = create_response(\n",
" model=MODEL_ID,\n",
" input=parallel_input + response_items(parallel_plan) + parallel_outputs,\n",
" tools=function_tools,\n",
" max_output_tokens=1024,\n",
" store=False,\n",
" )\n",
" parallel_answer = output_text(parallel_final).strip()\n",
" record_check(\"Parallel tool calls\", \"pass\", {\"tool_call_count\": len(parallel_calls), \"order_ids\": parallel_order_ids})\n",
" record_response(\"Parallel order lookup answer\", \"text\", parallel_answer)\n",
" print_labeled_json(\"Result: tool calls\", {\"tool_call_count\": len(parallel_calls), \"order_ids\": parallel_order_ids})\n",
" print_labeled_text(\"Result: final model answer\", parallel_answer)\n",
" print_response_summary(parallel_final)\n",
" print_key_takeaway('Parallel tool calls let the model request multiple lookups, while the application still controls execution.')\n",
" else:\n",
" fallback_orders = [get_order_status(order_id) for order_id in expected_order_ids]\n",
" fallback_prompt = (\n",
" \"The model did not request every expected order lookup. Use these application lookup results \"\n",
" \"to answer in two labeled plain-text lines without leading hyphens or bold text: \" + json.dumps(fallback_orders)\n",
" )\n",
" parallel_final = create_response(\n",
" model=MODEL_ID,\n",
" input=parallel_input + [{\"role\": \"user\", \"content\": fallback_prompt}],\n",
" max_output_tokens=1024,\n",
" store=False,\n",
" )\n",
" parallel_answer = output_text(parallel_final).strip()\n",
" record_check(\"Parallel tool calls\", \"warn\", {\"returned_order_ids\": parallel_order_ids, \"missing_order_ids\": missing_order_ids})\n",
" record_response(\"Parallel order lookup fallback answer\", \"text\", parallel_answer)\n",
" print_labeled_json(\"Result: returned tool calls\", {\"tool_call_count\": len(parallel_calls), \"order_ids\": parallel_order_ids})\n",
" print_labeled_json(\"Result: local tool outputs\", fallback_orders)\n",
" print_labeled_text(\"Result: final model answer\", parallel_answer)\n",
" print_response_summary(parallel_final)\n",
" print_key_takeaway('Local lookup outputs keep the parallel-tool pattern understandable if not every call is returned.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Parallel tool calls\", exc)"
]
},
{
"cell_type": "markdown",
"id": "88218be7",
"metadata": {},
"source": [
"### 4.4 Use a Custom Text Tool\n",
"\n",
"Custom tools pass freeform text to application-owned logic instead of requiring a structured JSON argument object. This cell defines a support-note normalizer, requests a custom tool call, and includes a local fallback if the endpoint returns ordinary text instead of a custom call. Inspect the output item types and the normalized note.\n"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "94436242",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | tools | \n",
" normalize_support_note | \n",
"
\n",
" \n",
" | tool_choice | \n",
" {'type': 'custom', 'name': 'normalize_support_note'} | \n",
"
\n",
" \n",
" | input | \n",
" 1 item(s): user: Call normalize_support_note with this exact note. Do not answer directly; send the note to the custom tool: order-8831 | cust-1042 | replacement delayed | customer wants supervisor | offer expedited replacement or 15% co... | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: local fallback normalization
\n",
"
ORDER_ID: ORDER-8831\n",
"CUSTOMER_ID: CUST-1042\n",
"ISSUE: REPLACEMENT DELAYED\n",
"CUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\n",
"POLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: returned output item types
\n",
"
[\n",
" "custom_tool_call"\n",
"]
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: custom tool input
\n",
"
order-8831 | cust-1042 | replacement delayed | customer wants supervisor | offer expedited replacement or 15% concession
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: application-owned normalized output
\n",
"
ORDER_ID: ORDER-8831\n",
"CUSTOMER_ID: CUST-1042\n",
"ISSUE: REPLACEMENT DELAYED\n",
"CUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\n",
"POLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_gwyauif44dnxpxrcssrxj4bh57tmgg3zwr67hfcklfswshbbscoa | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['custom_tool_call'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 674 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 37 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 711 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Custom tools are useful when the application owns a freeform parsing or execution step.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"custom_tools = [\n",
" {\n",
" \"type\": \"custom\",\n",
" \"name\": \"normalize_support_note\",\n",
" \"description\": \"Normalize a freeform support note written by an agent. Input is plain text.\",\n",
" \"format\": {\"type\": \"text\"},\n",
" }\n",
"]\n",
"\n",
"\n",
"def normalize_support_note_text(note: str) -> str:\n",
" fields = [part.strip().upper() for part in note.split(\"|\")]\n",
" labels = [\"ORDER_ID\", \"CUSTOMER_ID\", \"ISSUE\", \"CUSTOMER_REQUEST\", \"POLICY_OPTION\"]\n",
" return \"\\n\".join(\n",
" f\"{label}: {value}\"\n",
" for label, value in zip(labels, fields)\n",
" if value\n",
" )\n",
"\n",
"\n",
"support_note = \"order-8831 | cust-1042 | replacement delayed | customer wants supervisor | offer expedited replacement or 15% concession\"\n",
"custom_input = [{\n",
" \"role\": \"user\",\n",
" \"content\": (\n",
" \"Call normalize_support_note with this exact note. Do not answer directly; \"\n",
" f\"send the note to the custom tool: {support_note}\"\n",
" ),\n",
"}]\n",
"custom_request = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": custom_input,\n",
" \"tools\": custom_tools,\n",
" \"tool_choice\": {\"type\": \"custom\", \"name\": \"normalize_support_note\"},\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(custom_request)\n",
"print_labeled_text(\"Result: local fallback normalization\", normalize_support_note_text(support_note))\n",
"try:\n",
" custom_plan = create_response(**custom_request)\n",
" returned_item_types = [item.get(\"type\") for item in response_items(custom_plan)]\n",
" try:\n",
" custom_call = first_output_item(custom_plan, \"custom_tool_call\")\n",
" if custom_call is None:\n",
" raise LookupError(\"No custom_tool_call item returned.\")\n",
" tool_input = custom_call.get(\"input\", \"\").strip()\n",
" normalized_note = normalize_support_note_text(tool_input)\n",
" record_check(\"Custom tools\", \"pass\", {\"output_item_types\": returned_item_types, \"normalized_note\": normalized_note})\n",
" record_response(\"Normalized support note\", \"text\", normalized_note)\n",
" print_labeled_json(\"Result: returned output item types\", returned_item_types)\n",
" print_labeled_text(\"Result: custom tool input\", tool_input)\n",
" print_labeled_text(\"Result: application-owned normalized output\", normalized_note)\n",
" except LookupError:\n",
" fallback_text = output_text(custom_plan).strip() or \"No text content was returned.\"\n",
" normalized_note = normalize_support_note_text(support_note)\n",
" record_check(\"Custom tools\", \"warn\", {\n",
" \"expected\": \"custom_tool_call item named normalize_support_note\",\n",
" \"actual_output_item_types\": returned_item_types,\n",
" \"meaning\": \"The model response did not include a custom-tool invocation, so the application fallback normalization is shown for teaching.\",\n",
" })\n",
" record_response(\"Custom tool text fallback\", \"text\", fallback_text)\n",
" record_response(\"Application-owned normalization fallback\", \"text\", normalized_note)\n",
" print_labeled_json(\"Result: returned output item types\", returned_item_types or [\"no typed output items returned\"])\n",
" print_labeled_text(\"Result: model text response\", fallback_text)\n",
" print_labeled_text(\"Result: application-owned normalization\", normalized_note)\n",
" print_response_summary(custom_plan)\n",
" print_key_takeaway('Custom tools are useful when the application owns a freeform parsing or execution step.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Custom tools\", exc)"
]
},
{
"cell_type": "markdown",
"id": "d064c671",
"metadata": {},
"source": [
"## 5. Send Direct File Input\n",
"\n",
"Direct file input is separate from application-managed tools. A file can be included in the current Responses request as an `input_file` item alongside text instructions, which is useful when the model should read the file for this turn without setting up a retrieval index.\n",
"\n",
"### 5.1 Attach a PDF as `input_file`\n",
"\n",
"This cell generates a tiny PDF transcript in memory, attaches it as base64 file data, and asks for exact JSON fields from the document. Inspect the PDF preview, expected fields, parsed response, and usage summary.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "96b1aae1",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: PDF transcript preview
\n",
"
BrightCart support transcript\n",
"Ticket: TICKET-7429\n",
"Customer: Maya Chen\n",
"Order: ORDER-8831\n",
"Product: Standing desk replacement\n",
"Issue: Replacement for a damaged item is delayed and carrier scan has not moved\n",
"Customer request: Supervisor callback and refund options\n",
"Policy options: expedited replacement or 15% concession with agent approval after 48-hour delay
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | text format | \n",
" json_object | \n",
"
\n",
" \n",
" | input | \n",
" 1 item(s): user: input_file: brightcart-support-transcript.pdf; input_text: Read the attached PDF support transcript and return JSON with keys ticket_id, customer, order_id, product, issue, reques... | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: expected fields
\n",
"
{\n",
" "ticket_id": "TICKET-7429",\n",
" "customer": "Maya Chen",\n",
" "order_id": "ORDER-8831",\n",
" "product": "Standing desk replacement"\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
{"ticket_id":"TICKET-7429","customer":"Maya Chen","order_id":"ORDER-8831","product":"Standing desk replacement","issue":"Replacement for a damaged item is delayed and carrier scan has not moved","requested_resolution":"Supervisor callback and refund options","policy_options":"expedited replacement or 15% concession with agent approval after 48-hour delay"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_colsvndmpjd6qczpemdjscmsbjefmgl5vh6i7alqt52jflentfna | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 713 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 82 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 795 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Direct file input is useful when the file should be read in the current request context.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"def make_simple_pdf(lines: list[str]) -> bytes:\n",
" def pdf_escape(text: str) -> str:\n",
" return text.replace(\"\\\\\", \"\\\\\\\\\").replace(\"(\", \"\\\\(\").replace(\")\", \"\\\\)\")\n",
"\n",
" stream_lines = [\"BT\", \"/F1 11 Tf\", \"72 740 Td\", \"15 TL\"]\n",
" for idx, line in enumerate(lines):\n",
" if idx:\n",
" stream_lines.append(\"T*\")\n",
" stream_lines.append(f\"({pdf_escape(line)}) Tj\")\n",
" stream_lines.append(\"ET\")\n",
" stream = \"\\n\".join(stream_lines).encode(\"latin-1\", \"replace\")\n",
"\n",
" objects = [\n",
" b\"<< /Type /Catalog /Pages 2 0 R >>\",\n",
" b\"<< /Type /Pages /Kids [3 0 R] /Count 1 >>\",\n",
" b\"<< /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] /Resources << /Font << /F1 4 0 R >> >> /Contents 5 0 R >>\",\n",
" b\"<< /Type /Font /Subtype /Type1 /BaseFont /Helvetica >>\",\n",
" b\"<< /Length \" + builtins.str(len(stream)).encode(\"ascii\") + b\" >>\\nstream\\n\" + stream + b\"\\nendstream\",\n",
" ]\n",
"\n",
" pdf = b\"%PDF-1.4\\n\"\n",
" offsets = [0]\n",
" for idx, obj in enumerate(objects, start=1):\n",
" offsets.append(len(pdf))\n",
" pdf += f\"{idx} 0 obj\\n\".encode(\"ascii\") + obj + b\"\\nendobj\\n\"\n",
" xref_offset = len(pdf)\n",
" pdf += f\"xref\\n0 {len(objects) + 1}\\n0000000000 65535 f \\n\".encode(\"ascii\")\n",
" for offset in offsets[1:]:\n",
" pdf += f\"{offset:010d} 00000 n \\n\".encode(\"ascii\")\n",
" pdf += f\"trailer\\n<< /Size {len(objects) + 1} /Root 1 0 R >>\\nstartxref\\n{xref_offset}\\n%%EOF\\n\".encode(\"ascii\")\n",
" return pdf\n",
"\n",
"\n",
"file_lines = [\n",
" \"BrightCart support transcript\",\n",
" \"Ticket: TICKET-7429\",\n",
" \"Customer: Maya Chen\",\n",
" \"Order: ORDER-8831\",\n",
" \"Product: Standing desk replacement\",\n",
" \"Issue: Replacement for a damaged item is delayed and carrier scan has not moved\",\n",
" \"Customer request: Supervisor callback and refund options\",\n",
" \"Policy options: expedited replacement or 15% concession with agent approval after 48-hour delay\",\n",
"]\n",
"file_text = \"\\n\".join(file_lines)\n",
"pdf_data = base64.b64encode(make_simple_pdf(file_lines)).decode(\"utf-8\")\n",
"\n",
"expected_direct_file_fields = {\n",
" \"ticket_id\": \"TICKET-7429\",\n",
" \"customer\": \"Maya Chen\",\n",
" \"order_id\": \"ORDER-8831\",\n",
" \"product\": \"Standing desk replacement\",\n",
"}\n",
"\n",
"direct_file_request = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": [\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\n",
" \"type\": \"input_file\",\n",
" \"filename\": \"brightcart-support-transcript.pdf\",\n",
" \"file_data\": f\"data:application/pdf;base64,{pdf_data}\",\n",
" },\n",
" {\n",
" \"type\": \"input_text\",\n",
" \"text\": (\n",
" \"Read the attached PDF support transcript and return JSON with keys \"\n",
" \"ticket_id, customer, order_id, product, issue, requested_resolution, and policy_options. \"\n",
" \"Use exact values from the file. Do not return null for fields that are present in the file.\"\n",
" ),\n",
" },\n",
" ],\n",
" }\n",
" ],\n",
" \"text\": {\"format\": {\"type\": \"json_object\"}},\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_labeled_text(\"Result: PDF transcript preview\", file_text)\n",
"print_request_shape(direct_file_request)\n",
"print_labeled_json(\"Result: expected fields\", expected_direct_file_fields)\n",
"try:\n",
" direct_file_response = create_response(**direct_file_request)\n",
" raw_direct_file_output = output_text(direct_file_response).strip()\n",
" try:\n",
" direct_file_payload = json.loads(raw_direct_file_output)\n",
" missing_or_empty = [\n",
" key for key, expected in expected_direct_file_fields.items()\n",
" if builtins.str(direct_file_payload.get(key, \"\")).strip().lower() != expected.lower()\n",
" ]\n",
" null_fields = [key for key, value in direct_file_payload.items() if value in {None, \"\", []}]\n",
" if missing_or_empty or null_fields:\n",
" record_check(\"Direct file inputs\", \"warn\", {\n",
" \"message\": \"The request completed, but the model did not extract the expected values from the attached PDF.\",\n",
" \"missing_or_unexpected_fields\": missing_or_empty,\n",
" \"empty_fields\": null_fields,\n",
" \"payload\": direct_file_payload,\n",
" })\n",
" record_response(\"Support transcript extraction returned by model\", \"json\", direct_file_payload)\n",
" print_labeled_text(\"Result\", \"The request completed, but the model did not extract the expected values from the attached PDF.\")\n",
" print_labeled_json(\"Result: returned JSON\", direct_file_payload)\n",
" else:\n",
" record_check(\"Direct file inputs\", \"pass\", direct_file_payload)\n",
" record_response(\"Support transcript extraction\", \"json\", direct_file_payload)\n",
" print_labeled_json(\"Result\", direct_file_payload)\n",
" except Exception as parse_exc:\n",
" record_check(\"Direct file inputs\", \"warn\", {\n",
" \"message\": \"The request completed, but the response was not valid JSON.\",\n",
" \"text_sample\": raw_direct_file_output[:600],\n",
" \"error\": builtins.str(parse_exc),\n",
" })\n",
" record_response(\"Support transcript extraction text\", \"text\", raw_direct_file_output[:1200])\n",
" print_labeled_text(\"Result\", raw_direct_file_output[:1200])\n",
" print_response_summary(direct_file_response)\n",
" print_key_takeaway('Direct file input is useful when the file should be read in the current request context.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Direct file inputs\", exc)"
]
},
{
"cell_type": "markdown",
"id": "492681d1",
"metadata": {},
"source": [
"## 6. Manage Conversation State\n",
"\n",
"Conversation state determines how follow-up turns receive prior context. The Responses API supports stored continuation with `previous_response_id`, and applications can also manage state themselves by resending relevant input history. This section compares both patterns, then shows encrypted reasoning context where supported.\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "29e58e15",
"metadata": {},
"source": [
"### 6.1 Continue with `previous_response_id`\n",
"\n",
"Use `previous_response_id` to continue from a stored response without resending the full prior prompt. The first request stores the BrightCart case details; the second request passes only the new follow-up instruction plus the previous response ID. Inspect whether the follow-up preserves the order, customer, issue, and next action.\n"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "3a1e4556",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | previous_response_id | \n",
" <response-id-from-prior-stored-turn> | \n",
"
\n",
" \n",
" | input | \n",
" Return five labeled lines: ticket ID, order ID, customer name, issue, and next best action. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
Ticket ID: TICKET-4812\n",
"Order ID: ORDER-8831\n",
"Customer Name: Maya Chen\n",
"Issue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\n",
"Next Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_gkgqadc2gd24747lmhy5waftt5tga7eibtv67k77lndjgbuioo6q | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 715 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 86 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 801 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: previous_response_id lets a follow-up use stored context without resending the full prior turn.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"promised_delivery = (date.today() + timedelta(days=2)).isoformat()\n",
"stateful_seed_input = (\n",
" f\"Customer Maya Chen opened ticket TICKET-4812 about order ORDER-8831. \"\n",
" \"The item is a standing desk replacement for a damaged delivery. \"\n",
" f\"The promised delivery date is {promised_delivery}, but the carrier scan has not moved in 36 hours. \"\n",
" \"Customer sentiment is frustrated because this is the second attempt. \"\n",
" \"Support policy says to offer expedited replacement or a 15% concession if the delay exceeds 48 hours. \"\n",
" \"Escalation owner is Tier 2 Returns.\"\n",
")\n",
"stateful_followup_input = \"Return five labeled lines: ticket ID, order ID, customer name, issue, and next best action.\"\n",
"stateful_request_shape = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": stateful_followup_input,\n",
" \"previous_response_id\": \"\",\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(stateful_request_shape)\n",
"try:\n",
" stateful_turn_1 = create_response(model=MODEL_ID, input=stateful_seed_input, max_output_tokens=1024, store=True)\n",
" remember_stored_response(stateful_turn_1)\n",
" stateful_turn_2 = create_response(model=MODEL_ID, input=stateful_followup_input, previous_response_id=stateful_turn_1.id, max_output_tokens=1024, store=False)\n",
" text = output_text(stateful_turn_2).strip()\n",
" require(\"order-8831\" in text.lower() or \"maya\" in text.lower(), \"Stateful continuation response missed expected support context.\")\n",
" record_check(\"Stateful continuation\", \"pass\", stateful_turn_1.id)\n",
" record_response(\"Stateful support handoff\", \"text\", text)\n",
" print_labeled_text(\"Result\", text)\n",
" print_response_summary(stateful_turn_2)\n",
" print_key_takeaway('previous_response_id lets a follow-up use stored context without resending the full prior turn.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Stateful continuation\", exc)\n"
]
},
{
"cell_type": "markdown",
"id": "cab55353",
"metadata": {},
"source": [
"### 6.2 Rebuild Stateless Context\n",
"\n",
"Stateless continuation means the application sends the relevant history on every request. This is a good fit when your product already owns conversation storage, retention policy, or audit requirements. This cell sends a short chat history plus a new handoff instruction and inspects the summary and token usage.\n"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "647ee6e8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | input | \n",
" 4 item(s): user: Support chat TICKET-3920: Customer Jordan Lee says ORDER-7718 arrived with a cracked monitor stand.; assistant: Captured damaged-item issue for ORDER-7718 and asked for preferred resolution.; user: Jordan wants a replacement shipped this week and asks whether the damaged item must be returned first.; user: Summarize this support chat for the next agent in five labeled plain-text lines. Do not use leading hyphens or bold text. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
Customer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\n",
"Issue: Damaged item; monitor stand is cracked on arrival.\n",
"Requested Resolution: Customer wants a replacement shipped this week.\n",
"Open Question: Jordan asked whether the damaged item must be returned before replacement is sent.\n",
"Status: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_mezt6yqizyswuvujvudnonr34b73ndyyu2qsfncgtjyppzie5vva | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 255 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 78 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 333 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Stateless continuation sends the relevant history with each request when the application owns conversation storage.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"stateless_history = [\n",
" {\"role\": \"user\", \"content\": \"Support chat TICKET-3920: Customer Jordan Lee says ORDER-7718 arrived with a cracked monitor stand.\"},\n",
" {\"role\": \"assistant\", \"content\": \"Captured damaged-item issue for ORDER-7718 and asked for preferred resolution.\"},\n",
" {\"role\": \"user\", \"content\": \"Jordan wants a replacement shipped this week and asks whether the damaged item must be returned first.\"},\n",
"]\n",
"stateless_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": stateless_history + [{\"role\": \"user\", \"content\": \"Summarize this support chat for the next agent in five labeled plain-text lines. Do not use leading hyphens or bold text.\"}],\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(stateless_payload)\n",
"try:\n",
" stateless_response = create_response(**stateless_payload)\n",
" stateless_text = output_text(stateless_response).strip()\n",
" require(stateless_text, \"Stateless continuation response did not return text.\")\n",
" record_check(\"Stateless continuation\", \"pass\", summarize_response(stateless_response))\n",
" record_response(\"Stateless support handoff\", \"text\", stateless_text)\n",
" print_labeled_text(\"Result\", stateless_text)\n",
" print_response_summary(stateless_response)\n",
" print_key_takeaway('Stateless continuation sends the relevant history with each request when the application owns conversation storage.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Stateless continuation\", exc)\n"
]
},
{
"cell_type": "markdown",
"id": "9da3c529",
"metadata": {},
"source": [
"### 6.3 Carry Encrypted Reasoning Context\n",
"\n",
"Reasoning-capable models may return reasoning items and encrypted reasoning content when requested. This cell asks for encrypted reasoning metadata, carries prior response items into a follow-up request, and inspects whether encrypted content was returned. The hidden reasoning text is not exposed; the application only carries opaque context forward where supported.\n",
"\n",
"Official docs: [Reasoning models](https://developers.openai.com/api/docs/guides/reasoning) describes reasoning models and reasoning effort in Responses workflows.\n"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "15ffab13",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | reasoning | \n",
" {'effort': 'medium'} | \n",
"
\n",
" \n",
" | include | \n",
" ['reasoning.encrypted_content'] | \n",
"
\n",
" \n",
" | input | \n",
" 1 item(s): user: For a customer-support assistant handling names, order IDs, and refund context, compare stateful and stateless continuation in two sentences. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: reasoning metadata
\n",
"
{\n",
" "returned_item_types": [\n",
" "reasoning",\n",
" "message"\n",
" ],\n",
" "encrypted_reasoning_content_returned": true\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: follow-up answer
\n",
"
Recommendation: Stateless continuation\n",
"Reason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_qmjgoymsxqf32ht3apisbvscrv4d5t5x2tnxeftkxwpyl4eppjka | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 295 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 56 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 351 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Encrypted reasoning content can be carried forward where supported without exposing hidden reasoning text.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"encrypted_history = [\n",
" {\"role\": \"user\", \"content\": \"For a customer-support assistant handling names, order IDs, and refund context, compare stateful and stateless continuation in two sentences.\"}\n",
"]\n",
"encrypted_turn_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": encrypted_history,\n",
" \"reasoning\": {\"effort\": \"medium\"},\n",
" \"include\": [\"reasoning.encrypted_content\"],\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(encrypted_turn_payload)\n",
"try:\n",
" encrypted_turn_1 = create_response(**encrypted_turn_payload)\n",
" encrypted_turn_2 = create_response(\n",
" model=MODEL_ID,\n",
" input=encrypted_history + response_items(encrypted_turn_1) + [\n",
" {\"role\": \"user\", \"content\": \"Based on the prior reasoning context, recommend one approach for a regulated support workflow in two labeled plain-text lines. Do not use leading hyphens or bold text.\"}\n",
" ],\n",
" max_output_tokens=1024,\n",
" store=False,\n",
" )\n",
" reasoning_items = [item for item in response_items(encrypted_turn_1) if item.get(\"type\") == \"reasoning\"]\n",
" has_encrypted_content = any(item.get(\"encrypted_content\") for item in reasoning_items)\n",
" record_check(\"Encrypted reasoning\", \"pass\", {\"encrypted_content_returned\": has_encrypted_content, \"reasoning_item_count\": len(reasoning_items)})\n",
" encrypted_answer = output_text(encrypted_turn_2).strip()\n",
" record_response(\"State strategy recommendation\", \"text\", encrypted_answer)\n",
" print_labeled_json(\"Result: reasoning metadata\", {\n",
" \"returned_item_types\": [item.get(\"type\") for item in response_items(encrypted_turn_1)],\n",
" \"encrypted_reasoning_content_returned\": has_encrypted_content,\n",
" })\n",
" print_labeled_text(\"Result: follow-up answer\", encrypted_answer)\n",
" print_response_summary(encrypted_turn_2)\n",
" print_key_takeaway('Encrypted reasoning content can be carried forward where supported without exposing hidden reasoning text.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Encrypted reasoning\", exc)\n"
]
},
{
"cell_type": "markdown",
"id": "1a4abe4f",
"metadata": {},
"source": [
"## 7. Use Prompt Caching\n",
"\n",
"Prompt caching improves latency and cost when requests share an exact static prefix.\n",
"\n",
"### 7.1 Compare Two Cache-Keyed Requests\n",
"\n",
"This cell places stable BrightCart policy text at the beginning of the input, sends the same request twice with a `prompt_cache_key`, and compares token metadata. Inspect `cached_input_tokens` on the second response when the endpoint returns cache details.\n",
"\n",
"Note: `PROMPT_CACHE_RETENTION` is selected from the active `MODEL_ID`. It uses `24h` for `openai.gpt-5.5` and later models, and `in_memory` for `openai.gpt-5.4`.\n"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "9a2c7825",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | prompt_cache_key | \n",
" brightcart-support-policy-v1 | \n",
"
\n",
" \n",
" | prompt_cache_retention | \n",
" in_memory | \n",
"
\n",
" \n",
" | input | \n",
" 2 item(s): system: BrightCart support policy: 1. Be empathetic, concise, and specific about the customer's order. 2. Do not promise refunds, credits, or delivery dates unless the policy context supports it. 3. For damaged-item replacements...; user: Draft a two-sentence agent reply for Maya Chen about delayed replacement order ORDER-8831. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Prompt-cache input size
\n",
"
{\n",
" "estimated_input_words": 3016,\n",
" "target_minimum_tokens": 2048\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
Hi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
First request summary
\n",
"
{\n",
" "id": "resp_3w6r6ipbqa5z2max35awv3i23i5sjmfa33zzw3vhpgqxcvchhkgq",\n",
" "model": "openai.gpt-5.4",\n",
" "status": "completed",\n",
" "output_item_types": [\n",
" "message"\n",
" ],\n",
" "input_tokens": 3970,\n",
" "output_tokens": 66,\n",
" "total_tokens": 4036,\n",
" "cached_input_tokens": 0,\n",
" "reasoning_output_tokens": 0,\n",
" "service_tier": "default"\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Second request summary
\n",
"
{\n",
" "id": "resp_zzjeqttoswdjdwwl56xpolvly23w4h2n5dtsdoddgxqwfbkn7npq",\n",
" "model": "openai.gpt-5.4",\n",
" "status": "completed",\n",
" "output_item_types": [\n",
" "message"\n",
" ],\n",
" "input_tokens": 3970,\n",
" "output_tokens": 48,\n",
" "total_tokens": 4018,\n",
" "cached_input_tokens": 0,\n",
" "reasoning_output_tokens": 0,\n",
" "service_tier": "default"\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | request | \n",
" input_tokens | \n",
" cached_input_tokens | \n",
" output_tokens | \n",
" total_tokens | \n",
"
\n",
" \n",
" \n",
" \n",
" | first | \n",
" 3970 | \n",
" 0 | \n",
" 66 | \n",
" 4036 | \n",
"
\n",
" \n",
" | second | \n",
" 3970 | \n",
" 0 | \n",
" 48 | \n",
" 4018 | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: cached_input_tokens is the metadata field to inspect for prompt-cache reuse.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"base_support_policy = [\n",
" \"BrightCart support policy:\",\n",
" \"1. Be empathetic, concise, and specific about the customer's order.\",\n",
" \"2. Do not promise refunds, credits, or delivery dates unless the policy context supports it.\",\n",
" \"3. For damaged-item replacements, check replacement status before offering concessions.\",\n",
" \"4. If a replacement delay exceeds 48 hours, offer expedited replacement or a 15% concession subject to agent approval.\",\n",
"]\n",
"policy_reference_paragraph = (\n",
" \"Expanded cacheable policy context: BrightCart agents should identify the customer, order ID, replacement status, \"\n",
" \"carrier scan age, promised delivery window, item category, prior concessions, and supervisor approval needs before \"\n",
" \"drafting a customer-facing answer. The assistant should preserve a calm tone, avoid unsupported promises, separate \"\n",
" \"confirmed facts from assumptions, recommend one clear next action, and document why any escalation, expedited \"\n",
" \"replacement, or concession is appropriate. Repeated policy context like this is intentionally stable across many \"\n",
" \"requests so prompt caching can reuse the prefix when the same cache key is supplied.\"\n",
")\n",
"expanded_policy_context = \"\\n\".join(\n",
" f\"Policy reference paragraph {idx + 1}: {policy_reference_paragraph}\"\n",
" for idx in range(32)\n",
")\n",
"stable_support_policy = \"\\n\".join(base_support_policy + [expanded_policy_context])\n",
"cache_input = [\n",
" {\"role\": \"system\", \"content\": stable_support_policy},\n",
" {\"role\": \"user\", \"content\": \"Draft a two-sentence agent reply for Maya Chen about delayed replacement order ORDER-8831.\"},\n",
"]\n",
"estimated_cache_input_words = len(json.dumps(cache_input).split())\n",
"require(estimated_cache_input_words > 2048, f\"Prompt-cache input should be over 2048 words; found {estimated_cache_input_words}.\")\n",
"cache_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": cache_input,\n",
" \"prompt_cache_key\": \"brightcart-support-policy-v1\",\n",
" \"prompt_cache_retention\": PROMPT_CACHE_RETENTION,\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": False,\n",
"}\n",
"\n",
"print_request_shape(cache_payload)\n",
"print_labeled_json(\"Prompt-cache input size\", {\"estimated_input_words\": estimated_cache_input_words, \"target_minimum_tokens\": 2048})\n",
"try:\n",
" cache_response_1 = create_response(**cache_payload)\n",
" cache_response_2 = create_response(**cache_payload)\n",
" cache_summary_1 = summarize_response(cache_response_1)\n",
" cache_summary_2 = summarize_response(cache_response_2)\n",
" cache_comparison = pd.DataFrame([\n",
" {\n",
" \"request\": \"first\",\n",
" \"input_tokens\": cache_summary_1.get(\"input_tokens\"),\n",
" \"cached_input_tokens\": cache_summary_1.get(\"cached_input_tokens\"),\n",
" \"output_tokens\": cache_summary_1.get(\"output_tokens\"),\n",
" \"total_tokens\": cache_summary_1.get(\"total_tokens\"),\n",
" },\n",
" {\n",
" \"request\": \"second\",\n",
" \"input_tokens\": cache_summary_2.get(\"input_tokens\"),\n",
" \"cached_input_tokens\": cache_summary_2.get(\"cached_input_tokens\"),\n",
" \"output_tokens\": cache_summary_2.get(\"output_tokens\"),\n",
" \"total_tokens\": cache_summary_2.get(\"total_tokens\"),\n",
" },\n",
" ])\n",
" record_check(\"Prompt caching\", \"pass\" if cache_summary_2.get(\"cached_input_tokens\") is not None else \"warn\", {\"first\": cache_summary_1, \"second\": cache_summary_2})\n",
" cache_reply = output_text(cache_response_2).strip()\n",
" record_response(\"Prompt-cache token comparison\", \"table\", cache_comparison)\n",
" record_response(\"Cached support-policy reply\", \"text\", cache_reply)\n",
"\n",
" print_labeled_text(\"Result\", cache_reply)\n",
" print_labeled_json(\"First request summary\", cache_summary_1)\n",
" print_labeled_json(\"Second request summary\", cache_summary_2)\n",
" print_label(\"Response summary\")\n",
" display_wrapped_table(cache_comparison, max_col_width_px=260)\n",
" print_key_takeaway('cached_input_tokens is the metadata field to inspect for prompt-cache reuse.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Prompt caching\", exc)"
]
},
{
"cell_type": "markdown",
"id": "3766504d",
"metadata": {},
"source": [
"## 8. Run Background Work\n",
"\n",
"Background mode starts a response asynchronously and lets the application poll for terminal status.\n",
"\n",
"### 8.1 Submit and Poll a Background Response\n",
"\n",
"This cell sends `background=true`, stores the response ID, polls while status is queued or in progress, and then prints the final manager summary. Inspect the status history, final status, response ID, and token summary.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "3e5ec164",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" True | \n",
"
\n",
" \n",
" | background | \n",
" True | \n",
"
\n",
" \n",
" | input | \n",
" Return exactly three labeled plain-text lines for a support-manager summary: theme, risk, next action. Keep each line under 12 words. Do not use leading hyphens or bold text. Same-day BrightCart support backlog: 1. 18 delayed-order contacts, mostly from the We... | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: status history
\n",
"
[\n",
" "in_progress",\n",
" "completed"\n",
"]
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result: manager summary
\n",
"
theme: Shipping delays dominate, especially West Coast distribution lane.\n",
"risk: Rising dissatisfaction from delays, replacements, and return exceptions.\n",
"next action: Escalate West Coast lane issues and review holiday return policy.
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | id | \n",
" resp_lmmtsvgk3ntolh5ci5vxccmsa6uxcgrsq7v54jpz7oewmociyesa | \n",
"
\n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | status | \n",
" completed | \n",
"
\n",
" \n",
" | output_item_types | \n",
" ['message'] | \n",
"
\n",
" \n",
" | input_tokens | \n",
" 246 | \n",
"
\n",
" \n",
" | cached_input_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | output_tokens | \n",
" 45 | \n",
"
\n",
" \n",
" | total_tokens | \n",
" 291 | \n",
"
\n",
" \n",
" | reasoning_output_tokens | \n",
" 0 | \n",
"
\n",
" \n",
" | service_tier | \n",
" default | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Background mode starts work asynchronously and lets the application poll by response ID.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"backlog = \"\"\"\n",
"Same-day BrightCart support backlog:\n",
"1. 18 delayed-order contacts, mostly from the West Coast distribution lane.\n",
"2. 7 damaged-item replacement contacts; 3 mention replacement delays.\n",
"3. 5 return-window exception requests after holiday promotions.\n",
"\"\"\".strip()\n",
"background_payload = {\n",
" \"model\": MODEL_ID,\n",
" \"input\": f\"Return exactly three labeled plain-text lines for a support-manager summary: theme, risk, next action. Keep each line under 12 words. Do not use leading hyphens or bold text.\\n\\n{backlog}\",\n",
" \"background\": True,\n",
" \"max_output_tokens\": 1024,\n",
" \"store\": True,\n",
"}\n",
"\n",
"print_request_shape(background_payload)\n",
"try:\n",
" background_response = create_response(**background_payload)\n",
" remember_stored_response(background_response)\n",
" status_history = [getattr(background_response, \"status\", None)]\n",
" for _ in range(15):\n",
" if getattr(background_response, \"status\", None) not in {\"queued\", \"in_progress\"}:\n",
" break\n",
" time.sleep(2)\n",
" background_response = retrieve_response(background_response.id)\n",
" status_history.append(getattr(background_response, \"status\", None))\n",
" background_summary = summarize_response(background_response)\n",
" manager_summary = output_text(background_response).strip()\n",
" require(manager_summary, \"Background response did not return text.\")\n",
" status = \"pass\" if background_summary.get(\"status\") in {None, \"completed\"} else \"warn\"\n",
" record_check(\"Background mode\", status, {\"status_history\": status_history, \"id\": getattr(background_response, \"id\", None), \"final_status\": background_summary.get(\"status\")})\n",
" record_response(\"Background manager summary\", \"text\", manager_summary)\n",
" print_labeled_json(\"Result: status history\", status_history)\n",
" print_labeled_text(\"Result: manager summary\", manager_summary)\n",
" print_response_summary(background_summary)\n",
" print_key_takeaway('Background mode starts work asynchronously and lets the application poll by response ID.')\n",
"except Exception as exc:\n",
" handle_example_error(\"Background mode\", exc)"
]
},
{
"cell_type": "markdown",
"id": "82b83ade",
"metadata": {},
"source": [
"## 9. Compact Long-Running Context\n",
"\n",
"Compaction reduces long conversation state into durable facts, open questions, constraints, and next actions. This cell documents the application-side compaction pattern as a small JSON object so the concept is clear without adding another live feature path. Inspect which facts are kept and which details are omitted before the next turn.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "41dd4ab2",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
"
JSON
\n",
"
{\n",
" "feature": "Compaction",\n",
" "how_to_apply": "Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.",\n",
" "brightcart_example": {\n",
" "durable_facts": [\n",
" "Customer Maya Chen",\n",
" "ORDER-8831",\n",
" "replacement delayed",\n",
" "carrier scan stale"\n",
" ],\n",
" "policy_constraints": [\n",
" "Do not promise refund without eligibility",\n",
" "Offer expedited replacement or 15% concession after 48-hour delay with approval"\n",
" ],\n",
" "next_action": "Check latest carrier scan and supervisor callback status."\n",
" }\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"compaction_note = {\n",
" \"feature\": \"Compaction\",\n",
" \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\n",
" \"brightcart_example\": {\n",
" \"durable_facts\": [\"Customer Maya Chen\", \"ORDER-8831\", \"replacement delayed\", \"carrier scan stale\"],\n",
" \"policy_constraints\": [\"Do not promise refund without eligibility\", \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"],\n",
" \"next_action\": \"Check latest carrier scan and supervisor callback status.\",\n",
" },\n",
"}\n",
"record_check(\"Compaction\", \"documented\", compaction_note)\n",
"record_response(\"Compacted support context\", \"json\", compaction_note)\n",
"print_json(compaction_note)"
]
},
{
"cell_type": "markdown",
"id": "0658a6a3",
"metadata": {},
"source": [
"## 10. Run Operational Smoke Checks\n",
"\n",
"Operational smoke checks are lightweight setup checks, not a load test or service-level measurement. This cell sends three short requests, measures local elapsed time, summarizes success rate and token usage, and infers the region from the configured Bedrock base URL. Inspect latency, completion status, sample outputs, and token totals.\n"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "20d26f31",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Request shape
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | field | \n",
" value | \n",
"
\n",
" \n",
" \n",
" \n",
" | model | \n",
" openai.gpt-5.4 | \n",
"
\n",
" \n",
" | max_output_tokens | \n",
" 1024 | \n",
"
\n",
" \n",
" | store | \n",
" False | \n",
"
\n",
" \n",
" | service_tier | \n",
" auto | \n",
"
\n",
" \n",
" | input | \n",
" Reply with one short customer-support sentence. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
"
Result
\n",
"
{\n",
" "region_hint": "us-west-2",\n",
" "base_url_host": "bedrock-mantle.us-west-2.api.aws",\n",
" "sample_count": 3,\n",
" "success_rate": 1.0,\n",
" "completed_rate": 1.0,\n",
" "avg_latency_seconds": 0.362,\n",
" "p50_latency_seconds": 0.377,\n",
" "p90_latency_seconds": 0.4,\n",
" "total_output_tokens": 34,\n",
" "total_tokens": 544\n",
"}
\n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Response summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | ok | \n",
" latency_seconds | \n",
" output_tokens | \n",
" total_tokens | \n",
" status | \n",
" sample_output | \n",
"
\n",
" \n",
" \n",
" \n",
" | True | \n",
" 0.400 | \n",
" 14 | \n",
" 184 | \n",
" completed | \n",
" We apologize for the delay with your replacement order. | \n",
"
\n",
" \n",
" | True | \n",
" 0.310 | \n",
" 6 | \n",
" 174 | \n",
" completed | \n",
" Resolution Rate | \n",
"
\n",
" \n",
" | True | \n",
" 0.377 | \n",
" 14 | \n",
" 186 | \n",
" completed | \n",
" I’m escalating the return exception to a supervisor. | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Key takeaway: Responsiveness samples are setup checks, not a load test or service-level measurement.
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from __future__ import annotations\n",
"def infer_region_from_base_url(base_url: str) -> str | None:\n",
" host = normalize_base_url(base_url).replace(\"https://\", \"\").split(\"/\")[0]\n",
" for part in host.split(\".\"):\n",
" if part.count(\"-\") >= 2 and any(char.isdigit() for char in part):\n",
" return part\n",
" return None\n",
"\n",
"\n",
"def percentile(values: list[float], pct: float) -> float | None:\n",
" if not values:\n",
" return None\n",
" ordered = sorted(values)\n",
" index = min(len(ordered) - 1, max(0, round((pct / 100) * (len(ordered) - 1))))\n",
" return round(ordered[index], 3)\n",
"\n",
"\n",
"operations_features = [\"Latency runtime example\", \"Throughput runtime example\", \"Reliability runtime example\", \"Region check\"]\n",
"operations_payload = {\"model\": MODEL_ID, \"input\": \"Reply with one short customer-support sentence.\", \"service_tier\": \"auto\", \"max_output_tokens\": 1024, \"store\": False}\n",
"\n",
"print_request_shape(operations_payload)\n",
"if not RUN_RESPONSIVENESS_CHECK:\n",
" record_check(\"Endpoint responsiveness\", \"skipped\", \"BEDROCK_RESPONSIVENESS_CHECK is disabled.\")\n",
" print_labeled_text(\"Result\", \"Responsiveness check disabled.\")\n",
"else:\n",
" prompts = [\n",
" \"Reply in one short sentence: apologize for a delayed replacement order.\",\n",
" \"Reply with one metric name for support-assistant quality.\",\n",
" \"Reply in one short sentence: hand off a return exception to a supervisor.\",\n",
" ]\n",
" samples = []\n",
" for idx, prompt in enumerate(prompts):\n",
" started = time.perf_counter()\n",
" try:\n",
" response = create_response(model=MODEL_ID, input=prompt, service_tier=\"auto\", max_output_tokens=1024, store=False)\n",
" elapsed = time.perf_counter() - started\n",
" summary = summarize_response(response)\n",
" text = output_text(response).strip()\n",
" samples.append({\n",
" \"ok\": bool(text),\n",
" \"latency_seconds\": round(elapsed, 3),\n",
" \"output_tokens\": summary.get(\"output_tokens\") or 0,\n",
" \"total_tokens\": summary.get(\"total_tokens\") or 0,\n",
" \"status\": summary.get(\"status\"),\n",
" \"sample_output\": text[:140],\n",
" })\n",
" except Exception as exc:\n",
" elapsed = time.perf_counter() - started\n",
" samples.append({\"ok\": False, \"latency_seconds\": round(elapsed, 3), \"error\": describe_api_error(exc)})\n",
"\n",
" successes = [sample for sample in samples if sample[\"ok\"]]\n",
" completed = [sample for sample in successes if sample.get(\"status\") in {None, \"completed\"}]\n",
" latencies = [sample[\"latency_seconds\"] for sample in successes]\n",
" responsiveness_summary = {\n",
" \"region_hint\": infer_region_from_base_url(BASE_URL),\n",
" \"base_url_host\": normalize_base_url(BASE_URL).replace(\"https://\", \"\").split(\"/\")[0],\n",
" \"sample_count\": len(samples),\n",
" \"success_rate\": len(successes) / len(samples) if samples else 0,\n",
" \"completed_rate\": len(completed) / len(samples) if samples else 0,\n",
" \"avg_latency_seconds\": round(sum(latencies) / len(latencies), 3) if latencies else None,\n",
" \"p50_latency_seconds\": percentile(latencies, 50),\n",
" \"p90_latency_seconds\": percentile(latencies, 90),\n",
" \"total_output_tokens\": sum(sample.get(\"output_tokens\", 0) for sample in samples),\n",
" \"total_tokens\": sum(sample.get(\"total_tokens\", 0) for sample in samples),\n",
" }\n",
" status = \"pass\" if len(successes) == len(samples) and len(completed) == len(samples) else \"warn\"\n",
" for feature in operations_features:\n",
" record_check(feature, status, responsiveness_summary)\n",
" record_response(\"Endpoint responsiveness summary\", \"json\", {**responsiveness_summary, \"samples\": samples})\n",
" print_labeled_json(\"Result\", responsiveness_summary)\n",
" print_label(\"Response summary\")\n",
" display_wrapped_table(pd.DataFrame(samples), max_col_width_px=360)\n",
" print_key_takeaway('Responsiveness samples are setup checks, not a load test or service-level measurement.')\n"
]
},
{
"cell_type": "markdown",
"id": "0724c842",
"metadata": {},
"source": [
"## 11. Clean Up and Review Results\n",
"\n",
"Stored responses created by lifecycle, stateful continuation, and background examples are tracked in `STORED_RESPONSE_IDS`. This final cell attempts to delete stored responses when cleanup is enabled, then prints the run summary and example-response gallery. Inspect warnings first; they usually identify endpoint configuration, model availability, or feature-support differences.\n"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "4cf13814",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Stored response cleanup
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | response_id | \n",
" status | \n",
" detail | \n",
"
\n",
" \n",
" \n",
" \n",
" | resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia | \n",
" warn | \n",
" {'exception_class': 'AuthenticationError', 'status_code': 401, 'retryable': False, 'request_id': 'req_gkni5zyr7lkjkz2vfiwvkev2qgxs76crcwz5whhjrdkma7up3yta', 'message': 'Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}'} | \n",
"
\n",
" \n",
" | resp_vjrtvnakcgxjhnq5b7cj7rtowtdh7chkkf3aqwbdkynhjqiklp3a | \n",
" warn | \n",
" {'exception_class': 'AuthenticationError', 'status_code': 401, 'retryable': False, 'request_id': 'req_vvwhsmp2rkrzwbkajqdpdod2o4j5xo2vxelzxbcp2fjan2ybwc2a', 'message': 'Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}'} | \n",
"
\n",
" \n",
" | resp_lmmtsvgk3ntolh5ci5vxccmsa6uxcgrsq7v54jpz7oewmociyesa | \n",
" warn | \n",
" {'exception_class': 'AuthenticationError', 'status_code': 401, 'retryable': False, 'request_id': 'req_btgbpxspnokm3wzfndybnfuv3kjudxt3r2ihlsmru7ziisn52goa', 'message': 'Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}'} | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Run summary
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | name | \n",
" status | \n",
" detail | \n",
"
\n",
" \n",
" \n",
" \n",
" | Endpoint shape | \n",
" pass | \n",
" https://bedrock-mantle.us-west-2.api.aws/openai/v1/responses | \n",
"
\n",
" \n",
" | Model selection | \n",
" pass | \n",
" Using configured model; model-list metadata is not required for requests. | \n",
"
\n",
" \n",
" | Error handling | \n",
" pass | \n",
" {\"normalized_fields\": [\"exception_class\", \"status_code\", \"retryable\", \"request_id\", \"message\"], \"retryable_status_codes\": [408, 409, 429, 500, 502, 503, 504], \"notes\": \"call_with_retries(...) uses this taxonomy for transient retry handling.\"} | \n",
"
\n",
" \n",
" | Text generation | \n",
" pass | \n",
" resp_naythl6fvzhoctlsdogd4vpr673q5ibagqqpiujbast3sy6viroa | \n",
"
\n",
" \n",
" | Text generation | \n",
" pass | \n",
" {\"id\": \"resp_nmvqefzghd5hi67uwy4wfvhwnnzild3lslqsxqor3cat63kmucoq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 177, \"output_tokens\": 129, \"total_tokens\": 306, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 18, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Reasoning effort | \n",
" pass | \n",
" {\"id\": \"resp_nmvqefzghd5hi67uwy4wfvhwnnzild3lslqsxqor3cat63kmucoq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 177, \"output_tokens\": 129, \"total_tokens\": 306, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 18, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Responses lifecycle | \n",
" pass | \n",
" resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia | \n",
"
\n",
" \n",
" | Response schema | \n",
" pass | \n",
" {\"id\": \"resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 198, \"output_tokens\": 109, \"total_tokens\": 307, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Usage metadata | \n",
" pass | \n",
" {\"id\": \"resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 198, \"output_tokens\": 109, \"total_tokens\": 307, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Prompt caching | \n",
" pass | \n",
" {\"id\": \"resp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 183, \"output_tokens\": 91, \"total_tokens\": 274, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 34, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Service tier | \n",
" pass | \n",
" {\"id\": \"resp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 183, \"output_tokens\": 91, \"total_tokens\": 274, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 34, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Reasoning effort | \n",
" pass | \n",
" {\"id\": \"resp_q4akwbeynfwfwnt5i4tdwkpcgsffdu4lqvng7lnwaob53opswvwq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"reasoning\", \"message\"], \"input_tokens\": 183, \"output_tokens\": 91, \"total_tokens\": 274, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 34, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Structured Outputs | \n",
" pass | \n",
" {\"ticket_id\": \"TICKET-7429\", \"category\": \"delivery_delay\", \"priority\": \"urgent\", \"customer_sentiment\": \"frustrated and time-sensitive\", \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\", \"require... | \n",
"
\n",
" \n",
" | JSON mode | \n",
" pass | \n",
" {\"customer_name\": \"Maya Chen\", \"order_id\": \"ORDER-8831\", \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\", \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\", \"metrics_to_watch\": [\"tracking_scan_recency\", \"carrier_exception_status\", \"replacement_order_delivery_eta\", \"customer_follow_up_time\"]} | \n",
"
\n",
" \n",
" | Verbosity | \n",
" pass | \n",
" {\"compact_chars\": 182, \"detailed_chars\": 332} | \n",
"
\n",
" \n",
" | Function calling | \n",
" pass | \n",
" {\"tool_choice_used\": \"required\", \"arguments\": {\"order_id\": \"ORDER-8831\"}} | \n",
"
\n",
" \n",
" | Parallel tool calls | \n",
" warn | \n",
" {\"returned_order_ids\": [\"ORDER-8831\"], \"missing_order_ids\": [\"ORDER-2044\"]} | \n",
"
\n",
" \n",
" | Custom tools | \n",
" pass | \n",
" {\"output_item_types\": [\"custom_tool_call\"], \"normalized_note\": \"ORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION\"} | \n",
"
\n",
" \n",
" | Direct file inputs | \n",
" warn | \n",
" {\"message\": \"The request completed, but the response was not valid JSON.\", \"text_sample\": \"{\\\"ticket_id\\\":\\\"TICKET-7429\\\",\\\"customer\\\":\\\"Maya Chen\\\",\\\"order_id\\\":\\\"ORDER-8831\\\",\\\"product\\\":\\\"Standing desk replacement\\\",\\\"issue\\\":\\\"Replacement for a damaged item is delayed and carrier scan has not moved\\\",\\\"requested_resolution\\\":\\\"Supervisor callback and refund options\\\",\\\"policy_options\\\":\\\"expedited replacement or 15% concession with agent approval after 48-hour delay\\\"}\", \"error\": \"unhashable... | \n",
"
\n",
" \n",
" | Stateful continuation | \n",
" pass | \n",
" resp_vjrtvnakcgxjhnq5b7cj7rtowtdh7chkkf3aqwbdkynhjqiklp3a | \n",
"
\n",
" \n",
" | Stateless continuation | \n",
" pass | \n",
" {\"id\": \"resp_mezt6yqizyswuvujvudnonr34b73ndyyu2qsfncgtjyppzie5vva\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 255, \"output_tokens\": 78, \"total_tokens\": 333, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"} | \n",
"
\n",
" \n",
" | Encrypted reasoning | \n",
" pass | \n",
" {\"encrypted_content_returned\": true, \"reasoning_item_count\": 1} | \n",
"
\n",
" \n",
" | Prompt caching | \n",
" pass | \n",
" {\"first\": {\"id\": \"resp_3w6r6ipbqa5z2max35awv3i23i5sjmfa33zzw3vhpgqxcvchhkgq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 3970, \"output_tokens\": 66, \"total_tokens\": 4036, \"cached_input_tokens\": 0, \"reasoning_output_tokens\": 0, \"service_tier\": \"default\"}, \"second\": {\"id\": \"resp_zzjeqttoswdjdwwl56xpolvly23w4h2n5dtsdoddgxqwfbkn7npq\", \"model\": \"openai.gpt-5.4\", \"status\": \"completed\", \"output_item_types\": [\"message\"], \"input_tokens\": 3970, \"outp... | \n",
"
\n",
" \n",
" | Background mode | \n",
" pass | \n",
" {\"status_history\": [\"in_progress\", \"completed\"], \"id\": \"resp_lmmtsvgk3ntolh5ci5vxccmsa6uxcgrsq7v54jpz7oewmociyesa\", \"final_status\": \"completed\"} | \n",
"
\n",
" \n",
" | Compaction | \n",
" documented | \n",
" {\"feature\": \"Compaction\", \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\", \"brightcart_example\": {\"durable_facts\": [\"Customer Maya Chen\", \"ORDER-8831\", \"replacement delayed\", \"carrier scan stale\"], \"policy_constraints\": [\"Do not promise refund without eligibility\", \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"], \"next_action\": \"Check latest carrier scan and... | \n",
"
\n",
" \n",
" | Latency runtime example | \n",
" pass | \n",
" {\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544} | \n",
"
\n",
" \n",
" | Throughput runtime example | \n",
" pass | \n",
" {\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544} | \n",
"
\n",
" \n",
" | Reliability runtime example | \n",
" pass | \n",
" {\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544} | \n",
"
\n",
" \n",
" | Region check | \n",
" pass | \n",
" {\"region_hint\": \"us-west-2\", \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\", \"sample_count\": 3, \"success_rate\": 1.0, \"completed_rate\": 1.0, \"avg_latency_seconds\": 0.362, \"p50_latency_seconds\": 0.377, \"p90_latency_seconds\": 0.4, \"total_output_tokens\": 34, \"total_tokens\": 544} | \n",
"
\n",
" \n",
" | Stored response cleanup | \n",
" warn | \n",
" [{\"response_id\": \"resp_cvhvh7y5ghwrpa35snvk4bzgcgthgxp4tgwkllmf5mrhs7dikfia\", \"status\": \"warn\", \"detail\": {\"exception_class\": \"AuthenticationError\", \"status_code\": 401, \"retryable\": false, \"request_id\": \"req_gkni5zyr7lkjkz2vfiwvkev2qgxs76crcwz5whhjrdkma7up3yta\", \"message\": \"Error code: 401 - {'error': {'code': 'invalid_api_key', 'message': 'The security token included in the request is invalid.', 'param': None, 'type': 'permission_denied_error'}}\"}}, {\"response_id\": \"resp_vjrtvnakcgxjhnq5b7cj7rt... | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"Example responses
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" \n",
" \n",
" | example | \n",
" response_type | \n",
" response | \n",
"
\n",
" \n",
" \n",
" \n",
" | Endpoint verification | \n",
" text | \n",
" ok | \n",
"
\n",
" \n",
" | First raw HTTPS request | \n",
" text | \n",
" Empathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\\nAction: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you. | \n",
"
\n",
" \n",
" | SDK text generation | \n",
" text | \n",
" Use the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues. | \n",
"
\n",
" \n",
" | Create and retrieve response | \n",
" text | \n",
" goal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\\ndata needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\\nhuman-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested. | \n",
"
\n",
" \n",
" | Service tier and prompt cache request | \n",
" text | \n",
" Latency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\\nConsistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions. | \n",
"
\n",
" \n",
" | Structured ticket triage | \n",
" json | \n",
" {\\n \"ticket_id\": \"TICKET-7429\",\\n \"category\": \"delivery_delay\",\\n \"priority\": \"urgent\",\\n \"customer_sentiment\": \"frustrated and time-sensitive\",\\n \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\",\\n \"required_actions\": [\\n \"Review ORDER-8831 shipment status and confirm last carrier scan/update.\",\\n \"Contact carrier or open a trace/escalation for stalled tracking.\",\\n \"Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.\",\\n \"Arrange supervisor callback per customer request.\",\\n \"Review and communicate refund options, including refund\\n... | \n",
"
\n",
" \n",
" | JSON support handoff | \n",
" json | \n",
" {\\n \"customer_name\": \"Maya Chen\",\\n \"order_id\": \"ORDER-8831\",\\n \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\",\\n \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\",\\n \"metrics_to_watch\": [\\n \"tracking_scan_recency\",\\n \"carrier_exception_status\",\\n \"replacement_order_delivery_eta\",\\n \"customer_follow_up_time\"\\n ]\\n} | \n",
"
\n",
" \n",
" | Compact policy guidance | \n",
" text | \n",
" BrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window. | \n",
"
\n",
" \n",
" | Detailed policy guidance | \n",
" text | \n",
" 1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\\n2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\\n3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged. | \n",
"
\n",
" \n",
" | Order-status tool answer | \n",
" text | \n",
" Status: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\\nNext best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval. | \n",
"
\n",
" \n",
" | Parallel order lookup fallback answer | \n",
" text | \n",
" Order statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\\nShipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered. | \n",
"
\n",
" \n",
" | Normalized support note | \n",
" text | \n",
" ORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION | \n",
"
\n",
" \n",
" | Support transcript extraction text | \n",
" text | \n",
" {\"ticket_id\":\"TICKET-7429\",\"customer\":\"Maya Chen\",\"order_id\":\"ORDER-8831\",\"product\":\"Standing desk replacement\",\"issue\":\"Replacement for a damaged item is delayed and carrier scan has not moved\",\"requested_resolution\":\"Supervisor callback and refund options\",\"policy_options\":\"expedited replacement or 15% concession with agent approval after 48-hour delay\"} | \n",
"
\n",
" \n",
" | Stateful support handoff | \n",
" text | \n",
" Ticket ID: TICKET-4812\\nOrder ID: ORDER-8831\\nCustomer Name: Maya Chen\\nIssue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\\nNext Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed | \n",
"
\n",
" \n",
" | Stateless support handoff | \n",
" text | \n",
" Customer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\\nIssue: Damaged item; monitor stand is cracked on arrival.\\nRequested Resolution: Customer wants a replacement shipped this week.\\nOpen Question: Jordan asked whether the damaged item must be returned before replacement is sent.\\nStatus: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement. | \n",
"
\n",
" \n",
" | State strategy recommendation | \n",
" text | \n",
" Recommendation: Stateless continuation\\nReason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage. | \n",
"
\n",
" \n",
" | Prompt-cache token comparison | \n",
" table | \n",
" [\\n {\\n \"request\":\"first\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":66,\\n \"total_tokens\":4036\\n },\\n {\\n \"request\":\"second\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":48,\\n \"total_tokens\":4018\\n }\\n] | \n",
"
\n",
" \n",
" | Cached support-policy reply | \n",
" text | \n",
" Hi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary. | \n",
"
\n",
" \n",
" | Background manager summary | \n",
" text | \n",
" theme: Shipping delays dominate, especially West Coast distribution lane.\\nrisk: Rising dissatisfaction from delays, replacements, and return exceptions.\\nnext action: Escalate West Coast lane issues and review holiday return policy. | \n",
"
\n",
" \n",
" | Compacted support context | \n",
" json | \n",
" {\\n \"feature\": \"Compaction\",\\n \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\\n \"brightcart_example\": {\\n \"durable_facts\": [\\n \"Customer Maya Chen\",\\n \"ORDER-8831\",\\n \"replacement delayed\",\\n \"carrier scan stale\"\\n ],\\n \"policy_constraints\": [\\n \"Do not promise refund without eligibility\",\\n \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"\\n ],\\n \"next_action\": \"Check latest carrier scan and supervisor callback status.\"\\n }\\n} | \n",
"
\n",
" \n",
" | Endpoint responsiveness summary | \n",
" json | \n",
" {\\n \"region_hint\": \"us-west-2\",\\n \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\",\\n \"sample_count\": 3,\\n \"success_rate\": 1.0,\\n \"completed_rate\": 1.0,\\n \"avg_latency_seconds\": 0.362,\\n \"p50_latency_seconds\": 0.377,\\n \"p90_latency_seconds\": 0.4,\\n \"total_output_tokens\": 34,\\n \"total_tokens\": 544,\\n \"samples\": [\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.4,\\n \"output_tokens\": 14,\\n \"total_tokens\": 184,\\n \"status\": \"completed\",\\n \"sample_output\": \"We apologize for the delay with your replacement order.\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.31,\\n \"output_tokens\": 6,\\n \"total_tokens\": 174,\\n \"status\": \"completed\",\\n \"sample_output\": \"Resolution Rate\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.377,\\n \"output_tokens\": 14,\\n \"total_tokens\": 186,\\n \"status\": \"completed\",\\n \"sample_output\": \"I\\u2019m e\\n... | \n",
"
\n",
" \n",
"
\n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" example | \n",
" response_type | \n",
" response | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" Endpoint verification | \n",
" text | \n",
" ok | \n",
"
\n",
" \n",
" | 1 | \n",
" First raw HTTPS request | \n",
" text | \n",
" Empathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\\nAction: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you. | \n",
"
\n",
" \n",
" | 2 | \n",
" SDK text generation | \n",
" text | \n",
" Use the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues. | \n",
"
\n",
" \n",
" | 3 | \n",
" Create and retrieve response | \n",
" text | \n",
" goal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\\ndata needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\\nhuman-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested. | \n",
"
\n",
" \n",
" | 4 | \n",
" Service tier and prompt cache request | \n",
" text | \n",
" Latency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\\nConsistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions. | \n",
"
\n",
" \n",
" | 5 | \n",
" Structured ticket triage | \n",
" json | \n",
" {\\n \"ticket_id\": \"TICKET-7429\",\\n \"category\": \"delivery_delay\",\\n \"priority\": \"urgent\",\\n \"customer_sentiment\": \"frustrated and time-sensitive\",\\n \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\",\\n \"required_actions\": [\\n \"Review ORDER-8831 shipment status and confirm last carrier scan/update.\",\\n \"Contact carrier or open a trace/escalation for stalled tracking.\",\\n \"Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.\",\\n \"Arrange supervisor callback per customer request.\",\\n \"Review and communicate refund options, including refund\\n... | \n",
"
\n",
" \n",
" | 6 | \n",
" JSON support handoff | \n",
" json | \n",
" {\\n \"customer_name\": \"Maya Chen\",\\n \"order_id\": \"ORDER-8831\",\\n \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\",\\n \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\",\\n \"metrics_to_watch\": [\\n \"tracking_scan_recency\",\\n \"carrier_exception_status\",\\n \"replacement_order_delivery_eta\",\\n \"customer_follow_up_time\"\\n ]\\n} | \n",
"
\n",
" \n",
" | 7 | \n",
" Compact policy guidance | \n",
" text | \n",
" BrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window. | \n",
"
\n",
" \n",
" | 8 | \n",
" Detailed policy guidance | \n",
" text | \n",
" 1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\\n2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\\n3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged. | \n",
"
\n",
" \n",
" | 9 | \n",
" Order-status tool answer | \n",
" text | \n",
" Status: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\\nNext best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval. | \n",
"
\n",
" \n",
" | 10 | \n",
" Parallel order lookup fallback answer | \n",
" text | \n",
" Order statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\\nShipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered. | \n",
"
\n",
" \n",
" | 11 | \n",
" Normalized support note | \n",
" text | \n",
" ORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION | \n",
"
\n",
" \n",
" | 12 | \n",
" Support transcript extraction text | \n",
" text | \n",
" {\"ticket_id\":\"TICKET-7429\",\"customer\":\"Maya Chen\",\"order_id\":\"ORDER-8831\",\"product\":\"Standing desk replacement\",\"issue\":\"Replacement for a damaged item is delayed and carrier scan has not moved\",\"requested_resolution\":\"Supervisor callback and refund options\",\"policy_options\":\"expedited replacement or 15% concession with agent approval after 48-hour delay\"} | \n",
"
\n",
" \n",
" | 13 | \n",
" Stateful support handoff | \n",
" text | \n",
" Ticket ID: TICKET-4812\\nOrder ID: ORDER-8831\\nCustomer Name: Maya Chen\\nIssue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\\nNext Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed | \n",
"
\n",
" \n",
" | 14 | \n",
" Stateless support handoff | \n",
" text | \n",
" Customer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\\nIssue: Damaged item; monitor stand is cracked on arrival.\\nRequested Resolution: Customer wants a replacement shipped this week.\\nOpen Question: Jordan asked whether the damaged item must be returned before replacement is sent.\\nStatus: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement. | \n",
"
\n",
" \n",
" | 15 | \n",
" State strategy recommendation | \n",
" text | \n",
" Recommendation: Stateless continuation\\nReason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage. | \n",
"
\n",
" \n",
" | 16 | \n",
" Prompt-cache token comparison | \n",
" table | \n",
" [\\n {\\n \"request\":\"first\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":66,\\n \"total_tokens\":4036\\n },\\n {\\n \"request\":\"second\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":48,\\n \"total_tokens\":4018\\n }\\n] | \n",
"
\n",
" \n",
" | 17 | \n",
" Cached support-policy reply | \n",
" text | \n",
" Hi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary. | \n",
"
\n",
" \n",
" | 18 | \n",
" Background manager summary | \n",
" text | \n",
" theme: Shipping delays dominate, especially West Coast distribution lane.\\nrisk: Rising dissatisfaction from delays, replacements, and return exceptions.\\nnext action: Escalate West Coast lane issues and review holiday return policy. | \n",
"
\n",
" \n",
" | 19 | \n",
" Compacted support context | \n",
" json | \n",
" {\\n \"feature\": \"Compaction\",\\n \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\\n \"brightcart_example\": {\\n \"durable_facts\": [\\n \"Customer Maya Chen\",\\n \"ORDER-8831\",\\n \"replacement delayed\",\\n \"carrier scan stale\"\\n ],\\n \"policy_constraints\": [\\n \"Do not promise refund without eligibility\",\\n \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"\\n ],\\n \"next_action\": \"Check latest carrier scan and supervisor callback status.\"\\n }\\n} | \n",
"
\n",
" \n",
" | 20 | \n",
" Endpoint responsiveness summary | \n",
" json | \n",
" {\\n \"region_hint\": \"us-west-2\",\\n \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\",\\n \"sample_count\": 3,\\n \"success_rate\": 1.0,\\n \"completed_rate\": 1.0,\\n \"avg_latency_seconds\": 0.362,\\n \"p50_latency_seconds\": 0.377,\\n \"p90_latency_seconds\": 0.4,\\n \"total_output_tokens\": 34,\\n \"total_tokens\": 544,\\n \"samples\": [\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.4,\\n \"output_tokens\": 14,\\n \"total_tokens\": 184,\\n \"status\": \"completed\",\\n \"sample_output\": \"We apologize for the delay with your replacement order.\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.31,\\n \"output_tokens\": 6,\\n \"total_tokens\": 174,\\n \"status\": \"completed\",\\n \"sample_output\": \"Resolution Rate\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.377,\\n \"output_tokens\": 14,\\n \"total_tokens\": 186,\\n \"status\": \"completed\",\\n \"sample_output\": \"I\\u2019m e\\n... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" example response_type \\\n",
"0 Endpoint verification text \n",
"1 First raw HTTPS request text \n",
"2 SDK text generation text \n",
"3 Create and retrieve response text \n",
"4 Service tier and prompt cache request text \n",
"5 Structured ticket triage json \n",
"6 JSON support handoff json \n",
"7 Compact policy guidance text \n",
"8 Detailed policy guidance text \n",
"9 Order-status tool answer text \n",
"10 Parallel order lookup fallback answer text \n",
"11 Normalized support note text \n",
"12 Support transcript extraction text text \n",
"13 Stateful support handoff text \n",
"14 Stateless support handoff text \n",
"15 State strategy recommendation text \n",
"16 Prompt-cache token comparison table \n",
"17 Cached support-policy reply text \n",
"18 Background manager summary text \n",
"19 Compacted support context json \n",
"20 Endpoint responsiveness summary json \n",
"\n",
" response \n",
"0 ok \n",
"1 Empathy: I’m sorry, Maya — your replacement order ORDER-8831 is delayed because the carrier reported a temporary transit hold at the regional sorting facility.\\nAction: We’re monitoring the shipment closely and will send you an updated delivery estimate within 24 hours; if there’s no movement by then, we’ll review the next replacement or refund options with you. \n",
"2 Use the Responses API to build a BrightCart support assistant that can answer customer questions, summarize policies, and guide users through common workflows like order tracking, refunds, and account updates. Ground the assistant in BrightCart documentation and connect it to relevant backend tools or APIs so it can retrieve live order data, check account status, and provide accurate, context-aware support responses. Design the experience around clear system instructions, structured tool calling, and conversation state management so the assistant stays on-brand, reliable, and safe when handling customer issues. \n",
"3 goal: Help support agents explain delayed replacement orders, set expectations, and suggest next steps.\\ndata needed: Order ID, replacement order status, shipment/tracking events, delay reason, estimated ship/delivery date, customer contact history, inventory/backorder status, and applicable refund or reship policy.\\nhuman-review rule: Escalate to a human if the delay exceeds policy thresholds, tracking is inconsistent or missing, the order appears lost, the customer is high-risk or highly upset, or any refund/reship exception is requested. \n",
"4 Latency benefit: Prompt caching lets the BrightCart support assistant reuse previously processed context, reducing response time for repeated or similar requests.\\nConsistency benefit: Prompt caching helps the BrightCart support assistant return more uniform answers by reusing the same established prompt context across interactions. \n",
"5 {\\n \"ticket_id\": \"TICKET-7429\",\\n \"category\": \"delivery_delay\",\\n \"priority\": \"urgent\",\\n \"customer_sentiment\": \"frustrated and time-sensitive\",\\n \"summary\": \"Customer Maya Chen reports that ORDER-8831 is a replacement shipment for a previously damaged standing desk. The replacement is now 2 days late, carrier tracking has not updated, and she needs the desk delivered before Monday. She is requesting a supervisor callback and wants to know refund options if the replacement cannot arrive in time.\",\\n \"required_actions\": [\\n \"Review ORDER-8831 shipment status and confirm last carrier scan/update.\",\\n \"Contact carrier or open a trace/escalation for stalled tracking.\",\\n \"Check expedited reshipment or alternative fulfillment options to meet the before-Monday deadline.\",\\n \"Arrange supervisor callback per customer request.\",\\n \"Review and communicate refund options, including refund\\n... \n",
"6 {\\n \"customer_name\": \"Maya Chen\",\\n \"order_id\": \"ORDER-8831\",\\n \"issue_summary\": \"Customer is asking about a delayed replacement order. The carrier tracking scan is stale and has not updated.\",\\n \"next_step\": \"Handoff to support to investigate the carrier delay, verify shipment status, and provide Maya Chen with an update or resolution.\",\\n \"metrics_to_watch\": [\\n \"tracking_scan_recency\",\\n \"carrier_exception_status\",\\n \"replacement_order_delivery_eta\",\\n \"customer_follow_up_time\"\\n ]\\n} \n",
"7 BrightCart’s delayed-replacement policy lets customers keep using the original item until the replacement arrives, then return the defective product within the allowed return window. \n",
"8 1. BrightCart sends replacements after customers return the original item and warehouse receipt is confirmed.\\n2. This delay prevents duplicate shipments, verifies eligibility, and reduces fraud or inventory errors.\\n3. Agents should explain timelines clearly, offer return instructions, and reassure customers once receipt is logged. \n",
"9 Status: ORDER-8831 is delayed; carrier shows no movement for 36 hours at the Denver sort center, with promised delivery on 2026-06-01.\\nNext best action: Monitor until the 48-hour threshold; if no movement then, contact the customer and offer either an expedited replacement or a 15% concession with agent approval. \n",
"10 Order statuses: ORDER-8831 is delayed and ORDER-2044 was delivered yesterday.\\nShipping problems: Maya has one active shipping problem, because only ORDER-8831 is currently delayed while ORDER-2044 is already delivered. \n",
"11 ORDER_ID: ORDER-8831\\nCUSTOMER_ID: CUST-1042\\nISSUE: REPLACEMENT DELAYED\\nCUSTOMER_REQUEST: CUSTOMER WANTS SUPERVISOR\\nPOLICY_OPTION: OFFER EXPEDITED REPLACEMENT OR 15% CONCESSION \n",
"12 {\"ticket_id\":\"TICKET-7429\",\"customer\":\"Maya Chen\",\"order_id\":\"ORDER-8831\",\"product\":\"Standing desk replacement\",\"issue\":\"Replacement for a damaged item is delayed and carrier scan has not moved\",\"requested_resolution\":\"Supervisor callback and refund options\",\"policy_options\":\"expedited replacement or 15% concession with agent approval after 48-hour delay\"} \n",
"13 Ticket ID: TICKET-4812\\nOrder ID: ORDER-8831\\nCustomer Name: Maya Chen\\nIssue: Replacement standing desk shipment for damaged delivery has had no carrier movement for 36 hours; customer is frustrated because this is the second attempt\\nNext Best Action: Monitor until 48 hours without movement, then offer expedited replacement or 15% concession and escalate to Tier 2 Returns if needed \n",
"14 Customer: Jordan Lee reported ORDER-7718 arrived with a cracked monitor stand.\\nIssue: Damaged item; monitor stand is cracked on arrival.\\nRequested Resolution: Customer wants a replacement shipped this week.\\nOpen Question: Jordan asked whether the damaged item must be returned before replacement is sent.\\nStatus: Damage claim captured and awaiting next-agent confirmation on replacement timing and return requirement. \n",
"15 Recommendation: Stateless continuation\\nReason: In a regulated support workflow, requiring names, order IDs, and refund context to be explicitly provided each turn improves controllability, auditability, and data-minimization, reducing the risk of unintended retention or cross-session leakage. \n",
"16 [\\n {\\n \"request\":\"first\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":66,\\n \"total_tokens\":4036\\n },\\n {\\n \"request\":\"second\",\\n \"input_tokens\":3970,\\n \"cached_input_tokens\":0,\\n \"output_tokens\":48,\\n \"total_tokens\":4018\\n }\\n] \n",
"17 Hi Maya, I’m sorry your replacement order ORDER-8831 is delayed. I’m checking the latest replacement and carrier status now so I can confirm the best next step for you without making you wait longer than necessary. \n",
"18 theme: Shipping delays dominate, especially West Coast distribution lane.\\nrisk: Rising dissatisfaction from delays, replacements, and return exceptions.\\nnext action: Escalate West Coast lane issues and review holiday return policy. \n",
"19 {\\n \"feature\": \"Compaction\",\\n \"how_to_apply\": \"Summarize older support turns into durable facts, open questions, policy constraints, and next actions before continuing the workflow.\",\\n \"brightcart_example\": {\\n \"durable_facts\": [\\n \"Customer Maya Chen\",\\n \"ORDER-8831\",\\n \"replacement delayed\",\\n \"carrier scan stale\"\\n ],\\n \"policy_constraints\": [\\n \"Do not promise refund without eligibility\",\\n \"Offer expedited replacement or 15% concession after 48-hour delay with approval\"\\n ],\\n \"next_action\": \"Check latest carrier scan and supervisor callback status.\"\\n }\\n} \n",
"20 {\\n \"region_hint\": \"us-west-2\",\\n \"base_url_host\": \"bedrock-mantle.us-west-2.api.aws\",\\n \"sample_count\": 3,\\n \"success_rate\": 1.0,\\n \"completed_rate\": 1.0,\\n \"avg_latency_seconds\": 0.362,\\n \"p50_latency_seconds\": 0.377,\\n \"p90_latency_seconds\": 0.4,\\n \"total_output_tokens\": 34,\\n \"total_tokens\": 544,\\n \"samples\": [\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.4,\\n \"output_tokens\": 14,\\n \"total_tokens\": 184,\\n \"status\": \"completed\",\\n \"sample_output\": \"We apologize for the delay with your replacement order.\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.31,\\n \"output_tokens\": 6,\\n \"total_tokens\": 174,\\n \"status\": \"completed\",\\n \"sample_output\": \"Resolution Rate\"\\n },\\n {\\n \"ok\": true,\\n \"latency_seconds\": 0.377,\\n \"output_tokens\": 14,\\n \"total_tokens\": 186,\\n \"status\": \"completed\",\\n \"sample_output\": \"I\\u2019m e\\n... "
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from __future__ import annotations\n",
"\n",
"cleanup_rows = []\n",
"tracked_ids = list(dict.fromkeys(STORED_RESPONSE_IDS))\n",
"\n",
"if not tracked_ids:\n",
" cleanup_rows.append({\"response_id\": \"none\", \"status\": \"no stored responses tracked\", \"detail\": \"\"})\n",
"elif not CLEAN_UP_STORED_RESPONSES:\n",
" for stored_id in tracked_ids:\n",
" cleanup_rows.append({\"response_id\": stored_id, \"status\": \"skipped\", \"detail\": \"BEDROCK_CLEANUP_STORED_RESPONSES is disabled\"})\n",
" record_check(\"Stored response cleanup\", \"skipped\", cleanup_rows)\n",
"else:\n",
" for stored_id in tracked_ids:\n",
" try:\n",
" delete_result = delete_response(stored_id)\n",
" cleanup_rows.append({\"response_id\": stored_id, \"status\": \"deleted\", \"detail\": compact_text(to_dict(delete_result), 240)})\n",
" except Exception as exc:\n",
" cleanup_rows.append({\"response_id\": stored_id, \"status\": \"warn\", \"detail\": describe_api_error(exc)})\n",
" cleanup_status = \"pass\" if all(row[\"status\"] == \"deleted\" for row in cleanup_rows) else \"warn\"\n",
" record_check(\"Stored response cleanup\", cleanup_status, cleanup_rows)\n",
"\n",
"print_label(\"Stored response cleanup\")\n",
"display_wrapped_table(pd.DataFrame(cleanup_rows), max_col_width_px=520)\n",
"\n",
"summary_df = pd.DataFrame(RESULTS_SUMMARY)\n",
"print_label(\"Run summary\")\n",
"display_wrapped_table(summary_df, max_col_width_px=620)\n",
"\n",
"print_label(\"Example responses\")\n",
"print_response_gallery()\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv (3.11.8)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}