{ "cells": [ { "metadata": {}, "cell_type": "markdown", "source": [ "# Drive the browser with Playwright MCP and Koog\n", "\n", "In this notebook, you'll connect a Koog agent to Playwright's Model Context Protocol (MCP) server and let it drive a real browser to complete a task: open jetbrains.com, accept cookies, and click the AI section in the toolbar.\n", "\n", "We'll keep things simple and reproducible, focusing on a minimal but realistic agent + tools setup you can publish and reuse.\n" ] }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, "source": [ "%useLatestDescriptors\n", "%use koog\n" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "## Prerequisites\n", "- An OpenAI API key exported as an environment variable: `OPENAI_API_KEY`\n", "- Node.js and npx available on your PATH\n", "- Kotlin Jupyter notebook environment with Koog available via `%use koog`\n", "\n", "Tip: Run the Playwright MCP server in headful mode to watch the browser automate the steps.\n" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "## 1) Provide your OpenAI API key\n", "We read the API key from the `OPENAI_API_KEY` environment variable. This keeps secrets out of the notebook.\n" ] }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, "source": [ "// Get the API key from environment variables\n", "val openAIApiToken = System.getenv(\"OPENAI_API_KEY\") ?: error(\"OPENAI_API_KEY environment variable not set\")\n" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "## 2) Start the Playwright MCP server\n", "We'll launch Playwright's MCP server locally using `npx`. By default, it will expose an SSE endpoint we can connect to from Koog.\n" ] }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, "source": [ "// Start the Playwright MCP server via npx\n", "val process = ProcessBuilder(\n", " \"npx\",\n", " \"@playwright/mcp@latest\",\n", " \"--port\",\n", " \"8931\"\n", ").start()\n" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "## 3) Connect from Koog and run the agent\n", "We build a minimal Koog `AIAgent` with an OpenAI executor and point its tool registry to the MCP server over SSE. Then we ask it to complete the browser task strictly via tools.\n" ] }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, "source": [ "import kotlinx.coroutines.runBlocking\n", "\n", "runBlocking {\n", " println(\"Connecting to Playwright MCP server...\")\n", " val toolRegistry = McpToolRegistryProvider.fromTransport(\n", " transport = McpToolRegistryProvider.defaultSseTransport(\"http://localhost:8931\")\n", " )\n", " println(\"Successfully connected to Playwright MCP server\")\n", "\n", " // Create the agent\n", " val agent = AIAgent(\n", " executor = simpleOpenAIExecutor(openAIApiToken),\n", " llmModel = OpenAIModels.Chat.GPT4o,\n", " toolRegistry = toolRegistry,\n", " )\n", "\n", " val request = \"Open a browser, navigate to jetbrains.com, accept all cookies, click AI in toolbar\"\n", " println(\"Sending request: $request\")\n", "\n", " agent.run(\n", " request + \". \" +\n", " \"You can only call tools. Use the Playwright tools to complete this task.\"\n", " )\n", "}\n" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "## 4) Shut down the MCP process\n", "Always clean up the external process at the end of your run.\n" ] }, { "metadata": {}, "cell_type": "code", "outputs": [], "execution_count": null, "source": [ "// Shutdown the Playwright MCP process\n", "println(\"Closing connection to Playwright MCP server\")\n", "process.destroy()\n" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "## Troubleshooting\n", "- If the agent can't connect, make sure the MCP server is running on `http://localhost:8931`.\n", "- If you don't see the browser, ensure Playwright is installed and able to launch a browser on your system.\n", "- If you get authentication errors from OpenAI, double-check the `OPENAI_API_KEY` environment variable.\n", "\n", "## Next steps\n", "- Try different websites or flows. The MCP server exposes a rich set of Playwright tools.\n", "- Swap the LLM model, or add more tools to the Koog agent.\n", "- Integrate this flow into your app, or publish the notebook as documentation." ] } ], "metadata": { "kernelspec": { "display_name": "Kotlin", "language": "kotlin", "name": "kotlin" }, "language_info": { "name": "kotlin", "version": "2.2.20-Beta2", "mimetype": "text/x-kotlin", "file_extension": ".kt", "pygments_lexer": "kotlin", "codemirror_mode": "text/x-kotlin", "nbconvert_exporter": "" } }, "nbformat": 4, "nbformat_minor": 0 }