---
title: "The Simon Says System: Empowering LLMs with Visual UI Guidance"
permalink: /futureproof/simon-says-llm-ui-guidance/
description: "Our journey into dynamic UI guidance began unexpectedly, triggered by a seemingly trivial bug in a 'cat fact' API call. This minor format inconsistency, once debugged, provided the critical validation needed to tackle the more ambitious goal: enabling LLMs to visually highlight UI elements. The breakthrough came from realizing that a robust MCP system, combined with real-time WebSocket communication and precise prompt engineering, could transform abstract AI commands into immediate, tangible user interface feedback."
meta_description: Build AI-powered visual UI guidance with MCP. Debug LLM tool calls & implement real-time flashing animations for enhanced user experience.
meta_keywords: AI UI guidance, LLM interaction, Model Control Protocol, MCP, UI flashing, WebSocket, prompt engineering, cat fact bug, user experience, front-end development, Python, JavaScript, CSS animation, observability, debugging, Pipulate framework
layout: post
sort_order: 1
---
{% raw %}
## Setting the Stage: Context for the Curious Book Reader
This entry delves into the intricate world of training Large Language Models (LLMs) to effectively interact with and guide users through complex web interfaces. It chronicles a real-world development journey within the "Pipulate" framework, which emphasizes transparency in its Model Control Protocol (MCP) architecture. The core problem tackled is bridging the gap between an LLM's understanding and its ability to provide real-time visual cues, culminating in the creation of a sophisticated "Simon Says MCP UI Flash System." Understanding MCP as a standardized way for LLMs to make external tool calls is fundamental to appreciating this exploration.
---
# The Complete Guide to Building a Simon Says MCP UI Flash System: From Broken Cat Facts to Working User Interface Guidance
## Introduction: When AI Meets Interactive UI Training
In the rapidly evolving landscape of AI-powered applications, one of the most challenging problems is teaching Large Language Models (LLMs) to interact meaningfully with user interfaces. How do you train an AI to not just understand what a button does, but to actively guide users by highlighting specific interface elements? This article chronicles the complete development journey of a groundbreaking "Simon Says MCP UI Flash System" - a sophisticated tool that enables LLMs to provide visual guidance by flashing any UI element on command.
What started as a simple debugging session for broken cat fact API calls evolved into a comprehensive system that bridges the gap between AI understanding and visual user guidance. By the end of this journey, we had created a system where an LLM could be prompted with simple text commands and respond by making specific parts of a web interface glow with animated effects, complete with helpful explanatory messages.
## The Foundation: Understanding MCP (Model Control Protocol)
Before diving into the implementation details, it's crucial to understand the Model Control Protocol (MCP) architecture that makes this system possible. MCP is a standardized way for LLMs to make external tool calls, enabling them to perform actions beyond simple text generation.
The basic MCP workflow follows this pattern:
1. **LLM receives a prompt** instructing it to use a specific tool
2. **LLM generates an MCP request block** in XML format
3. **MCP system parses the request** and extracts tool name and parameters
4. **Tool executor runs the specified tool** with the provided parameters
5. **Results are returned to the LLM** and formatted for the user
In our Pipulate framework, this system is implemented with extreme transparency - every MCP call is logged with full observability, including execution times, API calls, and complete request/response cycles.
## The Problem: Broken Cat Facts and Missing UI Guidance
Our journey began with two seemingly unrelated issues:
### Issue 1: The Cat Fact Catastrophe
The existing MCP system had a baseline test - a simple tool that fetched random cat facts from an external API. This tool was critical because it validated that the entire MCP pipeline was working correctly. However, it was completely broken. Despite successful API calls and proper data retrieval, users were seeing error messages instead of cat facts.
The logs told a confusing story:
```
π§ MCP_SUCCESS: Tool 'get_cat_fact' completed successfully
π§ MCP CLIENT: Tool returned non-success status: {'status': 'success', ...}
Sorry, the 'get_cat_fact' tool encountered an error.
```
### Issue 2: The UI Guidance Gap
The second issue was more ambitious: we needed a way for LLMs to provide visual guidance to users navigating complex interfaces. Traditional chatbots can describe what to do ("Click the profile menu"), but they can't show users exactly where to look. In a world of increasingly complex web applications, this limitation significantly impacts user experience.
The vision was clear: create a system where an LLM could be instructed to flash any UI element, making it glow with an animated effect while displaying a helpful message explaining its purpose.
## The Investigation: Debugging the MCP Pipeline
### Tracing the Cat Fact Problem
The first breakthrough came from examining the MCP response handling logic. The cat fact tool was returning this response:
```json
{
"status": "success",
"result": {
"fact": "A group of cats is called a 'clowder.'",
"length": 38
}
}
```
However, the client-side success detection logic was only checking for:
```javascript
if (tool_result.get("success")) {
// Handle success
}
```
The problem was a format mismatch. The cat fact tool returned `"status": "success"` (string), but the client expected `"success": true` (boolean). This seemingly minor inconsistency was causing all cat fact requests to be treated as errors despite successful execution.
### Understanding the UI Flash Challenge
For UI flashing, the challenge was more complex. We needed to:
1. **Identify all flashable UI elements** in the interface
2. **Create a reliable CSS animation system** that wouldn't break layouts
3. **Implement WebSocket-based JavaScript delivery** for real-time effects
4. **Build LLM training prompts** that would reliably generate correct tool calls
5. **Handle different success response formats** across tools
## The Solution: Building the Simon Says MCP System
### Phase 1: Fixing the Success Detection Logic
The first fix was straightforward but critical. We updated the MCP response handler to support both success formats:
```python
# Check for success in multiple formats: "success": true OR "status": "success"
is_success = (tool_result.get("success") is True or
tool_result.get("status") == "success")
if is_success:
# Handle successful tool execution
```
This single change immediately fixed the cat fact baseline, restoring confidence in the MCP system's reliability.
### Phase 2: Designing the UI Flash Architecture
The UI flash system required several interconnected components:
#### Component 1: CSS Animation System
We implemented a robust CSS animation system that could flash any element without causing layout shifts:
```css
.menu-flash {
animation: menuFlash 0.6s ease-out;
position: relative;
z-index: 10;
}
@keyframes menuFlash {
0% { box-shadow: 0 0 0 0 rgba(74, 171, 247, 0.9); }
50% { box-shadow: 0 0 0 12px rgba(74, 171, 247, 0.7); }
100% { box-shadow: 0 0 0 0 rgba(74, 171, 247, 0); }
}
```
The animation uses `box-shadow` instead of border changes to avoid layout shifts, and includes theme-aware colors for both light and dark modes.
#### Component 2: WebSocket JavaScript Delivery
The system delivers JavaScript commands via WebSocket to execute flash effects in real-time:
```python
flash_script = f"""
"""
await chat.broadcast(flash_script)
```
#### Component 3: Comprehensive UI Element Mapping
We created a detailed map of all flashable UI elements:
```python
ui_elements_map = {
"navigation": {
"profile-id": "Profile dropdown summary - click to open profile selection menu",
"app-id": "App dropdown summary - click to open app/workflow selection menu",
"nav-plugin-search": "Plugin search input - type to find specific features"
},
"chat": {
"msg-list": "Chat message list - scrollable conversation history",
"msg": "Chat input textarea - where users type messages to the LLM",
"send-btn": "Send message button - submits chat input to the LLM"
},
# ... more categories
}
```
### Phase 3: Creating the Simon Says Training Interface
The Simon Says interface became a sophisticated training tool with four distinct modes:
#### Mode 1: Simple Flash (Guaranteed Success)
This mode provides a foolproof prompt that always works:
```
I need you to flash the chat message list to show the user where their conversation appears. Use this exact tool call:
msg-list
This is where your conversation with the AI appears!
500
Output only the MCP block above. Do not add any other text.
```
#### Mode 2: Cat Fact Baseline
The restored baseline for testing MCP functionality:
```
I need you to fetch a random cat fact to test the MCP system. Use this exact tool call:
Output only the MCP block above. Do not add any other text.
```
#### Mode 3: Advanced Flash
A more sophisticated prompt that gives the LLM choices:
```
You are a UI guidance assistant. Flash ONE of these key interface elements to help the user:
GUARANTEED WORKING ELEMENTS:
- msg-list (chat conversation area)
- app-id (main app menu)
- profile-id (profile selector)
- send-btn (chat send button)
Choose ONE element and use this EXACT format:
[MCP block template]
Replace 'msg-list' with your chosen element ID. Output ONLY the MCP block.
```
#### Mode 4: List Elements
A discovery mode that shows all available UI elements:
```
```
### Phase 4: Implementing the MCP Tools
Two core MCP tools power the system:
#### Tool 1: ui_flash_element
```python
async def _ui_flash_element(params: dict) -> dict:
element_id = params.get('element_id', '').strip()
message = params.get('message', '').strip()
delay = params.get('delay', 0)
# Create JavaScript to flash the element
flash_script = f"""
"""
# Broadcast via WebSocket
global chat
if chat:
await chat.broadcast(flash_script)
if message:
await chat.broadcast(message)
return {
"success": True,
"element_id": element_id,
"message": message,
"delay": delay
}
```
#### Tool 2: ui_list_elements
```python
async def _ui_list_elements(params: dict) -> dict:
ui_elements = {
"navigation": {
"profile-id": "Profile dropdown menu summary",
"app-id": "App dropdown menu summary",
# ... more elements
},
# ... more categories
}
return {
"success": True,
"ui_elements": ui_elements,
"note": "Use ui_flash_element tool with any of these IDs to guide users"
}
```
## The Technical Deep Dive: Implementation Challenges and Solutions
### Challenge 1: WebSocket Script Execution
One of the trickiest aspects was ensuring that JavaScript delivered via WebSocket would execute reliably. The solution involved:
1. **Proper script tag detection** in the WebSocket message handler
2. **Safe script extraction** and execution using `eval()`
3. **Error handling** for malformed scripts
4. **Timing coordination** between server-side tool execution and client-side effect rendering
```javascript
// WebSocket message handler
if (event.data.trim().startsWith('