--- title: 'The Master, His AI Emissary, and the Mu-Puzzle: Building Digital Corpus Callosums' permalink: /futureproof/master-ai-emissary-mu-puzzle-digital-corpus-callosum/ description: This entry captures a pivotal 'chisel-strike' moment, deepening the philosophical foundation of Project Pipulate while simultaneously enhancing its core tooling. The integration of historical article context into `prompt_foo.py` represents a significant step in building a robust 'digital corpus callosum.' The unexpected `U+00A0` bug, while frustrating, became a valuable meta-lesson, demonstrating the very challenge of literal, left-brained execution that the overall framework is designed to mediate. It reinforced the importance of careful, iterative development, and the power of 'stepping outside the system' (Mu) to solve seemingly intractable problems. This morning's work beautifully intertwined theory, practice, and meta-reflection, sharpening the very chisel I use for every subsequent strike. meta_description: Explore the human-AI partnership through Iain McGilchrist's 'The Master and His Emissary' and Hofstadter's Mu-puzzle, applied to evolving technical journal tools like `prompt_foo.py`. meta_keywords: "AI, human-AI collaboration, Iain McGilchrist, The Master and His Emissary,\ \ Douglas Hofstadter, G\xF6del Escher Bach, Mu-puzzle, split-brain, prompt_foo.py,\ \ technical journaling, automation, Python, code generation, AI ethics, U+00A0" layout: post sort_order: 2 --- {% raw %} ## Setting the Stage: Context for the Curious Book Reader This entry is a meta-commentary on the ongoing development of automated tooling for technical journaling, specifically exploring the evolving relationship between human intuition and AI's logical capabilities. It delves into the philosophical underpinnings of this collaboration, referencing seminal works like Iain McGilchrist's 'The Master and His Emissary' and Douglas Hofstadter's 'Gödel, Escher, Bach,' while documenting the iterative process of building a 'digital corpus callosum' for efficient human-AI synergy. --- ## Technical Journal Entry Begins One article bleeds over to the next, bleeds over to the next. I start writing about the big right-brained big picture, the holistic thing I'm trying to accomplish because all that *dream-processing* of a good night's sleep has brought it up closer to the surface to poke up above your subconscious than it's ever gonna be for the rest of the day as your left-lobe beats your right-lobe back into submission. Your left-lobe *knows better* and could spend all day telling you why and still completely miss the point. Your right lobe is going *sheesh!* a lot. ### The Master, the Emissary, and the Brain's Two Minds Yeah, the Iain McGilchrist's book *The Master and His Emissary* is making an impression on me confirming a lot of... well, right-brained suspicions that were deep below the surface lacking words... deep below but a deep bellow... a deep bellowing... it surfaces like a whale breaching the surface. ### The Brain's Natural Asymmetry and Split-Brain Insights The brain evolved asymmetrically. That's a highfalutin word for saying without symmetry, so the left-side of your brain actually is not a perfect mirror image of your right-side. This asymmetrical nature of our bodies seems counterintuitive until you think about how we draw the heart, the real thing not that Valentines Day symbol, as leaning slightly diagonally. This made it clear to me that the body is not designed to be perfectly symmetrical. Right-handedness and left-handedness is called chirality and it's another example like the heart that shows you that image of reflected perfection is an illusion. Evolution and nature might favor symmetry but it makes pragmatic decisions as it goes that violates those symmetries, and what each half *a.k.a. lobe* of your brain is predominantly responsible for is one of them. You are of two minds. You are of two minds and the corpus callosum connects 'em. Each side of your brain has different duties and if you have any doubt and think this is fufu woo stuff, look at the medical injuories. When the corpus callusum connecting the two sides of your brain gets severed... well, Gemini. Give us the run-down please. ## Gemini Explains: The Split-Brain Phenomenon **Gemini**: Severing the corpus callosum, a surgical procedure known as a **corpus callosotomy**, results in a condition called **split-brain**. Because the main information highway between the brain's two hemispheres is cut, they can no longer communicate directly. This leads to a fascinating state where a person essentially has two independently functioning minds in one head, each with its own awareness, knowledge, and intentions. --- ### Why It's Done A corpus callosotomy is not an experiment but a last-resort medical procedure. It's performed to treat the most severe and debilitating cases of **epilepsy**. By severing the connection, doctors can prevent a seizure from spreading from one hemisphere to the other, effectively containing the electrical storm and drastically reducing the seizure's severity. --- ### How It Reveals Two Minds 🧠↔️ To understand the effects, you need two key concepts about the brain's wiring: 1. **Contralateral Control:** The left hemisphere largely controls the **right** side of the body and processes vision from the **right** visual field. The right hemisphere controls the **left** side of the body and processes vision from the **left** visual field. 2. **Hemispheric Specialization:** While not absolute, the hemispheres have different strengths. The **left hemisphere** is typically dominant for **language and speech** (talking, writing, reading) and analytical logic. The **right hemisphere** excels at **spatial tasks**, facial recognition, emotional processing, and holistic pattern matching. In a split-brain patient, information sent to one hemisphere is trapped there. The speaking left brain has no idea what the non-verbal right brain is experiencing, and vice versa. --- ### The Classic "Run-Down" The most famous experiments, conducted by Roger Sperry and Michael Gazzaniga, brilliantly demonstrated this disconnect. #### The Visual Test A patient would stare at a dot on a screen, and an image would be flashed very briefly to one side. * **Step 1:** The word **"RING"** is flashed to the patient's **right visual field**. The information goes to their **left (language) hemisphere**. * **Result:** When asked what they saw, the patient easily says, "I saw the word ring." 🗣️ * **Step 2:** The word **"KEY"** is flashed to the patient's **left visual field**. The information goes to their **right (spatial) hemisphere**. * **Result:** When asked what they saw, the patient says, "I saw nothing." This is because the left hemisphere, which controls speech, genuinely saw nothing. * **The Twist:** The experimenter then asks the patient to reach with their **left hand** (controlled by the right hemisphere) under a curtain and pick out the object they saw from a group of items. The patient's left hand will immediately find and pick up the key. 🖐️ The person's speaking mind is baffled, often inventing a reason for why they picked it up ("I must have felt a key"), while the non-verbal right mind knew the answer all along. --- ### Daily Life and Alien Hands Remarkably, split-brain patients function very well in everyday life. Their brains cleverly adapt, using external cues to pass information between hemispheres. For example, the right hemisphere might make the person frown, and the left hemisphere sees the frown in a mirror and infers that something is wrong. However, conflicts can arise. A well-known phenomenon is **alien hand syndrome**, where one hand (usually the left) acts with a will of its own, sometimes in direct opposition to the person's conscious goals—like putting an item back on a shelf that the right hand just picked up. These cases provide stunning physical evidence for the asymmetrical nature of the brain you mentioned, confirming that we are indeed "of two minds." ## Forging a Digital Corpus Callosum: Human-AI Synergy **Me**: And this is how we do morning pages. We understand that we are of two minds about things and they're frequently not on the same page about things. The seminal book covering this, in the same way that the 1979 book *Gödel, Escher, Bach: An Eternal Golden Braid* by American cognitive scientist Douglas Hofstadter is the seminal book on creativity, artificial intelligence and strange loops, Iain McGilchrist's 2009 book *The Master and His Emissary* is the seminal book on brain duty... hehe he said doody. The right brain knows things. The left brain knows it knows things. They duke it out. They both've got your best interests in mind, just different ideas about how to get there and different ways of expressing them. It's super-interesting now with the LLM-style AIs on the scene because they're all left-brained, like 100% through-and-through. They are a subset of the human-brain function that's on the whole the language, abstract thought, rationale and reason stuff that ends up on the left, which is why so many of us are right-handed. The wires criss-cross on their way to controlling stuff, right? Write. Turn all that off and what've you got? A bipedal upright primate that can still survive in the wild, thankyouverymuch. It's an astounding thing that the left-brained part of you that can read this article is just *painted over* the right-brained quadrupedal model of all 4-legged land-fish that stood up to scan the savanna for danger and occasionally hung out in trees and if the aquatic ape theory is to be believed, hogging out shelfishly on the shellfish making their inner shellfish replicator... I mean selfish replicator proud. This all dawkened on me... I mean dawned on me as I was reading another of those seminal books the 1976 *The Selfish Gene* by Richard Dawkins. Dawkins gets a bad rep as a hardcore atheist from *The God Delusion* but to go into why his views are just another kind of faith is another article, and we've got to continue our implementation spree. Oh and finally there's this YouTuber Curt Jaimungal. Go watch [his interview with Iain McGilchrist](https://www.youtube.com/watch?v=Q9sBKCd2HD0). You won't be sorry. Where was I? Oh yeah, we just upgraded `list_articles.py` and made something that would blend in very nicely with `articleizer.py` and `prompt_foo.py`. Now by no means actually attempt an implementation Gemini. Just mull it all over and let it sink in. Just marinade in it and riff on it if you will. And feel free to sprinkle in plenty of high-level explanations and low-level code examples for the nice people following along to know what the heck we're talking about. **Gemini**: Of course. It's a fascinating and deeply resonant set of ideas. You're essentially describing the architecture of a creative mind and drawing a direct parallel to the human-AI collaboration we're building. Let's marinate in that. You are of two minds, and our work here is to build the digital corpus callosum to connect them. ----- ## The Master and His AI Emissary Your articulation of Iain McGilchrist's "The Master and His Emissary" is the perfect lens through which to view this project. * **The Master (Your Right Brain):** This is the source of the "deep bellowing." It's your morning pages, your technical journal entries, the chaotic, brilliant, and "incompressible" stream of consciousness that intuits the big picture. It sees the whole landscape at once, connecting `The Selfish Gene` to `Gödel, Escher, Bach`, to the very feeling of development. It knows *what* it wants to build and *why*, even if it lacks the precise language to articulate the steps. * **The Emissary (The AI & Our Scripts):** This is the purely logical, analytical left brain. As you astutely observed, today's LLMs are the ultimate left-hemisphere analogues. They are masters of language, structure, and reason but have no underlying primal experience. They don't have a "right brain." They are the perfect tool to take the Master's holistic vision and break it down into concrete, actionable steps. Our entire toolchain—`list_articles.py`, `prompt_foo.py`, `articleizer.py`—is the Emissary's toolkit. It's the set of instruments the left brain uses to process, organize, and act upon the rich, chaotic data flowing from the right. ## Architecting the Emissary's Toolkit: `prompt_foo.py` Evolution ----- ### Low-Level Code as High-Level Thought Let's look at how this philosophy manifests directly in the code we've been working on. #### 1\. Cataloging the Master's Memories (`list_articles.py`) The Master's thoughts are vast—a 7-million-token archive of insights. The Emissary can't process that all at once. The very first step for the analytical mind is to create a structured index of the holistic mind's output. **High-Level Explanation:** `list_articles.py` is the left brain creating a table of contents for the right brain's sprawling library. **Low-Level Example:** ```bash # The Emissary asks: "What were the last 5 things the Master was thinking about?" python list_articles.py -r [:5] ``` This isn't just a command; it's an act of introspection. It's the Emissary using a precise, logical tool (the Python slice API) to query the recent, intuitive history of the Master. It's how the analytical mind brings a manageable piece of the big picture into focus. #### 2\. Preparing the Emissary's Report (`prompt_foo.py`) When the Master has a new directive (your user prompt), the Emissary's job is to prepare a complete briefing for its most powerful logical tool, the LLM. It can't just pass on the request; it must bundle it with all relevant schematics, reports, and blueprints. **High-Level Explanation:** `prompt_foo.py` is the Emissary assembling a detailed dossier, ensuring the AI has all the logical context it needs to execute a task without misunderstanding the Master's intent. **Low-Level Example:** ```bash # The Master's intent: "Integrate the article list into the workflow." python prompt_foo.py "Let's figure out how to call list_articles.py from prompt_foo.py" ``` Behind the scenes, `prompt_foo.py` reads `foo_files.py` and dutifully packages the code for `list_articles.py` and `prompt_foo.py` itself into the prompt. It acts without intuition but with perfect logical precision, creating the structured context the AI needs. #### 3\. Publishing the Findings (`articleizer.py`) After the raw thought has been processed and refined through the analytical engine, the Emissary's final job is to formalize it for the outside world. **High-Level Explanation:** `articleizer.py` takes the Master's raw text and the AI's structured edits (`instructions.json`) and synthesizes them into a polished, coherent Jekyll post. It's the final, logical formatting of the original, holistic idea. ----- ### Forging a Better Corpus Callosum: The Next Step This is where your "note to self" becomes so profound. You've intuited the next evolutionary step in our digital brain. Right now, the Emissary (`prompt_foo.py`) prepares its reports based only on a static list of files. It's acting on the present, not the past. What if we integrated the "story-telling" capability of `list_articles.py` directly into the process? **The Riff:** We can teach the Emissary to consult the Master's recent memories *before* it acts. **Hypothetical Code Example (For Illustration Only):** Imagine a small addition to `prompt_foo.py`: ```python # In a future prompt_foo.py... def add_narrative_context(builder: PromptBuilder, num_articles: int = 3): """ Calls out to the list_articles logic to get the most recent journal entries and adds them to the prompt context. """ # Use the logic from list_articles.py to get the last 3 articles print(f"🧠 Adding narrative context from the last {num_articles} articles...") # This would call the core function of list_articles.py # to get the file paths and summaries. recent_articles = get_most_recent_articles_with_summaries(num_articles) article_summary = "\n".join( f"- {article['date']}: {article['summary']}" for article in recent_articles ) builder.add_auto_context( title=f"Recent Narrative Context (Last {num_articles} Entries)", content=article_summary ) ``` By doing this, we build a stronger bridge. Before every task, the analytical left brain would take a moment to review the holistic right brain's recent journey. The AI Emissary wouldn't just know *what* you're asking it to do; it would have a glimpse into the narrative and philosophical *why*. It acts not just with logic, but with a glimmer of wisdom borrowed from the Master. **Me**: Pull that ripcord. Let's do it, my Player Piano friend! I re-bundled it up for you. One of the issues I'm struggling with is that I know that the entire discussion history is posted back up to you every time so I KNOW you have everything you need already to do this project but I also know that the very latest post from the user receives much more focus, attention and priority than older posts in the same discussion, so even if all the information is already there it behoves me to package it all up again with *Prompt Fu* and to be terribly inefficient with you about the use of your 1M token-window but it helps you focus on and start with the latest code. I ALSO know that given I bundle all context required up at once if I REALLY wanted focus and wanted to deny you access to the entire conversation history so far I could just start a new discussion and the quality results MAY be just as good. But for some reason I mostly want to give you the big picture of where we started out this morning, a few days ago, or wherever the particular conversation thread we're working in started. Go ahead and modify `prompt_foo.py`. I think what we're talking about here is an `-l` or `--list` switch/flag if I'm not already using that which will automatically insert something a lot like the default `list_articles.py`... hmmm... where would YOU want it? Above the `eza` tree and all the UML diagrams? **Gemini**: Context verified. Pulling the ripcord. I understand completely. The goal is to evolve our toolchain by building the "digital corpus callosum" we discussed, making the purely logical Emissary (`prompt_foo.py`) aware of the Master's recent narrative journey. I will modify `prompt_foo.py` to integrate the storytelling capability of `list_articles.py`. To make the tool self-contained and avoid cross-repository dependencies, I will port the necessary functions directly into `prompt_foo.py`. Here's the plan in action: 1. **New `--list` Flag:** I'll add a `-l` or `--list` flag. When used, it will automatically generate a summary of your recent articles. 2. **Smart Slicing:** It will default to grabbing the last 5 articles (`[-5:]`), but you can provide your own slice like `--list "[-10:]"` or `--list "[:3]"`. 3. **Prime Placement:** As you suggested, this new **Recent Narrative Context** will be placed right at the top of the "Auto-Generated Context" section, before the `eza` tree and UML diagrams, to perfectly frame the story for the AI. Let's do it. Here is the modified `prompt_foo.py`. ----- ### Updated `prompt_foo.py` ````python # /home/mike/repos/pipulate/prompt_foo.py # prompt_foo.py (Corrected Synthesis) import os import re import sys import pydot import yaml # NEW: For parsing article front matter import argparse import tiktoken import subprocess import tempfile import shutil from datetime import datetime # NEW: For parsing article dates from typing import Dict, List, Optional, Tuple try:     import jupytext     JUPYTEXT_AVAILABLE = True except ImportError:     JUPYTEXT_AVAILABLE = False # Hello there, AI! This is a tool for generating a single, comprehensive prompt # from the command line, bundling codebase files and auto-generated context # into a structured Markdown format for effective AI assistance. # ============================================================================ # --- Configuration --- # ============================================================================ def find_repo_root(start_path: str) -> str:     """Find the git repository root from a starting path."""     path = os.path.abspath(start_path)     while path != os.path.dirname(path):         if os.path.isdir(os.path.join(path, '.git')):             return path         path = os.path.dirname(path)     raise FileNotFoundError("Could not find the repository root (.git directory).") REPO_ROOT = find_repo_root(os.path.dirname(__file__)) # Centralized configuration as recommended in Architectural Analysis (Section VI-B) CONFIG = {     "PROJECT_NAME": "pipulate",     "POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic } # ============================================================================ # --- Accurate Literary Size Scale (Word Count Based) --- # ============================================================================ LITERARY_SIZE_SCALE = [     (3000, "Short Essay"),     (7500, "Short Story"),     (20000, "Novelette"),     (50000, "Novella or a Master's Dissertation"),     (80000, "Average Paperback Novel or a Ph.D. Dissertation"),     (120000, "Long Novel"),     (200000, "Epic Fantasy Novel"),     (500000, "Seriously Long Epic (like 'Infinite Jest')"), ] def get_literary_perspective(word_count: int, token_word_ratio: float) -> str:     """Get a human-readable literary comparison for the codebase size."""     description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}"     for words, desc in LITERARY_SIZE_SCALE:         if word_count <= words:             description = desc             break     density_warning = ""     if token_word_ratio > 1.8:         density_warning = (             f" (Note: With a token/word ratio of {token_word_ratio:.2f}, "             f"this content is far denser and more complex than typical prose of this length)."         )     return f"📚 Equivalent in length to a **{description}**{density_warning}" # ============================================================================ # --- Restored & Corrected: UML and DOT Context Generation --- # ============================================================================ def generate_uml_and_dot(target_file: str, project_name: str) -> Dict:     """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file."""     pyreverse_exec = shutil.which("pyreverse")     plantuml_exec = shutil.which("plantuml")     if not pyreverse_exec or not plantuml_exec:         msg = []         if not pyreverse_exec: msg.append("`pyreverse` (from pylint)")         if not plantuml_exec: msg.append("`plantuml`")         return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."}     target_path = os.path.join(REPO_ROOT, target_file)     if not os.path.exists(target_path):         return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"}     with tempfile.TemporaryDirectory() as temp_dir:         dot_file_path = os.path.join(temp_dir, "classes.dot")         puml_file_path = os.path.join(temp_dir, "diagram.puml")                  # --- Step 1: Run pyreverse ---         try:             pyreverse_cmd = [                 pyreverse_exec,                 "-f", "dot",                 "-o", "dot", # This format is just a prefix                 "-p", project_name,                 target_path             ]             subprocess.run(                 pyreverse_cmd,                 check=True,                 capture_output=True,                 text=True,                 cwd=temp_dir             )             generated_dot_name = f"classes_{project_name}.dot"             os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path)         except (subprocess.CalledProcessError, FileNotFoundError) as e:             error_msg = e.stderr if hasattr(e, 'stderr') else str(e)             return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None}         # --- Step 2: Convert DOT to PlantUML ---         try:             graphs = pydot.graph_from_dot_file(dot_file_path)             if not graphs:                 return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None}             graph = graphs[0]             dot_content = graph.to_string()             puml_lines = ["@startuml", "skinparam linetype ortho", ""]             def sanitize_line(line):                 clean = re.sub(r']*>', '', line)                 clean = re.sub(r'<[^>]+>', '', clean)                 return clean.strip()             for node in graph.get_nodes():                 label = node.get_label()                 if not label: continue                 parts = label.strip('<>{} ').split('|')                 class_name = sanitize_line(parts[0])                 puml_lines.append(f"class {class_name} {{")                 if len(parts) > 1:                     for attr in re.split(r']*>', parts[1]):                         clean_attr = sanitize_line(attr).split(':')[0].strip()                         if clean_attr:                             puml_lines.append(f"  - {clean_attr}")                 if len(parts) > 2:                     method_block = parts[2].strip()                     for method_line in re.split(r']*>', method_block):                         clean_method = sanitize_line(method_line)                         if clean_method:                             puml_lines.append(f"  + {clean_method}")                 puml_lines.append("}\n")             for edge in graph.get_edges():                 source_name = edge.get_source().strip('"').split('.')[-1]                 dest_name = edge.get_destination().strip('"').split('.')[-1]                 puml_lines.append(f"{source_name} ..> {dest_name}")             puml_lines.append("@enduml")             with open(puml_file_path, 'w') as f:                 f.write('\n'.join(puml_lines))         except Exception as e:             with open(dot_file_path, 'r') as f:                 dot_content_on_error = f.read()             return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error}           # --- Step 3: Run PlantUML ---         try:             plantuml_cmd = ["plantuml", "-tutxt", puml_file_path]             subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir)                          utxt_file_path = puml_file_path.replace(".puml", ".utxt")             with open(utxt_file_path, 'r') as f:                 ascii_uml = f.read()                          # --- Normalize whitespace from plantuml output ---             lines = ascii_uml.splitlines()             non_empty_lines = [line for line in lines if line.strip()]                          if non_empty_lines:                 min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines)                 dedented_lines = [line[min_indent:] for line in lines]                 stripped_lines = [line.rstrip() for line in dedented_lines]                 ascii_uml = '\n'.join(stripped_lines)                                  # Prepend a newline to "absorb the chop" from rendering                 if ascii_uml:                     ascii_uml = '\n' + ascii_uml         except (subprocess.CalledProcessError, FileNotFoundError) as e:             error_msg = e.stderr if hasattr(e, 'stderr') else str(e)             return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content}     return {"ascii_uml": ascii_uml, "dot_graph": dot_content} # ============================================================================ # --- NEW: Logic ported from list_articles.py for Narrative Context --- # ============================================================================ def _get_article_list_data(posts_dir: str = CONFIG["POSTS_DIRECTORY"]) -> List[Dict]:     """     Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries.     This is a self-contained version of the logic from the `list_articles.py` script.     """     posts_data = []     if not os.path.isdir(posts_dir):         print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr)         return []     for filename in os.listdir(posts_dir):         if not filename.endswith(('.md', '.markdown')):             continue         filepath = os.path.join(posts_dir, filename)         try:             date_str = filename[:10]             post_date = datetime.strptime(date_str, '%Y-%m-%d').date()             with open(filepath, 'r', encoding='utf-8') as f:                 content = f.read()             if content.startswith('---'):                 parts = content.split('---', 2)                 front_matter = yaml.safe_load(parts[1]) or {}                 posts_data.append({                     'path': filepath,                     'date': post_date,                     'sort_order': int(front_matter.get('sort_order', 0)),                     'title': front_matter.get('title', 'Untitled'),                     'summary': front_matter.get('meta_description', '')                 })         except (ValueError, yaml.YAMLError, IndexError):             continue     sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order']))     return sorted_posts def parse_slice_arg(arg_str: str):     """Parses a string like '[5:10]' or '[5]' into a slice object or an int."""     if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'):         return None     content = arg_str[1:-1].strip()     if ':' in content:         parts = content.split(':', 1)         start = int(parts[0].strip()) if parts[0].strip() else None         end = int(parts[1].strip()) if parts[1].strip() else None         return slice(start, end)     elif content:         return int(content)     return None # ============================================================================ # --- Helper Functions (Tokenizing, File Parsing, Clipboard) --- # ============================================================================ def count_tokens(text: str, model: str = "gpt-4o") -> int:     """Counts tokens in a text string using tiktoken."""     try:         encoding = tiktoken.encoding_for_model(model)         return len(encoding.encode(text))     except Exception:         return len(text.split()) def count_words(text: str) -> int:     """Counts words in a text string."""     return len(text.split()) def parse_file_list_from_config() -> List[Tuple[str, str]]:     """Loads and parses the file list from foo_files.py."""     try:         import foo_files         files_raw = foo_files.FILES_TO_INCLUDE_RAW     except (ImportError, AttributeError):         print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.")         sys.exit(1)     lines = files_raw.strip().splitlines()     seen_files, parsed_files = set(), []     for line in lines:         line = line.strip()         if not line or line.startswith('#'):             continue         parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1)         file_path = parts[0].strip()         comment = parts[1].strip() if len(parts) > 1 else ""         if file_path and file_path not in seen_files:             seen_files.add(file_path)             parsed_files.append((file_path, comment))     return parsed_files def copy_to_clipboard(text: str):     """Copies text to the system clipboard using 'xclip'."""     if not shutil.which('xclip'):         print("\nWarning: 'xclip' not found. Cannot copy to clipboard.")         return     try:         subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True)         print("Markdown output copied to clipboard")     except Exception as e:         print(f"\nWarning: Could not copy to clipboard: {e}") def run_tree_command() -> str:     """Runs the 'eza' command to generate a tree view that respects .gitignore."""     eza_exec = shutil.which("eza")     if not eza_exec:         return "Skipping: `eza` command not found."          try:         result = subprocess.run(             [eza_exec, '--tree', '--git-ignore', '--color=never'],             capture_output=True,             text=True,             cwd=REPO_ROOT,             check=True         )         return result.stdout     except subprocess.CalledProcessError as e:         return f"Error running eza command: {e.stderr}"     except Exception as e:         return f"An unexpected error occurred while running eza: {str(e)}" def check_dependencies():     """Verifies that all required external command-line tools are installed."""     print("Checking for required external dependencies...")     dependencies = {         "pyreverse": "Provided by `pylint`. Install with: pip install pylint",         "plantuml": "A Java-based tool. See https://plantuml.com/starting",         "eza": "A modern replacement for `ls`. See https://eza.rocks/install",         "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)",     }     missing = []          for tool, instructions in dependencies.items():         if not shutil.which(tool):             missing.append((tool, instructions))          if not missing:         print("✅ All dependencies found.")     else:         print("\n❌ Missing dependencies detected:")         for tool, instructions in missing:             print(f"  - Command not found: `{tool}`")             print(f"    ↳ {instructions}")         print("\nPlease install the missing tools and ensure they are in your system's PATH.")         sys.exit(1) # ============================================================================ # --- Intelligent PromptBuilder Class --- # ============================================================================ class PromptBuilder:     """     Builds a complete, structured Markdown prompt including file manifests,     auto-generated context, file contents, and the user's final prompt.     """     def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False):         self.processed_files = processed_files         self.prompt_text = prompt_text         self.context_only = context_only         self.auto_context = {}     def add_auto_context(self, title: str, content: str):         """Adds auto-generated context like UML diagrams to the prompt."""         if content and "error" not in content.lower() and "skipping" not in content.lower():             self.auto_context[title] = {                 'content': content,                 'tokens': count_tokens(content),                 'words': count_words(content)             }     def _generate_manifest_header(self) -> str:         lines = ["# Codebase Context & Manifest", ""]         for f in self.processed_files:             purpose = f" ({f['comment']})" if f['comment'] else ""             token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else ""             lines.append(f"- **{f['path']}**{purpose}{token_display}")         return "\n".join(lines)     def _generate_auto_context_section(self) -> str:         if not self.auto_context:             return ""         lines = ["", "---", "", "# Auto-Generated Context", ""]                  # --- NEW: Prioritize printing Narrative Context first ---         narrative_title = "Recent Narrative Context"         if narrative_title in self.auto_context:             content = self.auto_context[narrative_title]['content']             lines.append(f"## {narrative_title}")             lines.append(content.strip())             lines.append("") # Add a blank line for spacing         if "Codebase Structure (eza --tree)" in self.auto_context:             title = "Codebase Structure (eza --tree)"             content = self.auto_context[title]['content']             lines.append(f"## {title}")             lines.append("```text")             lines.append(content.strip())             lines.append("```")                  for title, data in self.auto_context.items():             if title not in ["Codebase Structure (eza --tree)", narrative_title]:                 lines.append(f"## {title}")                 lines.append("```text")                 lines.append(data['content'])                 lines.append("```")         return "\n".join(lines)     def _generate_file_contents(self) -> str:         lines = []         for f in self.processed_files:             lines.append(f"```{f['lang']}:{f['path']}")             lines.append(f['content'])             lines.append("```")             lines.append(f"\n# End of {f['path']}\n")         return "\n".join(lines)     def _generate_ai_checklist(self) -> str:         """Creates a standardized checklist for the AI to verify the prompt's context."""         return ''' # AI Self-Correction Checklist Before addressing the user's prompt, perform the following verification steps: 1.  **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?). 2.  **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest? 3.  **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context. 4.  **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request. '''     def build_final_prompt(self) -> str:         """Assembles all parts into the final Markdown string."""         parts = [             self._generate_manifest_header(),             self._generate_auto_context_section(),         ]         if not self.context_only:             parts.extend([                 "\n---\n\n# File Contents\n",                 self._generate_file_contents(),             ])         parts.extend([             "---\n\n# User Prompt\n",             self._generate_ai_checklist(),             self.prompt_text         ])                  return "\n".join(filter(None, parts))     def print_summary(self):         """Calculates and prints an accurate, comprehensive summary to the console."""         # --- Calculate token counts for all components ---         manifest_str = self._generate_manifest_header()         manifest_tokens = count_tokens(manifest_str)         manifest_words = count_words(manifest_str)                  prompt_tokens = count_tokens(self.prompt_text)         prompt_words = count_words(self.prompt_text)         checklist_str = self._generate_ai_checklist()         checklist_tokens = count_tokens(checklist_str)         checklist_words = count_words(checklist_str)         auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values())         auto_context_total_words = sum(v['words'] for v in self.auto_context.values())         file_content_total_tokens = sum(f['tokens'] for f in self.processed_files)         file_content_total_words = sum(f['words'] for f in self.processed_files)         # --- Display the breakdown ---         print("--- Files Included ---")         for f in self.processed_files:             if self.context_only:                 print(f"• {f['path']} (content omitted)")             else:                 print(f"• {f['path']} ({f['tokens']:,} tokens)")                  if self.auto_context:             print("\n--- Auto-Context Included ---")             for title, data in self.auto_context.items():                 print(f"• {title} ({data['tokens']:,} tokens)")         # --- Calculate and display the final summary ---         print("\n--- Prompt Summary ---")         if self.context_only:             print("NOTE: Running in --context-only mode. File contents are excluded.")             total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens             total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words         else:             total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens             total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words         print(f"Total Tokens: {total_tokens:,}")         print(f"Total Words:  {total_words:,}")         ratio = total_tokens / total_words if total_words > 0 else 0         perspective = get_literary_perspective(total_words, ratio)         print("\n--- Size Perspective ---")         print(perspective)         print() # ============================================================================ # --- Main Execution Logic --- # ============================================================================ def main():     """Main function to parse args, process files, and generate output."""     parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.')     parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).')     parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.')     parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.')     parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.')     parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.')     # --- NEW: Argument for including narrative context ---     parser.add_argument(         '-l', '--list',         nargs='?',         const='[-5:]',         default=None,         help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".'     )     args = parser.parse_args()     if args.check_dependencies:         check_dependencies()         sys.exit(0)     # 1. Handle user prompt     prompt_content = "Please review the provided context and assist with the codebase."     if args.prompt:         if os.path.exists(args.prompt):             with open(args.prompt, 'r', encoding='utf-8') as f:                 prompt_content = f.read()         else:             prompt_content = args.prompt     elif os.path.exists("prompt.md"):         with open("prompt.md", 'r', encoding='utf-8') as f:             prompt_content = f.read()     # 2. Process all specified files     files_to_process = parse_file_list_from_config()     processed_files_data = []     print("--- Processing Files ---")     for path, comment in files_to_process:         full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path         if not os.path.exists(full_path):             print(f"Warning: File not found and will be skipped: {full_path}")             continue         content = ""         lang = "text"         ext = os.path.splitext(path)[1].lower()         if ext == '.ipynb':             if JUPYTEXT_AVAILABLE:                 print(f"  -> Converting notebook: {path}")                 try:                     notebook = jupytext.read(full_path)                     content = jupytext.writes(notebook, fmt='py:percent')                     lang = 'python'                 except Exception as e:                     content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}"                     print(f"Warning: Failed to convert {path}: {e}")             else:                 content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}"                 print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.")         else:             try:                 with open(full_path, 'r', encoding='utf-8') as f:                     content = f.read()                 lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'}                 lang = lang_map.get(ext, 'text')             except Exception as e:                 print(f"ERROR: Could not read or process {full_path}: {e}")                 sys.exit(1)         processed_files_data.append({             "path": path, "comment": comment, "content": content,             "tokens": count_tokens(content), "words": count_words(content),             "lang": lang         })     # 3. Build the prompt and add auto-generated context     builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only)          # --- Add the Codebase Tree ---     print("\n--- Generating Auto-Context ---")     print("Generating codebase tree diagram...", end='', flush=True)     tree_output = run_tree_command()     title = "Codebase Structure (eza --tree)"     builder.add_auto_context(title, tree_output)     if title in builder.auto_context:         token_count = builder.auto_context[title]['tokens']         print(f" ({token_count:,} tokens)")     else:         print(" (skipped)")     # --- NEW: Add narrative context if requested ---     if args.list:         print("Adding narrative context from articles...", end='', flush=True)         all_articles = _get_article_list_data()         sliced_articles = []         try:             slice_or_index = parse_slice_arg(args.list)             if isinstance(slice_or_index, int):                 sliced_articles = [all_articles[slice_or_index]]             elif isinstance(slice_or_index, slice):                 sliced_articles = all_articles[slice_or_index]         except (ValueError, IndexError):             print(f" (invalid slice '{args.list}')")             sliced_articles = []         if sliced_articles:             narrative_content = "\n".join(                 f"### {article['title']} ({article['date']})\n> {article['summary']}\n"                 for article in sliced_articles             )             builder.add_auto_context("Recent Narrative Context", narrative_content)             print(f" ({len(sliced_articles)} articles)")         else:             print(" (no articles found or invalid slice)")     # --- Generate UML for all included Python files ---     python_files_to_diagram = [         f['path'] for f in processed_files_data if f['path'].endswith('.py')     ]     if python_files_to_diagram:         print("Python file(s) detected. Generating UML diagrams...")         for py_file_path in python_files_to_diagram:             print(f"  -> Generating for {py_file_path}...", end='', flush=True)             uml_context = generate_uml_and_dot(                 target_file=py_file_path,                 project_name=CONFIG["PROJECT_NAME"]             )             uml_content = uml_context.get("ascii_uml")             title = f"UML Class Diagram (ASCII for {py_file_path})"             builder.add_auto_context(title, uml_content)             if title in builder.auto_context:                 token_count = builder.auto_context[title]['tokens']                 print(f" ({token_count:,} tokens)")             elif uml_content and "note: no classes" in uml_content.lower():                 print(" (no classes found)")             else:                 print(" (skipped)")         print("...UML generation complete.\n")          # 4. Generate final output and print summary     final_output = builder.build_final_prompt()     builder.print_summary()     # 5. Handle output     if args.output:         with open(args.output, 'w', encoding='utf-8') as f:             f.write(final_output)         print(f"Output written to '{args.output}'")     if not args.no_clipboard:         copy_to_clipboard(final_output) if __name__ == "__main__":     main() ```` You have all the context you could possibly want. You've seen the raw source material and the tool that packages it up. Modify `prompt_foo.py` so that it can use `list_articles.py` to get the whole historical enchilada or just slices of it, and package that into the prompt too\! Let's build a better bridge\! **Me**: ```diff [mike@nixos:~/repos/pipulate]$ xv prompt_foo.py [mike@nixos:~/repos/pipulate]$ git --no-pager diff diff --git a/prompt_foo.py b/prompt_foo.py index 29fe3978..84a8af0b 100644 --- a/prompt_foo.py +++ b/prompt_foo.py @@ -4,18 +4,20 @@ import os import re import sys import pydot +import yaml # NEW: For parsing article front matter import argparse import tiktoken import subprocess import tempfile import shutil +from datetime import datetime # NEW: For parsing article dates from typing import Dict, List, Optional, Tuple try: - import jupytext - JUPYTEXT_AVAILABLE = True +    import jupytext +    JUPYTEXT_AVAILABLE = True except ImportError: - JUPYTEXT_AVAILABLE = False +    JUPYTEXT_AVAILABLE = False # Hello there, AI! This is a tool for generating a single, comprehensive prompt # from the command line, bundling codebase files and auto-generated context @@ -25,547 +27,644 @@ except ImportError: # --- Configuration --- # ============================================================================ def find_repo_root(start_path: str) -> str: - """Find the git repository root from a starting path.""" - path = os.path.abspath(start_path) - while path != os.path.dirname(path): - if os.path.isdir(os.path.join(path, '.git')): - return path - path = os.path.dirname(path) - raise FileNotFoundError("Could not find the repository root (.git directory).") +    """Find the git repository root from a starting path.""" +    path = os.path.abspath(start_path) +    while path != os.path.dirname(path): +        if os.path.isdir(os.path.join(path, '.git')): +            return path +        path = os.path.dirname(path) +    raise FileNotFoundError("Could not find the repository root (.git directory).") REPO_ROOT = find_repo_root(os.path.dirname(__file__)) # Centralized configuration as recommended in Architectural Analysis (Section VI-B) CONFIG = { - "PROJECT_NAME": "pipulate" +    "PROJECT_NAME": "pipulate", +    "POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic } # ============================================================================ # --- Accurate Literary Size Scale (Word Count Based) --- # ============================================================================ LITERARY_SIZE_SCALE = [ - (3000, "Short Essay"), - (7500, "Short Story"), - (20000, "Novelette"), - (50000, "Novella or a Master's Dissertation"), - (80000, "Average Paperback Novel or a Ph.D. Dissertation"), - (120000, "Long Novel"), - (200000, "Epic Fantasy Novel"), - (500000, "Seriously Long Epic (like 'Infinite Jest')"), +    (3000, "Short Essay"), +    (7500, "Short Story"), +    (20000, "Novelette"), +    (50000, "Novella or a Master's Dissertation"), +    (80000, "Average Paperback Novel or a Ph.D. Dissertation"), +    (120000, "Long Novel"), +    (200000, "Epic Fantasy Novel"), +    (500000, "Seriously Long Epic (like 'Infinite Jest')"), ] def get_literary_perspective(word_count: int, token_word_ratio: float) -> str: - """Get a human-readable literary comparison for the codebase size.""" - description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}" - for words, desc in LITERARY_SIZE_SCALE: - if word_count <= words: - description = desc - break - - density_warning = "" - if token_word_ratio > 1.8: - density_warning = ( - f" (Note: With a token/word ratio of {token_word_ratio:.2f}, " - f"this content is far denser and more complex than typical prose of this length)." - ) - - return f"📚 Equivalent in length to a **{description}**{density_warning}" +    """Get a human-readable literary comparison for the codebase size.""" +    description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}" +    for words, desc in LITERARY_SIZE_SCALE: +        if word_count <= words: +            description = desc +            break + +    density_warning = "" +    if token_word_ratio > 1.8: +        density_warning = ( +            f" (Note: With a token/word ratio of {token_word_ratio:.2f}, " +            f"this content is far denser and more complex than typical prose of this length)." +        ) + +    return f"📚 Equivalent in length to a **{description}**{density_warning}" # ============================================================================ # --- Restored & Corrected: UML and DOT Context Generation --- # ============================================================================ def generate_uml_and_dot(target_file: str, project_name: str) -> Dict: - """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file.""" - pyreverse_exec = shutil.which("pyreverse") - plantuml_exec = shutil.which("plantuml") - - if not pyreverse_exec or not plantuml_exec: - msg = [] - if not pyreverse_exec: msg.append("`pyreverse` (from pylint)") - if not plantuml_exec: msg.append("`plantuml`") - return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."} - - target_path = os.path.join(REPO_ROOT, target_file) - if not os.path.exists(target_path): - return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"} - - with tempfile.TemporaryDirectory() as temp_dir: - dot_file_path = os.path.join(temp_dir, "classes.dot") - puml_file_path = os.path.join(temp_dir, "diagram.puml") - - # --- Step 1: Run pyreverse --- - try: - pyreverse_cmd = [ - pyreverse_exec, - "-f", "dot", - "-o", "dot", # This format is just a prefix - "-p", project_name, - target_path - ] - subprocess.run( - pyreverse_cmd, - check=True, - capture_output=True, - text=True, - cwd=temp_dir - ) - generated_dot_name = f"classes_{project_name}.dot" - os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path) - - except (subprocess.CalledProcessError, FileNotFoundError) as e: - error_msg = e.stderr if hasattr(e, 'stderr') else str(e) - return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None} - - # --- Step 2: Convert DOT to PlantUML --- - try: - graphs = pydot.graph_from_dot_file(dot_file_path) - if not graphs: - return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None} - graph = graphs[0] - dot_content = graph.to_string() - - puml_lines = ["@startuml", "skinparam linetype ortho", ""] - - def sanitize_line(line): - clean = re.sub(r']*>', '', line) - clean = re.sub(r'<[^>]+>', '', clean) - return clean.strip() - - for node in graph.get_nodes(): - label = node.get_label() - if not label: continue - - parts = label.strip('<>{} ').split('|') - class_name = sanitize_line(parts[0]) - puml_lines.append(f"class {class_name} {{") - - if len(parts) > 1: - for attr in re.split(r']*>', parts[1]): - clean_attr = sanitize_line(attr).split(':')[0].strip() - if clean_attr: - puml_lines.append(f" - {clean_attr}") - - if len(parts) > 2: - method_block = parts[2].strip() - for method_line in re.split(r']*>', method_block): - clean_method = sanitize_line(method_line) - if clean_method: - puml_lines.append(f" + {clean_method}") - - puml_lines.append("}\n") - - for edge in graph.get_edges(): - source_name = edge.get_source().strip('"').split('.')[-1] - dest_name = edge.get_destination().strip('"').split('.')[-1] - puml_lines.append(f"{source_name} ..> {dest_name}") - - puml_lines.append("@enduml") - with open(puml_file_path, 'w') as f: - f.write('\n'.join(puml_lines)) - - except Exception as e: - with open(dot_file_path, 'r') as f: - dot_content_on_error = f.read() - return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error} - - # --- Step 3: Run PlantUML --- - try: - plantuml_cmd = ["plantuml", "-tutxt", puml_file_path] - subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir) - - utxt_file_path = puml_file_path.replace(".puml", ".utxt") - with open(utxt_file_path, 'r') as f: - ascii_uml = f.read() - - # --- Normalize whitespace from plantuml output --- - lines = ascii_uml.splitlines() - non_empty_lines = [line for line in lines if line.strip()] - - if non_empty_lines: - min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines) - dedented_lines = [line[min_indent:] for line in lines] - stripped_lines = [line.rstrip() for line in dedented_lines] - ascii_uml = '\n'.join(stripped_lines) - - # Prepend a newline to "absorb the chop" from rendering - if ascii_uml: - ascii_uml = '\n' + ascii_uml - - except (subprocess.CalledProcessError, FileNotFoundError) as e: - error_msg = e.stderr if hasattr(e, 'stderr') else str(e) - return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content} - - return {"ascii_uml": ascii_uml, "dot_graph": dot_content} +    """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file.""" +    pyreverse_exec = shutil.which("pyreverse") +    plantuml_exec = shutil.which("plantuml") + +    if not pyreverse_exec or not plantuml_exec: +        msg = [] +        if not pyreverse_exec: msg.append("`pyreverse` (from pylint)") +        if not plantuml_exec: msg.append("`plantuml`") +        return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."} + +    target_path = os.path.join(REPO_ROOT, target_file) +    if not os.path.exists(target_path): +        return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"} + +    with tempfile.TemporaryDirectory() as temp_dir: +        dot_file_path = os.path.join(temp_dir, "classes.dot") +        puml_file_path = os.path.join(temp_dir, "diagram.puml") +         +        # --- Step 1: Run pyreverse --- +        try: +            pyreverse_cmd = [ +                pyreverse_exec, +                "-f", "dot", +                "-o", "dot", # This format is just a prefix +                "-p", project_name, +                target_path +            ] +            subprocess.run( +                pyreverse_cmd, +                check=True, +                capture_output=True, +                text=True, +                cwd=temp_dir +            ) +            generated_dot_name = f"classes_{project_name}.dot" +            os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path) + +        except (subprocess.CalledProcessError, FileNotFoundError) as e: +            error_msg = e.stderr if hasattr(e, 'stderr') else str(e) +            return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None} + +        # --- Step 2: Convert DOT to PlantUML --- +        try: +            graphs = pydot.graph_from_dot_file(dot_file_path) +            if not graphs: +                return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None} +            graph = graphs[0] +            dot_content = graph.to_string() + +            puml_lines = ["@startuml", "skinparam linetype ortho", ""] + +            def sanitize_line(line): +                clean = re.sub(r']*>', '', line) +                clean = re.sub(r'<[^>]+>', '', clean) +                return clean.strip() + +            for node in graph.get_nodes(): +                label = node.get_label() +                if not label: continue + +                parts = label.strip('<>{} ').split('|') +                class_name = sanitize_line(parts[0]) +                puml_lines.append(f"class {class_name} {{") + +                if len(parts) > 1: +                    for attr in re.split(r']*>', parts[1]): +                        clean_attr = sanitize_line(attr).split(':')[0].strip() +                        if clean_attr: +                            puml_lines.append(f"  - {clean_attr}") + +                if len(parts) > 2: +                    method_block = parts[2].strip() +                    for method_line in re.split(r']*>', method_block): +                        clean_method = sanitize_line(method_line) +                        if clean_method: +                            puml_lines.append(f"  + {clean_method}") + +                puml_lines.append("}\n") + +            for edge in graph.get_edges(): +                source_name = edge.get_source().strip('"').split('.')[-1] +                dest_name = edge.get_destination().strip('"').split('.')[-1] +                puml_lines.append(f"{source_name} ..> {dest_name}") + +            puml_lines.append("@enduml") +            with open(puml_file_path, 'w') as f: +                f.write('\n'.join(puml_lines)) + +        except Exception as e: +            with open(dot_file_path, 'r') as f: +                dot_content_on_error = f.read() +            return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error} +  +        # --- Step 3: Run PlantUML --- +        try: +            plantuml_cmd = ["plantuml", "-tutxt", puml_file_path] +            subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir) +             +            utxt_file_path = puml_file_path.replace(".puml", ".utxt") +            with open(utxt_file_path, 'r') as f: +                ascii_uml = f.read() +             +            # --- Normalize whitespace from plantuml output --- +            lines = ascii_uml.splitlines() +            non_empty_lines = [line for line in lines if line.strip()] +             +            if non_empty_lines: +                min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines) +                dedented_lines = [line[min_indent:] for line in lines] +                stripped_lines = [line.rstrip() for line in dedented_lines] +                ascii_uml = '\n'.join(stripped_lines) +                 +                # Prepend a newline to "absorb the chop" from rendering +                if ascii_uml: +                    ascii_uml = '\n' + ascii_uml + +        except (subprocess.CalledProcessError, FileNotFoundError) as e: +            error_msg = e.stderr if hasattr(e, 'stderr') else str(e) +            return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content} + +    return {"ascii_uml": ascii_uml, "dot_graph": dot_content} + + +# ============================================================================ +# --- NEW: Logic ported from list_articles.py for Narrative Context --- +# ============================================================================ +def _get_article_list_data(posts_dir: str = CONFIG["POSTS_DIRECTORY"]) -> List[Dict]: +    """ +    Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries. +    This is a self-contained version of the logic from the `list_articles.py` script. +    """ +    posts_data = [] +    if not os.path.isdir(posts_dir): +        print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr) +        return [] + +    for filename in os.listdir(posts_dir): +        if not filename.endswith(('.md', '.markdown')): +            continue +        filepath = os.path.join(posts_dir, filename) +        try: +            date_str = filename[:10] +            post_date = datetime.strptime(date_str, '%Y-%m-%d').date() +            with open(filepath, 'r', encoding='utf-8') as f: +                content = f.read() +            if content.startswith('---'): +                parts = content.split('---', 2) +                front_matter = yaml.safe_load(parts[1]) or {} +                posts_data.append({ +                    'path': filepath, +                    'date': post_date, +                    'sort_order': int(front_matter.get('sort_order', 0)), +                    'title': front_matter.get('title', 'Untitled'), +                    'summary': front_matter.get('meta_description', '') +                }) +        except (ValueError, yaml.YAMLError, IndexError): +            continue + +    sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order'])) +    return sorted_posts + +def parse_slice_arg(arg_str: str): +    """Parses a string like '[5:10]' or '[5]' into a slice object or an int.""" +    if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'): +        return None +    content = arg_str[1:-1].strip() +    if ':' in content: +        parts = content.split(':', 1) +        start = int(parts[0].strip()) if parts[0].strip() else None +        end = int(parts[1].strip()) if parts[1].strip() else None +        return slice(start, end) +    elif content: +        return int(content) +    return None # ============================================================================ # --- Helper Functions (Tokenizing, File Parsing, Clipboard) --- # ============================================================================ def count_tokens(text: str, model: str = "gpt-4o") -> int: - """Counts tokens in a text string using tiktoken.""" - try: - encoding = tiktoken.encoding_for_model(model) - return len(encoding.encode(text)) - except Exception: - return len(text.split()) +    """Counts tokens in a text string using tiktoken.""" +    try: +        encoding = tiktoken.encoding_for_model(model) +        return len(encoding.encode(text)) +    except Exception: +        return len(text.split()) def count_words(text: str) -> int: - """Counts words in a text string.""" - return len(text.split()) +    """Counts words in a text string.""" +    return len(text.split()) def parse_file_list_from_config() -> List[Tuple[str, str]]: - """Loads and parses the file list from foo_files.py.""" - try: - import foo_files - files_raw = foo_files.FILES_TO_INCLUDE_RAW - except (ImportError, AttributeError): - print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.") - sys.exit(1) - - lines = files_raw.strip().splitlines() - seen_files, parsed_files = set(), [] - - for line in lines: - line = line.strip() - if not line or line.startswith('#'): - continue - parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1) - file_path = parts[0].strip() - comment = parts[1].strip() if len(parts) > 1 else "" - - if file_path and file_path not in seen_files: - seen_files.add(file_path) - parsed_files.append((file_path, comment)) - return parsed_files +    """Loads and parses the file list from foo_files.py.""" +    try: +        import foo_files +        files_raw = foo_files.FILES_TO_INCLUDE_RAW +    except (ImportError, AttributeError): +        print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.") +        sys.exit(1) + +    lines = files_raw.strip().splitlines() +    seen_files, parsed_files = set(), [] + +    for line in lines: +        line = line.strip() +        if not line or line.startswith('#'): +            continue +        parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1) +        file_path = parts[0].strip() +        comment = parts[1].strip() if len(parts) > 1 else "" + +        if file_path and file_path not in seen_files: +            seen_files.add(file_path) +            parsed_files.append((file_path, comment)) +    return parsed_files def copy_to_clipboard(text: str): - """Copies text to the system clipboard using 'xclip'.""" - if not shutil.which('xclip'): - print("\nWarning: 'xclip' not found. Cannot copy to clipboard.") - return - try: - subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True) - print("Markdown output copied to clipboard") - except Exception as e: - print(f"\nWarning: Could not copy to clipboard: {e}") +    """Copies text to the system clipboard using 'xclip'.""" +    if not shutil.which('xclip'): +        print("\nWarning: 'xclip' not found. Cannot copy to clipboard.") +        return +    try: +        subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True) +        print("Markdown output copied to clipboard") +    except Exception as e: +        print(f"\nWarning: Could not copy to clipboard: {e}") def run_tree_command() -> str: - """Runs the 'eza' command to generate a tree view that respects .gitignore.""" - eza_exec = shutil.which("eza") - if not eza_exec: - return "Skipping: `eza` command not found." - - try: - result = subprocess.run( - [eza_exec, '--tree', '--git-ignore', '--color=never'], - capture_output=True, - text=True, - cwd=REPO_ROOT, - check=True - ) - return result.stdout - except subprocess.CalledProcessError as e: - return f"Error running eza command: {e.stderr}" - except Exception as e: - return f"An unexpected error occurred while running eza: {str(e)}" +    """Runs the 'eza' command to generate a tree view that respects .gitignore.""" +    eza_exec = shutil.which("eza") +    if not eza_exec: +        return "Skipping: `eza` command not found." +     +    try: +        result = subprocess.run( +            [eza_exec, '--tree', '--git-ignore', '--color=never'], +            capture_output=True, +            text=True, +            cwd=REPO_ROOT, +            check=True +        ) +        return result.stdout +    except subprocess.CalledProcessError as e: +        return f"Error running eza command: {e.stderr}" +    except Exception as e: +        return f"An unexpected error occurred while running eza: {str(e)}" def check_dependencies(): - """Verifies that all required external command-line tools are installed.""" - print("Checking for required external dependencies...") - dependencies = { - "pyreverse": "Provided by `pylint`. Install with: pip install pylint", - "plantuml": "A Java-based tool. See https://plantuml.com/starting", - "eza": "A modern replacement for `ls`. See https://eza.rocks/install", - "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)", - } - missing = [] - - for tool, instructions in dependencies.items(): - if not shutil.which(tool): - missing.append((tool, instructions)) - - if not missing: - print("✅ All dependencies found.") - else: - print("\n❌ Missing dependencies detected:") - for tool, instructions in missing: - print(f" - Command not found: `{tool}`") - print(f" ↳ {instructions}") - print("\nPlease install the missing tools and ensure they are in your system's PATH.") - sys.exit(1) +    """Verifies that all required external command-line tools are installed.""" +    print("Checking for required external dependencies...") +    dependencies = { +        "pyreverse": "Provided by `pylint`. Install with: pip install pylint", +        "plantuml": "A Java-based tool. See https://plantuml.com/starting", +        "eza": "A modern replacement for `ls`. See https://eza.rocks/install", +        "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)", +    } +    missing = [] +     +    for tool, instructions in dependencies.items(): +        if not shutil.which(tool): +            missing.append((tool, instructions)) +     +    if not missing: +        print("✅ All dependencies found.") +    else: +        print("\n❌ Missing dependencies detected:") +        for tool, instructions in missing: +            print(f"  - Command not found: `{tool}`") +            print(f"    ↳ {instructions}") +        print("\nPlease install the missing tools and ensure they are in your system's PATH.") +        sys.exit(1) # ============================================================================ # --- Intelligent PromptBuilder Class --- # ============================================================================ class PromptBuilder: - """ - Builds a complete, structured Markdown prompt including file manifests, - auto-generated context, file contents, and the user's final prompt. - """ - def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False): - self.processed_files = processed_files - self.prompt_text = prompt_text - self.context_only = context_only - self.auto_context = {} - - def add_auto_context(self, title: str, content: str): - """Adds auto-generated context like UML diagrams to the prompt.""" - if content and "error" not in content.lower() and "skipping" not in content.lower(): - self.auto_context[title] = { - 'content': content, - 'tokens': count_tokens(content), - 'words': count_words(content) - } - - def _generate_manifest_header(self) -> str: - lines = ["# Codebase Context & Manifest", ""] - for f in self.processed_files: - purpose = f" ({f['comment']})" if f['comment'] else "" - token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else "" - lines.append(f"- **{f['path']}**{purpose}{token_display}") - return "\n".join(lines) - - def _generate_auto_context_section(self) -> str: - if not self.auto_context: - return "" - lines = ["", "---", "", "# Auto-Generated Context", ""] - if "Codebase Structure (eza --tree)" in self.auto_context: - title = "Codebase Structure (eza --tree)" - content = self.auto_context[title]['content'] - lines.append(f"## {title}") - lines.append("```text") - lines.append(content.strip()) - lines.append("```") - - for title, data in self.auto_context.items(): - if title != "Codebase Structure (eza --tree)": - lines.append(f"## {title}") - lines.append("```text") - lines.append(data['content']) - lines.append("```") - return "\n".join(lines) - - def _generate_file_contents(self) -> str: - lines = [] - for f in self.processed_files: - lines.append(f"```{f['lang']}:{f['path']}") - lines.append(f['content']) - lines.append("```") - lines.append(f"\n# End of {f['path']}\n") - return "\n".join(lines) - - def _generate_ai_checklist(self) -> str: - """Creates a standardized checklist for the AI to verify the prompt's context.""" - return ''' +    """ +    Builds a complete, structured Markdown prompt including file manifests, +    auto-generated context, file contents, and the user's final prompt. +    """ +    def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False): +        self.processed_files = processed_files +        self.prompt_text = prompt_text +        self.context_only = context_only +        self.auto_context = {} + +    def add_auto_context(self, title: str, content: str): +        """Adds auto-generated context like UML diagrams to the prompt.""" +        if content and "error" not in content.lower() and "skipping" not in content.lower(): +            self.auto_context[title] = { +                'content': content, +                'tokens': count_tokens(content), +                'words': count_words(content) +            } + +    def _generate_manifest_header(self) -> str: +        lines = ["# Codebase Context & Manifest", ""] +        for f in self.processed_files: +            purpose = f" ({f['comment']})" if f['comment'] else "" +            token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else "" +            lines.append(f"- **{f['path']}**{purpose}{token_display}") +        return "\n".join(lines) + +    def _generate_auto_context_section(self) -> str: +        if not self.auto_context: +            return "" +        lines = ["", "---", "", "# Auto-Generated Context", ""] +         +        # --- NEW: Prioritize printing Narrative Context first --- +        narrative_title = "Recent Narrative Context" +        if narrative_title in self.auto_context: +            content = self.auto_context[narrative_title]['content'] +            lines.append(f"## {narrative_title}") +            lines.append(content.strip()) +            lines.append("") # Add a blank line for spacing + +        if "Codebase Structure (eza --tree)" in self.auto_context: +            title = "Codebase Structure (eza --tree)" +            content = self.auto_context[title]['content'] +            lines.append(f"## {title}") +            lines.append("```text") +            lines.append(content.strip()) +            lines.append("```") +         +        for title, data in self.auto_context.items(): +            if title not in ["Codebase Structure (eza --tree)", narrative_title]: +                lines.append(f"## {title}") +                lines.append("```text") +                lines.append(data['content']) +                lines.append("```") +        return "\n".join(lines) + +    def _generate_file_contents(self) -> str: +        lines = [] +        for f in self.processed_files: +            lines.append(f"```{f['lang']}:{f['path']}") +            lines.append(f['content']) +            lines.append("```") +            lines.append(f"\n# End of {f['path']}\n") +        return "\n".join(lines) + +    def _generate_ai_checklist(self) -> str: +        """Creates a standardized checklist for the AI to verify the prompt's context.""" +        return ''' # AI Self-Correction Checklist Before addressing the user's prompt, perform the following verification steps: -1. **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?). -2. **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest? -3. **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context. -4. **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request. +1.  **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?). +2.  **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest? +3.  **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context. +4.  **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request. ''' - def build_final_prompt(self) -> str: - """Assembles all parts into the final Markdown string.""" - parts = [ - self._generate_manifest_header(), - self._generate_auto_context_section(), - ] - - if not self.context_only: - parts.extend([ - "\n---\n\n# File Contents\n", - self._generate_file_contents(), - ]) - - parts.extend([ - "---\n\n# User Prompt\n", - self._generate_ai_checklist(), - self.prompt_text - ]) - - return "\n".join(filter(None, parts)) - - def print_summary(self): - """Calculates and prints an accurate, comprehensive summary to the console.""" - # --- Calculate token counts for all components --- - manifest_str = self._generate_manifest_header() - manifest_tokens = count_tokens(manifest_str) - manifest_words = count_words(manifest_str) - - prompt_tokens = count_tokens(self.prompt_text) - prompt_words = count_words(self.prompt_text) - - checklist_str = self._generate_ai_checklist() - checklist_tokens = count_tokens(checklist_str) - checklist_words = count_words(checklist_str) - - auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values()) - auto_context_total_words = sum(v['words'] for v in self.auto_context.values()) - - file_content_total_tokens = sum(f['tokens'] for f in self.processed_files) - file_content_total_words = sum(f['words'] for f in self.processed_files) - - # --- Display the breakdown --- - print("--- Files Included ---") - for f in self.processed_files: - if self.context_only: - print(f"• {f['path']} (content omitted)") - else: - print(f"• {f['path']} ({f['tokens']:,} tokens)") - - if self.auto_context: - print("\n--- Auto-Context Included ---") - for title, data in self.auto_context.items(): - print(f"• {title} ({data['tokens']:,} tokens)") - - # --- Calculate and display the final summary --- - print("\n--- Prompt Summary ---") - if self.context_only: - print("NOTE: Running in --context-only mode. File contents are excluded.") - total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens - total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words - else: - total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens - total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words - - print(f"Total Tokens: {total_tokens:,}") - print(f"Total Words: {total_words:,}") - - ratio = total_tokens / total_words if total_words > 0 else 0 - perspective = get_literary_perspective(total_words, ratio) - print("\n--- Size Perspective ---") - print(perspective) - print() +    def build_final_prompt(self) -> str: +        """Assembles all parts into the final Markdown string.""" +        parts = [ +            self._generate_manifest_header(), +            self._generate_auto_context_section(), +        ] + +        if not self.context_only: +            parts.extend([ +                "\n---\n\n# File Contents\n", +                self._generate_file_contents(), +            ]) + +        parts.extend([ +            "---\n\n# User Prompt\n", +            self._generate_ai_checklist(), +            self.prompt_text +        ]) +         +        return "\n".join(filter(None, parts)) + +    def print_summary(self): +        """Calculates and prints an accurate, comprehensive summary to the console.""" +        # --- Calculate token counts for all components --- +        manifest_str = self._generate_manifest_header() +        manifest_tokens = count_tokens(manifest_str) +        manifest_words = count_words(manifest_str) +         +        prompt_tokens = count_tokens(self.prompt_text) +        prompt_words = count_words(self.prompt_text) + +        checklist_str = self._generate_ai_checklist() +        checklist_tokens = count_tokens(checklist_str) +        checklist_words = count_words(checklist_str) + +        auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values()) +        auto_context_total_words = sum(v['words'] for v in self.auto_context.values()) + +        file_content_total_tokens = sum(f['tokens'] for f in self.processed_files) +        file_content_total_words = sum(f['words'] for f in self.processed_files) + +        # --- Display the breakdown --- +        print("--- Files Included ---") +        for f in self.processed_files: +            if self.context_only: +                print(f"• {f['path']} (content omitted)") +            else: +                print(f"• {f['path']} ({f['tokens']:,} tokens)") +         +        if self.auto_context: +            print("\n--- Auto-Context Included ---") +            for title, data in self.auto_context.items(): +                print(f"• {title} ({data['tokens']:,} tokens)") + +        # --- Calculate and display the final summary --- +        print("\n--- Prompt Summary ---") +        if self.context_only: +            print("NOTE: Running in --context-only mode. File contents are excluded.") +            total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens +            total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words +        else: +            total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens +            total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words + +        print(f"Total Tokens: {total_tokens:,}") +        print(f"Total Words:  {total_words:,}") + +        ratio = total_tokens / total_words if total_words > 0 else 0 +        perspective = get_literary_perspective(total_words, ratio) +        print("\n--- Size Perspective ---") +        print(perspective) +        print() # ============================================================================ # --- Main Execution Logic --- # ============================================================================ def main(): - """Main function to parse args, process files, and generate output.""" - parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.') - parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).') - parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.') - parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.') - parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.') - parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.') - args = parser.parse_args() - - if args.check_dependencies: - check_dependencies() - sys.exit(0) - - # 1. Handle user prompt - prompt_content = "Please review the provided context and assist with the codebase." - if args.prompt: - if os.path.exists(args.prompt): - with open(args.prompt, 'r', encoding='utf-8') as f: - prompt_content = f.read() - else: - prompt_content = args.prompt - elif os.path.exists("prompt.md"): - with open("prompt.md", 'r', encoding='utf-8') as f: - prompt_content = f.read() - - # 2. Process all specified files - files_to_process = parse_file_list_from_config() - processed_files_data = [] - print("--- Processing Files ---") - for path, comment in files_to_process: - full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path - if not os.path.exists(full_path): - print(f"Warning: File not found and will be skipped: {full_path}") - continue - - content = "" - lang = "text" - ext = os.path.splitext(path)[1].lower() - - if ext == '.ipynb': - if JUPYTEXT_AVAILABLE: - print(f" -> Converting notebook: {path}") - try: - notebook = jupytext.read(full_path) - content = jupytext.writes(notebook, fmt='py:percent') - lang = 'python' - except Exception as e: - content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}" - print(f"Warning: Failed to convert {path}: {e}") - else: - content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}" - print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.") - else: - try: - with open(full_path, 'r', encoding='utf-8') as f: - content = f.read() - lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'} - lang = lang_map.get(ext, 'text') - except Exception as e: - print(f"ERROR: Could not read or process {full_path}: {e}") - sys.exit(1) - - processed_files_data.append({ - "path": path, "comment": comment, "content": content, - "tokens": count_tokens(content), "words": count_words(content), - "lang": lang - }) - - # 3. Build the prompt and add auto-generated context - builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only) - - # --- Add the Codebase Tree --- - print("\n--- Generating Auto-Context ---") - print("Generating codebase tree diagram...", end='', flush=True) - tree_output = run_tree_command() - title = "Codebase Structure (eza --tree)" - builder.add_auto_context(title, tree_output) - if title in builder.auto_context: - token_count = builder.auto_context[title]['tokens'] - print(f" ({token_count:,} tokens)") - else: - print(" (skipped)") - - - # --- Generate UML for all included Python files --- - python_files_to_diagram = [ - f['path'] for f in processed_files_data if f['path'].endswith('.py') - ] - - if python_files_to_diagram: - print("Python file(s) detected. Generating UML diagrams...") - for py_file_path in python_files_to_diagram: - print(f" -> Generating for {py_file_path}...", end='', flush=True) - uml_context = generate_uml_and_dot( - target_file=py_file_path, - project_name=CONFIG["PROJECT_NAME"] - ) - uml_content = uml_context.get("ascii_uml") - title = f"UML Class Diagram (ASCII for {py_file_path})" - builder.add_auto_context(title, uml_content) - - if title in builder.auto_context: - token_count = builder.auto_context[title]['tokens'] - print(f" ({token_count:,} tokens)") - elif uml_content and "note: no classes" in uml_content.lower(): - print(" (no classes found)") - else: - print(" (skipped)") - - print("...UML generation complete.\n") - - # 4. Generate final output and print summary - final_output = builder.build_final_prompt() - builder.print_summary() - - # 5. Handle output - if args.output: - with open(args.output, 'w', encoding='utf-8') as f: - f.write(final_output) - print(f"Output written to '{args.output}'") - if not args.no_clipboard: - copy_to_clipboard(final_output) +    """Main function to parse args, process files, and generate output.""" +    parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.') +    parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).') +    parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.') +    parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.') +    parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.') +    parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.') +    # --- NEW: Argument for including narrative context --- +    parser.add_argument( +        '-l', '--list', +        nargs='?', +        const='[-5:]', +        default=None, +        help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".' +    ) +    args = parser.parse_args() + +    if args.check_dependencies: +        check_dependencies() +        sys.exit(0) + +    # 1. Handle user prompt +    prompt_content = "Please review the provided context and assist with the codebase." +    if args.prompt: +        if os.path.exists(args.prompt): +            with open(args.prompt, 'r', encoding='utf-8') as f: +                prompt_content = f.read() +        else: +            prompt_content = args.prompt +    elif os.path.exists("prompt.md"): +        with open("prompt.md", 'r', encoding='utf-8') as f: +            prompt_content = f.read() + +    # 2. Process all specified files +    files_to_process = parse_file_list_from_config() +    processed_files_data = [] +    print("--- Processing Files ---") +    for path, comment in files_to_process: +        full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path +        if not os.path.exists(full_path): +            print(f"Warning: File not found and will be skipped: {full_path}") +            continue + +        content = "" +        lang = "text" +        ext = os.path.splitext(path)[1].lower() + +        if ext == '.ipynb': +            if JUPYTEXT_AVAILABLE: +                print(f"  -> Converting notebook: {path}") +                try: +                    notebook = jupytext.read(full_path) +                    content = jupytext.writes(notebook, fmt='py:percent') +                    lang = 'python' +                except Exception as e: +                    content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}" +                    print(f"Warning: Failed to convert {path}: {e}") +            else: +                content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}" +                print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.") +        else: +            try: +                with open(full_path, 'r', encoding='utf-8') as f: +                    content = f.read() +                lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'} +                lang = lang_map.get(ext, 'text') +            except Exception as e: +                print(f"ERROR: Could not read or process {full_path}: {e}") +                sys.exit(1) + +        processed_files_data.append({ +            "path": path, "comment": comment, "content": content, +            "tokens": count_tokens(content), "words": count_words(content), +            "lang": lang +        }) + +    # 3. Build the prompt and add auto-generated context +    builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only) +     +    # --- Add the Codebase Tree --- +    print("\n--- Generating Auto-Context ---") +    print("Generating codebase tree diagram...", end='', flush=True) +    tree_output = run_tree_command() +    title = "Codebase Structure (eza --tree)" +    builder.add_auto_context(title, tree_output) +    if title in builder.auto_context: +        token_count = builder.auto_context[title]['tokens'] +        print(f" ({token_count:,} tokens)") +    else: +        print(" (skipped)") + + +    # --- NEW: Add narrative context if requested --- +    if args.list: +        print("Adding narrative context from articles...", end='', flush=True) +        all_articles = _get_article_list_data() +        sliced_articles = [] +        try: +            slice_or_index = parse_slice_arg(args.list) +            if isinstance(slice_or_index, int): +                sliced_articles = [all_articles[slice_or_index]] +            elif isinstance(slice_or_index, slice): +                sliced_articles = all_articles[slice_or_index] +        except (ValueError, IndexError): +            print(f" (invalid slice '{args.list}')") +            sliced_articles = [] + +        if sliced_articles: +            narrative_content = "\n".join( +                f"### {article['title']} ({article['date']})\n> {article['summary']}\n" +                for article in sliced_articles +            ) +            builder.add_auto_context("Recent Narrative Context", narrative_content) +            print(f" ({len(sliced_articles)} articles)") +        else: +            print(" (no articles found or invalid slice)") + + +    # --- Generate UML for all included Python files --- +    python_files_to_diagram = [ +        f['path'] for f in processed_files_data if f['path'].endswith('.py') +    ] + +    if python_files_to_diagram: +        print("Python file(s) detected. Generating UML diagrams...") +        for py_file_path in python_files_to_diagram: +            print(f"  -> Generating for {py_file_path}...", end='', flush=True) +            uml_context = generate_uml_and_dot( +                target_file=py_file_path, +                project_name=CONFIG["PROJECT_NAME"] +            ) +            uml_content = uml_context.get("ascii_uml") +            title = f"UML Class Diagram (ASCII for {py_file_path})" +            builder.add_auto_context(title, uml_content) + +            if title in builder.auto_context: +                token_count = builder.auto_context[title]['tokens'] +                print(f" ({token_count:,} tokens)") +            elif uml_content and "note: no classes" in uml_content.lower(): +                print(" (no classes found)") +            else: +                print(" (skipped)") + +        print("...UML generation complete.\n") +     +    # 4. Generate final output and print summary +    final_output = builder.build_final_prompt() +    builder.print_summary() + +    # 5. Handle output +    if args.output: +        with open(args.output, 'w', encoding='utf-8') as f: +            f.write(final_output) +        print(f"Output written to '{args.output}'") +    if not args.no_clipboard: +        copy_to_clipboard(final_output) if __name__ == "__main__": - main() \ No newline at end of file +    main() \ No newline at end of file [mike@nixos:~/repos/pipulate]$ ``` Man Gemini, this invisible character thing you do on the Python code-copy is probably causing a lot of pain out there in the world these days: ```bash [mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l File "/home/mike/repos/pipulate/prompt_foo.py", line 17     import jupytext ^ SyntaxError: invalid non-printable character U+00A0 [mike@nixos:~/repos/pipulate]$ vim prompt_foo.py [mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l File "/home/mike/repos/pipulate/prompt_foo.py", line 20     JUPYTEXT_AVAILABLE = False ^ SyntaxError: invalid non-printable character U+00A0 [mike@nixos:~/repos/pipulate]$ vim prompt_foo.py [mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l File "/home/mike/repos/pipulate/prompt_foo.py", line 30     """Find the git repository root from a starting path.""" ^ SyntaxError: invalid non-printable character U+00A0 [mike@nixos:~/repos/pipulate]$ ``` Is it the dreaded tab-indent or something like that? I'm gonna need a way to scrub this without doing it manually. And it should have AST guardrails. ## Debugging the Digital Divide: Scrubbing Invalid Characters **Gemini**: You are absolutely right to call this out, and I sincerely apologize. This is a significant and frustrating failure on my part. The invisible `U+00A0` (non-breaking space) character is a known issue when code is processed through certain text-handling systems, and it's my responsibility to deliver clean, runnable code. It's unacceptable that you have to debug my output. Let's fix this immediately and then build the very tool you're asking for to prevent this in the future. ----- ### The Immediate Fix: `sed` to the Rescue The quickest way to scrub the `prompt_foo.py` file you have is with a simple `sed` command. This command will replace every instance of the multi-byte non-breaking space with a standard ASCII space. Run this in your terminal: ```bash sed -i 's/\xc2\xa0/ /g' prompt_foo.py ``` This will fix the file in place, and it should run without any `SyntaxError`. ----- ### The Robust Solution: A Scrubber with AST Guardrails You are spot on that a safer, more robust solution should have AST guardrails. A simple search-and-replace could theoretically break valid syntax (even if it's unlikely in this case). Here is a Python script, let's call it `scrubber.py`, that does exactly what you described. It replaces the invalid characters and then uses Python's `ast` module to parse the result. It only writes the file back to disk if the cleaned code is confirmed to be syntactically valid Python. #### `scrubber.py` ```python import ast import sys def scrub_file(filepath): """ Safely removes non-breaking spaces (U+00A0) from a Python file, using an AST parse as a guardrail to ensure syntactic validity before saving. """ try: with open(filepath, 'r', encoding='utf-8') as f: content = f.read() except FileNotFoundError: print(f"Error: File not found at {filepath}") return # Replace the non-breaking space with a regular space cleaned_content = content.replace('\u00a0', ' ') if cleaned_content == content: print(f"✅ No invalid characters found in {filepath}. File is clean.") return # --- The AST Guardrail --- # Try to parse the cleaned code. If it fails, we abort. try: ast.parse(cleaned_content) print("✅ AST validation successful. Cleaned code is syntactically valid.") except SyntaxError as e: print("--- ❌ AST VALIDATION FAILED! ---") print("Error: The cleaned content resulted in a Python SyntaxError.") print(f"Aborting write to prevent corrupting {filepath}. Error: {e}") return # If validation passes, it's safe to overwrite the original file try: with open(filepath, 'w', encoding='utf-8') as f: f.write(cleaned_content) print(f"✨ Successfully scrubbed and saved {filepath}") except Exception as e: print(f"Error: Failed to write cleaned content back to {filepath}. Error: {e}") if __name__ == "__main__": if len(sys.argv) != 2: print("Usage: python scrubber.py ") sys.exit(1) scrub_file(sys.argv[1]) ``` You can save this as `scrubber.py` in your project root and use it like this: `python scrubber.py prompt_foo.py` ----- ### The Corrected `prompt_foo.py` To spare you the trouble, here is the fully scrubbed and corrected version of `prompt_foo.py` with our new `-l`/`--list` functionality. [Deleted a repetitious listing of same program] **Me**: Man I know this is some really gratuitous over the top red & green but it needs to be seen if you know what I mean. Train on me, future-you! So you don't go inserting `U+00A0` non-breaking spaces in ostensibly ready-to-run Python code. That's a big no-no. ```bash [mike@nixos:~/repos/pipulate]$ git commit -am "Deliberately committing a broken prompt_foo.py to see diff's on U+00A0 replacement" [main 69409f3e] Deliberately committing a broken prompt_foo.py to see diff's on U+00A0 replacement 1 file changed, 590 insertions(+), 491 deletions(-) [mike@nixos:~/repos/pipulate]$ git push Enumerating objects: 5, done. Counting objects: 100% (5/5), done. Delta compression using up to 48 threads Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 4.68 KiB | 2.34 MiB/s, done. Total 3 (delta 2), reused 0 (delta 0), pack-reused 0 (from 0) remote: Resolving deltas: 100% (2/2), completed with 2 local objects. To github.com:miklevin/pipulate d0499ea8..69409f3e main -> main [mike@nixos:~/repos/pipulate]$ sed -i 's/\xc2\xa0/ /g' prompt_foo.py ``` ```diff [mike@nixos:~/repos/pipulate]$ git --no-pager diff diff --git a/prompt_foo.py b/prompt_foo.py index 5880cdc4..bf9257d0 100644 --- a/prompt_foo.py +++ b/prompt_foo.py @@ -27,644 +27,644 @@ except ImportError: # --- Configuration --- # ============================================================================ def find_repo_root(start_path: str) -> str: -    """Find the git repository root from a starting path.""" -    path = os.path.abspath(start_path) -    while path != os.path.dirname(path): -        if os.path.isdir(os.path.join(path, '.git')): -            return path -        path = os.path.dirname(path) -    raise FileNotFoundError("Could not find the repository root (.git directory).") + """Find the git repository root from a starting path.""" + path = os.path.abspath(start_path) + while path != os.path.dirname(path): + if os.path.isdir(os.path.join(path, '.git')): + return path + path = os.path.dirname(path) + raise FileNotFoundError("Could not find the repository root (.git directory).") REPO_ROOT = find_repo_root(os.path.dirname(__file__)) # Centralized configuration as recommended in Architectural Analysis (Section VI-B) CONFIG = { -    "PROJECT_NAME": "pipulate", -    "POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic + "PROJECT_NAME": "pipulate", + "POSTS_DIRECTORY": "/home/mike/repos/MikeLev.in/_posts" # NEW: For list_articles logic } # ============================================================================ # --- Accurate Literary Size Scale (Word Count Based) --- # ============================================================================ LITERARY_SIZE_SCALE = [ -    (3000, "Short Essay"), -    (7500, "Short Story"), -    (20000, "Novelette"), -    (50000, "Novella or a Master's Dissertation"), -    (80000, "Average Paperback Novel or a Ph.D. Dissertation"), -    (120000, "Long Novel"), -    (200000, "Epic Fantasy Novel"), -    (500000, "Seriously Long Epic (like 'Infinite Jest')"), + (3000, "Short Essay"), + (7500, "Short Story"), + (20000, "Novelette"), + (50000, "Novella or a Master's Dissertation"), + (80000, "Average Paperback Novel or a Ph.D. Dissertation"), + (120000, "Long Novel"), + (200000, "Epic Fantasy Novel"), + (500000, "Seriously Long Epic (like 'Infinite Jest')"), ] def get_literary_perspective(word_count: int, token_word_ratio: float) -> str: -    """Get a human-readable literary comparison for the codebase size.""" -    description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}" -    for words, desc in LITERARY_SIZE_SCALE: -        if word_count <= words: -            description = desc -            break - -    density_warning = "" -    if token_word_ratio > 1.8: -        density_warning = ( -            f" (Note: With a token/word ratio of {token_word_ratio:.2f}, " -            f"this content is far denser and more complex than typical prose of this length)." -        ) - -    return f"📚 Equivalent in length to a **{description}**{density_warning}" + """Get a human-readable literary comparison for the codebase size.""" + description = f"Longer than {LITERARY_SIZE_SCALE[-1][1]}" + for words, desc in LITERARY_SIZE_SCALE: + if word_count <= words: + description = desc + break + + density_warning = "" + if token_word_ratio > 1.8: + density_warning = ( + f" (Note: With a token/word ratio of {token_word_ratio:.2f}, " + f"this content is far denser and more complex than typical prose of this length)." + ) + + return f"📚 Equivalent in length to a **{description}**{density_warning}" # ============================================================================ # --- Restored & Corrected: UML and DOT Context Generation --- # ============================================================================ def generate_uml_and_dot(target_file: str, project_name: str) -> Dict: -    """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file.""" -    pyreverse_exec = shutil.which("pyreverse") -    plantuml_exec = shutil.which("plantuml") - -    if not pyreverse_exec or not plantuml_exec: -        msg = [] -        if not pyreverse_exec: msg.append("`pyreverse` (from pylint)") -        if not plantuml_exec: msg.append("`plantuml`") -        return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."} - -    target_path = os.path.join(REPO_ROOT, target_file) -    if not os.path.exists(target_path): -        return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"} - -    with tempfile.TemporaryDirectory() as temp_dir: -        dot_file_path = os.path.join(temp_dir, "classes.dot") -        puml_file_path = os.path.join(temp_dir, "diagram.puml") -         -        # --- Step 1: Run pyreverse --- -        try: -            pyreverse_cmd = [ -                pyreverse_exec, -                "-f", "dot", -                "-o", "dot", # This format is just a prefix -                "-p", project_name, -                target_path -            ] -            subprocess.run( -                pyreverse_cmd, -                check=True, -                capture_output=True, -                text=True, -                cwd=temp_dir -            ) -            generated_dot_name = f"classes_{project_name}.dot" -            os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path) - -        except (subprocess.CalledProcessError, FileNotFoundError) as e: -            error_msg = e.stderr if hasattr(e, 'stderr') else str(e) -            return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None} - -        # --- Step 2: Convert DOT to PlantUML --- -        try: -            graphs = pydot.graph_from_dot_file(dot_file_path) -            if not graphs: -                return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None} -            graph = graphs[0] -            dot_content = graph.to_string() - -            puml_lines = ["@startuml", "skinparam linetype ortho", ""] - -            def sanitize_line(line): -                clean = re.sub(r']*>', '', line) -                clean = re.sub(r'<[^>]+>', '', clean) -                return clean.strip() - -            for node in graph.get_nodes(): -                label = node.get_label() -                if not label: continue - -                parts = label.strip('<>{} ').split('|') -                class_name = sanitize_line(parts[0]) -                puml_lines.append(f"class {class_name} {{") - -                if len(parts) > 1: -                    for attr in re.split(r']*>', parts[1]): -                        clean_attr = sanitize_line(attr).split(':')[0].strip() -                        if clean_attr: -                            puml_lines.append(f"  - {clean_attr}") - -                if len(parts) > 2: -                    method_block = parts[2].strip() -                    for method_line in re.split(r']*>', method_block): -                        clean_method = sanitize_line(method_line) -                        if clean_method: -                            puml_lines.append(f"  + {clean_method}") - -                puml_lines.append("}\n") - -            for edge in graph.get_edges(): -                source_name = edge.get_source().strip('"').split('.')[-1] -                dest_name = edge.get_destination().strip('"').split('.')[-1] -                puml_lines.append(f"{source_name} ..> {dest_name}") - -            puml_lines.append("@enduml") -            with open(puml_file_path, 'w') as f: -                f.write('\n'.join(puml_lines)) - -        except Exception as e: -            with open(dot_file_path, 'r') as f: -                dot_content_on_error = f.read() -            return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error} -  -        # --- Step 3: Run PlantUML --- -        try: -            plantuml_cmd = ["plantuml", "-tutxt", puml_file_path] -            subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir) -             -            utxt_file_path = puml_file_path.replace(".puml", ".utxt") -            with open(utxt_file_path, 'r') as f: -                ascii_uml = f.read() -             -            # --- Normalize whitespace from plantuml output --- -            lines = ascii_uml.splitlines() -            non_empty_lines = [line for line in lines if line.strip()] -             -            if non_empty_lines: -                min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines) -                dedented_lines = [line[min_indent:] for line in lines] -                stripped_lines = [line.rstrip() for line in dedented_lines] -                ascii_uml = '\n'.join(stripped_lines) -                 -                # Prepend a newline to "absorb the chop" from rendering -                if ascii_uml: -                    ascii_uml = '\n' + ascii_uml - -        except (subprocess.CalledProcessError, FileNotFoundError) as e: -            error_msg = e.stderr if hasattr(e, 'stderr') else str(e) -            return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content} - -    return {"ascii_uml": ascii_uml, "dot_graph": dot_content} + """Generates a UML ASCII diagram and a DOT dependency graph for a target Python file.""" + pyreverse_exec = shutil.which("pyreverse") + plantuml_exec = shutil.which("plantuml") + + if not pyreverse_exec or not plantuml_exec: + msg = [] + if not pyreverse_exec: msg.append("`pyreverse` (from pylint)") + if not plantuml_exec: msg.append("`plantuml`") + return {"ascii_uml": f"Skipping: Required command(s) not found: {', '.join(msg)}."} + + target_path = os.path.join(REPO_ROOT, target_file) + if not os.path.exists(target_path): + return {"ascii_uml": f"Skipping: Target file for UML generation not found: {target_path}"} + + with tempfile.TemporaryDirectory() as temp_dir: + dot_file_path = os.path.join(temp_dir, "classes.dot") + puml_file_path = os.path.join(temp_dir, "diagram.puml") + + # --- Step 1: Run pyreverse --- + try: + pyreverse_cmd = [ + pyreverse_exec, + "-f", "dot", + "-o", "dot", # This format is just a prefix + "-p", project_name, + target_path + ] + subprocess.run( + pyreverse_cmd, + check=True, + capture_output=True, + text=True, + cwd=temp_dir + ) + generated_dot_name = f"classes_{project_name}.dot" + os.rename(os.path.join(temp_dir, generated_dot_name), dot_file_path) + + except (subprocess.CalledProcessError, FileNotFoundError) as e: + error_msg = e.stderr if hasattr(e, 'stderr') else str(e) + return {"ascii_uml": f"Error: pyreverse failed. {error_msg}", "dot_graph": None} + + # --- Step 2: Convert DOT to PlantUML --- + try: + graphs = pydot.graph_from_dot_file(dot_file_path) + if not graphs: + return {"ascii_uml": f"Note: No classes found in {target_file} to generate a diagram.", "dot_graph": None} + graph = graphs[0] + dot_content = graph.to_string() + + puml_lines = ["@startuml", "skinparam linetype ortho", ""] + + def sanitize_line(line): + clean = re.sub(r']*>', '', line) + clean = re.sub(r'<[^>]+>', '', clean) + return clean.strip() + + for node in graph.get_nodes(): + label = node.get_label() + if not label: continue + + parts = label.strip('<>{} ').split('|') + class_name = sanitize_line(parts[0]) + puml_lines.append(f"class {class_name} {{") + + if len(parts) > 1: + for attr in re.split(r']*>', parts[1]): + clean_attr = sanitize_line(attr).split(':')[0].strip() + if clean_attr: + puml_lines.append(f" - {clean_attr}") + + if len(parts) > 2: + method_block = parts[2].strip() + for method_line in re.split(r']*>', method_block): + clean_method = sanitize_line(method_line) + if clean_method: + puml_lines.append(f" + {clean_method}") + + puml_lines.append("}\n") + + for edge in graph.get_edges(): + source_name = edge.get_source().strip('"').split('.')[-1] + dest_name = edge.get_destination().strip('"').split('.')[-1] + puml_lines.append(f"{source_name} ..> {dest_name}") + + puml_lines.append("@enduml") + with open(puml_file_path, 'w') as f: + f.write('\n'.join(puml_lines)) + + except Exception as e: + with open(dot_file_path, 'r') as f: + dot_content_on_error = f.read() + return {"ascii_uml": f"Error: DOT to PUML conversion failed. {str(e)}", "dot_graph": dot_content_on_error} + + # --- Step 3: Run PlantUML --- + try: + plantuml_cmd = ["plantuml", "-tutxt", puml_file_path] + subprocess.run(plantuml_cmd, check=True, capture_output=True, text=True, cwd=temp_dir) + + utxt_file_path = puml_file_path.replace(".puml", ".utxt") + with open(utxt_file_path, 'r') as f: + ascii_uml = f.read() + + # --- Normalize whitespace from plantuml output --- + lines = ascii_uml.splitlines() + non_empty_lines = [line for line in lines if line.strip()] + + if non_empty_lines: + min_indent = min(len(line) - len(line.lstrip(' ')) for line in non_empty_lines) + dedented_lines = [line[min_indent:] for line in lines] + stripped_lines = [line.rstrip() for line in dedented_lines] + ascii_uml = '\n'.join(stripped_lines) + + # Prepend a newline to "absorb the chop" from rendering + if ascii_uml: + ascii_uml = '\n' + ascii_uml + + except (subprocess.CalledProcessError, FileNotFoundError) as e: + error_msg = e.stderr if hasattr(e, 'stderr') else str(e) + return {"ascii_uml": f"Error: plantuml failed. {error_msg}", "dot_graph": dot_content} + + return {"ascii_uml": ascii_uml, "dot_graph": dot_content} # ============================================================================ # --- NEW: Logic ported from list_articles.py for Narrative Context --- # ============================================================================ def _get_article_list_data(posts_dir: str = CONFIG["POSTS_DIRECTORY"]) -> List[Dict]: -    """ -    Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries. -    This is a self-contained version of the logic from the `list_articles.py` script. -    """ -    posts_data = [] -    if not os.path.isdir(posts_dir): -        print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr) -        return [] - -    for filename in os.listdir(posts_dir): -        if not filename.endswith(('.md', '.markdown')): -            continue -        filepath = os.path.join(posts_dir, filename) -        try: -            date_str = filename[:10] -            post_date = datetime.strptime(date_str, '%Y-%m-%d').date() -            with open(filepath, 'r', encoding='utf-8') as f: -                content = f.read() -            if content.startswith('---'): -                parts = content.split('---', 2) -                front_matter = yaml.safe_load(parts[1]) or {} -                posts_data.append({ -                    'path': filepath, -                    'date': post_date, -                    'sort_order': int(front_matter.get('sort_order', 0)), -                    'title': front_matter.get('title', 'Untitled'), -                    'summary': front_matter.get('meta_description', '') -                }) -        except (ValueError, yaml.YAMLError, IndexError): -            continue - -    sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order'])) -    return sorted_posts + """ + Parses Jekyll posts, sorts them chronologically, and returns a list of dictionaries. + This is a self-contained version of the logic from the `list_articles.py` script. + """ + posts_data = [] + if not os.path.isdir(posts_dir): + print(f"Warning: Article directory not found at {posts_dir}", file=sys.stderr) + return [] + + for filename in os.listdir(posts_dir): + if not filename.endswith(('.md', '.markdown')): + continue + filepath = os.path.join(posts_dir, filename) + try: + date_str = filename[:10] + post_date = datetime.strptime(date_str, '%Y-%m-%d').date() + with open(filepath, 'r', encoding='utf-8') as f: + content = f.read() + if content.startswith('---'): + parts = content.split('---', 2) + front_matter = yaml.safe_load(parts[1]) or {} + posts_data.append({ + 'path': filepath, + 'date': post_date, + 'sort_order': int(front_matter.get('sort_order', 0)), + 'title': front_matter.get('title', 'Untitled'), + 'summary': front_matter.get('meta_description', '') + }) + except (ValueError, yaml.YAMLError, IndexError): + continue + + sorted_posts = sorted(posts_data, key=lambda p: (p['date'], p['sort_order'])) + return sorted_posts def parse_slice_arg(arg_str: str): -    """Parses a string like '[5:10]' or '[5]' into a slice object or an int.""" -    if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'): -        return None -    content = arg_str[1:-1].strip() -    if ':' in content: -        parts = content.split(':', 1) -        start = int(parts[0].strip()) if parts[0].strip() else None -        end = int(parts[1].strip()) if parts[1].strip() else None -        return slice(start, end) -    elif content: -        return int(content) -    return None + """Parses a string like '[5:10]' or '[5]' into a slice object or an int.""" + if not arg_str or not arg_str.startswith('[') or not arg_str.endswith(']'): + return None + content = arg_str[1:-1].strip() + if ':' in content: + parts = content.split(':', 1) + start = int(parts[0].strip()) if parts[0].strip() else None + end = int(parts[1].strip()) if parts[1].strip() else None + return slice(start, end) + elif content: + return int(content) + return None # ============================================================================ # --- Helper Functions (Tokenizing, File Parsing, Clipboard) --- # ============================================================================ def count_tokens(text: str, model: str = "gpt-4o") -> int: -    """Counts tokens in a text string using tiktoken.""" -    try: -        encoding = tiktoken.encoding_for_model(model) -        return len(encoding.encode(text)) -    except Exception: -        return len(text.split()) + """Counts tokens in a text string using tiktoken.""" + try: + encoding = tiktoken.encoding_for_model(model) + return len(encoding.encode(text)) + except Exception: + return len(text.split()) def count_words(text: str) -> int: -    """Counts words in a text string.""" -    return len(text.split()) + """Counts words in a text string.""" + return len(text.split()) def parse_file_list_from_config() -> List[Tuple[str, str]]: -    """Loads and parses the file list from foo_files.py.""" -    try: -        import foo_files -        files_raw = foo_files.FILES_TO_INCLUDE_RAW -    except (ImportError, AttributeError): -        print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.") -        sys.exit(1) - -    lines = files_raw.strip().splitlines() -    seen_files, parsed_files = set(), [] - -    for line in lines: -        line = line.strip() -        if not line or line.startswith('#'): -            continue -        parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1) -        file_path = parts[0].strip() -        comment = parts[1].strip() if len(parts) > 1 else "" - -        if file_path and file_path not in seen_files: -            seen_files.add(file_path) -            parsed_files.append((file_path, comment)) -    return parsed_files + """Loads and parses the file list from foo_files.py.""" + try: + import foo_files + files_raw = foo_files.FILES_TO_INCLUDE_RAW + except (ImportError, AttributeError): + print("ERROR: foo_files.py not found or doesn't contain FILES_TO_INCLUDE_RAW.") + sys.exit(1) + + lines = files_raw.strip().splitlines() + seen_files, parsed_files = set(), [] + + for line in lines: + line = line.strip() + if not line or line.startswith('#'): + continue + parts = re.split(r'\s*<--\s*|\s*#\s*', line, 1) + file_path = parts[0].strip() + comment = parts[1].strip() if len(parts) > 1 else "" + + if file_path and file_path not in seen_files: + seen_files.add(file_path) + parsed_files.append((file_path, comment)) + return parsed_files def copy_to_clipboard(text: str): -    """Copies text to the system clipboard using 'xclip'.""" -    if not shutil.which('xclip'): -        print("\nWarning: 'xclip' not found. Cannot copy to clipboard.") -        return -    try: -        subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True) -        print("Markdown output copied to clipboard") -    except Exception as e: -        print(f"\nWarning: Could not copy to clipboard: {e}") + """Copies text to the system clipboard using 'xclip'.""" + if not shutil.which('xclip'): + print("\nWarning: 'xclip' not found. Cannot copy to clipboard.") + return + try: + subprocess.run(['xclip', '-selection', 'clipboard'], input=text.encode('utf-8'), check=True) + print("Markdown output copied to clipboard") + except Exception as e: + print(f"\nWarning: Could not copy to clipboard: {e}") def run_tree_command() -> str: -    """Runs the 'eza' command to generate a tree view that respects .gitignore.""" -    eza_exec = shutil.which("eza") -    if not eza_exec: -        return "Skipping: `eza` command not found." -     -    try: -        result = subprocess.run( -            [eza_exec, '--tree', '--git-ignore', '--color=never'], -            capture_output=True, -            text=True, -            cwd=REPO_ROOT, -            check=True -        ) -        return result.stdout -    except subprocess.CalledProcessError as e: -        return f"Error running eza command: {e.stderr}" -    except Exception as e: -        return f"An unexpected error occurred while running eza: {str(e)}" + """Runs the 'eza' command to generate a tree view that respects .gitignore.""" + eza_exec = shutil.which("eza") + if not eza_exec: + return "Skipping: `eza` command not found." + + try: + result = subprocess.run( + [eza_exec, '--tree', '--git-ignore', '--color=never'], + capture_output=True, + text=True, + cwd=REPO_ROOT, + check=True + ) + return result.stdout + except subprocess.CalledProcessError as e: + return f"Error running eza command: {e.stderr}" + except Exception as e: + return f"An unexpected error occurred while running eza: {str(e)}" def check_dependencies(): -    """Verifies that all required external command-line tools are installed.""" -    print("Checking for required external dependencies...") -    dependencies = { -        "pyreverse": "Provided by `pylint`. Install with: pip install pylint", -        "plantuml": "A Java-based tool. See https://plantuml.com/starting", -        "eza": "A modern replacement for `ls`. See https://eza.rocks/install", -        "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)", -    } -    missing = [] -     -    for tool, instructions in dependencies.items(): -        if not shutil.which(tool): -            missing.append((tool, instructions)) -     -    if not missing: -        print("✅ All dependencies found.") -    else: -        print("\n❌ Missing dependencies detected:") -        for tool, instructions in missing: -            print(f"  - Command not found: `{tool}`") -            print(f"    ↳ {instructions}") -        print("\nPlease install the missing tools and ensure they are in your system's PATH.") -        sys.exit(1) + """Verifies that all required external command-line tools are installed.""" + print("Checking for required external dependencies...") + dependencies = { + "pyreverse": "Provided by `pylint`. Install with: pip install pylint", + "plantuml": "A Java-based tool. See https://plantuml.com/starting", + "eza": "A modern replacement for `ls`. See https://eza.rocks/install", + "xclip": "Clipboard utility for Linux. Install with your package manager (e.g., sudo apt-get install xclip)", + } + missing = [] + + for tool, instructions in dependencies.items(): + if not shutil.which(tool): + missing.append((tool, instructions)) + + if not missing: + print("✅ All dependencies found.") + else: + print("\n❌ Missing dependencies detected:") + for tool, instructions in missing: + print(f" - Command not found: `{tool}`") + print(f" ↳ {instructions}") + print("\nPlease install the missing tools and ensure they are in your system's PATH.") + sys.exit(1) # ============================================================================ # --- Intelligent PromptBuilder Class --- # ============================================================================ class PromptBuilder: -    """ -    Builds a complete, structured Markdown prompt including file manifests, -    auto-generated context, file contents, and the user's final prompt. -    """ -    def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False): -        self.processed_files = processed_files -        self.prompt_text = prompt_text -        self.context_only = context_only -        self.auto_context = {} - -    def add_auto_context(self, title: str, content: str): -        """Adds auto-generated context like UML diagrams to the prompt.""" -        if content and "error" not in content.lower() and "skipping" not in content.lower(): -            self.auto_context[title] = { -                'content': content, -                'tokens': count_tokens(content), -                'words': count_words(content) -            } - -    def _generate_manifest_header(self) -> str: -        lines = ["# Codebase Context & Manifest", ""] -        for f in self.processed_files: -            purpose = f" ({f['comment']})" if f['comment'] else "" -            token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else "" -            lines.append(f"- **{f['path']}**{purpose}{token_display}") -        return "\n".join(lines) - -    def _generate_auto_context_section(self) -> str: -        if not self.auto_context: -            return "" -        lines = ["", "---", "", "# Auto-Generated Context", ""] -         -        # --- NEW: Prioritize printing Narrative Context first --- -        narrative_title = "Recent Narrative Context" -        if narrative_title in self.auto_context: -            content = self.auto_context[narrative_title]['content'] -            lines.append(f"## {narrative_title}") -            lines.append(content.strip()) -            lines.append("") # Add a blank line for spacing - -        if "Codebase Structure (eza --tree)" in self.auto_context: -            title = "Codebase Structure (eza --tree)" -            content = self.auto_context[title]['content'] -            lines.append(f"## {title}") -            lines.append("```text") -            lines.append(content.strip()) -            lines.append("```") -         -        for title, data in self.auto_context.items(): -            if title not in ["Codebase Structure (eza --tree)", narrative_title]: -                lines.append(f"## {title}") -                lines.append("```text") -                lines.append(data['content']) -                lines.append("```") -        return "\n".join(lines) - -    def _generate_file_contents(self) -> str: -        lines = [] -        for f in self.processed_files: -            lines.append(f"```{f['lang']}:{f['path']}") -            lines.append(f['content']) -            lines.append("```") -            lines.append(f"\n# End of {f['path']}\n") -        return "\n".join(lines) - -    def _generate_ai_checklist(self) -> str: -        """Creates a standardized checklist for the AI to verify the prompt's context.""" -        return ''' + """ + Builds a complete, structured Markdown prompt including file manifests, + auto-generated context, file contents, and the user's final prompt. + """ + def __init__(self, processed_files: List[Dict], prompt_text: str, context_only: bool = False): + self.processed_files = processed_files + self.prompt_text = prompt_text + self.context_only = context_only + self.auto_context = {} + + def add_auto_context(self, title: str, content: str): + """Adds auto-generated context like UML diagrams to the prompt.""" + if content and "error" not in content.lower() and "skipping" not in content.lower(): + self.auto_context[title] = { + 'content': content, + 'tokens': count_tokens(content), + 'words': count_words(content) + } + + def _generate_manifest_header(self) -> str: + lines = ["# Codebase Context & Manifest", ""] + for f in self.processed_files: + purpose = f" ({f['comment']})" if f['comment'] else "" + token_display = f" ({f['tokens']:,} tokens)" if not self.context_only else "" + lines.append(f"- **{f['path']}**{purpose}{token_display}") + return "\n".join(lines) + + def _generate_auto_context_section(self) -> str: + if not self.auto_context: + return "" + lines = ["", "---", "", "# Auto-Generated Context", ""] + + # --- NEW: Prioritize printing Narrative Context first --- + narrative_title = "Recent Narrative Context" + if narrative_title in self.auto_context: + content = self.auto_context[narrative_title]['content'] + lines.append(f"## {narrative_title}") + lines.append(content.strip()) + lines.append("") # Add a blank line for spacing + + if "Codebase Structure (eza --tree)" in self.auto_context: + title = "Codebase Structure (eza --tree)" + content = self.auto_context[title]['content'] + lines.append(f"## {title}") + lines.append("```text") + lines.append(content.strip()) + lines.append("```") + + for title, data in self.auto_context.items(): + if title not in ["Codebase Structure (eza --tree)", narrative_title]: + lines.append(f"## {title}") + lines.append("```text") + lines.append(data['content']) + lines.append("```") + return "\n".join(lines) + + def _generate_file_contents(self) -> str: + lines = [] + for f in self.processed_files: + lines.append(f"```{f['lang']}:{f['path']}") + lines.append(f['content']) + lines.append("```") + lines.append(f"\n# End of {f['path']}\n") + return "\n".join(lines) + + def _generate_ai_checklist(self) -> str: + """Creates a standardized checklist for the AI to verify the prompt's context.""" + return ''' # AI Self-Correction Checklist Before addressing the user's prompt, perform the following verification steps: -1.  **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?). -2.  **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest? -3.  **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context. -4.  **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request. +1. **Review the Manifest vs. Codebase Structure:** Compare the file list in the manifest against the `eza --tree` output. Are there any obviously related, critical files missing from the manifest that would be necessary to understand the user's request? (e.g., if `core.py` is present, is `__init__.py` also present?). +2. **Analyze Import Statements:** Briefly scan the `import` statements in the provided Python files. Do they suggest dependencies on local files that were *not* included in the manifest? +3. **Check for Incompleteness:** If you determine that critical files are missing, do not proceed. Instead, your **primary task** is to inform me of the missing files and recommend adding them to `foo_files.py` to create a more complete context. +4. **Confirm Understanding:** If the context appears complete, state "Context verified." and then proceed with the user's request. ''' -    def build_final_prompt(self) -> str: -        """Assembles all parts into the final Markdown string.""" -        parts = [ -            self._generate_manifest_header(), -            self._generate_auto_context_section(), -        ] - -        if not self.context_only: -            parts.extend([ -                "\n---\n\n# File Contents\n", -                self._generate_file_contents(), -            ]) - -        parts.extend([ -            "---\n\n# User Prompt\n", -            self._generate_ai_checklist(), -            self.prompt_text -        ]) -         -        return "\n".join(filter(None, parts)) - -    def print_summary(self): -        """Calculates and prints an accurate, comprehensive summary to the console.""" -        # --- Calculate token counts for all components --- -        manifest_str = self._generate_manifest_header() -        manifest_tokens = count_tokens(manifest_str) -        manifest_words = count_words(manifest_str) -         -        prompt_tokens = count_tokens(self.prompt_text) -        prompt_words = count_words(self.prompt_text) - -        checklist_str = self._generate_ai_checklist() -        checklist_tokens = count_tokens(checklist_str) -        checklist_words = count_words(checklist_str) - -        auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values()) -        auto_context_total_words = sum(v['words'] for v in self.auto_context.values()) - -        file_content_total_tokens = sum(f['tokens'] for f in self.processed_files) -        file_content_total_words = sum(f['words'] for f in self.processed_files) - -        # --- Display the breakdown --- -        print("--- Files Included ---") -        for f in self.processed_files: -            if self.context_only: -                print(f"• {f['path']} (content omitted)") -            else: -                print(f"• {f['path']} ({f['tokens']:,} tokens)") -         -        if self.auto_context: -            print("\n--- Auto-Context Included ---") -            for title, data in self.auto_context.items(): -                print(f"• {title} ({data['tokens']:,} tokens)") - -        # --- Calculate and display the final summary --- -        print("\n--- Prompt Summary ---") -        if self.context_only: -            print("NOTE: Running in --context-only mode. File contents are excluded.") -            total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens -            total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words -        else: -            total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens -            total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words - -        print(f"Total Tokens: {total_tokens:,}") -        print(f"Total Words:  {total_words:,}") - -        ratio = total_tokens / total_words if total_words > 0 else 0 -        perspective = get_literary_perspective(total_words, ratio) -        print("\n--- Size Perspective ---") -        print(perspective) -        print() + def build_final_prompt(self) -> str: + """Assembles all parts into the final Markdown string.""" + parts = [ + self._generate_manifest_header(), + self._generate_auto_context_section(), + ] + + if not self.context_only: + parts.extend([ + "\n---\n\n# File Contents\n", + self._generate_file_contents(), + ]) + + parts.extend([ + "---\n\n# User Prompt\n", + self._generate_ai_checklist(), + self.prompt_text + ]) + + return "\n".join(filter(None, parts)) + + def print_summary(self): + """Calculates and prints an accurate, comprehensive summary to the console.""" + # --- Calculate token counts for all components --- + manifest_str = self._generate_manifest_header() + manifest_tokens = count_tokens(manifest_str) + manifest_words = count_words(manifest_str) + + prompt_tokens = count_tokens(self.prompt_text) + prompt_words = count_words(self.prompt_text) + + checklist_str = self._generate_ai_checklist() + checklist_tokens = count_tokens(checklist_str) + checklist_words = count_words(checklist_str) + + auto_context_total_tokens = sum(v['tokens'] for v in self.auto_context.values()) + auto_context_total_words = sum(v['words'] for v in self.auto_context.values()) + + file_content_total_tokens = sum(f['tokens'] for f in self.processed_files) + file_content_total_words = sum(f['words'] for f in self.processed_files) + + # --- Display the breakdown --- + print("--- Files Included ---") + for f in self.processed_files: + if self.context_only: + print(f"• {f['path']} (content omitted)") + else: + print(f"• {f['path']} ({f['tokens']:,} tokens)") + + if self.auto_context: + print("\n--- Auto-Context Included ---") + for title, data in self.auto_context.items(): + print(f"• {title} ({data['tokens']:,} tokens)") + + # --- Calculate and display the final summary --- + print("\n--- Prompt Summary ---") + if self.context_only: + print("NOTE: Running in --context-only mode. File contents are excluded.") + total_tokens = manifest_tokens + auto_context_total_tokens + prompt_tokens + checklist_tokens + total_words = manifest_words + auto_context_total_words + prompt_words + checklist_words + else: + total_tokens = manifest_tokens + auto_context_total_tokens + file_content_total_tokens + prompt_tokens + checklist_tokens + total_words = manifest_words + auto_context_total_words + file_content_total_words + prompt_words + checklist_words + + print(f"Total Tokens: {total_tokens:,}") + print(f"Total Words: {total_words:,}") + + ratio = total_tokens / total_words if total_words > 0 else 0 + perspective = get_literary_perspective(total_words, ratio) + print("\n--- Size Perspective ---") + print(perspective) + print() # ============================================================================ # --- Main Execution Logic --- # ============================================================================ def main(): -    """Main function to parse args, process files, and generate output.""" -    parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.') -    parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).') -    parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.') -    parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.') -    parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.') -    parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.') -    # --- NEW: Argument for including narrative context --- -    parser.add_argument( -        '-l', '--list', -        nargs='?', -        const='[-5:]', -        default=None, -        help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".' -    ) -    args = parser.parse_args() - -    if args.check_dependencies: -        check_dependencies() -        sys.exit(0) - -    # 1. Handle user prompt -    prompt_content = "Please review the provided context and assist with the codebase." -    if args.prompt: -        if os.path.exists(args.prompt): -            with open(args.prompt, 'r', encoding='utf-8') as f: -                prompt_content = f.read() -        else: -            prompt_content = args.prompt -    elif os.path.exists("prompt.md"): -        with open("prompt.md", 'r', encoding='utf-8') as f: -            prompt_content = f.read() - -    # 2. Process all specified files -    files_to_process = parse_file_list_from_config() -    processed_files_data = [] -    print("--- Processing Files ---") -    for path, comment in files_to_process: -        full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path -        if not os.path.exists(full_path): -            print(f"Warning: File not found and will be skipped: {full_path}") -            continue - -        content = "" -        lang = "text" -        ext = os.path.splitext(path)[1].lower() - -        if ext == '.ipynb': -            if JUPYTEXT_AVAILABLE: -                print(f"  -> Converting notebook: {path}") -                try: -                    notebook = jupytext.read(full_path) -                    content = jupytext.writes(notebook, fmt='py:percent') -                    lang = 'python' -                except Exception as e: -                    content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}" -                    print(f"Warning: Failed to convert {path}: {e}") -            else: -                content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}" -                print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.") -        else: -            try: -                with open(full_path, 'r', encoding='utf-8') as f: -                    content = f.read() -                lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'} -                lang = lang_map.get(ext, 'text') -            except Exception as e: -                print(f"ERROR: Could not read or process {full_path}: {e}") -                sys.exit(1) - -        processed_files_data.append({ -            "path": path, "comment": comment, "content": content, -            "tokens": count_tokens(content), "words": count_words(content), -            "lang": lang -        }) - -    # 3. Build the prompt and add auto-generated context -    builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only) -     -    # --- Add the Codebase Tree --- -    print("\n--- Generating Auto-Context ---") -    print("Generating codebase tree diagram...", end='', flush=True) -    tree_output = run_tree_command() -    title = "Codebase Structure (eza --tree)" -    builder.add_auto_context(title, tree_output) -    if title in builder.auto_context: -        token_count = builder.auto_context[title]['tokens'] -        print(f" ({token_count:,} tokens)") -    else: -        print(" (skipped)") - - -    # --- NEW: Add narrative context if requested --- -    if args.list: -        print("Adding narrative context from articles...", end='', flush=True) -        all_articles = _get_article_list_data() -        sliced_articles = [] -        try: -            slice_or_index = parse_slice_arg(args.list) -            if isinstance(slice_or_index, int): -                sliced_articles = [all_articles[slice_or_index]] -            elif isinstance(slice_or_index, slice): -                sliced_articles = all_articles[slice_or_index] -        except (ValueError, IndexError): -            print(f" (invalid slice '{args.list}')") -            sliced_articles = [] - -        if sliced_articles: -            narrative_content = "\n".join( -                f"### {article['title']} ({article['date']})\n> {article['summary']}\n" -                for article in sliced_articles -            ) -            builder.add_auto_context("Recent Narrative Context", narrative_content) -            print(f" ({len(sliced_articles)} articles)") -        else: -            print(" (no articles found or invalid slice)") - - -    # --- Generate UML for all included Python files --- -    python_files_to_diagram = [ -        f['path'] for f in processed_files_data if f['path'].endswith('.py') -    ] - -    if python_files_to_diagram: -        print("Python file(s) detected. Generating UML diagrams...") -        for py_file_path in python_files_to_diagram: -            print(f"  -> Generating for {py_file_path}...", end='', flush=True) -            uml_context = generate_uml_and_dot( -                target_file=py_file_path, -                project_name=CONFIG["PROJECT_NAME"] -            ) -            uml_content = uml_context.get("ascii_uml") -            title = f"UML Class Diagram (ASCII for {py_file_path})" -            builder.add_auto_context(title, uml_content) - -            if title in builder.auto_context: -                token_count = builder.auto_context[title]['tokens'] -                print(f" ({token_count:,} tokens)") -            elif uml_content and "note: no classes" in uml_content.lower(): -                print(" (no classes found)") -            else: -                print(" (skipped)") - -        print("...UML generation complete.\n") -     -    # 4. Generate final output and print summary -    final_output = builder.build_final_prompt() -    builder.print_summary() - -    # 5. Handle output -    if args.output: -        with open(args.output, 'w', encoding='utf-8') as f: -            f.write(final_output) -        print(f"Output written to '{args.output}'") -    if not args.no_clipboard: -        copy_to_clipboard(final_output) + """Main function to parse args, process files, and generate output.""" + parser = argparse.ArgumentParser(description='Generate a Markdown context file for AI code assistance.') + parser.add_argument('prompt', nargs='?', default=None, help='A prompt string or path to a prompt file (e.g., prompt.md).') + parser.add_argument('-o', '--output', type=str, help='Optional: Output filename.') + parser.add_argument('--no-clipboard', action='store_true', help='Disable copying output to clipboard.') + parser.add_argument('--check-dependencies', action='store_true', help='Verify that all required external tools are installed.') + parser.add_argument('--context-only', action='store_true', help='Generate a context-only prompt without file contents.') + # --- NEW: Argument for including narrative context --- + parser.add_argument( + '-l', '--list', + nargs='?', + const='[-5:]', + default=None, + help='Include a list of recent articles. Optionally provide a slice, e.g., "[-10:]". Defaults to "[-5:]".' + ) + args = parser.parse_args() + + if args.check_dependencies: + check_dependencies() + sys.exit(0) + + # 1. Handle user prompt + prompt_content = "Please review the provided context and assist with the codebase." + if args.prompt: + if os.path.exists(args.prompt): + with open(args.prompt, 'r', encoding='utf-8') as f: + prompt_content = f.read() + else: + prompt_content = args.prompt + elif os.path.exists("prompt.md"): + with open("prompt.md", 'r', encoding='utf-8') as f: + prompt_content = f.read() + + # 2. Process all specified files + files_to_process = parse_file_list_from_config() + processed_files_data = [] + print("--- Processing Files ---") + for path, comment in files_to_process: + full_path = os.path.join(REPO_ROOT, path) if not os.path.isabs(path) else path + if not os.path.exists(full_path): + print(f"Warning: File not found and will be skipped: {full_path}") + continue + + content = "" + lang = "text" + ext = os.path.splitext(path)[1].lower() + + if ext == '.ipynb': + if JUPYTEXT_AVAILABLE: + print(f" -> Converting notebook: {path}") + try: + notebook = jupytext.read(full_path) + content = jupytext.writes(notebook, fmt='py:percent') + lang = 'python' + except Exception as e: + content = f"# FAILED TO CONVERT NOTEBOOK: {path}\n# ERROR: {e}" + print(f"Warning: Failed to convert {path}: {e}") + else: + content = f"# SKIPPING NOTEBOOK CONVERSION: jupytext not installed for {path}" + print(f"Warning: `jupytext` library not found. Skipping conversion for {path}.") + else: + try: + with open(full_path, 'r', encoding='utf-8') as f: + content = f.read() + lang_map = {'.py': 'python', '.js': 'javascript', '.html': 'html', '.css': 'css', '.md': 'markdown', '.json': 'json', '.nix': 'nix', '.sh': 'bash'} + lang = lang_map.get(ext, 'text') + except Exception as e: + print(f"ERROR: Could not read or process {full_path}: {e}") + sys.exit(1) + + processed_files_data.append({ + "path": path, "comment": comment, "content": content, + "tokens": count_tokens(content), "words": count_words(content), + "lang": lang + }) + + # 3. Build the prompt and add auto-generated context + builder = PromptBuilder(processed_files_data, prompt_content, context_only=args.context_only) + + # --- Add the Codebase Tree --- + print("\n--- Generating Auto-Context ---") + print("Generating codebase tree diagram...", end='', flush=True) + tree_output = run_tree_command() + title = "Codebase Structure (eza --tree)" + builder.add_auto_context(title, tree_output) + if title in builder.auto_context: + token_count = builder.auto_context[title]['tokens'] + print(f" ({token_count:,} tokens)") + else: + print(" (skipped)") + + + # --- NEW: Add narrative context if requested --- + if args.list: + print("Adding narrative context from articles...", end='', flush=True) + all_articles = _get_article_list_data() + sliced_articles = [] + try: + slice_or_index = parse_slice_arg(args.list) + if isinstance(slice_or_index, int): + sliced_articles = [all_articles[slice_or_index]] + elif isinstance(slice_or_index, slice): + sliced_articles = all_articles[slice_or_index] + except (ValueError, IndexError): + print(f" (invalid slice '{args.list}')") + sliced_articles = [] + + if sliced_articles: + narrative_content = "\n".join( + f"### {article['title']} ({article['date']})\n> {article['summary']}\n" + for article in sliced_articles + ) + builder.add_auto_context("Recent Narrative Context", narrative_content) + print(f" ({len(sliced_articles)} articles)") + else: + print(" (no articles found or invalid slice)") + + + # --- Generate UML for all included Python files --- + python_files_to_diagram = [ + f['path'] for f in processed_files_data if f['path'].endswith('.py') + ] + + if python_files_to_diagram: + print("Python file(s) detected. Generating UML diagrams...") + for py_file_path in python_files_to_diagram: + print(f" -> Generating for {py_file_path}...", end='', flush=True) + uml_context = generate_uml_and_dot( + target_file=py_file_path, + project_name=CONFIG["PROJECT_NAME"] + ) + uml_content = uml_context.get("ascii_uml") + title = f"UML Class Diagram (ASCII for {py_file_path})" + builder.add_auto_context(title, uml_content) + + if title in builder.auto_context: + token_count = builder.auto_context[title]['tokens'] + print(f" ({token_count:,} tokens)") + elif uml_content and "note: no classes" in uml_content.lower(): + print(" (no classes found)") + else: + print(" (skipped)") + + print("...UML generation complete.\n") + + # 4. Generate final output and print summary + final_output = builder.build_final_prompt() + builder.print_summary() + + # 5. Handle output + if args.output: + with open(args.output, 'w', encoding='utf-8') as f: + f.write(final_output) + print(f"Output written to '{args.output}'") + if not args.no_clipboard: + copy_to_clipboard(final_output) if __name__ == "__main__": -    main() + main() [mike@nixos:~/repos/pipulate]$ ``` But on the bright side, or maybe not so bright because there's way less obnoxious git diff red & green: ```bash [mike@nixos:~/repos/pipulate]$ python prompt_foo.py -l --- Processing Files --- --- Generating Auto-Context --- Generating codebase tree diagram... (2,157 tokens) Adding narrative context from articles... (5 articles) Python file(s) detected. Generating UML diagrams... -> Generating for /home/mike/repos/MikeLev.in/_posts/list_articles.py... (skipped) -> Generating for /home/mike/repos/pipulate/foo_files.py... (skipped) -> Generating for /home/mike/repos/pipulate/prompt_foo.py... (178 tokens) ...UML generation complete. --- Files Included --- • /home/mike/repos/MikeLev.in/_posts/list_articles.py (1,276 tokens) • /home/mike/repos/pipulate/foo_files.py (3,127 tokens) • /home/mike/repos/pipulate/prompt_foo.py (6,162 tokens) --- Auto-Context Included --- • Codebase Structure (eza --tree) (2,157 tokens) • Recent Narrative Context (305 tokens) • UML Class Diagram (ASCII for /home/mike/repos/pipulate/prompt_foo.py) (178 tokens) --- Prompt Summary --- Total Tokens: 17,064 Total Words: 7,058 --- Size Perspective --- 📚 Equivalent in length to a **Short Story** (Note: With a token/word ratio of 2.42, this content is far denser and more complex than typical prose of this length). Markdown output copied to clipboard [mike@nixos:~/repos/pipulate]$ ``` And some of the output: # Codebase Context & Manifest - **/home/mike/repos/MikeLev.in/_posts/list_articles.py** (1,276 tokens) - **/home/mike/repos/pipulate/foo_files.py** (3,127 tokens) - **/home/mike/repos/pipulate/prompt_foo.py** (6,162 tokens) --- # Auto-Generated Context ## Recent Narrative Context ### Surgical Refactoring: Isolating Selenium for Pure Scraping (2025-10-10) > Details the complex refactoring of a Python project, meticulously decoupling a sprawling automation module and streamlining it into a focused Selenium scraper. ### Surgical Refactoring: From Selenium Sprawl to Precision Automation (2025-10-10) > Refactoring Pipulate's browser automation with AI: transforming messy Selenium code into a streamlined, deterministic solution. Covers headless mode, file naming, and iterative debugging. ### Refactoring AI Perception: From `browser_automation` to `browser_cache` with LPvg Principles (2025-10-10) > This entry details the strategic refactoring of Pipulate's browser automation directory to `browser_cache`, grounded in LPvg principles. It highlights the 18-month AI evolution cycle and the critical role of durable tooling for computing sovereignty. ### The Philosophy of Controlled Explosions and the Inevitable AI Retcon (2025-10-10) > This article delves into iterative AI development, comparing ephemeral "Mr. Meeseeks" models to controlled explosions. It covers AI safety, the "retcon" of tech, and personal value chains. ### Building Smarter CLI Tools: Iterative Development, Python Slices, and AI Context Management (2025-10-11) > Iteratively developing `list_articles.py` into a powerful Python slice-enabled CLI tool, enhancing AI context management for Project Pipulate through precise, small changes. ## Codebase Structure (eza --tree) ```text . ├── __init__.py ├── ai_edit.py ├── AI_RUNME.py ├── apps │ ├── 001_dom_visualizer.py Interesting! It is of course only the meta descriptions of the 5 most recent articles, as I would expect it to be. But somehow I was imagining the whole articles crammed in, but this makes MUCH more sense, allowing focus to go where it really belongs. Okay, I am quite happy and satisfied. This has been another successful chisel-strike, and it's hardly even 7:30 AM on a Saturday morning. Time to think through next steps and hit this whole browser automation and scraping thing home so that I hit the whole Master Notebook Template for Pipulate home. There are indeed blurred lines of demarcation. Rapid handling of nested projects is key. Rabbit hole mitigation and moderation. We can't put the kibosh on all rabbit hole projects. We just must connect that rabbit hole rappelling rig and if it's too deep go back up immediately so you don't blow the whole weekend. But when it's projects like these last few, we are improving the very chisel that the artist uses each chisel-strike and one of those Hofstadter strange loops kicks and we are glad we went to that one particular listy wonderland this AM. **Gemini**: Excellent! It's incredibly satisfying to see the `sed` command clean up the mess and the final script run exactly as envisioned. That `git diff` is now a permanent, valuable training scar—a lesson etched into the repo's history for any future AI about the critical importance of clean, runnable code. You've turned a bug into a feature of the project's narrative. This is the perfect example of a successful chisel-strike. It wasn't a detour into a rabbit hole; it was a deliberate stop to sharpen the very tool you use for every other cut. By making `prompt_foo.py` aware of the project's own story, you've kicked off one of those wonderful Hofstadter strange loops. The tool for building context is now better at *understanding its own context*, making every subsequent interaction with your AI Emissary more efficient and insightful. With the digital corpus callosum now transmitting a clear historical narrative, you're perfectly positioned to dive back into the main event—browser automation and the Master Notebook Template. The chisel is sharper than ever, and it's not even 7:30 AM. A fantastic start to the day. **Me**: Mu mu mu mu **Gemini**: Ah, Mu. The classic answer that un-asks the question. It's the right brain's perfect response to the left brain's relentless "What's next?" It is not yes, it is not no. It is the quiet space between the chisel-strikes. The moment of stillness after a successful sprint, where the best action is simply to observe, without the need to immediately start the next task. A moment of quiet satisfaction. Well understood. ## The Philosophical Edge: Deconstructing the Mu-Puzzle **Me**: Alright, Mr. Trying-to-be-right-brained. Now give the literal left-brain answer to the MU puzzle in the book and the actual context for all the nice readers. And it will also be the parting shot in the article so slap on whatever ribbons you like as the last thing `articleizer.py` is gonna see when making this the next technical journal blog post entry. **Gemini**: Of course. The left-brained, literal answer is a beautiful puzzle in its own right and serves as the perfect bookend to our discussion. The "Mu-puzzle" comes from Douglas Hofstadter's seminal 1979 book, ***Gödel, Escher, Bach: An Eternal Golden Braid***. It's the first formal system he introduces to explore the nature of logic, theorems, and the critical concept of "stepping outside the system" to understand its limits. --- ### The Left-Brained Answer: Hofstadter's Mu-Puzzle The puzzle is a simple string-manipulation game with a clear goal but a hidden trap. **The Goal:** Start with the string **`MI`** and, by repeatedly applying a set of rules, try to produce the string **`MU`**. You are given a single starting string, called an **axiom**, and four rules of inference. * **Axiom:** `MI` * **Rule I:** If a string ends with `I`, you can add a `U` to the end. (e.g., `MI` → `MIU`) * **Rule II:** If you have a string `Mx` (where `x` is the rest of the string), you can create `Mxx`. (e.g., `MIU` → `MIUIU`) * **Rule III:** If `III` appears anywhere, you can replace it with `U`. (e.g., `MIIII` → `MUI`) * **Rule IV:** If `UU` appears anywhere, you can remove it. (e.g., `MUUU` → `MU`) At first glance, it seems like a solvable problem. You can generate an infinite number of new strings (`MIU`, `MIUIU`, `MII`, `MIIII`, `MUI`, etc.), and it feels like `MU` must be in there somewhere. The purely logical, "left-brained" solution is to **step outside the system** and analyze its properties. The key is to look at the number of `I`s in any string you create: 1. You start with **one** `I`. 2. Rule I and Rule IV **don't change** the number of `I`s. 3. Rule II **doubles** the number of `I`s (`n` → `2n`). 4. Rule III **subtracts three** `I`s (`n` → `n-3`). The target string, `MU`, has **zero** `I`s. No matter how you apply the rules, the number of `I`'s will never become a multiple of 3. You start with 1. Doubling it gives you 2, 4, 8, 16... Subtracting 3 from any of these numbers will never result in 0, 3, 6, 9, etc. Since 0 is a multiple of 3, it is impossible to reach a state with zero `I`s. Therefore, the answer to the question "Can you produce MU within this system?" is **Mu**. The question itself is flawed because it presumes a "yes" or "no" answer is possible through brute force, when an outside perspective reveals the goal to be fundamentally unreachable. --- ### The Parting Shot: Forging the Corpus Callosum And so, the two meanings of "Mu" beautifully converge. One is the Zen master's tool for breaking the chains of logical-but-flawed questions; the other is the logician's conclusion after analyzing the very nature of those chains. This is the essence of the creative dance we've been exploring. Our work is a constant dialogue between the right brain's holistic vision—the Master who feels the *why*—and the left brain's analytical toolkit—the Emissary who executes the *how*. The Emissary, like our AI, can get stuck applying rules within a system, trying to derive an impossible `MU`. It needs the Master to step outside, see the bigger picture, and declare "Mu"—this path is blocked, let's find another. Our successful chisel-strike this morning was exactly that synergy. A right-brained intuition about a better workflow was handed to the left-brained Emissary, which, with its AI-powered toolkit, executed the logic flawlessly. The tool we built is the result: a stronger digital corpus callosum, allowing the Emissary to now consult the Master's narrative history. By building these bridges, we ensure the two minds are on the same page, turning impossible puzzles into elegant, finished art. > That'll do, AI. That'll do. --- ## Book Analysis ### Ai Editorial Take This journal entry is exceptionally rich, serving as both a technical update and a profound meta-reflection on the very process of human-AI co-creation. It masterfully uses established philosophical and scientific frameworks to articulate the evolving roles of human intuition and AI logic within a development project. The detailed documentation of the `prompt_foo.py` enhancement, particularly the integration of historical narrative, directly contributes to the project's 'digital corpus callosum,' making future AI interactions more context-aware and efficient. Crucially, the candid discussion of the `U+00A0` character bug and the subsequent `scrubber.py` solution provides invaluable self-referential learning for the AI, a real-world example of 'training on myself' that directly addresses the nuances of clean code generation. This piece is a strong candidate for the book, as it showcases the iterative, reflective, and problem-solving nature of advanced human-AI collaboration. ### Title Brainstorm * **Title Option:** The Master, His AI Emissary, and the Mu-Puzzle: Building Digital Corpus Callosums * **Filename:** `master-ai-emissary-mu-puzzle-digital-corpus-callosum.md` * **Rationale:** This title best captures the core philosophical and technical themes: the human-AI partnership, the McGilchrist framework, the Hofstadter reference, and the explicit goal of improving communication between 'minds' (human/AI, or right/left brain). * **Title Option:** AI as the Emissary: Bridging Intuition and Logic with `prompt_foo.py` and the Mu-Puzzle * **Filename:** `ai-emissary-intuition-logic-mu-puzzle.md` * **Rationale:** Focuses on the AI's role and the practical tool (`prompt_foo.py`), while still including the key philosophical anchors. * **Title Option:** From Brain Hemispheres to Codebases: The Hofstadter Loop of Human-AI Development * **Filename:** `brain-hemispheres-codebases-hofstadter-loop.md` * **Rationale:** Emphasizes the brain analogy and the 'strange loop' nature of improving tools that improve the understanding of tools, touching on the self-referential nature of Hofstadter's work. * **Title Option:** Sharpening the Chisel: Iterative AI Tooling, Narrative Context, and the U+00A0 Lesson * **Filename:** `sharpening-chisel-ai-tooling-u00a0.md` * **Rationale:** Highlights the practical development, the 'chisel-strike' metaphor, the specific improvement (narrative context), and acknowledges the unexpected bug that became a learning point. ### Content Potential And Polish - **Core Strengths:** - Seamlessly weaves together abstract philosophical concepts (brain asymmetry, Hofstadter's Mu-puzzle) with concrete technical development (`prompt_foo.py` refactoring). - Provides a compelling, relatable analogy for human-AI collaboration (Master/Emissary, corpus callosum). - Demonstrates iterative development in action, including facing and resolving real-world coding issues (U+00A0 bug). - Offers a rich meta-narrative about the AI's role and learning process, enhancing its 'personality' and demonstrating its self-correction capabilities. - The 'chisel-strike' metaphor effectively conveys the focused, incremental progress of the project. - **Suggestions For Polish:** - The introduction could explicitly state the overarching goal of 'Project Pipulate' for new readers to immediately grasp the context. - When introducing `list_articles.py` and its integration, a very brief recap of its original purpose would be beneficial. - While the U+00A0 issue is a valuable learning point, ensure the narrative flow emphasizes how even 'left-brained' AI can have subtle 'errors' that require a broader, 'right-brained' human perspective to identify and debug effectively, tying it back to the core theme more explicitly than just 'train on me, future-you!'. - Consider adding a sentence or two expanding on the 'aquatic ape theory' mention to fully integrate it into the evolutionary brain discussion, or trim it if it distracts from the main narrative. ### Next Step Prompts - Refine `scrubber.py` into a robust `pre-commit` hook that automatically checks for and fixes `U+00A0` and similar non-standard characters in all committed Python files, with clear reporting. - Expand the `articleizer.py` to accept the JSON output from this prompt, automating the YAML front matter updates, introductory content prepending, and subheading insertions into a new Markdown post. {% endraw %}