--- title: 'Shrinking server.py: A Refactoring Saga' permalink: /futureproof/python-refactoring-server-logging/ description: I'm really happy with how this entry turned out. It started as a simple desire to clean up my `server.py` file and blossomed into a deeper exploration on my coding philosophy. I love the tangent about FastHTML, HTMX, and the move away from the JavaScript-heavy ecosystem. It captures the 'why' behind the 'what.' The core of the piece, the live-blogging of my refactoring process with Gemini, feels authentic. Documenting the errors and fixes (like the `slog.slog` typo) is crucial because it shows that development isn't a clean, linear path; it's an iterative cycle of trying, failing, and fixing. This process of externalizing the logging functions was a huge win for the codebase, and I think this article does a great job of explaining the practical steps and the high-level reasoning behind them. meta_description: Follow a live-refactoring session in Python, moving logging functions out of a monolithic server.py to reduce its surface area and improve clarity. meta_keywords: python refactoring, server.py, clean code, software architecture, fasthtml, code externalization, logging module, vim, neovim, sed layout: post sort_order: 2 --- ## Setting the Stage: Context for the Curious Book Reader This journal entry captures a real-time coding session focused on a core principle of sustainable software development: reducing the surface area of monolithic files. I walk through my thought process, starting with a philosophical dive into why simpler web frameworks like FastHTML are preferable to their more complex, JSON-heavy counterparts like FastAPI. The main event is the step-by-step refactoring of Pipulate's `server.py`. I methodically externalize logging and display functions into a dedicated `imports/server_logging.py` module, using namespacing and careful, tool-assisted edits in Vim to ensure a clean transition. It's a practical look at how iterative cleanup, guided by a clear philosophy, leads to more maintainable and understandable code. --- ## Technical Journal Entry Begins I've been wanting to shrink the size of the `server.py` file in Pipulate. For those just joining in, `server.py` is just about the most popular name for a Flask-like `app` which for those just joining into Python is like a physical connection between a highly optimized high-performance webserver — in this case Uvicorn — and the Python code. People see `app` all the time in Flask-like code and don't really "get" how that's an instance of an outside piece of software that you're interacting with like controlling a robot or something. It's a *robot server!* Python has it's own built-in but Uvicorn is faster. ## The Philosophy: Why Less is More in Web Frameworks And when I say Flask-like, that's just a tradition of an application programming interface (API) coding style for writing websites that started as an April Fools joke from the people who wrote the initial component parts Flask was built on: `werkzeug` for routing web-requests into Python functions using this nifty feature of Python called `@decorators`, and the `jinja2` templates which are sort of like Python's answer to PHP or Liquid Templates. It's an HTML markup template language that lets you blend Python and HTML with a nice clear separation of concerns. And everyone copied Flask and we got a ton of Flask-like alternatives like FastAPI and FastHTML. FastAPI is made for the JavaScript web full stack world where everything is JSON and vomit. JSON is JavaScript Object Notation, but FastHTML is made for the post-JavaScript world where HTMX eliminates like 80% of what you need to do with JavaScript with 20% of the effort, rendering it almost unnecessary insofar as a developer's need to always use it as the 2nd language even though you're trying to do your webdev in Python. In fact there's a 3rd language too; those Jinja2 templates, remember? So FastHTML while Flask-like eliminates 2 of the 3 languages you need to know for Web Development and gets it down to just 1 again: Python! Okay, okay if you really want to count having to know the HTMX attributes it's back up to 2 languages but you're collapsing 2 really big ugly languages with 1 that's just short list of HTML attribute extensions. I don't want to go on too much of a tangent here, but it's about reducing surface area. With FastHTML all you need is Python and a few JavaScript include files for the HTMX trick. With FastAPI, first you need NodeJS for the JavaScript full stack because all the internal communication is JSON — that's the "API" in FastAPI. They're talking about JSON. And so you need like 10x the surface area to send stuff as JSON data between components stringifying, unstringifying, parsing, validating yadda yadda. The alternative is *over the wire* HTML. Almost all that stuff you're zapping around as JSON data can just be actual *fragments of HTML* which are targeting elements of the HTML document so that when it "hits it" (shot over the wire) it simply replaces it ZAP! All that JSON overhead eliminated and your entire tech stack is reduced maybe even a hundred-fold merely by getting NodeJS and all that `npm` stuff out of the picture. ## The Virtue of a Small Surface Area So in case you haven't gotten the picture, I like to trim and get things down to the smallest surface-area possible of still entirely understandable and explicit code that leaves nothing to the imagination nor to ridiculous breadcrumb trails of re-assembling what's actually going on from tracing included dependency after included dependency after included dependency just to see what the developer could have put in the very place it's used in the first place. The concept is over-abstraction and developers do it under the illusion that it's more elegant or will hope with code-reuse or something like that. The truth is it just makes everything way more difficult to figure out. While abstraction is unavoidable and separation of concerns still generally a good idea, it has been carried out to a excessive and obfuscating and infuriating degree. It's not great for the AIs either because it's hard to put all that abstracted stuff in the context window so it has them guessing a lot making the quality of the AI coding assistance lower based on actual truth and higher only because of statistical probability because the entire herd flocks that way. My way is different. It starts with Python because Python is more love-worthy than JavaScript. JavaScript was made because something had to be crammed into browsers for automation and Java was popular at the time and would add some legitimacy to the name, so why not? Python on the other hand was made as a fun summer project to blend together the strengths of the ABC language made specifically to be easy for new users to pick up right away and learn programming and the C programming language which ran the rest of the world. It was I believe also specifically made to appeal to system administrators who while their role might sound like they're programmers or developers are anything but. Unix script which they already had was too hard for them so they turned to something called PERL — a step up, but not much. Python really appealed to these types who needed it all brought down to their level for making scripts to automate setting up new employees at companies and such. Today even though there might be other languages underneath, Python is actually the high-level *"control-language"* for pretty much everything. I lean into this. ## Python as the Ultimate Control Language Because there was a great divide between Python and the Web for like maybe 2 decades — 1st you learn Python then you learn these 3 other languages so you can use it on the web — I pretty much steered clear of web development. Oh, I kept using Python as much as I could. I just wasn't in the world of webdev because it was absolutely heinous. To give credit it's due it was because Google was asserting its control over the entire browser and world by optimizing it so web apps were as high performance as compiled C-programs (very fast) and had all the telemetry built in to call back to mama and track you. So it was a *maturation* process for browsers and the Web that *it had to go through* — but it's done. Browsers are very standard now. The DOM is the DOM is the DOM is the DOM (document object model) and browsers are compatible and MSIE6 is dead... Jim. ## The Post-JavaScript DOM and Over-the-Wire HTML You don't need the bloated overhead of *virtual DOMs* that ReactJS used back in the day to make sophisticated webapps compatible across browsers. Because browsers are compatible you can drop the virtual DOM and just use the native DOM directly in a reliable and much more high-performance way. This is that *over the wire HTML* concept. You don't need anything other than HTML (like JSON) now that HTML is reliable, compatible, high-performance and really the format it needs to end up in to be visible to the user. It's still *real data.* Full stack purists will flip over this point. JSON is *real data* and HTML somehow is not. Uhhh... have you heard of the semantic web? RDFa? Microdata? You don't really need to embed that JSON-LD block to have easily parsed structured data. Don't get me wrong; it's still a good idea for SEO but blending it right in with the pure HTML tags by using good attributes and labeling is *real data* too. Viva the `
cell 1cell 2cell 3
`! Okay, I feel better. Now back to the clean-up job in `server.py` because consolidating configuration stuff into `config.py` is a good way to shrink `server.py` without any refactoring. I just sort of have to plow through it, rolling up and consolidating anything config-like that I find. Instead of asking AI to do this we think methodically. I load 2 files which I conveniently keep on the Pipulate repo root: server.py config.py ...into vim: vim server.py config.py I have 7 virtual desktop workspaces. I do this on Screen 2 which is where for over a year now I've kept Cursor so it's in my muscle memory that Screen 2 is where I edit code. ## The Refactoring Begins: Namespacing Logging Functions Hmmm. Tons of big and little things found. I'm not going to document every little edit but it's clear this is a great path to clean-up just going top-down in my code and *"telling the story"* again. That trick that I did with the import of the ASCII artwork from an external library that totally obviously belongs externalized which I cut down the import this and import that verbosity by importing everything but keeping a convenient abbreviated namespace was a really good idea. I'm going to do that now with logging functions! I already externalized `imports/server_logging.py` but there's hardly anything in there compared to how much is still in `server.py`. First we move one function experimentally from `server.py` to `server_logging.py` and it went well. Now we add the namespace dot prefix to everything imported from there. Okay, wow. This is very interesting. I can feel a sort of mental hurdle here like the calories are starting to burn when I address this topic. I'm layering namespacing where there wasn't any before, but "console" is a very common word and there's case sensitivity issues. This is a job for a very targeted *Prompt Fu* process where I just include this very article as the prompt because there's really no reason to pare it down. In fact I want snippets here of the larger files Gemini is going to encounter as I talk through the issue. Now Gemini I don't really need you to give me the code to do this logfile function externalization project I'm discussing here but rather I need help with the regular expression search and replace pattern to carefully replace all instances of `console` which are actually references to the object that was imported from `server_logging` with `slog.console` because I'm going to be namespacing everything imported from `server_logging` like this: BEFORE ```python from imports.server_logging import console, rich_json_display, setup_logging, safe_print ``` AFTER ```python import imports.server_logging as slog ``` I work in NeoVim so I have all the vim search and replace capabilities in there. I'm also no slouch with `sed` if that's a better approach and if I have to do it over multiple files I'm cool with `rg | xargs sed`. And of course I'm going to have to do it with the other imported items too but those are quite unique names, so I think I want to start with `console`. Give me some guidance here, please and thank you! ## A Play-by-Play of the Refactor **Gemini**: Of course\! I get what you're doing, and I love the philosophy behind it. Trimming down `server.py` by moving related functions into logical modules is a great way to reduce surface area without creating confusing layers of abstraction. Your tangent on FastHTML vs. FastAPI was spot on—it's all about simplifying the stack and keeping the cognitive load low. You've correctly identified that refactoring the logging imports is a great next step, and that `console` is the trickiest part because it's such a common name. A blind search-and-replace would be asking for trouble. Since you're a Vim user, let's stick with that for the most surgical and safe approach. The key is to use word boundaries and the "confirm" flag. ----- ## Refactoring with Surgical Precision in Vim/Neovim Here’s a step-by-step guide to do this safely right inside Neovim. ### Step 1: Change the Import First, make the change to the import statement in `server.py`. **FROM:** ```python from imports.server_logging import console, rich_json_display, setup_logging, safe_print ``` **TO:** ```python import imports.server_logging as slog ``` ----- ### Step 2: Safely Rename `console` References Now for the tricky part. We need to replace `console` but *only* when it's the object you imported. The best way to do this is to look for `console` followed by a dot, which indicates it's being used as an object. In Vim, run this command: ```vim :%s/\/slog.rich_json_display/gc ``` For `setup_logging`: ```vim :%s/\/slog.setup_logging/gc ``` For `safe_print`: ```vim :%s/\/slog.safe_print/gc ``` *(Using `\<` and `\>` ensures you're matching the whole word, which is just good practice.)* ----- ## The `rg` and `sed` Approach (For the Brave) If you prefer a command-line approach for one-shot changes, you can combine `ripgrep` and `sed`. This is faster but less interactive, so a `git diff` review afterward is a must. First, make the import change manually. Then, from your repo root: ```bash # For console sed -i 's/\bconsole\./slog.console./g' server.py # For the rest sed -i 's/\brich_json_display\b/slog.rich_json_display/g' server.py sed -i 's/\bsetup_logging\b/slog.setup_logging/g' server.py sed -i 's/\bsafe_print\b/slog.safe_print/g' server.py ``` *Note: `\b` is the word-boundary anchor for `sed`, which is equivalent to Vim's `\<` and `\>`. The `-i` flag edits the file in place.* After running these, do a thorough `git diff server.py` to make sure nothing unexpected was changed. I'd personally stick with the Vim `:s/.../c` method for `console` because it blends automation with manual oversight perfectly. Good luck with the refactoring—it's a satisfying feeling to shrink down a core file like that\! 👍 --- > Gemini's advice was spot-on and I used the vim search & replace method to walk > through each change. Doing it in vim gives the ability to review each > replacement. **Me**: Okay I did this and fixed a couple of edge cases that slipped through. I have now namespaced the already externalized logging functions. You can see where I'm going with this code. Give me a few more candidate functions for externalization this way please and thank you! **Gemini**: Of course\! Glad that worked out. The interactive-confirm (`:s/.../c`) is a lifesaver for this kind of surgical refactoring. Now that you've established the `slog` namespace pattern, you're in a great position to move more functions over. The goal is to move anything whose primary job is **displaying information** or **formatting data for logs**, rather than handling core application logic. Based on that theme, here are the next logical candidates in `server.py` to move into `imports/server_logging.py`. ----- ## Next Candidates for Externalization These functions are excellent choices because they are all concerned with the *presentation* of data, not its generation or manipulation. ### 1\. The Obvious Choice: `print_and_log_table` This is the perfect next candidate. Its entire purpose is to display a Rich `Table` to the console and then format it for the log file. It fits the theme of `server_logging.py` perfectly. ### 2\. The Formatting Helpers There's a group of functions that exist solely to prepare data structures for logging. They are tightly coupled and should be moved together: * `_recursively_parse_json_strings`: A utility used by the next function. * `_format_records_for_lifecycle_log`: Its whole job is to format database records into a nice string for logging, and it already uses `slog.rich_json_display`. ### 3\. The State Loggers These three functions are wrappers that use the formatting helpers above to log the state of different data sources. They are prime candidates for moving: * `log_dynamic_table_state` * `log_dictlike_db_to_lifecycle` * `log_raw_sql_table_to_lifecycle` ### 4\. The Advanced Candidate: `log_pipeline_summary` This one is also a fantastic candidate, but it requires a small, clean refactoring. Currently, it directly accesses the global `pipeline` object. To make it a pure, portable logging function, we can modify it to accept the `pipeline` object as an argument. **In `server.py`, the function signature is:** ```python def log_pipeline_summary(title_prefix: str = ''): try: records = list(pipeline()) # <-- Global access # ... ``` **When you move it to `server_logging.py`, change the signature to:** ```python def log_pipeline_summary(pipeline, title_prefix: str = ''): try: records = list(pipeline()) # <-- Now uses the passed-in object # ... ``` Then, back in `server.py`, you'll simply call it like this: `slog.log_pipeline_summary(pipeline, title_prefix='...')`. This is a great improvement as it decouples the function from the global state (a pattern called **dependency injection**). ----- ## The Plan 1. **Cut** the following functions from `server.py` and **paste** them into `imports/server_logging.py`: * `print_and_log_table` * `_recursively_parse_json_strings` * `_format_records_for_lifecycle_log` * `log_dynamic_table_state` * `log_dictlike_db_to_lifecycle` * `log_raw_sql_table_to_lifecycle` * `log_pipeline_summary` 2. **Add necessary imports** to the top of `imports/server_logging.py`: ```python import sqlite3 from rich.panel import Panel from rich.table import Table # You'll already have json, logger, Console, etc. ``` 3. **Update `log_pipeline_summary`** in `imports/server_logging.py` to accept `pipeline` as an argument, as described above. 4. **Update the calls in `server.py`**: * Change `print_and_log_table(...)` to `slog.print_and_log_table(...)`. * Change `log_dynamic_table_state(...)` to `slog.log_dynamic_table_state(...)`. * ...and so on for all the moved functions. * Crucially, change `log_pipeline_summary(...)` to `slog.log_pipeline_summary(pipeline, ...)`. This will make a significant dent in `server.py`'s line count and will consolidate almost all your presentation and logging logic into a single, clean module. It's a fantastic step toward the minimalist, high-clarity architecture you're aiming for. 👍 --- > Excellent, that did it. **Me**: Alright, all that's done. Review what we did please Gemini so that it flows well in this article. Not just the above stuff in this prompt, but this whole conversation please and thank you! **Gemini**: Of course\! It's been a productive session. Let's recap the journey of shrinking `server.py` into a clear and concise narrative. ----- ## The Refactoring Story: A Leaner `server.py` The core mission was to reduce the surface area of the main `server.py` file, staying true to a philosophy of clarity and minimal abstraction. We wanted to move functionality that wasn't strictly related to server operations into more logical, focused modules. The `imports/server_logging.py` file was the perfect new home for anything related to console display and log formatting. ### Step 1: Namespacing the Imports Our first move was to change how logging utilities were brought into `server.py`. Instead of importing individual functions and objects, we opted for a cleaner, namespaced approach. **Before:** We had a long, specific import line. ```python from imports.server_logging import console, rich_json_display, setup_logging, safe_print ``` **After:** We switched to a single, elegant import with a short alias. ```python import imports.server_logging as slog ``` This immediately set the stage for the refactor but required us to update all the places where `console`, `safe_print`, etc., were used. Using Vim's interactive search-and-replace (`:%s/\