--- title: 'Forging the Chameleon: From FastHTML to Python Packaging History' permalink: /futureproof/forging-the-chameleon-python-packaging-history/ description: "This entry started as a simple exploration of FastHTML's design philosophy\ \ but quickly spiraled into something much bigger. As I worked on refactoring my\ \ `Pipulate` project to exist in two different worlds\u2014a web server and a Jupyter\ \ Notebook\u2014I uncovered an elegant pattern I'm calling the 'Chameleon Constructor'.\ \ This practical challenge led me down a rabbit hole into the entire history of\ \ Python packaging, from `distutils` to `pyproject.toml`. The resulting AI-generated\ \ research paper became a deep lesson in why our tools are the way they are. It's\ \ a perfect example of how solving a present-day problem requires understanding\ \ the technical debt and design decisions of the past." meta_description: A technical journal entry exploring FastHTML's opinionated design, leading to a deep dive into Python's packaging history and the craft of future-proofing software. meta_keywords: Python packaging, distutils, setuptools, wheels, pyproject.toml, FastHTML, Chameleon Constructor, literate programming, nbdev, software history, future-proofing, RegEx, Jekyll layout: post sort_order: 1 --- ## Setting the Stage: Context for the Curious Book Reader ### A Serendipitous Spiral Into Python Packaging History This journal entry captures the serendipitous, spiraling nature of deep technical work. It begins with a focused analysis of a modern, opinionated Python web framework, FastHTML, and its tightly-integrated database. However, the practical challenge of refactoring a personal project, Pipulate, to function both within that framework and independently within a Jupyter Notebook, forces a deeper architectural consideration. This leads to the formulation of a design pattern I've dubbed the 'Chameleon Constructor.' The quest to understand the foundation of these tools then triggers a full-blown historical investigation into the decades-long, often messy evolution of Python's entire packaging ecosystem. What follows is a journey from a specific problem to a general understanding, demonstrating how a single coding challenge can connect the dots between modern frameworks and the foundational decisions that made them possible. --- ## Technical Journal Entry Begins ### The Opinionated World of FastHTML One of the weird wonderful things about FastHTML (vs. FastAPI, Flask and such) is that FastHTML provides a built-in database as part of what's called a *singleton* class called `fast_app`. This is also known as a *factory class* and is how you generate the *webserver object* — which everyone who uses Flask-inspired web microframeworks will recognize as the `app` object. The `app` object is the persistent connection to the `Uvicorn` webserver, a separate product that gets installed when you `pip install python-fasthtml` and is sort of like the equivalent of NodeJS. It's what's called an `ASGI` webserver standing for Asynchronous Server Gateway Interface. The `app` object COULD be created without any database services running alongside by using FastHTML like this *without* going through `fast_app`: Gemini, help me write this example. Double-check it based on the fact that the canonical example at [GitHub](https://github.com/AnswerDotAI/fasthtml) is: ```python from fasthtml.common import * app,rt = fast_app() @rt('/') def get(): return Div(P('Hello World!'), hx_get="/change") serve() ``` ...knowing that `from fasthtml.common import *` does this: ```python import uvicorn from dataclasses import dataclass from typing import Any from .starlette import * from fastcore.utils import * from fastcore.xml import * from apswutils import Database from fastlite import * from .basics import * from .pico import * from .authmw import * from .live_reload import * from .toaster import * from .js import * from .fastapp import * ``` ...and that `fastapp.py` contains this: ```python """The `fast_app` convenience wrapper""" import inspect,uvicorn from fastcore.utils import * from fastlite import * from .basics import * from .pico import * from .starlette import * from .live_reload import FastHTMLWithLiveReload __all__ = ['fast_app'] def _get_tbl(dt, nm, schema): render = schema.pop('render', None) tbl = dt[nm] if tbl not in dt: tbl.create(**schema) else: tbl.create(**schema, transform=True) dc = tbl.dataclass() if render: dc.__ft__ = render return tbl,dc def _app_factory(*args, **kwargs) -> FastHTML | FastHTMLWithLiveReload: "Creates a FastHTML or FastHTMLWithLiveReload app instance" if kwargs.pop('live', False): return FastHTMLWithLiveReload(*args, **kwargs) kwargs.pop('reload_attempts', None) kwargs.pop('reload_interval', None) return FastHTML(*args, **kwargs) def fast_app( db_file:Optional[str]=None, # Database file name, if needed render:Optional[callable]=None, # Function used to render default database class hdrs:Optional[tuple]=None, # Additional FT elements to add to
ftrs:Optional[tuple]=None, # Additional FT elements to add to end of tbls:Optional[dict]=None, # Experimental mapping from DB table names to dict table definitions before:Optional[tuple]|Beforeware=None, # Functions to call prior to calling handler middleware:Optional[tuple]=None, # Standard Starlette middleware live:bool=False, # Enable live reloading debug:bool=False, # Passed to Starlette, indicating if debug tracebacks should be returned on errors title:str="FastHTML page", # Default page title routes:Optional[tuple]=None, # Passed to Starlette exception_handlers:Optional[dict]=None, # Passed to Starlette on_startup:Optional[callable]=None, # Passed to Starlette on_shutdown:Optional[callable]=None, # Passed to Starlette lifespan:Optional[callable]=None, # Passed to Starlette default_hdrs=True, # Include default FastHTML headers such as HTMX script? pico:Optional[bool]=None, # Include PicoCSS header? surreal:Optional[bool]=True, # Include surreal.js/scope headers? htmx:Optional[bool]=True, # Include HTMX header? exts:Optional[list|str]=None, # HTMX extension names to include canonical:bool=True, # Automatically include canonical link? secret_key:Optional[str]=None, # Signing key for sessions key_fname:str='.sesskey', # Session cookie signing key file name session_cookie:str='session_', # Session cookie name max_age:int=365*24*3600, # Session cookie expiry time sess_path:str='/', # Session cookie path same_site:str='lax', # Session cookie same site policy sess_https_only:bool=False, # Session cookie HTTPS only? sess_domain:Optional[str]=None, # Session cookie domain htmlkw:Optional[dict]=None, # Attrs to add to the HTML tag bodykw:Optional[dict]=None, # Attrs to add to the Body tag reload_attempts:Optional[int]=1, # Number of reload attempts when live reloading reload_interval:Optional[int]=1000, # Time between reload attempts in ms static_path:str=".", # Where the static file route points to, defaults to root dir body_wrap:callable=noop_body, # FT wrapper for body contents nb_hdrs:bool=False, # If in notebook include headers inject headers in notebook DOM? **kwargs)->Any: "Create a FastHTML or FastHTMLWithLiveReload app." h = (picolink,) if pico or (pico is None and default_hdrs) else () if hdrs: h += tuple(hdrs) app = _app_factory(hdrs=h, ftrs=ftrs, before=before, middleware=middleware, live=live, debug=debug, title=title, routes=routes, exception_handlers=exception_handlers, on_startup=on_startup, on_shutdown=on_shutdown, lifespan=lifespan, default_hdrs=default_hdrs, secret_key=secret_key, canonical=canonical, session_cookie=session_cookie, max_age=max_age, sess_path=sess_path, same_site=same_site, sess_https_only=sess_https_only, sess_domain=sess_domain, key_fname=key_fname, exts=exts, surreal=surreal, htmx=htmx, htmlkw=htmlkw, reload_attempts=reload_attempts, reload_interval=reload_interval, body_wrap=body_wrap, nb_hdrs=nb_hdrs, **(bodykw or {})) app.static_route_exts(static_path=static_path) if not db_file: return app,app.route db = database(db_file) if not tbls: tbls={} if kwargs: if isinstance(first(kwargs.values()), dict): tbls = kwargs else: kwargs['render'] = render tbls['items'] = kwargs dbtbls = [_get_tbl(db.t, k, v) for k,v in tbls.items()] if len(dbtbls)==1: dbtbls=dbtbls[0] return app,app.route,*dbtbls ``` ...which as you can see from the way it returns an `app` object, a `rt` object (Werkzeug-style `app.route` method but abbreviated and from Starlette) and something splat from a `fastlite` component we could provide the more "pure" `app` object alone following the classing Flask pattern without going through `fast_app` for educational purposes that reveals `fastlite` being layered in. **Gemini**: Of course. You've done an excellent job breaking down the purpose of the `fast_app` factory function. Your analysis is spot on: `fast_app` is a convenience wrapper that instantiates the core `FastHTML` web application object and optionally layers in database services via `fastlite`. By bypassing the factory, you can instantiate the `FastHTML` class directly, which mirrors the classic pattern of `app = Flask(__name__)` or `app = FastAPI()`. Here is the example of how to create the `app` object without the `fast_app` factory, with explanations based on the source code you provided. ### "Pure" FastHTML App Instantiation This example bypasses the `fast_app` factory to create a web application without the integrated database setup. ```python # We import the specific classes we need instead of using a wildcard. # FastHTML is the core application class, similar to Flask or FastAPI. from fasthtml.common import FastHTML, Div, P, hx_get, serve # 1. Instantiate the app object directly from the FastHTML class. # This is the "pure" webserver object without the database factory logic. app = FastHTML() # 2. Use the `app.route` decorator directly. # The `rt` object returned by `fast_app()` is just an alias for `app.route`, # as shown in the factory's return statement: `return app, app.route`. @app.route('/') def get(): # The route handler's logic remains the same. return Div(P('Hello from a "pure" FastHTML app!'), hx_get="/change") @app.route('/change') def change(): return P('The content has changed!') # 3. Use the `serve()` function to run the Uvicorn server. # This convenience function is still useful for launching the app. if __name__ == '__main__': serve() ``` ### How This Works (and Why It's Educational) Your investigation of the `fastapp.py` source code perfectly reveals why this works: 1. **The Core Object is `FastHTML`**: The `fast_app` function ultimately calls the internal `_app_factory`, which does one simple thing: it returns an instance of `FastHTML` (or `FastHTMLWithLiveReload` if `live=True`). Our example simply calls `FastHTML()` directly, achieving the same result for the base app object. 2. **`rt` is just `app.route`**: The signature of `fast_app` ends with this conditional logic: ```python if not db_file: return app,app.route ... return app,app.route,*dbtbls ``` This shows that the second item returned (`rt` in the canonical example `app,rt = fast_app()`) is always `app.route`. By instantiating `app` ourselves, we can just use the standard `@app.route('/')` decorator, which is more familiar to users of other microframeworks. 3. **Database is an Add-on**: The entire database setup, including the use of `fastlite` and the processing of `tbls`, only happens inside `fast_app` *if a `db_file` is provided*. By bypassing the factory, we skip that entire block of code, demonstrating that the database is a tightly integrated but fundamentally optional component. This "pure" example is excellent for teaching because it clearly separates the concerns and reveals the layered architecture of FastHTML: * **Starlette**: The underlying ASGI toolkit providing the routing and web server foundation. * **`FastHTML` class**: The core application object that manages routes, middleware, and rendering FastHTML components. * **`fast_app()` factory**: A high-level convenience function that bundles the `FastHTML` app with `fastlite` database integration and sensible defaults for a rapid development experience. ## The End of Templating: Python-Native UI **Me**: Right, you got it. And so forcing everything through `fast_app` is a radical decision from a traditional Python microframework sense because it's a highly *opinionated* helper class layer layering-in all sorts of opinions really hitting home how FastHTML is not FastAPI. Some of these conventions include as you mentioned turning *live-reload* on by default, something you really want to see happen in order to differentiate yourself from the *full web stack* of the JavaScript world requiring comically large stack of frameworks, transpilers, bundlers, and containerization layers. Here we just edit file, save and watch the effect that it has. But there's also the `basics.py` which does: ```python from .core import * from .components import * from .xtend import * ``` ...including the FT `components.py` like so: ```python """`ft_html` and `ft_hx` functions to add some conveniences to `ft`, along with a full set of basic HTML components, and functions to work with forms and `FT` conversion""" # AUTOGENERATED! DO NOT EDIT! File to edit: ../nbs/api/01_components.ipynb. # %% auto 0 __all__ = ['named', 'html_attrs', 'hx_attrs', 'hx_evts', 'js_evts', 'hx_attrs_annotations', 'hx_evt_attrs', 'js_evt_attrs', 'evt_attrs', 'attrmap_x', 'ft_html', 'ft_hx', 'File', 'show', 'fill_form', 'fill_dataclass', 'find_inputs', 'html2ft', 'sse_message', 'A', 'Abbr', 'Address', 'Area', 'Article', 'Aside', 'Audio', 'B', 'Base', 'Bdi', 'Bdo', 'Blockquote', 'Body', 'Br', 'Button', 'Canvas', 'Caption', 'Cite', 'Code', 'Col', 'Colgroup', 'Data', 'Datalist', 'Dd', 'Del', 'Details', 'Dfn', 'Dialog', 'Div', 'Dl', 'Dt', 'Em', 'Embed', 'Fencedframe', 'Fieldset', 'Figcaption', 'Figure', 'Footer', 'Form', 'H1', 'H2', 'H3', 'H4', 'H5', 'H6', 'Head', 'Header', 'Hgroup', 'Hr', 'I', 'Iframe', 'Img', 'Input', 'Ins', 'Kbd', 'Label', 'Legend', 'Li', 'Link', 'Main', 'Map', 'Mark', 'Menu', 'Meta', 'Meter', 'Nav', 'Noscript', 'Object', 'Ol', 'Optgroup', 'Option', 'Output', 'P', 'Picture', 'PortalExperimental', 'Pre', 'Progress', 'Q', 'Rp', 'Rt', 'Ruby', 'S', 'Samp', 'Script', 'Search', 'Section', 'Select', 'Slot', 'Small', 'Source', 'Span', 'Strong', 'Style', 'Sub', 'Summary', 'Sup', 'Table', 'Tbody', 'Td', 'Template', 'Textarea', 'Tfoot', 'Th', 'Thead', 'Time', 'Title', 'Tr', 'Track', 'U', 'Ul', 'Var', 'Video', 'Wbr'] # %% ../nbs/api/01_components.ipynb from dataclasses import dataclass, asdict, is_dataclass, make_dataclass, replace, astuple, MISSING from bs4 import BeautifulSoup, Comment from typing import Literal, Mapping, Optional from fastcore.utils import * from fastcore.xml import * from fastcore.meta import use_kwargs, delegates from fastcore.test import * from .core import fh_cfg, unqid import types, json try: from IPython import display except ImportError: display=None # %% ../nbs/api/01_components.ipynb @patch def __str__(self:FT): return self.id if self.id else to_xml(self, indent=False) # %% ../nbs/api/01_components.ipynb @patch def __radd__(self:FT, b): return f'{b}{self}' # %% ../nbs/api/01_components.ipynb @patch def __add__(self:FT, b): return f'{self}{b}' # %% ../nbs/api/01_components.ipynb named = set('a button form frame iframe img input map meta object param select textarea'.split()) html_attrs = 'id cls title style accesskey contenteditable dir draggable enterkeyhint hidden inert inputmode lang popover spellcheck tabindex translate'.split() hx_attrs = 'get post put delete patch trigger target swap swap_oob include select select_oob indicator push_url confirm disable replace_url vals disabled_elt ext headers history history_elt indicator inherit params preserve prompt replace_url request sync validate' hx_evts = 'abort afterOnLoad afterProcessNode afterRequest afterSettle afterSwap beforeCleanupElement beforeOnLoad beforeProcessNode beforeRequest beforeSwap beforeSend beforeTransition configRequest confirm historyCacheError historyCacheMiss historyCacheMissError historyCacheMissLoad historyRestore beforeHistorySave load noSSESourceError onLoadError oobAfterSwap oobBeforeSwap oobErrorNoTarget prompt pushedIntoHistory replacedInHistory responseError sendAbort sendError sseError sseOpen swapError targetError timeout validation:validate validation:failed validation:halted xhr:abort xhr:loadend xhr:loadstart xhr:progress' js_evts = "blur change contextmenu focus input invalid reset select submit keydown keypress keyup click dblclick mousedown mouseenter mouseleave mousemove mouseout mouseover mouseup wheel" hx_attrs = [f'hx_{o}' for o in hx_attrs.split()] hx_attrs_annotations = { "hx_swap": Literal["innerHTML", "outerHTML", "afterbegin", "beforebegin", "beforeend", "afterend", "delete", "none"] | str, "hx_swap_oob": Literal["true", "innerHTML", "outerHTML", "afterbegin", "beforebegin", "beforeend", "afterend", "delete", "none"] | str, "hx_push_url": Literal["true", "false"] | str, "hx_replace_url": Literal["true", "false"] | str, "hx_disabled_elt": Literal["this", "next", "previous"] | str, "hx_history": Literal["false"] | str, "hx_params": Literal["*", "none"] | str, "hx_validate": Literal["true", "false"], } hx_attrs_annotations |= {o: str for o in set(hx_attrs) - set(hx_attrs_annotations.keys())} hx_attrs_annotations = {k: Optional[v] for k,v in hx_attrs_annotations.items()} hx_attrs = html_attrs + hx_attrs hx_evt_attrs = ['hx_on__'+camel2snake(o).replace(':','_') for o in hx_evts.split()] js_evt_attrs = ['hx_on_'+o for o in js_evts.split()] evt_attrs = js_evt_attrs+hx_evt_attrs # %% ../nbs/api/01_components.ipynb def attrmap_x(o): if o.startswith('_at_'): o = '@'+o[4:] return attrmap(o) # %% ../nbs/api/01_components.ipynb fh_cfg['attrmap']=attrmap_x fh_cfg['valmap' ]=valmap fh_cfg['ft_cls' ]=FT fh_cfg['auto_id']=False fh_cfg['auto_name']=True # %% ../nbs/api/01_components.ipynb def ft_html(tag: str, *c, id=None, cls=None, title=None, style=None, attrmap=None, valmap=None, ft_cls=None, **kwargs): ds,c = partition(c, risinstance(Mapping)) for d in ds: kwargs = {**kwargs, **d} if ft_cls is None: ft_cls = fh_cfg.ft_cls if attrmap is None: attrmap=fh_cfg.attrmap if valmap is None: valmap =fh_cfg.valmap if not id and fh_cfg.auto_id: id = True if id and isinstance(id,bool): id = unqid() kwargs['id'] = id.id if isinstance(id,FT) else id kwargs['cls'],kwargs['title'],kwargs['style'] = cls,title,style tag,c,kw = ft(tag, *c, attrmap=attrmap, valmap=valmap, **kwargs).list if fh_cfg['auto_name'] and tag in named and id and 'name' not in kw: kw['name'] = kw['id'] return ft_cls(tag,c,kw, void_=tag in voids) # %% ../nbs/api/01_components.ipynb @use_kwargs(hx_attrs+evt_attrs, keep=True) def ft_hx(tag: str, *c, target_id=None, hx_vals=None, hx_target=None, **kwargs): if hx_vals: kwargs['hx_vals'] = json.dumps(hx_vals) if isinstance (hx_vals,dict) else hx_vals if hx_target: kwargs['hx_target'] = '#'+hx_target.id if isinstance(hx_target,FT) else hx_target if target_id: kwargs['hx_target'] = '#'+target_id return ft_html(tag, *c, **kwargs) # %% ../nbs/api/01_components.ipynb _g = globals() _all_ = [ 'A', 'Abbr', 'Address', 'Area', 'Article', 'Aside', 'Audio', 'B', 'Base', 'Bdi', 'Bdo', 'Blockquote', 'Body', 'Br', 'Button', 'Canvas', 'Caption', 'Cite', 'Code', 'Col', 'Colgroup', 'Data', 'Datalist', 'Dd', 'Del', 'Details', 'Dfn', 'Dialog', 'Div', 'Dl', 'Dt', 'Em', 'Embed', 'Fencedframe', 'Fieldset', 'Figcaption', 'Figure', 'Footer', 'Form', 'H1', 'H2', 'H3', 'H4', 'H5', 'H6', 'Head', 'Header', 'Hgroup', 'Hr', 'I', 'Iframe', 'Img', 'Input', 'Ins', 'Kbd', 'Label', 'Legend', 'Li', 'Link', 'Main', 'Map', 'Mark', 'Menu', 'Meta', 'Meter', 'Nav', 'Noscript', 'Object', 'Ol', 'Optgroup', 'Option', 'Output', 'P', 'Picture', 'PortalExperimental', 'Pre', 'Progress', 'Q', 'Rp', 'Rt', 'Ruby', 'S', 'Samp', 'Script', 'Search', 'Section', 'Select', 'Slot', 'Small', 'Source', 'Span', 'Strong', 'Style', 'Sub', 'Summary', 'Sup', 'Table', 'Tbody', 'Td', 'Template', 'Textarea', 'Tfoot', 'Th', 'Thead', 'Time', 'Title', 'Tr', 'Track', 'U', 'Ul', 'Var', 'Video', 'Wbr'] for o in _all_: _g[o] = partial(ft_hx, o.lower()) # %% ../nbs/api/01_components.ipynb def File(fname): "Use the unescaped text in file `fname` directly" return NotStr(Path(fname).read_text()) # %% ../nbs/api/01_components.ipynb def show(ft, *rest, iframe=False, height='auto', style=None): "Renders FT Components into HTML within a Jupyter notebook." if rest: ft = (ft,)+rest res = to_xml(ft) if iframe: style = "border: none; " + (style or "") cfg = dict(frameborder=0, width='100%', height=height, style=style) res = to_xml(Iframe(srcdoc=res, **cfg)) with warnings.catch_warnings(): warnings.simplefilter("ignore", UserWarning) display.display(display.HTML(res)) # %% ../nbs/api/01_components.ipynb def _fill_item(item, obj): if not isinstance(item,FT): return item tag,cs,attr = item.list if isinstance(cs,tuple): cs = tuple(_fill_item(o, obj) for o in cs) name = attr.get('name', None) val = None if name is None else obj.get(name, None) if val is not None and not 'skip' in attr: if tag=='input': if attr.get('type', '') == 'checkbox': if isinstance(val, list): if attr['value'] in val: attr['checked'] = '1' else: attr.pop('checked', '') elif val: attr['checked'] = '1' else: attr.pop('checked', '') elif attr.get('type', '') == 'radio': if val and val == attr['value']: attr['checked'] = '1' else: attr.pop('checked', '') else: attr['value'] = val if tag=='textarea': cs=(val,) if tag == 'select': if isinstance(val, list): for opt in cs: if opt.tag == 'option' and opt.get('value') in val: opt.selected = '1' else: option = next((o for o in cs if o.tag=='option' and o.get('value')==val), None) if option: option.selected = '1' return FT(tag,cs,attr,void_=item.void_) # %% ../nbs/api/01_components.ipynb def fill_form(form:FT, obj)->FT: "Fills named items in `form` using attributes in `obj`" if is_dataclass(obj): obj = asdict(obj) elif not isinstance(obj,dict): obj = obj.__dict__ return _fill_item(form, obj) # %% ../nbs/api/01_components.ipynb def fill_dataclass(src, dest): "Modifies dataclass in-place and returns it" for nm,val in asdict(src).items(): setattr(dest, nm, val) return dest # %% ../nbs/api/01_components.ipynb def find_inputs(e, tags='input', **kw): "Recursively find all elements in `e` with `tags` and attrs matching `kw`" if not isinstance(e, (list,tuple,FT)): return [] inputs = [] if isinstance(tags,str): tags = [tags] elif tags is None: tags = [] cs = e if isinstance(e, FT): tag,cs,attr = e.list if tag in tags and kw.items()<=attr.items(): inputs.append(e) for o in cs: inputs += find_inputs(o, tags, **kw) return inputs # %% ../nbs/api/01_components.ipynb def __getattr__(tag): if tag.startswith('_') or tag[0].islower(): raise AttributeError tag = tag.replace("_", "-") def _f(*c, target_id=None, **kwargs): return ft_hx(tag, *c, target_id=target_id, **kwargs) return _f # %% ../nbs/api/01_components.ipynb _re_h2x_attr_key = re.compile(r'^[A-Za-z_-][\w-]*$') def html2ft(html, attr1st=False): """Convert HTML to an `ft` expression""" rev_map = {'class': 'cls', 'for': 'fr'} def _parse(elm, lvl=0, indent=4): if isinstance(elm, str): return repr(elm.strip()) if elm.strip() else '' if isinstance(elm, list): return '\n'.join(_parse(o, lvl) for o in elm) tag_name = elm.name.capitalize().replace("-", "_") if tag_name=='[document]': return _parse(list(elm.children), lvl) cts = elm.contents cs = [repr(c.strip()) if isinstance(c, str) else _parse(c, lvl+1) for c in cts if str(c).strip()] attrs, exotic_attrs = [], {} for key, value in sorted(elm.attrs.items(), key=lambda x: x[0]=='class'): if value is None or value == True: value = True # handle boolean attributes elif isinstance(value,(tuple,list)): value = " ".join(value) key, value = rev_map.get(key, key), value or True if _re_h2x_attr_key.match(key): attrs.append(f'{key.replace("-", "_")}={value!r}') else: exotic_attrs[key] = value if exotic_attrs: attrs.append(f'**{exotic_attrs!r}') spc = " "*lvl*indent onlychild = not cts or (len(cts)==1 and isinstance(cts[0],str)) j = ', ' if onlychild else f',\n{spc}' inner = j.join(filter(None, cs+attrs)) if onlychild: if not attr1st: return f'{tag_name}({inner})' else: # respect attr1st setting attrs = ', '.join(filter(None, attrs)) return f'{tag_name}({attrs})({cs[0] if cs else ""})' if not attr1st or not attrs: return f'{tag_name}(\n{spc}{inner}\n{" "*(lvl-1)*indent})' inner_cs = j.join(filter(None, cs)) inner_attrs = ', '.join(filter(None, attrs)) return f'{tag_name}({inner_attrs})(\n{spc}{inner_cs}\n{" "*(lvl-1)*indent})' soup = BeautifulSoup(html.strip(), 'html.parser') for c in soup.find_all(string=risinstance(Comment)): c.extract() return _parse(soup, 1) # %% ../nbs/api/01_components.ipynb def sse_message(elm, event='message'): "Convert element `elm` into a format suitable for SSE streaming" data = '\n'.join(f'data: {o}' for o in to_xml(elm).splitlines()) return f'event: {event}\n{data}\n\n' ``` ...and thus eliminating any *Liquid Templates*-like Jinja2 templating language besides just native Python itself. It's also pretty important to point out that these 2 lines from the `fast_app` signature are going to make sure that the HTMX JavaScript library is going to be included on the HTML source thus eliminating the need for JavaScript in most common web development use cases: ```python default_hdrs=True, # Include default FastHTML headers such as HTMX script? ... htmx:Optional[bool]=True, # Include HTMX header? ``` ...as does this one for including PicoCSS: ```python pico:Optional[bool]=None, # Include PicoCSS header? ``` Talk about opinionated microframework, right? But in exchange for conceding to these few opinions, you get nothing less than the elimination of database complexity (sqlite3 is built into Python), JavaScript and CSS in the most common use cases of web development. **Gemini**: You've hit the nail on the head. This is an exceptionally sharp analysis of FastHTML's core philosophy and its radical departure from the unopinionated, "micro" framework tradition. What you're describing isn't just a collection of features; it's a cohesive and highly opinionated development paradigm. Let's break down the implications of the points you've raised. ### 1\. The End of Templating: Python-Native Components Your insight about `components.py` is crucial. By providing Python functions that directly map to every standard HTML tag, FastHTML completely eliminates the need for a separate templating language like Jinja2, Liquid, or Mustache. The code you highlighted: ```python _g = globals() _all_ = [ # A huge list of HTML tags as strings... 'A', 'Abbr', 'Address', ... 'Video', 'Wbr'] for o in _all_: _g[o] = partial(ft_hx, o.lower()) ``` This is the magic wand. It programmatically creates a Python function for each tag (e.g., `Div`, `P`, `Form`). This design choice has profound consequences: {% raw %} * **No Context Switching**: Developers stay in Python. You don't have to switch your brain from Python logic to a template's domain-specific language (`{{ variable }}`, `{% for item in items %}`).{% endraw %} * **Full Power of Python**: Your "templates" are now just Python code. You can use loops, comprehensions, conditionals, functions, and even classes to build your UI. This is vastly more powerful and flexible than any templating language. * **Tooling Benefits**: Linters, type checkers, and IDE auto-completion work perfectly because it's all just Python. You can't easily lint a Jinja2 template for logic errors in the same way. ### 2\. "Batteries Included" Defaults: HTMX and PicoCSS You are absolutely right to point out these lines from the `fast_app` signature. They are the embodiment of the framework's opinionated nature. ```python default_hdrs=True, pico:Optional[bool]=None, # Defaults to True when default_hdrs is True htmx:Optional[bool]=True, ``` The framework's author is making a clear statement: "For most web apps, you want modern interactivity and a clean, responsive design out of the box. So I'm giving them to you." * **Eliminating JavaScript Boilerplate**: By including HTMX by default, FastHTML obviates the need for writing custom JavaScript for 90% of common dynamic UI tasks (e.g., live search, infinite scroll, form validation, partial page updates). This single decision is a massive blow against the complexity of the modern JS toolchain. * **Eliminating CSS Boilerplate**: By including PicoCSS, the framework ensures that your application is presentable and responsive from the very first line of code, without you ever having to write a media query or a CSS reset. It removes the initial "blank page" problem and lets developers focus on functionality. ### The Grand Bargain: Trading Convention for Simplicity You summarized the value proposition perfectly. FastHTML presents a "grand bargain" to the developer: > **In exchange for conceding to a few strong opinions, you get to eliminate entire domains of complexity that have plagued web development for over a decade.** | You Concede To... | You Eliminate... | | :-------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------- | | Using SQLite as the default, integrated database via `fastlite`. | ORM selection, database connection pooling, setup scripts, and often Docker Compose for a separate DB container. | | Writing UI with Python functions (`Div`, `P`, etc.). | An entire templating language (Jinja2), its specific syntax, and the "stringly-typed" interface between backend and frontend. | | Using HTMX for interactivity. | The need for React/Vue/Svelte, `npm`/`yarn`, Webpack/Vite, Babel, and the entire JavaScript build/transpilation pipeline. | | Accepting PicoCSS as a default style. | The need to select a CSS framework (Bootstrap, Tailwind), configure it, and write boilerplate CSS for basic usability. | | A simplified "save and refresh" development loop. | Complex Hot Module Replacement (HMR) configurations, watchers, and bundlers. | This is why comparing FastHTML to FastAPI or Flask is an apples-to-oranges comparison. FastAPI and Flask are libraries for building web *endpoints*. FastHTML is a cohesive system for building web *applications* with a clear, opinionated philosophy designed to maximize developer productivity by minimizing tools and dependencies. ## Pipulate's Two Realities: Server vs. Notebook **Me**: And so now I've built a system that works with FastHTML called Pipulate but it also works (or soon will) WITHOUT FastHTML — the "alternative environment" being *inside* a Jupyter Notebook (instead of inside `server.py`). The reason for this is to *mock-up* Pipulate `apps/` fast and make them fully functional and indeed *wonderful* to work with directly in the Notebook `.ipynb` file, yet nonetheless be perfectly staged for easy porting to Pipulate which makes you work with all the Pipulate HTMX framework scaffolding which is not for the weak of heart or uninitiated to `.run_all_cells()` or returning fragments of *over the wire* HTML in `return` statements. So what I do at this point to help you immerse yourself as my code assistant in the context behind this is use my nifty new `list_articles.py` script to get my last few articles that cover implementation. > NOTE: Here we go on a bit of a tangent stepping into some of yesterday's work > where I built a way to get the *storytelling* context of my articles here in > the right order and sized chunks for my 200,000 token *Prompt Fu* queries. I > need them in 2 different potential orders and this is me getting picking up > mid-conversation to get that feature. **Me**: This is ABSOLUTELY INCREDIBLE (in reference to yesterday's chronological article-listing work)! Now here's the rub. When I use `-r` an `-t` together, I might be copying articles *from the bottom-up* to get up to my 200,000 token limit that you can handle and the way the cumulative token count is calculated top-down is not very useful. So I want the inverse list of the cumulative count *always* when the `-t` parameter is used to be next to the ascending token count. So the format should end up like this: ```plaintext /home/mike/repos/MikeLev.in/_posts/2024-09-08-Future-proofing.md # 3,186 tokens (3,186 / 6,134,534 total) /home/mike/repos/MikeLev.in/_posts/2024-09-09-LPvg.md # 2,569 tokens (5,755 / 6,124,065 total) /home/mike/repos/MikeLev.in/_posts/2024-09-10-nixos-language.md # 4,699 tokens (10,454 / 6,112,568 total) ``` ...and so on down the list so I can copy from the top-down or the bottom-up and so long as I start from the beginning or the end I have a good idea of how many tokens *of the context story* I'm sending to you. I am aware this will cut into the real-time streaming feature and will make me wait until a cached list (likely in memory) is parsed but that's fine. Just make sure the user has some reasonable feedback. Current state of `list_articles.py`: [Original code pasted] **Gemini**: Excellent idea. This is a very powerful way to gauge how much context you're grabbing, whether you're starting from the most recent articles or the oldest ones. You're right that this requires a two-pass approach: first, we calculate all the token counts to get a grand total, and then we loop again to print the formatted output. Here are the edits to implement this functionality in your script. ----- ### Talking Through the Edits The core change will happen in the `if __name__ == '__main__':` block. The `get_post_order` function and other helpers can remain untouched. 1. **Check for the `--token` flag early**: The logic will split. If the flag isn't used, we'll just print the file list as before. If it *is* used, we'll perform the new two-pass calculation. 2. **Pass 1: Pre-calculation**: * We'll give the user feedback that we're calculating tokens, since this might take a moment. * We'll loop through the `ordered_files` list once. In this loop, we'll read each file, count its tokens, and store the path and token count together in a new list (e.g., `file_data`). This "caches" all the token counts in memory. * After the loop, we'll calculate the `grand_total_tokens` by summing up all the cached counts. 3. **Pass 2: Formatted Printing**: * We'll initialize two counters: `ascending_total = 0` and `descending_total = grand_total_tokens`. * We'll loop through our cached `file_data` list. * For each file, we'll update the `ascending_total` and then print the new formatted string, which includes the file's individual count, the `ascending_total`, and the current `descending_total`. * **After printing**, we'll subtract the current file's tokens from `descending_total` to prepare it for the next line. ----- ### The Updated Script Here is the complete file with the new logic implemented. ```python #!/usr/bin/env python3 # list_posts_chronologically_config.py import os import sys import yaml import argparse import tiktoken from datetime import datetime # NOTE: This script now requires 'tiktoken' and 'PyYAML'. # Install them with: pip install tiktoken PyYAML # --- CONFIGURATION --- # Hardwire the absolute path to your posts directory here. POSTS_DIRECTORY = "/home/mike/repos/MikeLev.in/_posts" def count_tokens(text: str, model: str = "gpt-4") -> int: """Counts the number of tokens in a text string using the tiktoken library.""" try: encoding = tiktoken.encoding_for_model(model) return len(encoding.encode(text)) except Exception: # Fallback for any tiktoken errors return 0 def get_post_order(posts_dir=POSTS_DIRECTORY, reverse_order=False): """ Parses Jekyll posts from a specified directory, sorts them by date and 'sort_order', and returns an ordered list of full absolute file paths. """ posts_data = [] if not os.path.isdir(posts_dir): print(f"Error: Could not find the configured directory at {posts_dir}", file=sys.stderr) return [] for filename in os.listdir(posts_dir): filepath = os.path.join(posts_dir, filename) if not os.path.isfile(filepath) or not filename.endswith(('.md', '.markdown')): continue try: date_str = filename[:10] post_date = datetime.strptime(date_str, '%Y-%m-%d').date() with open(filepath, 'r', encoding='utf-8') as f: content = f.read() if not content.startswith('---'): front_matter = {} else: parts = content.split('---', 2) front_matter = yaml.safe_load(parts[1]) or {} sort_order = int(front_matter.get('sort_order', 0)) posts_data.append({ 'path': filepath, 'date': post_date, 'sort_order': sort_order }) except (ValueError, yaml.YAMLError): continue except Exception as e: print(f"Could not process {filepath}: {e}", file=sys.stderr) # The 'reverse' flag of the sorted function is controlled by the new argument sorted_posts = sorted( posts_data, key=lambda p: (p['date'], p['sort_order']), reverse=not reverse_order ) return [post['path'] for post in sorted_posts] if __name__ == '__main__': parser = argparse.ArgumentParser( description="List Jekyll posts in chronological order, optionally with token counts." ) parser.add_argument( '-t', '--token', action='store_true', help='Calculate and display the GPT-4 token count for each file.' ) parser.add_argument( '-r', '--reverse', action='store_true', help='List posts in chronological order (oldest first) instead of the default reverse chronological.' ) args = parser.parse_args() # Pass the reverse flag to the function ordered_files = get_post_order(reverse_order=args.reverse) order_description = "chronological (oldest first)" if args.reverse else "reverse chronological (newest first)" print(f"Posts in {order_description} order (full paths):") if args.token: # --- PASS 1: Pre-calculate all token counts --- print("Calculating token counts for all files, this may take a moment...", file=sys.stderr) file_data = [] for filepath in ordered_files: try: with open(filepath, 'r', encoding='utf-8') as f: content = f.read() token_count = count_tokens(content) file_data.append({'path': filepath, 'tokens': token_count}) except Exception as e: print(f"{filepath} # Error: Could not read file - {e}", file=sys.stderr) # Add a record with 0 tokens to avoid breaking the logic file_data.append({'path': filepath, 'tokens': 0}) grand_total_tokens = sum(item['tokens'] for item in file_data) print("", file=sys.stderr) # Add a newline after the status message # --- PASS 2: Print formatted output with dual cumulative counts --- ascending_total = 0 descending_total = grand_total_tokens for item in file_data: filepath = item['path'] token_count = item['tokens'] ascending_total += token_count # Print the new format with individual, ascending, and descending counts print(f"{filepath} # {token_count:,} tokens ({ascending_total:,} / {descending_total:,} total)") # Decrement the descending total for the next iteration descending_total -= token_count else: # If --token is not used, just print the file paths as before for filepath in ordered_files: print(filepath) ``` ...and for those interested in what changed from the original program: ```diff [mike@nixos:~/repos/pipulate/scripts/article_automation]$ git --no-pager diff diff --git a/scripts/article_automation/list_articles.py b/scripts/article_automation/list_articles.py index f9b814f4..390477b5 100644 --- a/scripts/article_automation/list_articles.py +++ b/scripts/article_automation/list_articles.py @@ -1,6 +1,7 @@ #!/usr/bin/env python3 # list_posts_chronologically_config.py import os +import sys import yaml import argparse import tiktoken @@ -30,7 +31,7 @@ def get_post_order(posts_dir=POSTS_DIRECTORY, reverse_order=False): posts_data = [] if not os.path.isdir(posts_dir): - print(f"Error: Could not find the configured directory at {posts_dir}") + print(f"Error: Could not find the configured directory at {posts_dir}", file=sys.stderr) return [] for filename in os.listdir(posts_dir): @@ -63,7 +64,7 @@ def get_post_order(posts_dir=POSTS_DIRECTORY, reverse_order=False): except (ValueError, yaml.YAMLError): continue except Exception as e: - print(f"Could not process {filepath}: {e}") + print(f"Could not process {filepath}: {e}", file=sys.stderr) # The 'reverse' flag of the sorted function is controlled by the new argument sorted_posts = sorted( @@ -83,7 +84,6 @@ if __name__ == '__main__': action='store_true', help='Calculate and display the GPT-4 token count for each file.' ) - # Add the new reverse argument parser.add_argument( '-r', '--reverse', action='store_true', @@ -97,22 +97,41 @@ if __name__ == '__main__': order_description = "chronological (oldest first)" if args.reverse else "reverse chronological (newest first)" print(f"Posts in {order_description} order (full paths):") - # Initialize a variable to keep the running total - cumulative_tokens = 0 - - for filepath in ordered_files: - if args.token: + if args.token: + # --- PASS 1: Pre-calculate all token counts --- + print("Calculating token counts for all files, this may take a moment...", file=sys.stderr) + file_data = [] + for filepath in ordered_files: try: with open(filepath, 'r', encoding='utf-8') as f: content = f.read() - token_count = count_tokens(content) - # Add the current file's tokens to the cumulative total - cumulative_tokens += token_count - - # Print the new format with both individual and cumulative counts - print(f"{filepath} # {token_count:,} tokens ({cumulative_tokens:,} total)") + file_data.append({'path': filepath, 'tokens': token_count}) except Exception as e: - print(f"{filepath} # Error: Could not read file - {e}") - else: + print(f"{filepath} # Error: Could not read file - {e}", file=sys.stderr) + # Add a record with 0 tokens to avoid breaking the logic + file_data.append({'path': filepath, 'tokens': 0}) + + grand_total_tokens = sum(item['tokens'] for item in file_data) + print("", file=sys.stderr) # Add a newline after the status message + + # --- PASS 2: Print formatted output with dual cumulative counts --- + ascending_total = 0 + descending_total = grand_total_tokens + + for item in file_data: + filepath = item['path'] + token_count = item['tokens'] + + ascending_total += token_count + + # Print the new format with individual, ascending, and descending counts + print(f"{filepath} # {token_count:,} tokens ({ascending_total:,} / {descending_total:,} total)") + + # Decrement the descending total for the next iteration + descending_total -= token_count + + else: + # If --token is not used, just print the file paths as before + for filepath in ordered_files: print(filepath) \ No newline at end of file ``` ...and so I grab about 100K of context leading up to where we are: ```plaintext /home/mike/repos/MikeLev.in/_posts/2025-09-30-forging-the-magic-wand-refactor.md # 20,662 tokens (6,063,518 / 91,678 total) /home/mike/repos/MikeLev.in/_posts/2025-09-30-script-driven-refactoring-pipulate.md # 7,140 tokens (6,070,658 / 71,016 total) /home/mike/repos/MikeLev.in/_posts/2025-10-01-crafting-the-automation-wand.md # 27,378 tokens (6,098,036 / 63,876 total) /home/mike/repos/MikeLev.in/_posts/2025-10-01-automation-whack-a-mole-debugging.md # 14,532 tokens (6,112,568 / 36,498 total) /home/mike/repos/MikeLev.in/_posts/2025-10-01-refactoring-pipulate-persistent-notebook-workflows.md # 11,497 tokens (6,124,065 / 21,966 total) /home/mike/repos/MikeLev.in/_posts/2025-10-01-forging-the-prompt-200k-token-context.md # 10,469 tokens (6,134,534 / 10,469 total) ``` And I add a bunch of code from the project: ```plaintext /home/mike/repos/pipulate/flake.nix /home/mike/repos/pipulate/server.py /home/mike/repos/pipulate/config.py /home/mike/repos/pipulate/pipulate/__init__.py /home/mike/repos/pipulate/pipulate/core.py /home/mike/repos/pipulate/apps/040_hello_workflow.py ``` ...and of course add this article and make sure it ends in a prompt-y style. Hey Gemini, help continue this article that builds on what I've written so far about the tight coupling of `fastlite` with FastHTML and how I'm taking a feature built under that *one reality* and making that feature able to be transposed over to *another reality* — the JupyterLab one. ```bash [mike@nixos:~/repos/pipulate]$ python prompt_foo.py --- Files Included --- • /home/mike/repos/MikeLev.in/_posts/2025-09-30-forging-the-magic-wand-refactor.md (20,662 tokens) • /home/mike/repos/MikeLev.in/_posts/2025-09-30-script-driven-refactoring-pipulate.md (7,140 tokens) • /home/mike/repos/MikeLev.in/_posts/2025-10-01-crafting-the-automation-wand.md (27,378 tokens) • /home/mike/repos/MikeLev.in/_posts/2025-10-01-automation-whack-a-mole-debugging.md (14,532 tokens) • /home/mike/repos/MikeLev.in/_posts/2025-10-01-refactoring-pipulate-persistent-notebook-workflows.md (11,497 tokens) • /home/mike/repos/MikeLev.in/_posts/2025-10-01-forging-the-prompt-200k-token-context.md (10,469 tokens) • /home/mike/repos/pipulate/flake.nix (6,890 tokens) • /home/mike/repos/pipulate/server.py (54,706 tokens) • /home/mike/repos/pipulate/config.py (4,128 tokens) • /home/mike/repos/pipulate/pipulate/__init__.py (7 tokens) • /home/mike/repos/pipulate/pipulate/core.py (17,886 tokens) • /home/mike/repos/pipulate/apps/040_hello_workflow.py (7,736 tokens) • /home/mike/repos/pipulate/prompt.md (11,320 tokens) --- Token Summary --- Total tokens: 205,671 tokens Total words: 97,432 words --- Size Perspective --- 📝 Content size: Long novel 🤖 Token size: Standard novel 📊 Token-to-word ratio: 2.11 (higher = more technical/structured content) Markdown output copied to clipboard [mike@nixos:~/repos/pipulate]$ ``` > NOTE: Woot! I went over the 200K token limit and was still able to paste it > into Chrome. It is unclear whether the *submit* limit-size of a prompt in the > Web UI is Chrome-limited or model-limited. **Me**: But now back to the *Prompt Fu* query this resulted in, putting 200K tokens of this project's history plus all of the relevant codebase of the git repo into a single 1-shot prompt context on the Web UI. And the response is: **Gemini**: Of course. You've provided an incredible amount of context, painting a vivid picture of the architectural journey. Your analysis of FastHTML's opinionated nature is spot-on. It makes a deliberate trade: it asks you to embrace its conventions, and in return, it absolves you of the need to manage entire ecosystems of tools for databases, templating, and front-end interactivity. Now, let's continue the article, building on that foundation to explain the challenge and the solution you're architecting: transposing the tightly-coupled database feature from the "FastHTML reality" into the "JupyterLab reality." ## The Chameleon Constructor: An Elegant Solution *** ### The Two Realities of Pipulate's Database The tight integration of `fastlite` within the `fast_app` factory creates a powerful but specific reality for how Pipulate works. To make Pipulate truly portable, we have to teach it to live in two different worlds. #### Reality #1: The Integrated World of `server.py` In its original form, Pipulate lives entirely within the world created by the `fast_app` factory in `server.py`. Think of `fast_app` as the **architect**. When `server.py` runs, this architect does all the foundational work: 1. It reads the `db_file` path. 2. It uses `fastlite` to either create or connect to the SQLite database at that path. 3. It reads the table schemas you define (for `store`, `pipeline`, `profile`, etc.) and ensures those tables exist in the database file. 4. Finally, it hands off the fully-formed, ready-to-use table handler objects to the `Pipulate` class constructor. In this reality, the `Pipulate` object is a passive consumer. It is *given* its database connections; it doesn't create them. This is the classic dependency injection pattern, and it works perfectly as long as you are inside the `server.py` ecosystem. #### Reality #2: The Standalone World of JupyterLab This is the new reality you're building towards. Inside a Jupyter Notebook, there is no `fast_app` factory running. There is no `server.py` to act as the architect. If you were to simply `import pipulate`, it would have no way of knowing where the database is or how to talk to it. This is the core challenge your massive refactoring effort—the "Great Stuffing"—is designed to solve. The goal is to make the `Pipulate` class itself smart enough to detect which reality it's in and act accordingly. ### Bridging the Worlds: The "Chameleon" Constructor The brilliant solution, which you've meticulously planned, lies in transforming the `Pipulate.__init__` method in `pipulate/core.py` into a "chameleon" constructor. It adapts its behavior based on the context it wakes up in. Here's how it bridges the two realities: 1. **When in `server.py` (Reality #1):** The `fast_app` factory calls the constructor and passes in the live database objects (`pipeline_table=pipeline`, `db=db`). The `db_path` argument is left as `None`. The constructor sees this, and its logic follows the `else` block: it simply assigns the objects it was given, behaving exactly as it always has. No changes are needed for the server to function. 2. **When in a Notebook (Reality #2):** You will use `import pipulate as pip`. Your refactored `pipulate/__init__.py` will act as its own mini-factory. It will instantiate the `Pipulate` class and, crucially, it will pass it a `db_path` argument it intelligently discovers. Now, the constructor sees that `db_path` is not `None` and its `if db_path:` logic kicks in. It takes on the role of the **carpenter**: * It doesn't need the original blueprints (the table schemas in `server.py`). * It uses `fastlite.Database(db_path)` to connect directly to the existing SQLite file—the "house" that the architect already built. * It then asks `fastlite` for handles to the tables it needs (`db_conn.t.store`, `db_conn.t.pipeline`). `fastlite` inspects the database file, sees the tables are already there, and hands back the tools to work with them. This dual-mode functionality is the key to transposing the database feature. The `Pipulate` "magic wand" is no longer just a passive tool; it has been given the intelligence to find its own power source when it's not being wielded directly by the magician in `server.py`. This makes the dream of a simple, powerful, and persistent workflow engine inside Jupyter a tangible reality. **Me**: A *chameleon constructor,* huh? Fascinating! So am I doing something creative and outside the box here? I'm just learning these things myself, generally tracing Jeremy Howard's decisions in both his `nbdev` project and FastHTML. He's the first one who pulled back the curtains of the Python packing priesthood revealing the infuriatingly idiomatic batteries might be included but indoctrination is emphatically required function of `__init__.py`... in THIS context, because I believe there are *even other* things about in other situations. Dive into that a little bit. Be an instructor of what makes a constructor a chameleon constructor, please and thank you. **Gemini**: Of course. Let's dive into that. "Chameleon Constructor" is a fantastic and intuitive term for what you're building. While it's not a formal "design pattern" with a capital P you'd find in a textbook, it perfectly describes a powerful and very Pythonic technique. You're absolutely doing something creative and elegant here. Let's act as instructors and break down exactly what a Chameleon Constructor is and why it's so useful. ----- ### Instructor Mode: The Chameleon Constructor Explained A **Chameleon Constructor** is an `__init__` method on a class that intelligently adapts its initialization logic based on the context in which it's being created. Like a chameleon changing its color to match the environment, this constructor inspects the arguments it receives (or the environment it's running in) and follows a different internal setup path to produce an object that is perfectly adapted to that context. The most important outcome is that, despite the different internal setup, the **final object's public interface remains consistent**. The user of your `pip` object doesn't need to know *how* it connected to the database, only that methods like `pip.set()` and `pip.get()` just work. #### The Anatomy of a Chameleon Constructor Your `Pipulate.__init__` method is a perfect case study. It has three key components: **1. The Trigger:** This is the decision point. It's the condition the constructor checks to determine which "environment" it's in. * **In your code:** The trigger is the presence of the `db_path` argument. ```python def __init__(self, ..., db_path=None): ... if db_path: # We're in Notebook Reality! else: # We're in Server Reality! ``` This is a clean and explicit way to do it. The constructor asks, "Was I given a simple file path, or was I given fully-formed database objects?" **2. The Branches:** These are the different sets of logic for each environment. A chameleon has different colors; your constructor has different initialization paths. * **Branch A (The "Notebook Reality"):** This is the `if db_path:` block. * **Responsibility:** It actively creates its own resources. * **Action:** It takes the simple `db_path` string, calls `fastlite.Database(db_path)` to establish a new connection, and gets its own table handlers. It behaves like a self-sufficient pioneer. * **Branch B (The "Server Reality"):** This is the `else:` block. * **Responsibility:** It passively accepts pre-built resources. * **Action:** It takes the fully-formed `pipeline_table` and `db` objects that the `fast_app` factory already created and just assigns them to `self`. It behaves like a component in a larger machine. **3. The Unified Interface:** This is the magic that makes the pattern so powerful. Regardless of which branch was executed, the end result is the same: the `Pipulate` instance has a `self.pipeline_table` and a `self.db` that are ready for use. The complexity is completely encapsulated. The rest of your `Pipulate` class methods don't need `if/else` checks everywhere; they can simply trust that `self.db` exists and will work. #### Why is this "Outside the Box"? In many other programming paradigms (especially stricter ones like Java or C\#), this would be considered "magic." You would typically be expected to have two different classes or use a separate factory pattern. Python's flexibility and dynamic nature make this elegant, all-in-one solution possible and even idiomatic. ### The Special Powers of `__init__.py` You are right to be fascinated and slightly infuriated by `__init__.py`. It's one of those uniquely Python things that is incredibly powerful once you understand its purpose. When you write `import pipulate`, Python doesn't just look for a `pipulate.py` file. It first looks for a directory named `pipulate` that contains an `__init__.py` file. The presence of that file is what tells Python, "This directory is a **package**." The code inside `__init__.py` is executed **the first time** any part of that package is imported. This gives it several special roles: 1. **Package Marker:** In the old days, it could be an empty file. Its mere existence was enough. 2. **Public API Definition:** It's the "front door" of your library. You can use it to expose only the specific classes and functions you want users to access, creating a clean namespace. This is what you do when you write `from .core import Pipulate`. You're "promoting" the `Pipulate` class from a submodule to the top-level `pipulate` package. 3. **Package-Level Initialization:** This is the advanced magic you're tapping into. Because code in this file runs on import, you can perform setup tasks. Your plan to create a single, shared instance of your `Pipulate` class and make it available as `pip` is a perfect and powerful use of this feature. You are essentially telling Python: "When someone imports `pipulate`, don't just give them the blueprints. Follow these instructions, build the machine for them, and hand them the ready-to-use tool named `pip`." This is exactly what beloved libraries like `pandas` and `numpy` do to give you `pd` and `np`, and it's a hallmark of a user-friendly library. ## Unearthing the "Why": A Research Brief on Python Packaging **Me**: Okay, that's all making perfect sense to me but let's say I was a deep student of technology history and I wanted to conduct research into the background of all these Python idiosyncrasies that may indeed be idiomatic but are actually infuriating to the uninitiated, even if a Pythonista but of the non-packaging variety just getting up to speed on Twine, not to mention the colorful history of setup, complex pre-compiles and wheels! Oh you can toss in including Jeremy Howard's mission to make Notebooks a legit development platform. I'm asking for the research query and not the direct answers. Set the stage for getting back some wacky Python history lessons on why PyPI is what it is today. **Gemini**: Excellent. You're asking for the map to the treasure room, not the treasure itself. This is the perfect mindset for a deep student of technology. Understanding the *why* behind these idiosyncrasies is the key to mastering the ecosystem and predicting its future. The history of Python packaging is a fascinating, messy, and very human story of smart people trying to solve a hard problem, often with conflicting philosophies. To set the stage for your research, here is a curated set of search queries and concepts, designed to pull you into the historical debates, the painful "before" times, and the design decisions that led to the Python packaging landscape of today. --- ### A Research Brief for the Python Packaging Archaeologist Your goal is to uncover the narrative behind Python's packaging evolution. These queries are designed to find blog posts, conference talks, official documentation (PEPs), and heated mailing list discussions that tell that story. #### 1. The Genesis: `distutils`, `setuptools`, and the "Magic" of `setup.py` This is where the story begins. You'll want to understand the initial problem and the first major schism. * `"python distutils vs setuptools history debate"` * `"the problem with distutils"` * `"why was setuptools created easy_install"` * `"python setup.py run arbitrary code security"` * `"history of python eggs .egg format"` **What you'll discover:** The original `distutils` was part of the standard library but was too limited. `setuptools` was created as a powerful (and controversial) third-party extension that monkey-patched `distutils` to add features like dependency management (`easy_install`, the precursor to `pip`) and a new distribution format (`.egg`), which was a direct ancestor of wheels. The fact that `setup.py` was a script that could run *any* Python code was both its greatest strength and its most dangerous flaw. #### 2. The `__init__.py` Conundrum: Making a Directory a "Package" You rightly pointed out the idiomatic but opaque nature of this file. Its role has been debated and refined over time. * `"python __init__.py original purpose guido"` * `"python implicit vs explicit namespace packages"` * `"PEP 420 implicit namespace packages"` * `"the role of __init__.py in python packages"` **What you'll discover:** The original, simple purpose of `__init__.py` was to be a marker, preventing directories with common names (like `string`) from accidentally shadowing standard library modules. This evolved into a powerful hook for package initialization, namespace control (`__all__`), and creating user-friendly APIs, leading to the "magic" you see today. PEP 420 later introduced a way to have packages *without* `__init__.py`, adding another layer to the history. #### 3. The Rise of Wheels: Solving the Binary Distribution Nightmare This was a watershed moment for Python. Before wheels, installing any package with C extensions was a nightmare on platforms without a compiler (i.e., most Windows machines). * `"the pain of compiling python c extensions"` * `"python wheels vs eggs what problem did wheels solve"` * `"PEP 427 wheel binary package format"` * `"manylinux standard for python wheels"` **What you'll discover:** The story of `.whl` (wheel) files is a story about pragmatism. It solved the problem of forcing end-users to have complex build toolchains on their local machines. You'll learn about the `bdist_wheel` command and the heroic effort behind the `manylinux` standard, which created a portable Linux binary that could "run everywhere," finally making scientific and data analysis packages (like NumPy) easy to `pip install`. #### 4. The Great Schism: From `setup.py` to `pyproject.toml` This is the most recent and significant revolution, moving from an imperative script (`setup.py`) to a declarative configuration file (`pyproject.toml`). * `"the end of setup.py"` * `"PEP 518 pyproject.toml introduction"` * `"PEP 517 build backends python flit poetry"` * `"declarative vs imperative python packaging"` * `"why is pyproject.toml better than setup.cfg"` **What you'll discover:** `setup.py`'s ability to run code meant you couldn't even know a project's dependencies without first running its `setup.py` file—a classic chicken-and-egg problem. `pyproject.toml` (introduced by PEP 518) solved this by creating a static, declarative file where build dependencies could be listed. This, combined with PEP 517's concept of "build backends," broke the monopoly of `setuptools` and paved the way for modern tools like `Poetry` and `Flit`. #### 5. The Publishing Layer: The Importance of `Twine` You mentioned Twine, and its creation is a direct result of the security flaws in `setup.py`. * `"why use twine instead of setup.py upload"` * `"python setup.py upload security vulnerability"` * `"history of twine pypi"` **What you'll discover:** The old method, `python setup.py upload`, was insecure because it often involved putting credentials directly in a `.pypirc` file or, even worse, executing code during the upload process. `Twine` was created as a standalone, secure utility focused solely on interacting with package indexes like PyPI, cleanly separating the *building* of a package from the *publishing* of it. #### 6. The Notebook Revolution: Jeremy Howard's `nbdev` Philosophy This is a parallel history, but one that deeply informs your own work. Jeremy Howard wasn't just building a tool; he was proposing a whole new way of developing software. * `"Jeremy Howard literate programming nbdev"` * `"the problem with jupyter notebooks for software development"` * `"nbdev sync notebooks to python modules"` * `"fastai nbdev philosophy documentation tests"` **What you'll discover:** Jeremy Howard's central thesis is that the separation between code, its documentation, and its tests is artificial and harmful. `nbdev` is his tool for a "literate programming" approach where the Jupyter Notebook is the single source of truth. From the notebook, `nbdev` can automatically generate the Python library modules, the documentation, and the tests. It's a direct assault on the "hidden state" and reproducibility problems of notebooks, attempting to make them a first-class citizen in the world of serious software engineering. By chasing down these threads, you will piece together the complete, colorful, and often frustrating history of why Python packaging is the way it is. Happy digging! --- **Gemini Deep Research**: # **From Chaos to Convention: A Narrative History of the Python Packaging Ecosystem** ## **Section 1: The Genesis of Distribution (1998-2003)** The story of Python packaging is a chronicle of necessity, ingenuity, and the gradual, often painful, forging of standards from a landscape of ad-hoc solutions. In the earliest days of Python, sharing code was a fundamentally manual process, reliant on little more than archiving source files and providing instructions for where to place them. The first significant step toward a standardized infrastructure was the creation of the site module in 1996, which established the concept of a common directory—site-packages—where third-party libraries should reside.1 This simple convention laid the conceptual groundwork for an entire ecosystem, but the mechanisms for populating that directory were still undefined. ### **The distutils Mandate: Standardization Through Execution** The need for a more robust and uniform method of distributing and installing code led to the formation of the Distutils Special Interest Group (SIG) in late 1998.2 From this group's efforts, spearheaded by Greg Ward, emerged distutils, a foundational library that would define the core paradigms of Python packaging for nearly two decades. In 2000, distutils was integrated into the Python standard library with the release of Python 1.6, cementing its status as the official, sanctioned tool for the job.2 The central innovation of distutils was the setup.py script. By convention, this file, placed at the root of a project, contained a call to a setup() function, which accepted arguments defining the project's metadata—such as its name, version, and the packages it contained.2 This script was not merely a static configuration file; it was an executable Python program. This design choice was both pragmatic and, in retrospect, fateful. It provided package authors with the full power and flexibility of Python to define their build and installation logic. This was particularly crucial for the complex but essential task of compiling C extensions, a process distutils was explicitly designed to support.2 A developer could run commands like python setup.py sdist to create a source tarball or python setup.py install to place the package into the local site-packages directory.2 This decision to use an executable script for configuration, however, planted the seeds of future challenges. It established an imperative model where determining a package's properties, including its dependencies, required executing arbitrary code. This approach, while natural for a scripting language, would later be recognized as a significant source of security vulnerabilities and tooling complexity. The very flexibility it offered meant that any tool wishing to inspect a package had to be prepared to run a piece of untrusted code downloaded from the internet—a practice that directly contradicts modern security principles.6 The convenience of the year 2000 would become the critical security flaw of the 2020s, a foundational problem that would take the community nearly two decades to fully address through a shift to declarative, static configuration formats. ### **The Centralization Imperative: The Need for a "Cheese Shop"** While distutils provided the *how* of packaging, it did not address the *where*. In its early days, the Python world lacked a central, trusted repository for discovering and downloading packages.3 This fragmentation hindered the growth of the third-party library ecosystem. To address this, the Catalog SIG was formed in 2000, with the initial goal of standardizing the metadata that any such catalog would need to index.2 This effort produced PEP 241 in 2001, which defined a standard for package metadata.2 With a metadata standard in place, the path was clear for a centralized index. In November 2002, Richard Jones authored PEP 301, "Package Index and Metadata for Distutils," which formally proposed the creation of what would become the Python Package Index (PyPI).2 The core motivation was to provide a simple, official discovery mechanism for the growing number of Python modules. Hosting the index within the python.org domain was seen as crucial for giving it an "air of legitimacy" that previous, unofficial cataloging efforts had lacked.3 In 2003, the Python Package Index was launched, and distutils was extended to allow packages and their metadata to be uploaded to it.2 In a nod to the culture of the early Python community, which was heavily influenced by the British comedy troupe, PyPI was affectionately nicknamed the "Cheese Shop." This moniker is a direct reference to a famous Monty Python sketch, a piece of trivia that speaks to the playful and community-oriented spirit of the language's development.2 The launch of PyPI marked the end of the genesis period; Python now had both a standardized mechanism for building packages and a central repository for sharing them. ## **Section 2: The Setuptools Revolution and Its Discontents (2004-2008)** The foundational layer established by distutils and PyPI was functional but incomplete. The most significant missing piece was a standardized way for a package to declare its dependencies on other packages. This critical shortcoming was the primary impetus for the next major evolutionary leap in Python packaging: the creation of setuptools. ### **Beyond distutils: The Problem of Dependencies** The distutils framework provided no mechanism for specifying that one package required another to function.1 This meant that installing a complex library often involved a frustrating manual process of identifying, downloading, and installing each of its dependencies in the correct order. In 2004, Phillip Eby released setuptools as a powerful, third-party extension to solve this problem and many others.1 setuptools was not a replacement for distutils but an enhancement. It achieved its functionality by "monkey-patching" the standard library module, a technique where a library modifies another library's code at runtime.13 This allowed setuptools to extend the setup() function with a host of new arguments. The most important of these was install_requires, which allowed a developer to list a package's dependencies as a simple list of strings.1 Beyond this crucial feature, setuptools introduced a vast array of advanced capabilities, including support for namespace packages, optional dependencies (via extras_require), automatic discovery of packages in a source tree, and even the ability to find packages by scraping web pages.2 ### **easy_install and the egg: A Flawed First Step in Automation** To leverage the new dependency metadata, setuptools introduced easy_install, Python's first widely adopted, automated package installer.5 When a user ran easy_install on a package, the tool would connect to PyPI, download the package, and then recursively find, download, and install all the dependencies listed in its install_requires field.1 easy_install also popularized the .egg distribution format. An egg was a binary distribution packaged as a single zip file, containing the project's code, resources, and metadata.2 Eggs could be imported directly from their zipped format, which was intended to simplify deployment. However, the .egg format was never formalized through a PEP and had significant technical drawbacks. It included version-specific .pyc files, which made eggs less portable across different Python versions, and its design made uninstallation notoriously difficult.1 ### **A Fork in the Road: The distribute Schism** By 2008, setuptools had become an indispensable part of the Python development infrastructure. Its success, however, also became a liability. Development was heavily centralized around its creator, Phillip Eby, whose availability had decreased over time.2 As a result, the project's development stalled. A growing backlog of bug fixes and feature requests from the community went unaddressed, with the most pressing issue being the lack of support for the upcoming Python 3.2 This situation created a classic open-source governance crisis. A critical piece of community infrastructure was dependent on a single point of failure. The community, needing to move forward, resorted to a standard, if contentious, solution: a fork. After attempts to open up the setuptools project itself failed, a group of prominent developers led by Tarek Ziadé forked the codebase in 2008 to create a new project named distribute.1 The explicit goal of distribute was to be a community-maintained, drop-in replacement for setuptools, allowing existing projects to switch over with minimal changes while benefiting from active maintenance and, crucially, progress on Python 3 support.2 The fork caused considerable friction and debate within the Python community but ultimately served its purpose by breaking the development logjam. The two projects would coexist for several years before the changes and community momentum from distribute were eventually merged back into a revitalized setuptools project in 2013.5 This episode was a key catalyst in the eventual formation of the Python Packaging Authority (PyPA), a neutral governance body created to steward key packaging projects and prevent such a crisis from recurring.8 ## **Section 3: The Modern Toolchain Emerges (2008-2012)** The setuptools era, with its powerful but complex and sometimes fragile tools, set the stage for a new wave of innovation. The period between 2008 and 2012 saw the emergence of a more modular and robust set of tools, each created as a direct response to the shortcomings of its predecessors. This "toolbox" approach, heavily influenced by the Unix philosophy of small, interoperable programs that do one thing well, formed the foundation of the modern Python developer experience. ### **Fixing Installation: The Birth of pip** While easy_install was a revolutionary step in automating dependency management, its day-to-day use was fraught with problems. It often installed packages as opaque, zipped .egg files, which were difficult to inspect. Its uninstallation capabilities were nonexistent, forcing users to manually delete files and edit path configuration files.14 Furthermore, because it installed packages as they were downloaded, a network failure could leave a system in a broken, partially-installed state.1 In 2008, Ian Bicking, a key figure in the Python packaging community, created pip (a recursive acronym for "Pip Installs Packages") with the explicit goal of fixing these shortcomings.1 pip introduced several key innovations that made it a superior alternative: * **Reliable Uninstallation:** pip tracked every file it installed, allowing it to provide a clean and reliable uninstall command.1 * **Atomic Installations:** It first downloaded all required packages into a cache before beginning the installation process. This ensured that a failed download would not result in a partially-completed installation.1 * **Environment Reproducibility:** It introduced the concept of requirements.txt files. This simple text file format allowed developers to list their project's dependencies and their exact versions, providing a declarative and reliable way to replicate a development environment across different machines.1 * **"Flat" Installs:** Instead of using .egg archives, pip installed packages as regular directories within site-packages. This "flat" installation made it much easier for developers to browse the source code, debug issues, and understand what was installed on their system.1 The following table provides a direct comparison of the features and philosophies of easy_install and its successor, pip. | Feature | easy_install (The Monolith) | pip (The Toolbox) | | :---- | :---- | :---- | | **Uninstallation** | Not supported; requires manual deletion of files.22 | Fully supported via pip uninstall.1 | | **Installation Atomicity** | Installs packages as they download; can result in partial installs.1 | Downloads all packages first, then installs, preventing partial states.1 | | **Environment Replication** | No standardized mechanism.26 | requirements.txt files provide a simple, declarative standard.1 | | **Package Format** | Installs from and creates .egg archives, which can be opaque.22 | Installs packages "flat" as standard directories; does not install from eggs.1 | | **Dependency Resolution** | Basic, can struggle with complex or conflicting dependencies.22 | More advanced dependency resolution, with clearer output.1 | | **Virtual Environments** | Limited integration; requires activation before use.22 | First-class support; designed to work seamlessly with isolated environments.22 | | **Core Philosophy** | An integrated part of the setuptools suite, handling many tasks.14 | A focused, standalone tool for reliably installing and managing packages.1 | ### **Solving the Environment Problem: virtualenv** The proliferation of third-party packages created another significant problem: "dependency hell." As projects grew, it became common for two different applications on the same system to require conflicting versions of the same library. Installing all packages into a single, global site-packages directory was no longer a scalable solution. Ian Bicking also pioneered the solution to this challenge. His work on virtual-python and workingenv culminated in the release of virtualenv in 2007.2 virtualenv is a tool for creating isolated Python environments. Each environment contains its own Python interpreter and its own site-packages directory, completely independent of the system-wide Python installation and any other virtual environments.2 This allowed developers to manage dependencies on a per-project basis, eliminating conflicts and making development and deployment far more predictable. The combination of pip and virtualenv became the de facto standard for Python development, a pairing so essential that the concept was eventually incorporated into the Python standard library as the venv module in PEP 405.8 ### **A Superior Binary Format: The wheel** The final piece of the modern toolchain to fall into place was a replacement for the problematic .egg format. While eggs had served a purpose, their lack of a formal specification, inclusion of .pyc files, and difficult installation process made them a source of ongoing issues.1 In 2012, Daniel Holth proposed a new binary package format called "wheel" in PEP 427.1 The wheel format was designed from the ground up to be a superior distribution format, addressing the key flaws of the egg. | Feature | .egg Format | .wheel Format | | :---- | :---- | :---- | | **Standardization** | No official PEP; behavior defined by setuptools implementation.15 | Officially specified in PEP 427, providing a clear standard for all tools.15 | | **Installation Process** | Can be importable at runtime; installation can be complex.15 | A pure distribution format; installation is a fast and simple file copy operation.15 | | **Compiled Code** | Handled binary distributions but with less standardized naming. | Designed for pre-compiled extensions, speeding up installation for users.15 | | **Compatibility Tagging** | Limited naming convention.15 | Rich filename convention with tags for Python version, ABI, and platform.15 | | **Bytecode (.pyc) Files** | Included .pyc files, making them Python-version specific.1 | Intentionally excludes .pyc files, allowing for "universal" wheels for pure-Python code.15 | | **Metadata Directory** | Used a non-standard .egg-info directory.15 | Uses the standard .dist-info directory, compatible with modern tooling.15 | Wheels provided a faster, more reliable, and more interoperable way to distribute pre-built packages. For end-users, this meant that installing complex scientific packages with compiled C or Fortran extensions could be reduced from a lengthy, error-prone compilation process to a near-instantaneous download and copy. The introduction of the wheel format marked a significant step in the maturation of the Python packaging ecosystem, and PyPI quickly became populated with wheels for a vast number of packages. In August 2023, PyPI officially stopped accepting new egg uploads, marking the final end of the format's reign.15 ## **Section 4: Security, Standardization, and the Declarative Future (2013-Present)** With a robust set of community-developed tools like pip, virtualenv, and wheel in place, the focus of the Python packaging ecosystem shifted toward maturation. This era has been defined by two major trends: a concerted effort to address long-standing security vulnerabilities and a monumental push to replace imperative, ad-hoc configuration with formal, declarative standards. This work, largely coordinated by the Python Packaging Authority (PyPA), has culminated in a comprehensive architectural redesign of how Python projects are defined and built. ### **Securing the Gates: The Necessity of Twine** For many years, the standard method for publishing a package to PyPI was the python setup.py upload command, provided by distutils and setuptools. This simple command hid two critical vulnerabilities. First, older versions of Python would transmit a user's PyPI username and password over unencrypted HTTP, making them susceptible to interception and theft.29 While this was eventually fixed in newer Python releases, the risk remained for anyone using an older, unpatched environment. Second, the command tightly coupled the build process with the upload process. A developer would run a single command that would build the distribution artifacts and immediately upload them. This workflow provided no opportunity to inspect or test the exact files that were being published to the public repository, creating a risk of accidentally releasing a broken or incomplete package.29 To solve these problems, Twine was created as a standalone, dedicated utility for securely uploading packages to PyPI.32 Twine addresses the security and usability issues of setup.py upload in two key ways: 1. **Secure Connections:** It enforces the use of verified HTTPS (TLS) for all communication with the package index, protecting user credentials from being sniffed in transit.29 2. **Decoupled Workflow:** It is purely an upload tool. The developer first builds the distribution files (e.g., python -m build), creating sdist and wheel files in a dist/ directory. They can then inspect, validate, and test these exact files before using Twine to upload them (twine upload dist/*).30 This separation of concerns makes the release process significantly more robust and secure. ### **The pyproject.toml Reformation: From Execution to Declaration** The most profound shift in modern Python packaging has been the move away from the executable setup.py file to a static, declarative configuration file: pyproject.toml. This transition was not a single event but a multi-year process carried out through a series of landmark Python Enhancement Proposals (PEPs). The core motivation was to solve the fundamental problems inherent in an executable configuration format. The most severe was the security risk of arbitrary code execution; simply to determine a package's build dependencies, an installer like pip had to execute its setup.py script.7 This also created a logistical "Catch-22": a tool could not know the dependencies required to run setup.py without first running setup.py.33 This architectural flaw was resolved by the following PEPs: * **PEP 518 (2016):** This PEP introduced the pyproject.toml file format (based on TOML) and defined a [build-system] table within it. This table allows a project to declaratively specify its build-time dependencies (e.g., requires \= ["setuptools\>=61.0", "wheel"]). This broke the Catch-22, enabling tools to install the necessary build environment *before* attempting to execute any build logic.33 * **PEP 517 (2017):** With build dependencies now declaratively available, this PEP defined a standardized API for build backends. It specified a set of "hooks" (like build_wheel and build_sdist) that a build tool must provide. This decoupled the frontend (the tool invoking the build, like pip) from the backend (the tool performing the build, like setuptools). This finally broke the de facto monopoly of setuptools and allowed for the creation of alternative build systems like Flit and Poetry.34 * **PEP 621 (2020):** This was the final piece of the puzzle. It standardized a [project] table within pyproject.toml for specifying the vast majority of a project's core metadata—name, version, description, dependencies, entry points, etc.—in a static, tool-agnostic format. This made it possible to define a complete package without needing a setup.py or setup.cfg file at all.34 As a result of this reformation, the old python setup.py \