{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### Neural next-step prediction | part 5: `llmsuggest` co-pilot \n", "Tutorial on neural theorem proving\\\n", "Author: Sean Welleck\n", "\n", "----------------" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "#### High-level goal\n", "\n", "Finally, we will see how our neural model can act as a helpful \"co-pilot\" when we are writing proofs. \\\n", "We will make an interactive tool that uses our neural next-step model to suggest next-steps in VSCode.\n", "\n", "Concretely, we'll create a `llmsuggest` tactic (in essence, a function) that displays generated suggestions in VSCode. `llmsuggest` is a minimal version of the [**llmstep** [Welleck & Saha 2023]](https://github.com/wellecks/llmstep) tactic, aimed at learning and building off of.\n", "\n", "Here is a preview of `llmsuggest`:" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "<img src=\"images/llmsuggest/llmsuggest.gif\" />\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "On the top, the user entered `llmsuggest`, then suggestions from our next-step prediction model appear in the Lean Infoview. Clicking a suggestion adds it to the proof." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "-----------------------\n", "\n", "### High-level approach\n", "\n", "Implementing `llmsuggest` involves three components:\n", "\n", "1. A Lean *tactic* that sends the current state to a Python script.\n", "2. A Python script that sends the current state to a server via a POST request.\n", "3. A Python server that runs our next-step suggestion model on the current state.\n", "\n", "The suggestions (3) are sent back to (2), and the tactic (1) displays the result in VSCode.\n", "\n", "\n", "### Implementing `llmsuggest`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1. Tactic\n", "\n", "At a technical level, the proofs we have seen are actually sequences of *tactics*. \n", "For instance, `intro` is a tactic and `rw [...]` is a tactic. In general, a *tactic* is a Lean program that manipulates a state. \n", "\n", "To build a new tactic, we use *Lean metaprogramming*, which gives us tools to define new syntax, access the proof state, and more. \\\n", "`llmsuggest` only requires basic metaprogramming. To learn more, see the [Lean 4 metaprogramming book](https://github.com/leanprover-community/lean4-metaprogramming-book/tree/master).\n", "\n", "`llmsuggest` is implemented in `ntp_lean/LLMsuggest.lean`. The main definition specifies the syntax (i.e., `\"llmsuggest\"`), then defines the tactic. \\\n", "You can see below that the tactic gets the main goal (the \"tactic state\"), pretty-prints it, and converts it to a string.\n", "Then it runs a `runSuggest` function, and passes the output to an `addSuggestions` function:\n", "\n", "```lean\n", "-- `llmsuggest` tactic.\n", "syntax \"llmsuggest\" : tactic\n", "elab_rules : tactic\n", " | `(tactic | llmsuggest%$tac) =>\n", " Lean.Elab.Tactic.withMainContext do\n", " let goal ← Lean.Elab.Tactic.getMainGoal\n", " let ppgoal ← Lean.Meta.ppGoal goal\n", " let ppgoalstr := toString ppgoal\n", " let suggest ← runSuggest #[ppgoalstr]\n", " addSuggestions tac $ suggest.splitOn \"[SUGGESTION]\"\n", "```\n", "\n", "The `runSuggest` function calls a Python script (step 2 above), and the `addSuggestions` uses a Lean widget to display the results in VSCode. \\\n", "We won't look at these in detail, but please see `ntp_lean/LLMsuggest.lean` if you are curious. \\\n", "Hopefully with a small amount of effort, you can modify the tactic or make your own in the future.\n", "\n", "\n", "#### 2. Python script\n", "\n", "The `runSuggest` function in the tactic calls a Python script, `ntp_python/llmsuggest/suggest.py`. It passes the current tactic state as a command line argument.\\\n", "The script is simple: it sends a POST request containing the current tactic state to a server:\n", "\n", "```python\n", "def suggest(tactic_state):\n", " conn = http.client.HTTPConnection(\"localhost\", 5000)\n", " headers = {'Content-type': 'application/json'}\n", " body = json.dumps({\"tactic_state\": sys.argv[1]})\n", " conn.request(\"POST\", \"/\", body, headers)\n", " response = conn.getresponse()\n", " data = response.read()\n", " data_dict = json.loads(data)\n", " print('[SUGGESTION]'.join(data_dict['suggestions']))\n", " conn.close()\n", "\n", "if __name__ == \"__main__\":\n", " suggest(sys.argv[1])\n", "```\n", "\n", "After receiving suggestions, it prints the suggestions, and the printed suggestions will be received in the `runSuggest` function.\n", "\n", "#### 3. Server\n", "Finally, in `ntp_python/llmsuggest/server.py` we define a web server that handles the POST request, and hosts the language model.\n", "Specifically, the server initializes our language model, and uses the model to\n", "generate suggestions given a tactic state received in a POST request.\n", "```python\n", "model = transformers.GPTNeoXForCausalLM.from_pretrained('wellecks/llmstep-mathlib4-pythia2.8b')\n", "\n", "def generate ...\n", "\n", "app = Flask(__name__)\n", "\n", "@app.route('/', methods=['POST'])\n", "def process_request():\n", " data = request.get_json()\n", " tactic_state = data.get('tactic_state')\n", " prompt = \"\"\"[GOAL]%s[PROOFSTEP]\"\"\" % (tactic_state)\n", " texts = generate(prompt, args.num_samples)\n", " response = {\"suggestions\": texts}\n", " return jsonify(response)\n", "\n", "if __name__ == '__main__':\n", " app.run(host='0.0.0.0', port=args.port)\n", "```\n", "\n", "This server is minimal; one can imagine adding several features." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Running `llmsuggest`\n", "\n", "To run `llmsuggest`, first start the server:\n", "```bash\n", "python python/server.py\n", "```\n", "\n", "Then open `ntp_lean/LLMsuggest.lean` in VS Code and try out `llmsuggest`. There are some example theorems and proofs at the bottom of the page:\n", "\n", "<img src=\"images/llmsuggest/llmsuggest_examples.png\" alt=\"\" width=\"450\" />" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "-----------------------\n", "\n", "### `llmstep`: [L]LM proofstep suggestions in Lean\n", "\n", "[`llmstep`](https://github.com/wellecks/llmstep) is an expanded version of the `llm_suggest` tactic: https://github.com/wellecks/llmstep\n", "\n", "`llmstep` includes features such as:\n", "1. **Type checking**: suggestions are checked by Lean and marked as completing a proof, valid, or invalid (but still possibly useful).\n", "2. **Prefixed generation**: e.g. `llmstep \"exact\"` returns suggestions that start with `\"exact\"`\n", "3. **Fast inference**: fast inference via [PagedAttention](https://vllm.ai/) for near real-time suggestions\n", "4. **Other models**: support for other models, e.g. `llmstep-llama2`\n", "\n", "Here's an example of using `llmstep`:\n", "\n", "<img src=\"images/llmsuggest/llmstep_gif.gif\" alt=\"\" width=\"400\" />\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first invocation (`llmstep \"\"`) gives 5 suggestions, with `intro h n` and `intro h` outlined in blue since they type check.\n", "\n", "The second invocation (`llmstep \"exact\"`) gives suggestions that start with `exact`. The first three are outlined in green since they complete the proof." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----------\n", "\n", "## Next steps\n", "\n", "This concludes part 1 of the tutorial. We have seen how to build a neural next-step suggestion tool from scratch: collecting data, learning a model, measuring performance with proof search and evaluation sets, and deploying the model as an interactive tactic.\n", "\n", "In part 2, we will look at a generalization called language cascades, in which a language model implements a \"function\" that does more than predict the next step. We will see example cascades for drafting informal proofs, sketching the high-level structure of a proof, and refining proofs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" } }, "nbformat": 4, "nbformat_minor": 4 }