{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "=====================\n", "Working with Matchers\n", "=====================\n", "Matchers provide a flexible way of comparing LibCST nodes in order to build ", "more complex transforms. See :doc:`Matchers ` for the complete ", "documentation.\n", "\n", "Basic Matcher Usage\n", "===================\n", "Let's say you are visiting a LibCST :class:`~libcst.Call` node and you want ", "to know if all arguments provided are the literal ``True`` or ``False``. ", "You look at the documentation and see that ``Call.args`` is a sequence of ", ":class:`~libcst.Arg`, and each ``Arg.value`` is a :class:`~libcst.BaseExpression`. ", "In order to verify that each argument is either ``True`` or ``False`` you ", "would have to first loop over ``node.args``, and then check ", "``isinstance(arg.value, cst.Name)`` for each ``arg`` in the loop before ", "finally checking ``arg.value.value in (\"True\", \"False\")``. \n", "\n", "Here's a short example of that in action:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "import sys\n", "sys.path.append(\"../../\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import libcst as cst\n", "\n", "def is_call_with_booleans(node: cst.Call) -> bool:\n", " for arg in node.args:\n", " if not isinstance(arg.value, cst.Name):\n", " # This can't be the literal True/False, so bail early.\n", " return False\n", " if cst.ensure_type(arg.value, cst.Name).value not in (\"True\", \"False\"):\n", " # This is a Name node, but not the literal True/False, so bail.\n", " return False\n", " # We got here, so all arguments are literal boolean values.\n", " return True\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "We can see from a few examples that this does work as intended. ", "However, it is an awful lot of boilerplate that was fairly cumbersome to write.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "call_1 = cst.Call(\n", " func=cst.Name(\"foo\"),\n", " args=(\n", " cst.Arg(cst.Name(\"True\")),\n", " ),\n", ")\n", "is_call_with_booleans(call_1)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "call_2 = cst.Call(\n", " func=cst.Name(\"foo\"),\n", " args=(\n", " cst.Arg(cst.Name(\"None\")),\n", " ),\n", ")\n", "is_call_with_booleans(call_2)\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Let's try to do a bit better with matchers. We can make a better function ", "that takes advantage of matchers to get rid of both the instance check and ", "the ``ensure_type`` call, like so:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import libcst.matchers as m\n", "\n", "def better_is_call_with_booleans(node: cst.Call) -> bool:\n", " for arg in node.args:\n", " if not m.matches(arg.value, m.Name(\"True\") | m.Name(\"False\")):\n", " # Oops, this isn't a True/False literal!\n", " return False\n", " # We got here, so all arguments are literal boolean values.\n", " return True\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "This is a lot shorter and is easier to read as well! We made use of the ", "fact that matchers handles instance checking for us in a safe way. We also ", "made use of the fact that matchers allows us to concisely express multiple ", "match options with the use of Python's or operator. We can also see that ", "it still works on our previous examples:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "better_is_call_with_booleans(call_1)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "better_is_call_with_booleans(call_2)\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "We still have one more trick up our sleeve though. Matchers don't just ", "allow us to specify which attributes we want to match on exactly. It ", "also allows us to specify rules for matching sequences of nodes, like ", "the list of :class:`~libcst.Arg` nodes that appears in :class:`~libcst.Call`. ", "Let's make use of that, turning our original ``is_call_with_booleans`` ", "function into a call to :func:`~libcst.matchers.matches`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def best_is_call_with_booleans(node: cst.Call) -> bool:\n", " return m.matches(\n", " node,\n", " m.Call(\n", " args=(\n", " m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n", " ),\n", " ),\n", " )\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "We've turned our original function into a single call to ", ":func:`~libcst.matchers.matches`. As an added benefit, the match node can ", "be read from left to right in a way that makes sense in english: \"match ", "any call with zero or more arguments that are the literal ``True`` or ", "``False``\". As we can see, it works as intended: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "best_is_call_with_booleans(call_1)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "best_is_call_with_booleans(call_2)\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Matcher Decorators\n", "==================\n", "You can already do a lot with just :func:`~libcst.matchers.matches`. It ", "lets you define the shape of nodes you want to match and LibCST takes ", "care of the rest. However, you still need to include a lot of boilerplate ", "into your :ref:`libcst-visitors` in order to identify which nodes you care ", "about. Matcher :ref:`libcst-matcher-decorators` help reduce that boilerplate.\n", "\n", "Say you wanted to invert the boolean literals in functions which ", "match the above ``best_is_call_with_booleans``. You could build something ", "that looks like the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class BoolInverter(cst.CSTTransformer):\n", " def __init__(self) -> None:\n", " self.in_call: int = 0\n", "\n", " def visit_Call(self, node: cst.Call) -> None:\n", " if m.matches(node, m.Call(args=(\n", " m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n", " ))):\n", " self.in_call += 1\n", "\n", " def leave_Call(self, original_node: cst.Call, updated_node: cst.Call) -> cst.Call:\n", " if m.matches(original_node, m.Call(args=(\n", " m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n", " ))):\n", " self.in_call -= 1\n", " return updated_node\n", "\n", " def leave_Name(self, original_node: cst.Name, updated_node: cst.Name) -> cst.Name:\n", " if self.in_call > 0:\n", " if updated_node.value == \"True\":\n", " return updated_node.with_changes(value=\"False\")\n", " if updated_node.value == \"False\":\n", " return updated_node.with_changes(value=\"True\")\n", " return updated_node\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "We can try it out on a source file to see that it works:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "source = \"def some_func(*params: object) -> None:\\n pass\\n\\nsome_func(True, False)\\nsome_func(1, 2, 3)\\nsome_func()\\n\"\n", "module = cst.parse_module(source)\n", "print(source)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_module = module.visit(BoolInverter())\n", "print(new_module.code)\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "While this works its not super elegant. We have to track where we are in ", "the tree so we know when its safe to invert boolean literals which means ", "we have to create a constructor and we have to duplicate matching logic. ", "We could refactor that into a helper like the ``best_is_call_with_booleans`` ", "above, but it only makes things so much better.\n", "\n", "So, let's try rewriting it with matcher decorators instead. Note that this ", "includes changing the class we inherit from to ", ":class:`~libcst.matchers.MatcherDecoratableTransformer` in order to enable ", "the matcher decorator feature:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class BetterBoolInverter(m.MatcherDecoratableTransformer):\n", " @m.call_if_inside(m.Call(args=(\n", " m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n", " )))\n", " def leave_Name(self, original_node: cst.Name, updated_node: cst.Name) -> cst.Name:\n", " if updated_node.value == \"True\":\n", " return updated_node.with_changes(value=\"False\")\n", " if updated_node.value == \"False\":\n", " return updated_node.with_changes(value=\"True\")\n", " return updated_node\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_module = module.visit(BetterBoolInverter())\n", "print(new_module.code)\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Using matcher decorators we successfully removed all of the boilerplate ", "around state tracking! The only thing that ``leave_Name`` needs to concern ", "itself with is the actual business logic of the transform. However, it ", "still needs to check to see if the value of the node should be inverted. ", "This is because the ``Call.func`` is a :class:`~libcst.Name` in this case. ", "Let's use another matcher decorator to make that problem go away:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class BestBoolInverter(m.MatcherDecoratableTransformer):\n", " @m.call_if_inside(m.Call(args=(\n", " m.ZeroOrMore(m.Arg(m.Name(\"True\") | m.Name(\"False\"))),\n", " )))\n", " @m.leave(m.Name(\"True\") | m.Name(\"False\"))\n", " def invert_bool_literal(self, original_node: cst.Name, updated_node: cst.Name) -> cst.Name:\n", " return updated_node.with_changes(value=\"False\" if updated_node.value == \"True\" else \"True\")\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_module = module.visit(BestBoolInverter())\n", "print(new_module.code)\n" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "That's it! Instead of using a ``leave_Name`` which modifies all ", ":class:`~libcst.Name` nodes we instead created a matcher visitor that ", "only modifies :class:`~libcst.Name` nodes with the value of ``True`` or ", "``False``. We decorate *that* with :func:`~libcst.matchers.call_if_inside` ", "to ensure we run this on :class:`~libcst.Name` nodes found inside of ", "function calls that only take boolean literals. Using two matcher ", "decorators we got rid of all of the state management as well as all of the ", "cases where we needed to handle nodes we weren't interested in." ] } ], "metadata": { "celltoolbar": "Edit Metadata", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }