{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Structure and Schema of AuthoritySpoke Holdings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial will show how to create and load objects representing legal Holdings in AuthoritySpoke." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To get ready, we need to repeat some setup steps we already saw in the `Introduction to AuthoritySpoke` guide. First, import the package." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "from dotenv import load_dotenv\n", "\n", "import authorityspoke\n", "from authorityspoke.io.loaders import load_decision\n", "from authorityspoke.io.downloads import CAPClient\n", "\n", "load_dotenv()\n", "CAP_API_KEY = os.getenv('CAP_API_KEY')\n", "client = CAPClient(api_token=CAP_API_KEY)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, you have the choice of using either the real API clients or mockups that supply only the testing data for these examples." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "USE_REAL_CASE_API = False\n", "USE_REAL_LEGISLICE_API = False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we can download the judicial decisions we're going to compare \n", "and convert the JSON responses from the API\n", "into :class:`authorityspoke.decisions.Decision` objects." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import os\n", "from dotenv import load_dotenv\n", "load_dotenv()\n", "\n", "from authorityspoke import Decision, DecisionReading\n", "from authorityspoke.io.loaders import load_decision_as_reading\n", "\n", "if USE_REAL_CASE_API:\n", " CAP_API_KEY = os.getenv('CAP_API_KEY')\n", " \n", " oracle_decision = client.read_cite(\n", " cite=\"750 F.3d 1339\", \n", " full_case=True, \n", " api_key=CAP_API_KEY)\n", " \n", " lotus_decision = client.read_cite(\n", " cite=\"49 F.3d 807\", \n", " full_case=True, \n", " api_key=CAP_API_KEY)\n", " oracle = DecisionReading(oracle_decision)\n", " lotus = DecisionReading(lotus_decision)\n", "\n", "else:\n", " oracle = load_decision_as_reading(\"oracle_h.json\")\n", " lotus = load_decision_as_reading(\"lotus_h.json\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we need a download `Client` for accessing legislative provisions." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "from authorityspoke.io.downloads import LegisClient\n", "from authorityspoke.io.fake_enactments import FakeClient\n", "\n", "if USE_REAL_LEGISLICE_API:\n", " \n", " LEGISLICE_API_TOKEN = os.getenv(\"LEGISLICE_API_TOKEN\")\n", " legis_client = LegisClient(api_token=LEGISLICE_API_TOKEN)\n", "\n", "else:\n", " legis_client = FakeClient.from_file(\"usc.json\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading Holdings from Existing JSON" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we're ready to look at the process of describing legal `Holding`s and loading that information into AuthoritySpoke. In version 0.6, although there's not yet a web interface for loading this data, there is an interface for loading JSON files, and there's an OpenAPI schema specification for the input data (see below). \n", "\n", "There are several interfaces for loading Authorityspoke objects in the `authorityspoke.io.loaders` and `authorityspoke.io.schemas_json` modules. One way to load data is to create a YAML document that contains a list of objects, where each object represents one Holding. Then you can load the Holdings into AuthoritySpoke objects using the `loaders.read_holdings_from_file` function." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from authorityspoke.io.loaders import read_holdings_from_file\n", "\n", "oracle_holdings = read_holdings_from_file(\"holding_oracle.yaml\", client=legis_client)\n", "lotus_holdings = read_holdings_from_file(\"holding_lotus.yaml\", client=legis_client)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to open one of the input YAML files in your own text editor for comparison, you can find them in the folder `example_data/holdings/`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`holding_oracle.yaml` contains a list of holdings. These are places where the text of the _Oracle_ opinion endorses legal rules (or sometimes, rejects legal rules). Each of these rules is described procedurally, in terms of inputs and outputs. \n", "\n", "Each holding in the YAML input may also include an `anchors` field indicating where the holding can be found in the opinion. For instance, the first holding of _Oracle America v. Google_ is derived from the following sentence from the majority opinion:\n", "\n", "> By statute, a work must be “original” to qualify for copyright protection. 17 U.S.C. § 102(a).\n", "\n", "The `anchors` field doesn't do much yet in AuthoritySpoke version 0.6, but in future versions it'll help link each Holding to the relevant parts of the Opinion." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Parts of a Holding as a Python Dictionary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's look at the part of `holding_oracle.yaml` representing that first holding. This will convert the YAML file to a Python dictionary (with a structure similar to JSON), but won't yet load it as an AuthoritySpoke object." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'inputs': {'type': 'fact',\n", " 'content': '{the Java API} was an original work',\n", " 'truth': False},\n", " 'outputs': {'type': 'fact',\n", " 'content': 'the Java API was copyrightable',\n", " 'truth': False},\n", " 'mandatory': True,\n", " 'enactments': {'node': '/us/usc/t17/s102/a',\n", " 'exact': 'Copyright protection subsists, in accordance with this title, in original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.',\n", " 'name': 'copyright protection provision',\n", " 'anchors': ['qualify for copyright protection. |17 U.S.C. § 102(a)|.']},\n", " 'anchors': 'By statute, a work |must be “original” to qualify| for'}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from authorityspoke.io.loaders import load_holdings\n", "\n", "holdings_to_read = load_holdings(\"holding_oracle.yaml\")\n", "holdings_to_read[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To compare the input data to the created Python objects, you can link the Holdings to the Opinions using the `.posit` method. As we look at the parts of the JSON file, the code cells will show how fields from the JSON affect the structure of the Holding object." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "the Holding to ACCEPT\n", " the Rule that the court MUST SOMETIMES impose the\n", " RESULT:\n", " the fact it was false that was copyrightable\n", " GIVEN:\n", " the fact it was false that was an original work\n", " GIVEN the ENACTMENT:\n", " \"Copyright protection subsists, in accordance with this title, in original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.…\" (/us/usc/t17/s102/a 2013-07-18)\n" ] } ], "source": [ "oracle.posit(oracle_holdings)\n", "lotus.posit(lotus_holdings)\n", "\n", "print(oracle.holdings[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This Holding means that according to the cited enactment, if it's false that \"the Java API was an original work\", then it's mandatory for the court to find it to be false that \"the Java API was copyrightable\".\n", "\n", "The JSON file represented these Factors inside an \"inputs\" field (labeled as the \"GIVEN\" Factors when you print the Holding object) and an \"outputs\" field (labeled as \"RESULT\" Factors). Inputs are the preconditions for applying the Holding, and outputs are the results. Not shown here, Rules can also have \"despite\" Factors, which are Factors that don't need to be present to trigger the rule, but that don't prevent the rule from applying if they're present. There can be more than one Factor in the \"inputs\", \"outputs\" or \"despite\" categories, and if so they would be listed together in square brackets in the JSON." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "the fact it was false that was an original work\n" ] } ], "source": [ "print(oracle.holdings[0].inputs[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The curly brackets around `{the Java API}` indicate that the parser should consider that phrase to be a reference to an Entity object, which becomes one of the input's `terms`. If such an object hasn't been referenced before in the file, it will be created." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(Entity(name='the Java API'),)\n" ] } ], "source": [ "print(oracle.holdings[0].inputs[0].terms)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The JSON representation of a Rule can also have \"mandatory\" and \"universal\" fields. If omitted, the values of these fields are implied as False. \"universal\" means that the Rule applies whenever its inputs are present. \"mandatory\" means that when Rule applies, the court has no discretion and must accept the outputs." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n" ] } ], "source": [ "print(oracle.holdings[0].mandatory)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The JSON can also contain fields representing Enactments. It identifies a passage of legislative text with a [United States Legislative Markup](https://github.com/usgpo/uslm) identifier that shows the \"path\" to the text. In this case, \"us\" refers to the jurisdiction (the US federal government), \"usc\" refers to the Code (the United States Code), \"t17\" specifies Title 17 of the United States Code, \"s102\" specifies Section 102 of Title 17, and \"a\" specifies subsection (a) of Section 102. If the relevant passage is less than the entire section or subsection, an \"exact\" field can identify the full text of the passage or \"prefix\" and \"suffix\" fields can be used to the phrase by what comes immediately before or after it. You don't need to include \"prefix\" and \"suffix\" if you're sure the phrase you're trying to select only occurs once in the statute subdivision you've cited. Alternatively, a passage can be saved as a `text` field with pipe characters that split it into three parts for \"prefix\", \"exact\", and \"suffix\" fields.\n", "\n", "For instance, to get just the phrase \"original works of authorship\", we could have included the following field in the JSON when loading:\n", "```\n", "\"text\": \"in accordance with this title, in|original works of authorship|fixed\"\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also select that same string to change the Enactment's selected text after loading the Enactment:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "to_select = \"in accordance with this title, in|original works of authorship|fixed\"\n", "oracle.holdings[0].enactments[0].select(to_select)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we can use the `selected_text` method to verify that the Enactment's selected text has changed." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'…original works of authorship…'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "oracle.holdings[0].enactments[0].selected_text()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The \"name\" field is a nickname that can be used to refer to the passage again later in the same file. For any Factor or Enactment object, you can add a \"name\" field and assign a unique string value as the name. If you need to refer to the object again in the list of Holdings you're importing, you can replace the object with the name string. This means a Holding object could have \"input\", \"despite\" and \"output\" fields containing lists of string indentifiers of Factors defined elsewhere. Enactment objects can be replaced the same way in the \"enactments\" and \"enactments_despite\" fields." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'copyright protection provision'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "holdings_to_read[0][\"enactments\"][\"name\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the second holding in the JSON file, you can see where the enactment is referenced by its name \"copy protection provision\" instead of being repeated in its entirety." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'inputs': [{'type': 'fact',\n", " 'content': 'the Java API was independently created by the author, as opposed to copied from other works'},\n", " {'type': 'fact',\n", " 'content': 'the Java API possessed at least some minimal degree of creativity'}],\n", " 'outputs': {'type': 'fact', 'content': 'the Java API was an original work'},\n", " 'mandatory': True,\n", " 'universal': True,\n", " 'enactments': 'copyright protection provision'}" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "holdings_to_read[1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There can also be an \"enactments_despite\" field, which identifies legislative text that doesn't need to be present for the Rule to apply, but that also doesn't negate the validity of the Rule." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## JSON API Specification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to view the schema specification, you can find it in the `io.api_spec` module. When you read it, you might be surprised to see that every Holding object contains a Rule, and every Rule contains a Procedure. \n", "\n", "If you prefer, instead of nesting a Rule object and Procedure object inside the Holding object, AuthoritySpoke's YAML data loading library allows you to place all the properties of the Rule and the Procedure directly into the Holding object, as shown in the examples above. But a JSON API that transfers AuthoritySpoke objects should conform to the schema below." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "components:\n", " schemas:\n", " Allegation:\n", " properties:\n", " absent:\n", " default: false\n", " type: boolean\n", " generic:\n", " default: false\n", " type: boolean\n", " name:\n", " default: null\n", " nullable: true\n", " type: string\n", " pleading:\n", " allOf:\n", " - $ref: '#/components/schemas/Pleading'\n", " default: null\n", " nullable: true\n", " statement:\n", " allOf:\n", " - $ref: '#/components/schemas/Fact'\n", " default: null\n", " nullable: true\n", " type: object\n", " CrossReference:\n", " properties:\n", " reference_text:\n", " type: string\n", " target_node:\n", " type: integer\n", " target_uri:\n", " type: string\n", " target_url:\n", " format: url\n", " type: string\n", " required:\n", " - reference_text\n", " - target_uri\n", " - target_url\n", " type: object\n", " Enactment:\n", " properties:\n", " node:\n", " format: url\n", " type: string\n", " heading:\n", " default: ''\n", " type: string\n", " text_version:\n", " allOf:\n", " - $ref: '#/components/schemas/TextVersion'\n", " default: null\n", " nullable: true\n", " start_date:\n", " format: date\n", " type: string\n", " end_date:\n", " default: null\n", " format: date\n", " nullable: true\n", " type: string\n", " known_revision_date:\n", " type: boolean\n", " selection:\n", " items:\n", " $ref: '#/components/schemas/PositionSelector'\n", " type: array\n", " anchors:\n", " items:\n", " $ref: '#/components/schemas/PositionSelector'\n", " type: array\n", " citations:\n", " items:\n", " $ref: '#/components/schemas/CrossReference'\n", " type: array\n", " children:\n", " items:\n", " $ref: '#/components/schemas/Enactment'\n", " type: array\n", " required:\n", " - node\n", " - start_date\n", " type: object\n", " Entity:\n", " properties:\n", " generic:\n", " default: true\n", " type: boolean\n", " name:\n", " default: null\n", " nullable: true\n", " type: string\n", " plural:\n", " type: boolean\n", " type: object\n", " Evidence:\n", " properties:\n", " absent:\n", " default: false\n", " type: boolean\n", " exhibit:\n", " allOf:\n", " - $ref: '#/components/schemas/Exhibit'\n", " default: null\n", " nullable: true\n", " generic:\n", " default: false\n", " type: boolean\n", " name:\n", " default: null\n", " nullable: true\n", " type: string\n", " to_effect:\n", " allOf:\n", " - $ref: '#/components/schemas/Fact'\n", " default: null\n", " nullable: true\n", " type: object\n", " Exhibit:\n", " properties:\n", " absent:\n", " default: false\n", " type: boolean\n", " form:\n", " default: null\n", " nullable: true\n", " type: string\n", " generic:\n", " default: false\n", " type: boolean\n", " name:\n", " default: null\n", " nullable: true\n", " type: string\n", " statement:\n", " allOf:\n", " - $ref: '#/components/schemas/Fact'\n", " default: null\n", " nullable: true\n", " statement_attribution:\n", " allOf:\n", " - $ref: '#/components/schemas/Entity'\n", " default: null\n", " nullable: true\n", " type: object\n", " Fact:\n", " properties:\n", " absent:\n", " default: false\n", " type: boolean\n", " generic:\n", " default: false\n", " type: boolean\n", " name:\n", " default: null\n", " nullable: true\n", " type: string\n", " predicate:\n", " $ref: '#/components/schemas/Predicate'\n", " standard_of_proof:\n", " default: null\n", " nullable: true\n", " type: string\n", " terms:\n", " items:\n", " $ref: '#/components/schemas/Factor'\n", " type: array\n", " type: object\n", " Factor:\n", " discriminator:\n", " mapping:\n", " Allegation: '#/components/schemas/Allegation'\n", " Entity: '#/components/schemas/Entity'\n", " Evidence: '#/components/schemas/Evidence'\n", " Exhibit: '#/components/schemas/Exhibit'\n", " Fact: '#/components/schemas/Fact'\n", " Pleading: '#/components/schemas/Pleading'\n", " propertyName: type\n", " oneOf:\n", " - $ref: '#/components/schemas/Allegation'\n", " - $ref: '#/components/schemas/Entity'\n", " - $ref: '#/components/schemas/Evidence'\n", " - $ref: '#/components/schemas/Exhibit'\n", " - $ref: '#/components/schemas/Fact'\n", " - $ref: '#/components/schemas/Pleading'\n", " Holding:\n", " properties:\n", " anchors:\n", " items:\n", " $ref: '#/components/schemas/Selector'\n", " type: array\n", " decided:\n", " default: true\n", " type: boolean\n", " exclusive:\n", " default: false\n", " type: boolean\n", " generic:\n", " default: false\n", " type: boolean\n", " rule:\n", " $ref: '#/components/schemas/Rule'\n", " rule_valid:\n", " default: true\n", " type: boolean\n", " type: object\n", " Pleading:\n", " properties:\n", " absent:\n", " default: false\n", " type: boolean\n", " filer:\n", " allOf:\n", " - $ref: '#/components/schemas/Entity'\n", " default: null\n", " nullable: true\n", " generic:\n", " default: false\n", " type: boolean\n", " name:\n", " default: null\n", " nullable: true\n", " type: string\n", " type: object\n", " PositionSelector:\n", " properties:\n", " start:\n", " type: integer\n", " end:\n", " default: null\n", " nullable: true\n", " type: integer\n", " include_start:\n", " default: true\n", " type: boolean\n", " writeOnly: true\n", " include_end:\n", " default: false\n", " type: boolean\n", " writeOnly: true\n", " type: object\n", " Predicate:\n", " properties:\n", " content:\n", " type: string\n", " expression:\n", " default: null\n", " nullable: true\n", " sign:\n", " default: null\n", " enum:\n", " - ''\n", " - '>='\n", " - ==\n", " - '!='\n", " - <=\n", " - <>\n", " - '>'\n", " - <\n", " nullable: true\n", " type: string\n", " truth:\n", " default: true\n", " type: boolean\n", " type: object\n", " Procedure:\n", " properties:\n", " despite:\n", " items:\n", " $ref: '#/components/schemas/Factor'\n", " type: array\n", " inputs:\n", " items:\n", " $ref: '#/components/schemas/Factor'\n", " type: array\n", " outputs:\n", " items:\n", " $ref: '#/components/schemas/Factor'\n", " type: array\n", " type: object\n", " Rule:\n", " properties:\n", " enactments:\n", " items:\n", " $ref: '#/components/schemas/Enactment'\n", " type: array\n", " enactments_despite:\n", " items:\n", " $ref: '#/components/schemas/Enactment'\n", " type: array\n", " generic:\n", " default: false\n", " type: boolean\n", " mandatory:\n", " default: false\n", " type: boolean\n", " name:\n", " default: null\n", " nullable: true\n", " type: string\n", " procedure:\n", " $ref: '#/components/schemas/Procedure'\n", " universal:\n", " default: false\n", " type: boolean\n", " type: object\n", " Selector:\n", " properties:\n", " exact:\n", " default: null\n", " nullable: true\n", " type: string\n", " prefix:\n", " default: null\n", " nullable: true\n", " type: string\n", " suffix:\n", " default: null\n", " nullable: true\n", " type: string\n", " start:\n", " type: integer\n", " end:\n", " default: null\n", " nullable: true\n", " type: integer\n", " include_start:\n", " default: true\n", " type: boolean\n", " writeOnly: true\n", " include_end:\n", " default: false\n", " type: boolean\n", " writeOnly: true\n", " type: object\n", " TextVersion:\n", " properties:\n", " content:\n", " type: string\n", " required:\n", " - content\n", " type: object\n", "info:\n", " description: An interface for annotating judicial holdings\n", " title: AuthoritySpoke Holding API Schema\n", " version: 0.3.0\n", "openapi: 3.0.2\n", "paths: {}\n", "\n" ] } ], "source": [ "from authorityspoke.io.api_spec import make_spec\n", "\n", "yaml = make_spec().to_yaml()\n", "print(yaml)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exporting AuthoritySpoke Holdings back to JSON" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, if you want to convert an AuthoritySpoke object back to JSON or to a Python dictionary, you can do so with the `io.dump` module. Although no API exists yet for serving and ingesting data using the AuthoritySpoke Holding Schema, this JSON format is easier to store and share over the web." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'{\"terms\": [{\"name\": \"the Java API\", \"plural\": false, \"generic\": true, \"type\": \"Entity\"}], \"standard_of_proof\": null, \"generic\": false, \"absent\": false, \"predicate\": {\"content\": \"${the_java_api} was copyrightable\", \"truth\": false, \"expression\": null}, \"name\": \"\"}'" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from authorityspoke.io import dump\n", "\n", "dump.to_json(oracle.holdings[0].outputs[0])" ] } ], "metadata": { "kernelspec": { "display_name": "authorityspoke", "language": "python", "name": "authorityspoke" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 4 }