{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "e86d9d2b",
   "metadata": {},
   "source": [
    "# Jacobs' Fairy Tales\n",
    "\n",
    "This recipe shows how to scrape Jacobs' fairy tale collections from source OCR search text documents returned from the Internet Archive.\n",
    "\n",
    "The works include:\n",
    "\n",
    "- [*English Fairy Tales*](https://archive.org/details/englishfairytal00jacogoog/);\n",
    "- [*More English Fairy Tales*](https://archive.org/details/moreenglishfairy00jaco2/);\n",
    "- [*Celtic Fairy Tales*](https://archive.org/details/celticfairytale00conggoog)\n",
    "- [*More Celtic Fairy Tales*](https://archive.org/details/morecelticfairyt00jaco/)\n",
    "- [*Indian Fairy Tales*](https://archive.org/details/indianfairytales00jaco)\n",
    "- [*European folk and fairy tales*](https://archive.org/details/europeanfolkfair00jaco/)\n",
    "\n",
    "Most of the texts can also be found on the [*Sacred Texts*](https://www.sacred-texts.com/) website:\n",
    "\n",
    "- https://www.sacred-texts.com/neu/eng/eft/index.htm\n",
    "- https://www.sacred-texts.com/neu/eng/meft/index.htm\n",
    "- https://sacred-texts.com/neu/celt/cft/index.htm\n",
    "- https://sacred-texts.com/neu/celt/mcft/index.htm\n",
    "- https://sacred-texts.com/hin/ift/index.htm\n",
    "- European not available?\n",
    "\n",
    "The approach explores how we can \"chunk\" the original text into separate stories, and suggests that a combined human + machine strategy may provide a more realistic approach than trying to create a purely automated approach."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75836861",
   "metadata": {},
   "source": [
    "```{warning}\n",
    "For each of the works on archive.org, several different scanned versions of the text may be available. A quick look at the full text document for each version will give a feel for how effective the OCR process was. Ideally, we're looking for full text that was recognised cleanly and is not full of typographical errors.\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "id": "fb093b30-e088-44f2-b35c-20b1cb95b1d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Support dynamic reliading if we update saved module files\n",
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d7c52c68-0c83-4d1e-a6ea-64a4bd55eba8",
   "metadata": {},
   "source": [
    "## Simple Book Indexer\n",
    "\n",
    "We can reuse various recipes we have developed previously to create a simple, searchable database over Jacobs' fairy tale collections.\n",
    "\n",
    "The original texts are available (in various forms) via the Intenrnet Archive. However, the text quality may be quite poor.\n",
    "\n",
    "Most of the books are also available from the *Sacred Texts* website."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "id": "89165fb0-230a-491a-b022-a6fdc0e6dafa",
   "metadata": {},
   "outputs": [],
   "source": [
    "book_ids = {\"English Fairy Tales\": {\"ia\": \"englishfairytal00jacogoog\",\n",
    "                                    \"st\": \"neu/eng/eft/index.htm\" },\n",
    "            \"More English Fairy Tales\": {\"ia\": \"moreenglishfairy00jaco2\",\n",
    "                                         \"st\": \"neu/eng/meft/index.htm\"},\n",
    "            \"Celtic Fairy Tales\": {\"ia\": \"celticfairytale00conggoog\",\n",
    "                                   \"st\": \"neu/celt/cft/index.htm\"},\n",
    "            \"More Celtic Fairy Tales\": {\"ia\": \"morecelticfairyt00jaco\",\n",
    "                                        \"st\": \"neu/celt/mcft/index.htm\"},\n",
    "            \"Indian Fairy Tales\": {\"ia\": \"indianfairytales00jaco\",\n",
    "                                   \"st\": \"hin/ift/index.htm\"},\n",
    "            \"European Fairy Tales\": {\"ia\": \"europeanfolkfair00jaco\"}\n",
    "           }"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac646827-3c46-4abf-ae57-52fe20c26e79",
   "metadata": {},
   "source": [
    "Create a simple database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "id": "ff27b46a-8156-421b-af00-bfaac3fdd1d2",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sqlite_utils import Database\n",
    "\n",
    "db_name = \"jacobs_fairy_tale.db\"\n",
    "\n",
    "# Uncomment the following lines to connect to a pre-existing database\n",
    "#db = Database(db_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "id": "57eeaa0b-f3c4-4740-81a1-346fe09e1ca7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Do not run this cell if your database already exists!\n",
    "\n",
    "# While developing the script, recreate database each time...\n",
    "db = Database(db_name, recreate=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4d5885bf-e863-4956-9bbd-a103c01fb1a5",
   "metadata": {},
   "source": [
    "The following function starts to build on the schema developed to index the Lang Fairy Tales collection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "id": "53ddd828-38a4-4fcf-8992-e431466b693b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting ia_utils/create_db_tables_book.py\n"
     ]
    }
   ],
   "source": [
    "%%writefile ia_utils/create_db_tables_book.py\n",
    "def create_db_tables_book(db, drop=True):\n",
    "    \"\"\"Create a database table and an associated full-text search table.\"\"\"\n",
    "    # If required, drop any previously defined tables of the same name\n",
    "    table_name = \"stories\"\n",
    "    if drop:\n",
    "        db[table_name].drop(ignore=True)\n",
    "        db[f\"{table_name}_fts\"].drop(ignore=True)\n",
    "    elif db[table_name].exists():\n",
    "        print(f\"Table {table_name} exists...\")\n",
    "        return\n",
    "\n",
    "    # This schema has been evolved iteratively as I have identified structure\n",
    "    # that can be usefully mined...\n",
    "\n",
    "    db[table_name].create({\n",
    "        \"book_id\": str,\n",
    "        \"book_title\": str,\n",
    "        \"story_id\": str,\n",
    "        \"story_title\": str,\n",
    "        \"story_text\": str,\n",
    "        \"last_para\": str, # sometimes contains provenance\n",
    "        \"first_line\": str, # maybe we want to review the openings, or create an index...\n",
    "        \"provenance\": str, # attempt at provenance\n",
    "        \"chapter_order\": int, # Sort order of stories in book\n",
    "    }, pk=(\"story_id\"))\n",
    "\n",
    "    # Enable full text search\n",
    "    # This creates an extra virtual table (issues_fts) to support the full text search\n",
    "    # A stemmer is applied to support the efficacy of the full-text searching\n",
    "    db[table_name].enable_fts([\"story_title\", \"story_text\"], create_triggers=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6fdc2b39-9d21-4e94-aadc-868536549c93",
   "metadata": {},
   "source": [
    "Create a `stories` table in the database, along with a full-text search index for it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 115,
   "id": "8fabc7f2-ce0a-41e5-b5e4-bdc098d94614",
   "metadata": {},
   "outputs": [],
   "source": [
    "from ia_utils.create_db_tables_book import create_db_tables_book\n",
    "\n",
    "create_db_tables_book(db)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d506843f-b538-492a-a313-ca74532d5731",
   "metadata": {},
   "source": [
    "Preview the tables and their columns:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "id": "7bba17ff-2d23-4f33-9365-57727f280523",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<Table stories (book_id, book_title, story_id, story_title, story_text, last_para, first_line, provenance, chapter_order)>,\n",
       " <Table stories_fts (story_title, story_text)>,\n",
       " <Table stories_fts_data (id, block)>,\n",
       " <Table stories_fts_idx (segid, term, pgno)>,\n",
       " <Table stories_fts_docsize (id, sz)>,\n",
       " <Table stories_fts_config (k, v)>]"
      ]
     },
     "execution_count": 116,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "db.tables"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e1cf49e-e892-4aaa-8702-4e5bf0630ffc",
   "metadata": {},
   "source": [
    "## Scrape the Sacred Texts Website\n",
    "\n",
    "Downloadable zip files of the text from the *Sacred Texts* website only seems to be available for the *Celtic Fairy Tales* collection, so let's write a simple scraper to pull the texts, a story at a time, from each book page."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8bc5effa-0066-47ca-905f-1e0d23e08667",
   "metadata": {},
   "source": [
    "First, we need to get the links to the chapters from a book page:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "id": "4454ea50-50fb-43f4-8c20-6a3c971c144a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# These packages make it easy to download web pages so that we can work with them\n",
    "import requests\n",
    "# \"Cacheing\" pages mans grabbing a local copy of the page so we only need to download it once\n",
    "import requests_cache\n",
    "from datetime import timedelta\n",
    "\n",
    "requests_cache.install_cache('web_cache',\n",
    "                             backend='sqlite',\n",
    "                             expire_after=timedelta(days=1000))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "id": "430e28b3-31e6-4eaa-ad16-986d1d2b29be",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Specify the URL of the page we want to download\n",
    "BASE_URL = \"https://www.sacred-texts.com\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "id": "436f1d52-1744-4b40-a53e-21b357a89900",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'https://www.sacred-texts.com/neu/eng/eft/index.htm'"
      ]
     },
     "execution_count": 119,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def get_st_url(book, base_url=BASE_URL):\n",
    "    stub = book_ids[book][\"st\"]\n",
    "    return f'{base_url}/{stub}'\n",
    "\n",
    "example_book_url = get_st_url(\"English Fairy Tales\")\n",
    "example_book_url"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 120,
   "id": "42378375-c813-4983-9ae2-e5a77abfe68f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'<HTML>\\n <HEAD>\\n<!-- Global site tag (gtag.js) - Google Analytics -->\\n<script async src=\"https://www.googletagmanager.com/gtag/js?id=UA-12241170-1\"></script>\\n<script>\\n  window.dataLayer = window.dataLayer || [];\\n  function gtag(){dataLayer.push(arguments);}\\n  gtag(\\'js\\', new Date());\\n\\n  gtag(\\'config\\', \\'UA-12241170-1\\');\\n  gtag(\\'config\\', \\'GA_MEASUREMENT_ID\\', {\\n    \\'linker\\': {\\n      \\'domains\\': [\\'sacred-texts.com\\', \\'next.sacred-texts.com\\', \\'happyvegan.jp\\', \\'next.happyvegan.jp\\', \\'sacred-texts.online\\', \\'next.sacred-texts.online\\']\\n    }\\n  });\\n</script>\\n<!-- End Global site tag (gtag.js) - Google Analytics -->\\n<META name=\"description\" content=\"English Fairy Tales, by Joseph Jacobs, at sacred-texts.com\">\\n <META name=\"keywords\" content=\"English Fairytale Fairy Tales Folklore Mythology England\">\\n <TITLE>English Fairy Tales Index</TITLE>\\n </HEAD>\\n <BODY>\\n \\n \\n <CENTER>\\n <A HREF=\"../../../cdshop/index.htm\"><IMG SRC=\"../../../cdshop/cdinfo.jpg\" BORDER=\"0\"></A><BR><A HREF=\"../../../index.htm\">Sacred Texts</A>&nbsp;\\n <A HREF=\"../../index.htm\">Legends and Sagas</A>&nbsp;\\n <A HREF=\"../index.htm\">English Folklore</A>&nbsp;\\n </CENTER>\\n <HR>\\n <CENTER>\\n <TABLE WIDTH=\"75%\">\\n <TR>\\n <TD WIDTH=\"50%\" VALIGN=\"TOP\" ALIGN=\"CENTER\">\\n <IMG SRC=\"img/goblin2.jpg\" HEIGHT=\"256\" alt=\"Goblin: public domain image\">\\n </TD>\\n <TD WIDTH=\"50%\" VALIGN=\"TOP\" ALIGN=\"CENTER\">\\n <H1 ALIGN=\"CENTER\">English Fairy Tales</H1>\\n <H2 ALIGN=\"CENTER\">by Joseph Jacobs</H2>\\n <H4 ALIGN=\"CENTER\">[1890]</H4>\\n </TD>\\n </TR>\\n </TABLE>\\n <HR>\\n </CENTER>\\n <A HREF=\"eft00.htm\">Title Page</A><BR>\\n <A HREF=\"eft01.htm\">Preface</A><BR>\\n <A HREF=\"eft02.htm\">Tom Tit Tot</A><BR>\\n <A HREF=\"eft03.htm\">The Three Sillies</A><BR>\\n <A HREF=\"eft04.htm\">The Rose-Tree</A><BR>\\n <A HREF=\"eft05.htm\">The Old Woman and Her Pig</A><BR>\\n <A HREF=\"eft06.htm\">How Jack Went to Seek his Fortune</A><BR>\\n <A HREF=\"eft07.htm\">Mr Vinegar</A><BR>\\n <A HREF=\"eft08.htm\">Nix Nought Nothing</A><BR>\\n <A HREF=\"eft09.htm\">Jack Hannaford</A><BR>\\n <A HREF=\"eft10.htm\">Binnorie</A><BR>\\n <A HREF=\"eft11.htm\">Mouse and Mouser</A><BR>\\n <A HREF=\"eft12.htm\">Cap O\\' Rushes</A><BR>\\n <A HREF=\"eft13.htm\">Teeny-Tiny</A><BR>\\n <A HREF=\"eft14.htm\">Jack and the Beanstalk</A><BR>\\n <A HREF=\"eft15.htm\">The Story of the Three Little Pigs</A><BR>\\n <A HREF=\"eft16.htm\">The Master and His Pupil</A><BR>\\n <A HREF=\"eft17.htm\">Titty Mouse and Tatty Mouse</A><BR>\\n <A HREF=\"eft18.htm\">Jack and His Golden Snuff-Box</A><BR>\\n <A HREF=\"eft19.htm\">The Story of the Three Bears</A><BR>\\n <A HREF=\"eft20.htm\">Jack the Giant-Killer</A><BR>\\n <A HREF=\"eft21.htm\">Henny-Penny</A><BR>\\n <A HREF=\"eft22.htm\">Childe Rowland</A><BR>\\n <A HREF=\"eft23.htm\">Molly Whuppie</A><BR>\\n <A HREF=\"eft24.htm\">The Red Ettin</A><BR>\\n <A HREF=\"eft25.htm\">The Golden Arm</A><BR>\\n <A HREF=\"eft26.htm\">The History of Tom Thumb</A><BR>\\n <A HREF=\"eft27.htm\">Mr Fox</A><BR>\\n <A HREF=\"eft28.htm\">Lazy Jack</A><BR>\\n <A HREF=\"eft29.htm\">Johnny-Cake</A><BR>\\n <A HREF=\"eft30.htm\">Earl Mar\\'s Daughter</A><BR>\\n <A HREF=\"eft31.htm\">Mr Miacca</A><BR>\\n <A HREF=\"eft32.htm\">Whittington and His Cat</A><BR>\\n <A HREF=\"eft33.htm\">The Strange Visitor</A><BR>\\n <A HREF=\"eft34.htm\">The Laidly Worm of Spindleston Heugh</A><BR>\\n <A HREF=\"eft35.htm\">The Cat and the Mouse</A><BR>\\n <A HREF=\"eft36.htm\">The Fish and the Ring</A><BR>\\n <A HREF=\"eft37.htm\">The Magpie\\'s Nest</A><BR>\\n <A HREF=\"eft38.htm\">Kate Crackernuts</A><BR>\\n <A HREF=\"eft39.htm\">The Cauld Lad of Hilton</A><BR>\\n <A HREF=\"eft40.htm\">The Ass, The Table and the Stick</A><BR>\\n <A HREF=\"eft41.htm\">Fairy Ointment</A><BR>\\n <A HREF=\"eft42.htm\">The Well of the World\\'s End</A><BR>\\n <A HREF=\"eft43.htm\">Master of all Masters</A><BR>\\n <A HREF=\"eft44.htm\">The Three Heads of the Well</A><BR>\\n <A HREF=\"eft45.htm\">Introductory Notes</A><BR>\\n <A HREF=\"eft46.htm\">Notes</A><BR>\\n </BODY>\\n </HTML>'"
      ]
     },
     "execution_count": 120,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# And then grab the page\n",
    "html = requests.get(example_book_url)\n",
    "\n",
    "# Preview some of the raw web page / HTML text in the page we just downloaded\n",
    "html.text[:5000]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed22280c-6f72-47d1-8dd5-e1839f5fd159",
   "metadata": {},
   "source": [
    "The book index pages contain links to separate chapter (i.e. story) pages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 121,
   "id": "43792a99-e0ee-413e-b8e2-ff34d6f111a0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[<a href=\"eft01.htm\">Preface</a>,\n",
       " <a href=\"eft02.htm\">Tom Tit Tot</a>,\n",
       " <a href=\"eft03.htm\">The Three Sillies</a>,\n",
       " <a href=\"eft04.htm\">The Rose-Tree</a>,\n",
       " <a href=\"eft05.htm\">The Old Woman and Her Pig</a>,\n",
       " <a href=\"eft06.htm\">How Jack Went to Seek his Fortune</a>,\n",
       " <a href=\"eft07.htm\">Mr Vinegar</a>,\n",
       " <a href=\"eft08.htm\">Nix Nought Nothing</a>,\n",
       " <a href=\"eft09.htm\">Jack Hannaford</a>,\n",
       " <a href=\"eft10.htm\">Binnorie</a>,\n",
       " <a href=\"eft11.htm\">Mouse and Mouser</a>,\n",
       " <a href=\"eft12.htm\">Cap O' Rushes</a>,\n",
       " <a href=\"eft13.htm\">Teeny-Tiny</a>,\n",
       " <a href=\"eft14.htm\">Jack and the Beanstalk</a>,\n",
       " <a href=\"eft15.htm\">The Story of the Three Little Pigs</a>,\n",
       " <a href=\"eft16.htm\">The Master and His Pupil</a>,\n",
       " <a href=\"eft17.htm\">Titty Mouse and Tatty Mouse</a>,\n",
       " <a href=\"eft18.htm\">Jack and His Golden Snuff-Box</a>,\n",
       " <a href=\"eft19.htm\">The Story of the Three Bears</a>,\n",
       " <a href=\"eft20.htm\">Jack the Giant-Killer</a>,\n",
       " <a href=\"eft21.htm\">Henny-Penny</a>,\n",
       " <a href=\"eft22.htm\">Childe Rowland</a>,\n",
       " <a href=\"eft23.htm\">Molly Whuppie</a>,\n",
       " <a href=\"eft24.htm\">The Red Ettin</a>,\n",
       " <a href=\"eft25.htm\">The Golden Arm</a>,\n",
       " <a href=\"eft26.htm\">The History of Tom Thumb</a>,\n",
       " <a href=\"eft27.htm\">Mr Fox</a>,\n",
       " <a href=\"eft28.htm\">Lazy Jack</a>,\n",
       " <a href=\"eft29.htm\">Johnny-Cake</a>,\n",
       " <a href=\"eft30.htm\">Earl Mar's Daughter</a>,\n",
       " <a href=\"eft31.htm\">Mr Miacca</a>,\n",
       " <a href=\"eft32.htm\">Whittington and His Cat</a>,\n",
       " <a href=\"eft33.htm\">The Strange Visitor</a>,\n",
       " <a href=\"eft34.htm\">The Laidly Worm of Spindleston Heugh</a>,\n",
       " <a href=\"eft35.htm\">The Cat and the Mouse</a>,\n",
       " <a href=\"eft36.htm\">The Fish and the Ring</a>,\n",
       " <a href=\"eft37.htm\">The Magpie's Nest</a>,\n",
       " <a href=\"eft38.htm\">Kate Crackernuts</a>,\n",
       " <a href=\"eft39.htm\">The Cauld Lad of Hilton</a>,\n",
       " <a href=\"eft40.htm\">The Ass, The Table and the Stick</a>,\n",
       " <a href=\"eft41.htm\">Fairy Ointment</a>,\n",
       " <a href=\"eft42.htm\">The Well of the World's End</a>,\n",
       " <a href=\"eft43.htm\">Master of all Masters</a>,\n",
       " <a href=\"eft44.htm\">The Three Heads of the Well</a>,\n",
       " <a href=\"eft45.htm\">Introductory Notes</a>,\n",
       " <a href=\"eft46.htm\">Notes</a>]"
      ]
     },
     "execution_count": 121,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# The BeautifulSoup package provides a range of tools\n",
    "# that help us work with the downloaded web page,\n",
    "# such as extracting particular elements from it\n",
    "from bs4 import BeautifulSoup\n",
    "\n",
    "# The \"soup\" is a parsed and structured form of the page we downloaded\n",
    "soup = BeautifulSoup(html.content, \"html.parser\")\n",
    "\n",
    "# Find the span elements containing the links\n",
    "links_ = soup.find_all(\"a\")\n",
    "\n",
    "# Preview the first few extracted <span> elements\n",
    "links_[5:]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "26b5c859-d566-4992-8dc2-7f82e35b371a",
   "metadata": {},
   "source": [
    "We notice that page links share a common key that we can also obtain from the book index page:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 122,
   "id": "5a042d8c-2a4a-40c6-93b2-426f38ed3372",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'eft'"
      ]
     },
     "execution_count": 122,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stub = example_book_url.split(\"/\")[-2]\n",
    "stub"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf12324b-0218-4602-9a3d-a3f98347d6af",
   "metadata": {},
   "source": [
    "Create a simple list of the story links:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 123,
   "id": "46685ddb-a2ca-430d-9c90-0ecce98960c9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('Tom Tit Tot', 'eft02.htm'),\n",
       " ('The Three Sillies', 'eft03.htm'),\n",
       " ('The Rose-Tree', 'eft04.htm'),\n",
       " ('The Old Woman and Her Pig', 'eft05.htm'),\n",
       " ('How Jack Went to Seek his Fortune', 'eft06.htm'),\n",
       " ('Mr Vinegar', 'eft07.htm'),\n",
       " ('Nix Nought Nothing', 'eft08.htm'),\n",
       " ('Jack Hannaford', 'eft09.htm'),\n",
       " ('Binnorie', 'eft10.htm'),\n",
       " ('Mouse and Mouser', 'eft11.htm'),\n",
       " (\"Cap O' Rushes\", 'eft12.htm'),\n",
       " ('Teeny-Tiny', 'eft13.htm'),\n",
       " ('Jack and the Beanstalk', 'eft14.htm'),\n",
       " ('The Story of the Three Little Pigs', 'eft15.htm'),\n",
       " ('The Master and His Pupil', 'eft16.htm'),\n",
       " ('Titty Mouse and Tatty Mouse', 'eft17.htm'),\n",
       " ('Jack and His Golden Snuff-Box', 'eft18.htm'),\n",
       " ('The Story of the Three Bears', 'eft19.htm'),\n",
       " ('Jack the Giant-Killer', 'eft20.htm'),\n",
       " ('Henny-Penny', 'eft21.htm'),\n",
       " ('Childe Rowland', 'eft22.htm'),\n",
       " ('Molly Whuppie', 'eft23.htm'),\n",
       " ('The Red Ettin', 'eft24.htm'),\n",
       " ('The Golden Arm', 'eft25.htm'),\n",
       " ('The History of Tom Thumb', 'eft26.htm'),\n",
       " ('Mr Fox', 'eft27.htm'),\n",
       " ('Lazy Jack', 'eft28.htm'),\n",
       " ('Johnny-Cake', 'eft29.htm'),\n",
       " (\"Earl Mar's Daughter\", 'eft30.htm'),\n",
       " ('Mr Miacca', 'eft31.htm'),\n",
       " ('Whittington and His Cat', 'eft32.htm'),\n",
       " ('The Strange Visitor', 'eft33.htm'),\n",
       " ('The Laidly Worm of Spindleston Heugh', 'eft34.htm'),\n",
       " ('The Cat and the Mouse', 'eft35.htm'),\n",
       " ('The Fish and the Ring', 'eft36.htm'),\n",
       " (\"The Magpie's Nest\", 'eft37.htm'),\n",
       " ('Kate Crackernuts', 'eft38.htm'),\n",
       " ('The Cauld Lad of Hilton', 'eft39.htm'),\n",
       " ('The Ass, The Table and the Stick', 'eft40.htm'),\n",
       " ('Fairy Ointment', 'eft41.htm'),\n",
       " (\"The Well of the World's End\", 'eft42.htm'),\n",
       " ('Master of all Masters', 'eft43.htm'),\n",
       " ('The Three Heads of the Well', 'eft44.htm')]"
      ]
     },
     "execution_count": 123,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "story_links = [(l.text, l.get('href')) for l in links_ if l.get('href') and l.get('href').startswith(stub)]\n",
    "\n",
    "# Tidy out links that aren't stories\n",
    "story_links = [s for s in story_links if \"00.\" not in s[1] and not any(x in s[0] for x in [\"Preface\", \"Notes\", \"Title\"])]\n",
    "story_links"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "972b4fe5-7335-4b33-a6e3-e6f796575bf7",
   "metadata": {},
   "source": [
    "The structure of the HTML page for each story may differ in certain respects, but in all but one case, it seems that there is *some* structure we can pull on from the document: the title appears as the only header, followed by the story.\n",
    "\n",
    "We can then parse that structure out:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 124,
   "id": "dfe5e301-5a61-49ab-baf9-d1cfcb753ca7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from markdownify import markdownify\n",
    "\n",
    "def get_stories_from_book(book):\n",
    "    book_index_url = get_st_url(book)\n",
    "    print(book_index_url)\n",
    "    stories = []\n",
    "    \n",
    "    story = requests.get(book_index_url)\n",
    "    # The \"soup\" is a parsed and structured form of the page we downloaded\n",
    "    soup = BeautifulSoup(story.content, \"html.parser\")\n",
    "    links_ = soup.find_all(\"a\")\n",
    "    stub = book_index_url.split(\"/\")[-2]\n",
    "    \n",
    "    story_links = [(l.text, l.get('href')) for l in links_ if l.get('href') and l.get('href').startswith(stub)]\n",
    "    # Tidy out links that aren't stories\n",
    "    story_links = [s for s in story_links if \"00.\" not in s[1] and not any(x in s[0] for x in [\"Preface\", \"Notes\", \"Title\"])]\n",
    "\n",
    "    # We need a heuristic to get the text of the story and not any other text\n",
    "    for story_link in story_links:\n",
    "        story_url = book_index_url.replace(\"index.htm\", story_link[1])\n",
    "        #print(f\"Getting {story_link[0]} from {story_url}\")\n",
    "        story_ = requests.get(story_url)\n",
    "        try:\n",
    "            story_text = [markdownify(x).strip() for x in story_.text.split(\"<HR>\") if \"===\" in markdownify(x)][0]\n",
    "            stories.append((book, book_index_url, f'{stub}_{story_link[1]}'.split(\".\")[0], story_text))\n",
    "        except:\n",
    "            # These are not handled\n",
    "            print(\"Error\", story_link[0], story_url)\n",
    "            # The only one we want to capture is https://sacred-texts.com/hin/ift/ift11.htm\n",
    "            # which does not have the title of the story as a title\n",
    "        \n",
    "    return stories"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 125,
   "id": "694e3f37-9ad7-409c-a600-99656d6d2690",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "https://www.sacred-texts.com/neu/eng/eft/index.htm\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "('English Fairy Tales',\n",
       " 'https://www.sacred-texts.com/neu/eng/eft/index.htm',\n",
       " 'eft_eft44',\n",
       " \"The Three Heads of the Well\\n===========================\\n\\n\\nLONG before Arthur and the Knights of the Round Table, there reigned in the eastern part of England a king who kept his court at Colchester.\\n\\n\\nIn the midst of all his glory, his queen died, leaving behind her an only daughter, about fifteen years of age, who for her beauty and kindness was the wonder of all that knew her. But the king, hearing of a lady who had likewise an only daughter, had a mind to marry her for the sake of her riches, though she was old, ugly, hook-nosed, and hump-backed. Her daughter was a yellow dowdy, full of envy and ill-nature; and, in short, was much of the same mould as her mother. But in a few weeks the king, attended by the nobility and gentry, brought his deformed bride to the palace, where the marriage rites were performed. She had not been long in the court before she set the king against his own beautiful daughter by false reports. The young princess, having lost her father's love, grew weary of the court, and one day, meeting with her father in the garden, she begged him, with tears in her eyes, to let her go and seek her fortune; to which the king consented, and ordered her stepmother to give her what she pleased. She went to the queen, who gave her a canvas bag of brown bread and hard cheese, with a bottle of beer. Though this was but a pitiful dowry for a king's daughter, she took it, with thanks, and proceeded on her journey, passing through groves, woods, and valleys, till at length she saw an old man sitting on a stone at the mouth of a cave, who said: 'Good morrow, fair maiden, whither away so fast?'\\n\\n\\n'Aged father,' says she, 'I am going to seek my fortune.'\\n\\n\\n'What have you got in your bag and bottle?'\\n\\n\\n'In my bag I have got bread and cheese, and in my bottle good small beer. Would you like to have some?'\\n\\n\\n'Yes,' said he, 'with all my heart.'\\n\\n\\nWith that the lady pulled out the provisions, and bade him eat and welcome. He did so, and gave her many thanks, and said: 'There is a thick thorny hedge before you, which you cannot get through, but take this wand in your hand, strike it three times, and say, 'Pray, hedge, let me come through', and it will open immediately; then, a little further, you will find a well; sit down on the brink of it, and there will come up three golden heads, which will speak; and whatever they require, that do.' Promising she would, she took her leave of him. Coming to the hedge and using the old man's wand, it divided, and let her through; then, coming to the well, she had no sooner sat down than a golden head came up singing:\\n\\n\\n'Wash me and comb me,  \\n\\n And lay me down softly.  \\n\\n And lay me on a bank to dry,  \\n\\n That I may look pretty,  \\n\\n When somebody passes by.'\\n\\n\\n'Yes,' said she, and taking it in her lap combed it with a silver comb, and then placed it upon a primrose bank. Then up came a second and a third head, saying the same as the former. So she did the same for them, and then, pulling out her provisions, sat down to eat her dinner.\\n\\n\\n\\n[![](tn/062.jpg)  \\nClick to enlarge](img/062.jpg)\\n\\n\\nThen said the heads one to another: 'What shall we weird for this damsel who has used us so kindly?'\\n\\n\\nThe first said: 'I weird her to be so beautiful that she shall charm the most powerful prince in the world.'\\n\\n\\nThe second said: 'I weird her such a sweet voice as shall far exceed the nightingale.'\\n\\n\\nThe third said: 'My gift shall be none of the least, as she is a king's daughter; I'll weird her so fortunate that she shall become queen to the greatest prince that reigns.'\\n\\n\\nShe then let them down into the well again, and so went on her journey. She had not travelled long before she saw a king hunting in the park with his nobles. She would have avoided him, but the king, having caught a sight of her, approached, and what with her beauty and sweet voice, fell desperately in love with her, and soon induced her to marry him.\\n\\n\\nThis king, finding that she was the king of Colchester's daughter, ordered some chariots to be got ready, that he might pay the king, his father-in-law, a visit. The chariot in which the king and queen rode was adorned with rich gems of gold. The king, her father, was at first astonished that his daughter had been so fortunate, till the young king let him know of all that had happened. Great was the joy at court amongst all, with the exception of the queen and her club-footed daughter, who were ready to burst with envy. The rejoicings, with feasting and dancing, continued many days. Then at length they returned home with the dowry her father gave her.\\n\\n\\nThe hump-backed princess, perceiving that her sister had been so lucky in seeking her fortune, wanted to do the same; so she told her mother, and all preparations were made, and she was furnished with rich dresses, and with sugar, almonds, and sweetmeats, in great quantities, and a large bottle of Malaga sack. With these she went the same road as her sister; and coming near the cave, the old man said: 'Young woman, whither so fast?'\\n\\n\\n'What's that to you?' said she.\\n\\n\\n'Then,' said he, 'what have you in your bag and bottle?'\\n\\n\\nShe answered: 'Good things, which you shall not be troubled with.'\\n\\n\\n'Won't you give me some?' said he.\\n\\n\\n'No, not a bit, nor a drop, unless it would choke you.'\\n\\n\\nThe old man frowned, saying: 'Evil fortune attend ye!'\\n\\n\\nGoing on, she came to the hedge, through which she espied a gap, and thought to pass through it; but the hedge closed, and the thorns ran into her flesh, so that it was with great difficulty that she got through. Being now all over blood, she searched for water to wash herself, and, looking round she saw the well. She sat down on the brink of it, and one of the heads came up saying: 'Wash me, comb me, and lay me down softly', as before, but she banged it with her bottle, saying, 'Take that for your washing.' So the second and third heads came up, and met with no better treatment than the first. Whereupon the heads consulted among themselves what evils to plague her with for such usage.\\n\\n\\nThe first said: 'Let her be struck with leprosy in her face.'\\n\\n\\nThe second: 'Let her voice be as harsh as a corncrake's.'\\n\\n\\nThe third said: 'Let her have for husband but a poor country cobbler.'\\n\\n\\nWell, on she went till she came to a town, and it being market-day, the people looked at her, and, seeing such an ugly face, and hearing such a squeaky voice, all fled but a poor country cobbler. Now he not long before had mended the shoes of an old hermit, who, having no money, gave him a box of ointment for the cure of the leprosy, and a bottle of spirits for a harsh voice. So the cobbler, having a mind to do an act of charity, was induced to go up to her and ask her who she was.\\n\\n\\n'I am,' said she, 'the king of Colchester's step-daughter.'\\n\\n\\n'Well,' said the cobbler, 'if I restore you to your natural complexion, and make a sound cure both in face and voice, will you in reward take me for a husband?'\\n\\n\\n'Yes, friend,' replied she, 'with all my heart!'\\n\\n\\nWith this the cobbler applied the remedies, and they made her well in a few weeks; after which they were married, and so set forward for the court at Colchester. When the queen found that her daughter had married nothing but a poor cobbler, she hanged herself in wrath. The death of the queen so pleased the king, who was glad to get rid of her so soon, that he gave the cobbler a hundred pounds, to quit the court with his lady, and take to a remote part of the kingdom, where he lived many years mending shoes, his wife spinning the thread for him.\")"
      ]
     },
     "execution_count": 125,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# For example:\n",
    "stories = get_stories_from_book(\"English Fairy Tales\")\n",
    "\n",
    "story = stories[-1]\n",
    "story"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6fcdd576-1bb6-4acc-a3a0-b8f8d60dff21",
   "metadata": {},
   "source": [
    "Some of the story texts may contain images or web links. We can clean those out:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 126,
   "id": "b3165643-a1df-4f10-afe3-794bc4ab73e7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import re\n",
    "\n",
    "def clean_story_text(txt):\n",
    "    \"\"\"Clean the story text.\"\"\"\n",
    "\n",
    "    # Remove images\n",
    "    cleaner = re.sub(r'!\\[\\]\\([^\\)]*\\)', '',\n",
    "           markdownify(txt))\n",
    "    # Remove links\n",
    "    cleaner = re.sub(r'\\[[^\\]]*\\]\\([^\\)]*\\)', '', cleaner)\n",
    "    # Minimise line breaks\n",
    "    cleaner = re.sub(r'\\n[\\n]*', '\\n\\n', cleaner)\n",
    "    \n",
    "    # Remove whitespace around line breaks\n",
    "    cleaner = \"\\n\\n\".join(s.strip() for s in cleaner.split(\"\\n\\n\"))\n",
    "\n",
    "    return cleaner"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 127,
   "id": "de5365fe-1114-4ab6-99c4-71444fde5d36",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"The Three Heads of the Well\\n\\n===========================\\n\\nLONG before Arthur and the Knights of the Round Table, there reigned in the eastern part of England a king who kept his court at Colchester.\\n\\nIn the midst of all his glory, his queen died, leaving behind her an only daughter, about fifteen years of age, who for her beauty and kindness was the wonder of all that knew her. But the king, hearing of a lady who had likewise an only daughter, had a mind to marry her for the sake of her riches, though she was old, ugly, hook-nosed, and hump-backed. Her daughter was a yellow dowdy, full of envy and ill-nature; and, in short, was much of the same mould as her mother. But in a few weeks the king, attended by the nobility and gentry, brought his deformed bride to the palace, where the marriage rites were performed. She had not been long in the court before she set the king against his own beautiful daughter by false reports. The young princess, having lost her father's love, grew weary of the court, and one day, meeting with her father in the garden, she begged him, with tears in her eyes, to let her go and seek her fortune; to which the king consented, and ordered her stepmother to give her what she pleased. She went to the queen, who gave her a canvas bag of brown bread and hard cheese, with a bottle of beer. Though this was but a pitiful dowry for a king's daughter, she took it, with thanks, and proceeded on her journey, passing through groves, woods, and valleys, till at length she saw an old man sitting on a stone at the mouth of a cave, who said: 'Good morrow, fair maiden, whither away so fast?'\\n\\n'Aged father,' says she, 'I am going to seek my fortune.'\\n\\n'What have you got in your bag and bottle?'\\n\\n'In my bag I have got bread and cheese, and in my bottle good small beer. Would you like to have some?'\\n\\n'Yes,' said he, 'with all my heart.'\\n\\nWith that the lady pulled out the provisions, and bade him eat and welcome. He did so, and gave her many thanks, and said: 'There is a thick thorny hedge before you, which you cannot get through, but take this wand in your hand, strike it three times, and say, 'Pray, hedge, let me come through', and it will open immediately; then, a little further, you will find a well; sit down on the brink of it, and there will come up three golden heads, which will speak; and whatever they require, that do.' Promising she would, she took her leave of him. Coming to the hedge and using the old man's wand, it divided, and let her through; then, coming to the well, she had no sooner sat down than a golden head came up singing:\\n\\n'Wash me and comb me,\\n\\nAnd lay me down softly.\\n\\nAnd lay me on a bank to dry,\\n\\nThat I may look pretty,\\n\\nWhen somebody passes by.'\\n\\n'Yes,' said she, and taking it in her lap combed it with a silver comb, and then placed it upon a primrose bank. Then up came a second and a third head, saying the same as the former. So she did the same for them, and then, pulling out her provisions, sat down to eat her dinner.\\n\\nThen said the heads one to another: 'What shall we weird for this damsel who has used us so kindly?'\\n\\nThe first said: 'I weird her to be so beautiful that she shall charm the most powerful prince in the world.'\\n\\nThe second said: 'I weird her such a sweet voice as shall far exceed the nightingale.'\\n\\nThe third said: 'My gift shall be none of the least, as she is a king's daughter; I'll weird her so fortunate that she shall become queen to the greatest prince that reigns.'\\n\\nShe then let them down into the well again, and so went on her journey. She had not travelled long before she saw a king hunting in the park with his nobles. She would have avoided him, but the king, having caught a sight of her, approached, and what with her beauty and sweet voice, fell desperately in love with her, and soon induced her to marry him.\\n\\nThis king, finding that she was the king of Colchester's daughter, ordered some chariots to be got ready, that he might pay the king, his father-in-law, a visit. The chariot in which the king and queen rode was adorned with rich gems of gold. The king, her father, was at first astonished that his daughter had been so fortunate, till the young king let him know of all that had happened. Great was the joy at court amongst all, with the exception of the queen and her club-footed daughter, who were ready to burst with envy. The rejoicings, with feasting and dancing, continued many days. Then at length they returned home with the dowry her father gave her.\\n\\nThe hump-backed princess, perceiving that her sister had been so lucky in seeking her fortune, wanted to do the same; so she told her mother, and all preparations were made, and she was furnished with rich dresses, and with sugar, almonds, and sweetmeats, in great quantities, and a large bottle of Malaga sack. With these she went the same road as her sister; and coming near the cave, the old man said: 'Young woman, whither so fast?'\\n\\n'What's that to you?' said she.\\n\\n'Then,' said he, 'what have you in your bag and bottle?'\\n\\nShe answered: 'Good things, which you shall not be troubled with.'\\n\\n'Won't you give me some?' said he.\\n\\n'No, not a bit, nor a drop, unless it would choke you.'\\n\\nThe old man frowned, saying: 'Evil fortune attend ye!'\\n\\nGoing on, she came to the hedge, through which she espied a gap, and thought to pass through it; but the hedge closed, and the thorns ran into her flesh, so that it was with great difficulty that she got through. Being now all over blood, she searched for water to wash herself, and, looking round she saw the well. She sat down on the brink of it, and one of the heads came up saying: 'Wash me, comb me, and lay me down softly', as before, but she banged it with her bottle, saying, 'Take that for your washing.' So the second and third heads came up, and met with no better treatment than the first. Whereupon the heads consulted among themselves what evils to plague her with for such usage.\\n\\nThe first said: 'Let her be struck with leprosy in her face.'\\n\\nThe second: 'Let her voice be as harsh as a corncrake's.'\\n\\nThe third said: 'Let her have for husband but a poor country cobbler.'\\n\\nWell, on she went till she came to a town, and it being market-day, the people looked at her, and, seeing such an ugly face, and hearing such a squeaky voice, all fled but a poor country cobbler. Now he not long before had mended the shoes of an old hermit, who, having no money, gave him a box of ointment for the cure of the leprosy, and a bottle of spirits for a harsh voice. So the cobbler, having a mind to do an act of charity, was induced to go up to her and ask her who she was.\\n\\n'I am,' said she, 'the king of Colchester's step-daughter.'\\n\\n'Well,' said the cobbler, 'if I restore you to your natural complexion, and make a sound cure both in face and voice, will you in reward take me for a husband?'\\n\\n'Yes, friend,' replied she, 'with all my heart!'\\n\\nWith this the cobbler applied the remedies, and they made her well in a few weeks; after which they were married, and so set forward for the court at Colchester. When the queen found that her daughter had married nothing but a poor cobbler, she hanged herself in wrath. The death of the queen so pleased the king, who was glad to get rid of her so soon, that he gave the cobbler a hundred pounds, to quit the court with his lady, and take to a remote part of the kingdom, where he lived many years mending shoes, his wife spinning the thread for him.\""
      ]
     },
     "execution_count": 127,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "txt = clean_story_text(story[3])\n",
    "txt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bcbd0b5d-34bf-4b62-8273-e0af941e2bcb",
   "metadata": {},
   "source": [
    "It will also be useful to parse out separate components of each story, such as the title, the body of the text, the first sentence aand the closig paragraph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 128,
   "id": "2765a85a-0022-4cd4-8583-6591f7273c84",
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_story_components(txt):\n",
    "    \"\"\"Extract components of story for db table.\"\"\"\n",
    "    txt = txt.strip()\n",
    "    _parts = re.split('=+', txt)\n",
    "    title = _parts[0].strip()\n",
    "    title = re.sub(r'p\\.[^\\n]*', '', title).strip()\n",
    "    body = _parts[1].strip()\n",
    "    # Use a proper sentence parser?\n",
    "    first_sent = body.split(\".\")[0].strip()\n",
    "    last_para = body.split(\"\\n\\n\")[-1].strip()\n",
    "    \n",
    "    return title, body, first_sent, last_para"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 129,
   "id": "782cddc6-c453-422b-bb93-0c7da03f3dc8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('The Three Heads of the Well',\n",
       " \"LONG before Arthur and the Knights of the Round Table, there reigned in the eastern part of England a king who kept his court at Colchester.\\n\\nIn the midst of all his glory, his queen died, leaving behind her an only daughter, about fifteen years of age, who for her beauty and kindness was the wonder of all that knew her. But the king, hearing of a lady who had likewise an only daughter, had a mind to marry her for the sake of her riches, though she was old, ugly, hook-nosed, and hump-backed. Her daughter was a yellow dowdy, full of envy and ill-nature; and, in short, was much of the same mould as her mother. But in a few weeks the king, attended by the nobility and gentry, brought his deformed bride to the palace, where the marriage rites were performed. She had not been long in the court before she set the king against his own beautiful daughter by false reports. The young princess, having lost her father's love, grew weary of the court, and one day, meeting with her father in the garden, she begged him, with tears in her eyes, to let her go and seek her fortune; to which the king consented, and ordered her stepmother to give her what she pleased. She went to the queen, who gave her a canvas bag of brown bread and hard cheese, with a bottle of beer. Though this was but a pitiful dowry for a king's daughter, she took it, with thanks, and proceeded on her journey, passing through groves, woods, and valleys, till at length she saw an old man sitting on a stone at the mouth of a cave, who said: 'Good morrow, fair maiden, whither away so fast?'\\n\\n'Aged father,' says she, 'I am going to seek my fortune.'\\n\\n'What have you got in your bag and bottle?'\\n\\n'In my bag I have got bread and cheese, and in my bottle good small beer. Would you like to have some?'\\n\\n'Yes,' said he, 'with all my heart.'\\n\\nWith that the lady pulled out the provisions, and bade him eat and welcome. He did so, and gave her many thanks, and said: 'There is a thick thorny hedge before you, which you cannot get through, but take this wand in your hand, strike it three times, and say, 'Pray, hedge, let me come through', and it will open immediately; then, a little further, you will find a well; sit down on the brink of it, and there will come up three golden heads, which will speak; and whatever they require, that do.' Promising she would, she took her leave of him. Coming to the hedge and using the old man's wand, it divided, and let her through; then, coming to the well, she had no sooner sat down than a golden head came up singing:\\n\\n'Wash me and comb me,\\n\\nAnd lay me down softly.\\n\\nAnd lay me on a bank to dry,\\n\\nThat I may look pretty,\\n\\nWhen somebody passes by.'\\n\\n'Yes,' said she, and taking it in her lap combed it with a silver comb, and then placed it upon a primrose bank. Then up came a second and a third head, saying the same as the former. So she did the same for them, and then, pulling out her provisions, sat down to eat her dinner.\\n\\nThen said the heads one to another: 'What shall we weird for this damsel who has used us so kindly?'\\n\\nThe first said: 'I weird her to be so beautiful that she shall charm the most powerful prince in the world.'\\n\\nThe second said: 'I weird her such a sweet voice as shall far exceed the nightingale.'\\n\\nThe third said: 'My gift shall be none of the least, as she is a king's daughter; I'll weird her so fortunate that she shall become queen to the greatest prince that reigns.'\\n\\nShe then let them down into the well again, and so went on her journey. She had not travelled long before she saw a king hunting in the park with his nobles. She would have avoided him, but the king, having caught a sight of her, approached, and what with her beauty and sweet voice, fell desperately in love with her, and soon induced her to marry him.\\n\\nThis king, finding that she was the king of Colchester's daughter, ordered some chariots to be got ready, that he might pay the king, his father-in-law, a visit. The chariot in which the king and queen rode was adorned with rich gems of gold. The king, her father, was at first astonished that his daughter had been so fortunate, till the young king let him know of all that had happened. Great was the joy at court amongst all, with the exception of the queen and her club-footed daughter, who were ready to burst with envy. The rejoicings, with feasting and dancing, continued many days. Then at length they returned home with the dowry her father gave her.\\n\\nThe hump-backed princess, perceiving that her sister had been so lucky in seeking her fortune, wanted to do the same; so she told her mother, and all preparations were made, and she was furnished with rich dresses, and with sugar, almonds, and sweetmeats, in great quantities, and a large bottle of Malaga sack. With these she went the same road as her sister; and coming near the cave, the old man said: 'Young woman, whither so fast?'\\n\\n'What's that to you?' said she.\\n\\n'Then,' said he, 'what have you in your bag and bottle?'\\n\\nShe answered: 'Good things, which you shall not be troubled with.'\\n\\n'Won't you give me some?' said he.\\n\\n'No, not a bit, nor a drop, unless it would choke you.'\\n\\nThe old man frowned, saying: 'Evil fortune attend ye!'\\n\\nGoing on, she came to the hedge, through which she espied a gap, and thought to pass through it; but the hedge closed, and the thorns ran into her flesh, so that it was with great difficulty that she got through. Being now all over blood, she searched for water to wash herself, and, looking round she saw the well. She sat down on the brink of it, and one of the heads came up saying: 'Wash me, comb me, and lay me down softly', as before, but she banged it with her bottle, saying, 'Take that for your washing.' So the second and third heads came up, and met with no better treatment than the first. Whereupon the heads consulted among themselves what evils to plague her with for such usage.\\n\\nThe first said: 'Let her be struck with leprosy in her face.'\\n\\nThe second: 'Let her voice be as harsh as a corncrake's.'\\n\\nThe third said: 'Let her have for husband but a poor country cobbler.'\\n\\nWell, on she went till she came to a town, and it being market-day, the people looked at her, and, seeing such an ugly face, and hearing such a squeaky voice, all fled but a poor country cobbler. Now he not long before had mended the shoes of an old hermit, who, having no money, gave him a box of ointment for the cure of the leprosy, and a bottle of spirits for a harsh voice. So the cobbler, having a mind to do an act of charity, was induced to go up to her and ask her who she was.\\n\\n'I am,' said she, 'the king of Colchester's step-daughter.'\\n\\n'Well,' said the cobbler, 'if I restore you to your natural complexion, and make a sound cure both in face and voice, will you in reward take me for a husband?'\\n\\n'Yes, friend,' replied she, 'with all my heart!'\\n\\nWith this the cobbler applied the remedies, and they made her well in a few weeks; after which they were married, and so set forward for the court at Colchester. When the queen found that her daughter had married nothing but a poor cobbler, she hanged herself in wrath. The death of the queen so pleased the king, who was glad to get rid of her so soon, that he gave the cobbler a hundred pounds, to quit the court with his lady, and take to a remote part of the kingdom, where he lived many years mending shoes, his wife spinning the thread for him.\",\n",
       " 'LONG before Arthur and the Knights of the Round Table, there reigned in the eastern part of England a king who kept his court at Colchester',\n",
       " 'With this the cobbler applied the remedies, and they made her well in a few weeks; after which they were married, and so set forward for the court at Colchester. When the queen found that her daughter had married nothing but a poor cobbler, she hanged herself in wrath. The death of the queen so pleased the king, who was glad to get rid of her so soon, that he gave the cobbler a hundred pounds, to quit the court with his lady, and take to a remote part of the kingdom, where he lived many years mending shoes, his wife spinning the thread for him.')"
      ]
     },
     "execution_count": 129,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "get_story_components(txt)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d8f11eb-0c1d-4b6c-8908-113c68ac62fc",
   "metadata": {},
   "source": [
    "We can parse all the stories into components and then add them to the database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "id": "28a032a2-64b0-47b4-aaa2-2a4ab0657733",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "https://www.sacred-texts.com/neu/eng/eft/index.htm\n",
      "https://www.sacred-texts.com/neu/eng/meft/index.htm\n",
      "https://www.sacred-texts.com/neu/celt/cft/index.htm\n",
      "Error Text [Zipped] https://www.sacred-texts.com/neu/celt/cft/cft.txt.gz\n",
      "https://www.sacred-texts.com/neu/celt/mcft/index.htm\n",
      "https://www.sacred-texts.com/hin/ift/index.htm\n",
      "Error The Soothsayers Son https://www.sacred-texts.com/hin/ift/ift11.htm\n"
     ]
    }
   ],
   "source": [
    "items = []\n",
    "\n",
    "for book in book_ids:\n",
    "    if \"st\" in book_ids[book]:\n",
    "        stories = get_stories_from_book(book)\n",
    "        for (book_title, book_id, _id, story) in stories:\n",
    "            (title, body, first_sent, last_para) = get_story_components(clean_story_text(story))\n",
    "            items.append({\"book_id\": book_id,\n",
    "                          \"book_title\": book_title,\n",
    "                          \"story_id\": _id,\n",
    "                          \"story_title\": title,\n",
    "                          \"story_text\": body,\n",
    "                          \"last_para\": last_para, # sometimes contains provenance\n",
    "                          \"first_line\": first_sent, # maybe we want to review the openings, or create an index...\n",
    "                          \"provenance\": \"\", # attempt at provenance\n",
    "                          \"chapter_order\": \"\", # Sort order of stories in book\n",
    "                         })\n",
    "    # The upsert means \"add or replace\"\n",
    "    db[\"stories\"].upsert_all(items, pk=(\"story_id\" ))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07955294-9dcc-47e7-833c-a053c07c3f5e",
   "metadata": {},
   "source": [
    "Run a test query on the database:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 132,
   "id": "0d7dd539-b832-45be-aba9-c1f7cc747188",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'story_title': 'The Black Horse', 'story_text': 'ONCE\\n\\nthere was a king and he had three sons, and when the king died, they did not\\n\\ngive a shade of anything to the youngest son, but an old white limping garron.\\n\\n\"If I get but this,\" quoth he, \"it seems that I\\n\\nhad best go with this same.\\n\\nHe was going with it right before him, sometimes walking,\\n\\nsometimes riding. When he had been riding a good while he thought that the\\n\\ngarron would need a while of eating, so he came down to earth, and what should\\n\\nhe see coming out of the heart of the western airt towards him but a rider\\n\\nriding high, well, and right well.\\n\\n\"AllI hail, my lad,\" said he.\\n\\n\"Hail, king\\'s son,\" said the other.\\n\\n\"What\\'s your news?\" said the king\\'s son.\\n\\n\"I have got that,\" said the lad who came. \"I am\\n\\nafter breaking my heart riding this ass of a horse ; but will you give me the\\n\\nlimping white garron for him?\"\\n\\n\"No,\" said the prince; \"it would be a bad\\n\\nbusiness for me*.*\"\\n\\n\"You need not fear,\" said the man that came,\\n\\n\"there is no saying but that you might make better use of him than I. He\\n\\nhas one value, there is no single place that you can think of in the four\\n\\nparts of the wheel of the world that the black horse will not take you\\n\\nthere.\"\\n\\nSo the king\\'s son got the black horse, and he gave the limping\\n\\nwhite garron.\\n\\nWhere should he think of being when he mounted but in the\\n\\nRealm Underwaves. He went, and before sunrise on the morrow he was there. What\\n\\nshould he find when he got there but the son of the King Underwaves holding a\\n\\nCourt, and the people of the realm gathered to see if there was any one who\\n\\nwould undertake to go to seek the daughter of the King of the Greeks to be the\\n\\nprince\\'s wife. No one came forward, when who should come up but the rider of\\n\\nthe black horse.\\n\\n\"You, rider of the black horse,\" said the prince,\\n\\n\"I lay you under crosses and under spells to have the daughter of the\\n\\nKing of the Greeks here before the sun rises to-morrow.\"\\n\\nHe went out and he reached the black horse and leaned his\\n\\nelbow on his mane, and he heaved a sigh.\\n\\n\"Sigh of a king\\'s son under spells I\". said the\\n\\nhorse; but have no care; we shall do the thing that was set before you.\"\\n\\nAnd so off they went.\\n\\n\"Now,\" said the horse, \"when\\n\\nwe get near the great town of the Greeks, you will notice that the four feet of\\n\\na horse never went to the town before. The king\\'s daughter will see me from the\\n\\ntop of the castle looking out of a window, and she will not be content without a\\n\\nturn of a ride upon me. Say that she may have that, but the horse will suffer no\\n\\nman but you to ride before a woman on him.\"\\n\\nThey came near the big town, and he fell to horsemanship; and\\n\\nthe princess was looking out of the windows, and noticed the horse. The\\n\\nhorsemanship pleased her, and she came out just as the horse had come.\\n\\n\"Give me a ride on the horse,\" said she.\\n\\n\"You shall have that,\" said he, \"but the horse\\n\\nwill let no man ride him before a woman but me.\"\\n\\n\"I have a horseman of my own,\" said she.\\n\\n\"If so, set him in front,\" said he.\\n\\nBefore the horseman mounted at all, when he tried to get up, the\\n\\nhorse lifted his legs and kicked him off.\\n\\n\"Come then yourself and mount before me,\" said she;\\n\\n\"I won\\'t leave the matter so.\"\\n\\nHe mounted the horse and she behind him, and before she glanced\\n\\nfrom her she was nearer sky than earth. He was in Realm Underwaves with her\\n\\nbefore sunrise.\\n\\n\"You are come,\" said Prince Underwaves.\\n\\n\"I am come,\" said he.\\n\\n\"There you are, my hero,\" said the prince. \"You\\n\\nare the son of a king, but I am a son of success. Anyhow, we shall have no delay\\n\\nor neglect now, but a wedding.\"\\n\\n\"Just gently,\" said the princess; \"your wedding\\n\\nis not so short a way off as you suppose. Till I get the silver cup that my\\n\\ngrandmother had at her wedding, and that my mother had as well, I will not\\n\\nmarry, for I need to have it at my own wedding.\"\\n\\n\"You, rider of the black horse,\" said the Prince\\n\\nUnderwaves, \"I set you under spells and under crosses unless the silver cup\\n\\nis here before dawn to-morrow.\"\\n\\nOut he went and reached the horse and leaned his elbow on his\\n\\nmane, and he heaved a sigh.\\n\\n\"Sigh of a king\\'s son under spells !\" said the horse;\\n\\n\"mount and you shall get the silver cup. The people of the realm are\\n\\ngathered about the king tonight, for he has missed his daughter, and when you\\n\\nget to the palace go in and leave me without; they will have the cup there going\\n\\nround the company. Go in and sit in their midst. Say nothing, and seem to be as\\n\\none of the people of the place. But when the cup comes round to you, take it\\n\\nunder your oxter, and come out to me with it, and we\\'ll go.\"\\n\\nAway they went and they got to Greece, and he went in to the\\n\\npalace and did as the black horse bade. He took the cup and came out and\\n\\nmounted, and before sunrise he was in the Realm Underwaves.\\n\\n\"You are come,\" said Prince Underwaves.\\n\\n\"I am come,\" said he.\\n\\n\"We had better get married now,\" said the prince to\\n\\nthe Greek princess.\\n\\n\"Slowly and softly,\" said she. \"I will not marry\\n\\ntill I get the silver ring that my grandmother and my mother wore when they were\\n\\nwedded.\"\\n\\n\"You, rider of the black horse,\" said the Prince\\n\\nUnderwaves, \"do that. Let\\'s have that ring here to-morrow at sunrise.\"\\n\\nThe lad went to the black horse and put his elbow on his crest\\n\\nand told him how it was.\\n\\n\"There never was a matter set before me harder than this\\n\\nmatter which has now been set in front of me,\" said the horse, \" but\\n\\nthere is no help for it at any rate. Mount me. There is a snow mountain and an\\n\\nice mountain and a mountain of fire between us and the winning of that ring. It\\n\\nis right hard for us to pass them.\"\\n\\nThus they went as they were, and about a mile from the snow\\n\\nmountain they were in a bad case with cold. As they came near it he struck the\\n\\nhorse, and with the bound he gave the black horse was on the top of the snow\\n\\nmountain ; at the next bound he was on the top of the ice mountain; at the third\\n\\nbound he went through the mountain of fire. When he had passed the mountains he\\n\\nwas dragging at the horse\\'s neck, as though he were about to lose himself. He\\n\\nwent on before him down to a town below.\\n\\n\"Go down,\\'\\' said the black horse, \\'\\'to a smithy; make an\\n\\niron spike for every bone end in me.\"\\n\\nDown he went as the horse desired, and he got the spikes made,\\n\\nand back he came with them.\\n\\n\"Stick them into me,\" said the horse, \"every\\n\\nspike of them in every bone end that I have.\"\\n\\nThat he did ; he stuck the spikes into the horse.\\n\\n\"There is a loch here,\" said the horse, \"four\\n\\nmiles long and four miles wide, and when I go out into it the loch will take\\n\\nfire and blaze. If you see the Loch of Fire going out before the sun rises,\\n\\nexpect me, and if not, go your way.\"\\n\\nOut went the black horse into the lake, and the lake became\\n\\nflame. Long was he stretched about the lake, beating his palms and roaring. Day\\n\\ncame, and the loch did not go out.\\n\\nBut at the hour when the sun was rising out of the water the\\n\\nlake went out.\\n\\nAnd the black horse rose in the middle of the water with one\\n\\nsingle spike in him, and the ring upon its end.\\n\\nHe came on shore, and down he fell beside the loch.\\n\\nThen down went the rider. He got the ring, and he dragged the\\n\\nhorse down to the side of a hill. He fell to sheltering him with his arms about\\n\\nhim, and as the sun was rising he got better and better, till about. midday,\\n\\nwhen he rose on his feet.\\n\\n\"Mount,\" said the horse, \"and let us\\n\\nbegone.\"\\n\\nHe mounted on the black horse, and away they went. He reached\\n\\nthe mountains, and he leaped the horse at the fire mountain and was on the top.\\n\\nFrom the mountain of fire he leaped to the mountain of ice, and from the\\n\\nmountain of ice to the mountain of snow. He put the mountains past him, and by\\n\\nmorning he was in realm under the waves.\\n\\n\"You are come,\" said the prince.\\n\\n\"I am,\" said he.\\n\\n\"That\\'s true,\" said Prince Underwaves. \"A king\\'s\\n\\nson are you, but a son of success am I. We shall have no more mistakes and\\n\\ndelays, but a wedding this time.\"\\n\\n\"Go easy,\" said the Princess of the Greeks. \"Your\\n\\nwedding is not so near as you think yet. Till you make a castle, I won\\'t marry\\n\\nyou. Not to your father\\'s castle nor to your mother\\'s will I go to dwell; but\\n\\nmake me a castle for which your father\\'s castle will not make washing\\n\\nwater.\"\\n\\n\"You, rider of the black horse, make that,\" said\\n\\nPrince Underwaves, \"before the morrow\\'s sun rises.\"\\n\\nThe lad went out to the horse and leaned his elbow on his neck\\n\\nand sighed, thinking that this castle never could be made for ever.\\n\\n\"There never came a turn in my road yet that is easier for\\n\\nme to pass than this,\" said the black horse.\\n\\nGlance that the lad gave from him he saw all that there were,\\n\\nand ever so many wrights and stone masons at work, and the castle was ready\\n\\nbefore the sun rose.\\n\\nHe shouted at the Prince Underwaves, and he saw the castle. He\\n\\ntried to pluck out his eye, thinking that it was a false sight.\\n\\n\"Son of King Underwaves,\" said the rider of the black\\n\\nhorse, \"don\\'t think that you have a false sight ; this is a true\\n\\nsight.\"\\n\\n\"That\\'s true,\" said the prince. \"You are a son of\\n\\nsuccess, but I am a son of success too. There will be no more mistakes and\\n\\ndelays, but a wedding now.\"\\n\\n\"No,\" said she. The time is come. Should we not go to\\n\\nlook at the castle ? There\\'s time enough to get married before the night\\n\\ncomes.\"\\n\\nThey went to the castle and the castle was without a \" but\\n\\n\" ----\\n\\n\"I see one,\" said the prince. \"One want at least\\n\\nto be made good. A well to be made inside, so that water may not be far to fetch\\n\\nwhen there is a feast or a wedding in the castle.\"\\n\\n\"That won\\'t be long undone,\" said the rider of the\\n\\nblack horse.\\n\\nThe well was made, and it was seven fathoms deep and two or\\n\\nthree fathoms wide, and they looked at the well on the way to the wedding.\\n\\n\"It is very well made,\" said she, \"but for one\\n\\nlittle fault yonder.\"\\n\\n\"Where is it?\" said Prince Underwaves.\\n\\n\"There,\" said she.\\n\\nHe bent him down to look. She came out, and she put her two\\n\\nhands at his back, and cast him in.\\n\\n\"Be thou there,\" said she. \"If I go to be\\n\\nmarried, thou art not the man; but the man who did each exploit that has been\\n\\ndone, and, if he chooses, him will I have.\"\\n\\nAway she went with the rider of the little black horse to the\\n\\nwedding.\\n\\nAnd at the end of three years after that so it was that he first\\n\\nremembered the black horse or where be left him.\\n\\nHe got up and went out, and he was very sorry for his neglect of\\n\\nthe black horse. He found him just where he left him.\\n\\n\"Good luck to you, gentleman,\" said the horse.\\n\\n\"You seem as if you had got something that you like better than me.\"\\n\\n\"I have not got that, and I won\\'t; but it came over me to\\n\\nforget you,\" said he.\\n\\n\"I don\\'t mind,\" said the horse, \"it will make no\\n\\ndifference. Raise your sword and smite off my head.\"\\n\\n\"Fortune will now allow that I should do that,\" said\\n\\nhe.\\n\\n\"Do it instantly, or I will do it to you,\" said the\\n\\nhorse.\\n\\nSo the lad drew his sword and smote off the horse\\'s head ; then\\n\\nhe lifted his two palms and uttered a doleful cry.\\n\\nWhat should he hear behind him but \" All hail, my\\n\\nbrother-in-law.\"\\n\\nHe looked behind him, and there was the finest man he ever set\\n\\neyes upon.\\n\\n\"What set you weeping for the black horse?\" said he.\\n\\n\"This,\" said the lad, \"that there never was born\\n\\nof man or beast a creature in this world that I was fonder of.\"\\n\\n\"Would you take me for him ?\" said the stranger.\\n\\n\"If I could think you the horse, I would ; but if not, I\\n\\nwould rather the horse,\" said the rider.\\n\\n\"I am the black horse,\" said the\\n\\nlad, \"and if I were not, how should you have all these things that you went\\n\\nto seek in my father\\'s house. Since I went under spells, many a man have I ran\\n\\nat before you met me. They had but one word amongst them : they could not keep\\n\\nme, nor manage me, and they never kept me a couple of days. But when I fell in\\n\\nwith you, you kept me till the time ran out that was to come from the spells.\\n\\nAnd now you shall go home with me, and we will make a wedding in my father\\'s\\n\\nhouse.\"'}\n",
      "{'story_title': 'The King of England and his Three Sons', 'story_text': 'ONCE upon a time there was an old king who had three sons; and the old king fell very sick one time and there was nothing at all could make him well but some golden apples from a far country. So the three brothers went on horseback to look for some of these apples. They set off together, and when they came to cross-roads they halted and refreshed themselves a bit; and then they agreed to meet on a certain time, and not one was to go home before the other. So Valentine took the right, and Oliver went straight on, and poor Jack took the left.\\n\\nTo make my long story short, I shall follow poor Jack, and let the other two take their chances, for I don\\'t think there was much good in them. Off poor Jack rides over hills, dales, valleys, and mountains, through woolly woods and sheepwalks, where the old chap never sounded his hollow bugle-horn, farther than I can tell you tonight or ever intend to tell you.\\n\\nAt last he came to an old house, near a great forest, and there was an old man sitting out by the door, and his look was enough to frighten you or anyone else; and the old man said to him:\\n\\n\\'Good morning, my king\\'s son.\\'\\n\\n\\'Good morning to you, old gentleman,\\' was the young prince\\'s answer; frightened out of his wits though he was, he didn\\'t like to give in.\\n\\nThe old gentleman told him to dismount and to go in to have some refreshment, and to put his horse in the stable, such as it was. Jack soon felt much better after having something to eat, and began to ask the old gentleman how he knew he was a king\\'s son.\\n\\n\\'Oh dear!\\' said the old man, \\'I knew that you were a king\\'s son, and I know what is your business better than what you do yourself. So you will have to stay here tonight; and when you are in bed you mustn\\'t be frightened whatever you may hear. There will come all manner of frogs and snakes, and some will try to get into your eyes and your mouth, but mind, don\\'t stir the least bit or you will turn into one of those things yourself.\\'\\n\\nPoor Jack didn\\'t know what to make of this, but, however, he ventured to go to bed. Just as he thought to have a bit of sleep, round and over and under him they came, but he never stirred an inch all night.\\n\\n\\'Well, my young son, how are you this morning?\\'\\n\\n\\'Oh, I am very well, thank you, but I didn\\'t have much rest.\\'\\n\\n\\'Well, never mind that; you have got on very well so far, but you have a great deal to go through before you can have the golden apples to go to your father. You\\'d better come and have some breakfast before you start on your way to my other brother\\'s house. You will have to leave your own horse here with me until you come back again, and tell me everything about how you get on.\\'\\n\\nAfter that out came a fresh horse for the young prince, and the old man gave him a ball of yarn, and he flung it between the horse\\'s two ears.\\n\\nOff he went as fast as the wind, which the wind behind could not catch the wind before, until he came to the second oldest brother\\'s house. When he rode up to the door he had the same salute as from the first old man, but this one was even uglier than the first one. He had long grey hair, and his teeth were curling out of his mouth, and his finger- and toe-nails had not been cut for many thousand years. He put the horse into a much better stable, and called Jack in, and gave him plenty to eat and drink, and they had a bit of a chat before they went to bed.\\n\\n\\'Well, my young son,\\' said the old man, \\'I suppose you are one of the king\\'s children come to look for the golden apples to bring him back to health.\\'\\n\\n\\'Yes, I am the youngest of the three brothers, and I should like to get them to go back with.\\'\\n\\n\\'Well, don\\'t mind, my young son. Before you go to bed tonight I will send to my eldest brother, and will tell him what you want, and he won\\'t have much trouble in sending you on to the place where you must get the apples. But mind not to stir tonight no matter how you get bitten and stung, or else you will work great mischief to yourself.\\'\\n\\nThe young man went to bed and bore all, as he did the first night, and got up the next morning well and hearty. After a good breakfast out comes a fresh horse, and a ball of yarn to throw between his ears. The old man told him to jump up quick, and said that he had made it all right with his eldest brother, not to delay for anything whatever, \\'For,\\' said he, \\'you have a good deal to go through with in a very short and quick time.\\'\\n\\nHe flung the ball, and off he goes as quick as lightning, and comes to the eldest brother\\'s house. The old man received him very kindly and told him he long wished to see him, and that he would go through his work like a man and come back safe and sound. \\'Tonight,\\' said he, \\'I will give you rest; there shall nothing come to disturb you, so that you may not feel sleepy for tomorrow. And you must mind to get up middling early, for you\\'ve got to go and come all in the same day; there will be no place for you to rest within thousands of miles of that place; and if there was, you would stand in great danger never to come from there in your own form. Now, my young prince, mind what I tell you. Tomorrow, when you come in sight of a very large castle, which will be surrounded with black water, the first thing you will do you will tie your horse to a tree, and you will see three beautiful swans in sight, and you will say, \"Swan, swan, carry me over in the name of the Griffin of the Greenwood\", and the swans will swim you over to the earth. There will be three great entrances, the first guarded by four giants with drawn swords in their hands, the second by lions, the other by fiery serpents and dragons. You will have to be there exactly at one o\\'clock; and mind and leave there precisely at two, and not a moment later. When the swans carry you over to the castle, you will pass all these things, all fast asleep, but you must not notice any of them.\\n\\n\\'When you go in, you will turn up to the right; you will see some grand rooms, then you will go downstairs through the cooking kitchen, and through a door on your left you go into a garden, where you will find the apples you want for your father to get well. After you fill your wallet, you make all speed you possibly can, and call out for the swans to carry you over the same as before. After you get on your horse, should you hear anything shouting or making any noise after you, be sure not to look back, as they will follow you for thousands of miles; but when the time is up and you get near my place, it will all be over. Well now, my young man, I have told you all you have to do tomorrow; and mind, whatever you do, don\\'t look about you when you see all those frightful things asleep. Keep a good heart, and make haste from there, and come back to me with all the speed you can. I should like to know how my two brothers were when you left them, and what they said to you about me.\\'\\n\\n\\'Well, to tell the truth, before I left London my father was sick, and said I was to come here to look for the golden apples, for they were the only things that would do him good; and when I came to your youngest brother, he told me many things I had to do before I came here. And I thought once that your youngest brother put me in the wrong bed, when he put all those snakes to bite me all night long, until your second brother told me \"So it was to be\", and said, \"It is the same here\", but said you had none in your beds.\\'\\n\\n\\'Well, let\\'s go to bed. You need not fear. There are no snakes here.\\'\\n\\nThe young man went to bed, and had a good night\\'s rest, and got up the next morning as fresh as newly caught trout. Breakfast being over, out comes the other horse, and, while saddling and fettling, the old man began to laugh, and told the young gentleman that if he saw a pretty young lady, not to stay with her too long, because she might waken, and then he would have to stay with her or to be turned into one of those unearthly monsters, like those he would have to pass by going into the castle.\\n\\n\\'Ha! ha! ha! you make me laugh so that I can scarcely buckle the saddle-straps. I think I shall make it all right, my uncle, if I see a young lady there, you may depend.\\'\\n\\n\\'Well, my boy, I shall see how you will get on.\\'\\n\\nSo he mounts his Arab steed, and off he goes like a shot out of a gun. At last he comes in sight of the castle. He ties his horse safe to a tree, and pulls out his watch. It was then a quarter to one, when he called out, \\'Swan, swan, carry me over, in the name of the old Griffin of the Greenwood.\\' No sooner said than done. A swan under each side, and one in front, took him over in a crack. He got on his legs, and walked quietly by all those giants, lions, fiery serpents, and all manner of other frightful things too numerous to mention, while they were fast asleep, and that only for the space of one hour, when into the castle he goes neck or nothing. Turning to the right, upstairs he runs, and enters into a very grand bedroom, and sees a beautiful princess lying full stretch on a gold bedstead, fast asleep. He gazed on her beautiful form with admiration, and he takes her garter off, and buckles it on his own leg, and he buckles his on hers; he also takes her gold watch and pocket-handkerchief, and exchanges his for hers; after that he ventures to give her a kiss, when she very nearly opens her eyes. Seeing the time short, off he runs downstairs, and passing through the kitchen to go into the garden for the apples, he could see the cook all-fours on her back on the middle of the floor, with the knife in one hand and the fork in the other. He found the apples, and filled the wallet; and on passing through the kitchen the cook near wakened, but he was obliged to make all the speed he possibly could, as the time was nearly up. He called out for the swans, and they managed to take him over; but they found that he was a little heavier than before. No sooner than he had mounted his horse he could hear a tremendous noise, the enchantment was broke, and they tried to follow him, but all to no purpose. He was not long before he came to the oldest brother\\'s house; and glad enough he was to see it, for the sight and the noise of all those things that were after him nearly frightened him to death.\\n\\n\\'Welcome, my boy; I am proud to see you. Dismount and put the horse in the stable, and come in and have some refreshments; I know you are hungry after all you have gone through in that castle. And tell me all you did, and all you saw there. Other kings\\' sons went by here to go to that castle, but they never came back alive, and you are the only one that ever broke the spell. And now you must come with me, with a sword in your hand, and must cut my head off, and must throw it in that well.\\'\\n\\nThe young prince dismounts, and puts his horse in the stable, and they go in to have some refreshments, for I can assure you he wanted some; and after telling everything that passed, which the old gentleman was very pleased to hear, they both went for a walk together, the young prince looking around and seeing the place looking dreadful, as did the old man. He could scarcely walk from his toe-nails curling up like ram\\'s horns that had not been cut for many hundred years, and big long hair. They come to a well, and the old man gives the prince a sword, and tells him to cut his head off, and throw it in that well. The young man has to do it against his wish, but has to do it.\\n\\nNo sooner has he flung the head in the well, than up springs one of the finest young gentlemen you would wish to see; and instead of the old house and the frightful-looking place, it was changed into a beautiful hall and grounds. And they went back and enjoyed themselves well, and had a good laugh about the castle.\\n\\nThe young prince leaves this young gentleman in all his glory, and he tells the prince before leaving that he will see him again before long. They have a jolly shake-hands, and off he goes to the next oldest brother; and, to make my long story short, he has to serve the other two brothers the same as the first.\\n\\nNow the youngest brother began to ask him how things went on. \\'Did you see my two brothers?\\'\\n\\n\\'Yes.\\'\\n\\n\\'How did they look?\\'\\n\\n\\'Oh! they looked very well. I liked them much. They told me many things what to do.\\'\\n\\n\\'Well, did you go to the castle?\\'\\n\\n\\'Yes, my uncle.\\'\\n\\n\\'And will you tell me what you see in there? Did you see the young lady?\\'\\n\\n\\'Yes, I saw her, and plenty of other frightful things.\\'\\n\\n\\'Did you hear any snake biting you in my oldest brother\\'s bed?\\'\\n\\n\\'No, there were none there; I slept well.\\'\\n\\n\\'You won\\'t have to sleep in the same bed tonight. You will have to cut my head off in the morning.\\'\\n\\nThe young prince had a good night\\'s rest, and changed all the appearance of the place by cutting his friend\\'s head off before he started in the morning. A jolly shake-hands, and the uncle tells him it\\'s very probable he shall see him again soon when he is not aware of it. This one\\'s mansion was very pretty, and the country around it beautiful, after his head was cut off. Off Jack goes, over hills, dales, valleys, and mountains, and very near losing his apples again.\\n\\nAt last he arrives at the cross-roads, where he had to meet his brothers, on the very day appointed. Coming up to the place, he sees no tracks of horses, and, being very tired, he lays himself down to sleep, by tying the horse to his leg, and putting the apples under his head. Presently up come the other brothers the same time to the minute, and found him fast asleep; and they would not waken him, but said one to another, \\'Let us see what sort of apples he has got under his head.\\' So they took and tasted them, and found they were different to theirs. They took and changed his apples for theirs, and off to London as fast as they could, and left the poor fellow sleeping.\\n\\nAfter a while he awoke, and, seeing the tracks of other horses, he mounted and off with him, not thinking anything about the apples being changed. He had still a long way to go, and by the time he got near London he could hear all the bells in the town ringing, but did not know what was the matter till he rode up to the palace, when he came to know that his father was recovered by his brothers\\' apples. When he got there his two brothers were off to some sports for a while; and the king was glad to see his youngest son, and very anxious to taste his apples. But when he found out that they were not good, and thought that they were more for poisoning him, he sent immediately for the headsman to behead his youngest son, who was taken away there and then in a carriage. But instead of the headsman taking his head off, he took him to a forest not far from the town, because he had pity on him, and there left him to take his chance, when presently up comes a big hairy bear, limping upon three legs. The prince, poor fellow, climbed up a tree, frightened of him, but the bear told him to come down, that it was no use of him to stop here. With hard persuasion poor Jack comes down, and the bear speaks to him and bids him: \\'Come here to me; I will not do you any harm. It\\'s better for you to come with me and have some refreshments; I know that you are hungry all this time.\\'\\n\\nThe poor young prince says, \\'No, I am not hungry; but I was very frightened when I saw you coming to me first, as I had no place to run away from you.\\'\\n\\nThe bear said, \\'I was also afraid of you when I saw that gentleman setting you down from the carriage. I thought you would have guns with you, and that you would not mind killing me if you saw me; but when I saw the gentleman going away with the carriage, and leaving you behind by yourself, I made bold to come to you, to see who you were, and now I know who you are very well. Are you not the king\\'s youngest son? I have seen you and your brothers and lots of other gentlemen in this wood many times. Now before we go from here, I must tell you that I am in disguise; and I shall take you where we are stopping.\\'\\n\\nThe young prince tells him everything from first to last, how he started in search of the apples, and about the three old men, and about the castle, and how he was served at last by his father after he came home; and instead of the headsman taking his head off, he was kind enough to leave him his life, \\'and here I am now, under your protection.\\'\\n\\nThe bear tells him, \\'Come on, my brother; there shall no harm come to you as long as you are with me.\\'\\n\\nSo he takes him up to the tents; and when they see \\'em coming, the girls begin to laugh, and say, \\'Here is our Jubal coming with a young gentleman.\\' When he advanced nearer the tents, they all knew that he was the young prince that had passed by that way many times before; and when Jubal went to change himself, he called most of them together into one tent, and told them all about him, and to be kind to him. And so they were, for there was nothing that he desired but what he had, the same as if he was in the palace with his father and mother. Jubal, after he pulled off his hairy coat, was one of the finest young men amongst them, and he was the young prince\\'s closest companion. The young prince was always very sociable and merry, only when he thought of the gold watch he had from the young princess in the castle, and which he had lost he knew not where.\\n\\nHe passed off many happy days in the forest; but one day he and poor Jubal were strolling through the trees, when they came to the very spot where they first met, and, accidentally looking up, he could see his watch hanging in the tree which he had to climb when he first saw poor Jubal coming to him in the form of a bear; and he cries out, \\'Jubal, Jubal, I can see my watch up in that tree.\\'\\n\\n\\'Well, I am sure, how lucky!\\' exclaimed poor Jubal, \\'shall I go and get it down?\\'\\n\\n\\'No, I\\'d rather go myself,\\' said the young prince.\\n\\nNow whilst all this was going on, the young princess in that castle, seeing that one of the King of England\\'s sons had been there by the changing of the watch and other things, got herself ready with a large army, and sailed off for England. She left her army a little out of the town, and she went with her guards straight up to the palace to see the king, and also demanded to see his sons. They had a long conversation together about different things. At last she demands one of the sons to come before her; and the oldest comes, when she asks him, \\'Have you ever been at the Castle of Melvales?\\' and he answers, \\'Yes.\\' She throws down a pocket handkerchief and bids him to walk over it without stumbling. He goes to walk over it, and no sooner did he put his foot on it, than he fell down and broke his leg. He was taken off immediately and made a prisoner of by her own guards. The other was called upon, and was asked the same questions, and had to go through the same performance, and he also was made a prisoner of. Now she says, \\'Have you not another son?\\' when the king began to [to] shiver and shake and knock his two knees together that he could scarcely stand upon his legs, and did not know what to say to her, he was so much frightened. At last a thought came to him to send for his headsman, and inquire of him particularly, Did he behead his son, or was he alive?\\n\\n\\'He is saved, O King.\\'\\n\\n\\'Then bring him here immediately, or else I shall be done for.\\'\\n\\nTwo of the fastest horses they had were put in the carriage, to go and look for the poor prince; and when they got to the very spot where they left him, it was the time when the prince was up the tree, getting his watch down, and poor Jubal standing a distance off. They cried out to him, Had he seen another young man in this wood? Jubal, seeing such a nice carriage, thought something, and did not like to say No, and said Yes, and pointed up the tree; and they told him to come down immediately, as there was a young lady in search of him.\\n\\n\\'Ha! ha! ha! Jubal, did you ever hear such a thing in all your life, my brother?\\'\\n\\n\\'Do you call him your brother?\\'\\n\\n\\'Well, he has been better to me than my brothers.\\'\\n\\n\\'Well, for his kindness he shall accompany you to the palace, and see how things turn out.\\'\\n\\nAfter they go to the palace, the prince has a good wash, and appears before the princess, when she asks him, Had he ever been at the Castle of Melvales? With a smile upon his face, he gives a graceful bow. And says my lady, \\'Walk over that handkerchief without stumbling.\\' He walks over it many times, and dances upon it, and nothing happened to him. She said, with a proud and smiling air, \\'That is the young man\\'; and out come the objects exchanged by both of them. Presently she orders a very large box to be brought in and to be opened, and out come some of the most costly uniforms that were ever worn on an emperor\\'s back; and when he dressed himself up, the king could scarcely look upon him from the dazzling of the gold and diamonds on his coat. He orders his two brothers to be in confinement for a period of time; and before the princess asks him to go with her to her own country, she pays a visit to the bear\\'s camp, and she makes some very handsome presents for their kindness to the young prince. And she gives Jubal an invitation to go with them, which he accepts; wishes them a hearty farewell for a while, promising to see them all again in some little time.\\n\\nThey go back to the king and bid farewell, and tell him not to be so hasty another time to order people to be beheaded before having a proper cause for it. Off they go with all their army with them; but while the soldiers were striking their tents, the prince bethought himself of his Welsh harp, and had it sent for immediately to take with him in a beautiful wooden case. They called to see each of those three brothers whom the prince had to stay with when he was on his way to the Castle of Melvales; and I can assure you, when they all got together, they had a very merry time of it. And there we will leave them.'}\n"
     ]
    }
   ],
   "source": [
    "q = 'king \"three sons\" princess sword'\n",
    "\n",
    "# The `.search()` method knows how to find the full text search table\n",
    "# given the original table name\n",
    "for story in db[\"stories\"].search(db.quote_fts(q), columns=[\"story_title\", \"story_text\"]):\n",
    "    print(story)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "917985e6-aaf3-469e-9ec0-eb2ae61599d9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# We could manually close the databse.\n",
    "#db.conn.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68385812-7a9a-43ca-acfe-09f5124a38ad",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "dd170242-559b-4f19-bd3c-3e58a0f8f010",
   "metadata": {},
   "source": [
    "https://huggingface.co/course/chapter5/6?fw=tf and use the doc2vec recipe?\n",
    "\n",
    "https://github.com/neuml/txtai ?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "80fefbc8-359f-48de-90de-f3d07eb9f1be",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d9ee37d-4d53-4e1b-be68-838430b8fe4f",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}