{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "CWPK \\#60: Adding a SPARQL Endpoint - Part II\n",
    "=======================================\n",
    "\n",
    "Finally Getting a Working Instance\n",
    "--------------------------\n",
    "\n",
    "<div style=\"float: left; width: 305px; margin-right: 10px;\">\n",
    "\n",
    "<img src=\"http://kbpedia.org/cwpk-files/cooking-with-kbpedia-305.png\" title=\"Cooking with KBpedia\" width=\"305\" />\n",
    "\n",
    "</div>\n",
    "\n",
    "Yesterday's installment of [*Cooking with Python and KBpedia*](https://www.mkbergman.com/cooking-with-python-and-kbpedia/) presented the first part of this two-part series on developing a [SPARQL](https://en.wikipedia.org/wiki/SPARQL) endpoint for KBpedia on a remote server. This concluding part picks up with **step #7** in the stepwise approach I took to complete this task. \n",
    "\n",
    "At the outset I thought it would progress rapidly: After all, is not SPARQL a proven query language with central importance to [knowledge graphs](https://en.wikipedia.org/wiki/Knowledge_graph)? But, possibly because our focus in the series is [Python](https://en.wikipedia.org/wiki/Python_(programming_language)), or perhaps for other reasons, I have found a dearth of examples to follow regarding setting up a Python SPARQL endpoint (there are some resources available related to [REST](https://en.wikipedia.org/wiki/Representational_state_transfer) [APIs](https://en.wikipedia.org/wiki/API)).\n",
    "\n",
    "The first six steps in yesterday's installment covered getting our environment set up on the remote Linux server, including installing the Web framework [Flask](https://en.wikipedia.org/wiki/Flask_(web_framework)) and creating a virtual environment. We also presented the Web page form design and template for our SPARQL query form. This second part covers the steps of tieing this template form into actual endpoint code, which proved to be simple in presentation but exceeding difficult to formulate and debug. Once this working endpoint is in hand, I next cover the steps of giving the site an external URL address, starting and stopping the service on the server, and packaging the code for GitHub distribution. I conclude this two-part series with some lessons learned, some comments on the use and relevance of [linked data](https://en.wikipedia.org/wiki/Linked_data), and point to additional documentation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step-wise Approach (con't)\n",
    "We pick up our step-wise approach here.\n",
    "\n",
    "**7. Tie SPARQL Form to Local Instance**\n",
    "\n",
    "So far, we have a local instance that works from the command line and an empty SPARQL form. We need to relate thest two pieces together. In the last installment, I noted two SPARQL-related efforts, [pyLDAPI](https://github.com/RDFLib/pyLDAPI) (and its [GNAF](https://github.com/CSIRO-enviro-informatics/gnaf-dataset/blob/master/view/templates/page_sparql.html) example) and [adhs](https://github.com/nareike/adhs/blob/master/templates/sparql.html). I could not find working examples for either, but I did consult their code frequently while testing various options.\n",
    "\n",
    "Thus, unlike many areas throughout this **CWPK** series, I really had no working examples from which to build or modify our current SPARQL endpoint needs. While the related efforts above and other examples could provide single functions or small snippets, possibly as use for guidance or some modification, it was pretty clear I was going to need to build up the code step-by-step, in a similar stepwise manner to what I was following for the entire endpoint. Fortunately, as described in **step #6**, I did have a starting point for the Web page template using the GNAF example.\n",
    "\n",
    "From a code standpoint, the first area we need to address is to convert our example start-up stub, what was called <code>test_sparql.py</code> in the [**CWPK #58**](https://www.mkbergman.com/2407/cwpk-58-setting-up-a-remote-instance-and-web-page-server/) installment, to our main application for this endpoint. We choose to call it <code>cowpoke-endpoint.py</code> in keeping with its role. We will build on the earlier stub by adding most of our <code>import</code> and Flask-routing ('<code>@app.route(\"/\")</code>', for example) statements, as well as the initialization code for the endpoint functions. We will call out some specific aspects of this file as we build it.\n",
    "\n",
    "The second coding area we need to address is how to tie the text areas in our Web form template to the actual Python code. We will put some of that code in the template and some of that code in <code>cowpoke-endpoint.py</code> governing getting and processing the SPARQL query. There is a useful pattern for how to relate templates to Python code via what might be entered into a text area from [StackOverflow](https://stackoverflow.com/questions/37345215/retrieve-text-from-textarea-in-flask). Here is the code example that should be put within the governing template, using the important <code>{{ url_for('submit') }}</code>:\n",
    "\n",
    "<pre>\n",
    "&lt;form action=\"{{ url_for('submit') }}\" method=\"post\">\n",
    "    &lt;textarea name=\"text\"></textarea>\n",
    "    &lt;input type=\"submit\">\n",
    "&lt;/form>\n",
    "</pre>\n",
    "\n",
    "and here is the matching code that needs to go into the governing Python file:\n",
    "\n",
    "<pre>\n",
    "from flask import Flask, request, render_template\n",
    "\n",
    "app = Flask(__name__)\n",
    "\n",
    "@app.route('/')\n",
    "def index():\n",
    "    return render_template('form.html')\n",
    "\n",
    "@app.route('/submit', methods=['POST'])\n",
    "def submit():\n",
    "    return 'You entered: {}'.format(request.form['text'])\n",
    "</pre>\n",
    "\n",
    "Note that file names, form names and routes all need to be properly identified and matched. Also note that imports need to be complete. Further notice in the file listing below that we modify the <code>return</code> statement. We also repeat this form related to the SPARQL results text area.\n",
    "\n",
    "One of the challenging needs in the code development was working with a remote instance, as opposed to local code. I was also now dealing with a Linux environment, not my local Windows one. After much trial-and-error, which I'm sure is quite familiar to professional developers working in a client-server framework, I learned some valuable (essential!) lessons:\n",
    "\n",
    "1. First, with my <code>miniconda</code> approach and its minimal starting Python basis, I needed to check every new package import required by the code and check whether it was already in the remote instance. The <code>conda list</code> command is important here to first check whether the package is already in the Python environment or not. If not, I would need to find the proper repository for the package and install it per the instructions in [**CWPK #58**](https://www.mkbergman.com/2407/cwpk-58-setting-up-a-remote-instance-and-web-page-server/)\n",
    "\n",
    "\n",
    "2. I needed to make sure that the permission (Linux <code>chmod</code>) and ownership (Linux <code>chown</code> settings were properly set on the target directories for the remote instance such that I could use my SSH-based file transfer program ([WinSCP](xxx) in my case; [Filezilla](xxx) is another leading option). I simply do not do enough Linux work to be comfortable with remote editors. SSH transfer would enable me to work on the developing code in my local Windows instance\n",
    "\n",
    "\n",
    "3. I needed to get basic templates working early, since I needed Web page targets for where the outputs or traces of the running code would display \n",
    "\n",
    "\n",
    "4. I needed to restart the Apache2 server whenever there was a new code update to be tested. This resulted in a fairly set workflow of edit &rarr; upload &rarr; re-start Apache &rarr; call up remote Web template form (e.g., http://xx.xxx.xxx.xxx/sparql) &rarr; inspect trace or logs &rarr; rinse and repeat\n",
    "\n",
    "\n",
    "5. Be attentive to and properly [set content types](https://stackoverflow.com/questions/11773348/python-flask-how-to-set-content-type), since we are moving data and results from Web forms to code and back again. Content header information can be tricky, and one needs to use [cURL](https://en.wikipedia.org/wiki/CURL) or [wget](https://en.wikipedia.org/wiki/Wget) (or [Postman](https://www.postman.com/), which is often referenced, but I did not use). One way to inspect headers and content types is in the output Web page templates, using this code:\n",
    "<pre>\n",
    "  req = request.form\n",
    "  print(req)\n",
    "</pre> \n",
    "\n",
    "\n",
    "6. In HTML forms, use the <code>&lt;</code> code for the left angle bracket symbol (used in SPARQL queries to denote a URI link), otherwise the link will not display on the Web page since this character is reserved\n",
    "\n",
    "\n",
    "7. Used the standard [W3C validator](https://validator.w3.org/i18n-checker/check) when needing to check encodings and Web addresses\n",
    "\n",
    "\n",
    "8. Be **extremely attentive** to the use of tabs v white spaces in your Python code. Get in the habit of using spaces only, and not tabbing for indents. Editors are more forgiving in a Windows development environment; Linux ones are not.\n",
    "\n",
    "\n",
    "The reason I began assembling these lessons arose from the frustrations I had in early code development. Since I was getting small pieces of the functionality running directly in Python from the command line, some of which is shown in the prior two installments, my initial failures to import these routines in a code file (<code>*.py</code>) and get them to work had me pulling my hair out. I simply could not understand why routines that worked directly from the command line did not work once embedded into a code file.\n",
    "\n",
    "One discovery is that Flask does not play well with the Python <code>list</code> command. If one inspects prior SPARQL examples in this series (for example, [**CWPK #25**](https://www.mkbergman.com/2358/cwpk-25-querying-kbpedia-with-sparql/)), one can see that this construct is common with the standard query code. One adjustment, therefore, was to remove the <code>list</code> generator, and install a looping function for the query output. This applied to both RDFLib and owlready2.\n",
    "\n",
    "Besides the lessons presented above, some of the hypotheses I tested to get things to work included the use of <code>CDATA</code> (which only applies to XML), pasting to or saving and retrieving from intermediate text files, changing <code>content-type</code> or <code>mimetype</code>, treatment of the Python multi-line convention (<code>\"\"\"</code>), possible use of JavaScript, and more. Probably the major issue I needed to overcome was turning on space and tab display in my local editor to remove their mixed use. This experience really brought home to me the fundamental adherence to indentation in the Python language.\n",
    "\n",
    "Nonetheless, by following these guidelines and with eventual multiple tries, I was finally able to get a basic code block working, as documented under the next step."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**8. Create and validate an external SPARQL query using SPARQLwrapper to this endpoint.**\n",
    "\n",
    "Since the approach that worked above got closer to the standard RDFLib approach, I decided to expand the query form to allow for external searches as well. Besides modifications to the Web page template, the use of external sources also invokes the [SPARQLwrapper](https://sparqlwrapper.readthedocs.io/en/latest/main.html) extension to RDFLib. Though its results presentation is a bit different, and we now have a requirement to also input and retrieve the URL of the external SPARQL endpoint, we were able to add this capability fairly easily.\n",
    "\n",
    "The resulting code is actually quite simple, though the path to get there was anything but. I present below the eventual code file so developed, with code notes following the listing. You will see that, aside from the Flask code conventions and decorators, that our code file is quite similar to others developed throughout *cowpoke*:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from flask import Flask, Response, request, render_template            # Note 1\n",
    "from owlready2 import *\n",
    "import rdflib\n",
    "from rdflib import Graph\n",
    "import json\n",
    "from SPARQLWrapper import SPARQLWrapper, JSON, XML\n",
    "\n",
    "# load knowledge graph files\n",
    "main = '/var/data/kbpedia/kbpedia_reference_concepts.owl'              # Note 2\n",
    "skos_file = 'http://www.w3.org/2004/02/skos/core' \n",
    "kko_file = '/var/data/kbpedia/kko.owl'\n",
    "\n",
    "# set up scopes and namespaces\n",
    "world = World()                                                        # Note 2 \n",
    "kb = world.get_ontology(main).load()\n",
    "rc = kb.get_namespace('http://kbpedia.org/kko/rc/')\n",
    "skos = world.get_ontology(skos_file).load()\n",
    "kb.imported_ontologies.append(skos)\n",
    "kko = world.get_ontology(kko_file).load()\n",
    "kb.imported_ontologies.append(kko)\n",
    "\n",
    "graph = world.as_rdflib_graph()\n",
    "\n",
    "# set up Flask microservice\n",
    "app = Flask(__name__)                                                  # Note 3\n",
    "\n",
    "@app.route(\"/\")\n",
    "def sparql_form():\n",
    "    return render_template('sparql_page.html')\n",
    "\n",
    "# set up route for submitting query, receiving results \n",
    "@app.route('/submit', methods=['POST'])                                # Note 4\n",
    "def submit():\n",
    "#    if request.method == 'POST':\n",
    "    q_submit = None\n",
    "    results = ''\n",
    "    if request.form['q_submit'] is None or len(request.form['q_submit']) < 5:\n",
    "        return Response(\n",
    "        'Your request to the SPARQL endpoint must contain a \\'query\\'.',\n",
    "        mimetype = 'text/plain'\n",
    "        )\n",
    "    else:\n",
    "        data = request.form['q_submit']                                # Note 5\n",
    "        source = request.form['selectSource']\n",
    "        format = request.form['selectFormat']\n",
    "        q_url = request.values.get('q_url')\n",
    "        try:                                                           # Note 6\n",
    "            if source == 'kbpedia' and format == 'owlready':           # Note 7\n",
    "                q_query = graph.query_owlready(data)                   # Note 8\n",
    "                for row in q_query:\n",
    "                    row = str(row)\n",
    "                    results = results + row\n",
    "                results = results.replace(']', ']\\n')\n",
    "            elif source == 'kbpedia' and format == 'rdflib':\n",
    "                q_query = graph.query(data)                            # Note 8\n",
    "                for row in q_query:\n",
    "                    row = str(row)\n",
    "                    results = results + row\n",
    "                results = results.replace('))', '))\\n')\n",
    "            elif source == 'kbpedia' and format == 'xml':\n",
    "                q_query = graph.query(data)\n",
    "                for row in q_query:\n",
    "                    row = str(row)\n",
    "                    results = results + row\n",
    "                results = q_query.serialize(format='xml')\n",
    "                results = str(results)\n",
    "                results = results.replace('<result>', '\\n<result>')\n",
    "            elif source == 'kbpedia' and format == 'json':\n",
    "                q_query = graph.query(data)\n",
    "                for row in q_query:\n",
    "                    row = str(row)\n",
    "                    results = results + row\n",
    "                results = q_query.serialize(format='json')\n",
    "                results = str(results)\n",
    "                results = results.replace('}}, ', '}}, \\n')\n",
    "            elif source == 'kbpedia' and format == 'html':             #Note 9\n",
    "                q_query = graph.query(data)\n",
    "                for row in q_query:\n",
    "                    row = str(row)\n",
    "                    results = results + row\n",
    "                results = q_query.serialize(format='csv')\n",
    "                results = str(results)\n",
    "                results = results.readlines()\n",
    "#                table = '<html><table>'\n",
    "                for row in results:\n",
    "#                     row = str(row)                    \n",
    "                     result = row[0]\n",
    "#                    row = row.replace('\\r\\n', '')\n",
    "#                    row = row.replace(',', '</td><td>')\n",
    "#                    table += '<tr><td>' + row + '</td></tr>' + '\\n'\n",
    "#                table += '</table><br></html>' \n",
    "#                results = table\n",
    "                return result\n",
    "            elif source == 'kbpedia' and format == 'txt':\n",
    "                q_query = graph.query(data)\n",
    "                for row in q_query:\n",
    "                    row = str(row)\n",
    "                    results = results + row\n",
    "                results = q_query.serialize(format='txt')\n",
    "            elif source == 'kbpedia' and format == 'csv':\n",
    "                q_query = graph.query(data)\n",
    "                for row in q_query:\n",
    "                    row = str(row)\n",
    "                    results = results + row\n",
    "                results = q_query.serialize(format='csv')\n",
    "            elif source == 'external' and format == 'rdfxml':\n",
    "                q_url = str(q_url)\n",
    "                results = q_url\n",
    "            elif source == 'external' and format == 'xml':\n",
    "                sparql = SPARQLWrapper(q_url)\n",
    "                data = data.replace('\\r', '')\n",
    "                sparql.setQuery(data)\n",
    "                results = sparql.query()\n",
    "            elif source == 'external' and format == 'json':            #Note 10\n",
    "                sparql = SPARQLWrapper(q_url)\n",
    "                data = data.replace('\\r', '')\n",
    "#                data = data.replace(\"\\n\", \"\\n' + '\")\n",
    "#                data = '\"' + data + '\"'\n",
    "                sparql.setQuery(data)\n",
    "                sparql.setReturnFormat(JSON)                           #Note 10\n",
    "                results = sparql.queryAndConvert()\n",
    "#                q_sparql = str(sparql)\n",
    "#                results = q_sparql\n",
    "            else:                                                      #Note 11\n",
    "                results = ('This combination of Source + Format is not available. Here are the possible combinations:\\n\\n' + \n",
    "                           '    Kbpedia:   owlready2:    Formats:  owlready2\\n' + \n",
    "                           '                  rdflib:                 rdflib\\n' +\n",
    "                           '                                             xml\\n' +\n",
    "                           '                                            json\\n' +\n",
    "                           '                                           *html\\n' +\n",
    "                           '                                            text\\n' +\n",
    "                           '                                             csv\\n' +\n",
    "                           '   External:  as entered:                rdf/xml\\n' +\n",
    "                           '                                           *json\\n\\n' +\n",
    "                           '            * combo still buggy')\n",
    "            if format == 'html':\n",
    "                return Response(results, mimetype='text/html')         # Note 9, 12\n",
    "            else:\n",
    "                return Response(results, mimetype='text/plain')\n",
    "        except Exception as e:                                         # Note 6\n",
    "            return Response(\n",
    "            'Error(s) found in query: ' + str(e),\n",
    "            mimetype = 'text/plain'\n",
    "            )\n",
    "\n",
    "if __name__ == \"__main__\":\n",
    "    app.run(debug=true)                     "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are some annotation notes related to this code, as keyed by note number above:\n",
    "\n",
    "1. There are many specific packages needed for this SPARQL application, as discussed in the main text. The major point to make here is that each of these packages needs to be loaded into the remote virtual environment, per the discussion in [**CWPK #58**](https://www.mkbergman.com/2407/cwpk-58-setting-up-a-remote-instance-and-web-page-server/)\n",
    "\n",
    "\n",
    "2. Like other *cowpoke* modules, these are pretty standard calls to the needed knowledge graphs and configuration settings\n",
    "\n",
    "\n",
    "3. These are the standard Flask calls, as discussed in the prior installment\n",
    "\n",
    "\n",
    "4. The main routine for the application is located here. We could have chosen to break this routine into multiple files and templates, but since this application is rather straightforward, we have placed all functionality into this one function block\n",
    "\n",
    "\n",
    "5. These are the calls that bring the assignments from the actual Web page (template) into the application\n",
    "\n",
    "\n",
    "6. We set up a standard <code>try . . . exception</code> block, which allows an error, if it occurs, to exit gracefully with a possible error explanation\n",
    "\n",
    "\n",
    "7. We set up all execution options as a two-part condition. One part is whether the source is the internal KBpedia knowledge graph (which may use either the standard <code>rdflib</code> or <code>owlready2</code> methods) or is external (which uses the <code>sparqlwrapper</code> method). The second part is which of eight format options might be used for the output, though not all are available to the source options; see further **Note 11**. Also, most of the routines have some minor code to display results line-by-line\n",
    "\n",
    "\n",
    "8. Here is where the graph query function differs by whether RDFLib or owlready2 is used\n",
    "\n",
    "\n",
    "9. As of the time of release of this installment, I am still getting errors in this HTML output routine. I welcome any suggestions for working code here\n",
    "\n",
    "\n",
    "10. As of the time of release of this installment, I am still getting errors in this JSON output routine. I have tried the standard SPARQLwrapper code, SPARQLwrapper2, and adding the JSON format to the initial <code>sparql</code>, all to no avail. It appears there may be some character or encoding issue in moving the query on the Web form to the function. The error also appears to occur in the line indicated. I welcome any suggestions for working code here\n",
    "\n",
    "\n",
    "11. This is where any of the two-part combos discussed in **Note #7** that do not work get captured\n",
    "\n",
    "\n",
    "12. This <code>if . . . else</code> enables the HTML output option."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**9. Set up an external URI to the localhost instance**\n",
    "With this working code instance now in place, it was time to expose the service through a standard external URI. (During development we used http://xx.xxx.xxx.xxx/sparql). The URL we chose for the service is http://sparql.kbpedia.org/.\n",
    "\n",
    "We first needed to set up a subdomain pointing to the service via our DNS provider. While we generally provide SSL support for all of our Web sites (the secure protocol behind the <code>https:</code> Web prefix), we decided the minor use of this SPARQL site did not warrant keeping the certificates enabled and current. So, this site is configured for <code>http:</code> alone.\n",
    "    \n",
    "We first configured our Flask sites as described in [**CWPK #58**](https://www.mkbergman.com/2407/cwpk-58-setting-up-a-remote-instance-and-web-page-server/). To get this site working under the new URL, I only needed to make two changes to the earlier configuration. This configuration file is <code>000-default.conf</code> and is found on my server at the <code>/etc/apache2/sites-enabled</code> directory. Here at the two changes, called out by note:\n",
    "    \n",
    "<pre>\n",
    "&lt;VirtualHost *:80>\n",
    "  ServerName sparql.kbpedia.org                        #Note 1\n",
    "  ServerAdmin mike@mkbergman.com\n",
    "  DocumentRoot /var/www/html\n",
    "\n",
    "  WSGIDaemonProcess sparql python-path=/usr/bin/python-projects/miniconda3/envs/sparql/lib/python3.8/site-packages\n",
    "  WSGIScriptAlias / /var/www/html/sparql/wsgi.py       #Note 2\n",
    "  &lt;Directory /var/www/html/sparql>\n",
    "     WSGIProcessGroup sparql\n",
    "     WSGIApplicationGroup %{GLOBAL}\n",
    "     Order deny,allow\n",
    "       Allow from all\n",
    "  &lt;/Directory>\n",
    "\n",
    "  ErrorLog ${APACHE_LOG_DIR}/error.log\n",
    "  CustomLog ${APACHE_LOG_DIR}/access.log combined\n",
    "&lt;/VirtualHost>\n",
    "</pre>\n",
    "    \n",
    "The first change was to add the new domain <code>sparql.kbpedia.org</code> under <code>ServerName</code>. The second change was to replace the <code>/sparql</code> alias to <code>/</code> under the <code>WSGIScriptAlias</code> directive."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**10. Set up an automatic start/re-start cron job.**\n",
    "\n",
    "The last step under our endpoint process is to schedule a <code>cron</code> job on the remote server to start up the <code>sparql</code> virtual environment in the case of an unintended shut down or breaking of the Web site. This last task means we can let the endpoint run virtually unattended. First, let's look at how simple a re-activation (<code>xxx</code>) script may look:\n",
    "<pre>\n",
    "#!/bin/sh\n",
    "\n",
    "conda activate sparql\n",
    "</pre>\n",
    "\n",
    "Note the standard bash script header on this file. Also note our standard activation statement. One can create this file and then place it in a logical, findable location. In our instance, we will put it where the same <code>sparql</code> scripts exist, namely in <code>/var/www/html/sparql/</code>.\n",
    "\n",
    "We next need to make sure this script is readable by our <code>cron</code> job. So we navigate to the the directory where this bash script is located and change its permissions:\n",
    "<pre>\n",
    "chmod +x re_activate.sh\n",
    "</pre>\n",
    "\n",
    "Once these items are set, we are now able to add this bash script to our scheduled <code>cron</code> jobs. We find this specification and invoke our editor of that file by using:\n",
    "<pre>\n",
    "nano /etc/crontab\n",
    "</pre>\n",
    "\n",
    "Using the <code>nano</code> editor conventions (or those of your favored editor), we can now add our new <code>cron </code> job in a new entry between the asterisk (\\*) specifications:\n",
    "<pre>\n",
    "30 * * * * /bin/sh /var/www/html/sparql/re_activate.sh\n",
    "</pre>\n",
    "\n",
    "We have now completed all of our desired development steps for the [KBpedia SPARQL endpoint](http://sparql.kbpedia.org/). As of the release of today's installment, the site is active."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Endpoint Packaging\n",
    "I will package up this code as a separate project and repository on GitHub per the steps outlined in [**CWPK #46**](https://www.mkbergman.com/2388/cwpk-46-creating-the-cowpoke-package-and-unit-tests/) under the MIT license, same as *cowpoke*. Since there are only a few files, we did not create a formal <code>pip</code> package. Here will be the package address:\n",
    "\n",
    "<code>[https://github.com/Cognonto/cowpoke-endpoint](https://github.com/Cognonto/cowpoke-endpoint)</code>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linked Data and Why Not Employed\n",
    "\n",
    "My original plan was to have this SPARQL site offer [linked data](https://en.wikipedia.org/wiki/Linked_data). Linked data is where the requesting user agent may be served either semantic data such as RDF in various formats or standard HTML if the requester is a browser. It is a useful approach for the [semantic Web](https://en.wikipedia.org/wiki/Semantic_Web) and has a series of requirements to qualify as '[5-star](https://5stardata.info/en/)' linked open data. \n",
    "\n",
    "From a technical standpoint, the nature of the requesting user agent is determined by the local Web server (Apache2 in our case), which then routes the request to produce either structured data or semi-structured HTML for displaying in a Web page through a process known as [content negotiation](https://en.wikipedia.org/wiki/Content_negotiation) (the term is sometimes shortened to 'conneg'). In this manner, our item of interest can be denoted with a single URI, but the content version that gets served to the requesting agent may differ based on the nature of the agent or its request. In a data-oriented setting, for example, the requested data may be served up in a choice of formats to make it easier to consume on the receiving end.\n",
    "\n",
    "As I noted in my initial investigations regarding Python ([**CWPK #58**](https://www.mkbergman.com/2407/cwpk-58-setting-up-a-remote-instance-and-web-page-server/)), there are not many options compared to other languages such as Java or JavaScript. One of the reasons I initially targeted pyLDAPI was that it promised to provide linked data. ([RDFLib-web](https://github.com/RDFLib/rdflib-web) used to provide an option, but it is no longer maintained and does not work on Python 3.) Unfortunately, I could find no working instances of the pyLDAPI code and, when inspecting the code base itself, I was concerned about the duplicated number of Flask templates required by this approach. Given the number and diversity of classes and properties in KBpedia, my initial review suggested pyLDAPI was not a tenable approach, even if I could figure out how to get the code working.\n",
    "\n",
    "Given the current state of development, my suggestion is to go with an established triple store with linked data support if one wants to provide linked data. It does not appear that Python has a sufficiently mature option available to make linked data available at acceptable effort."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lessons and Possible Enhancements\n",
    "\n",
    "The last section summarized the relative immature state of Python for SPARQL and linked data purposes. In order to get the limited SPARQL functionality working in this **CWPK** series I have kept my efforts limited to the SPARQL 'SELECT' statement and have noted many gotchas and workarounds in the dicussions over this and the prior two installments. Here are some additional lessons not already documented:\n",
    "\n",
    "1. Flask apparently does not like 'return None'\n",
    "1. Our minimal conda installation can cause problems with 'standard' Python packages dropped from the <code>miniconda3</code> distro. One of these is <code>json</code>, which I ultimately needed to obtain from <code>conda install -c jmcmurray json</code>.\n",
    "\n",
    "Clearly, some corners were cut above and some aspects ignored. If one wanted to fully commercialize a Python endpoint for SPARQL based on the efforts in this and the previous **CWPK** installments, here are some useful additions:\n",
    "\n",
    "- Add the full suite of SPARQL commands to the endpoint (e.g., CONSTRUCT, ASK, DESCRIBE, plus other nuances)\n",
    "- Expand the number of output formats\n",
    "- Add further error trapping and feedback for poorly-formed queries, and\n",
    "- Make it easier to add linked data content negotiation.\n",
    "\n",
    "Of course, these enhancements do not include more visual or user-interface assists for creating SPARQL queries in the first place. These are useful efforts in their own right."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### End of Part V\n",
    "This installment marks the end of our **Part V: Mapping, Stats, and Other Tools**. We begin **Part VI** next week governing natural language applications and machine learning involving KBpedia. We are also now 80% of the way through our entire **CWPK** series."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Additional Documentation\n",
    "\n",
    "Here are related documents, some which extend the uses discussed herein:\n",
    "\n",
    "#### Flask Resources\n",
    "- https://hackersandslackers.com/flask-routes/\n",
    "- https://www.digitalocean.com/community/tutorials/how-to-structure-large-flask-applications\n",
    "- https://www.digitalocean.com/community/tutorials/processing-incoming-request-data-in-flask\n",
    "\n",
    "#### RDFLib\n",
    "- https://rdflib.readthedocs.io/en/stable/intro_to_sparql.html\n",
    "- https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.plugins.sparql.html\n",
    "- https://rdflib.readthedocs.io/en/stable/modules/rdflib/query.html\n",
    "- https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.plugins.stores.html#module-rdflib.plugins.stores.sparqlconnector\n",
    "- https://rdflib.readthedocs.io/en/stable/apidocs/rdflib.extras.html#module-rdflib.extras.infixowl\n",
    "- https://rebeccabilbro.github.io/rdflib-and-sparql/\n",
    "\n",
    "#### SPARQLwrapper\n",
    "- https://rdflib.dev/sparqlwrapper/\n",
    "- https://sparqlwrapper.readthedocs.io/en/latest/main.html\n",
    "- https://readthedocs.org/projects/sparqlwrapper/downloads/pdf/stable/\n",
    "- https://pypi.org/project/SPARQLWrapper/\n",
    "\n",
    "#### Other\n",
    "- [Using cURL for SPARQL](http://www.snee.com/bobdc.blog/2019/02/curling-sparql.html)\n",
    "- [RDFLib JSON-LD](https://github.com/RDFLib/rdflib-jsonld)\n",
    "- [Flask-RDF](https://pypi.org/project/flask_rdf/), a Flask decorator to output RDF using content negotiation\n",
    "- [Cautions about SPARQL endpoints](https://grensesnittet.computas.com/linked-data-a-special-note-on-sparql-endpoints-limitations-dangers-and-usage-practices/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " <div style=\"background-color:#ffecec; border:1px dotted #f5aca6; vertical-align:middle; margin:15px 60px; padding:8px;\"> \n",
    "  <span style=\"font-weight: bold;\">NOTE:</span> This article is part of the <a href=\"https://www.mkbergman.com/cooking-with-python-and-kbpedia/\" style=\"font-style: italic;\">Cooking with Python and KBpedia</a> series. See the <a href=\"https://www.mkbergman.com/cooking-with-python-and-kbpedia/\"><strong>CWPK</strong> listing</a> for other articles in the series. <a href=\"http://kbpedia.org/\">KBpedia</a> has its own Web site. The <em>cowpoke</em> Python <a href=\"https://github.com/Cognonto/cowpoke\">code listing covering the series</a> is also available from GitHub.\n",
    "  </div>\n",
    "\n",
    "<div style=\"background-color:#ebf8e2; border:1px dotted #71c837; vertical-align:middle; margin:15px 60px; padding:8px;\"> \n",
    "\n",
    "<span style=\"font-weight: bold;\">NOTE:</span> This <strong>CWPK \n",
    "installment</strong> is available both as an online interactive\n",
    "file <a href=\"https://mybinder.org/v2/gh/Cognonto/CWPK/master\" ><img src=\"https://mybinder.org/badge_logo.svg\" style=\"display:inline-block; vertical-align: middle;\" /></a> or as a <a href=\"https://github.com/Cognonto/CWPK\" title=\"CWPK notebook\" alt=\"CWPK notebook\">direct download</a> to use locally. Make sure and pick the correct installment number. For the online interactive option, pick the <code>*.ipynb</code> file. It may take a bit of time for the interactive option to load.</div>\n",
    "\n",
    "<div style=\"background-color:#feeedc; border:1px dotted #f7941d; vertical-align:middle; margin:15px 60px; padding:8px;\"> \n",
    "<div style=\"float: left; margin-right: 5px;\"><img src=\"http://kbpedia.org/cwpk-files/warning.png\" title=\"Caution!\" width=\"32\" /></div>I am at best an amateur with Python. There are likely more efficient methods for coding these steps than what I provide. I encourage you to experiment -- which is part of the fun of Python -- and to <a href=\"mailto:mike@mkbergman.com\">notify me</a> should you make improvements.    \n",
    "\n",
    "</div>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}