{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "2021-07-06-book-recommender.api.ipynb", "provenance": [], "collapsed_sections": [], "authorship_tag": "ABX9TyPpIY4aGExwEKL0epX+N+6n" }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "gVBiHPtWsxMP" }, "source": [ "# Book Recommender API\n", "> Converting book short description into vectors using Universal Sentence Encoder model and wrapping in an interactive Flask API with Front end HTML page\n", "\n", "- toc: true\n", "- badges: true\n", "- comments: true\n", "- categories: [Book, Flask, API, FrontEnd, NLP, TFHub, KNN]\n", "- author: \"staniher\"\n", "- image:" ] }, { "cell_type": "markdown", "metadata": { "id": "PnVizeduss0V" }, "source": [ "## Setup" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Q6ukaNKTfraS", "outputId": "c76aa79d-3410-413d-df24-38d84cd22e09" }, "source": [ "!pip install -q tensorflow_text" ], "execution_count": 5, "outputs": [ { "output_type": "stream", "text": [ "\u001b[K |████████████████████████████████| 4.3MB 8.2MB/s \n", "\u001b[?25h" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "b2gBnmftfEuB" }, "source": [ "import numpy as np\n", "import pandas as pd\n", "import nltk\n", "import json\n", "import re\n", "import csv\n", "import pickle\n", "\n", "from sklearn.metrics.pairwise import euclidean_distances\n", "from sklearn.metrics.pairwise import cosine_similarity\n", "\n", "import tensorflow_hub as hub\n", "import tensorflow_text" ], "execution_count": 20, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "TTKCRg9bsq7O" }, "source": [ "## Data loading" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 615 }, "id": "fxAe0x-FfK0j", "outputId": "3a5b64d3-9865-46c7-87ac-04b77f156952" }, "source": [ "data = pd.read_json('https://raw.githubusercontent.com/sparsh-ai/reco-data/master/books.json', lines=True)\n", "data.head()" ], "execution_count": 3, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
| \n", " | _id | \n", "title | \n", "isbn | \n", "pageCount | \n", "publishedDate | \n", "thumbnailUrl | \n", "shortDescription | \n", "longDescription | \n", "status | \n", "authors | \n", "categories | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "1 | \n", "Unlocking Android | \n", "1933988673 | \n", "416 | \n", "{'$date': '2009-04-01T00:00:00.000-0700'} | \n", "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.... | \n", "Unlocking Android: A Developer's Guide provide... | \n", "Android is an open source mobile phone platfor... | \n", "PUBLISH | \n", "[W. Frank Ableson, Charlie Collins, Robi Sen] | \n", "[Open Source, Mobile] | \n", "
| 1 | \n", "2 | \n", "Android in Action, Second Edition | \n", "1935182722 | \n", "592 | \n", "{'$date': '2011-01-14T00:00:00.000-0800'} | \n", "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.... | \n", "Android in Action, Second Edition is a compreh... | \n", "When it comes to mobile apps, Android can do a... | \n", "PUBLISH | \n", "[W. Frank Ableson, Robi Sen] | \n", "[Java] | \n", "
| 2 | \n", "3 | \n", "Specification by Example | \n", "1617290084 | \n", "0 | \n", "{'$date': '2011-06-03T00:00:00.000-0700'} | \n", "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.... | \n", "NaN | \n", "NaN | \n", "PUBLISH | \n", "[Gojko Adzic] | \n", "[Software Engineering] | \n", "
| 3 | \n", "4 | \n", "Flex 3 in Action | \n", "1933988746 | \n", "576 | \n", "{'$date': '2009-02-02T00:00:00.000-0800'} | \n", "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.... | \n", "NaN | \n", "New web applications require engaging user-fri... | \n", "PUBLISH | \n", "[Tariq Ahmed with Jon Hirschi, Faisal Abid] | \n", "[Internet] | \n", "
| 4 | \n", "5 | \n", "Flex 4 in Action | \n", "1935182420 | \n", "600 | \n", "{'$date': '2010-11-15T00:00:00.000-0800'} | \n", "https://s3.amazonaws.com/AKIAJC5RLADLUMVRPFDQ.... | \n", "NaN | \n", "Using Flex, you can create high-quality, effec... | \n", "PUBLISH | \n", "[Tariq Ahmed, Dan Orlando, John C. Bland II, J... | \n", "[Internet] | \n", "