{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c6bfcf58-b95d-45b4-a395-6e851c347f7f",
   "metadata": {
    "tags": []
   },
   "source": [
    "# About"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5d083f9e-dafd-4ca6-89a9-c76e9513c587",
   "metadata": {},
   "source": [
    "This notebook explores the conversion dataset.\n",
    "\n",
    "The `conversions.tsv` dataset has one row per search conversion.  \n",
    "\n",
    "The dataset tells you which photo has been downloaded for a search, the country of origin, and an anonymous identifier to indiciate the unique users. \n",
    "\n",
    "[Source](https://github.com/unsplash/datasets/blob/master/DOCS.md)\n",
    "\n",
    "\n",
    "We will use this dataset to understand the type of queries, that users in the platform are searching."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6020f52f-145b-45ce-8b21-c05f060ad301",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Exploring the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "34f2eede-56da-4319-8df5-5cf45336484f",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:30:53.133017Z",
     "iopub.status.busy": "2023-04-25T16:30:53.132603Z",
     "iopub.status.idle": "2023-04-25T16:30:53.215007Z",
     "shell.execute_reply": "2023-04-25T16:30:53.213149Z",
     "shell.execute_reply.started": "2023-04-25T16:30:53.132981Z"
    }
   },
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "91428786-d229-4748-a91e-2829d834a674",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:30:53.216683Z",
     "iopub.status.busy": "2023-04-25T16:30:53.216378Z",
     "iopub.status.idle": "2023-04-25T16:30:53.308980Z",
     "shell.execute_reply": "2023-04-25T16:30:53.308050Z",
     "shell.execute_reply.started": "2023-04-25T16:30:53.216661Z"
    }
   },
   "outputs": [],
   "source": [
    "pd.set_option('display.max_rows', 100)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "9d2b5364-acf5-47c4-a765-1a691435e139",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:30:53.310150Z",
     "iopub.status.busy": "2023-04-25T16:30:53.309865Z",
     "iopub.status.idle": "2023-04-25T16:30:53.319703Z",
     "shell.execute_reply": "2023-04-25T16:30:53.318898Z",
     "shell.execute_reply.started": "2023-04-25T16:30:53.310123Z"
    }
   },
   "outputs": [],
   "source": [
    "path = \"../data/raw/conversions.tsv000\"\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "dc7046a8-6a05-41ce-811e-c5df1223a226",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:30:53.321113Z",
     "iopub.status.busy": "2023-04-25T16:30:53.320603Z",
     "iopub.status.idle": "2023-04-25T16:31:18.524555Z",
     "shell.execute_reply": "2023-04-25T16:31:18.523773Z",
     "shell.execute_reply.started": "2023-04-25T16:30:53.321087Z"
    }
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv(path,sep=\"\\t\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "a156cb80-e608-40d2-81ba-cdd0e874b819",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:18.527688Z",
     "iopub.status.busy": "2023-04-25T16:31:18.527041Z",
     "iopub.status.idle": "2023-04-25T16:31:18.533732Z",
     "shell.execute_reply": "2023-04-25T16:31:18.532900Z",
     "shell.execute_reply.started": "2023-04-25T16:31:18.527652Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "12166088"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(df)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9aa396d-c335-47cd-b01d-ffb3f7b743ab",
   "metadata": {},
   "source": [
    "sample view of the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "745cd38a-a41e-44c4-a836-8601ebfd043f",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:18.535011Z",
     "iopub.status.busy": "2023-04-25T16:31:18.534779Z",
     "iopub.status.idle": "2023-04-25T16:31:18.627594Z",
     "shell.execute_reply": "2023-04-25T16:31:18.626599Z",
     "shell.execute_reply.started": "2023-04-25T16:31:18.534991Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>converted_at</th>\n",
       "      <th>conversion_type</th>\n",
       "      <th>keyword</th>\n",
       "      <th>photo_id</th>\n",
       "      <th>anonymous_user_id</th>\n",
       "      <th>conversion_country</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2020-07-29 00:08:04.221</td>\n",
       "      <td>download</td>\n",
       "      <td>clouds</td>\n",
       "      <td>ABmygVJcYgY</td>\n",
       "      <td>dd01ebdd-7691-4518-ab19-b2105782ae8b</td>\n",
       "      <td>VE</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2020-07-29 00:25:23.426</td>\n",
       "      <td>download</td>\n",
       "      <td>shark</td>\n",
       "      <td>fB2jl6Rb3l4</td>\n",
       "      <td>c48ba6e0-c6a7-4a92-b569-fe57808a8a2c</td>\n",
       "      <td>QA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2020-07-29 00:26:13.122</td>\n",
       "      <td>download</td>\n",
       "      <td>dogs</td>\n",
       "      <td>k1hbfag2na0</td>\n",
       "      <td>62c4f043-579c-438f-8815-eb8ba3c54d34</td>\n",
       "      <td>KR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2020-07-29 00:37:03.308</td>\n",
       "      <td>download</td>\n",
       "      <td>astronaut</td>\n",
       "      <td>-SyUjRlHauQ</td>\n",
       "      <td>7ad6dc18-a02e-4ba2-b93c-fd7ea2e551d8</td>\n",
       "      <td>JP</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2020-07-29 00:54:28.942</td>\n",
       "      <td>download</td>\n",
       "      <td>red roses</td>\n",
       "      <td>A0iTJUhK4es</td>\n",
       "      <td>f03a5708-32e4-4fae-8210-3c5d2632cbfb</td>\n",
       "      <td>NZ</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              converted_at conversion_type    keyword     photo_id  \\\n",
       "0  2020-07-29 00:08:04.221        download     clouds  ABmygVJcYgY   \n",
       "1  2020-07-29 00:25:23.426        download      shark  fB2jl6Rb3l4   \n",
       "2  2020-07-29 00:26:13.122        download       dogs  k1hbfag2na0   \n",
       "3  2020-07-29 00:37:03.308        download  astronaut  -SyUjRlHauQ   \n",
       "4  2020-07-29 00:54:28.942        download  red roses  A0iTJUhK4es   \n",
       "\n",
       "                      anonymous_user_id conversion_country  \n",
       "0  dd01ebdd-7691-4518-ab19-b2105782ae8b                 VE  \n",
       "1  c48ba6e0-c6a7-4a92-b569-fe57808a8a2c                 QA  \n",
       "2  62c4f043-579c-438f-8815-eb8ba3c54d34                 KR  \n",
       "3  7ad6dc18-a02e-4ba2-b93c-fd7ea2e551d8                 JP  \n",
       "4  f03a5708-32e4-4fae-8210-3c5d2632cbfb                 NZ  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3cd9509a-2cc9-452a-ad92-4c4ea5fd7df0",
   "metadata": {},
   "source": [
    "Get top queries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "6738ac94-950f-45ef-89a9-f558e83ea151",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:18.629372Z",
     "iopub.status.busy": "2023-04-25T16:31:18.628998Z",
     "iopub.status.idle": "2023-04-25T16:31:20.872575Z",
     "shell.execute_reply": "2023-04-25T16:31:20.871772Z",
     "shell.execute_reply.started": "2023-04-25T16:31:18.629345Z"
    }
   },
   "outputs": [],
   "source": [
    "df_res = df.groupby([\"keyword\"], as_index=False)\\\n",
    "            .size()\\\n",
    "            .sort_values(\"size\", ascending=False)\\\n",
    "            .rename(columns={'size':'num_searches'})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "6500ce65-b8f2-4abf-b796-e80a554c92b4",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:20.874034Z",
     "iopub.status.busy": "2023-04-25T16:31:20.873707Z",
     "iopub.status.idle": "2023-04-25T16:31:20.879505Z",
     "shell.execute_reply": "2023-04-25T16:31:20.878517Z",
     "shell.execute_reply.started": "2023-04-25T16:31:20.873970Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of unique queries: 569996 \n"
     ]
    }
   ],
   "source": [
    "print (f\"Number of unique queries: {len(df_res)} \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "98918b39-a639-42ea-a25c-e17bc0c51c8d",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:20.881180Z",
     "iopub.status.busy": "2023-04-25T16:31:20.880529Z",
     "iopub.status.idle": "2023-04-25T16:31:20.894394Z",
     "shell.execute_reply": "2023-04-25T16:31:20.893375Z",
     "shell.execute_reply.started": "2023-04-25T16:31:20.881155Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>keyword</th>\n",
       "      <th>num_searches</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>334943</th>\n",
       "      <td>nature</td>\n",
       "      <td>381173</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>445718</th>\n",
       "      <td>sky</td>\n",
       "      <td>239848</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193034</th>\n",
       "      <td>flowers</td>\n",
       "      <td>202391</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>333735</th>\n",
       "      <td>natural</td>\n",
       "      <td>196189</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>189492</th>\n",
       "      <td>flower</td>\n",
       "      <td>175126</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>431887</th>\n",
       "      <td>sea</td>\n",
       "      <td>165744</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>325200</th>\n",
       "      <td>mountain</td>\n",
       "      <td>161816</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198609</th>\n",
       "      <td>forest</td>\n",
       "      <td>153677</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>350461</th>\n",
       "      <td>ocean</td>\n",
       "      <td>145435</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45100</th>\n",
       "      <td>beach</td>\n",
       "      <td>136862</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>460237</th>\n",
       "      <td>space</td>\n",
       "      <td>120184</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>146484</th>\n",
       "      <td>dog</td>\n",
       "      <td>112637</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>328262</th>\n",
       "      <td>mountains</td>\n",
       "      <td>111005</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>533443</th>\n",
       "      <td>water</td>\n",
       "      <td>109987</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>320914</th>\n",
       "      <td>moon</td>\n",
       "      <td>106111</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>550361</th>\n",
       "      <td>winter</td>\n",
       "      <td>89541</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>90851</th>\n",
       "      <td>cat</td>\n",
       "      <td>87984</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>528686</th>\n",
       "      <td>wallpaper</td>\n",
       "      <td>87079</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19880</th>\n",
       "      <td>animal</td>\n",
       "      <td>79378</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>507852</th>\n",
       "      <td>tree</td>\n",
       "      <td>78697</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>345303</th>\n",
       "      <td>night sky</td>\n",
       "      <td>77892</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>482869</th>\n",
       "      <td>sunset</td>\n",
       "      <td>75404</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>343674</th>\n",
       "      <td>night</td>\n",
       "      <td>74551</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>278629</th>\n",
       "      <td>landscape</td>\n",
       "      <td>72824</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>481555</th>\n",
       "      <td>sunrise</td>\n",
       "      <td>72290</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>161638</th>\n",
       "      <td>earth</td>\n",
       "      <td>70303</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21578</th>\n",
       "      <td>animals</td>\n",
       "      <td>68001</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>518127</th>\n",
       "      <td>universe</td>\n",
       "      <td>66711</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>475829</th>\n",
       "      <td>summer</td>\n",
       "      <td>66019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>387151</th>\n",
       "      <td>plant</td>\n",
       "      <td>64406</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          keyword  num_searches\n",
       "334943     nature        381173\n",
       "445718        sky        239848\n",
       "193034    flowers        202391\n",
       "333735    natural        196189\n",
       "189492     flower        175126\n",
       "431887        sea        165744\n",
       "325200   mountain        161816\n",
       "198609     forest        153677\n",
       "350461      ocean        145435\n",
       "45100       beach        136862\n",
       "460237      space        120184\n",
       "146484        dog        112637\n",
       "328262  mountains        111005\n",
       "533443      water        109987\n",
       "320914       moon        106111\n",
       "550361     winter         89541\n",
       "90851         cat         87984\n",
       "528686  wallpaper         87079\n",
       "19880      animal         79378\n",
       "507852       tree         78697\n",
       "345303  night sky         77892\n",
       "482869     sunset         75404\n",
       "343674      night         74551\n",
       "278629  landscape         72824\n",
       "481555    sunrise         72290\n",
       "161638      earth         70303\n",
       "21578     animals         68001\n",
       "518127   universe         66711\n",
       "475829     summer         66019\n",
       "387151      plant         64406"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_res.head(30)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c8349ec7-0112-4b7a-b1c5-a52e4b33a221",
   "metadata": {
    "tags": []
   },
   "source": [
    "## What can we say about the typical queries ?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0548f316-6452-4eab-9740-838176ea683b",
   "metadata": {},
   "source": [
    "- Most of the queries seem to be under <3 keywords.\n",
    "- Users in the platform are interested in nature\n",
    "- no normalizations is done for the queries; animal vs animals ; vs mountain vs mountains"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bf35d26c-9a62-4e67-b810-1d552dc9e822",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "0ab13652-0433-40c0-9044-e21fc7ac22c6",
   "metadata": {},
   "source": [
    "Queries like above with \"broad\" intent are not that useful for comparing results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba1ef62a-fdab-4a33-8f89-e20e0c6d2fbc",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e8e5731-c4f5-45c8-9a15-76c8758e1845",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "fc047f35-f156-4a2c-a0e3-d629ddaca3ff",
   "metadata": {},
   "source": [
    "## Exploring Longer Queries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "a5f9bcd7-6c07-4b87-8910-81df7affaea4",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:20.895617Z",
     "iopub.status.busy": "2023-04-25T16:31:20.895394Z",
     "iopub.status.idle": "2023-04-25T16:31:21.322740Z",
     "shell.execute_reply": "2023-04-25T16:31:21.321881Z",
     "shell.execute_reply.started": "2023-04-25T16:31:20.895597Z"
    }
   },
   "outputs": [],
   "source": [
    "df_res[\"num_keywords\"] = df_res[\"keyword\"].apply(lambda x: len(x.split(\" \")))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "049b1fe8-cc95-4e1a-956f-81564bd75c16",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:21.324017Z",
     "iopub.status.busy": "2023-04-25T16:31:21.323750Z",
     "iopub.status.idle": "2023-04-25T16:31:21.369322Z",
     "shell.execute_reply": "2023-04-25T16:31:21.368403Z",
     "shell.execute_reply.started": "2023-04-25T16:31:21.323993Z"
    }
   },
   "outputs": [],
   "source": [
    "df_long_queries = df_res[(df_res[\"num_keywords\"] > 1) ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "c06f5c52-1240-432b-aa0a-07b6fb6427ce",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:21.370917Z",
     "iopub.status.busy": "2023-04-25T16:31:21.370534Z",
     "iopub.status.idle": "2023-04-25T16:31:21.385868Z",
     "shell.execute_reply": "2023-04-25T16:31:21.385104Z",
     "shell.execute_reply.started": "2023-04-25T16:31:21.370890Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>keyword</th>\n",
       "      <th>num_searches</th>\n",
       "      <th>num_keywords</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>327457</th>\n",
       "      <td>mountain star landscape night sky</td>\n",
       "      <td>779</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>287590</th>\n",
       "      <td>light at the end of the tunnel</td>\n",
       "      <td>308</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>499894</th>\n",
       "      <td>there is no planet b</td>\n",
       "      <td>242</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>276561</th>\n",
       "      <td>lago di braies, braies, italy</td>\n",
       "      <td>118</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>534678</th>\n",
       "      <td>water droplets on a leaf</td>\n",
       "      <td>106</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>224699</th>\n",
       "      <td>great sand dunes national park</td>\n",
       "      <td>94</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>258115</th>\n",
       "      <td>image of a man in a desert</td>\n",
       "      <td>82</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>274846</th>\n",
       "      <td>konkan beach resort, ratnagiri, india</td>\n",
       "      <td>73</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>335652</th>\n",
       "      <td>nature backgrounds water ripple reflection</td>\n",
       "      <td>67</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>459722</th>\n",
       "      <td>south georgia and the south sandwich islands</td>\n",
       "      <td>54</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>426421</th>\n",
       "      <td>samsung note 10 lite wallpaper</td>\n",
       "      <td>52</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37977</th>\n",
       "      <td>background image for google doc</td>\n",
       "      <td>51</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>141140</th>\n",
       "      <td>desert sunset nature landscape sky</td>\n",
       "      <td>49</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>61505</th>\n",
       "      <td>black grapes with wood plates</td>\n",
       "      <td>44</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>454677</th>\n",
       "      <td>snow mountain clear blue sky</td>\n",
       "      <td>44</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>364313</th>\n",
       "      <td>palm trees at the beach</td>\n",
       "      <td>43</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>349030</th>\n",
       "      <td>nova scotia duck tolling retriever</td>\n",
       "      <td>43</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>499403</th>\n",
       "      <td>the surface of the moon</td>\n",
       "      <td>37</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>499623</th>\n",
       "      <td>the waves of the sea</td>\n",
       "      <td>37</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>263540</th>\n",
       "      <td>iphone 11 pro max wallpaper</td>\n",
       "      <td>36</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27378</th>\n",
       "      <td>art of table potted flower</td>\n",
       "      <td>35</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>337991</th>\n",
       "      <td>nature photos  light colours</td>\n",
       "      <td>35</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>504377</th>\n",
       "      <td>torres del paine national park</td>\n",
       "      <td>32</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>134706</th>\n",
       "      <td>dark side of the moon</td>\n",
       "      <td>31</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>67340</th>\n",
       "      <td>blue sky and white clouds</td>\n",
       "      <td>31</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>375214</th>\n",
       "      <td>person on top of mountain</td>\n",
       "      <td>31</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>277772</th>\n",
       "      <td>lake with lotus and lilies photos</td>\n",
       "      <td>29</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50421</th>\n",
       "      <td>beauitful wallpaper nature  8k</td>\n",
       "      <td>29</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>287582</th>\n",
       "      <td>light at end of tunnel</td>\n",
       "      <td>28</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26499</th>\n",
       "      <td>ariel view of the ocean</td>\n",
       "      <td>28</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>177964</th>\n",
       "      <td>farmhouse rustic yellow and pink</td>\n",
       "      <td>28</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>161094</th>\n",
       "      <td>eagle flying in the sky</td>\n",
       "      <td>27</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>569329</th>\n",
       "      <td>沙漠青蛙      沙漠青蛙      desert frog</td>\n",
       "      <td>26</td>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>415209</th>\n",
       "      <td>ripley's aquarium of canada, toronto, canada</td>\n",
       "      <td>26</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>425941</th>\n",
       "      <td>salar de uyuni uyuni bolivia</td>\n",
       "      <td>25</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>143650</th>\n",
       "      <td>dew drops on a grass</td>\n",
       "      <td>25</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>123267</th>\n",
       "      <td>couple romdik love photo in tamil</td>\n",
       "      <td>25</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>304156</th>\n",
       "      <td>man on top of mountain</td>\n",
       "      <td>25</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>193247</th>\n",
       "      <td>flowers and plants and trees</td>\n",
       "      <td>24</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>497287</th>\n",
       "      <td>the butterfly atrium at hershey gardens</td>\n",
       "      <td>24</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>277773</th>\n",
       "      <td>lake with lotus and lily</td>\n",
       "      <td>24</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>479245</th>\n",
       "      <td>sun rise on a mountain</td>\n",
       "      <td>24</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>203350</th>\n",
       "      <td>free hd luminious backgrounds for photos</td>\n",
       "      <td>24</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>298593</th>\n",
       "      <td>lower antelope canyon, page, united states</td>\n",
       "      <td>24</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>393216</th>\n",
       "      <td>por do sol no mar</td>\n",
       "      <td>23</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>142678</th>\n",
       "      <td>desktop wallpapers 1920 x 1080</td>\n",
       "      <td>22</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>534769</th>\n",
       "      <td>water drops on the rose</td>\n",
       "      <td>22</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>437160</th>\n",
       "      <td>seven wonders of the world</td>\n",
       "      <td>20</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>66424</th>\n",
       "      <td>blue lake and green shore</td>\n",
       "      <td>19</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>295937</th>\n",
       "      <td>looking up to the sky</td>\n",
       "      <td>19</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                             keyword  num_searches  \\\n",
       "327457             mountain star landscape night sky           779   \n",
       "287590                light at the end of the tunnel           308   \n",
       "499894                          there is no planet b           242   \n",
       "276561                 lago di braies, braies, italy           118   \n",
       "534678                      water droplets on a leaf           106   \n",
       "224699                great sand dunes national park            94   \n",
       "258115                    image of a man in a desert            82   \n",
       "274846         konkan beach resort, ratnagiri, india            73   \n",
       "335652    nature backgrounds water ripple reflection            67   \n",
       "459722  south georgia and the south sandwich islands            54   \n",
       "426421                samsung note 10 lite wallpaper            52   \n",
       "37977                background image for google doc            51   \n",
       "141140            desert sunset nature landscape sky            49   \n",
       "61505                  black grapes with wood plates            44   \n",
       "454677                  snow mountain clear blue sky            44   \n",
       "364313                       palm trees at the beach            43   \n",
       "349030            nova scotia duck tolling retriever            43   \n",
       "499403                       the surface of the moon            37   \n",
       "499623                          the waves of the sea            37   \n",
       "263540                   iphone 11 pro max wallpaper            36   \n",
       "27378                     art of table potted flower            35   \n",
       "337991                  nature photos  light colours            35   \n",
       "504377                torres del paine national park            32   \n",
       "134706                         dark side of the moon            31   \n",
       "67340                      blue sky and white clouds            31   \n",
       "375214                     person on top of mountain            31   \n",
       "277772             lake with lotus and lilies photos            29   \n",
       "50421                 beauitful wallpaper nature  8k            29   \n",
       "287582                        light at end of tunnel            28   \n",
       "26499                        ariel view of the ocean            28   \n",
       "177964              farmhouse rustic yellow and pink            28   \n",
       "161094                       eagle flying in the sky            27   \n",
       "569329               沙漠青蛙      沙漠青蛙      desert frog            26   \n",
       "415209  ripley's aquarium of canada, toronto, canada            26   \n",
       "425941                  salar de uyuni uyuni bolivia            25   \n",
       "143650                          dew drops on a grass            25   \n",
       "123267             couple romdik love photo in tamil            25   \n",
       "304156                        man on top of mountain            25   \n",
       "193247                  flowers and plants and trees            24   \n",
       "497287       the butterfly atrium at hershey gardens            24   \n",
       "277773                      lake with lotus and lily            24   \n",
       "479245                        sun rise on a mountain            24   \n",
       "203350      free hd luminious backgrounds for photos            24   \n",
       "298593    lower antelope canyon, page, united states            24   \n",
       "393216                             por do sol no mar            23   \n",
       "142678                desktop wallpapers 1920 x 1080            22   \n",
       "534769                       water drops on the rose            22   \n",
       "437160                    seven wonders of the world            20   \n",
       "66424                      blue lake and green shore            19   \n",
       "295937                         looking up to the sky            19   \n",
       "\n",
       "        num_keywords  \n",
       "327457             5  \n",
       "287590             7  \n",
       "499894             5  \n",
       "276561             5  \n",
       "534678             5  \n",
       "224699             5  \n",
       "258115             7  \n",
       "274846             5  \n",
       "335652             5  \n",
       "459722             7  \n",
       "426421             5  \n",
       "37977              5  \n",
       "141140             5  \n",
       "61505              5  \n",
       "454677             5  \n",
       "364313             5  \n",
       "349030             5  \n",
       "499403             5  \n",
       "499623             5  \n",
       "263540             5  \n",
       "27378              5  \n",
       "337991             5  \n",
       "504377             5  \n",
       "134706             5  \n",
       "67340              5  \n",
       "375214             5  \n",
       "277772             6  \n",
       "50421              5  \n",
       "287582             5  \n",
       "26499              5  \n",
       "177964             5  \n",
       "161094             5  \n",
       "569329            14  \n",
       "415209             6  \n",
       "425941             5  \n",
       "143650             5  \n",
       "123267             6  \n",
       "304156             5  \n",
       "193247             5  \n",
       "497287             6  \n",
       "277773             5  \n",
       "479245             5  \n",
       "203350             6  \n",
       "298593             6  \n",
       "393216             5  \n",
       "142678             5  \n",
       "534769             5  \n",
       "437160             5  \n",
       "66424              5  \n",
       "295937             5  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_long_queries[df_long_queries.num_keywords > 4].head(50)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7c4d9566-4f47-4e50-8dde-5790c6466c4d",
   "metadata": {},
   "source": [
    "## Interesting Queries"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49f0b698-ffde-4da4-9975-2ea8ea63c6b8",
   "metadata": {},
   "source": [
    "Detailed Intent\n",
    "- water droplets on a leaf\t\n",
    "- image of a man in a desert\t\n",
    "- person on top of mountain\t\n",
    "\n",
    "\n",
    "\n",
    "Location:\n",
    "- ripley's aquarium of canada, toronto, canada\t\n",
    "- the butterfly atrium at hershey gardens\t\n",
    "\n",
    "Non English Queries\n",
    "- salar de uyuni uyuni bolivia\t\n",
    "- 沙漠青蛙 沙漠青蛙 desert frog\t\n",
    "- por do sol no mar\t\n",
    "- conhece te a ti mesmo\t ( Greek for know thyself)\n",
    "\n",
    "\n",
    "Metaphors / Slogan:\n",
    "- light at the end of the tunnel\t\n",
    "- there is no planet b\t\n",
    "\n",
    "Multiple Candidates\n",
    "- seven wonders of the world\t\n",
    "\n",
    "Long Query / Single Intent\n",
    "- nova scotia duck tolling retriever\t ( dog breed)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7f9f81d3-ebfa-4489-920c-d81f0f9e98f3",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "66c63f2d-219b-4ee8-ac61-3ca43a3b79bf",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d184a7c-87d2-4003-8082-6e1a38bcdf53",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "8b57c441-ae37-47c4-b0b6-bf07ecb16ca4",
   "metadata": {},
   "source": [
    "Non frequently searched queries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "5e2d990b-b36f-4274-8b55-e8b71db3ef43",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-04-25T16:31:21.387111Z",
     "iopub.status.busy": "2023-04-25T16:31:21.386821Z",
     "iopub.status.idle": "2023-04-25T16:31:21.401449Z",
     "shell.execute_reply": "2023-04-25T16:31:21.400704Z",
     "shell.execute_reply.started": "2023-04-25T16:31:21.387073Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>keyword</th>\n",
       "      <th>num_searches</th>\n",
       "      <th>num_keywords</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>313105</th>\n",
       "      <td>mid night star picture for youtube thumbnail</td>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119583</th>\n",
       "      <td>cool gamer pics for free</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313060</th>\n",
       "      <td>mid century gothic style rose painting</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313079</th>\n",
       "      <td>mid century modern interior design</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313077</th>\n",
       "      <td>mid century modern home interior</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313076</th>\n",
       "      <td>mid century modern home decor</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313160</th>\n",
       "      <td>middle aged women  beauty</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313148</th>\n",
       "      <td>middle age is an age of many colors.</td>\n",
       "      <td>1</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313185</th>\n",
       "      <td>middle east night in the desert</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313694</th>\n",
       "      <td>milky way at the sea</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313709</th>\n",
       "      <td>milky way by the nasa</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119302</th>\n",
       "      <td>cool adventurous places one can visit with a b...</td>\n",
       "      <td>1</td>\n",
       "      <td>16</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119308</th>\n",
       "      <td>cool and colorful  wallpapers</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119310</th>\n",
       "      <td>cool and fun pictures of animals</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313799</th>\n",
       "      <td>milky way moon  3000x3000</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119239</th>\n",
       "      <td>cooking over  a flame</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313744</th>\n",
       "      <td>milky way galaxy and man</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313765</th>\n",
       "      <td>milky way galaxy with people</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119390</th>\n",
       "      <td>cool beach romance for familly</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>119374</th>\n",
       "      <td>cool backgrounds with cool wolves</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>313519</th>\n",
       "      <td>miles pond, vt chamber of commerce</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315756</th>\n",
       "      <td>minimalist autumn wallpaper for mac</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118529</th>\n",
       "      <td>constantia, cape town, south africa</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118506</th>\n",
       "      <td>conserve energy  hd images</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315504</th>\n",
       "      <td>minimal windows 10 wallpaper plants</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315497</th>\n",
       "      <td>minimal white pot with green leaves</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>316032</th>\n",
       "      <td>minimalist nature black and white</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>316000</th>\n",
       "      <td>minimalist lotus whitte background flower</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>316020</th>\n",
       "      <td>minimalist motivation work hard beach</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>316101</th>\n",
       "      <td>minimalist qoutes for travel wallpaper</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118232</th>\n",
       "      <td>conhece te a ti mesmo</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315931</th>\n",
       "      <td>minimalist gentle monochrome simple macro text...</td>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315959</th>\n",
       "      <td>minimalist home decor with plants</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315905</th>\n",
       "      <td>minimalist flower black and white</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314828</th>\n",
       "      <td>minimal black and white background</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118809</th>\n",
       "      <td>contaminated and counterfeited bottled water</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314994</th>\n",
       "      <td>minimal flower on white  background</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314905</th>\n",
       "      <td>minimal colorful art on white background</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314803</th>\n",
       "      <td>minimal background texture nature plants</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314801</th>\n",
       "      <td>minimal background stacks of magazines</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314765</th>\n",
       "      <td>minimal background dark double screen</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118848</th>\n",
       "      <td>contemporary architecture made from wood</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314790</th>\n",
       "      <td>minimal background nature soft brown</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118850</th>\n",
       "      <td>contemporary art gallery at night</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>314729</th>\n",
       "      <td>minimal art black and white</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315333</th>\n",
       "      <td>minimal scene with geometric forms.</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118589</th>\n",
       "      <td>constellations in the night sky</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315010</th>\n",
       "      <td>minimal food flat lay background</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118714</th>\n",
       "      <td>construction worker at the beach</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>118716</th>\n",
       "      <td>construction worker in the winter</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  keyword  num_searches  \\\n",
       "313105       mid night star picture for youtube thumbnail             1   \n",
       "119583                           cool gamer pics for free             1   \n",
       "313060             mid century gothic style rose painting             1   \n",
       "313079                 mid century modern interior design             1   \n",
       "313077                   mid century modern home interior             1   \n",
       "313076                      mid century modern home decor             1   \n",
       "313160                          middle aged women  beauty             1   \n",
       "313148               middle age is an age of many colors.             1   \n",
       "313185                    middle east night in the desert             1   \n",
       "313694                               milky way at the sea             1   \n",
       "313709                              milky way by the nasa             1   \n",
       "119302  cool adventurous places one can visit with a b...             1   \n",
       "119308                      cool and colorful  wallpapers             1   \n",
       "119310                   cool and fun pictures of animals             1   \n",
       "313799                          milky way moon  3000x3000             1   \n",
       "119239                              cooking over  a flame             1   \n",
       "313744                           milky way galaxy and man             1   \n",
       "313765                       milky way galaxy with people             1   \n",
       "119390                     cool beach romance for familly             1   \n",
       "119374                  cool backgrounds with cool wolves             1   \n",
       "313519                 miles pond, vt chamber of commerce             1   \n",
       "315756                minimalist autumn wallpaper for mac             1   \n",
       "118529                constantia, cape town, south africa             1   \n",
       "118506                         conserve energy  hd images             1   \n",
       "315504                minimal windows 10 wallpaper plants             1   \n",
       "315497                minimal white pot with green leaves             1   \n",
       "316032                  minimalist nature black and white             1   \n",
       "316000          minimalist lotus whitte background flower             1   \n",
       "316020              minimalist motivation work hard beach             1   \n",
       "316101             minimalist qoutes for travel wallpaper             1   \n",
       "118232                              conhece te a ti mesmo             1   \n",
       "315931  minimalist gentle monochrome simple macro text...             1   \n",
       "315959                  minimalist home decor with plants             1   \n",
       "315905                  minimalist flower black and white             1   \n",
       "314828                 minimal black and white background             1   \n",
       "118809       contaminated and counterfeited bottled water             1   \n",
       "314994                minimal flower on white  background             1   \n",
       "314905           minimal colorful art on white background             1   \n",
       "314803           minimal background texture nature plants             1   \n",
       "314801             minimal background stacks of magazines             1   \n",
       "314765              minimal background dark double screen             1   \n",
       "118848           contemporary architecture made from wood             1   \n",
       "314790               minimal background nature soft brown             1   \n",
       "118850                  contemporary art gallery at night             1   \n",
       "314729                        minimal art black and white             1   \n",
       "315333                minimal scene with geometric forms.             1   \n",
       "118589                    constellations in the night sky             1   \n",
       "315010                   minimal food flat lay background             1   \n",
       "118714                   construction worker at the beach             1   \n",
       "118716                  construction worker in the winter             1   \n",
       "\n",
       "        num_keywords  \n",
       "313105             7  \n",
       "119583             5  \n",
       "313060             6  \n",
       "313079             5  \n",
       "313077             5  \n",
       "313076             5  \n",
       "313160             5  \n",
       "313148             8  \n",
       "313185             6  \n",
       "313694             5  \n",
       "313709             5  \n",
       "119302            16  \n",
       "119308             5  \n",
       "119310             6  \n",
       "313799             5  \n",
       "119239             5  \n",
       "313744             5  \n",
       "313765             5  \n",
       "119390             5  \n",
       "119374             5  \n",
       "313519             6  \n",
       "315756             5  \n",
       "118529             5  \n",
       "118506             5  \n",
       "315504             5  \n",
       "315497             6  \n",
       "316032             5  \n",
       "316000             5  \n",
       "316020             5  \n",
       "316101             5  \n",
       "118232             5  \n",
       "315931             7  \n",
       "315959             5  \n",
       "315905             5  \n",
       "314828             5  \n",
       "118809             5  \n",
       "314994             6  \n",
       "314905             6  \n",
       "314803             5  \n",
       "314801             5  \n",
       "314765             5  \n",
       "118848             5  \n",
       "314790             5  \n",
       "118850             5  \n",
       "314729             5  \n",
       "315333             5  \n",
       "118589             5  \n",
       "315010             5  \n",
       "118714             5  \n",
       "118716             5  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_long_queries[df_long_queries.num_keywords > 4].tail(50)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0fdf13c0-a913-44f4-8786-b223cad1d0a7",
   "metadata": {},
   "source": [
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49f898d8-5fd1-4475-a8f3-68c2e3d7f7b9",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "environment": {
   "kernel": "python3",
   "name": "pytorch-gpu.1-13.m107",
   "type": "gcloud",
   "uri": "gcr.io/deeplearning-platform-release/pytorch-gpu.1-13:m107"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}