\n",
"
All (444)\n",
"
Featured\n",
"
\n",
"Alarm\n",
"(9)\n",
"\n",
"...\n",
"```\n",
"\n",
"The line `
All (444)` contains the counter. "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"URL = 'https://home-assistant.io/components/'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With `requests` the website is retrieved and with `BeautifulSoup` parsed."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"raw_html = requests.get(URL).text\n",
"data = BeautifulSoup(raw_html, 'html.parser')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now you have the complete content of the page. [CSS selectors](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors) can be used to identify the counter. We have several options to get the part in question. As `BeautifulSoup` is giving us a list with the findings, we only need to identify the position in the list."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"
All (791)\n"
]
}
],
"source": [
"print(data.select('a')[10])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"
All (791)\n"
]
}
],
"source": [
"print(data.select('.btn')[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`nth-of-type(x)` gives you element `x` back."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[
All (791)]\n"
]
}
],
"source": [
"print(data.select('a:nth-of-type(11)'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To make your selector as robust as possible, it's recommended to look for unique elements like `id`, `URL`, etc."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[
All (791)]\n"
]
}
],
"source": [
"print(data.select('a[href=\"#all\"]'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The value extration is handled with `value_template` by the [`scrape` sensor](https://home-assistant.io/components/sensor.scrape/). The next two step are only shown here to show all manual steps.\n",
"\n",
"We only need the actual text."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"All (791)\n"
]
}
],
"source": [
"print(data.select('a[href=\"#all\"]')[0].text)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a string and can be manipulated. We focus on the number."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"791\n"
]
}
],
"source": [
"print(data.select('a[href=\"#all\"]')[0].text[5:8])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is the number of the current platforms/components from the [Component overview](https://home-assistant.io/components/) which are available in Home Assistant.\n",
"\n",
"The details you identified here can be re-used to configure [`scrape` sensor](https://home-assistant.io/components/sensor.scrape/)'s `select`. This means that the most efficient way is to apply `nth-of-type(x)` to your selector."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Send the value to the Home Assistant frontend\n",
"The [\"Using the Home Assistant Python API\"](http://nbviewer.jupyter.org/github/home-assistant/home-assistant-notebooks/blob/master/home-assistant-python-api.ipynb) notebooks contains an intro to the [Python API](https://home-assistant.io/developers/python_api/) of Home Assistant and Jupyther notebooks. Here we are sending the scrapped value to the Home Assistant frontend."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import homeassistant.remote as remote\n",
"\n",
"HOST = '127.0.0.1'\n",
"PASSWORD = 'YOUR_PASSWORD'\n",
"\n",
"api = remote.API(HOST, PASSWORD)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"new_state = data.select('a[href=\"#all\"]')[0].text[5:8]\n",
"attributes = {\n",
" \"friendly_name\": \"Home Assistant Implementations\",\n",
" \"unit_of_measurement\": \"Count\"\n",
"}\n",
"remote.set_state(api, 'sensor.ha_implement', new_state=new_state, attributes=attributes)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}