{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "12RuCnFuJ_Qe" }, "source": [ "# \"Smart maintenance based on vehicle CAN bus data from scratch in Python\"\n", "> \"In this article we prototype an algorithm that automatically scores engine health based on vehicle CAN bus data.\"\n", "- toc: true\n", "- branch: master\n", "- badges: true\n", "- comments: true\n", "- categories: [python, numpy, data-driven products, smart maintenance, scikit-learn]" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "iRkM1heyaWC_" }, "source": [ "# Summary\n", "\n", "Equipment that behaves anomalously or breaks down unexpectedly is a major cost driver in **manufacturing**, **logistics**, **public transport**, and any other sector that relies on complex machinery.\n", "\n", "A big promise of data analytics and machine learning in this space is to detect anomalies in machinery automatically and to alert their user of occurring faults.\n", "As an extension, the prediction of machinery faults and breakdowns is an important field of application.\n", "\n", "Automated detection and prediction of machinery breakdown is a key algorithmic approach behind **smart and predictive maintenance**.\n", "\n", "In this article we showcase a simple algorithmic approach for anomaly detection in the space of automated engine health detection.\n", "\n", "Our approach here can be an interesting starting point for the development of **smart telematics solutions** for automated and predictive vehicle breakdown detection." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "RsZflPAVgvDx" }, "source": [ "# Fetch the data\n", "\n", "We'll make use of an open data set of vehicle CAN bus data, called\n", "[Automotive CAN bus data: An Example Dataset from the AEGIS Big Data Project](https://zenodo.org/record/3267184#.XmCx8hNKh24).\n", "\n", "A CAN bus is a local network of sensors and actuators in modern vehicles that provides a stream of data for all important signals of a vehicle - such as its present velocity, interior temperature, and potentially hundreds of other signals.\n", "\n", "This data set encompasses time series data (traces) of various vehicles driven by different drivers.\n", "\n", "Let's go ahead and download a data set for driver 1 and a data set for driver 2:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "8DX2qNW6uNDA" }, "outputs": [], "source": [ "!wget --quiet https://zenodo.org/record/3267184/files/20181113_Driver1_Trip1.hdf" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "lR_mX1hpZ1NG" }, "outputs": [], "source": [ "!wget --quiet https://zenodo.org/record/3267184/files/20181114_Driver2_Trip3.hdf" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "n-pblzCli3G2" }, "source": [ "# Load libraries\n", "\n", "Here we import all necessary Python libraries for our analysis and algorithm:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "5g5oAL2-ua78" }, "outputs": [], "source": [ "import h5py\n", "from matplotlib import pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "\n", "from sklearn.mixture import GaussianMixture" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "d83ODF49_bBW" }, "outputs": [], "source": [ "plt.rcParams['figure.figsize'] = (10,10)\n", "sns.set(style=\"darkgrid\")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "9xOull5TjHGw" }, "source": [ "# Load vehicle data\n", "\n", "Let's load the data for driver 1 and driver 2 into memory:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "L6QLV8xJuj9S" }, "outputs": [], "source": [ "driver_1 = h5py.File('20181113_Driver1_Trip1.hdf', 'r')\n", "driver_2 = h5py.File('20181114_Driver2_Trip3.hdf', 'r')" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "VlMFracSoYEm" }, "source": [ "Both files contain multiple subgroups of data, one of which is the aformentioned CAN bus:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 34 }, "colab_type": "code", "id": "blcoMtWhwY9P", "outputId": "1deb3ba2-5d2a-4ce8-a7bf-3263f2106722" }, "outputs": [ { "data": { "text/plain": [ "['AI', 'CAN', 'GPS', 'Math', 'Plugins']" ] }, "execution_count": 6, "metadata": { "tags": [] }, "output_type": "execute_result" } ], "source": [ "list(driver_1.keys())" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 34 }, "colab_type": "code", "id": "voKJBUDAmkdq", "outputId": "9f577b23-9d90-49a6-eb1e-047413b3e4fb" }, "outputs": [ { "data": { "text/plain": [ "['AI', 'CAN', 'GPS', 'Math', 'Plugins']" ] }, "execution_count": 7, "metadata": { "tags": [] }, "output_type": "execute_result" } ], "source": [ "list(driver_2.keys())" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "vUWTNiQhumai" }, "source": [ "# Turn time series data into tables\n", "\n", "The CAN bus data comes in serialized form - written out in series in a nested format.\n", "\n", "To handle the CAN bus data more efficiently we'll turn it into tables that are easier to inspect and handle." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "lipkE3xL0TFA" }, "outputs": [], "source": [ "data_driver_1 = {}\n", "data_driver_2 = {}\n", "\n", "for channel_name, channel_data in driver_1['CAN'].items():\n", " data_driver_1[channel_name] = channel_data[:, 0]\n", "\n", "table_driver_1 = pd.DataFrame(\n", " data=data_driver_1,\n", " index=channel_data[:, 1]\n", ")\n", "table_driver_1 = table_driver_1.loc[:, table_driver_1.nunique() > 1]\n", "\n", "for channel_name, channel_data in driver_2['CAN'].items():\n", " data_driver_2[channel_name] = channel_data[:, 0]\n", "\n", "table_driver_2 = pd.DataFrame(\n", " data=data_driver_2,\n", " index=channel_data[:, 1]\n", ")\n", "table_driver_2 = table_driver_2.loc[:, table_driver_2.nunique() > 1]" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "6oXRvEyRwcw-" }, "source": [ "The tabular data for driver 1 looks as follows - it holds 158,659 measured time points in 28 channels that we deem relevant:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 439 }, "colab_type": "code", "id": "Viffbre46ovg", "outputId": "d8914d55-ef16-4a3c-a94c-79c268201699" }, "outputs": [ { "data": { "text/html": [ "
\n", " | AccPedal | \n", "AirIntakeTemperature | \n", "AmbientTemperature | \n", "BoostPressure | \n", "BrkVoltage | \n", "ENG_Trq_DMD | \n", "ENG_Trq_ZWR | \n", "ENG_Trq_m_ex | \n", "EngineSpeed_CAN | \n", "EngineTemperature | \n", "Engine_02_BZ | \n", "Engine_02_CHK | \n", "OilTemperature1 | \n", "SCS_01_BZ | \n", "SCS_01_CHK | \n", "SCS_Cancel | \n", "SCS_Tip_Down | \n", "SCS_Tip_Set | \n", "SCS_Tip_Up | \n", "SteerAngle1 | \n", "Trq_FrictionLoss | \n", "Trq_Indicated | \n", "VehicleSpeed | \n", "WheelSpeed_FL | \n", "WheelSpeed_FR | \n", "WheelSpeed_RL | \n", "WheelSpeed_RR | \n", "Yawrate1 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.000000 | \n", "0.0 | \n", "31.5 | \n", "8.0 | \n", "0.97 | \n", "1.0 | \n", "18.0 | \n", "20.0 | \n", "27.000000 | \n", "809.500000 | \n", "93.0 | \n", "6.000000 | \n", "168.000000 | \n", "82.0 | \n", "9.000000 | \n", "27.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "125.599998 | \n", "29.0 | \n", "27.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.190000 | \n", "
0.050000 | \n", "0.0 | \n", "31.5 | \n", "8.0 | \n", "0.97 | \n", "1.0 | \n", "18.0 | \n", "20.0 | \n", "27.000000 | \n", "809.500000 | \n", "93.0 | \n", "6.000000 | \n", "168.000000 | \n", "82.0 | \n", "9.000000 | \n", "27.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "125.599998 | \n", "29.0 | \n", "27.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.190000 | \n", "
0.100000 | \n", "0.0 | \n", "31.5 | \n", "8.0 | \n", "0.97 | \n", "1.0 | \n", "18.0 | \n", "20.0 | \n", "27.000000 | \n", "810.215759 | \n", "93.0 | \n", "10.331579 | \n", "163.663162 | \n", "82.0 | \n", "9.333000 | \n", "26.000999 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "125.599998 | \n", "29.0 | \n", "27.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.209900 | \n", "
0.150000 | \n", "0.0 | \n", "31.5 | \n", "8.0 | \n", "0.97 | \n", "1.0 | \n", "18.0 | \n", "20.0 | \n", "27.384237 | \n", "807.657654 | \n", "93.0 | \n", "9.605911 | \n", "165.674881 | \n", "82.0 | \n", "9.833000 | \n", "24.500999 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "125.599998 | \n", "29.0 | \n", "27.384237 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.204887 | \n", "
0.200000 | \n", "0.0 | \n", "31.5 | \n", "8.0 | \n", "0.97 | \n", "1.0 | \n", "18.0 | \n", "20.0 | \n", "27.000000 | \n", "805.500000 | \n", "93.0 | \n", "4.358586 | \n", "172.641418 | \n", "82.0 | \n", "10.333000 | \n", "24.333000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "125.599998 | \n", "28.0 | \n", "27.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.200000 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
7932.700195 | \n", "0.0 | \n", "25.5 | \n", "9.5 | \n", "0.98 | \n", "0.0 | \n", "20.0 | \n", "19.0 | \n", "29.000000 | \n", "797.500000 | \n", "96.0 | \n", "8.174129 | \n", "84.825874 | \n", "96.0 | \n", "10.493239 | \n", "24.493240 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "14.100000 | \n", "29.0 | \n", "29.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.130000 | \n", "
7932.750000 | \n", "0.0 | \n", "25.5 | \n", "9.5 | \n", "0.98 | \n", "0.0 | \n", "19.0 | \n", "19.0 | \n", "29.000000 | \n", "800.500000 | \n", "96.0 | \n", "13.164251 | \n", "145.835754 | \n", "96.0 | \n", "10.993991 | \n", "24.993992 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "14.100000 | \n", "29.0 | \n", "29.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.123609 | \n", "
7932.799805 | \n", "0.0 | \n", "25.5 | \n", "9.5 | \n", "0.98 | \n", "0.0 | \n", "19.0 | \n", "19.0 | \n", "29.000000 | \n", "797.790588 | \n", "96.0 | \n", "2.177665 | \n", "156.822342 | \n", "96.0 | \n", "11.494000 | \n", "27.469999 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "14.100000 | \n", "28.0 | \n", "29.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.178596 | \n", "
7932.850098 | \n", "0.0 | \n", "25.5 | \n", "9.5 | \n", "0.98 | \n", "0.0 | \n", "19.0 | \n", "19.0 | \n", "29.000000 | \n", "796.000000 | \n", "96.0 | \n", "7.180095 | \n", "153.739334 | \n", "96.0 | \n", "11.994000 | \n", "29.969999 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "14.100000 | \n", "28.0 | \n", "29.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.157268 | \n", "
7932.899902 | \n", "0.0 | \n", "25.5 | \n", "9.5 | \n", "0.98 | \n", "0.0 | \n", "19.0 | \n", "19.0 | \n", "29.000000 | \n", "794.500000 | \n", "96.0 | \n", "12.155440 | \n", "146.844559 | \n", "96.0 | \n", "12.494247 | \n", "30.494247 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "14.100000 | \n", "28.0 | \n", "29.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.140000 | \n", "
158659 rows × 28 columns
\n", "\n", " | AccPedal | \n", "AirIntakeTemperature | \n", "AmbientTemperature | \n", "BoostPressure | \n", "BrkVoltage | \n", "ENG_Trq_DMD | \n", "ENG_Trq_ZWR | \n", "ENG_Trq_m_ex | \n", "EngineSpeed_CAN | \n", "EngineTemperature | \n", "Engine_02_BZ | \n", "Engine_02_CHK | \n", "OilTemperature1 | \n", "SCS_01_BZ | \n", "SCS_01_CHK | \n", "SCS_Cancel | \n", "SCS_Tip_Down | \n", "SCS_Tip_Restart | \n", "SCS_Tip_Set | \n", "SCS_Tip_Up | \n", "SteerAngle1 | \n", "Trq_FrictionLoss | \n", "Trq_Indicated | \n", "VehicleSpeed | \n", "WheelSpeed_FL | \n", "WheelSpeed_FR | \n", "WheelSpeed_RL | \n", "WheelSpeed_RR | \n", "Yawrate1 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.000000 | \n", "0.0 | \n", "44.25 | \n", "14.5 | \n", "1.00 | \n", "1.0 | \n", "20.000000 | \n", "21.0 | \n", "30.000000 | \n", "791.000000 | \n", "96.0 | \n", "6.000000 | \n", "59.000000 | \n", "94.0 | \n", "10.000000 | \n", "24.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "4.800000 | \n", "31.0 | \n", "30.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.140000 | \n", "
0.050000 | \n", "0.0 | \n", "44.25 | \n", "14.5 | \n", "1.00 | \n", "1.0 | \n", "21.000000 | \n", "20.0 | \n", "30.000000 | \n", "791.000000 | \n", "96.0 | \n", "10.688680 | \n", "102.688683 | \n", "94.0 | \n", "10.000000 | \n", "24.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "4.800000 | \n", "31.0 | \n", "30.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.120000 | \n", "
0.100000 | \n", "0.0 | \n", "44.25 | \n", "14.5 | \n", "1.00 | \n", "1.0 | \n", "21.000000 | \n", "20.0 | \n", "30.000000 | \n", "791.165039 | \n", "96.0 | \n", "4.381188 | \n", "105.079208 | \n", "94.0 | \n", "10.110166 | \n", "24.110165 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "4.800000 | \n", "31.0 | \n", "30.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.148120 | \n", "
0.150000 | \n", "0.0 | \n", "44.25 | \n", "14.5 | \n", "1.00 | \n", "1.0 | \n", "21.000000 | \n", "20.0 | \n", "30.751268 | \n", "787.500000 | \n", "96.0 | \n", "4.725888 | \n", "104.725891 | \n", "94.0 | \n", "10.610916 | \n", "24.610916 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "4.800000 | \n", "31.0 | \n", "30.751268 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.126384 | \n", "
0.200000 | \n", "0.0 | \n", "44.25 | \n", "14.5 | \n", "1.00 | \n", "1.0 | \n", "21.000000 | \n", "20.0 | \n", "31.000000 | \n", "791.000000 | \n", "96.0 | \n", "9.733668 | \n", "101.733665 | \n", "94.0 | \n", "11.111500 | \n", "25.557501 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "4.800000 | \n", "31.0 | \n", "31.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.135592 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
6807.450195 | \n", "0.0 | \n", "25.50 | \n", "10.5 | \n", "0.98 | \n", "1.0 | \n", "21.000000 | \n", "21.0 | \n", "31.000000 | \n", "815.500000 | \n", "94.5 | \n", "10.693069 | \n", "118.693069 | \n", "95.0 | \n", "5.992500 | \n", "22.394501 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "37.400002 | \n", "30.0 | \n", "31.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.130326 | \n", "
6807.500000 | \n", "0.0 | \n", "25.50 | \n", "10.5 | \n", "0.98 | \n", "1.0 | \n", "21.000000 | \n", "21.0 | \n", "30.682692 | \n", "813.500000 | \n", "94.5 | \n", "5.192307 | \n", "120.884613 | \n", "95.0 | \n", "0.100500 | \n", "18.100500 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "37.400002 | \n", "30.0 | \n", "30.682692 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.150000 | \n", "
6807.549805 | \n", "0.0 | \n", "25.50 | \n", "10.5 | \n", "0.98 | \n", "1.0 | \n", "21.000000 | \n", "21.0 | \n", "31.000000 | \n", "816.000000 | \n", "94.5 | \n", "4.676768 | \n", "120.676765 | \n", "95.0 | \n", "0.600500 | \n", "18.600500 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "37.476120 | \n", "30.0 | \n", "31.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.140351 | \n", "
6807.600098 | \n", "0.0 | \n", "25.50 | \n", "10.5 | \n", "0.98 | \n", "1.0 | \n", "20.323383 | \n", "21.0 | \n", "30.000000 | \n", "819.587524 | \n", "94.5 | \n", "9.676617 | \n", "74.726372 | \n", "95.0 | \n", "1.100550 | \n", "18.698349 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "37.476616 | \n", "30.0 | \n", "30.000000 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.140000 | \n", "
6807.649902 | \n", "0.0 | \n", "25.50 | \n", "10.5 | \n", "0.98 | \n", "1.0 | \n", "22.000000 | \n", "21.0 | \n", "30.722773 | \n", "810.048096 | \n", "94.5 | \n", "14.693069 | \n", "178.693069 | \n", "95.0 | \n", "1.600800 | \n", "17.197599 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "37.500000 | \n", "31.0 | \n", "30.722773 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.129625 | \n", "
136154 rows × 29 columns
\n", "