{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyMIyN9ZHyHznB4VrfulleeQ",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"source": [
"# Hidden Markov Models in Python\n",
"\n",
"## Model sequences and uncover patterns in time series data\n",
"\n",
"|  |\n",
"|:--:|\n",
"| Image Generated Using Canva|\n",
"\n",
"### Introduction\n",
"Hidden Markov Models (HMMs) are powerful statistical tools used in temporal pattern recognition, such as speech, bioinformatics, and finance. They model systems where the states are hidden but produce observable outputs. Despite their complexity, Python libraries make implementing HMMs cleanly and effectively possible.\n",
"\n",
"---\n",
"\n",
"### Why it’s so important\n",
"* Helps model sequences where underlying patterns are not directly observable.\n",
"* Widely used in fields like natural language processing, genomics, and market prediction.\n",
"* Builds a foundation for mastering other sequence models like RNNs and Transformers.\n",
"\n",
"---\n",
"\n",
"### Code Implementation"
],
"metadata": {
"id": "KmlG1OqA_xnb"
}
},
{
"cell_type": "code",
"source": [
"!pip install hmmlearn"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "YUR7Vn2PAWqQ",
"outputId": "3cebab6e-0aaf-4062-edd3-db57eedb74fa"
},
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Collecting hmmlearn\n",
" Downloading hmmlearn-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)\n",
"Requirement already satisfied: numpy>=1.10 in /usr/local/lib/python3.11/dist-packages (from hmmlearn) (2.0.2)\n",
"Requirement already satisfied: scikit-learn!=0.22.0,>=0.16 in /usr/local/lib/python3.11/dist-packages (from hmmlearn) (1.6.1)\n",
"Requirement already satisfied: scipy>=0.19 in /usr/local/lib/python3.11/dist-packages (from hmmlearn) (1.15.2)\n",
"Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn!=0.22.0,>=0.16->hmmlearn) (1.4.2)\n",
"Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.11/dist-packages (from scikit-learn!=0.22.0,>=0.16->hmmlearn) (3.6.0)\n",
"Downloading hmmlearn-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (165 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m165.9/165.9 kB\u001b[0m \u001b[31m2.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hInstalling collected packages: hmmlearn\n",
"Successfully installed hmmlearn-0.3.3\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"from hmmlearn.hmm import CategoricalHMM\n",
"import numpy as np\n",
"\n",
"# Example: Weather states (0 = Sunny, 1 = Rainy)\n",
"observations = np.array([[0], [1], [0], [1], [1], [0]])\n",
"\n",
"model = CategoricalHMM(n_components=2, n_iter=100)\n",
"model.fit(observations)\n",
"\n",
"logprob, hidden_states = model.decode(observations, algorithm=\"viterbi\")\n",
"\n",
"print(\"Most likely hidden states:\", hidden_states)\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vRtLXyrAANFj",
"outputId": "3c7e2ab6-7b35-4bab-ffc2-4597cd207213"
},
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Most likely hidden states: [1 1 1 1 1 1]\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"### Understanding the Code, Hidden Markov Models with CategoricalHMM\n",
"This code demonstrates how to use a **Hidden Markov Model (HMM)** with **categorical (discrete)** observations to uncover hidden states from a sequence of events. In this example, you're modeling weather states like **Sunny (0)** and **Rainy (1)** using hmmlearn's `CategoricalHMM`.\n",
"\n",
"#### What Each Part Does\n",
"\n",
"1. Input Observations\n",
"\n"
],
"metadata": {
"id": "e9HwsYlVB1ct"
}
},
{
"cell_type": "code",
"source": [
"observations = np.array([[0], [1], [0], [1], [1], [0]])"
],
"metadata": {
"id": "Qs2A0nKxCC-R"
},
"execution_count": 5,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"This is a sequence of observed weather conditions: Sunny, Rainy, Sunny, etc.\n",
"\n",
"The values must be in a 2D array format `([[0], [1], ...])`, as required by hmmlearn.\n",
"\n",
"2. Model Initialization\n",
"\n",
"\n",
"\n"
],
"metadata": {
"id": "EjpbxKiiCEqy"
}
},
{
"cell_type": "code",
"source": [
"model = CategoricalHMM(n_components=2, n_iter=100)"
],
"metadata": {
"id": "agJuiK1VCMm6"
},
"execution_count": 6,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"* `n_components=2` sets the number of **hidden states** (for example, “Dry” and “Wet” weather systems).\n",
"\n",
"* `n_iter=100` allows up to 100 iterations during training to ensure the model converges.\n",
"\n",
"3. Training the Model\n",
"\n"
],
"metadata": {
"id": "1eT-LtB2CO1Q"
}
},
{
"cell_type": "code",
"source": [
"model.fit(observations)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 97
},
"id": "QjIpXViPCWaf",
"outputId": "673e0832-5de9-4aab-dc06-e0313b95e799"
},
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"CategoricalHMM(n_components=2, n_features=np.int64(2), n_iter=100,\n",
" random_state=RandomState(MT19937) at 0x7BF73FEA1540)"
],
"text/html": [
"
CategoricalHMM(n_components=2, n_features=np.int64(2), n_iter=100,\n",
" random_state=RandomState(MT19937) at 0x7BF73FEA1540)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. CategoricalHMM(n_components=2, n_features=np.int64(2), n_iter=100,\n",
" random_state=RandomState(MT19937) at 0x7BF73FEA1540)