{
"cells": [
{
"cell_type": "markdown",
"id": "4f034c95-5954-4aac-ac6e-6daf4348aab1",
"metadata": {},
"source": [
"# CWE"
]
},
{
"cell_type": "markdown",
"id": "166bea8a-75cf-4c87-8c88-eccac6c2efb4",
"metadata": {},
"source": [
"**Common Weakness Enumeration (CWE™)** is a formal list or dictionary of common software and hardware weaknesses that can occur in architecture, design, code, or implementation that can lead to exploitable security vulnerabilities. CWE was created to serve as a common language for describing security weaknesses; serve as a standard measuring stick for security tools targeting these weaknesses; and to provide a common baseline standard for weakness identification, mitigation, and prevention efforts. “Weaknesses” are flaws, faults, bugs, and other errors in software and hardware design, architecture, code, or implementation that if left unaddressed could result in systems and networks, and hardware being vulnerable to attack\n",
"\n",
"> source: [cwe.mitre.org](https://cwe.mitre.org/about/faq.html#what_is_cwe_weakness_meaning)"
]
},
{
"cell_type": "markdown",
"id": "565d190c-11e6-42c0-b9b0-1560e440fa3f",
"metadata": {},
"source": [
"You can see this notebook directly via:\n",
"- [GitHub](https://github.com/LimberDuck/limberduck_org_julio_7/blob/main/docs/notebooks/cwe/cwe.ipynb)\n",
"- [Jupyter nbviewer](https://nbviewer.org/github/LimberDuck/limberduck_org_julio_7/blob/main/docs/notebooks/cwe/cwe.ipynb)"
]
},
{
"cell_type": "markdown",
"id": "976eafb3-02cb-4ad9-a716-70783e9a6434",
"metadata": {},
"source": [
"## Generation time"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "3cd59ed4-798a-4aae-b84a-56ad8a520c24",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2025-04-05 18:23:26 +0000\n"
]
}
],
"source": [
"from datetime import datetime, timezone, timedelta\n",
"\n",
"timezone_offset = 0.0\n",
"tzinfo = timezone(timedelta(hours=timezone_offset))\n",
"generation_time = datetime.now(tzinfo).strftime('%Y-%m-%d %H:%M:%S %z')\n",
"print(generation_time)"
]
},
{
"cell_type": "markdown",
"id": "11ed7cf0-c8eb-4300-b084-4b7d18713ea9",
"metadata": {},
"source": [
"## Creative Commons"
]
},
{
"cell_type": "markdown",
"id": "02463631-dbdc-4d1b-9339-ad2f78729669",
"metadata": {},
"source": [
"This notebook and generated diagrams are released with [Creative Commons liecense (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/deed.en).\n",
"\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a360acfb-784e-41f6-a41d-59147352a07d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cc.xlarge.png\n",
"by.xlarge.png\n"
]
}
],
"source": [
"import requests\n",
"import urllib3\n",
"\n",
"urllib3.disable_warnings()\n",
"\n",
"urls = ['https://mirrors.creativecommons.org/presskit/icons/cc.xlarge.png',\n",
" 'https://mirrors.creativecommons.org/presskit/icons/by.xlarge.png']\n",
"for url in urls:\n",
" file_name = url.split(\"/\")[-1:][0]\n",
" print(file_name)\n",
"\n",
" file = requests.get(url, verify=False)\n",
" open(file_name, 'wb').write(file.content)"
]
},
{
"cell_type": "markdown",
"id": "c2876bbe-8f68-44d3-a2b6-87b074a652fa",
"metadata": {},
"source": [
"## CWE data downloading"
]
},
{
"cell_type": "markdown",
"id": "1a8deaba-c660-4760-80fa-c5b239d8b654",
"metadata": {},
"source": [
"All CWE IDs are taken from [cwe.mitre.org/data/downloads.html](https://cwe.mitre.org/data/downloads.html)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "1e8428da-eb78-4d60-be57-59bee6a7d5a4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cwec_latest.xml.zip\n"
]
}
],
"source": [
"url = 'https://cwe.mitre.org/data/xml/cwec_latest.xml.zip'\n",
"file_name = url.split(\"/\")[-1:][0]\n",
"print(file_name)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "03ab49bf-14af-4668-ab9a-074e7775d304",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1779050"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import requests\n",
"import urllib3\n",
"\n",
"urllib3.disable_warnings()\n",
"\n",
"file = requests.get(url, verify=False)\n",
"open(file_name, 'wb').write(file.content)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "27240ec4-b81a-474e-ba33-ae4b3b4997bf",
"metadata": {},
"outputs": [],
"source": [
"import zipfile\n",
"\n",
"with zipfile.ZipFile(file_name, 'r') as zip_ref:\n",
" zip_ref.extractall()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "ae4022a9-c4bf-40a5-9edc-1af6cb330341",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"cwec_v4.17.xml\n"
]
}
],
"source": [
"import glob\n",
"\n",
"file_name = glob.glob('*.xml')[-1]\n",
"print(file_name)"
]
},
{
"cell_type": "markdown",
"id": "f43928f0-26fc-4351-af46-f9abbba8f35c",
"metadata": {},
"source": [
"## CWE data parsing"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "e1088dbd-1b57-4057-8d90-0278a7405013",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" number year\n",
"0 1004 2017\n",
"1 1007 2017\n",
"2 102 2006\n",
"3 1021 2017\n",
"4 1022 2017\n",
".. ... ...\n",
"963 95 2006\n",
"964 96 2006\n",
"965 97 2006\n",
"966 98 2006\n",
"967 99 2006\n",
"\n",
"[968 rows x 2 columns]\n"
]
}
],
"source": [
"import pandas as pd \n",
"import xml.etree.ElementTree as et \n",
"\n",
"tree = et.parse(file_name)\n",
"root = tree.getroot()\n",
"df_cols = [\"number\", \"year\"]\n",
"rows = []\n",
"\n",
"if root.findall('{http://cwe.mitre.org/cwe-7}Weaknesses'):\n",
" weeknesses = root.find('{http://cwe.mitre.org/cwe-7}Weaknesses')\n",
" for weekness in weeknesses:\n",
" weekness_id = weekness.get(\"ID\")\n",
" weekness_content_history = weekness.find(\"{http://cwe.mitre.org/cwe-7}Content_History\")\n",
" weekness_content_submission = weekness_content_history.find(\"{http://cwe.mitre.org/cwe-7}Submission\")\n",
" weekness_content_submission_date = weekness_content_submission.find(\"{http://cwe.mitre.org/cwe-7}Submission_Date\").text\n",
" weekness_content_submission_year = weekness_content_submission_date[0:4]\n",
" \n",
" rows.append({\"number\": weekness_id, \"year\": weekness_content_submission_year})\n",
"\n",
"df = pd.DataFrame(rows, columns = df_cols)\n",
"\n",
"print(df)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a587e39c-fe69-4ece-ba2f-2b9d7bf32970",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
| \n", " | year | \n", "number | \n", "
|---|---|---|
| 1 | \n", "2006 | \n", "533 | \n", "
| 2 | \n", "2007 | \n", "27 | \n", "
| 3 | \n", "2008 | \n", "67 | \n", "
| 4 | \n", "2009 | \n", "44 | \n", "
| 5 | \n", "2010 | \n", "20 | \n", "
| 6 | \n", "2011 | \n", "11 | \n", "
| 7 | \n", "2012 | \n", "5 | \n", "
| 8 | \n", "2013 | \n", "14 | \n", "
| 9 | \n", "2014 | \n", "5 | \n", "
| 10 | \n", "2017 | \n", "4 | \n", "
| 11 | \n", "2018 | \n", "94 | \n", "
| 12 | \n", "2019 | \n", "21 | \n", "
| 13 | \n", "2020 | \n", "95 | \n", "
| 14 | \n", "2021 | \n", "9 | \n", "
| 15 | \n", "2022 | \n", "9 | \n", "
| 16 | \n", "2023 | \n", "7 | \n", "
| 17 | \n", "2024 | \n", "2 | \n", "
| 18 | \n", "2025 | \n", "1 | \n", "