{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# **Use ML techniques for layer stackup modeling** \n", "## Table of contents: \n", "### 1.**[Motivation](#Motivation)** ...\n", "### 2.**[Problem Statements](#Problem_Statements)** ...\n", "### 3.**[Generate Data](#Generate_Data)** ...\n", "### 4.**[Prepare Data](#Prepare_Data)** ...\n", "### 5.**[Choose a Model](#Choose_a_Model)** ...\n", "### 6.**[Training](#Training)** ...\n", "### 7.**[Neural Network](#Neural_Network)** ...\n", "### 8.**[Deploy](#Deploy)** ...\n", "### 9.**[Conclusion](#Conclusion)** ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Motivation:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When planning a PCB layer stackup, often time we would like to know the trade-off between various layout options vs their signal integrity performance. For example, a wider trace may provides smaller impedance yet occupate more routing area. Narrow down the spacing between differential pairs may save some spaces but will also increase crosstalk. Most EDA tool involving system level signal integrity analysis provides \"transmission line calculator\" like shown below for designer to quickly make estimation and determine the trade-off: \n", "![TLineCalc](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/KiCad_LineCalc.png?raw=true)\n", "\n", "However, all such \"calculators\" I have seen, even in a commercial one, only consider the single trace or one differential pair itself. They do not take take crosstalks into account. More over, stackup parameters such as conductivity and permetivity must be entered individually instead of a range. As a results, user can't not easily visualize relationships between performance parameters vs the stackup properties. Thus, an enhanced version of such \"T-Line calculator\", which can address the aformentioned gaps will be very useful. Such tool requires a prediction model to link between various stackup parameters to their performance targets. Data science/machine learning techniques can thus be used to build such model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Problem Statements:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We would like to build a prediction model such that given a set of stackup parameters such as trace width and spacing etc, its performance such as impedance, attenuations, near-end and far-end crosstalk can be quickly estimated. This model can then be deployed into a stand-alone tool for range based sweep such that a visual plot can be generated to provide relations between various parameters to decide design trade-off." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Generate Data**:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Overview: \n", "The model to be built here is for nominal (i.e. numerical) prediction with around 10 attributes, i.e. input variables. Various stackup configurations will be generated via sampling and their corresponding stackup model, in the form of frequency dependent R/L/G/C matrices will be simulated via field solver. Such process are deterministics. Post process steps will read these solved model and calculate performance. Here we define performance to be predicted as impedance, attenuation, near-end/far-end crosstalks and propagation speed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ***Define stakup structure:*** \n", "There are many possible stakup structures as shown below. For more accurate prediction, we are going to generate one prediction model per structure.\n", "![Presets](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/LStkPresets.png?raw=true)\n", "\n", "Use three single-ended traces (victim in the middle) in strip-line setup as an example, various attributes may be defined as shown below:\n", "![Setup](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/SLSE3Setup.png?raw=true)\n", "\n", "These parameters, such as S(Spacing), W(Width), Sigma(Conductivity), Er(Permitivity), H(Height) etc are represented as varaibles to be sampled." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ***Define sampling points:*** \n", "Next step is to define ranges of variable values and sampling points. Since there are about 10 parameters, full combinatorial data will be impractical. Thus we may need to apply sampling algorithms such as design-of-experiments or spacing filling etc to establish best coverage of the solution space.\n", "For this setup, we have generate 10,000 cases to be simulated.\n", "![Sample](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/SLSE3Sample.png?raw=true)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ***Generate inputs setup and simulate:*** \n", "Once we have sample points, layer stackup configurations to the solver to be used will be generated. Each field solver has different syntax thus a flow will be needed...\n", "![TProFlow](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/SLSE3FlowGUI.png?raw=true)\n", "\n", "In this case, we use HSpice from Synopsys for field solver, thus each of 10K parameter combinations will be used to generate their spice input files for simulation:\n", "![Simulate](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/SLSE3Simulate.png?raw=true)\n", "\n", "The next step is to perforam circuit simulation for all these cases. This may be a time-consuming process so a distributed environment or simulation farm may be used." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ***Performance measurement:*** \n", "The outcome of each simulation is a frequency dependent tabular model, corresponding to its layer stackup settings. HSpice's tabular format looks like this:\n", "![Tabular](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/SLSE3Tabular.png?raw=true)\n", "Next step is to load these models and do performance measurement:\n", "![Measure](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/SLSE3Measure.png?raw=true)\n", "Matrix manipulation such as eigen-value decomponsition will be applied in order to obtain the characteristic impedance and propagation speed etc. Measurement output of each model should be a set of parameters which will be combined with original inputs to form the dataset for our prediction modeling." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Prepare Data:**\n", "From this point, we can start the modeling process using python and various packages." ] }, { "cell_type": "code", "execution_count": 101, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "## Initial set-up for data\n", "import os\n", "import pandas as pd\n", "import matplotlib\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "prjHome = 'C:/Temp/WinProj/LStkMdl'\n", "workDir = prjHome + '/wsp/'\n", "srcFile = prjHome + '/dat/SLSE3.csv'\n", "\n", "def save_fig(fig_id, tight_layout=True, fig_extension=\"png\", resolution=300):\n", " path = os.path.join(workDir, fig_id + \".\" + fig_extension)\n", " print(\"Saving figure\", fig_id)\n", " if tight_layout:\n", " plt.tight_layout()\n", " plt.savefig(path, format=fig_extension, dpi=resolution)" ] }, { "cell_type": "code", "execution_count": 102, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
H1H3ER1ER3TD1TD3H2WSEDWSIGMAHFNAMEZ0SE(1_SE)S0SE(1_SE)KBSENB(1_1)KFSENB(1_1)A(1_1)
04.93115.58593.69643.32770.0137710.0305650.912367.557330.66100.17753050000000.015SPIMDL00001.TAB37.7150881.601068e+080.000035-7.297762e-150.497289
14.89185.31564.04103.84040.0220080.0086000.531267.03786.92170.73337050000000.015SPIMDL00002.TAB39.8319541.510345e+080.019608-1.093612e-120.249591
21.790711.52303.69843.35340.0138590.0451372.009903.696427.32300.70775050000000.015SPIMDL00003.TAB35.6639281.587798e+080.000305-2.797417e-130.705953
32.85954.72593.96053.39810.0104810.0180280.365472.38727.28030.73521050000000.015SPIMDL00004.TAB59.4564381.558444e+080.010125-5.478920e-120.327952
45.59468.25533.41763.32490.0244340.0036632.088106.377620.16800.05358550000000.015SPIMDL00005.TAB46.0827221.633106e+080.004136-4.787903e-130.169760
\n", "
" ], "text/plain": [ " H1 H3 ER1 ER3 TD1 TD3 H2 W \\\n", "0 4.9311 5.5859 3.6964 3.3277 0.013771 0.030565 0.91236 7.5573 \n", "1 4.8918 5.3156 4.0410 3.8404 0.022008 0.008600 0.53126 7.0378 \n", "2 1.7907 11.5230 3.6984 3.3534 0.013859 0.045137 2.00990 3.6964 \n", "3 2.8595 4.7259 3.9605 3.3981 0.010481 0.018028 0.36547 2.3872 \n", "4 5.5946 8.2553 3.4176 3.3249 0.024434 0.003663 2.08810 6.3776 \n", "\n", " S EDW SIGMA H FNAME Z0SE(1_SE) \\\n", "0 30.6610 0.177530 50000000.0 15 SPIMDL00001.TAB 37.715088 \n", "1 6.9217 0.733370 50000000.0 15 SPIMDL00002.TAB 39.831954 \n", "2 27.3230 0.707750 50000000.0 15 SPIMDL00003.TAB 35.663928 \n", "3 7.2803 0.735210 50000000.0 15 SPIMDL00004.TAB 59.456438 \n", "4 20.1680 0.053585 50000000.0 15 SPIMDL00005.TAB 46.082722 \n", "\n", " S0SE(1_SE) KBSENB(1_1) KFSENB(1_1) A(1_1) \n", "0 1.601068e+08 0.000035 -7.297762e-15 0.497289 \n", "1 1.510345e+08 0.019608 -1.093612e-12 0.249591 \n", "2 1.587798e+08 0.000305 -2.797417e-13 0.705953 \n", "3 1.558444e+08 0.010125 -5.478920e-12 0.327952 \n", "4 1.633106e+08 0.004136 -4.787903e-13 0.169760 " ] }, "execution_count": 102, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's read the data and do some statistic\n", "srcData = pd.read_csv(srcFile)\n", "\n", "# Take a peek:\n", "srcData.head()" ] }, { "cell_type": "code", "execution_count": 103, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 10000 entries, 0 to 9999\n", "Data columns (total 18 columns):\n", "H1 10000 non-null float64\n", "H3 10000 non-null float64\n", "ER1 10000 non-null float64\n", "ER3 10000 non-null float64\n", "TD1 10000 non-null float64\n", "TD3 10000 non-null float64\n", "H2 10000 non-null float64\n", "W 10000 non-null float64\n", "S 10000 non-null float64\n", "EDW 10000 non-null float64\n", "SIGMA 10000 non-null float64\n", "H 10000 non-null int64\n", "FNAME 10000 non-null object\n", "Z0SE(1_SE) 10000 non-null float64\n", "S0SE(1_SE) 10000 non-null float64\n", "KBSENB(1_1) 9998 non-null float64\n", "KFSENB(1_1) 9998 non-null float64\n", "A(1_1) 10000 non-null float64\n", "dtypes: float64(16), int64(1), object(1)\n", "memory usage: 1.4+ MB\n" ] } ], "source": [ "srcData.info()" ] }, { "cell_type": "code", "execution_count": 104, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
H1H3ER1ER3TD1TD3H2WSEDWSIGMAHZ0SE(1_SE)S0SE(1_SE)KBSENB(1_1)KFSENB(1_1)A(1_1)
count10000.00000010000.00000010000.00000010000.00000010000.00000010000.00000010000.00000010000.00000010000.00000010000.00000010000.010000.010000.0000001.000000e+049.998000e+039.998000e+0310000.000000
mean4.1999997.5000073.8500013.8500010.0250000.0250001.4000005.50000118.5000070.37500050000000.015.042.9853451.533140e+081.475118e-02-2.155868e-120.509534
std1.6166613.1755910.4907730.4907720.0144340.0144340.6351172.0208299.5267500.2165150.00.0172.1638357.085288e+062.847164e-023.787910e-110.288974
min1.4000002.0010003.0000003.0001000.0000030.0000020.3001402.0001002.0026000.00000250000000.015.011.6304241.384625e+081.009146e-10-2.796522e-090.011486
25%2.7999254.7503253.4250753.4250500.0125000.0125010.8500383.75035010.2500000.18751050000000.015.032.1739331.479284e+082.216767e-04-1.112024e-120.328938
50%4.2000007.5005003.8500003.8499500.0250000.0250001.3999505.50015018.5000000.37498550000000.015.040.0316141.527750e+081.969018e-03-1.928585e-150.507237
75%5.60007510.2495004.2749504.2749500.0374990.0374981.9498757.24965026.7480000.56250750000000.015.048.6880361.583456e+081.376321e-027.447876e-130.680500
max6.99990013.0000004.6998004.6999000.0500000.0499952.5000008.99960034.9990000.74993050000000.015.017097.9735701.725066e+083.184933e-019.613918e-1114.524624
\n", "
" ], "text/plain": [ " H1 H3 ER1 ER3 TD1 \\\n", "count 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 \n", "mean 4.199999 7.500007 3.850001 3.850001 0.025000 \n", "std 1.616661 3.175591 0.490773 0.490772 0.014434 \n", "min 1.400000 2.001000 3.000000 3.000100 0.000003 \n", "25% 2.799925 4.750325 3.425075 3.425050 0.012500 \n", "50% 4.200000 7.500500 3.850000 3.849950 0.025000 \n", "75% 5.600075 10.249500 4.274950 4.274950 0.037499 \n", "max 6.999900 13.000000 4.699800 4.699900 0.050000 \n", "\n", " TD3 H2 W S EDW \\\n", "count 10000.000000 10000.000000 10000.000000 10000.000000 10000.000000 \n", "mean 0.025000 1.400000 5.500001 18.500007 0.375000 \n", "std 0.014434 0.635117 2.020829 9.526750 0.216515 \n", "min 0.000002 0.300140 2.000100 2.002600 0.000002 \n", "25% 0.012501 0.850038 3.750350 10.250000 0.187510 \n", "50% 0.025000 1.399950 5.500150 18.500000 0.374985 \n", "75% 0.037498 1.949875 7.249650 26.748000 0.562507 \n", "max 0.049995 2.500000 8.999600 34.999000 0.749930 \n", "\n", " SIGMA H Z0SE(1_SE) S0SE(1_SE) KBSENB(1_1) \\\n", "count 10000.0 10000.0 10000.000000 1.000000e+04 9.998000e+03 \n", "mean 50000000.0 15.0 42.985345 1.533140e+08 1.475118e-02 \n", "std 0.0 0.0 172.163835 7.085288e+06 2.847164e-02 \n", "min 50000000.0 15.0 11.630424 1.384625e+08 1.009146e-10 \n", "25% 50000000.0 15.0 32.173933 1.479284e+08 2.216767e-04 \n", "50% 50000000.0 15.0 40.031614 1.527750e+08 1.969018e-03 \n", "75% 50000000.0 15.0 48.688036 1.583456e+08 1.376321e-02 \n", "max 50000000.0 15.0 17097.973570 1.725066e+08 3.184933e-01 \n", "\n", " KFSENB(1_1) A(1_1) \n", "count 9.998000e+03 10000.000000 \n", "mean -2.155868e-12 0.509534 \n", "std 3.787910e-11 0.288974 \n", "min -2.796522e-09 0.011486 \n", "25% -1.112024e-12 0.328938 \n", "50% -1.928585e-15 0.507237 \n", "75% 7.447876e-13 0.680500 \n", "max 9.613918e-11 14.524624 " ] }, "execution_count": 104, "metadata": {}, "output_type": "execute_result" } ], "source": [ "srcData.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ** Note that:** \n", "- Sigma(Conductivity) and H(default layer height) are constants in this setup;\n", "- FNAME (File name) is not needed for modeling\n", "- Z0 (impedance) has outliers\n", "- Forward/Backward crosstalk (Kb/kf) have missing terms" ] }, { "cell_type": "code", "execution_count": 105, "metadata": {}, "outputs": [], "source": [ "# drop constant and file name columns\n", "stkData = srcData.drop(columns=['H', 'SIGMA', 'FNAME'])" ] }, { "cell_type": "code", "execution_count": 106, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving figure attribute_histogram_plots\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# plot distributions before dropping measurement outliers\n", "stkData.hist(bins=50, figsize=(20,15))\n", "save_fig(\"attribute_histogram_plots\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 107, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Int64Index: 9994 entries, 0 to 9999\n", "Data columns (total 15 columns):\n", "H1 9994 non-null float64\n", "H3 9994 non-null float64\n", "ER1 9994 non-null float64\n", "ER3 9994 non-null float64\n", "TD1 9994 non-null float64\n", "TD3 9994 non-null float64\n", "H2 9994 non-null float64\n", "W 9994 non-null float64\n", "S 9994 non-null float64\n", "EDW 9994 non-null float64\n", "Z0SE(1_SE) 9994 non-null float64\n", "S0SE(1_SE) 9994 non-null float64\n", "KBSENB(1_1) 9994 non-null float64\n", "KFSENB(1_1) 9994 non-null float64\n", "A(1_1) 9994 non-null float64\n", "dtypes: float64(15)\n", "memory usage: 1.2 MB\n" ] } ], "source": [ "# drop outliers and invalid Kb/Kf cells\n", "# These may be caused by unphysical stakup model or calculation during post-processing\n", "maxZVal = 200\n", "minZVal = 10\n", "stkTemp = stkData[(stkData['Z0SE(1_SE)'] < maxZVal) & \\\n", " (stkData['Z0SE(1_SE)'] > minZVal) & \\\n", " (np.abs(stkData['KBSENB(1_1)']) > 0.0) & \\\n", " (np.abs(stkData['KFSENB(1_1)']) > 0.0)]\n", "\n", "# Check again to make sure data are now justified\n", "stkTemp.info()" ] }, { "cell_type": "code", "execution_count": 108, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving figure attribute_histogram_plots\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# now plot distributions again, should see proper distribuition now\n", "stkData = stkTemp\n", "stkData.hist(bins=50, figsize=(20,15))\n", "save_fig(\"attribute_histogram_plots\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Z0SE(1_SE) 1.000000\n", "W 0.664396\n", "H1 0.535432\n", "H3 0.344356\n", "H2 0.186618\n", "ER1 0.119390\n", "ER3 0.110507\n", "EDW 0.067261\n", "TD3 0.061906\n", "TD1 0.017937\n", "S 0.000107\n", "Name: Z0SE(1_SE), dtype: float64" ] }, "execution_count": 109, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# find principal components for Z\n", "corr_matrix = stkData.drop(columns=['KBSENB(1_1)', 'KFSENB(1_1)', 'S0SE(1_SE)', 'A(1_1)']).corr()\n", "corr_matrix['Z0SE(1_SE)'].abs().sort_values(ascending=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From this correlation matrix above, it can be shown that trace width and height are dominate factors for the trace's impedance." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Choose a Model:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we are building a nominal estimator here, I will try simple linear regressor as estimator first:" ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [], "source": [ "# Separate input and output attributes\n", "allTars = ['Z0SE(1_SE)', 'KBSENB(1_1)', 'KFSENB(1_1)', 'S0SE(1_SE)', 'A(1_1)']\n", "varList = [e for e in list(stkData) if e not in allTars]\n", "varData = stkData[varList]" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.478480854776983" ] }, "execution_count": 111, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We have 10,000 cases here, try in-memory normal equation directly first:\n", "\n", "# LinearRegression Fit Impedance\n", "from sklearn.linear_model import LinearRegression\n", "\n", "tarData = stkData['Z0SE(1_SE)']\n", "lin_reg = LinearRegression()\n", "lin_reg.fit(varData, tarData)\n", "\n", "# Fit and check predictions using MSE etc\n", "from sklearn.metrics import mean_squared_error, mean_absolute_error\n", "predict = lin_reg.predict(varData)\n", "resRMSE = np.sqrt(mean_squared_error(tarData, predict))\n", "resRMSE" ] }, { "cell_type": "code", "execution_count": 112, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Attribute: ['H1', 'H3', 'ER1', 'ER3', 'TD1', 'TD3', 'H2', 'W', 'S', 'EDW']\n", "Scores: [3.25291562 4.16881619 3.22429058 3.32251943 3.70096691 3.46812853\n", " 3.34169206 3.5511994 3.28259776 3.39876687]\n", "Mean: 3.4711893352193854\n", "Standard deviation: 0.270957271481177\n" ] } ], "source": [ "# Use 10-Split for cross validations:\n", "def display_scores(attribs, scores):\n", " print(\"Attribute:\", attribs)\n", " print(\"Scores:\", scores)\n", " print(\"Mean:\", scores.mean())\n", " print(\"Standard deviation:\", scores.std())\n", " \n", "from sklearn.model_selection import cross_val_score\n", "lin_scores = cross_val_score(lin_reg, varData, tarData,\n", " scoring=\"neg_mean_squared_error\", cv=10)\n", "lin_rmse_scores = np.sqrt(-lin_scores)\n", "display_scores(varList, lin_rmse_scores)" ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.4921877809274595" ] }, "execution_count": 113, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# try Regularization it self\n", "from sklearn.linear_model import Ridge\n", "ridge_reg = Ridge(alpha=1, solver=\"cholesky\")\n", "ridge_reg.fit(varData, tarData)\n", "predict = ridge_reg.predict(varData)\n", "resRMSE = np.sqrt(mean_squared_error(tarData, predict))\n", "resRMSE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus a 3 ohms or so difference may be obtained from this estimator. What if higher order regression is used:" ] }, { "cell_type": "code", "execution_count": 114, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.2780298094097002" ] }, "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.preprocessing import PolynomialFeatures\n", "poly_features = PolynomialFeatures(degree=2, include_bias=False)\n", "varPoly = poly_features.fit_transform(varData)\n", "lin_reg = LinearRegression()\n", "lin_reg.fit(varPoly, tarData)\n", "predict = lin_reg.predict(varPoly)\n", "resRMSE = np.sqrt(mean_squared_error(tarData, predict))\n", "resRMSE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A more accurate model thus may be obtained this way. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Training and Evaluation:**" ] }, { "cell_type": "code", "execution_count": 115, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving figure underfitting_learning_curves_plot\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from sklearn.metrics import mean_squared_error\n", "from sklearn.model_selection import train_test_split\n", "\n", "def plot_learning_curves(model, X, y):\n", " X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=10)\n", " train_errors, val_errors = [], []\n", " for m in range(1, len(X_train)):\n", " model.fit(X_train[:m], y_train[:m])\n", " y_train_predict = model.predict(X_train[:m])\n", " y_val_predict = model.predict(X_val)\n", " train_errors.append(mean_squared_error(y_train_predict, y_train[:m]))\n", " val_errors.append(mean_squared_error(y_val_predict, y_val))\n", "\n", " plt.plot(np.sqrt(train_errors), \"r-+\", linewidth=2, label=\"Training set\")\n", " plt.plot(np.sqrt(val_errors), \"b-\", linewidth=3, label=\"Validation set\")\n", " plt.legend(loc=\"upper right\", fontsize=14)\n", " plt.xlabel(\"Training set size\", fontsize=14)\n", " plt.ylabel(\"RMSE\", fontsize=14)\n", "\n", "lin_reg = LinearRegression()\n", "plot_learning_curves(lin_reg, varData, tarData)\n", "plt.axis([0, 8000, 0, 20])\n", "save_fig(\"underfitting_learning_curves_plot\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Neural Network:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As the difference between prediction to actual measurement is about two ohms, it has met our modeling goals. As an alternative approacy, let's try neural net modeling below" ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "dense_10 (Dense) (None, 64) 704 \n", "_________________________________________________________________\n", "dropout_7 (Dropout) (None, 64) 0 \n", "_________________________________________________________________\n", "dense_11 (Dense) (None, 64) 4160 \n", "_________________________________________________________________\n", "dropout_8 (Dropout) (None, 64) 0 \n", "_________________________________________________________________\n", "dense_12 (Dense) (None, 1) 65 \n", "=================================================================\n", "Total params: 4,929\n", "Trainable params: 4,929\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "from keras.models import Sequential\n", "from keras.layers import Dense, Dropout\n", "\n", "numInps = len(varList)\n", "nnetMdl = Sequential()\n", "# input layer\n", "nnetMdl.add(Dense(units=64, activation='relu', input_dim=numInps))\n", "\n", "# hidden layers\n", "nnetMdl.add(Dropout(0.3, noise_shape=None, seed=None))\n", "nnetMdl.add(Dense(64, activation = \"relu\"))\n", "nnetMdl.add(Dropout(0.2, noise_shape=None, seed=None))\n", " \n", "# output layer\n", "nnetMdl.add(Dense(units=1, activation='sigmoid'))\n", "nnetMdl.compile(loss='mean_squared_error', optimizer='adam')\n", "\n", "# Provide some info\n", "#from keras.utils import plot_model\n", "#plot_model(nnetMdl, to_file= workDir + 'model.png')\n", "nnetMdl.summary()" ] }, { "cell_type": "code", "execution_count": 117, "metadata": {}, "outputs": [], "source": [ "# Prepare Training (tran) and Validation (test) dataset\n", "varTran, varTest, tarTran, tarTest = train_test_split(varData, tarData, test_size=0.2)\n", "\n", "# scale the data\n", "from sklearn import preprocessing\n", "varScal = preprocessing.MinMaxScaler()\n", "varTran = varScal.fit_transform(varTran)\n", "varTest = varScal.transform(varTest)\n", "\n", "tarScal = preprocessing.MinMaxScaler()\n", "tarTran = tarScal.fit_transform(tarTran.values.reshape(-1, 1))" ] }, { "cell_type": "code", "execution_count": 118, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train on 7195 samples, validate on 800 samples\n", "Epoch 1/50\n", "7195/7195 [==============================] - 0s 43us/step - loss: 0.0457 - val_loss: 0.0205\n", "Epoch 2/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0202 - val_loss: 0.0150\n", "Epoch 3/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0173 - val_loss: 0.0157\n", "Epoch 4/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0169 - val_loss: 0.0134\n", "Epoch 5/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0148 - val_loss: 0.0111\n", "Epoch 6/50\n", "7195/7195 [==============================] - 0s 6us/step - loss: 0.0129 - val_loss: 0.0094\n", "Epoch 7/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0110 - val_loss: 0.0078\n", "Epoch 8/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0093 - val_loss: 0.0061\n", "Epoch 9/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0078 - val_loss: 0.0045\n", "Epoch 10/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0063 - val_loss: 0.0034\n", "Epoch 11/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0055 - val_loss: 0.0026\n", "Epoch 12/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0047 - val_loss: 0.0021\n", "Epoch 13/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0043 - val_loss: 0.0018\n", "Epoch 14/50\n", "7195/7195 [==============================] - 0s 6us/step - loss: 0.0039 - val_loss: 0.0017\n", "Epoch 15/50\n", "7195/7195 [==============================] - 0s 6us/step - loss: 0.0037 - val_loss: 0.0016\n", "Epoch 16/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0034 - val_loss: 0.0015\n", "Epoch 17/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0032 - val_loss: 0.0015\n", "Epoch 18/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0030 - val_loss: 0.0015\n", "Epoch 19/50\n", "7195/7195 [==============================] - ETA: 0s - loss: 0.002 - 0s 8us/step - loss: 0.0029 - val_loss: 0.0015\n", "Epoch 20/50\n", "7195/7195 [==============================] - 0s 6us/step - loss: 0.0028 - val_loss: 0.0014\n", "Epoch 21/50\n", "7195/7195 [==============================] - 0s 6us/step - loss: 0.0028 - val_loss: 0.0014\n", "Epoch 22/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0026 - val_loss: 0.0014\n", "Epoch 23/50\n", "7195/7195 [==============================] - 0s 6us/step - loss: 0.0025 - val_loss: 0.0013\n", "Epoch 24/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0024 - val_loss: 0.0013\n", "Epoch 25/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0024 - val_loss: 0.0013\n", "Epoch 26/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0023 - val_loss: 0.0012\n", "Epoch 27/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0023 - val_loss: 0.0012\n", "Epoch 28/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0022 - val_loss: 0.0012\n", "Epoch 29/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0022 - val_loss: 0.0012\n", "Epoch 30/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0020 - val_loss: 0.0011\n", "Epoch 31/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0021 - val_loss: 0.0011\n", "Epoch 32/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0020 - val_loss: 0.0011\n", "Epoch 33/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0020 - val_loss: 0.0011\n", "Epoch 34/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0019 - val_loss: 0.0010\n", "Epoch 35/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0018 - val_loss: 0.0010\n", "Epoch 36/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0018 - val_loss: 9.9226e-04\n", "Epoch 37/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0018 - val_loss: 9.8783e-04\n", "Epoch 38/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0017 - val_loss: 9.5433e-04\n", "Epoch 39/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0017 - val_loss: 9.5077e-04\n", "Epoch 40/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0017 - val_loss: 9.2018e-04\n", "Epoch 41/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0016 - val_loss: 9.1261e-04\n", "Epoch 42/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0016 - val_loss: 9.0183e-04\n", "Epoch 43/50\n", "7195/7195 [==============================] - 0s 4us/step - loss: 0.0016 - val_loss: 8.8474e-04\n", "Epoch 44/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0016 - val_loss: 8.6719e-04\n", "Epoch 45/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0015 - val_loss: 8.4854e-04\n", "Epoch 46/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0016 - val_loss: 8.2477e-04\n", "Epoch 47/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0015 - val_loss: 8.0656e-04\n", "Epoch 48/50\n", "7195/7195 [==============================] - 0s 6us/step - loss: 0.0014 - val_loss: 8.1118e-04\n", "Epoch 49/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0014 - val_loss: 7.8844e-04\n", "Epoch 50/50\n", "7195/7195 [==============================] - 0s 5us/step - loss: 0.0014 - val_loss: 7.9169e-04\n" ] }, { "data": { "text/plain": [ "1.982441126153167" ] }, "execution_count": 118, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hist = nnetMdl.fit(varTran, tarTran, epochs=50, batch_size=1000, validation_split=0.1)\n", "tarTemp = nnetMdl.predict(varTest, batch_size=1000)\n", "predict = tarScal.inverse_transform(tarTemp)\n", "resRMSE = np.sqrt(mean_squared_error(tarTest, predict))\n", "resRMSE" ] }, { "cell_type": "code", "execution_count": 119, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(hist.history['loss'])\n", "plt.title('Model loss')\n", "plt.ylabel('Loss')\n", "plt.xlabel('Epoch')\n", "plt.legend(['Train', 'Val'], loc='upper right')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With epoc increased to 100, we can even obtain 1.5 ohms accuracy. It seems this neural network model is comparable to the polynominal regressor and meet our needs." ] }, { "cell_type": "code", "execution_count": 120, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saved model to disk\n" ] } ], "source": [ "# save model and architecture to single file\n", "nnetMdl.save(workDir + \"LStkMdl.h5\")\n", "\n", "# finally\n", "print(\"Saved model to disk\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Deploy:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the SESL3 data set as an example, we follow the similar process and built 10+ prediction models for different stakup structure setup. The polynominal model or neural network can be implemented in Java/C++ to avoid dependencies on python's package for distribution purpose. The implemented front-end, shown below, provide a quick and easy method for system designer for stackup/routing planning:\n", "![Deploy](https://github.com/SPISim/ML_LStkModeling/blob/master/assets/images/SLSE3Deploy.png?raw=true)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Conclusion:**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this post/markdown document, we decribe the stackup modeling process using data science/machine learning techniques. The outcome is a deployed front-end with modeled neural network for user's instant performance evaluation. The data set and this markdown document is published on this project's git-hub page." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }