{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Machine Learning (ML) to Predict the Stability of Planetary Systems\n", "One key question planetary scientists strive to understand is the longterm stability of exoplanetary systems. That is, they would like to know whether, over billions of orbits, planets will collide or be ejected from the system. Due to the [chaotic](https://en.wikipedia.org/wiki/Chaos_theory) nature of planetary systems, the \"answer\" of whether a particular planetary system is longterm stable can only be explored statistically. For example, [Laskar & Gastineau (2009)](https://www.nature.com/articles/nature08096) researched the longterm stability of the Solar System by performing 2,501 N-body simulations, each 5 billion years in length, and found that 1% of solutions lead to a large unstable increase in Mercury’s eccentricity. However, this study would have taken roughly **200 years** to complete on a standard workstation (they had access to a very large computing cluster), motivating the exploration of other methods to speed up the process. \n", "\n", "One such method is to use machine learning to predict the longterm behaviour of a planetary system based off its initial conditions. Once the model is trained, it can take as little as a second to generate new predictions, arriving at an answer quickly. Such a method is described and presented in [Tamayo, Silburt, et al. (2016)](https://arxiv.org/abs/1610.05359). In this notebook, we will explore a simplified version of this work, using a dataset of 25,000 simulated 3-planet systems to train and test a variety of machine learning models." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "from sklearn.model_selection import GridSearchCV\n", "from sklearn.model_selection import KFold\n", "from sklearn import metrics\n", "from sklearn.metrics import precision_recall_curve, roc_curve\n", "pd.options.mode.chained_assignment = None" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | runstring | \n", "a1 | \n", "e1 | \n", "omega1 | \n", "inc1 | \n", "m1 | \n", "Omega1 | \n", "true_anom1 | \n", "mean_anom1 | \n", "a2 | \n", "... | \n", "a3 | \n", "e3 | \n", "omega3 | \n", "inc3 | \n", "m3 | \n", "Omega3 | \n", "true_anom3 | \n", "mean_anom3 | \n", "Stable | \n", "instability_time | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
24995 | \n", "0024995.bin | \n", "1.0 | \n", "0.001983 | \n", "-0.103719 | \n", "0.034756 | \n", "6.384047e-07 | \n", "1.891501 | \n", "2.323505 | \n", "2.320607 | \n", "1.171907 | \n", "... | \n", "1.387978 | \n", "0.136176 | \n", "2.998349 | \n", "0.013797 | \n", "0.000001 | \n", "-1.687570 | \n", "-0.102447 | \n", "-0.077197 | \n", "0.0 | \n", "1.843085e+04 | \n", "
24996 | \n", "0024996.bin | \n", "1.0 | \n", "0.000435 | \n", "-0.528805 | \n", "0.009183 | \n", "4.383173e-06 | \n", "-1.320371 | \n", "2.943623 | \n", "2.943452 | \n", "1.118284 | \n", "... | \n", "1.220488 | \n", "0.000275 | \n", "2.799539 | \n", "0.006669 | \n", "0.000033 | \n", "-0.174781 | \n", "-4.296110 | \n", "1.986572 | \n", "0.0 | \n", "1.868255e+03 | \n", "
24997 | \n", "0024997.bin | \n", "1.0 | \n", "0.000159 | \n", "0.995542 | \n", "0.001832 | \n", "3.176214e-05 | \n", "1.857300 | \n", "0.520523 | \n", "0.520365 | \n", "1.579004 | \n", "... | \n", "1.697671 | \n", "0.014659 | \n", "0.939074 | \n", "0.090331 | \n", "0.000011 | \n", "-2.313583 | \n", "1.605318 | \n", "1.576007 | \n", "0.0 | \n", "6.939622e+04 | \n", "
24998 | \n", "0024998.bin | \n", "1.0 | \n", "0.042915 | \n", "2.803428 | \n", "0.034103 | \n", "4.817579e-07 | \n", "0.320698 | \n", "-2.821058 | \n", "-2.793167 | \n", "1.055424 | \n", "... | \n", "1.286817 | \n", "0.002571 | \n", "-1.435588 | \n", "0.002550 | \n", "0.000011 | \n", "-0.858794 | \n", "1.509610 | \n", "1.504478 | \n", "0.0 | \n", "3.230627e+04 | \n", "
24999 | \n", "0024999.bin | \n", "1.0 | \n", "0.000022 | \n", "-0.388908 | \n", "0.019481 | \n", "4.806837e-05 | \n", "0.756854 | \n", "2.089563 | \n", "2.089526 | \n", "1.316695 | \n", "... | \n", "1.849302 | \n", "0.001253 | \n", "-2.367310 | \n", "0.005140 | \n", "0.000041 | \n", "-0.267279 | \n", "0.119263 | \n", "0.118965 | \n", "1.0 | \n", "1.000000e+09 | \n", "
5 rows × 27 columns
\n", "\n", " | a1 | \n", "e1 | \n", "omega1 | \n", "inc1 | \n", "m1 | \n", "Omega1 | \n", "true_anom1 | \n", "mean_anom1 | \n", "a2 | \n", "e2 | \n", "... | \n", "true_anom2 | \n", "mean_anom2 | \n", "a3 | \n", "e3 | \n", "omega3 | \n", "inc3 | \n", "m3 | \n", "Omega3 | \n", "true_anom3 | \n", "mean_anom3 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1.0 | \n", "0.005031 | \n", "1.400202 | \n", "0.008978 | \n", "3.414016e-05 | \n", "1.770839 | \n", "-1.534327 | \n", "-1.524274 | \n", "1.221502 | \n", "0.059391 | \n", "... | \n", "-0.567081 | \n", "-0.505606 | \n", "1.589043 | \n", "0.002515 | \n", "1.084278 | \n", "0.065498 | \n", "1.826900e-06 | \n", "-1.192440 | \n", "-0.614709 | \n", "-0.611814 | \n", "
1 | \n", "1.0 | \n", "0.010350 | \n", "-2.070391 | \n", "0.001541 | \n", "2.529838e-07 | \n", "-1.493638 | \n", "0.013233 | \n", "0.012961 | \n", "1.190647 | \n", "0.013048 | \n", "... | \n", "1.437355 | \n", "1.411526 | \n", "1.671403 | \n", "0.072450 | \n", "0.275475 | \n", "0.046939 | \n", "1.955797e-05 | \n", "-0.619350 | \n", "0.192203 | \n", "0.165935 | \n", "
2 | \n", "1.0 | \n", "0.051912 | \n", "-1.710828 | \n", "0.016289 | \n", "7.380789e-05 | \n", "2.705977 | \n", "2.472633 | \n", "2.406234 | \n", "1.125660 | \n", "0.042326 | \n", "... | \n", "-0.317958 | \n", "-0.292271 | \n", "1.936762 | \n", "0.003395 | \n", "0.624064 | \n", "0.014535 | \n", "1.477931e-07 | \n", "2.795101 | \n", "1.685406 | \n", "1.678659 | \n", "
3 | \n", "1.0 | \n", "0.000152 | \n", "-2.300890 | \n", "0.003301 | \n", "5.174804e-07 | \n", "-1.027543 | \n", "2.993015 | \n", "2.992970 | \n", "1.216684 | \n", "0.000026 | \n", "... | \n", "-2.294238 | \n", "-2.294199 | \n", "1.503164 | \n", "0.057380 | \n", "2.166845 | \n", "0.098014 | \n", "1.287851e-06 | \n", "-2.995941 | \n", "-1.625763 | \n", "-1.510968 | \n", "
4 | \n", "1.0 | \n", "0.001457 | \n", "0.653565 | \n", "0.033910 | \n", "5.106746e-07 | \n", "1.084883 | \n", "0.667169 | \n", "0.665367 | \n", "1.034058 | \n", "0.020334 | \n", "... | \n", "-1.074927 | \n", "-1.039417 | \n", "1.059346 | \n", "0.010325 | \n", "0.038984 | \n", "0.011840 | \n", "1.542321e-06 | \n", "-1.253823 | \n", "1.215392 | \n", "1.196085 | \n", "
5 rows × 24 columns
\n", "\n", " | a1 | \n", "e1 | \n", "omega1 | \n", "inc1 | \n", "m1 | \n", "Omega1 | \n", "true_anom1 | \n", "mean_anom1 | \n", "a2 | \n", "e2 | \n", "... | \n", "a3 | \n", "e3 | \n", "omega3 | \n", "inc3 | \n", "m3 | \n", "Omega3 | \n", "true_anom3 | \n", "mean_anom3 | \n", "Rhill_12 | \n", "Rhill_23 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1.0 | \n", "0.005031 | \n", "1.400202 | \n", "0.008978 | \n", "3.414016e-05 | \n", "1.770839 | \n", "-1.534327 | \n", "-1.524274 | \n", "1.221502 | \n", "0.059391 | \n", "... | \n", "1.589043 | \n", "0.002515 | \n", "1.084278 | \n", "0.065498 | \n", "1.826900e-06 | \n", "-1.192440 | \n", "-0.614709 | \n", "-0.611814 | \n", "0.028916 | \n", "0.026716 | \n", "
1 | \n", "1.0 | \n", "0.010350 | \n", "-2.070391 | \n", "0.001541 | \n", "2.529838e-07 | \n", "-1.493638 | \n", "0.013233 | \n", "0.012961 | \n", "1.190647 | \n", "0.013048 | \n", "... | \n", "1.671403 | \n", "0.072450 | \n", "0.275475 | \n", "0.046939 | \n", "1.955797e-05 | \n", "-0.619350 | \n", "0.192203 | \n", "0.165935 | \n", "0.024868 | \n", "0.037600 | \n", "
2 | \n", "1.0 | \n", "0.051912 | \n", "-1.710828 | \n", "0.016289 | \n", "7.380789e-05 | \n", "2.705977 | \n", "2.472633 | \n", "2.406234 | \n", "1.125660 | \n", "0.042326 | \n", "... | \n", "1.936762 | \n", "0.003395 | \n", "0.624064 | \n", "0.014535 | \n", "1.477931e-07 | \n", "2.795101 | \n", "1.685406 | \n", "1.678659 | \n", "0.038585 | \n", "0.043732 | \n", "
3 | \n", "1.0 | \n", "0.000152 | \n", "-2.300890 | \n", "0.003301 | \n", "5.174804e-07 | \n", "-1.027543 | \n", "2.993015 | \n", "2.992970 | \n", "1.216684 | \n", "0.000026 | \n", "... | \n", "1.503164 | \n", "0.057380 | \n", "2.166845 | \n", "0.098014 | \n", "1.287851e-06 | \n", "-2.995941 | \n", "-1.625763 | \n", "-1.510968 | \n", "0.012972 | \n", "0.016725 | \n", "
4 | \n", "1.0 | \n", "0.001457 | \n", "0.653565 | \n", "0.033910 | \n", "5.106746e-07 | \n", "1.084883 | \n", "0.667169 | \n", "0.665367 | \n", "1.034058 | \n", "0.020334 | \n", "... | \n", "1.059346 | \n", "0.010325 | \n", "0.038984 | \n", "0.011840 | \n", "1.542321e-06 | \n", "-1.253823 | \n", "1.215392 | \n", "1.196085 | \n", "0.006304 | \n", "0.008739 | \n", "
5 rows × 26 columns
\n", "\n", " | a1 | \n", "e1 | \n", "omega1 | \n", "inc1 | \n", "m1 | \n", "Omega1 | \n", "true_anom1 | \n", "mean_anom1 | \n", "a2 | \n", "e2 | \n", "... | \n", "a3 | \n", "e3 | \n", "omega3 | \n", "inc3 | \n", "m3 | \n", "Omega3 | \n", "true_anom3 | \n", "mean_anom3 | \n", "Rhill_12 | \n", "Rhill_23 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
744 | \n", "1.0 | \n", "0.005276 | \n", "-2.280183 | \n", "0.003580 | \n", "8.450904e-07 | \n", "-0.521908 | \n", "4.171273 | \n", "-2.102849 | \n", "1.046257 | \n", "0.007543 | \n", "... | \n", "1.059626 | \n", "0.022941 | \n", "-1.575458 | \n", "0.004367 | \n", "1.018926e-06 | \n", "2.793342 | \n", "1.065387 | \n", "1.025576 | \n", "0.012080 | \n", "0.012576 | \n", "
745 | \n", "1.0 | \n", "0.007163 | \n", "1.572270 | \n", "0.018378 | \n", "4.589269e-07 | \n", "-1.765554 | \n", "-1.770789 | \n", "-1.756734 | \n", "1.414292 | \n", "0.004847 | \n", "... | \n", "1.454929 | \n", "0.017930 | \n", "-1.248724 | \n", "0.072972 | \n", "1.261339e-06 | \n", "3.067091 | \n", "1.311294 | \n", "1.276756 | \n", "0.018345 | \n", "0.022342 | \n", "
746 | \n", "1.0 | \n", "0.045097 | \n", "1.730455 | \n", "0.061635 | \n", "1.867993e-06 | \n", "2.250357 | \n", "-3.620189 | \n", "2.620180 | \n", "1.155877 | \n", "0.012655 | \n", "... | \n", "1.671401 | \n", "0.010231 | \n", "-1.424115 | \n", "0.002171 | \n", "2.352218e-05 | \n", "-1.661276 | \n", "0.842566 | \n", "0.827372 | \n", "0.009669 | \n", "0.028202 | \n", "
747 | \n", "1.0 | \n", "0.005446 | \n", "-0.626045 | \n", "0.081258 | \n", "7.862898e-07 | \n", "2.284007 | \n", "2.358184 | \n", "2.350475 | \n", "1.125095 | \n", "0.015416 | \n", "... | \n", "1.503039 | \n", "0.021079 | \n", "-3.070809 | \n", "0.007092 | \n", "2.187174e-05 | \n", "1.096155 | \n", "2.599894 | \n", "2.577860 | \n", "0.014999 | \n", "0.028160 | \n", "
748 | \n", "1.0 | \n", "0.004916 | \n", "-2.321251 | \n", "0.013209 | \n", "1.638547e-07 | \n", "0.331210 | \n", "0.190913 | \n", "0.189054 | \n", "1.349109 | \n", "0.000609 | \n", "... | \n", "1.761103 | \n", "0.184905 | \n", "-0.267640 | \n", "0.020949 | \n", "8.206347e-07 | \n", "2.488754 | \n", "-2.440396 | \n", "-2.174502 | \n", "0.033689 | \n", "0.044742 | \n", "
5 rows × 26 columns
\n", "