{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Title: msticpy - GeoIP Lookup\n", "\n", "## Introduction\n", "This module contains two classes that allow you to look up the Geolocation of IP Addresses.\n", "\n", "You must have msticpy installed to run this notebook:\n", "```\n", "%pip install --upgrade msticpy\n", "```\n", "\n", "\n", "### MaxMind GeoIPLite\n", "This product includes GeoLite2 data created by MaxMind, available from\n", "https://www.maxmind.com.\n", "\n", "This uses a local database which is downloaded first time when class object is instantiated. It gives very fast lookups but you need to download updates regularly. Maxmind offers a free tier of this database, updated monthly. For greater accuracy and more detailed information they have varying levels of paid service. Please check out their site for more details.\n", "\n", "The geoip module uses official maxmind pypi package - geoip2 and also has options to customize the behavior of local maxmind database.\n", "* ```db_folder``` : Specify custom path containing local maxmind city database. If not specified, download to .msticpy dir under user\\`s home dir.\n", "* ```force_update``` : can be set to True/False to issue force update despite of age check.\n", "* Check age of maxmind city database based on database info and download new if it is not updated in last 30 days.\n", "* ``auto_update``` : can be set to True/False Allow option to override auto update database if user is desired not to update database older than 30 days.\n", "\n", "### IPStack\n", "This library uses services provided by ipstack.\n", "https://ipstack.com\n", "\n", "IPStack is an online service and also offers a free tier of their service. Again, the paid tiers offer greater accuracy, more detailed information and higher throughput. Please check out their site for more details.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Table of Contents\n", "- [Maxmind GeoIP Lookup](#geoip_lookups)\n", "- [IPStack GeoIP Lookup](#ipstack_lookups)\n", "- [Dataframe input](#dataframe_input)\n", "- [Creating your own GeoIP Class](#custom_lookup)\n", "- [Calculating Geographical Distances](#calc_distance)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:51.269375Z", "start_time": "2020-02-08T02:38:49.282504Z" }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "\n", "This product includes GeoLite2 data created by MaxMind, available from\n", "https://www.maxmind.com.\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Imports\n", "import sys\n", "MIN_REQ_PYTHON = (3,6)\n", "if sys.version_info < MIN_REQ_PYTHON:\n", " print('Check the Kernel->Change Kernel menu and ensure that Python 3.6')\n", " print('or later is selected as the active kernel.')\n", " sys.exit(\"Python %s.%s or later is required.\\n\" % MIN_REQ_PYTHON)\n", "\n", "\n", "from IPython.display import display\n", "import pandas as pd\n", "\n", "import msticpy\n", "from msticpy.context.geoip import GeoLiteLookup, IPStackLookup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Contents](#contents)\n", "## Maxmind GeoIP Lite Lookup Class\n", "Signature:\n", "```\n", "iplocation.lookup_ip(ip_address: str = None, \n", " ip_addr_list: collections.abc.Iterable = None,\n", " ip_entity: msticpy.nbtools.entityschema.IpAddress = None)\n", "Docstring:\n", "Lookup IP location from GeoLite2 data created by MaxMind.\n", "\n", "Keyword Arguments:\n", " ip_address {str} -- a single address to look up (default: {None})\n", " ip_addr_list {Iterable} -- a collection of addresses to lookup (default: {None})\n", " ip_entity {IpAddress} -- an IpAddress entity\n", "\n", "Returns:\n", " tuple(list{dict}, list{entity}) -- returns raw geolocation results and\n", " same results as IP/Geolocation entities\n", "```" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:51.294362Z", "start_time": "2020-02-08T02:38:51.270375Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No local Maxmind City Database found. Attempting to downloading new database to .\n", "Downloading and extracting GeoLite DB archive from MaxMind....\n", "Extraction complete. Local Maxmind city DB: GeoLite2-City.mmdb.13983.tar.gz\n", "Raw result\n" ] }, { "data": { "text/plain": [ "[{'continent': {'code': 'EU',\n", " 'geoname_id': 6255148,\n", " 'names': {'de': 'Europa',\n", " 'en': 'Europe',\n", " 'es': 'Europa',\n", " 'fr': 'Europe',\n", " 'ja': 'ヨーロッパ',\n", " 'pt-BR': 'Europa',\n", " 'ru': 'Европа',\n", " 'zh-CN': '欧洲'}},\n", " 'country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'location': {'accuracy_radius': 1000,\n", " 'latitude': 55.7386,\n", " 'longitude': 37.6068,\n", " 'time_zone': 'Europe/Moscow'},\n", " 'registered_country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 21}}]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "IP Address Entity\n" ] }, { "data": { "text/html": [ "

ipaddress

{ 'Address': '90.156.201.97',
  'Location': { 'CountryCode': 'RU',
                'CountryName': 'Russia',
                'Latitude': 55.7386,
                'Longitude': 37.6068,
                'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 0, 737221),
                'Type': 'geolocation'},
  'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 0, 737221),
  'Type': 'ipaddress'}" ], "text/plain": [ "IpAddress(Address=90.156.201.97, Location={ 'CountryCode': 'RU',\n", " 'CountryName': 'Russia'...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "iplocation = GeoLiteLookup()\n", "loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97')\n", "\n", "print('Raw result')\n", "display(loc_result)\n", "\n", "print('IP Address Entity')\n", "display(ip_entity[0])" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:51.311353Z", "start_time": "2020-02-08T02:38:51.296360Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Raw result\n" ] }, { "data": { "text/plain": [ "[{'continent': {'code': 'EU',\n", " 'geoname_id': 6255148,\n", " 'names': {'de': 'Europa',\n", " 'en': 'Europe',\n", " 'es': 'Europa',\n", " 'fr': 'Europe',\n", " 'ja': 'ヨーロッパ',\n", " 'pt-BR': 'Europa',\n", " 'ru': 'Европа',\n", " 'zh-CN': '欧洲'}},\n", " 'country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'location': {'accuracy_radius': 1000,\n", " 'latitude': 55.7386,\n", " 'longitude': 37.6068,\n", " 'time_zone': 'Europe/Moscow'},\n", " 'registered_country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 21}}]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "IP Address Entity\n" ] }, { "data": { "text/html": [ "

ipaddress

{ 'Address': '90.156.201.97',
  'Location': { 'CountryCode': 'RU',
                'CountryName': 'Russia',
                'Latitude': 55.7386,
                'Longitude': 37.6068,
                'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 0, 954578),
                'Type': 'geolocation'},
  'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 0, 954578),
  'Type': 'ipaddress'}" ], "text/plain": [ "IpAddress(Address=90.156.201.97, Location={ 'CountryCode': 'RU',\n", " 'CountryName': 'Russia'...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import tempfile\n", "from pathlib import Path\n", "tmp_folder = tempfile.gettempdir()\n", "iplocation = GeoLiteLookup(db_folder=str(Path(tmp_folder).joinpath('geolite')))\n", "loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97')\n", "\n", "print('Raw result')\n", "display(loc_result)\n", "\n", "print('IP Address Entity')\n", "display(ip_entity[0])" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:54.589392Z", "start_time": "2020-02-08T02:38:51.312351Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "force_update is set to True. Attempting to download new database to .\n", "Downloading and extracting GeoLite DB archive from MaxMind....\n", "Raw result\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "e:\\src\\msticpy\\msticpy\\context\\geoip.py:769: UserWarning: GeoIpLookup: Cannot overwrite GeoIP DB file: GeoLite2-City.mmdb.12635.tar.gz. The file may be in use or you do not have permission to overwrite.\n", " - [WinError 5] Access is denied: 'GeoLite2-City_20220527\\\\GeoLite2-City.mmdb' -> 'GeoLite2-City.mmdb'\n", " warnings.warn(\n", "e:\\src\\msticpy\\msticpy\\context\\geoip.py:769: UserWarning: GeoIpLookup: DB download failed\n", " warnings.warn(\n", "e:\\src\\msticpy\\msticpy\\context\\geoip.py:769: UserWarning: GeoIpLookup: Continuing with cached database. Results may inaccurate.\n", " warnings.warn(\n" ] }, { "data": { "text/plain": [ "[{'continent': {'code': 'EU',\n", " 'geoname_id': 6255148,\n", " 'names': {'de': 'Europa',\n", " 'en': 'Europe',\n", " 'es': 'Europa',\n", " 'fr': 'Europe',\n", " 'ja': 'ヨーロッパ',\n", " 'pt-BR': 'Europa',\n", " 'ru': 'Европа',\n", " 'zh-CN': '欧洲'}},\n", " 'country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'location': {'accuracy_radius': 1000,\n", " 'latitude': 55.7386,\n", " 'longitude': 37.6068,\n", " 'time_zone': 'Europe/Moscow'},\n", " 'registered_country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 21}}]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "IP Address Entity\n" ] }, { "data": { "text/html": [ "

ipaddress

{ 'Address': '90.156.201.97',
  'Location': { 'CountryCode': 'RU',
                'CountryName': 'Russia',
                'Latitude': 55.7386,
                'Longitude': 37.6068,
                'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 3, 191866),
                'Type': 'geolocation'},
  'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 3, 191866),
  'Type': 'ipaddress'}" ], "text/plain": [ "IpAddress(Address=90.156.201.97, Location={ 'CountryCode': 'RU',\n", " 'CountryName': 'Russia'...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "iplocation = GeoLiteLookup(force_update=True)\n", "loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97')\n", "\n", "print('Raw result')\n", "display(loc_result)\n", "\n", "print('IP Address Entity')\n", "display(ip_entity[0])" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:54.601359Z", "start_time": "2020-02-08T02:38:54.590367Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Raw result\n" ] }, { "data": { "text/plain": [ "[{'continent': {'code': 'EU',\n", " 'geoname_id': 6255148,\n", " 'names': {'de': 'Europa',\n", " 'en': 'Europe',\n", " 'es': 'Europa',\n", " 'fr': 'Europe',\n", " 'ja': 'ヨーロッパ',\n", " 'pt-BR': 'Europa',\n", " 'ru': 'Европа',\n", " 'zh-CN': '欧洲'}},\n", " 'country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'location': {'accuracy_radius': 1000,\n", " 'latitude': 55.7386,\n", " 'longitude': 37.6068,\n", " 'time_zone': 'Europe/Moscow'},\n", " 'registered_country': {'geoname_id': 2017370,\n", " 'iso_code': 'RU',\n", " 'names': {'de': 'Russland',\n", " 'en': 'Russia',\n", " 'es': 'Rusia',\n", " 'fr': 'Russie',\n", " 'ja': 'ロシア',\n", " 'pt-BR': 'Rússia',\n", " 'ru': 'Россия',\n", " 'zh-CN': '俄罗斯联邦'}},\n", " 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 21}}]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "IP Address Entity\n" ] }, { "data": { "text/html": [ "

ipaddress

{ 'Address': '90.156.201.97',
  'Location': { 'CountryCode': 'RU',
                'CountryName': 'Russia',
                'Latitude': 55.7386,
                'Longitude': 37.6068,
                'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 3, 465434),
                'Type': 'geolocation'},
  'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 3, 465434),
  'Type': 'ipaddress'}" ], "text/plain": [ "IpAddress(Address=90.156.201.97, Location={ 'CountryCode': 'RU',\n", " 'CountryName': 'Russia'...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "iplocation = GeoLiteLookup(auto_update=False)\n", "loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97')\n", "\n", "print('Raw result')\n", "display(loc_result)\n", "\n", "print('IP Address Entity')\n", "display(ip_entity[0])" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:54.612353Z", "start_time": "2020-02-08T02:38:54.603359Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['151.101.128.223', '151.101.64.223', '151.101.0.223', '151.101.192.223']\n" ] }, { "data": { "text/plain": [ "[IpAddress(Address=151.101.128.223, Location={ 'CountryCode': 'US',\n", " 'CountryName': 'Unite...),\n", " IpAddress(Address=151.101.64.223, Location={ 'CountryCode': 'US',\n", " 'CountryName': 'United...),\n", " IpAddress(Address=151.101.0.223, Location={ 'CountryCode': 'US',\n", " 'CountryName': 'United ...),\n", " IpAddress(Address=151.101.192.223, Location={ 'CountryCode': 'US',\n", " 'CountryName': 'Unite...)]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import socket\n", "socket_info = socket.getaddrinfo(\"pypi.org\",0,0,0,0)\n", "\n", "ips = [res[4][0] for res in socket_info]\n", "print(ips)\n", "\n", "_, ip_entities = iplocation.lookup_ip(ip_addr_list=ips)\n", "display(ip_entities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Contents](#contents)\n", "## IPStack Geo-lookup Class\n", "\n", "#### Class Initialization\n", "\n", "Note - requires IPStack API Key, Optional parameter bulk_lookup allows multiple IPs in a single request. This is only available with the paid Professional tier and above.\n", "```\n", "Init signature: IPStackLookup(api_key: str, bulk_lookup: bool = False)\n", "Docstring: \n", "GeoIP Lookup using IPStack web service.\n", "\n", "Raises:\n", " ConnectionError -- Invalid status returned from http request\n", " PermissionError -- Service refused request (e.g. requesting batch of addresses\n", " on free tier API key)\n", "Init docstring:\n", "Create a new instance of IPStackLookup.\n", "\n", "Arguments:\n", " api_key {str} -- API Key from IPStack - see https://ipstack.com\n", " bulk_lookup {bool} -- For Professional and above tiers allowing you to\n", " submit multiple IPs in a single request.\n", " \n", "```\n", "\n", "#### lookup_ip method\n", "```\n", "Signature:\n", "iplocation.lookup_ip(\n", " ['ip_address: str = None', 'ip_addr_list: collections.abc.Iterable = None', 'ip_entity: msticpy.nbtools.entityschema.IpAddress = None'],\n", ") -> tuple\n", "Docstring:\n", "Lookup IP location from IPStack web service.\n", "\n", "Keyword Arguments:\n", " ip_address {str} -- a single address to look up (default: {None})\n", " ip_addr_list {Iterable} -- a collection of addresses to lookup (default: {None})\n", " ip_entity {IpAddress} -- an IpAddress entity\n", "\n", "Raises:\n", " ConnectionError -- Invalid status returned from http request\n", " PermissionError -- Service refused request (e.g. requesting batch of addresses\n", " on free tier API key)\n", "\n", "Returns:\n", " tuple(list{dict}, list{entity}) -- returns raw geolocation results and\n", " same results as IP/Geolocation entities\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Contents](#contents)\n", "### You will need a IPStack API key\n", "You will get more detailed results and a higher throughput allowance if you have a paid tier. See IPStack website for more details" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:54.643336Z", "start_time": "2020-02-08T02:38:54.613352Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "This library uses services provided by ipstack.\n", "https://ipstack.com" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from msticpy.common.provider_settings import get_provider_settings\n", "iplocation = IPStackLookup()\n", "\n", "# Enter your IPStack Key here (if not set in msticpyconfig.yaml)\n", "ips_key = msticpy.nbwidgets.GetEnvironmentKey(env_var='IPSTACK_AUTH',\n", " help_str='To obtain an API key sign up here https://www.ipstack.com/',\n", " prompt='IPStack API key:'\n", ")\n", "\n", "ipstack_settings = get_provider_settings(config_section=\"OtherProviders\").get(\"IPStack\")\n", "if not ipstack_settings:\n", " ips_key.display()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:54.761206Z", "start_time": "2020-02-08T02:38:54.646333Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Raw result\n" ] }, { "data": { "text/plain": [ "[({'ip': '90.156.201.97',\n", " 'type': 'ipv4',\n", " 'continent_code': 'AS',\n", " 'continent_name': 'Asia',\n", " 'country_code': 'RU',\n", " 'country_name': 'Russia',\n", " 'region_code': 'MOW',\n", " 'region_name': 'Moscow',\n", " 'city': 'Moscow',\n", " 'zip': '115088',\n", " 'latitude': 55.712608337402344,\n", " 'longitude': 37.68056869506836,\n", " 'location': {'geoname_id': 524901,\n", " 'capital': 'Moscow',\n", " 'languages': [{'code': 'ru', 'name': 'Russian', 'native': 'Русский'}],\n", " 'country_flag': 'https://assets.ipstack.com/flags/ru.svg',\n", " 'country_flag_emoji': '🇷🇺',\n", " 'country_flag_emoji_unicode': 'U+1F1F7 U+1F1FA',\n", " 'calling_code': '7',\n", " 'is_eu': False}},\n", " 200)]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "IP Address Entity\n" ] }, { "data": { "text/html": [ "

ipaddress

{ 'Address': '90.156.201.97',
  'Location': { 'City': 'Moscow',
                'CountryCode': 'RU',
                'CountryName': 'Russia',
                'Latitude': 55.712608337402344,
                'Longitude': 37.68056869506836,
                'State': 'Moscow',
                'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 4, 600662),
                'Type': 'geolocation'},
  'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 4, 600662),
  'Type': 'ipaddress'}" ], "text/plain": [ "IpAddress(Address=90.156.201.97, Location={ 'City': 'Moscow',\n", " 'CountryCode': 'RU',\n", " 'Co...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import os\n", "if not ipstack_settings and not ips_key.value:\n", " raise ValueError(\"No Authentication key in config/environment or supplied by user.\")\n", "if ips_key.value:\n", " iplocation = IPStackLookup(api_key=ips_key.value)\n", "\n", "if \"MSTICPY_SKIP_IPSTACK_TEST\" not in os.environ:\n", " loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97')\n", " print('Raw result')\n", " display(loc_result)\n", "\n", " if ip_entity:\n", " print('IP Address Entity')\n", " display(ip_entity[0])\n", " else:\n", " print(\"No result returned\")" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:55.196405Z", "start_time": "2020-02-08T02:38:54.762206Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Raw results\n" ] }, { "data": { "text/plain": [ "[({'ip': '151.101.128.223',\n", " 'type': 'ipv4',\n", " 'continent_code': 'NA',\n", " 'continent_name': 'North America',\n", " 'country_code': 'US',\n", " 'country_name': 'United States',\n", " 'region_code': 'CA',\n", " 'region_name': 'California',\n", " 'city': 'San Francisco',\n", " 'zip': '94107',\n", " 'latitude': 37.76784896850586,\n", " 'longitude': -122.39286041259766,\n", " 'location': {'geoname_id': 5391959,\n", " 'capital': 'Washington D.C.',\n", " 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}],\n", " 'country_flag': 'https://assets.ipstack.com/flags/us.svg',\n", " 'country_flag_emoji': '🇺🇸',\n", " 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8',\n", " 'calling_code': '1',\n", " 'is_eu': False}},\n", " 200),\n", " ({'ip': '151.101.64.223',\n", " 'type': 'ipv4',\n", " 'continent_code': 'NA',\n", " 'continent_name': 'North America',\n", " 'country_code': 'US',\n", " 'country_name': 'United States',\n", " 'region_code': 'CA',\n", " 'region_name': 'California',\n", " 'city': 'San Francisco',\n", " 'zip': '94107',\n", " 'latitude': 37.76784896850586,\n", " 'longitude': -122.39286041259766,\n", " 'location': {'geoname_id': 5391959,\n", " 'capital': 'Washington D.C.',\n", " 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}],\n", " 'country_flag': 'https://assets.ipstack.com/flags/us.svg',\n", " 'country_flag_emoji': '🇺🇸',\n", " 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8',\n", " 'calling_code': '1',\n", " 'is_eu': False}},\n", " 200),\n", " ({'ip': '151.101.0.223',\n", " 'type': 'ipv4',\n", " 'continent_code': 'NA',\n", " 'continent_name': 'North America',\n", " 'country_code': 'US',\n", " 'country_name': 'United States',\n", " 'region_code': 'CA',\n", " 'region_name': 'California',\n", " 'city': 'San Francisco',\n", " 'zip': '94107',\n", " 'latitude': 37.76784896850586,\n", " 'longitude': -122.39286041259766,\n", " 'location': {'geoname_id': 5391959,\n", " 'capital': 'Washington D.C.',\n", " 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}],\n", " 'country_flag': 'https://assets.ipstack.com/flags/us.svg',\n", " 'country_flag_emoji': '🇺🇸',\n", " 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8',\n", " 'calling_code': '1',\n", " 'is_eu': False}},\n", " 200),\n", " ({'ip': '151.101.192.223',\n", " 'type': 'ipv4',\n", " 'continent_code': 'NA',\n", " 'continent_name': 'North America',\n", " 'country_code': 'US',\n", " 'country_name': 'United States',\n", " 'region_code': 'CA',\n", " 'region_name': 'California',\n", " 'city': 'San Francisco',\n", " 'zip': '94107',\n", " 'latitude': 37.76784896850586,\n", " 'longitude': -122.39286041259766,\n", " 'location': {'geoname_id': 5391959,\n", " 'capital': 'Washington D.C.',\n", " 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}],\n", " 'country_flag': 'https://assets.ipstack.com/flags/us.svg',\n", " 'country_flag_emoji': '🇺🇸',\n", " 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8',\n", " 'calling_code': '1',\n", " 'is_eu': False}},\n", " 200)]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "IP Address Entities\n" ] }, { "data": { "text/plain": [ "[IpAddress(Address=151.101.128.223, Location={ 'City': 'San Francisco',\n", " 'CountryCode': 'U...),\n", " IpAddress(Address=151.101.64.223, Location={ 'City': 'San Francisco',\n", " 'CountryCode': 'US...),\n", " IpAddress(Address=151.101.0.223, Location={ 'City': 'San Francisco',\n", " 'CountryCode': 'US'...),\n", " IpAddress(Address=151.101.192.223, Location={ 'City': 'San Francisco',\n", " 'CountryCode': 'U...)]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "if \"MSTICPY_SKIP_IPSTACK_TEST\" not in os.environ:\n", " loc_result, ip_entities = iplocation.lookup_ip(ip_addr_list=ips)\n", " print('Raw results')\n", " display(loc_result)\n", "\n", " print('IP Address Entities')\n", " display(ip_entities)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Contents](#contents)\n", "## Taking input from a pandas DataFrame\n", "\n", "The base class for both implementations has a method that sources the ip addresses from a dataframe column and returns a new dataframe with the location information merged with the input frame\n", "```\n", "Signature: iplocation.df_lookup_ip(data: pandas.core.frame.DataFrame, column: str)\n", "Docstring:\n", "Lookup Geolocation data from a pandas Dataframe.\n", "\n", "Keyword Arguments:\n", " data {pd.DataFrame} -- pandas dataframe containing IpAddress column\n", " column {str} -- the name of the dataframe column to use as a source\n", "```" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:55.262352Z", "start_time": "2020-02-08T02:38:55.197381Z" }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AllExtIPsCountryCodeCountryNameStateCityLongitudeLatitudeTimeGeneratedTypeIpAddress
065.55.44.109USUnited StatesVirginiaBoydton-78.375036.65342022-05-27 22:53:05.650469geolocation65.55.44.109
113.71.172.128CACanadaOntarioToronto-79.362343.65472022-05-27 22:53:05.651468geolocation13.71.172.128
213.71.172.130CACanadaOntarioToronto-79.362343.65472022-05-27 22:53:05.651468geolocation13.71.172.130
340.124.45.19USUnited StatesTexasSan Antonio-98.492729.42272022-05-27 22:53:05.651468geolocation40.124.45.19
4104.43.212.12USUnited StatesIowaDes Moines-93.612441.60212022-05-27 22:53:05.652477geolocation104.43.212.12
.................................
8220.41.41.23USUnited StatesVirginiaBoydton-78.375036.65342022-05-27 22:53:05.686740geolocation20.41.41.23
8352.179.17.38USUnited StatesVirginiaTappahannock-76.854537.92732022-05-27 22:53:05.686740geolocation52.179.17.38
84157.55.134.142USUnited StatesVirginiaTappahannock-76.854537.92732022-05-27 22:53:05.687739geolocation157.55.134.142
85172.217.15.110USUnited StatesNaNNaN-97.822037.75102022-05-27 22:53:05.687739geolocation172.217.15.110
8640.91.75.5USUnited StatesWashingtonNaN-122.341447.60342022-05-27 22:53:05.688740geolocation40.91.75.5
\n", "

87 rows × 10 columns

\n", "
" ], "text/plain": [ " AllExtIPs CountryCode CountryName State City \\\n", "0 65.55.44.109 US United States Virginia Boydton \n", "1 13.71.172.128 CA Canada Ontario Toronto \n", "2 13.71.172.130 CA Canada Ontario Toronto \n", "3 40.124.45.19 US United States Texas San Antonio \n", "4 104.43.212.12 US United States Iowa Des Moines \n", ".. ... ... ... ... ... \n", "82 20.41.41.23 US United States Virginia Boydton \n", "83 52.179.17.38 US United States Virginia Tappahannock \n", "84 157.55.134.142 US United States Virginia Tappahannock \n", "85 172.217.15.110 US United States NaN NaN \n", "86 40.91.75.5 US United States Washington NaN \n", "\n", " Longitude Latitude TimeGenerated Type \\\n", "0 -78.3750 36.6534 2022-05-27 22:53:05.650469 geolocation \n", "1 -79.3623 43.6547 2022-05-27 22:53:05.651468 geolocation \n", "2 -79.3623 43.6547 2022-05-27 22:53:05.651468 geolocation \n", "3 -98.4927 29.4227 2022-05-27 22:53:05.651468 geolocation \n", "4 -93.6124 41.6021 2022-05-27 22:53:05.652477 geolocation \n", ".. ... ... ... ... \n", "82 -78.3750 36.6534 2022-05-27 22:53:05.686740 geolocation \n", "83 -76.8545 37.9273 2022-05-27 22:53:05.686740 geolocation \n", "84 -76.8545 37.9273 2022-05-27 22:53:05.687739 geolocation \n", "85 -97.8220 37.7510 2022-05-27 22:53:05.687739 geolocation \n", "86 -122.3414 47.6034 2022-05-27 22:53:05.688740 geolocation \n", "\n", " IpAddress \n", "0 65.55.44.109 \n", "1 13.71.172.128 \n", "2 13.71.172.130 \n", "3 40.124.45.19 \n", "4 104.43.212.12 \n", ".. ... \n", "82 20.41.41.23 \n", "83 52.179.17.38 \n", "84 157.55.134.142 \n", "85 172.217.15.110 \n", "86 40.91.75.5 \n", "\n", "[87 rows x 10 columns]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "netflow_df = pd.read_csv(\"data/az_net_flows.csv\")\n", "netflow_df = netflow_df[[\"AllExtIPs\"]].drop_duplicates()\n", "iplocation = GeoLiteLookup()\n", "iplocation.df_lookup_ip(netflow_df, column=\"AllExtIPs\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Contents](#contents)\n", "## Creating a Custom GeopIP Lookup Class\n", "\n", "You can derive a class that implements the same operations to use with a different GeoIP service.\n", "\n", "The class signature is as follows:\n", "```\n", "class GeoIpLookup(ABC):\n", " \"\"\"Abstract base class for GeoIP Lookup classes.\"\"\"\n", "\n", " @abstractmethod\n", " def lookup_ip(self, ip_address: str = None, ip_addr_list: Iterable = None,\n", " ip_entity: IpAddress = None):\n", " \"\"\"\n", " Lookup IP location.\n", "\n", " Keyword Arguments:\n", " ip_address {str} -- a single address to look up (default: {None})\n", " ip_addr_list {Iterable} -- a collection of addresses to lookup (default: {None})\n", " ip_entity {IpAddress} -- an IpAddress entity\n", "\n", " Returns:\n", " tuple(list{dict}, list{entity}) -- returns raw geolocation results and\n", " same results as IP/Geolocation entities\n", "\n", " \"\"\"\n", "```\n", "You should override the lookup_ip method implementing your own method of geoip lookup." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Contents](#contents)\n", "## Calculating Geographical Distances\n", "\n", "Use the geo_distance function from msticpy.sectools.geoip to calculated distances between two locations.\n", "I am indebted to Martin Thoma who posted this solution (which I've modified slightly) on Stackoverflow.\n", "\n", "\n", "```\n", "Signature: geo_distance(origin: Tuple[float, float], destination: Tuple[float, float]) -> float\n", "Docstring:\n", "Calculate the Haversine distance.\n", "\n", "Author: Martin Thoma - stackoverflow\n", "\n", "Parameters\n", "----------\n", "origin : tuple of float\n", " (lat, long)\n", "destination : tuple of float\n", " (lat, long)\n", "\n", "Returns\n", "-------\n", "distance_in_km : float\n", "```\n", "\n", "\n", "Or where you have source and destination IpAddress entities, you can use the wrapper entity_distance.\n", "```\n", "Signature:\n", "entity_distance(\n", " ['ip_src: msticpy.nbtools.entityschema.IpAddress', 'ip_dest: msticpy.nbtools.entityschema.IpAddress'],\n", ") -> float\n", "Docstring:\n", "Return distance between two IP Entities.\n", "\n", "Arguments:\n", " ip_src {IpAddress} -- Source IpAddress Entity\n", " ip_dest {IpAddress} -- Destination IpAddress Entity\n", "\n", "Raises:\n", " AttributeError -- if either entity has no location information\n", "\n", "Returns:\n", " float -- Distance in kilometers.\n", "```" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2020-02-08T02:38:55.326316Z", "start_time": "2020-02-08T02:38:55.319320Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{ 'Address': '90.156.201.97',\n", " 'Location': { 'CountryCode': 'RU',\n", " 'CountryName': 'Russia',\n", " 'Latitude': 55.7386,\n", " 'Longitude': 37.6068,\n", " 'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 5, 986390),\n", " 'Type': 'geolocation'},\n", " 'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 5, 986390),\n", " 'Type': 'ipaddress'}\n", "{ 'Address': '151.101.64.223',\n", " 'Location': { 'CountryCode': 'US',\n", " 'CountryName': 'United States',\n", " 'Latitude': 37.751,\n", " 'Longitude': -97.822,\n", " 'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 5, 986390),\n", " 'Type': 'geolocation'},\n", " 'TimeGenerated': datetime.datetime(2022, 5, 27, 22, 53, 5, 986390),\n", " 'Type': 'ipaddress'}\n", "\n", "Distance between IP Locations = 8796.8km\n" ] } ], "source": [ "from msticpy.sectools.geoip import geo_distance\n", "_, ip_entity1 = iplocation.lookup_ip(ip_address='90.156.201.97')\n", "_, ip_entity2 = iplocation.lookup_ip(ip_address='151.101.64.223')\n", "\n", "print(ip_entity1[0])\n", "print(ip_entity2[0])\n", "dist = geo_distance(origin=(ip_entity1[0].Location.Latitude, ip_entity1[0].Location.Longitude),\n", " destination=(ip_entity2[0].Location.Latitude, ip_entity2[0].Location.Longitude))\n", "print(f'\\nDistance between IP Locations = {round(dist, 1)}km')" ] } ], "metadata": { "celltoolbar": "Tags", "hide_input": false, "kernelspec": { "display_name": "Python 3.9.7 ('msticpy')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": { "height": "318.996px", "width": "320.994px" }, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "165px" }, "toc_section_display": true, "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "position": { "height": "406.193px", "left": "1468.4px", "right": "20px", "top": "120px", "width": "456.572px" }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false }, "vscode": { "interpreter": { "hash": "0f1a8e166ce5c1ec1911a36e4fdbd34b2f623e2a3442791008b8ac429a1d6070" } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }