{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction to Reference Data in Vortexa Python SDK\n", "\n", "When working with Vortexa’s Python SDK, one of the foundational components you'll encounter is reference data. But what exactly is reference data, and why is it crucial for your analytics and data manipulation tasks?\n", "\n", "### What is Reference Data?\n", "\n", "Reference data, also known as master data, encompasses the core datasets that define and categorize the entities within Vortexa’s ecosystem. These datasets are used to standardize and provide context to the event data you’ll be working with. In essence, reference data acts as the backbone for ensuring consistency, accuracy, and reliability in your data operations.\n", "\n", "### Three main Reference Data in Vortexa\n", "**1. Geographies**: Detailed information about various locations, such as ports, terminals, and regions. This allows you to accurately track and analyze movements and activities related to these locations.\n", "\n", "**2. Vessels**: Comprehensive data about vessels, including their names, types, sizes, and other attributes. This is essential for tracking vessel movements, understanding fleet compositions, and analyzing shipping trends.\n", "\n", "**3. Products**: Information about different types of products, their classifications, and hierarchies. This helps in analyzing trade flows, market dynamics, and commodity-specific trends." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import Libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import vortexasdk as v\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To start with, let's explore what you could extract from Geographies endpoint." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2024-07-23 13:31:07,189 vortexasdk.client — WARNING — You are using vortexasdk version 0.72.5, however version 0.72.6 is available.\n", "You should consider upgrading via the 'pip install vortexasdk --upgrade' command.\n" ] } ], "source": [ "# To load all geographies\n", "all_geographies = v.Geographies().load_all().to_df()\n", "\n", "# To load all geographies with a specific type (e.g. country)\n", "country_list = v.Geographies().search(filter_layer=['country']).to_df()\n", "\n", "# To load all geographies with a name containing 'China'\n", "china_list = v.Geographies().search(term='China').to_df()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamelayer
00eb7b43e3d4e62db74e187bc4eadadfb878a210201e14a...21st Century Shipbuilding[terminal]
1b27430b57d617a855d44f91fb70441bae69c19a3c0deb4...2x1 Holding Cape Midia Shipyard[terminal]
2d38a8f7bf8ed422b439ad5270be65b60b964bed9568936...A Pobra Do Caraminal [ES][port]
33ebcf6f2e43e3a8b06e9b0ae31df3d87f3fbc8d032b72d...A and P Group Falmouth Shipyard[terminal]
40c8a30f40639257e5352e2c6ac52af2f93b2c6bfed7187...A and P Group Tees Shipyard[terminal]
\n", "
" ], "text/plain": [ " id \\\n", "0 0eb7b43e3d4e62db74e187bc4eadadfb878a210201e14a... \n", "1 b27430b57d617a855d44f91fb70441bae69c19a3c0deb4... \n", "2 d38a8f7bf8ed422b439ad5270be65b60b964bed9568936... \n", "3 3ebcf6f2e43e3a8b06e9b0ae31df3d87f3fbc8d032b72d... \n", "4 0c8a30f40639257e5352e2c6ac52af2f93b2c6bfed7187... \n", "\n", " name layer \n", "0 21st Century Shipbuilding [terminal] \n", "1 2x1 Holding Cape Midia Shipyard [terminal] \n", "2 A Pobra Do Caraminal [ES] [port] \n", "3 A and P Group Falmouth Shipyard [terminal] \n", "4 A and P Group Tees Shipyard [terminal] " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "all_geographies.head(5)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamelayer
01cd3c07221f9e9b3296c859d0bcd3da17ac6072bfdcc84...Afghanistan[country]
15e4e7b5040b933b5a0f0d2357ed27ebec432c749b9d63a...Albania[country]
287269b28eaea324d2c35e97b0ecc837ebc9a244faf94e2...Algeria[country]
3f4435d4fffa5b2ba7a340e3a8e7d421f619d9f7832fa0c...American Samoa[country]
4db3cb74043b8fd438087a3e0e04e3e498b78c3c9790fce...Andorra[country]
\n", "
" ], "text/plain": [ " id name \\\n", "0 1cd3c07221f9e9b3296c859d0bcd3da17ac6072bfdcc84... Afghanistan \n", "1 5e4e7b5040b933b5a0f0d2357ed27ebec432c749b9d63a... Albania \n", "2 87269b28eaea324d2c35e97b0ecc837ebc9a244faf94e2... Algeria \n", "3 f4435d4fffa5b2ba7a340e3a8e7d421f619d9f7832fa0c... American Samoa \n", "4 db3cb74043b8fd438087a3e0e04e3e498b78c3c9790fce... Andorra \n", "\n", " layer \n", "0 [country] \n", "1 [country] \n", "2 [country] \n", "3 [country] \n", "4 [country] " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "country_list.head(5)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamelayer
0934c47f36c16a58d68ef5e007e62a23f5f036ee3f3d1f5...China[country]
1781cacc7033f877caa4b4106d096b74afe006a96391bf5...South China[alternative_region]
2a63890260e29d859390fd1a23c690181afd4bd152943a0...North China[alternative_region]
3d1d5a3d3666d6ebb65a1b53e626b07f6b8540e8048a524...Shipyard Nansha China[terminal]
4ae0f224030f7337d0ffe5a54d290a9f0bd029f636eaf12...China Yangfan Group[terminal]
\n", "
" ], "text/plain": [ " id name \\\n", "0 934c47f36c16a58d68ef5e007e62a23f5f036ee3f3d1f5... China \n", "1 781cacc7033f877caa4b4106d096b74afe006a96391bf5... South China \n", "2 a63890260e29d859390fd1a23c690181afd4bd152943a0... North China \n", "3 d1d5a3d3666d6ebb65a1b53e626b07f6b8540e8048a524... Shipyard Nansha China \n", "4 ae0f224030f7337d0ffe5a54d290a9f0bd029f636eaf12... China Yangfan Group \n", "\n", " layer \n", "0 [country] \n", "1 [alternative_region] \n", "2 [alternative_region] \n", "3 [terminal] \n", "4 [terminal] " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "china_list.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This shows how you could extract the ids, which may be required from other endpoints. In addition, to extract more information about the locations such as centroids, hierarchies etc, we can do .to_df(columns = 'all) method." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamelayerleafparentexclusion_ruleref_typehierarchyposaliasestags
0934c47f36c16a58d68ef5e007e62a23f5f036ee3f3d1f5...China[country]False[{'name': 'Far East', 'layer': ['alternative_r...[{'name': 'China', 'layer': ['country'], 'id':...geography[{'label': 'China', 'layer': 'country', 'id': ...[105.4525116754, 35.4496039032][]{'importProductTags': [], 'exportProductTags':...
1781cacc7033f877caa4b4106d096b74afe006a96391bf5...South China[alternative_region]False[{'name': 'China (excl. HK & Macau)', 'layer':...[{'name': 'South China', 'layer': ['alternativ...geography[{'id': 'b5fafce6e20de2dc307fb7e0b89978ee91a49...[106.3418341843, 28.1283930445][]{'importProductTags': [], 'exportProductTags':...
2a63890260e29d859390fd1a23c690181afd4bd152943a0...North China[alternative_region]False[{'name': 'China (excl. HK & Macau)', 'layer':...[{'name': 'North China', 'layer': ['alternativ...geography[{'id': 'b5fafce6e20de2dc307fb7e0b89978ee91a49...[104.7737174548, 40.9901656588][]{'importProductTags': [], 'exportProductTags':...
3d1d5a3d3666d6ebb65a1b53e626b07f6b8540e8048a524...Shipyard Nansha China[terminal]True[{'name': 'Nansha [CN]', 'layer': ['port', 'st...[{'name': 'Shipyard Nansha China', 'layer': ['...geography[{'label': 'Shipyard Nansha China', 'layer': '...[113.5214914215, 22.7496639804][]{'importProductTags': [], 'exportProductTags':...
4ae0f224030f7337d0ffe5a54d290a9f0bd029f636eaf12...China Yangfan Group[terminal]True[{'name': 'Zhoushan [CN]', 'layer': ['port', '...[{'name': 'China Yangfan Group', 'layer': ['te...geography[{'label': 'China Yangfan Group', 'layer': 'te...[122.2921859199, 29.9282176371][]{'importProductTags': [], 'exportProductTags':...
\n", "
" ], "text/plain": [ " id name \\\n", "0 934c47f36c16a58d68ef5e007e62a23f5f036ee3f3d1f5... China \n", "1 781cacc7033f877caa4b4106d096b74afe006a96391bf5... South China \n", "2 a63890260e29d859390fd1a23c690181afd4bd152943a0... North China \n", "3 d1d5a3d3666d6ebb65a1b53e626b07f6b8540e8048a524... Shipyard Nansha China \n", "4 ae0f224030f7337d0ffe5a54d290a9f0bd029f636eaf12... China Yangfan Group \n", "\n", " layer leaf \\\n", "0 [country] False \n", "1 [alternative_region] False \n", "2 [alternative_region] False \n", "3 [terminal] True \n", "4 [terminal] True \n", "\n", " parent \\\n", "0 [{'name': 'Far East', 'layer': ['alternative_r... \n", "1 [{'name': 'China (excl. HK & Macau)', 'layer':... \n", "2 [{'name': 'China (excl. HK & Macau)', 'layer':... \n", "3 [{'name': 'Nansha [CN]', 'layer': ['port', 'st... \n", "4 [{'name': 'Zhoushan [CN]', 'layer': ['port', '... \n", "\n", " exclusion_rule ref_type \\\n", "0 [{'name': 'China', 'layer': ['country'], 'id':... geography \n", "1 [{'name': 'South China', 'layer': ['alternativ... geography \n", "2 [{'name': 'North China', 'layer': ['alternativ... geography \n", "3 [{'name': 'Shipyard Nansha China', 'layer': ['... geography \n", "4 [{'name': 'China Yangfan Group', 'layer': ['te... geography \n", "\n", " hierarchy \\\n", "0 [{'label': 'China', 'layer': 'country', 'id': ... \n", "1 [{'id': 'b5fafce6e20de2dc307fb7e0b89978ee91a49... \n", "2 [{'id': 'b5fafce6e20de2dc307fb7e0b89978ee91a49... \n", "3 [{'label': 'Shipyard Nansha China', 'layer': '... \n", "4 [{'label': 'China Yangfan Group', 'layer': 'te... \n", "\n", " pos aliases \\\n", "0 [105.4525116754, 35.4496039032] [] \n", "1 [106.3418341843, 28.1283930445] [] \n", "2 [104.7737174548, 40.9901656588] [] \n", "3 [113.5214914215, 22.7496639804] [] \n", "4 [122.2921859199, 29.9282176371] [] \n", "\n", " tags \n", "0 {'importProductTags': [], 'exportProductTags':... \n", "1 {'importProductTags': [], 'exportProductTags':... \n", "2 {'importProductTags': [], 'exportProductTags':... \n", "3 {'importProductTags': [], 'exportProductTags':... \n", "4 {'importProductTags': [], 'exportProductTags':... " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "china_list_enhanced = v.Geographies().search(term='China').to_df(columns = 'all')\n", "china_list_enhanced.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have demonstrated how to extract reference data via Geographies. Similarly, the methodology works for Products & Vessels endpoint as well." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamelayer.0parent.0.name
0887940a6cf2d527a20d82a5f163ecce502878ceb1cd59f...0.005 / 50ppmgradeGasoil
18bb096fb847f92af86235002b2a78ca0437543722cdb8c...0.05 / 500ppmgradeGasoil
235f8222ff81fe5befafee9c64c1d76618e4cc53e74021a...0.1 / 1000ppmgradeGasoil
3881be476857ff08dcf6a8708a2fc279d26770cecf245f7...0.1+ / 1000ppm Plus (HS)gradeGasoil
4a6ef13d2f3145a1b67a81300c1cfa4f21874f24fab4f8f...180 CSTgradeHigh Sulphur Fuel Oil
\n", "
" ], "text/plain": [ " id \\\n", "0 887940a6cf2d527a20d82a5f163ecce502878ceb1cd59f... \n", "1 8bb096fb847f92af86235002b2a78ca0437543722cdb8c... \n", "2 35f8222ff81fe5befafee9c64c1d76618e4cc53e74021a... \n", "3 881be476857ff08dcf6a8708a2fc279d26770cecf245f7... \n", "4 a6ef13d2f3145a1b67a81300c1cfa4f21874f24fab4f8f... \n", "\n", " name layer.0 parent.0.name \n", "0 0.005 / 50ppm grade Gasoil \n", "1 0.05 / 500ppm grade Gasoil \n", "2 0.1 / 1000ppm grade Gasoil \n", "3 0.1+ / 1000ppm Plus (HS) grade Gasoil \n", "4 180 CST grade High Sulphur Fuel Oil " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# To load all products\n", "products = v.Products().load_all().to_df()\n", "products.head(5)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamelayer.0parent.0.name
0b2034f1ad3a4ac269e962f00b9914d6b909923cf904d99...GasoilcategoryDiesel/Gasoil
1deda35eb9ca56b54e74f0ff370423f9a8c61cf6a3796fc...Diesel/Gasoilgroup_productClean Petroleum Products
2e06296595e1d554008a70172440d5582c923bdb8182af5...Coker GasoilgradeDirty Feedstocks
3feb8190865392ab6caecd7709077a58645ec4828c23d94...Marine GasoilgradeGasoil
\n", "
" ], "text/plain": [ " id name \\\n", "0 b2034f1ad3a4ac269e962f00b9914d6b909923cf904d99... Gasoil \n", "1 deda35eb9ca56b54e74f0ff370423f9a8c61cf6a3796fc... Diesel/Gasoil \n", "2 e06296595e1d554008a70172440d5582c923bdb8182af5... Coker Gasoil \n", "3 feb8190865392ab6caecd7709077a58645ec4828c23d94... Marine Gasoil \n", "\n", " layer.0 parent.0.name \n", "0 category Diesel/Gasoil \n", "1 group_product Clean Petroleum Products \n", "2 grade Dirty Feedstocks \n", "3 grade Gasoil " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# To load Gasoil only\n", "gasoil = v.Products().search(term='Gasoil').to_df()\n", "gasoil.head(5)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnameimovessel_class
062f3f3c1f5a663d621fe6cf9537c7d936b547497932f5d...\\tATHINEA9291248.0oil_lr2
1e6b259c04da30a57db353665e7e61f67a0a3222b96c457...\\tBORA9276004.0oil_mr2
2f351708121bce4d357ac5fad967cb1bf7fe5072773f05a...0051-04oil_coastal
31761da4fb069cd6ce153b6ad1c48e15cdb994eb386e4aa...058oil_coastal
4c817b5994efe14621949533d6777b22ce11db1c6bf9e48...1011oil_coastal
\n", "
" ], "text/plain": [ " id name imo \\\n", "0 62f3f3c1f5a663d621fe6cf9537c7d936b547497932f5d... \\tATHINEA 9291248.0 \n", "1 e6b259c04da30a57db353665e7e61f67a0a3222b96c457... \\tBORA 9276004.0 \n", "2 f351708121bce4d357ac5fad967cb1bf7fe5072773f05a... 0051-04 \n", "3 1761da4fb069cd6ce153b6ad1c48e15cdb994eb386e4aa... 058 \n", "4 c817b5994efe14621949533d6777b22ce11db1c6bf9e48... 1011 \n", "\n", " vessel_class \n", "0 oil_lr2 \n", "1 oil_mr2 \n", "2 oil_coastal \n", "3 oil_coastal \n", "4 oil_coastal " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# To load all vessels\n", "vessels = v.Vessels().load_all().to_df()\n", "vessels.head(5)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnameimovessel_class
000d89be99f08890c9122c326aa32c83ae6d557629bc124...VS REMLIN9252307oil_mr1
103c191caf28a554c8c1adfc11ff2a7a08ad5fa21a83892...SOUTH LOYALTY9537769oil_vlcc
207e562b509f2617c18f9e848c51bdec8e97488c4a41bbc...HENRIETTE MAERSK9399349oil_mr1
3085ab83b592713e314070620a8cb9f4d795a2b486075f4...VS9252292oil_mr1
409332aaa6067574eac586030b5b81cf5554337c4f48d17...BULL SULAWESI9180920oil_lr2
\n", "
" ], "text/plain": [ " id name \\\n", "0 00d89be99f08890c9122c326aa32c83ae6d557629bc124... VS REMLIN \n", "1 03c191caf28a554c8c1adfc11ff2a7a08ad5fa21a83892... SOUTH LOYALTY \n", "2 07e562b509f2617c18f9e848c51bdec8e97488c4a41bbc... HENRIETTE MAERSK \n", "3 085ab83b592713e314070620a8cb9f4d795a2b486075f4... VS \n", "4 09332aaa6067574eac586030b5b81cf5554337c4f48d17... BULL SULAWESI \n", "\n", " imo vessel_class \n", "0 9252307 oil_mr1 \n", "1 9537769 oil_vlcc \n", "2 9399349 oil_mr1 \n", "3 9252292 oil_mr1 \n", "4 9180920 oil_lr2 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# To load all vessels with a name containing 'Maersk' (currently named or previously named)\n", "maersk_vessels = v.Vessels().search(term='Maersk').to_df()\n", "maersk_vessels.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "\n", "In this tutorial, we covered the essentials of working with reference data using the Vortexa Python SDK. We began by discussing the importance of reference data and its role in supporting accurate and consistent data operations. Through various examples, including locations, vessels, and products, we demonstrated how reference data can be effectively applied in your analyses.\n", "\n", "You’ve learned how to query and retrieve reference data based on different criteria such as name or type. Mastering the use of reference data enables you to drive more accurate insights, improve data consistency, and enhance your understanding of the energy markets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" } }, "nbformat": 4, "nbformat_minor": 2 }