{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Päivitetty 2024-03-30 / Aki Taanila\n" ] } ], "source": [ "from datetime import datetime\n", "print(f'Päivitetty {datetime.now().date()} / Aki Taanila')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Aikaleimat\n", "\n", "Useista palveluista, esimerkiksi Yahoo Finance, voin noutaa aikasarjoja, joiden aikaleimat tulevat automaattisesti dataframen indeksiin. Jos näin ei ole, niin joudun itse huolehtimaan aikaleimojen muuntamisesta ja siirtämisestä indeksiin." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aikatiedon lukeminen merkkijonosta\n", "\n", "Merkkijonona (tekstinä) esitetyt aikatiedot pitää muuntaa Pythonin ymmärtämiksi\n", "aikaleimoiksi. Tämän voin tehdä esimerkiksi seuraavasti:\n", "\n", "* Avaan datan ja katson, missä muodossa aikatiedot ovat?\n", "* Muunnan aikatiedot pandas-kirjaston **to_datetime**-funktiolla ja sijoitan ne indeksiin.\n", "\n", "Muuntamisessa tarvitsen muotoilukoodeja, jotka löydän esimerkiksi osoitteesta:\n", "https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Esimerkki 1" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
KuukausiCO2
01999-12368.04
12000-1369.25
22000-2369.50
32000-3370.56
42000-4371.82
\n", "
" ], "text/plain": [ " Kuukausi CO2\n", "0 1999-12 368.04\n", "1 2000-1 369.25\n", "2 2000-2 369.50\n", "3 2000-3 370.56\n", "4 2000-4 371.82" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Avaan aikasarjatietoa sisältävän datan ja katson aikatietojen esitysmuodon\n", "df1 = pd.read_excel('http://taanila.fi/CO2.xlsx')\n", "df1.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CO2
Kuukausi
1999-12-01368.04
2000-01-01369.25
2000-02-01369.50
2000-03-01370.56
2000-04-01371.82
\n", "
" ], "text/plain": [ " CO2\n", "Kuukausi \n", "1999-12-01 368.04\n", "2000-01-01 369.25\n", "2000-02-01 369.50\n", "2000-03-01 370.56\n", "2000-04-01 371.82" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Sijoitan aikatiedon indeksiin\n", "# to_datetime-funktio muuntaa merkkijonot aikaleimoiksi\n", "# %Y tarkoittaa vuosilukua, vuoden ja kuukauden välissä väliviiva -, %m tarkoittaa kuukauden numeroa\n", "df1.index = pd.to_datetime(df1['Kuukausi'], format='%Y-%m')\n", "\n", "# Poistan alkuperäisen 'Kuukausi'-sarakkeen\n", "df1 = df1.drop('Kuukausi', axis=1)\n", "df1.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Esimerkki 2" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DATEIPG2211A2N
01/1/198572.5052
12/1/198570.6720
23/1/198562.4502
34/1/198557.4714
45/1/198555.3151
\n", "
" ], "text/plain": [ " DATE IPG2211A2N\n", "0 1/1/1985 72.5052\n", "1 2/1/1985 70.6720\n", "2 3/1/1985 62.4502\n", "3 4/1/1985 57.4714\n", "4 5/1/1985 55.3151" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Avaan aikasarjatietoa sisältävän datan ja katson aikatietojen esitysmuodon\n", "df2 = pd.read_csv('http://taanila.fi/Electric_Production.csv')\n", "df2.head()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IPG2211A2N
DATE
1985-01-0172.5052
1985-02-0170.6720
1985-03-0162.4502
1985-04-0157.4714
1985-05-0155.3151
\n", "
" ], "text/plain": [ " IPG2211A2N\n", "DATE \n", "1985-01-01 72.5052\n", "1985-02-01 70.6720\n", "1985-03-01 62.4502\n", "1985-04-01 57.4714\n", "1985-05-01 55.3151" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Sijoitan aikatiedon indeksiin\n", "# to_datetime-funktio muuntaa merkkijonot aikaleimoiksi \n", "df2.index = pd.to_datetime(df2['DATE'], format='%m/%d/%Y')\n", "df2 = df2.drop('DATE', axis=1)\n", "df2.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aikaleimojen luominen\n", "\n", "Voin luoda sarjan aikaleimoja pandas-kirjaston **date_range**-funktiolla. Funktiolle annan parametreina täsmälleen kolme seuraavista: start, end, periods, freq. Lue lisää: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html.\n", "\n", "**freq**-parametrin mahdolliset arvot löydät seuraavasta: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Kysyntä
2021-09-30500
2021-12-31350
2022-03-31250
2022-06-30400
2022-09-30450
2022-12-31350
2023-03-31200
2023-06-30300
2023-09-30350
2023-12-31200
2024-03-31150
2024-06-30400
2024-09-30550
2024-12-31350
2025-03-31250
2025-06-30550
2025-09-30550
2025-12-31400
2026-03-31350
2026-06-30600
2026-09-30750
2026-12-31500
2027-03-31400
2027-06-30650
2027-09-30850
\n", "
" ], "text/plain": [ " Kysyntä\n", "2021-09-30 500\n", "2021-12-31 350\n", "2022-03-31 250\n", "2022-06-30 400\n", "2022-09-30 450\n", "2022-12-31 350\n", "2023-03-31 200\n", "2023-06-30 300\n", "2023-09-30 350\n", "2023-12-31 200\n", "2024-03-31 150\n", "2024-06-30 400\n", "2024-09-30 550\n", "2024-12-31 350\n", "2025-03-31 250\n", "2025-06-30 550\n", "2025-09-30 550\n", "2025-12-31 400\n", "2026-03-31 350\n", "2026-06-30 600\n", "2026-09-30 750\n", "2026-12-31 500\n", "2027-03-31 400\n", "2027-06-30 650\n", "2027-09-30 850" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Aikasarjan luvut listana\n", "data = [500, 350, 250, 400, 450, 350, 200, 300, 350, 200, 150, 400, 550,\n", " 350, 250, 550, 550, 400, 350, 600, 750, 500, 400, 650, 850]\n", "\n", "# Aikaleimojen luominen vuosineljänneksittäin (QE)\n", "index = pd.date_range(start='2021-9-30', periods=len(data), freq='QE')\n", "\n", "# Dataframen luominen\n", "df3 = pd.DataFrame(data=data, index=index, columns=['Kysyntä'])\n", "\n", "df3" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" } }, "nbformat": 4, "nbformat_minor": 4 }