{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Päivitetty 2024-03-30 / Aki Taanila\n"
]
}
],
"source": [
"from datetime import datetime\n",
"print(f'Päivitetty {datetime.now().date()} / Aki Taanila')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Aikaleimat\n",
"\n",
"Useista palveluista, esimerkiksi Yahoo Finance, voin noutaa aikasarjoja, joiden aikaleimat tulevat automaattisesti dataframen indeksiin. Jos näin ei ole, niin joudun itse huolehtimaan aikaleimojen muuntamisesta ja siirtämisestä indeksiin."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Aikatiedon lukeminen merkkijonosta\n",
"\n",
"Merkkijonona (tekstinä) esitetyt aikatiedot pitää muuntaa Pythonin ymmärtämiksi\n",
"aikaleimoiksi. Tämän voin tehdä esimerkiksi seuraavasti:\n",
"\n",
"* Avaan datan ja katson, missä muodossa aikatiedot ovat?\n",
"* Muunnan aikatiedot pandas-kirjaston **to_datetime**-funktiolla ja sijoitan ne indeksiin.\n",
"\n",
"Muuntamisessa tarvitsen muotoilukoodeja, jotka löydän esimerkiksi osoitteesta:\n",
"https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Esimerkki 1"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Kuukausi | \n",
" CO2 | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1999-12 | \n",
" 368.04 | \n",
"
\n",
" \n",
" 1 | \n",
" 2000-1 | \n",
" 369.25 | \n",
"
\n",
" \n",
" 2 | \n",
" 2000-2 | \n",
" 369.50 | \n",
"
\n",
" \n",
" 3 | \n",
" 2000-3 | \n",
" 370.56 | \n",
"
\n",
" \n",
" 4 | \n",
" 2000-4 | \n",
" 371.82 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Kuukausi CO2\n",
"0 1999-12 368.04\n",
"1 2000-1 369.25\n",
"2 2000-2 369.50\n",
"3 2000-3 370.56\n",
"4 2000-4 371.82"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Avaan aikasarjatietoa sisältävän datan ja katson aikatietojen esitysmuodon\n",
"df1 = pd.read_excel('http://taanila.fi/CO2.xlsx')\n",
"df1.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" CO2 | \n",
"
\n",
" \n",
" Kuukausi | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 1999-12-01 | \n",
" 368.04 | \n",
"
\n",
" \n",
" 2000-01-01 | \n",
" 369.25 | \n",
"
\n",
" \n",
" 2000-02-01 | \n",
" 369.50 | \n",
"
\n",
" \n",
" 2000-03-01 | \n",
" 370.56 | \n",
"
\n",
" \n",
" 2000-04-01 | \n",
" 371.82 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" CO2\n",
"Kuukausi \n",
"1999-12-01 368.04\n",
"2000-01-01 369.25\n",
"2000-02-01 369.50\n",
"2000-03-01 370.56\n",
"2000-04-01 371.82"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Sijoitan aikatiedon indeksiin\n",
"# to_datetime-funktio muuntaa merkkijonot aikaleimoiksi\n",
"# %Y tarkoittaa vuosilukua, vuoden ja kuukauden välissä väliviiva -, %m tarkoittaa kuukauden numeroa\n",
"df1.index = pd.to_datetime(df1['Kuukausi'], format='%Y-%m')\n",
"\n",
"# Poistan alkuperäisen 'Kuukausi'-sarakkeen\n",
"df1 = df1.drop('Kuukausi', axis=1)\n",
"df1.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Esimerkki 2"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" DATE | \n",
" IPG2211A2N | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1/1/1985 | \n",
" 72.5052 | \n",
"
\n",
" \n",
" 1 | \n",
" 2/1/1985 | \n",
" 70.6720 | \n",
"
\n",
" \n",
" 2 | \n",
" 3/1/1985 | \n",
" 62.4502 | \n",
"
\n",
" \n",
" 3 | \n",
" 4/1/1985 | \n",
" 57.4714 | \n",
"
\n",
" \n",
" 4 | \n",
" 5/1/1985 | \n",
" 55.3151 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" DATE IPG2211A2N\n",
"0 1/1/1985 72.5052\n",
"1 2/1/1985 70.6720\n",
"2 3/1/1985 62.4502\n",
"3 4/1/1985 57.4714\n",
"4 5/1/1985 55.3151"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Avaan aikasarjatietoa sisältävän datan ja katson aikatietojen esitysmuodon\n",
"df2 = pd.read_csv('http://taanila.fi/Electric_Production.csv')\n",
"df2.head()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" IPG2211A2N | \n",
"
\n",
" \n",
" DATE | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 1985-01-01 | \n",
" 72.5052 | \n",
"
\n",
" \n",
" 1985-02-01 | \n",
" 70.6720 | \n",
"
\n",
" \n",
" 1985-03-01 | \n",
" 62.4502 | \n",
"
\n",
" \n",
" 1985-04-01 | \n",
" 57.4714 | \n",
"
\n",
" \n",
" 1985-05-01 | \n",
" 55.3151 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" IPG2211A2N\n",
"DATE \n",
"1985-01-01 72.5052\n",
"1985-02-01 70.6720\n",
"1985-03-01 62.4502\n",
"1985-04-01 57.4714\n",
"1985-05-01 55.3151"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Sijoitan aikatiedon indeksiin\n",
"# to_datetime-funktio muuntaa merkkijonot aikaleimoiksi \n",
"df2.index = pd.to_datetime(df2['DATE'], format='%m/%d/%Y')\n",
"df2 = df2.drop('DATE', axis=1)\n",
"df2.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Aikaleimojen luominen\n",
"\n",
"Voin luoda sarjan aikaleimoja pandas-kirjaston **date_range**-funktiolla. Funktiolle annan parametreina täsmälleen kolme seuraavista: start, end, periods, freq. Lue lisää: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html.\n",
"\n",
"**freq**-parametrin mahdolliset arvot löydät seuraavasta: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Kysyntä | \n",
"
\n",
" \n",
" \n",
" \n",
" 2021-09-30 | \n",
" 500 | \n",
"
\n",
" \n",
" 2021-12-31 | \n",
" 350 | \n",
"
\n",
" \n",
" 2022-03-31 | \n",
" 250 | \n",
"
\n",
" \n",
" 2022-06-30 | \n",
" 400 | \n",
"
\n",
" \n",
" 2022-09-30 | \n",
" 450 | \n",
"
\n",
" \n",
" 2022-12-31 | \n",
" 350 | \n",
"
\n",
" \n",
" 2023-03-31 | \n",
" 200 | \n",
"
\n",
" \n",
" 2023-06-30 | \n",
" 300 | \n",
"
\n",
" \n",
" 2023-09-30 | \n",
" 350 | \n",
"
\n",
" \n",
" 2023-12-31 | \n",
" 200 | \n",
"
\n",
" \n",
" 2024-03-31 | \n",
" 150 | \n",
"
\n",
" \n",
" 2024-06-30 | \n",
" 400 | \n",
"
\n",
" \n",
" 2024-09-30 | \n",
" 550 | \n",
"
\n",
" \n",
" 2024-12-31 | \n",
" 350 | \n",
"
\n",
" \n",
" 2025-03-31 | \n",
" 250 | \n",
"
\n",
" \n",
" 2025-06-30 | \n",
" 550 | \n",
"
\n",
" \n",
" 2025-09-30 | \n",
" 550 | \n",
"
\n",
" \n",
" 2025-12-31 | \n",
" 400 | \n",
"
\n",
" \n",
" 2026-03-31 | \n",
" 350 | \n",
"
\n",
" \n",
" 2026-06-30 | \n",
" 600 | \n",
"
\n",
" \n",
" 2026-09-30 | \n",
" 750 | \n",
"
\n",
" \n",
" 2026-12-31 | \n",
" 500 | \n",
"
\n",
" \n",
" 2027-03-31 | \n",
" 400 | \n",
"
\n",
" \n",
" 2027-06-30 | \n",
" 650 | \n",
"
\n",
" \n",
" 2027-09-30 | \n",
" 850 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Kysyntä\n",
"2021-09-30 500\n",
"2021-12-31 350\n",
"2022-03-31 250\n",
"2022-06-30 400\n",
"2022-09-30 450\n",
"2022-12-31 350\n",
"2023-03-31 200\n",
"2023-06-30 300\n",
"2023-09-30 350\n",
"2023-12-31 200\n",
"2024-03-31 150\n",
"2024-06-30 400\n",
"2024-09-30 550\n",
"2024-12-31 350\n",
"2025-03-31 250\n",
"2025-06-30 550\n",
"2025-09-30 550\n",
"2025-12-31 400\n",
"2026-03-31 350\n",
"2026-06-30 600\n",
"2026-09-30 750\n",
"2026-12-31 500\n",
"2027-03-31 400\n",
"2027-06-30 650\n",
"2027-09-30 850"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Aikasarjan luvut listana\n",
"data = [500, 350, 250, 400, 450, 350, 200, 300, 350, 200, 150, 400, 550,\n",
" 350, 250, 550, 550, 400, 350, 600, 750, 500, 400, 650, 850]\n",
"\n",
"# Aikaleimojen luominen vuosineljänneksittäin (QE)\n",
"index = pd.date_range(start='2021-9-30', periods=len(data), freq='QE')\n",
"\n",
"# Dataframen luominen\n",
"df3 = pd.DataFrame(data=data, index=index, columns=['Kysyntä'])\n",
"\n",
"df3"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 4
}