{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyOQnUSvTA1MjKkT7NVxddyL",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"source": [
"**Hypothesis 6**: \"Female labor force participation (LNS12000002) grows faster than male labor force participation (LNS12300001) during economic expansions.\"\n",
"\n",
"**Rationale:** Economic expansions may create more diverse job opportunities, encouraging more women to join the labor force."
],
"metadata": {
"id": "wOtXHA1sWxYY"
}
},
{
"cell_type": "markdown",
"source": [
"**FETCHING DATA:**"
],
"metadata": {
"id": "Vx5aTMp5XPz6"
}
},
{
"cell_type": "code",
"source": [
"import pandas_datareader.data as web\n",
"from datetime import datetime\n",
"\n",
"# Set the date range\n",
"start = datetime(1948, 1, 1)\n",
"end = datetime(2024, 1, 1)\n",
"\n",
"# Fetch data for female and male labor force participation\n",
"female_data = web.DataReader('LNS12300002', 'fred', start, end)\n",
"male_data = web.DataReader('LNS12300001', 'fred', start, end)\n",
"\n",
"# Display the first few rows to verify\n",
"print(female_data.head())\n",
"print(male_data.head())"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "-_c533uDXLtS",
"outputId": "0aa388b3-ec70-44c8-9863-d06bcb798b20"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
" LNS12300002\n",
"DATE \n",
"1948-01-01 30.9\n",
"1948-02-01 31.0\n",
"1948-03-01 30.7\n",
"1948-04-01 31.6\n",
"1948-05-01 30.9\n",
" LNS12300001\n",
"DATE \n",
"1948-01-01 83.8\n",
"1948-02-01 83.9\n",
"1948-03-01 83.0\n",
"1948-04-01 83.3\n",
"1948-05-01 83.1\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"**DEFINING EXPANSION PERIODS:**"
],
"metadata": {
"id": "qw6Iu0GKXSIS"
}
},
{
"cell_type": "code",
"source": [
"import pandas as pd\n",
"\n",
"# Define recession periods as tuples of (start, end)\n",
"recession_periods = [\n",
" ('1948-11-01', '1949-10-31'),\n",
" ('1953-07-01', '1954-05-31'),\n",
" ('1957-08-01', '1958-04-30'),\n",
" ('1960-04-01', '1961-02-28'),\n",
" ('1969-12-01', '1970-11-30'),\n",
" ('1973-11-01', '1975-03-31'),\n",
" ('1980-01-01', '1980-07-31'),\n",
" ('1981-07-01', '1982-11-30'),\n",
" ('1990-07-01', '1991-03-31'),\n",
" ('2001-03-01', '2001-11-30'),\n",
" ('2007-12-01', '2009-06-30'),\n",
" ('2020-02-01', '2020-04-30')\n",
"]\n",
"\n",
"# Convert the recession periods to datetime format\n",
"recession_periods = [(pd.to_datetime(start), pd.to_datetime(end)) for start, end in recession_periods]\n",
"\n",
"# Ensure 'date' is in datetime format and remove any timezone info\n",
"female_data.index = pd.to_datetime(female_data.index).tz_localize(None)\n",
"male_data.index = pd.to_datetime(male_data.index).tz_localize(None)\n",
"\n",
"# Function to flag expansion periods\n",
"def is_expansion(date, recession_periods):\n",
" for start, end in recession_periods:\n",
" if start <= date <= end:\n",
" return 0 # 0 for recession, 1 for expansion\n",
" return 1\n",
"\n",
"# Apply the expansion flagging function\n",
"female_data['is_expansion'] = female_data.index.to_series().apply(lambda x: is_expansion(x, recession_periods))\n",
"male_data['is_expansion'] = male_data.index.to_series().apply(lambda x: is_expansion(x, recession_periods))\n",
"\n",
"# Display the updated data with expansion flag\n",
"print(female_data.head())\n",
"print(male_data.head())"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "OIjc44hUXLz9",
"outputId": "91462e9a-f03f-4e60-9856-49ecad3c35b8"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
" LNS12300002 is_expansion\n",
"DATE \n",
"1948-01-01 30.9 1\n",
"1948-02-01 31.0 1\n",
"1948-03-01 30.7 1\n",
"1948-04-01 31.6 1\n",
"1948-05-01 30.9 1\n",
" LNS12300001 is_expansion\n",
"DATE \n",
"1948-01-01 83.8 1\n",
"1948-02-01 83.9 1\n",
"1948-03-01 83.0 1\n",
"1948-04-01 83.3 1\n",
"1948-05-01 83.1 1\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"**CALCULATING GROWTH RATES:**"
],
"metadata": {
"id": "qgr9TSliYpoU"
}
},
{
"cell_type": "code",
"source": [
"import numpy as np\n",
"\n",
"# Function to calculate growth rate\n",
"def calculate_cagr(start_value, end_value, periods):\n",
" return (end_value / start_value) ** (1 / periods) - 1\n",
"\n",
"# Initialize lists to store growth rates\n",
"female_growth_rates = []\n",
"male_growth_rates = []\n",
"\n",
"# Calculate CAGR for expansion and non-expansion periods\n",
"for period in female_data['is_expansion'].unique():\n",
" female_subset = female_data[female_data['is_expansion'] == period]['LNS12300002']\n",
" male_subset = male_data[male_data['is_expansion'] == period]['LNS12300001']\n",
"\n",
" # Ensure there are at least 2 data points to calculate CAGR\n",
" if len(female_subset) > 1 and len(male_subset) > 1:\n",
" female_cagr = calculate_cagr(female_subset.iloc[0], female_subset.iloc[-1], len(female_subset))\n",
" male_cagr = calculate_cagr(male_subset.iloc[0], male_subset.iloc[-1], len(male_subset))\n",
" else:\n",
" female_cagr = np.nan\n",
" male_cagr = np.nan\n",
"\n",
" female_growth_rates.append(female_cagr)\n",
" male_growth_rates.append(male_cagr)\n",
"\n",
"# Convert to a DataFrame for easier analysis\n",
"growth_rates_df = pd.DataFrame({\n",
" 'Period': ['Expansion', 'Recession'],\n",
" 'Female Growth Rate': female_growth_rates,\n",
" 'Male Growth Rate': male_growth_rates\n",
"})\n",
"\n",
"# Display the growth rates\n",
"print(growth_rates_df)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "CB-oQUaNXL2t",
"outputId": "683d726b-7a90-4b69-b9f4-029d4b0f7288"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
" Period Female Growth Rate Male Growth Rate\n",
"0 Expansion 0.000754 -0.000323\n",
"1 Recession 0.002803 -0.002773\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"It looks like the female labor force participation rate grew during both expansion and recession periods, while the male labor force participation rate slightly declined in both periods."
],
"metadata": {
"id": "rmvy72Q9Y4TD"
}
},
{
"cell_type": "markdown",
"source": [
"**Statistical T-Test to compare growth rates between males and females during epansion periods:**"
],
"metadata": {
"id": "dZ29zk4tY5xg"
}
},
{
"cell_type": "code",
"source": [
"# Step 1: Calculate growth rates for both female and male labor force participation\n",
"female_data['Female Growth Rate'] = female_data['LNS12300002'].pct_change()\n",
"male_data['Male Growth Rate'] = male_data['LNS12300001'].pct_change()\n",
"\n",
"# Step 2: Combine the growth rates into a single DataFrame\n",
"growth_data = pd.DataFrame({\n",
" 'Date': female_data.index,\n",
" 'Female Growth Rate': female_data['Female Growth Rate'],\n",
" 'Male Growth Rate': male_data['Male Growth Rate']\n",
"})\n",
"\n",
"# Step 3: Filter for expansion periods (using the previously defined periods or custom ones)\n",
"\n",
"# Assuming 'recession_periods' is already defined:\n",
"def is_expansion(date, recession_periods):\n",
" for start, end in recession_periods:\n",
" if start <= date <= end:\n",
" return False\n",
" return True\n",
"\n",
"# Flagging expansion periods\n",
"growth_data['is_expansion'] = growth_data['Date'].apply(lambda x: is_expansion(x, recession_periods))\n",
"\n",
"# Step 4: Filter the data for expansion periods\n",
"expansion_data = growth_data[growth_data['is_expansion'] == True]\n",
"\n",
"# Step 5: Perform the t-test between the female and male growth rates during expansion periods\n",
"from scipy import stats\n",
"\n",
"# Drop NaN values if any\n",
"expansion_female_growth = expansion_data['Female Growth Rate'].dropna()\n",
"expansion_male_growth = expansion_data['Male Growth Rate'].dropna()\n",
"\n",
"# Ensure enough data points exist\n",
"if len(expansion_female_growth) > 1 and len(expansion_male_growth) > 1:\n",
" t_stat, p_value = stats.ttest_ind(expansion_female_growth, expansion_male_growth)\n",
" print(f\"T-statistic: {t_stat}\")\n",
" print(f\"P-value: {p_value}\")\n",
"else:\n",
" print(\"Not enough data points for a t-test.\")"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "b7YmN9-1a7v1",
"outputId": "ac1b43f2-b271-41d5-fe34-b5b6cdc162ad"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"T-statistic: 3.0578846240731274\n",
"P-value: 0.0022672027857231834\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"The t-test results show a T-statistic of approximately 3.06 and a P-value of 0.0023, indicating that there is a statistically significant difference between the growth rates of female and male labor force participation during expansion periods. The low P-value suggests that the null hypothesis (that there is no difference between the two growth rates) can be rejected.\n",
"\n",
"This insight supports the hypothesis that female labor force participation grows faster than male labor force participation during economic expansions."
],
"metadata": {
"id": "0xJU_3dlcNh4"
}
},
{
"cell_type": "markdown",
"source": [
"**BAR PLOT:**"
],
"metadata": {
"id": "YWQaTxrHcTtn"
}
},
{
"cell_type": "code",
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"# Calculate the mean growth rates\n",
"mean_female_growth = expansion_female_growth.mean()\n",
"mean_male_growth = expansion_male_growth.mean()\n",
"\n",
"# Create the bar plot\n",
"plt.figure(figsize=(8, 6))\n",
"plt.bar(['Female Growth Rate', 'Male Growth Rate'], [mean_female_growth, mean_male_growth], color=['blue', 'orange'])\n",
"plt.title('Average Growth Rates During Economic Expansions')\n",
"plt.ylabel('Average Growth Rate')\n",
"plt.show()"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 545
},
"id": "hKRm8aEKcVep",
"outputId": "85354754-58b5-4063-c869-c2b74b312b04"
},
"execution_count": null,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"