{"cells":[{"cell_type":"markdown","metadata":{},"source":["## Sentiment Analysis using LDA\n","\n","1. Data Collection: We will start by collecting the top 20 news summaries for each company in the Dow Jones Industrial Average using the Yahoo Finance API.\n","\n","2. Initial Sentiment Analysis: Perform a basic sentiment analysis on these summaries to get an initial sentiment score for each company.\n","\n","3. Topic Modeling: Use Latent Dirichlet Allocation (LDA) to identify five key topics that these news summaries were talking about.\n","\n","4. Topic-Specific Sentiment Analysis: Calculate the average sentiment for news summaries belonging to each of these topics.\n","\n","5. Weighted Sentiment Analysis: Use these topic-specific sentiment scores to recalculate a weighted sentiment score for each company.\n","\n","6. Comparison: Compare the original and new weighted sentiment scores to evaluate the difference."]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["!pip install -q yahoo_fin pandas_datareader gensim textblob"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["[nltk_data] Downloading package stopwords to\n","[nltk_data] /home/hexuser/nltk_data...\n","[nltk_data] Unzipping corpora/stopwords.zip.\n","[nltk_data] Downloading package punkt to /home/hexuser/nltk_data...\n","[nltk_data] Unzipping tokenizers/punkt.zip.\n"]},"execution_count":null,"metadata":{},"output_type":"execute_result"},{"data":{"text/plain":["True"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["import nltk\n","nltk.download('stopwords')\n","nltk.download('punkt')"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["import requests\n","import pandas as pd\n","from yahoo_fin import stock_info as info\n","from yahoo_fin import news\n","from pandas_datareader import DataReader\n","import numpy as np\n","import warnings\n","warnings.filterwarnings('ignore')\n","\n","from gensim import corpora, models\n","from nltk.corpus import stopwords\n","from nltk.tokenize import word_tokenize\n","import string\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["['AAPL',\n"," 'AMGN',\n"," 'AMZN',\n"," 'AXP',\n"," 'BA',\n"," 'CAT',\n"," 'CRM',\n"," 'CSCO',\n"," 'CVX',\n"," 'DIS',\n"," 'DOW',\n"," 'GS',\n"," 'HD',\n"," 'HON',\n"," 'IBM',\n"," 'INTC',\n"," 'JNJ',\n"," 'JPM',\n"," 'KO',\n"," 'MCD',\n"," 'MMM',\n"," 'MRK',\n"," 'MSFT',\n"," 'NKE',\n"," 'PG',\n"," 'TRV',\n"," 'UNH',\n"," 'V',\n"," 'VZ',\n"," 'WMT']"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Get the list of tickers that comprise the Dow Jones Industrial Average\n","tickers = info.tickers_dow()\n","tickers"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/de3ed2a9-623b-4a20-a897-fce77b20c7f5\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerSummaries
0AAPL[Magnificent Seven stocks, including AI leader...
1AMGN[Amgen's shares have come under pressure this ...
2AMZN[Amazon.com said on Wednesday it plans to push...
3AXP[The pair both declared substantial improvemen...
4BA[Boeing’s global fleet of 787 Dreamliner jets ...
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Initialize an empty DataFrame to store the summaries\n","dow_news_df = pd.DataFrame(columns=['Ticker', 'Summaries'])\n","# Iterate through the list of Dow tickers and fetch news summaries\n","for ticker in tickers:\n"," ticker_news = news.get_yf_rss(ticker)\n"," summaries = [article['summary'] for article in ticker_news]\n"," dow_news_df = dow_news_df.append({'Ticker': ticker, 'Summaries': summaries}, ignore_index=True)\n","dow_news_df.head()"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/2420f88f-a6f1-40cd-8392-d445b2bd5720\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerSummaries
0AAPL[Magnificent Seven stocks, including AI leader...
1AMGN[Amgen's shares have come under pressure this ...
2AMZN[Amazon.com said on Wednesday it plans to push...
3AXP[The pair both declared substantial improvemen...
4BA[Boeing’s global fleet of 787 Dreamliner jets ...
5CAT[The bull and bear debate over the cyclical st...
6CRM[Key Insights Institutions' substantial holdin...
7CSCO[Cisco Systems (CSCO) concluded the recent tra...
8CVX[(Bloomberg) -- President Joe Biden’s administ...
9DIS[Workers who help bring Disneyland’s beloved c...
10DOW[Should corporate executives’ pay be tied to c...
11GS[(Bloomberg) -- Banks have found another way t...
12HD[Home Depot (HD) has been one of the stocks mo...
13HON[Honeywell (HON) gains from solid momentum in ...
14IBM[IBM (IBM) doesn't possess the right combinati...
15INTC[Shares of the chip equipment manufacturer pul...
16JNJ[Johnson & Johnson (JNJ) continued with its lo...
17JPM[(Bloomberg) -- JPMorgan Chase & Co. Chief Exe...
18KO[Mastercard, Netflix, Coca-Cola, Berkshire Hat...
19MCD[With over 38,000 locations in more than 100 c...
20MMM[NORTHAMPTON, MA / ACCESSWIRE / April 15, 2024...
21MRK[The fact that multiple Merck & Co., Inc. ( NY...
22MSFT[Consumers face the prospect of permanently hi...
23NKE[Adidas's strong first-quarter figures suggest...
24PG[The Management Top 250 ranking, compiled by r...
25TRV[Travelers' (TRV) first-quarter results reflec...
26UNH[UnitedHealth Group (UNH) breezed past the Zac...
27V[Visa (V) has an impressive earnings surprise ...
28VZ[Looking beyond Wall Street's top -and-bottom-...
29WMT[The stock has significantly outperformed the ...
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["dow_news_df"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/64783261-bb2c-4fb3-be16-258583a23e81\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerAverage Sentiment
0AAPL0.195268
1AMGN0.125121
2AMZN0.143147
3AXP0.158369
4BA0.145588
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["from textblob import TextBlob\n","# Function to calculate sentiment polarity\n","def calculate_sentiment(text):\n"," return TextBlob(text).sentiment.polarity\n","# Initialize an empty DataFrame to store the sentiment scores\n","dow_sentiment_df = pd.DataFrame(columns=['Ticker', 'Average Sentiment'])\n","# Iterate through the DataFrame and calculate the average sentiment for each ticker\n","for index, row in dow_news_df.iterrows():\n"," ticker = row['Ticker']\n"," summaries = row['Summaries']\n"," if summaries:\n"," avg_sentiment = np.mean([calculate_sentiment(summary) for summary in summaries])\n"," dow_sentiment_df = dow_sentiment_df.append({'Ticker': ticker, 'Average Sentiment': avg_sentiment}, ignore_index=True)\n","dow_sentiment_df.head()"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/b007097d-7793-4eef-9bd0-523f0dd3dd89\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerAverage Sentiment
0AAPL0.195268
1AMGN0.125121
2AMZN0.143147
3AXP0.158369
4BA0.145588
5CAT0.099819
6CRM0.134925
7CSCO0.088520
8CVX0.124590
9DIS0.169991
10DOW0.180742
11GS0.239865
12HD0.168376
13HON0.109561
14IBM0.148592
15INTC0.043373
16JNJ0.087794
17JPM0.075948
18KO0.215687
19MCD0.155715
20MMM0.157566
21MRK0.140685
22MSFT0.105799
23NKE0.073771
24PG0.160547
25TRV0.138650
26UNH0.114048
27V0.124004
28VZ0.145537
29WMT0.099878
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["dow_sentiment_df"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":[]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/0d236868-ff6d-48fe-834b-ac2342b966fe\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerSummary
0AAPLMagnificent Seven stocks, including AI leader ...
1AAPLSo much for the \"pay or okay\" model that the F...
2AAPLApple is opening up web distribution for iOS a...
3AAPLThese four stocks will be the cream of the cro...
4AAPLApple has fixed a bug that suggested the Pales...
5AAPLApple CEO Tim Cook says ‘the investment abilit...
6AAPLThese are stocks you should always consider bu...
7AAPLAmazon, Apple initiated: Wall Street's top ana...
8AAPLThe tech giant is no longer the world's top sm...
9AAPLThese companies are at earlier stages in their...
10AAPLWhich of these tech titans is the better buy r...
11AAPLAfter Vietnam, Cook then flew further south to...
12AAPLApple will consider making some of its product...
13AAPLThe iShares Expanded Tech Sector ETF is outper...
14AAPLIn just a few days, India will commence the wo...
15AAPLThe legendary investor has made tens of billio...
16AAPLNot all AI stocks trade at stratospheric valua...
17AAPLThis ETF contains some of the world's most imp...
18AAPL(Bloomberg) -- Apple Inc. is weighing the poss...
19AAPLApple CEO Tim Cook said the company will “look...
20AMGNAmgen's shares have come under pressure this y...
21AMGNNot every business in operation today is built...
22AMGNAmgen (NASDAQ:AMGN) today provided an update r...
23AMGNZacks.com users have recently been watching Am...
24AMGNIn the most recent trading session, Amgen (AMG...
25AMGNAmgen's stock isn't expensive, but the busines...
26AMGNGLP-1 medications will likely expand beyond th...
27AMGNIn this article, we discuss 11 best biotech ET...
28AMGNIt’s been a volatile start to the second quart...
29AMGNLooking ahead, the future of the U.S. economy ...
30AMGNJoseph Artuso of Easterly Investment Partners ...
31AMGNIn this piece, we will take a look at the ten ...
32AMGNIn this article, we discuss 13 best cheap divi...
33AMGNThese drugmakers have made important moves ove...
34AMGNAmgen (AMGN) concluded the recent trading sess...
35AMGNUse the recent short-term weakness from these ...
36AMGNA Phase 3 study will compare Merck’s experimen...
37AMGNAmgen (AMGN) expects strong sales growth of pr...
38AMGNAmgen (AMGN) closed the most recent trading da...
39AMGNAmgen (AMGN) has an impressive earnings surpri...
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Initialize an empty DataFrame to store the top 20 summaries for each ticker\n","dow_top20_summaries_df = pd.DataFrame(columns=['Ticker', 'Summary'])\n","# Iterate through the list of Dow tickers and fetch the top 20 news summaries\n","for ticker in tickers:\n"," ticker_news = news.get_yf_rss(ticker)[:20]\n"," for article in ticker_news:\n"," summary = article['summary']\n"," dow_top20_summaries_df = dow_top20_summaries_df.append({'Ticker': ticker, 'Summary': summary}, ignore_index=True)\n","dow_top20_summaries_df.head(40)"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/200645fa-f332-4423-96e3-6c89de913ed6\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerSummary
0AAPLMagnificent Seven stocks, including AI leader ...
1AAPLSo much for the \"pay or okay\" model that the F...
2AAPLApple is opening up web distribution for iOS a...
3AAPLThese four stocks will be the cream of the cro...
4AAPLApple has fixed a bug that suggested the Pales...
.........
595WMTThe price reductions come as consumers feel th...
596WMTThis retailer's faster growth helped fund a bi...
597WMTNichole Hart walks 20,000 steps as she searche...
598WMTAlaska Permanent, the largest U.S. state wealt...
599WMTThe retailer could be a more exciting stock to...
\n","

600 rows × 2 columns

\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["dow_top20_summaries_df"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/c5f1dec3-12cf-4f26-828a-7975cc587610\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerSummarySentiment
0AAPLMagnificent Seven stocks, including AI leader ...1.000000
1AAPLSo much for the \"pay or okay\" model that the F...0.233333
2AAPLApple is opening up web distribution for iOS a...0.225000
3AAPLThese four stocks will be the cream of the cro...0.000000
4AAPLApple has fixed a bug that suggested the Pales...0.100000
5AAPLApple CEO Tim Cook says ‘the investment abilit...-0.125000
6AAPLThese are stocks you should always consider bu...0.000000
7AAPLAmazon, Apple initiated: Wall Street's top ana...0.500000
8AAPLThe tech giant is no longer the world's top sm...0.250000
9AAPLThese companies are at earlier stages in their...0.062500
10AAPLWhich of these tech titans is the better buy r...0.392857
11AAPLAfter Vietnam, Cook then flew further south to...0.100000
12AAPLApple will consider making some of its product...0.000000
13AAPLThe iShares Expanded Tech Sector ETF is outper...0.000000
14AAPLIn just a few days, India will commence the wo...-0.200000
15AAPLThe legendary investor has made tens of billio...0.500000
16AAPLNot all AI stocks trade at stratospheric valua...0.000000
17AAPLThis ETF contains some of the world's most imp...0.450000
18AAPL(Bloomberg) -- Apple Inc. is weighing the poss...0.066667
19AAPLApple CEO Tim Cook said the company will “look...0.350000
20AMGNAmgen's shares have come under pressure this y...0.300000
21AMGNNot every business in operation today is built...0.000000
22AMGNAmgen (NASDAQ:AMGN) today provided an update r...0.000000
23AMGNZacks.com users have recently been watching Am...0.150000
24AMGNIn the most recent trading session, Amgen (AMG...0.058333
25AMGNAmgen's stock isn't expensive, but the busines...-0.500000
26AMGNGLP-1 medications will likely expand beyond th...0.000000
27AMGNIn this article, we discuss 11 best biotech ET...0.384091
28AMGNIt’s been a volatile start to the second quart...0.125000
29AMGNLooking ahead, the future of the U.S. economy ...0.188333
30AMGNJoseph Artuso of Easterly Investment Partners ...0.000000
31AMGNIn this piece, we will take a look at the ten ...0.428571
32AMGNIn this article, we discuss 13 best cheap divi...0.487143
33AMGNThese drugmakers have made important moves ove...0.075000
34AMGNAmgen (AMGN) concluded the recent trading sess...0.000000
35AMGNUse the recent short-term weakness from these ...0.000000
36AMGNA Phase 3 study will compare Merck’s experimen...0.100000
37AMGNAmgen (AMGN) expects strong sales growth of pr...0.433333
38AMGNAmgen (AMGN) closed the most recent trading da...0.058333
39AMGNAmgen (AMGN) has an impressive earnings surpri...0.214286
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Function to calculate sentiment polarity\n","def calculate_sentiment(text):\n"," return TextBlob(text).sentiment.polarity\n","# Initialize an empty DataFrame to store the sentiment scores for the top 20 summaries\n","dow_top20_sentiment_df = pd.DataFrame(columns=['Ticker', 'Summary', 'Sentiment'])\n","# Iterate through the DataFrame and calculate the sentiment for each summary\n","for index, row in dow_top20_summaries_df.iterrows():\n"," ticker = row['Ticker']\n"," summary = row['Summary']\n"," sentiment = calculate_sentiment(summary)\n"," dow_top20_sentiment_df = dow_top20_sentiment_df.append({'Ticker': ticker, 'Summary': summary, 'Sentiment': sentiment}, ignore_index=True)\n","dow_top20_sentiment_df.head(40)"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["[(0, '0.014*\"’\" + 0.008*\"2024\" + 0.007*\"\\'s\" + 0.006*\"april\"'),\n"," (1, '0.014*\"stocks\" + 0.014*\"\\'s\" + 0.009*\"trading\" + 0.007*\"earnings\"'),\n"," (2, '0.012*\"\\'s\" + 0.007*\"2024\" + 0.006*\"stock\" + 0.006*\"market\"'),\n"," (3, '0.009*\"stocks\" + 0.008*\"earnings\" + 0.008*\"company\" + 0.007*\"\\'s\"'),\n"," (4, '0.012*\"\\'s\" + 0.010*\"’\" + 0.006*\"u.s.\" + 0.005*\"rate\"')]"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Function to clean and tokenize text\n","def clean_tokenize(text):\n"," stop_words = set(stopwords.words('english'))\n"," tokens = word_tokenize(text.lower())\n"," tokens = [word for word in tokens if word not in stop_words and word not in string.punctuation]\n"," return tokens\n","\n","# Tokenize the summaries\n","tokenized_summaries = dow_top20_summaries_df['Summary'].apply(clean_tokenize)\n","\n","# Create a dictionary and corpus from the tokenized summaries\n","dictionary = corpora.Dictionary(tokenized_summaries)\n","corpus = [dictionary.doc2bow(text) for text in tokenized_summaries]\n","\n","# Apply LDA model\n","lda_model = models.ldamodel.LdaModel(corpus, num_topics=5, id2word=dictionary, passes=15)\n","topics = lda_model.print_topics(num_words=4)\n","\n","topics"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"text/plain":["[(0, '0.013*\"trading\" + 0.011*\"stocks\" + 0.011*\"day\" + 0.010*\"’\"'),\n"," (1, '0.011*\"\\'s\" + 0.011*\"’\" + 0.009*\"stocks\" + 0.007*\"market\"'),\n"," (2, '0.013*\"earnings\" + 0.012*\"’\" + 0.011*\"2024\" + 0.008*\"company\"'),\n"," (3, '0.011*\"\\'s\" + 0.011*\"stocks\" + 0.006*\"’\" + 0.006*\"company\"'),\n"," (4, '0.019*\"\\'s\" + 0.008*\"said\" + 0.007*\"street\" + 0.007*\"wall\"')]"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Re-run the LDA topic modeling code after downloading the required NLTK resources\n","from gensim import corpora, models\n","from nltk.corpus import stopwords\n","from nltk.tokenize import word_tokenize\n","import string\n","\n","# Function to clean and tokenize text\n","def clean_tokenize(text):\n"," stop_words = set(stopwords.words('english'))\n"," tokens = word_tokenize(text.lower())\n"," tokens = [word for word in tokens if word not in stop_words and word not in string.punctuation]\n"," return tokens\n","\n","# Tokenize the summaries\n","tokenized_summaries = dow_top20_summaries_df['Summary'].apply(clean_tokenize)\n","\n","# Create a dictionary and corpus from the tokenized summaries\n","dictionary = corpora.Dictionary(tokenized_summaries)\n","corpus = [dictionary.doc2bow(text) for text in tokenized_summaries]\n","\n","# Apply LDA model\n","lda_model = models.ldamodel.LdaModel(corpus, num_topics=5, id2word=dictionary, passes=15)\n","topics = lda_model.print_topics(num_words=4)\n","\n","topics"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/fe4fd92e-0c89-4b96-a036-2719999dcfeb\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TopicSentiment
000.124343
110.110615
220.126383
330.178993
440.126170
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Function to assign topics to summaries based on LDA model\n","def assign_topic_to_summary(summary):\n"," bow = dictionary.doc2bow(clean_tokenize(summary))\n"," topic_scores = lda_model[bow]\n"," dominant_topic = max(topic_scores, key=lambda x: x[1])[0]\n"," return dominant_topic\n","\n","# Assign topics to each summary\n","dow_top20_summaries_df['Topic'] = dow_top20_summaries_df['Summary'].apply(assign_topic_to_summary)\n","\n","# Perform sentiment analysis on each summary\n","dow_top20_summaries_df['Sentiment'] = dow_top20_summaries_df['Summary'].apply(calculate_sentiment)\n","\n","# Group by topic and calculate average sentiment\n","topic_sentiment_df = dow_top20_summaries_df.groupby('Topic')['Sentiment'].mean().reset_index()\n","\n","topic_sentiment_df"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[{"data":{"application/vnd.hex.export+parquet":"{\"success\":true,\"exportKey\":\"4a8043e5-f038-4821-9c1d-4b8d3d5b0fcd/4f22f623-94d2-4685-a2bb-957a6cfa4229/exports/3e79ced9-c958-471a-9f7b-bda913cf5601\"}","text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
TickerOriginal_SentimentNew_Weighted_Sentiment
0AAPL0.1952680.027609
1AMGN0.1251210.018886
2AMZN0.1431470.019433
3AXP0.1583690.022741
4BA0.1455880.021359
5CAT0.0998190.012247
6CRM0.1349250.018329
7CSCO0.0885200.011163
8CVX0.1245900.017225
9DIS0.1699910.022478
10DOW0.1807420.024173
11GS0.2398650.030563
12HD0.1683760.024265
13HON0.1095610.015721
14IBM0.1485920.022415
15INTC0.0433730.006019
16JNJ0.0877940.010825
17JPM0.0759480.010967
18KO0.2156870.033600
19MCD0.1557150.022637
20MMM0.1575660.022900
21MRK0.1406850.018020
22MSFT0.1057990.016556
23NKE0.0737710.011753
24PG0.1605470.023201
25TRV0.1386500.021667
26UNH0.1140480.015892
27V0.1240040.017746
28VZ0.1455370.017689
29WMT0.0998780.012784
\n","
"]},"execution_count":null,"metadata":{},"output_type":"execute_result"}],"source":["# Function to calculate weighted sentiment based on topic sentiment\n","def calculate_weighted_sentiment(row):\n"," topic = row['Topic']\n"," sentiment = row['Sentiment']\n"," topic_weight = topic_sentiment_df[topic_sentiment_df['Topic'] == topic]['Sentiment'].values[0]\n"," return sentiment * topic_weight\n","\n","# Calculate weighted sentiment for each summary\n","dow_top20_summaries_df['Weighted_Sentiment'] = dow_top20_summaries_df.apply(calculate_weighted_sentiment, axis=1)\n","\n","# Calculate new average sentiment for each company based on weighted sentiment\n","new_dow_sentiment_df = dow_top20_summaries_df.groupby('Ticker')['Weighted_Sentiment'].mean().reset_index()\n","\n","# Merge with original dow_sentiment_df to compare\n","comparison_df = pd.merge(dow_sentiment_df, new_dow_sentiment_df, on='Ticker', how='inner')\n","comparison_df.columns = ['Ticker', 'Original_Sentiment', 'New_Weighted_Sentiment']\n","\n","comparison_df"]},{"cell_type":"markdown","metadata":{},"source":["## Conclusions:\n","\n","1. Nuanced Understanding: The weighted sentiment scores provide a more nuanced understanding of the news landscape for each company. They take into account not just the sentiment of the news, but also the importance of the topic that the news belongs to.\n","\n","2. Risk Mitigation: By focusing on topic-specific sentiment, investors can potentially mitigate risks. For example, if a company has negative sentiment in a critical topic like \"Corporate Announcements,\" it might be a red flag.\n","\n","3. Strategic Investment: The topic-weighted sentiment can be used to fine-tune investment strategies. For instance, you might prioritize companies with positive news in topics that are currently trending or are of strategic importance, like \"Stock Market Trends.\"\n","\n","4. Dynamic Adaptation: As the importance of topics changes over time (e.g., during earnings season, product launches, etc.), the weighted sentiment scores can adapt dynamically, providing timely investment insights.\n","\n","5. Comprehensive Analysis: Combining both general and topic-specific sentiment gives a more rounded view, allowing for better-informed investment decisions.\n","\n","By using weighted sentiment scores, investors can make more nuanced and strategic decisions, potentially leading to better investment outcomes."]}],"metadata":{"hex_info":{"author":"Brandon Doey","exported_date":"Wed Apr 17 2024 18:05:36 GMT+0000 (Coordinated Universal Time)","project_id":"4f22f623-94d2-4685-a2bb-957a6cfa4229","version":"draft"},"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"orig_nbformat":4},"nbformat":4,"nbformat_minor":4}