top of page
R script to scrape data from Yahoo Finance and uploading to SQL server
Software / programming language used:
R
SQL
Web Scraping
Left Joins
Project Details
This project was made to support a series of stock market dashboards based on the US S&P 500 market. The script scrapes yahoo finance and can be programmed to run on a daily basis to extract data, run calculations such as growth and the volatility and overvaluation status of stocks. The daily stock script also combines other information such as market cap, EPS ratio, Annual reports, etc.
​
Script file is available below for download
​
Highlights

Stock data daily refresh, daily stock information is loaded into a dataframe from the yahoo finance web scrape. Data is then transformed with additional columns such as month and year added to help with the joins from other data sources

Financial information, Market cap information as well as information regarding company industry and financial report information are loaded into R and prepared before being joined with the daily web scraped data.

Some special features such as a overvaluation or hype rating score was calculated using volatility, working capital, cash flow, ebit, revenue among other things were calculated against market cap to get a overvaluation rating for each stock from Slightly undervalued to Slightly Overvalued.

Lastly, 3 files are written and updated back to the SQL server, 1 table contains raw information of the web scraping only, 2nd table contains market cap information as well as the hype rating and the last table contains the company financials taken from financial reports. These 3 data tables can be connected easily in PowerBI or other frontend BI applications.
Additional Datasets from:
Kaggle - Payton Fisher
Kaggle - Hanseo Park
Kaggle - LARXEL
bottom of page