Back
Stock Market Analysis
University of Calcutta, India — 2024
My Role
Project maker — ML Codes,Model fit,Data Manage and Documentation.
Mentors
Prof Shirsendu Mukherjee, Project Guide
Prof. Dhiman Dutta, Partial Guidance
Prof. Amlan Chakravarti, ML Guide
Timeline & Status
4 Months, Completed in June 2024
Overview
Followong project is made under the rules of The University of Calcutta.Guided by Department of Statistics,Asutosh College

I had made htis page to demonstrate the whole process of forecasring live.Though it might not be so interactive but will be understandable clearly.

All predictions are done with almost with 80% acuuracy level.Actual data plot,test plot and train plots are almost coninsided to each other.
HIGHLIGHTS
An end-to-end Live demonstration of forecasting.
0.1 Google Prediction
IMAGE
0.2 Microsoft Prediction
IMAGE
0.3 IBM Prediction
IMAGE
0.4Amazon Prediction
IMAGE
CONTEXT
A final year university project.
It was more than 85% accurate.
Making a journal level project. Above all else, it was the right thing to do, and an opportunity to overdeliver.
Linkedin Profile.
IMAGE
THE PROBLEM
It was not just an analysis work,prediction is also included as well.
A huge data set.
Data analysis are done in general. However in my case, I wanted to make a predictive work also.

The decision was made to build tons of codes — which came with its own set of unique constraints and challenges:
A few months deadline, because I had to do and make documentation before our external exams.
High GPU Consumption. The system had to be strong enough for training and predicting correctly.
Data should be Clean. As there was a chunk of data so cleaning it was a huge pressure.
Unreliable Data Source. Stok Data is very snsitive so collecting data was a huge task.
THE CHALLENGE
Get the perfect data , Clean it and predict with minimum time cost and with maximum acuuracy.
North Star Machine Learning principles:
01
Clean Data
Clear, To the Point, and Only necessary unbiased Data.
02
Less Time cost in Training
We need to fit such model that gives more accurate prediction with less time cost.
03
Least Error
Turning the RMSE least to get more accurate result.
UPDATE FLOW
Platform and making it the Backbone.
You gotta start somewhere.
I had chosen Google Collab as a Jupyter Notebook Provider. Link to the Github Reposiratory is given below. (Click on Figure 3.0).
3.0 It's a preview of README.md File in my Github for the Project.
IMAGE
Finding Accuracy amidst the chaos of Malaciousity.
Finding the data from NSE and following further upto prediction takes humungous level of effort.

Shown in Figure 3.1, categorizing instructions into all stages from beginning to end.

Additionally, overly-technical terms were revised to better cater to a general audience.
Pinpointing the issues.
Clearifying , Summerising and then Normalizing being analysed are the most difficult parts in this whole work.
Unnecesarry Variables should be omitted.
Data causing bias in analysis must be sorted and deleted
System out of memory — or is it vague?
Data storing issue is solved by bypassing to the Notebook.
No validation if action was performed successfully nor indication for progression
Test Plot , Train Plot and Actual Plot are almost collinear
Getting the quick fixes in.
1. Data collected from NSE. No scare about unbiasedness. 1st problem fixed.

2. As a library Pandas is imported.Data analysis becomes smoother.2nd problem fixed.

3. For model fit GRU Model. is Selected. Less effort and more accuracy. 3rd problem fixed.

(Figure 3.3) — with the objective of fixing the problems..
National Stock Exchange Data resolved.
Pandas dataframe resolved.
Gated Recurrent Unite Model resolved.
Data with garbage — with a static solution?
High RMSE — with a static solution?
Data Storage & High end specification device — once Jupyter Notebook came, now a revolution.
As I was high end device and data storage issue, I had noticed that there is Google Collaboratory (Click to go to Google Colab) for doing these works, there instance services resolved my problem.
KEY DISCOVERY
Discovering Google Collaboratory fixed my all problems about storing the data and having low end device.
Seasonal Decomposition - Trend.
Google data : Seasonal Decompose.
For google there is a very low increasing trend until 2012,but after 2012 there is an exponential high trend.

Very high seasonality
3.1Google Decomposition
IMAGE


Microsoft data : Seasonal Decompose.
Same for Microsoft data, there is a very slow increasing trend until 2012, but after 2012 there was an exponential high trend.
Very high seasonality.



3.2Microsoft Decomposition
IMAGE

IBM data : Seasonal Decompose.
Same for IBM data like Google and Microsoft trend spikes up from 2021.
3.3IBM Decomposition
IMAGE
Amazon data : Seasonal Decompose.

Amazon data is similar to Google.
3.4Amazon Decomposition
IMAGE
DATA PATTERNS
Google , Microsoft , IBM and Amazon - Individual
Analysing Data of Google — Normalizing.
4.1 Google Normalise.
IMAGE
Analysing Data of Microsoft — Normalise.
4.2 Microsoft Normalise
IMAGE
Analysing Data of IBM — Normalise.
4.3 IBM Normalise.
IMAGE
Analysing Data of Amazon — Normalise.
4.4 Amazon Normalise.
IMAGE
FINAL Prediction
A less errored Forecasting
Plots are almost coinsided
Google Prediction : Forecasting.
Click to focus
7.1Google Prediction
INTERACTIVE
Microsoft Prediction : Forecasting.
Click to Focus
7.2Microsoft prediction
INTERACTIVE
Microsoft Prediction Forecasting.
Click to focus
7.3IBM prediction
INTERACTIVE
IBM Prediction : Forecasting
Click to focus
7.4Amazon prediction
INTERACTIVE
Amazon prediction : Forecasting
RETROSPECTIVE
A bittersweet ending.
A HUGE SUCCESS
Over 85% of accuracy the forecasting plots are generated which broke the general error bound record!
Project Takeaways:
01
Big brand Stock Market
Viewing and analysing the stock markets of unicorn brands in tech industries.
02
Leveraging existing resources
It led me to the deep ocean of Stock and Machine Learning resources.
03
Bursting Facts
The trend of using ARIMA and base LSTM models for Time Series Forecasting is finally broke.
04
Satisfaction for over accuracy
In general RMSE takes around 15-20, but for my case it was 5.23 - 12.23 which pushed me towards 85% of accuracy level of real world.