GitHub - xixi0222/Quantitative-Analysis-of-Fundamentals-by-Machine-Learning: Quantitative analysis of fundamentals in quarterly reports by Machine Learning

Quantitative Analysis of Fundamentals by Machine Learning

Brief Introduction

Quantitative analysis of fundamentals in quarterly reports by Machine Learning.
Preprocess data from raw data before model training.
Use 20 fundamental attributes to fit the model.
Use 10 Machine Learning Models to do the training and prediction.
After each model, do the result analysis.

Data Details

Choose target stock: 600188.SH (兖州煤业).
Time period: year 2003 - year 2019.
Fundamentals are extracted from quarterly reports of target company (兖州煤业).
Fundamental attributes: gross revenue, revenue, total operating cost, selling expense, administration expense, financial expense, net investment income, operating profit, retained profit, net income attributable to parent company, basic EPS, total shares, total assets, current assets, total liabilities, current liabilities, minority equity, intangible assets, goodwill.
Machine Learning Models: AdaBoost Model, Decision Tree Model, Dummy Classifier Baseline Model, Gradient Boost Model, KNN Model, LogReg Model, Naive Bayes Model, Random Forest Model, SVM Model, XGBoost Model.
Data source: Wind

Algorithm Pipeline

Data Preprocessing

Do data preprocess, including data clean, replacing values with percent difference.
Export new data which have been preprocessed.

Create Labels

Create trade decisions in regard to percent change of high and low price.
There are 3 trade decisions: sell, buy, hold. These are labels for machine learning later.

Explore the data

Visualize the count of Buy, Hold, and Sell.
Check for any correlation between the future price and the current quarter's features.
Get and visualize feature importances using ExtraTreesClassifier.
Select top 10 features (respectively correlation and importance) and save into Excel files.

Training and prediction of Machine Learning Model

Scale the Data.
Split the Data.
Plot confusion matrix.
Fit and Train.
Print out Evaluation Metrics.
Tune model parameters.
Show tuned results and choose optimal result.
Model with the best parameters.
Show results and confusion matrix from optimum parameters.
Save the optimal model we trained into .m file.

Classify new data using well-trained models

Extract trained classifier and new data for prediction.
Preprocess data.
Do prediction and analysis result of prediction.

Limitation and future work

Since I get data from quarterly reports of one single firm, the amount of data is quite small for the training of Machine Learning. Therefore, the accuracy cannot be guaranteed much.
This whole project is more likely a very complete and good practice using Machine Learning Models to analyse fundamentals. Then, the project goes on to find trade decisions using trained classifiers.
Maybe I will get more data (more firms) to train models to get higher accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
machine-learning-models-top-10-correlation		machine-learning-models-top-10-correlation
machine-learning-models-top-10-importance		machine-learning-models-top-10-importance
save_models		save_models
README.md		README.md
class_count.png		class_count.png
classify_new_data.ipynb		classify_new_data.ipynb
correlation_feature_price.png		correlation_feature_price.png
explore_dataset.ipynb		explore_dataset.ipynb
final_data.xlsx		final_data.xlsx
fundamental_data.xlsx		fundamental_data.xlsx
preprocess_data.ipynb		preprocess_data.ipynb
price_data.xlsx		price_data.xlsx
top10_corr_features.xlsx		top10_corr_features.xlsx
top10_features.xlsx		top10_features.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quantitative Analysis of Fundamentals by Machine Learning

Brief Introduction

Data Details

Algorithm Pipeline

Limitation and future work

About

Releases

Packages

Languages

xixi0222/Quantitative-Analysis-of-Fundamentals-by-Machine-Learning

Folders and files

Latest commit

History

Repository files navigation

Quantitative Analysis of Fundamentals by Machine Learning

Brief Introduction

Data Details

Algorithm Pipeline

Limitation and future work

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages