Skip to content

AWS S3 & Sentiment Analysis, Basic Plotting with Matplotlib, & Supervised Learning & Machine Learning with Sklearn.

Notifications You must be signed in to change notification settings

catherman/Data-Science-Miscellaneous

Repository files navigation

Overview

A. AWS S3 & Sentiment Analysis.ipynb

Screen Shot 2024-06-15 at 8 12 01 PM

  1. Objective: Engineer model for sentiment analysis of product reviews.

  2. Data: Download Amazon Product Reviews from AWS’s Registry of Open Data.

  3. Data prep: With pandas, seaborn, and sklearn to clean, transform, and export the data in preparation to train the data for a sentiment analysis algorithm.

  4. Visualize sentiment metrics with word cloud, violin plots, & bar charts.

  5. Take the first step in Data Lake formation, using AWS Glue & Athena to catalog metadata & query it using Athena. Use Jupyter’s custom magic capabilities/SQL to incorporate Athena’s query abilities.


B. Basic Plotting with Matplotlib.ipynb Screen Shot 2024-07-06 at 2 35 58 PM

C. Supervised Learning & Machine Learning with Sklearn.ipynb

  1. Objective: Use machine learning models to determine if observations are of a benign or malignant tumor.

  2. Data: Breast cancer data. Exploratory data analysis with visualization. Data preparation: Create a dummy classifier, train/test split.

  3. Models: Fit logistic regression, decision tree, & random forest. Analyze & visualize models' performance: Confusion matrix, accuracy, precision, recall, & ROC/AUC. Screen Shot 2024-07-06 at 2 36 48 PM