The binary build of LEO CDP Free Edition for training purposes
-
Updated
Jul 16, 2024 - HTML
The binary build of LEO CDP Free Edition for training purposes
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Visual, interactive queries against big databases
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
Degree diploma project
Google Cloud Platform Projects, Workshop Training and Skill Badge
Analyzed Zomato's dataset for insights into restaurant ratings and consumer behavior using advanced data preprocessing, EDA, machine learning, and interactive dashboards for intuitive exploration.
Leveraging PySpark to analyze the IMDB database, answer various queries, and develop machine learning models to predict a movie's popularity based on its cast
vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
TIL(=Today I learned.)
PySpark unofficial implementation of the study "Home monitoring for older singles: A gas sensor array system"
Logistic regression modeling of swing state voter turnout to support U.S. political campaign proposals
This project utilizes big data analytics, machine learning, and statistical methods to identify and classify adverse effects of COVID-19 vaccinations. By analyzing large datasets, it aims to uncover patterns and correlations, providing valuable insights into vaccine safety and efficacy.
📘 FIWARE 306: Real-time Processing of Context Data using Apache Spark
📘 FIWARE 305: Real-time Processing of Context Data using Apache Flink
Content to study the Data visualization , SMUR , BDA , etc
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
Big Data Analysis of NYC Fire Incident data to analyze casual relationship between fires, govt. inspections, socio-ecnomic factors and enviroment. Used Hadoop MapReduce for data pre-processing, Trino for complex queries and Tableau for visualizations and interactive dashboards
The credit card fraud detection system which sends transaction data to a Kafka topic, and processes this data to detect fraud using predefined rules or a machine learning model, triggering alerts for fraudulent transactions.
Add a description, image, and links to the big-data-analytics topic page so that developers can more easily learn about it.
To associate your repository with the big-data-analytics topic, visit your repo's landing page and select "manage topics."