POC in Apache Kafka and Spark Streaming using Avro serialization.
-
Updated
Sep 6, 2018 - Scala
POC in Apache Kafka and Spark Streaming using Avro serialization.
ETL pipeline with AWS Redshift orchestrated with Airflow
Codes for data flow between models, data post-process, and visualization
Udacity Data Engineering Nanodegree - Project #2
Short course: Introduction to Machine Learning
Transformation airbnb data set using dbt and snowflake, then visualizing data using preset
Data pipeline to gather data from chess website APIs using Airflow.
Исследование продаж компьютерных игр
An end-to-end data pipeline deployed on GCP that extracts cryptocurrency data for analytics.
Convolutional Neural Network capable of detecting brain tumors and respective locations from 5712 MRI brain scans
A cutting-edge big data initiative aimed at creating a real-time data pipeline to analyze the popularity and sentiments of trending topics on Twitter.
The mini project for the course Database Technologies. The task is to take in data via a pipeline built using spark-streaming and kafka, and store the processed data into a SQLite database for further manipulation
A data pipeline project that leverages Docker and PostgreSQL for efficient data processing and analysis tasks. Uses containerization to ensure portability and reproducibility of the data pipeline.
Deployable AWS data platform to process powerlifting data extracted from openpowerlifting.org.
Python module that adds unix-like pipe operation and adapts common python functions
Proof of concept application for medium and large scale data acquisition
In the Project, you'll find a data set containing real messages that were sent during disaster events. You will be creating a machine learning pipeline to categorize these events so that you can send the messages to an appropriate disaster relief agency.
etl pipeline for turkish football events
This is the repository of RD Investments, founded by Andrew Chen and Nathan Ng.
Intro to Data Engineering
Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."