delta-lake
Here are 145 public repositories matching this topic...
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
Updated
Jul 17, 2024 - Java
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
-
Updated
Jul 16, 2024 - Java
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
-
Updated
Jul 16, 2024 - Scala
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
-
Updated
May 8, 2024 - Scala
A native Rust library for Delta Lake, with bindings into Python
-
Updated
Jul 16, 2024 - Rust
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
-
Updated
May 29, 2024 - Rust
An open protocol for secure data sharing
-
Updated
Jul 16, 2024 - Scala
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
-
Updated
Jul 16, 2024 - Java
Amazon SageMaker Local Mode Examples
-
Updated
Jun 19, 2024 - Python
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
-
Updated
Dec 15, 2023 - Dockerfile
Exercícios do módulo 1 - Bootcamp EDC - IGTI 2021
-
Updated
Mar 26, 2023 - Python
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
-
Updated
Jun 17, 2024 - Python
Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work
-
Updated
Jul 13, 2022 - Jupyter Notebook
Sample project to demonstrate data engineering best practices
-
Updated
Feb 24, 2024 - Python
Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline
-
Updated
Feb 15, 2023 - HTML
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
-
Updated
Jul 9, 2024 - Python
Spark structured streaming examples with using of version 3.5.1
-
Updated
Apr 27, 2024 - Scala
Improve this page
Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."