The open-source tool for building high-quality datasets and computer vision models
-
Updated
Jul 16, 2024 - Python
The open-source tool for building high-quality datasets and computer vision models
Archeologická mapa České republiky
Digitální archiv AMČR
Scalable toolkit for data curation
A web service for semi-automated conversion of raw imaging data to BIDS
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Interactively explore unstructured datasets from your dataframe.
Client interface for all things Cleanlab Studio
Rebalancing chemical reaction
Curation of BIDS (CuBIDS): A sanity-preserving software package for processing BIDS datasets.
🧼🔎 A holistic self-supervised data cleaning strategy to detect irrelevant samples, near duplicates and label errors.
A curated, but incomplete, list of data-centric AI resources.
fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
Package that builds a JSON inventory/manifest from public primary or derived datasets
Python package to make URL extraction, generalization, validation, and filtration easy.
Repository for Data Curation Process Ontology
Code and data for "Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation" (EMNLP 2023)
Gene Curator is an open-source platform for managing and curating genetic data. It facilitates gene data analysis, entry, and reporting, serving genetics researchers with tools for efficient data handling.
Demo showing how the Trustworthy Language Model add reliability to LLM outputs and improves RAG, agents, and data enrichment worfklows. can be used to improve fine-tuning of LLMs, accuracy of LLM outputs, and smart routing for RAG and agents.
Add a description, image, and links to the data-curation topic page so that developers can more easily learn about it.
To associate your repository with the data-curation topic, visit your repo's landing page and select "manage topics."