asr

Here are 1,040 public repositories matching this topic...

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Jul 16, 2024
Python

mkiol / dsnote

Star

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

text-to-speech translator translation offline machine-translation sailfishos tts speech-synthesis speech-recognition speech-to-text nmt linux-desktop stt asr flatpak-applications

Updated Jul 16, 2024
C++

AssemblyAI / assemblyai-ruby-sdk

Star

The AssemblyAI Ruby SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

ruby ai speech-to-text transcription stt asr assemblyai llm

Updated Jul 16, 2024
Ruby

AssemblyAI / assemblyai-java-sdk

Star

The AssemblyAI Java SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and real-time transcription, audio intelligence models, as well as the latest LeMUR models.

java ai speech-to-text transcription stt asr assemblyai llm

Updated Jul 16, 2024
Java

FunAudioLLM / SenseVoice

Star

Multilingual Voice Understanding Model

python ai pytorch speech-recognition speech-to-text asr cross-lingual speech-emotion-recognition audio-event-classification aigc llm gpt-4o

Updated Jul 16, 2024
Python

speechbrain / speechbrain

Star

A PyTorch-based Speech Toolkit

Updated Jul 16, 2024
Python

k2-fsa / sherpa-onnx

Star

Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter

android windows macos linux raspberry-pi ios text-to-speech csharp cpp dotnet speech-to-text aarch64 mfc risc-v asr arm32 onnx vits openkylin

Updated Jul 16, 2024
C++

PeterH0323 / Streamer-Sales

Star

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️

chat chatbot text-generation tts gpt chat-application asr rag digital-human llm chatgpt internlm-chat-7b internlm2 meta-human

Updated Jul 16, 2024
Python

winstxnhdw / CapGen

Star

A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.

docker caddy automatic-speech-recognition whisper asr fastapi uvicorn-gunicorn huggingface huggingface-spaces ctranslate2

Updated Jul 16, 2024
Python

jmaczan / asr-dysarthria

Star

Research on Automatic Speech Recognition for dysarthric speech

deep-learning automatic-speech-recognition asr self-supervised-learning dysarthric-speech wav2vec2 dysarthria

Updated Jul 16, 2024
Jupyter Notebook

PaddlePaddle / PaddleSpeech

Star

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Jul 16, 2024
Python

flozi00 / atra

Star

An open source NLP as a service project focused on providing state of the art systems with ease. Training and inference by simple docker commands

chatbot speech transformers inference speech-recognition asr llm stable-diffusion

Updated Jul 16, 2024
Jupyter Notebook

k2-fsa / sherpa

Star

Speech-to-text server framework with next-gen Kaldi

python cpp websocket pytorch speech-recognition transducer asr ctc end-to-end-asr

Updated Jul 16, 2024
C++

wzpan / wukong-robot

Sponsor

Star

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

alexa ai amazon-echo muse tts openai google-home unit bci speaker homeassistant snowboy asr anyq raspeberry-pi gpt3 chatgpt

Updated Jul 16, 2024
Python

innerNULL / mia

Star

My Implementations' Archive

audio nlp training crawler machine-learning youtube deep-learning corpus youtube-dl dataset youtube-downloader data-collection asr paper-implementations youtube-crawler

Updated Jul 16, 2024
Python

DmitryRyumin / ICASSP-2023-24-Papers

Star

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!