ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).
-
Updated
Jul 16, 2024 - HTML
ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
On-device voice activity detection (VAD) powered by deep learning
Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
Speech-to-Text based on silero-vad + whisper.cpp (GGML STT) for ROS 2
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
A python package to build AI-powered real-time audio applications
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
Introduction to Speech Processing
ovos plugin for voice activity detection using silero vad
Voice activity detection (VAD) library for speech-end detection, based on WebRTC's VAD engine
Hello guys, welcome to my Data Science Portfolio. I include some knowledges I earn in my journey. I included some case study, papers, and code. Please check the readme.
♂️♀️ Detect a person's gender from a voice file (90.7% +/- 1.3% accuracy).
Voice Activity Detection (VAD) AudioWorklet
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Add a description, image, and links to the voice-activity-detection topic page so that developers can more easily learn about it.
To associate your repository with the voice-activity-detection topic, visit your repo's landing page and select "manage topics."