ENTITY Text To Speech

Text To Speech

PulseAugur coverage of Text To Speech — every cluster mentioning Text To Speech across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

9 over 90d

Releases · 30d

0 over 90d

Papers · 30d

7 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/1 · 9 TOTAL

TOOL · CL_79909 · Jun 9 · 04:00

End-to-end training unifies TTS components for better speech generation

Researchers have developed a novel end-to-end training framework for discrete token Large Language Model (LLM) based Text-to-Speech (TTS) systems. This approach unifies the training of the speech tokenizer, LLM, a flow-…
TOOL · CL_69223 · Jun 3 · 16:29

Tongyi Labs releases top-charting STT/TTS models, open weights questioned

Tongyi Labs has released new speech-to-text (STT) and text-to-speech (TTS) models that are reportedly topping charts. The models were released without significant fanfare, leading to community questions about whether th…
RESEARCH · CL_68139 · Jun 2 · 17:46

LLMs generate synthetic conversations to boost ASR training

Researchers have developed a novel method to enhance Automatic Speech Recognition (ASR) training for low-resource languages by generating synthetic conversational data. This pipeline uses LLMs to create dialogues, maps …
TOOL · CL_19446 · May 6 · 13:58

AMD EPYC CPUs show competitive performance for LLM and TTS inference workloads

A recent analysis by Leaseweb benchmarks the performance of AMD EPYC 9334 CPUs for Large Language Model (LLM) and Text-to-Speech (TTS) inference workloads. The study reveals that while GPUs offer higher throughput, CPUs…
RESEARCH · CL_13577 · May 3 · 07:47

Sakana AI's KAME architecture injects LLM knowledge into speech AI without latency

Sakana AI has developed KAME, a novel tandem architecture for speech-to-speech AI that aims to combine the speed of direct systems with the knowledge depth of LLM-based approaches. KAME operates with two asynchronous co…
RESEARCH · CL_09296 · Apr 29 · 16:36

Tamazight single-speaker speech dataset released on Hugging Face

A new single-speaker speech dataset for the Tamazight language has been released on Hugging Face and the Mozilla Data Collective. This dataset is intended for use in AI applications such as automatic speech recognition …
RESEARCH · CL_08610 · Apr 29 · 04:00

Researchers enhance elderly ASR with LLM paraphrasing and speech synthesis

Researchers have developed a novel data augmentation technique to improve automatic speech recognition (ASR) for elderly individuals. This method utilizes large language models to paraphrase existing transcripts, genera…
RESEARCH · CL_08270 · Apr 28 · 10:28

New benchmark evaluates Indic TTS accent fidelity across six dimensions

Researchers have introduced PSP, a new benchmark designed to evaluate the accent accuracy of text-to-speech (TTS) systems for Indic languages. Unlike existing metrics that focus on intelligibility and naturalness, PSP s…
RESEARCH · CL_02967 · Apr 23 · 09:44

New study evaluates 7 TTS systems for 10 Indian languages

Researchers have developed a new framework for evaluating Text-to-Speech (TTS) systems in Indian languages, addressing the high variance typically seen in crowdsourced evaluations. This framework uses controlled, multid…

End-to-end training unifies TTS components for better speech generation

Tongyi Labs releases top-charting STT/TTS models, open weights questioned

LLMs generate synthetic conversations to boost ASR training

AMD EPYC CPUs show competitive performance for LLM and TTS inference workloads

Sakana AI's KAME architecture injects LLM knowledge into speech AI without latency

Tamazight single-speaker speech dataset released on Hugging Face

Researchers enhance elderly ASR with LLM paraphrasing and speech synthesis

New benchmark evaluates Indic TTS accent fidelity across six dimensions

New study evaluates 7 TTS systems for 10 Indian languages