PulseAugur
EN
LIVE 20:00:53
ENTITY Text To Speech

Text To Speech

PulseAugur coverage of Text To Speech — every cluster mentioning Text To Speech across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
9
9 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
7
7 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/1 · 9 TOTAL
  1. TOOL · CL_79909 ·

    End-to-end training unifies TTS components for better speech generation

    Researchers have developed a novel end-to-end training framework for discrete token Large Language Model (LLM) based Text-to-Speech (TTS) systems. This approach unifies the training of the speech tokenizer, LLM, a flow-…

  2. TOOL · CL_69223 ·

    Tongyi Labs releases top-charting STT/TTS models, open weights questioned

    Tongyi Labs has released new speech-to-text (STT) and text-to-speech (TTS) models that are reportedly topping charts. The models were released without significant fanfare, leading to community questions about whether th…

  3. RESEARCH · CL_68139 ·

    LLMs generate synthetic conversations to boost ASR training

    Researchers have developed a novel method to enhance Automatic Speech Recognition (ASR) training for low-resource languages by generating synthetic conversational data. This pipeline uses LLMs to create dialogues, maps …

  4. TOOL · CL_19446 ·

    AMD EPYC CPUs show competitive performance for LLM and TTS inference workloads

    A recent analysis by Leaseweb benchmarks the performance of AMD EPYC 9334 CPUs for Large Language Model (LLM) and Text-to-Speech (TTS) inference workloads. The study reveals that while GPUs offer higher throughput, CPUs…

  5. RESEARCH · CL_13577 ·

    Sakana AI's KAME architecture injects LLM knowledge into speech AI without latency

    Sakana AI has developed KAME, a novel tandem architecture for speech-to-speech AI that aims to combine the speed of direct systems with the knowledge depth of LLM-based approaches. KAME operates with two asynchronous co…

  6. RESEARCH · CL_09296 ·

    Tamazight single-speaker speech dataset released on Hugging Face

    A new single-speaker speech dataset for the Tamazight language has been released on Hugging Face and the Mozilla Data Collective. This dataset is intended for use in AI applications such as automatic speech recognition …

  7. RESEARCH · CL_08610 ·

    Researchers enhance elderly ASR with LLM paraphrasing and speech synthesis

    Researchers have developed a novel data augmentation technique to improve automatic speech recognition (ASR) for elderly individuals. This method utilizes large language models to paraphrase existing transcripts, genera…

  8. RESEARCH · CL_08270 ·

    New benchmark evaluates Indic TTS accent fidelity across six dimensions

    Researchers have introduced PSP, a new benchmark designed to evaluate the accent accuracy of text-to-speech (TTS) systems for Indic languages. Unlike existing metrics that focus on intelligibility and naturalness, PSP s…

  9. RESEARCH · CL_02967 ·

    New study evaluates 7 TTS systems for 10 Indian languages

    Researchers have developed a new framework for evaluating Text-to-Speech (TTS) systems in Indian languages, addressing the high variance typically seen in crowdsourced evaluations. This framework uses controlled, multid…