Classical Arabic Text-to-Speech Corpus
- 12 hours
- 1 male speaker
- 9,705 utterances
- TTS, ASR
PDF
Dataset
BibTeX
@inproceedings{kulkarni2023clartts,
title={ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus},
author={Kulkarni, Ajinkya and Kulkarni, Atharva and Shatnawi, Sara Abedalmon'em Mohammad and Aldarmaki, Hanan},
booktitle={Proc. Interspeech 2023},
pages={5511--5515},
year={2023},
doi={10.21437/Interspeech.2023-2224}
}
The Classical Arabic Text-to-Speech corpus is constructed using audio from the LibriVox project (public domain). Specifically, we used a single audiobook, Kitab Adab al-Dunya w'al-Din (972 - 1058 AD), recorded by a male speaker. The audio is sampled at 40100 Hz. We processed and segmented the original audio into shorter segments from 2 to 10 seconds, and discarded some samples that diverge in speaking style. In total, we kept around 12 hours of audio, and split it into train:test subsets (9,500:205 utterances). Before segmentation, we recruited native Arabic speakers to manually transcribe and validate the audio, including full diacritics. The dataset has been used for research on Arabic text-to-speech, ASR, and diacritic restoration. Check out the paper for more details on dataset construction and text-to-speech baselines.