Li Ai-jun, Yin Zhi-gang
Speech corpus is the basis for both analyzing the
characteristics of speech signal and developing speech synthesis and
recognition systems. The corpus content becomes more and more complicated and
the size larger and larger with the development of computation power and the
speech technology. Chinese speech corpus can be categorized according to its
content, speaking style, channel property, phonetic coverage, dialectal accent
or application area.
In mainland
We have so many different kinds and a large number of Chinese speech corpora that it is rather important to be able to conveniently share these speech corpora to avoid waste of time and money and to make the research work more efficiency. One of the problems in sharing these corpora is the lack of general specifications on corpus collection, annotation and distribution.
The primary goal of this research is to find the
standard program of speech corpus, which can make the corpus be established
more efficiently and be used or shared more easily.
RASC863(Regional Accent Speech Corpus funded by National 863
Project), a huge speech corpus on 10 regional accented Chinese,
is introduced to illuminate the standardization of
speech corpus production.
Chinese regional accent speech corpus is the corpus of
spoken Chinese which comprises many regional variants called dialects. Although
these dialects employ a common written form, they are mutually unintelligible
to a large extent. There are 10 major dialects in
The standardization system of RASC863 speech corpus consists of two parts: steps of
speech corpus production, specifications should be followed in these steps.
Generally speaking, the corpus production can be
divided into 9 steps: making various specifications, preparing for collection,
pre-collecting, pre-validation, starting the real collection, annotating,
compiling lexical dictionaries or lexical frequency tables, post validation and
distribution.
The primary specifications include specification of speakers: describing
the information of speakers, such as age, sex, education, accent…;specification of prompt design:
describing the rules in prompt design process, the speaking type, phonetics and
linguistics request; specification of
recording: describing the recording software and the specification of
recording equipment and acoustic environment; specification of data: describing the format and index of the data; specification of annotation: describing
the annotation system; legal documents;
specification of validation:
evaluating the value of the corpus; specification
of distribution: describing the plan ,rules and media(CD/DVD) of
distribution.
Key words: phonetics, speech
corpus, standardization, production, specification
Name:
Li-Aijun
Sex:
Female
Birth:
Oct. 1966
Title:
Professor
Duty: Director of Phonetics
Laboratory,
Research
Interest: Speech
prosody, expressive speech, accented speech in L2, speech corpus production and
annotation.
E-mail:
liaj@cass.org.cn, Tel: +86-10-65237408
Name:
Yin-Zhigang
Sex:
Male
Birth:
Oct. 1977
Title:
Research Assistant
Research
Interest: speech
prosody, speech corpus production and annotation
E-mail:
yinzhg@cass.org.cn , Tel:
+86-10-65237408
===================================================================
Li-Aijun:
Director and professor of Phonetics Laboratory,
speech corpus production and
annotation.
E-mail:
liaj@cass.org.cn, Tel: +86-10-65237408
Prof. Yin-Zhigang: Research assistant of Phonetics Laboratory,
E-mail:
yinzhg@cass.org.cn , Tel:
+86-10-85195394