BEIJING, Dec. 15, 2020 /PRNewswire/ -- iQIYI Inc. (NASDAQ:
IQ) ("iQIYI" or the "Company"), an innovative market-leading online
entertainment service in China, is
pleased to announce that it has partnered with multiple
organizations to hold a Multi-Speaker Multi-Style Voice Cloning
Challenge (M2VoC) scheduled to run from 27
November 2020 to 11 February
2021.
M2VoC aims to enhance the quality of synthetic speech while
reducing the dependence on the quantity and quality of training
datasets. The Company hopes that participants can improve the
intelligibility and naturalness of synthetic speech even under
conditions in which there are limited resources.
iQIYI released detailed guidelines for M2VoC, the first
low-resource voice cloning challenge in the world, on 27 November.
Organized by a team of iQIYI experts and a number of organizations,
the challenge is aimed to serve as a general dataset and a fair
test platform that would facilitate the research of the voice
cloning tasks.
As an ICASSP2021 Signal Processing Grand Challenge, M2VoC
encourages researchers from academia and the computing industry to
participate.
The competition is comprised of two categories, the 'few-shots'
category and the 'one-shot' category. Target speakers for voice
cloning validation and evaluation are provided for both
categories.
In the few-shots category, each speaker has a different speaking
style with 100 available samples. In the one-shot category, each
speaker has a different speaking style with only 5 samples.
For both categories, contestants will be provided with two base
datasets for base model training, with each dataset containing
5,000 different training samples of different speech styles.
The winners will be selected for each category based on a
weighted value of four criteria: speaker similarity, speech
quality, style/expressiveness and pronunciation accuracy.
As an innovative technology in the field of artificial
intelligence (AI), speech synthesis is essential for creating a
good interactive experience. As speech synthesis has valuable
applications in areas such as voice assistants, broadcasting and
audio books, it is a fast-growing field. The global market of
speech recognition and speech-related technologies is projected to
expand to $16 billion in the next
seven to eight years, with a compound annual growth rate of 16%,
according to market research firm Global Industry Analysts.
Thanks to deep learning, speech synthesis has been able to
produce very realistic and natural-sounding speech in specific
areas. However, the technology requires a large number of datasets
and highly demanding recording conditions. As a result,
technological advancement in the field has been hindered by the
capital and time required for dataset creation. There is still much
room for improvement in the expressiveness and robustness of
synthetic speech with different speakers and various styles,
especially in real-world or low-resource conditions. iQIYI hopes
that M2VoC will help to address these issues and accelerate the
development of AI voice technology.
The competition will also drive the development of cutting-edge
technologies such as voice cloning and speech recognition, further
broadening the application scope of AI and creating new
opportunities in the audiovisual industry. Through this challenge,
iQIYI hopes to team up with talented researchers and build
solutions for low-resource voice cloning with advanced
deep-learning technology and multi-stylistic voice morphing
technology. The Company also anticipates that M2VoC will further
elevate the interactive experience of video and drive the
development and application of voice cloning technology.
In recent years, iQIYI has been leveraging AI to enable content
creation, enhance users' entertainment experience and improve
iQIYI's growing entertainment ecosystem. Currently, iQIYI's AI
technology has been applied to a whole set of processes including
content creation, production, distribution and commercialization.
In the years ahead, iQIYI will continue to explore AI voice
technology, unlocking its tremendous potential for use in the
multi-media entertainment industry so that the Company can create a
better audio-visual world for its users.
View original
content:http://www.prnewswire.com/news-releases/iqiyi-holds-worlds-first-low-resource-voice-cloning-challenge-to-accelerate-development-of-ai-voice-technology-301192766.html
SOURCE iQIYI