CREATE
  • Technology
    • BIOTECH
    • COMMUNICATIONS
    • COMPUTING
    • IMAGING
    • MATERIALS
    • ROBOTICS
    • SOFTWARE
  • Industry
    • DEFENCE
    • INFRASTRUCTURE
    • INNOVATION
    • MANUFACTURING
    • POLICY
    • PROJECTS
    • TRANSPORT
  • Sustainability
    • ENERGY
    • ENVIRONMENT
    • RESOURCES
  • Community
    • CULTURE
    • PEOPLE
  • Career
    • EDUCATION
    • INSPIRATION
    • LEADERSHIP
    • TRENDS
  • About
    • CONTACT
    • SUBSCRIBE
No Result
View All Result
CREATE
  • Technology
    • BIOTECH
    • COMMUNICATIONS
    • COMPUTING
    • IMAGING
    • MATERIALS
    • ROBOTICS
    • SOFTWARE
  • Industry
    • DEFENCE
    • INFRASTRUCTURE
    • INNOVATION
    • MANUFACTURING
    • POLICY
    • PROJECTS
    • TRANSPORT
  • Sustainability
    • ENERGY
    • ENVIRONMENT
    • RESOURCES
  • Community
    • CULTURE
    • PEOPLE
  • Career
    • EDUCATION
    • INSPIRATION
    • LEADERSHIP
    • TRENDS
  • About
    • CONTACT
    • SUBSCRIBE
No Result
View All Result
CREATE
No Result
View All Result
Home Technology Software

How this engineer is improving speech recognition for young tech users

Michelle Wheeler by Michelle Wheeler
7 April 2021
in Software
3 min read
0
Speech recognition software struggles to understand children’s voices. A new Australian project hopes to change that.

Speech recognition software struggles to understand children’s voices. A new Australian project hopes to change that.

Researchers are hoping to capture the voices of hundreds of Australian children in a bid to improve speech recognition for young tech users. 

Until now, the speech recognition software behind virtual assistants like Google Assistant, Alexa and Siri has relied on a database of adult voices. But AusKidTalk — a new joint project of five universities — aims to change that.

The project’s team of engineers, linguists, psychologists and speech pathologists are creating a unique database of Australian kids’ voices, and they say the benefits could extend to new learning and speech therapy tools for children.

Beena Ahmed is an expert in the use of signal processing to understand speech.
Beena Ahmed is an expert in the use of signal processing to understand speech.

University of New South Wales electrical engineer Dr Beena Ahmed researches the use of signal processing to understand speech.

She’s been studying children’s speech for more than a decade, coming up with new tools and technology to help Australian kids.

“The biggest issue that I have faced in my research is that we don’t have children’s speech databases that we can use in developing these tools,” Ahmed said.

“There are huge databases for adult speech but for children’s speech, there are not that many databases around the world. And with speech, accents make a huge difference … so something built for American accents might not necessarily work well with Australian ones.”

Unable to find a database of children’s voices to support her research, Ahmed decided to build one. She reached out to like-minded researchers in Sydney and Melbourne, and AusKidTalk was born.

Ahmed and her fellow engineers will use the database to develop new speech recognition systems for younger users. Linguists and psychologists, meanwhile, will use it to better understand how children develop their speech and language. 

Ahmed says young children can struggle with consonant clusters — so clown becomes cown and brick becomes bick. And certain sounds, such as r and th typically don’t come until children are five or six years old. 

“To develop a really robust model, you need thousands and thousands of hours of speech. We’re only getting 20 to 30 minutes per child.”
Dr Beena Ahmed

Children also don’t have a fully developed vocabulary and may not use correct grammar or sentence structure — something that Ahmed says technology doesn’t recognise.

One of the goals of AusKidTalk is to collect different kinds of speech. The researchers are aiming to record 750 Australian children between the ages of three and 12, including 50 with disordered speech. 

It sounds like a lot, but Ahmed says it’s not a huge amount of data for building speech recognition algorithms.  

“To develop a really robust model, you need thousands and thousands of hours of speech,” she said.

“We’re only getting 20 to 30 minutes per child.”

The team is having to develop techniques to get the most out of the limited speech they’re able to collect. One is known as “domain adaptation”. 

AusKidTalk aims to collect speech samples from 750 children aged between three and 12.
AusKidTalk aims to collect speech samples from 750 children aged between three and 12.

“Say you have a model for American speech, or adult speech, and then you use the children’s speech to improve the model so it works better with children’s speech,” Ahmed said.

“From an engineering perspective, it’s the AI algorithms that will be our focus.”

Right now, one of the biggest engineering challenges is in the annotation. 

“Once you’ve developed something, you then need to manually validate it, to make sure it’s doing the correct annotation,” Ahmed said. “So it’s a lot of cost involved.

“Then to train those annotation tools, we need Australian speech somehow, which we don’t have already anyway. It’s sort of a Catch-22.”

Ahmed is also looking at recognising emotion in children’s speech — something that could be used to triage phone calls to kids’ helplines, for instance. And since going public with their plans, the AusKidTalk team has been approached by commercial companies interested in accessing their database of recordings.

“That’s something we’re still discussing,” Ahmed said. 

“At the moment, the major priority is our own research.” 

“You have a model for American speech, or adult speech, and then you use the children’s speech to improve the model so it works better.”
Dr Beena Ahmed

Building a database

Ahmed said the different speech samples AusKidTalk collects will first be evaluated and categorised. 

“At the moment, we’re developing automated annotation tools so that we can mark what is said where,” she said. 

“Then the next step is actually developing some algorithms to recognise some of the common sounds.”

The algorithms could feed into applications like automated reading or speech therapy tools. 

“In our algorithms, we develop strategies to perhaps use the new data with existing data and adapt what we call ‘acoustic models’ … which model the individual sounds and speech,” she said.

“Then we have language models as well … for the individual words.”

With children’s speech changing so quickly, the researchers will divide the recordings into four age groups for their initial analysis. 

Eventually, the recordings will be combined.

Tags: AIvoice recognitionalgorithm
Previous Post

Australian research is driving us closer to the cars of the future

Next Post

Engineers and the rise of renewables

Michelle Wheeler

Michelle Wheeler

Michelle is a science and technology journalist, which makes perfect sense given where you're reading this. Her work has seen her drive to the remote SKA site in a 2WD Hyundai, test great white shark detectors in a tinny, visit a tiger snake-infested island dubbed the most dangerous in the world and meet isolated tribes in the jungle.

Next Post
For a long time, engineers working in renewables felt they were pushing a cart uphill. It’s become clear that we’re over the peak of that hill.

Engineers and the rise of renewables

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

    WANT CREATE DELIVERED DIRECT TO YOUR INBOX? SUBSCRIBE TO OUR NEWSLETTER.

    By subscribing to create you are also subscribing to Engineers Australia content. Please find our Terms and conditions here

    create is brought to you by Engineers Australia, Australia's national body for engineers and the voice of more than 120,000 members. Backing today's problem-solvers so they can shape a better tomorrow.
    • ABOUT US
    • CONTACT US
    • SITEMAP
    • PRIVACY POLICY
    • TERMS
    • SUBSCRIBE

    © 2024 Engineers Australia

    No Result
    View All Result
    • Technology
      • BIOTECH
      • COMMUNICATIONS
      • COMPUTING
      • IMAGING
      • MATERIALS
      • ROBOTICS
      • SOFTWARE
    • Industry
      • DEFENCE
      • INFRASTRUCTURE
      • INNOVATION
      • MANUFACTURING
      • POLICY
      • PROJECTS
      • TRANSPORT
    • Sustainability
      • ENERGY
      • ENVIRONMENT
      • RESOURCES
    • Community
      • CULTURE
      • PEOPLE
    • Career
      • EDUCATION
      • INSPIRATION
      • LEADERSHIP
      • TRENDS
    • About
      • CONTACT
      • SUBSCRIBE
    preload imagepreload image