Go Back

Reducing Pronunciation errors by 30% in Enterprise Voiceover generation

Customer satisfaction for a generated voiceover heavily depends upon the pronunciation accuracy. For enterprises that generates bigger volume of voiceovers, there usually is a set of words - proper nouns like industry specific technical terms or brand names - which often get mis-pronounced as the model fails to identify these terms. This results in manual intervention where the creator have to update these pronunciations which is often time consuming and redundant. In order to optimise this execution time for enterprises, we came up with a feature where in users can define the pronunciations of such words before hand, so across the workspace, these words will be pronounced as they intended.

Date

December 2024

Role

Product Designer

Company

Murf AI

What was my role in this feature?

I joined very early on this feature, right from taking the downloading the problem statement from product and talking to our enterprise users. Then we brainstormed on solutions, acknowledging the constraints we had and then finally delivering the polished designs

Way forward

While talking to users, we realised that our enterprise users grouped these specific words in different buckets depending upon their workflow. For eg. one of the users grouped all such words in languages and another grouped them based on industries. We decided to give that flexibility to the user. Since this was one of the main features requested by majority of our enterprise users, we decided to have the touchpoint in the main navigation.

Once inside Pronunciation Library, users can choose to create a list, by entering the name and description.

Once they have created the list, users can now start adding words to the list. They can either search for a new word to set its pronunciation, add a new word or add a word that was previously used in the workspace.

Users could also go through the list of pronunciations that they have applied across the workspace previously and add them to the lists as well.

Once they have added words to the list, they can now apply this list to required root folders. All the projects inside that folder, will now respect the pronunciations defined in this list.

When an user is inside a project, and they come across a pronunciation which they wish to add to a list, they can just select the word and required pronunciation, and just choose the list they wish to add the word to, on the go.

Within a project, users can also check what list is applied by going to the pronunciation library widget as shown

About the launch

Being one of the most requested enterprise features we had maximum clarity on the problem that we are solving and how effectively it reduces the execution time for creators. We enabled this feature to only those users who actively asked for it, and we been closely watching the behaviour.

After analysing the data for those users who we enabled the feature, we found the WER has decreased. From 5% to 3.5% - primarily because with this library, users could get pre-define the pronunciation of brand names and other proper nouns, which contributed to better generations.