Logo
Junwoo
2024-08-27 12:38:59

Behind
UX Details to Make All TTS Engines Properly Read Korean

VideoStew offers a variety of AI voices. From Google Wavenet, Amazon Polly, KT AI Voice, Naver Clova, Azure, to ElevenLabs...

As a result, a natural issue arises: each model is trained differently, so even the same sentence is read slightly differently. This difference is particularly noticeable when reading units.

In this post, we’ll discuss the considerations we took into account to ensure proper reading of units in Korean when generating TTS.

Getting Measurement Units Right

When using various TTS engines, we encountered an issue where even the expression "100kg" was read differently by each engine. Some would read it as "hundred K-G," while others would trail off awkwardly at the end with "hundred krr..." (Yes, even AI can get flustered...)

Of course, there were engines that read it accurately as "hundred kilograms."

To address this, we developed a preprocessing library to standardize how these units are read across all engines.


Getting Numbers Right

Running this service, we encountered another unexpected issue. (Korean is truly a fascinating language..)

In Korean, there are two ways to read numbers, something we instinctively use but might not notice.

For instance, when telling time, "10시 10분" is read as "yeol si sip bun." Why...?

And while general numbers are read in Sino-Korean ("il, i, sam"), when a quantity unit is added, they are read in native Korean ("han gae, du gae, se gae"). Official measurement units use Sino-Korean. (e.g., 10cm = "sip sen-ti-mi-teo")

Alright, let me give you a number. “90”. How would you read it?

Technically, up to 90 should be read as “ninety.” Some of you might have read it as “nine zero.” As numbers get larger, we tend to adopt a more convenient way to read them.

Let’s think about smaller numbers. “9.” Yes, everyone would read this as “nine.”

Following this principle, we developed a preprocessing system that reads numbers from 1 to 99 in their unique word forms. However, this is just a guideline, and we’re open to changing it based on customer feedback. For instance, “forty” might feel more natural than “four zero.”

Anyway, to ensure a consistent user experience, we created a unit library corresponding to the unique word forms like “one, two, three.”

If you're looking to implement a TTS service, this library could be useful. However, keep in mind that it's not a complete version as we continuously discover unexpected units through customer feedback.

Detail in UX = Success in SaaS

Encountering the term “quantifier” for the first time, we faced many challenges while providing a video editing SaaS solution in Korea. It was yet another reminder of how dynamic and remarkable the Korean language is…

While we sometimes envy English-speaking services, they surely have their own set of challenges.

Of course, VideoStew offers a feature called [Manual Text Designation]. This allows you to generate TTS regardless of what's displayed on the screen.

< Setting the sound to be read by TTS regardless of the subtitles displayed on the screen >

By using this method, you can write as you like and as it sounds, leading TTS to pronounce more naturally and allowing users to correct any misread units directly.

The reason we focus on such details is our conviction that these small differences determine the growth of the service.

Users have a preconceived notion of how a sentence with numerical expressions should be read the moment they input it. If it doesn't generate as expected, that's an initial problem. Our ultimate UX goal is to produce the expected result as quickly as possible without additional editing.

We will continue to post our UX considerations like this in the future. As mentioned earlier, in the current situation where all services are leveling up, we believe that pondering over such details determines the success or failure of a service.

Go to Article

Join for the newsletter and get the news

E-mails collected are not used for any purpose other than sending newsletters and can be withdrawn at any time

You're subscribed to the newsletter 🎉

We'll come back with useful news
E-mails collected are not used for any purpose other than sending newsletters and can be withdrawn at any time
🗞️ [Update] Create Podcast-Style Videos in a Snap A new writing style, the Podcast Format, has just been added to Videostew.When you select the podcast format in either [Start from an idea] or in Polishing mode...
[Update] Create Podcast-Style Videos in a Snap
Junwoo 2026-04-15
🎓 Are You Sure You’re Even Choosing the Right Type of “Free Video Site”? You want to start making video content, but you’ve got no budget and monthly fees for paid tools feel a bit painful. So you search for “free video sites” and… i...
Are You Sure You’re Even Choosing the Right Type of “Free Video Site”?
Junwoo 2026-04-06
🗞️ If You’re a Media Company, Don’t Miss This 2026 Public Infrastructure Support Project: How to Use NewsTTV/News Images for Free The Korea Press Foundation has opened applications for its 2026 News Content Shared Infrastructure Support Program. The application deadline is Wednesday, April...
If You’re a Media Company, Don’t Miss This 2026 Public Infrastructure Support Project: How to Use NewsTTV/News Images for Free
Junwoo 2026-04-01
🎓 Video-Editing AI: Now the Real Work Is Picking One — 3 Battle-Tested Criteria for Pros These days, the market is flooded with AI video-editing tools.Sora, Veo, Runway, Kling, Vrew, Videostew, Canva, InVideo… the list is so long you’ll need an oxyg...
Video-Editing AI: Now the Real Work Is Picking One — 3 Battle-Tested Criteria for Pros
Junwoo 2026-03-30
🗞️ [Update] Real-Time Preview Just Got a Major Glow-Up! ✨ Videostew now serves both full-sequence and slice-only real-time previews.In this update we chased down every last micro-flicker that used to flash when slides ...
[Update] Real-Time Preview Just Got a Major Glow-Up! ✨
Junwoo 2026-03-27
🗞️ [Update] Image motion (Ken Burns) effect is now turned on by default—let your stills steal the show! 📸✨ Until now, if you wanted to make a still image slide, zoom, or pan, you had to hunt for the Ken Burns switch and flip it yourself.Not anymore. From today, the m...
[Update] Image motion (Ken Burns) effect is now turned on by default—let your stills steal the show! 📸✨
Junwoo 2026-03-19
🗞️ [Update] AI Sticker Icon Generation Feature The AI sticker image generation feature is now live! You can now create the perfect sticker images to match any context, whether you're feeling quirky, creative...
[Update] AI Sticker Icon Generation Feature
Junwoo 2026-02-13
🗞️ [Update] Sharper Images & Vanishing Backgrounds: Image Upscaling & BG-Removal Just Leveled Up! ✨ File Editor just leveled up—say hello to sharper upscales and cleaner background removal.Open any image in your library, hit “File Editor,” and watch your still...
[Update] Sharper Images & Vanishing Backgrounds: Image Upscaling & BG-Removal Just Leveled Up! ✨
Junwoo 2026-02-10
[Stop]