Samsung Mobile Press

The Learning Curve, Part 6: Four Things We Learned From the Global Samsung Research Teams Who Brought 16 Languages to Galaxy AI

June 20, 2024

What does it take to teach AI new languages? To find out, we went around the world to visit various Samsung Research centers, which play a crucial role in allowing Galaxy AI to support 16 languages1. This support enables more people to expand their language capabilities, even when offline, thanks to on-device translation features such as Live Translate and Interpreter. As we learned about the development of AI languages, we also gained insight into technology’s role in preserving and furthering culture. Here are four things you should know about the process of AI innovation and the people who make it happen.

1) Quality Data Is Crucial
The first step in introducing a new language to Galaxy AI is planning what data is needed. The language tools of Galaxy AI consist of three processes: automatic speech recognition (ASR), neural machine translation (NMT) and text-to-speech (TTS). For each process, the teams at Samsung Research centers must target distinct sets of information. Only after establishing data targets can they secure and process the data they need.

ASR needs extensive speech recordings in numerous environments, each paired with text transcriptions. Noise levels have to be accounted for, and it’s not enough just to add noises to recordings. For instance, the team developing for the Indonesian language had to go out and capture sounds from real life, such as people talking in coffee shops.

NMT needs references particular to a language, such as its grammar, idioms, loan words and dialects — anything that helps AI understand rules of communicating in that language. TTS needs data on pronunciation, such as a high-quality database of voices, plus context on how parts of words sound in different circumstances.

body image of Learning Curve Summary episode
2) Language Is Beautifully Complex
Language is a fundamental building block of culture, so each team must dive deep into the unique linguistic conditions of each country in order to understand what data is required, and to train a language model in the appropriate way.

Arabic, for example, is a language of many dialects. This meant the team in Jordan had to collect a diverse range of audio recordings of dialects from many sources, all of which had to be transcribed with a focus on unique sounds. The work was made more complex by the fact that written Arabic is often missing diacritics — symbols that guide pronunciation — so the team had to create a neural network that accurately predicts them to convert text to speech.

Good AI translation is not just about communicating the meaning of words but also conveying the identity of a culture.

body image of Learning Curve Summary episode
3) In The Era of AI, Humans Still Shine
AI is useful for many things, but the complex nature of languages means misspelled words or incorrect pronunciation can easily be introduced, and some things must be handled by humans with expertise.

For example, in Vietnamese, the words for ghost, grave and mother translate as “ma,” “mả” and “má,” differentiated in speech by slight tonal variations and are easily mistaken. Errors could also creep in because of regional variations. A swimming pool is “alberca” in Mexico, but it is “pileta” in Argentina, Paraguay, and Uruguay. Meanwhile in Colombia, Bolivia, and Venezuela, it is a “piscina,” which is the same in Brazil but pronounced with a slight tonal shift.

These complications mean it takes human care and attention to ensure an AI model interprets data correctly. The audio and text data that each team gathers must go through reviews, corrections, random checks for overall quality, then normalization and cleaning — removing background noise or filtering out inappropriate data — before an AI language model can be trained. Any errors call for further data refinement and model training.

body image of Learning Curve Summary episode
4) Open Collaboration Yields the Most Positive Results
Galaxy AI’s language expansion is a perfect example of Samsung’s openly collaborative approach to innovation and the belief that working together to share expertise enables new perspectives and experiences.

From an internal perspective, the teams had to work seamlessly across borders and time zones. For example, the development team in Brazil is three hours ahead of the quality assurance team in Mexico, and 12 hours behind the management team in Korea. This meant creating new communication channels and processes to align results and share progress, but this way of working led to a fiesta of ideas for Galaxy AI.

Samsung’s global scale of operation was not the only factor. It was also important to get input from local experts. Some teams collaborated with science and technology institutes, while others called in experts in linguistics and machine learning.

On a global level, it was also important to work closely with other leaders in AI. That’s why Galaxy partnered with Google in many countries around the world, as well as Baidu and Meitu in China. Embracing the unique strengths and expertise that partners could bring is a key part of Galaxy’s approach to creating great mobile experiences for users.

As culture continues to evolve, so will Galaxy AI constantly explore new ways to help people communicate more easily, breaking down technological and cultural barriers, one language at a time. Galaxy AI has enabled a more open world and there is much to explore together.

Users can take advantage of the 16 supported languages today on the millions of devices with Galaxy AI, including Galaxy S24, S23, S23 FE and S22 series, plus Galaxy Z Fold and Flip5, and 4 series, and Galaxy Tab S9 and S8 series.

body image of Learning Curve Summary episode

Supported languages include Arabic, Chinese, English (India, United Kingdom, United States), French, German, Hindi, Indian, Italian, Japanese, Korean, Polish, Portuguese (Brazil), Russian, Spanish (Mexico, Spain, United States), Thai and Vietnamese.

Images (4)