5 Q’s For Olga Beregovaya, Vice President of AI at Smartling – Center for Data Innovation

0
5 Q’s For Olga Beregovaya, Vice President of AI at Smartling – Center for Data Innovation

The Center for Data Innovation spoke with Olga Beregovaya, vice-president of AI at Smartling, a New York-based global content delivery platform that uses AI to streamline translation, transcription, and content management. Beregovaya discussed how the company’s platform leverages large language models and machine learning to adapt content for diverse markets and tackle the challenges of translating under-resourced languages.

Martin Makaryan: Can you give us an overview of Smartling and the services it provides?

Olga Beregovaya: Smartling is an AI-powered translation platform that delivers automated translations with quality close to human-level, especially for Romance and Nordic languages. But we do much more than just translation. We help businesses adapt their content for global audiences, enabling them to deliver content effectively across different markets. We offer multimedia localization services, workflow automation, and project management. Essentially, we’re an end-to-end global content delivery platform, with translation being just one part of that.

To ensure top-quality translations, we follow a human-in-the-loop approach. AI models, including neural machine translation, can sometimes produce errors or “hallucinations,” so our tools allow human translators to refine AI-generated translations to achieve 100 percent accuracy. 

As VP of AI at Smartling, I oversee research and development (R&D) efforts related to AI, lead a deployment team that focuses on training and fine-tuning models for customers, and guide strategy and product vision for AI advancements. Initially, my role was VP of Machine Translation and AI, but it became clear that machine translation is just a subset of AI, so we integrated it into a broader AI strategy. My work revolves around ensuring our AI solutions are cutting-edge, scalable, and meet our clients’ needs. 

Makaryan: What are some of the key challenges Smartling has faced with the rise of Generative AI?

Beregovaya: The past two years have been a period of rapid AI experimentation, with many projects thriving while others faced unexpected setbacks. According to Gartner, 75 percent of AI projects fail, but we’ve managed to position ourselves within the successful 25 percent.

One of the biggest challenges has been latency—the time it takes for AI-generated translations to be delivered. Large language models (LLMs) require extensive computational resources, which can slow down response times—something unacceptable for real-time content delivery. To speed up responses, we’ve made our systems more efficient by handling multiple tasks at once, using different AI models together, and improving the way we process and deliver translations. Compliance and security concerns have also been major challenges, as enterprises are increasingly cautious about how their data is processed and stored. Many of our customers’ legal departments have set strict data governance requirements, pushing us to strengthen our SLAs, data privacy policies, and security frameworks.

Another challenge is choosing the right AI model for each use case. Early on, many in the industry relied on a single LLM, but we’ve taken a provider-neutral approach. Instead of relying on just one provider, we carefully evaluate and integrate the best available models, including GPT-4, Google Gemini, and Claude, based on the task at hand. 

Finally, hallucinations remain a common issue. AI models can generate fluent but factually incorrect or entirely irrelevant translations. For example, in one instance, the AI system was supposed to translate a simple phrase, but instead, it generated an entire Italian dating profile—completely unrelated to the original text. This highlights the importance of continuous monitoring and refinement.

Makaryan: How does the Smartling platform address the specific needs of clients?

Beregovaya: Our clients span multiple industries, including tech, streaming, entertainment, e-commerce, and legal. Some well-known clients include Twitch, H&M, and IBM. Each industry has its own localization needs, which we address through AI-powered tools and workflow automation.

Because different industries face different challenges, Smartling tailors its solutions accordingly. In the high-tech sector, for instance, we focus on automating the translation of user interfaces (UI) for apps and websites, which can  often contain incomplete, inconsistent, or grammatically inaccurate text strings. For e-commerce, a common challenge is translating user-generated content, which can be messy, full of abbreviations, or written in informal language. Our platform cleans and prepares this content for translation, ensuring it flows smoothly through the pipeline. Similarly, in the legal industry, we handle complex patent translations, which require precise terminology and can vary widely in subject matter, from consumer products to cutting-edge technologies.

Another significant challenge we address is automating the quality assurance process. Many clients invest heavily in internal language quality evaluation, and our platform uses AI to improve this process. Instead of randomly sampling content, our system automatically detects the most critical issues, allowing human reviewers to focus their efforts more effectively. This combination of automation and AI-driven quality assurance helps clients reduce costs and increase efficiency in their content delivery processes.

Makaryan: How do you see the trends in AI impacting where Smartling is headed in the future? 

Beregovaya: Traditionally, Smartling has operated as a software-as-a-service (SaaS) platform, where customers actively managed their translation workflows—using a combination of AI-powered tools and human oversight. But AI is transforming how we approach localization.

We’re shifting toward what we call “service-as-software,” meaning AI will handle more of the translation process automatically, reducing the need for manual intervention. Instead of customers setting up workflows and managing translators, our system will automate more of these steps, delivering high-quality translations with minimal effort on their part.

To achieve this, we’re investing in fully automated global content delivery, leveraging machine translation with automated post-editing and smaller, fine-tuned AI models specialized for translation tasks. General-purpose models like GPT can translate, but custom-trained models reduce errors and improve accuracy.

Another key trend is multi-modality, where AI can process and generate content in multiple formats, including text, audio, and visuals. This opens up opportunities beyond translation, expanding into content creation and adaptation.

Makaryan: What challenges does Smartling face in obtaining data for under-resourced languages it is trying to expand into?

Beregovaya:  One of the biggest challenges in expanding to new languages is the lack of available data for training machine learning models. While languages like English have vast amounts of training data, others—like Swahili—have far fewer digital resources, making it harder to develop high-quality translation models.

Most AI translation models are primarily trained on English and other widely spoken languages, which can lead to culturally irrelevant or linguistically inaccurate translations for languages with smaller datasets. To address this, Smartling uses a combination of strategies, including data scraping, collecting audio recordings from regions with limited written content, and generating synthetic data to expand smaller datasets.

Another approach is leveraging data from related languages within the same language family. By using acoustic models and applying shared linguistic patterns, we can improve translation accuracy for under-resourced languages. Additionally, we collaborate closely with customers, building training datasets through human feedback. For example, Smartling partnered with the African Language Lab to help collect data for African languages, strengthening resources for long-tail languages.

The good news is that progress is being made. Initiatives like Facebook’s “No Language Left Behind” project are helping to supply more diverse and relevant training data for under-resourced languages. At Smartling, we continuously refine our models and training processes and recognize the importance of a diverse data science team. Different cultural perspectives provide valuable insights when developing language models. A great example is our former Chief Data Scientist, who was Chinese—her perspective helped improve how we selected and fine-tuned datasets for different languages.

Through these efforts, we’re seeing significant improvements in translation quality for languages that were once overlooked.

Leave a Reply

Your email address will not be published. Required fields are marked *