Work Notes: Translation Technology Round Table 2024

In January 2024, I participated in the Translation Technology Round Table held in Boulder, Colorado, organized by The Localization Institute. Participants included professionals from the gaming industry, translation companies, and AI technology development teams. The two-day round table was rewarding, featuring rigorous discussions and sunny days. Below are my notes from the sessions.‍

Future landscape

Traditional roles like translators being replaced by newer positions such as prompt engineers and post editors.
Sampling/fine-tuning LLMs might also be the future of a translator’s job.
TMS/LSPs moving into content management.
There was a focus on the importance of source quality and predictability for future translation work.
- For example, if the transcriptions/scripts are properly organized (paragraph & sentence layout, accuracy of the text, etc.) before inputting in the translation system. Or if the translation output is consistent and predictable with the same source input.
- Input needs to be constrained (input control) is the first step for automation.

The use of GenAI‍

The use of GenAI was discussed in terms of its capabilities in content generation, predictability, accuracy, and the ability to switch between various translationtones. However, client satisfaction with GenAI was reported as low.
Gaming companies have a firm policy of restricting any usage of GenAI and related products, this is to prevent data leaks.
Though big companies are very firm about not using any type of AI tool for security reasons, some of them have their internal teams to work out a framework that is only accessible by their own company. Azure is the main one the big companies are looking into, since Microsoft guarantees that they will not use any of their data to train.
LLMs can be used to identify and define glossary/terminology.
GenAI in marketing - with a question raised about the necessity of translation teams since GenAI can generate content in multiplelanguages at once.
- Content Management Departmentin small companies want to implement LLMs on their websites (such as “AskAI” or “Generate” function within their CMS), so they can create a blogpost in a desired language within their CMS, and without going through localization process.
- Participants emphasized that contents generated by GenAI still need to go through translation management systems (TMS) and human reviewers before being published.
- Some LSPs are repositioning themselves as content companies, shifting from translation to transcreation, sometimes the transcreated content performs better than the translation that is true to the source in the local market.
- Shared by Company E (gaming company): Regional teams (localization team/countrymanager) use the English material as a reference to create their own content, skipping the localization process. This gives them more freedom and autonomy, but also makes them accountable for compliance and legal concerns.

Continuous localization –automation is inevitable

E-commerce companies often have micro updates, they are a good example of continuous localization.
Continuous localization doesn’t always get implemented when it comes to UI.
‍Budget is the biggest problem on the client side when it comes to continuous localization, they don’t necessarily know how to estimate the volume.‍
Successful case for continuous localization – have the translators on standby during specific periods, they get paid based on how much they translate, and if the volume doesn’t surpass a predetermined threshold, they receive a guaranteed minimum that is agreed upon.

Questions raised:

What is the ideal training data for LLMs if one is looking to implement it into the existing TMS? How to control the result and maintain its consistency?
- Perhaps documents where translators add annotations on why they make certain decisions, such as their reasoning behind choosing one word over another.
- Following this, we can examine what post editors do, why they correct the translation, what the intent is, and train another LLM specifically for correcting the translation produced by another LLM.
- Observe how translators do their translations. Their actions and thinking process can help LLMs improve translation processing. For instance, translators often work in a non-linear way. They focus on context and might translate a sentence while skipping others, then go back to fill in the gaps. If we incorporate this kind ofthinking process into LLMs, they might generate more consistent translations.
PM side – how to measure PM efficiency and how to reduce manual work for requests?
Economy aspect – the price for per word rate is getting extremely low, and will only get lower, what pricing model can be adapted?
- A non-word cost estimate for future:
  - Participants mentioned referencing the value of the final work, the potential profit that the translation can generate.
  - Another pricing model that uses tokens when LLMs is specifically asked to be used in the translation process.
- Cost side – how to control cost when using LLMs to generate translation? How to efficiently use prompts to control the number of tokens used?
  - Implement narrow constraints to enhance certain steps, minimize errors, and drive down expenses.

What some translation companies are doing

Linguists generate prompts, vetting and iterating until the best results appear. They vote, provide feedback, and then decide what results to deploy in the training model. This feedback loop helps create more examples.
LLM-Augmented CAT – get a contextual meaning of certain term or word within MT, this serves as a reference for translators,so they get an immediate understanding of what the source text is saying.
Quality Estimation – what the standard is and how to estimate is still under discussion.

On another note...

Abby Chung

Full-time coffee lover. ☕