Imagine the endless number of software programs that store content: website builders like WordPress and Typo3, product information systems like inRiver, email newsletter tools and marketing automation platforms like Mailchimp and HubSpot, game assembly kits like Unity3D and Unreal Engine, design workbenches like Canva and Figma, blueprint labs like Adobe Illustrator, and writing tools like the Hemingway App and Google Docs. It would take a book to list the names of all the products in all categories. But if we wrote only the names of tools that can handle multilingual content, well, one or two pages would suffice.
In 2021, at a time when Facebook and Microsoft support over a hundred languages, being unable to deal with just two or three is a major disadvantage. The reason for this disparity is that developers who embark on the journey to create a CMS are often not familiar with localization. When they begin working on their products they don’t know about multilingual content or don’t focus on it. Even developers who are aware of localization often prioritize more glamorous features that will drive sales and leave localization for later. Or they assume that localization is part of the CMS: that a translator will log into the CMS and work in the same way as a content author does, clicking on each individual text box in the CMS and retyping it in another language. In reality, things are different. Translators work with professional translation tools that include translation memory, machine translation, terminology support, spell checkers and automatic QA.
As a result, CMSs are rarely created to be multilingual. It’s a journey to build and improve language support – a journey initiated by language teams who understand the professional localization process and the fact that people around the world speak various languages.
With this article, I hope to provide guidance to CMS developers: pointing out where products fail at being multilingual and what needs to improve to achieve professional support for localization. Perhaps the reader will be the person who sets the path to developing a multilingual product.
When big CMSs fall short in language: WordPress
Let’s start with the most popular website CMS in the world: WordPress powers a mind-boggling half a billion websites. Yet, despite the incredible spread of this CMS around the globe, WordPress has no built-in multilingual capability. The system’s core creators elected to rely on partners to develop components for multilingual websites. Owners of WordPress websites need to install one of the plugins, for example, WPML, Polyglot, TranslatePress, MultilingualPress, etc. And this is where the problems begin.
Figure 1: WordPress multilingual website plugins and TMS integrations
First problem: decentralized content. Plugins like WPML can extract most of the content from core WordPress, such as pages, posts, menus, and media. But that’s not all. Most WordPress websites have 20-30 plugins installed, and some of these plugins store additional content independently of the core. wpDataTables is an example of a plugin that creates beautiful chart visualization, while Amira is a plugin to schedule appointments. They store extra content externally and create shortcodes such as [wpdatachart id=22].
The page renderer then picks up these shortcodes and generates the visual, a chart, or a booking page. However, because this content sits outside the core WordPress, translation plugins don’t pick it up. Therefore, the translator only receives the text when completing a task, but no new visual. To fix this, a translator needs to manually access the plugin and select the missing parts. In other words, whatever automation has been put in place, it invariably breaks here. Handling these exceptions that might only make for 20 percent of content takes 90 percent of the time in a localization project.
These issues can be solved by developing custom fixes, such as integration between WPML and wpDataChart, leading to the next problem of relying too heavily on crutches.
Second problem: fragmented integrations. WordPress plugins for multilingual websites support a limited set of translation tools. To be specific, WPML is integrated with MemSource and Wordbee, while Polyglot is integrated with Smartcat and Phrase; GlotPress supports Weglot. If you change a TMS, you have to change the plugin and vice versa.
Imagine you are using one translation company for French, and the company uses MemSource, so you integrate your plugins and custom code with WPML. Then, your company expands into Japan and you add a new provider for Japanese who uses Smartcat. Now you need another plugin to support all the new website components. And this integration story repeats itself with every new language.
As a workaround, cleverly built multilingual plugins support manual export via XLIFF, CSV, JSON, or HTML. Export from WPML can work in Phrase despite the lack of integration. On the downside, this is a step backward from today’s elegant continuous localization process. In the case of a website with ten language versions, exporting a page every time the writer changes a few words and then importing the translation back ten times is just frustrating. In the case of a digital company such as Airbnb, which supports 60+ languages, this process is simply unsustainable.
Problem three: dependency on micro-companies. WordPress is omnipresent, flexible, and extremely resilient. There is little chance of this huge CMS being displaced from the market. Plugin developers, on the other hand, are typically micro-companies that come and go. If you are working for a large company such as Volkswagen or Deutsche Bahn, it’s safe to build a website on WordPress, but it is not safe to add a plugin from a small company. There is no guarantee that a plugin developer will be around in two or three years’ time to upgrade new types of content, new components, and the new processes of the day.
Outcome: WordPress is an immensely popular CMS but a difficult customer when it comes to building a multilingual website. Fishing out pieces of content and fixing exceptions, relying on third-party plugins developed by micro-companies, and the lack of built-in support for professional translation workflow tools create a mountain of challenges for any website manager. The story of WordPress is not unique. Many other CMSs rely on plugins, have decentralized content, and share the same whirlpool of multilingual woes as the biggest CMS of all times.
Mailchimp: the quest for a magic multilingual button
With about 14 million users, Mailchimp is the world’s most popular tool for email campaigns. Like WordPress, the market leader is big on core features on the partner ecosystem but offers little support for multilingual.
I used to be a marketing manager in a software startup, running a biweekly email newsletter to 100,000 users around the world. The newsletter was published in English, but the big reader hubs were in Japan, Russia, the Nordics, and Germany. While we sent thousands of emails, it was always a struggle to maintain open rates above the industry’s average 22 percent. After all, this was a technical newsletter detailing new features and the progress of the company from startup to scaleup, not a Stephen King epistolary novel.
At some point, we mounted a particularly important promotional campaign for Japan, and I decided to translate the email contents with the help of my Japanese team member. The resulting open rate exceeded 36 percent, a record for the company at that time. Smiling at our accomplishment, I vowed to localize every big campaign so that people could finally read all the stuff we were sending them. However, with the time pressures and deadlines on the publishing calendar and my teammate in Japan having little familiarity with our emailing tool and working in a different time zone, things didn’t turn out the way I had hoped.
It was then that I realized how awesome it would be to have a magic button to automatically replicate a monolingual campaign into a multilingual one with translations automatically pushed through the translation tool. Millions of email marketers would stand to benefit as would readers from around the world who would finally be able to read our biweekly corporate marketing newsletter in their own language. Unfortunately, years later, this magic button still remains a fantasy.
For those unfortunates among us who need to generate a multilingual campaign in Mailchimp, a set of three options is available. But frankly speaking, each approach has its shortfalls.
Option A is to provide a link to a Google translation of the email. In this case, you have to trust the machine to render the message correctly and to generate emphatic verbiage that convinces the customers to trust your brand. Good luck avoiding unfortunate accidents!
Option B is to create a single huge HTML file in which texts in every target language follow one after another. Merge tags hide all languages but one, and the readers will only see the relevant part of the email. This approach is convenient for reporting because it provides a single campaign to track, but it is a patchwork quilt during preparation. First, it does not allow for automation, and everything has to be assembled manually with extremely rigorous testing. Second, to change even a couple of words in the text, the marketer will need to send this to a layout designer with HTML skills. This is bulky and cumbersome at best, and an absolute monster at worst.
Figure 2: Mailchimp campaign with multiple languages crammed into one HTML file via merge tags
Finally, Option C is to duplicate the campaign for each language and import translations via HTML. This is the closest to a professional approach and some translation tools like Language Exchange have been integrated to eliminate import and export operations. The downside is the reporting, where the marketer is obliged to track results across numerous campaigns per language instead of one consolidated report. Tough luck, marketing managers!
To go from zero to hero and solve this problem, Mailchimp needs to do two things:
- Add the ability to nest campaigns in multiple languages under one title for reporting
- Promote more integrations for translation tools
Thousands of email marketers and millions of their readers around the world will give Mailchimp a collective “Phew, obrigado/arigato/thank you”.
What constitutes perfect support for translation and localization?
Let’s imagine that these two examples have convinced you – the product owner in a CMS company – to leave Ithaca and start teaching your software to speak in tongues. A perilous journey over uncertain waters lies ahead.
Given the speed at which technology moves today, it’s hard to predict which specific controls and features will be needed one or two years from now, so it is important to get the basic architecture right and build for flexibility. With a strong frame, your boat can weather many storms.
Figure 3:Headless content repo as an enabler of an automated localization workflow
The journey begins with content centralization. To enable an unbroken automated translation workflow, all content – be it text, images, media, or navigation items – should be accessible via a central repository. This content repository should ideally be decoupled from publishing following the modern “headless CMS” approach in which one system rules all. Website CMSs, mobile app CMSs, a product catalog, a car app, even a smart TV app – all of them connect to the central CMS, the single source of truth, to get the latest version of the content, while storing nothing themselves.
Next, everything should be accessible via an API. This means that the CMS’s API, in addition to the usual pages, posts, and product descriptions, needs to support navigation text, menus, CDN items, URL text, tag names, category names, chart elements, accordions, anchors, SEO metadata, etc. No element that contains text should be left unsupported. Otherwise, the automation can’t be completed.
Last but not least come the localization interoperability features to help translation management system developers create a rich app ecosystem. One of the features is built-in support for localization formats XLIFF and XLF. Another one is webhooks to detect changes in the content and allow translation systems to pull content as soon as it is changed into the localization workflow. The third feature is a set of visual library components to build a page preview in a third-party system so that translators can see the pages they are working on in layout as they go line by line. With previews, translators can understand the context better and can adapt translations to fit the text box size.
With centralized content accessible via the API and interoperability features in place, the frame is complete, and the next step is to add multilingual functionality beyond translation – functionality unique to your type of CMS. This encompasses magic buttons to make an email campaign multilingual for marketing automation platforms, language layers for videogame construction kits, interlinks between different language versions of product descriptions in PIM solutions, and stacks of text strings for blueprint design tools. At this point, you may venture into uncharted waters and pioneer new features not yet discovered or find new and more convenient ways to solve the many challenges of serving a global community of users with a multitude of cultures and tongues.
The future vision
When developing localization features, the pool of stakeholders includes all the experts who support CMS buyers: translators, project managers, and localization engineers in agencies as well as in-house language teams. These experts will be the ones to recommend your CMS to partners, clients, friends, and family.
In 2021, multilingual content in CMS should be a given. It should take zero clicks and should give the user the ability to plug in any translation technology. Few developers and CMS product owners outside the localization industry have a clear vision of this, and the task to advocate and push for this support falls on the shoulders of localization managers.