Previously, we’ve outlined our 6-step process for managing Global Content Quality. It can be applied by any company or team that produces global, multilingual content at scale and wants to leverage it for consistent, positive impact on their international business goals.
We have already talked about content requirements management (how to define what “good quality” really means for your content) and data-driven quality evaluation (how to gather both objective and subjective information about quality from multiple sources for a 360-degree view).
This time, we’ll provide a detailed walkthrough of Error Typology, a popular quality measurement (evaluation) method that you can apply to extract value from the data already stored in your Content Management Systems (CMS) and/or Translation Management Systems (TMS).
Error Typology is a venerable evaluation method for content quality that’s very common in the modern Translation & Localization industry. Despite having been popularized for translated multilingual content, it can easily be applied in a single-language context just as well, with only minor changes. Here’s how to use it.
Any modern CMS or TMS will typically store multiple revisions for each of your content assets automatically. This means you already have a wealth of information for potential analysis at your fingertips! The challenge is mostly about picking the right revisions since analyzing each revision versus each other is VERY time-consuming.
If you don’t have another revision yet, consider revising the content yourself — or ask your peers, subject matter experts, editors, or in-country reviewers to do that for you (whatever works best for your content production process). Note down which errors to correct and which improvements to make, in order to have this piece of content better match your requirements.
If you store these lists of corrections separately from the changed content itself, you will have a “virtual” revision (a quality evaluation) that can be implemented into content at a later stage (e.g. by your writers or translators in your CMS or TMS). This doesn’t really matter for Error Typology analysis since in either case you would still have 2 revisions of content to compare.
Classification is most frequently performed by human experts, but can also be produced by automatic tools. Those tools “read” your content and find various types of content quality issues using algorithms (including Natural Language Processing, Artificial Intelligence, and Machine Learning).
Since automatic tools sometimes produce false positives (issues which are not really issues), it’s usually advisable to remove those first if you strive for accurate Error Typology analysis. However, even the raw output is sometimes enough to quickly gauge certain aspects of quality and guide further decisions.
Classification can either be performed at the same time as the actual revision or done separately at a later stage (potentially by another party). Essentially, any document revision can be turned into a quality evaluation at any time! This can be very useful for post-project analysis since it allows your global content teams to focus on producing top-notch content first and analyze their work later.
Note: we describe just one possible way of how to perform content quality scoring which is based on MQM recommendations. Many alternative ways exist!
Now you have a simple single-number representation of how different aspects of quality have played out in your content according to your requirements. In other words, TQ is a content quality score. This score can be easily stored over time in large quantities (e.g. in an Excel spreadsheet, in a database, or even in a dedicated content quality management system that directly connects all types of quality evaluations to specific content items). It also lends itself extremely well to all sorts of quantitative analysis techniques. We’ll talk about those in a later post.
Error Typology analysis is rather time-consuming and requires well-trained and well-instructed content professionals (writers and translators) to consistently do it right. That’s why in practice, companies usually apply it to subsets (or samples) of data in statistically sound ways that allow drawing conclusions about a larger whole (e.g. a set of documents) by one of its parts (e.g. a chapter). However, the level of detail you can get from this analysis and the resulting learning potential for your global content teams are unparalleled.
While Error Typology is very useful for detailed internal analysis, it is an atomistic, expert-based quality evaluation method. Thus, it doesn’t accurately predict the holistic perception of content by the reader, and might not be a good leading metric for content performance in many cases. For a true 360-degree view of quality, Error Typology should be paired with holistic quality evaluation methods and content performance metrics.
Are you currently using Error Typology or similar approaches when analyzing content quality in your company or team? If yes — how exactly do you do this now, and how does it help your team? If no — why not, and what obstacles do you face? Please share with us in the Comments section!