Skip to content

Glossary of Standards Terms

Every industry follows particular standards and uses specific technologies and methodologies. The localization industry is no exception.

Standards bodies pull together experts from around the globe to ensure that their work represents current practices. The resulting standards encode best practices and help facilitate interoperability, information transfer, and process improvement.

The standards that are most integral to localization, including Internationalization Tag Set, interoperability, SRX, TMX, Unicode, and XLIFF, are described in detail in the main text and are not repeated here. This glossary highlights additional standards, technologies, and methodologies that are relevant to the localization industry. You can find a comprehensive list of translation and localization industry standards at GALA (Globalization and Localization Association)[Lommel-Arle 1].

About the Author: Arle Lommel

Photo of Arle Lommel

Arle Lommel is a senior analyst with Common Sense Advisory (CSA Research), where he focuses on language technology and translation quality. A noted writer and speaker on localization and translation, he headed standards development at the Localization Industry Standards Association (LISA) and later at GALA, before working on translation quality topics at the German Research Center for Artificial Intelligence (DFKI). He has a PhD from Indiana University and currently resides in Bloomington, Indiana.

Terms: Augmented Translation, Standards Terms

Email: arle.lommel@gmail.com

Website: commonsenseadvisory.com

Twitter: @ArleLommel

LinkedIn: linkedin.com/in/arlelommel/

Standards Terms


accessibility

The ease with which a person can make use of a product, process, service, or content. In the context of localization, making a product accessible means eliminating barriers of language, usability, disability, and technology. Some of the standards and initiatives that govern accessibility include the Americans with Disabilities Act (USA)[Lommel-Arle 2], Section 508 (USA)[Lommel-Arle 3], Web Content Accessibility Guidelines (global)[Lommel-Arle 4], Level Access (EU)[Lommel-Arle 5], and ICTA Global (International Commission on Technology and Accessibility)[Lommel-Arle 6].


Agile methodology

An iterative project management methodology that divides tasks into short phases of work (often called sprints), with frequent re-assessment and adaptation. The originator of this method wrote a manifesto[Lommel-Arle 7] to describe the principles. The manifesto forms the basis for several different Agile software methodologies, the most widespread being Scrum and Kanban. In the past few years, there has been a push to include localization in the Agile process.


application programming interface (API)

A specification that defines how software components interact and exchange data. APIs simplify automation of routine tasks. For example, a client can use an API to send content that is ready for translation from its content management system (CMS) directly and automatically to the localization vendor’s project management system. In turn, the localization vendor can use an API to push translated content back into the client’s CMS.


Bayesian spam filtering

A method that uses statistical analysis of an incoming email header and content to estimate the likelihood that it is spam. Bayesian spam filtering compares the frequency of the occurrence of particular words in an incoming message with the frequency of those words occurring in spam and legitimate email.


code page

A table of values used to map binary codes to textual characters, including some control characters. Today, most code pages have been superseded by Unicode, but legacy code pages still play an important role in many environments


DITA

Darwin Information Typing Architecture. An XML data model and architecture for authoring and publishing user documentation. It is an open standard that was developed by IBM and is now maintained by the OASIS DITA Technical Committee. Localization vendors need to understand the DITA architecture and unique file management requirements of processing DITA files for translation.


DocBook

An XML standard originally designed around a book model, DocBook has evolved to be useful for help systems, ebooks, and other forms of technical documentation. DocBook is widely used in the open source community. It is an open standard that was originally designed and implemented by HAL Computer Systems and O’Reilly and Associates, and is now maintained by the OASIS DocBook Technical Committee.


Dublin Core

An open organization that supports and promotes innovation and best practices in metadata design. The group maintains affiliations with several standards bodies and other metadata associations[Creekmore-Laura 1].


European Machinery Directive

Directive 2006/42/EC of the European Parliament and of the Council of 17 May 2006 governs safety in machinery placed on the market and mandates that safety notices must be in the local language. It also places requirements on the localization of some aspects of the user documentation.


faceted search

A technique for browsing or accessing information organized using semantic categories and multiple explicit dimensions, which allows the user to apply multiple filters. Amazon.com and many other websites use faceted search to narrow down search results by allowing you to search with particular categories (e.g., books, music, etc.).


ISO

The International Organization for Standardization coordinates the efforts of 150 countries to develop common standards for technology and products. These standards are intended to provide more efficient, safer, more consistent, and cleaner development practices (https://www.iso.org).


LISA OSCAR

A standards special interest group in the Localization Industry Standards Association (LISA). When LISA folded in 2011, its portfolio of standards was placed under a Creative Commons Attribution 3.0 license. These standards include: TMX, TBX/ISO 30042, and SRX. The GALA website provides links to these standards[Lommel-Arle 8].


LISA QA model

A software tool and abstract model that was used for evaluating the quality of translated documents and user interfaces[Lommel-Arle 9]. It provided a way to categorize and count errors in translated products to determine adherence to expectations. Widely implemented and modified, it remains highly influential, but has been supplanted by Multidimensional Quality Metrics (MQM).


multidimensional quality metrics (MQM)

A framework for defining task-specific, translation-quality metrics, based on a shared vocabulary of error types[Lommel-Arle 10]. It is currently in the standardization process in ASTM Committee F43 (American Society for Testing and Materials; now known as ASTM International). The DQF (Dynamic Quality Framework) Error Typology is a subset of MQM that has been widely adopted in the translation industry.


NSGCIS

National Standard Guide for Community Interpreting. This Canadian standard is used to certify localization service providers as community interpreters (https://tlolink.com/2fU6r6A).


OASIS

The Organization for the Advancement of Structured Information Standards (https://oasis-open.org). A global nonprofit consortium that works to develop and harmonize standards for computer- and Internet-related activities, particularly as they relate to data structure and exchange (e.g., security, Internet of Things, XLIFF, DITA, DocBook, etc.).


OAXAL

Open Architecture for XML Authoring and Localization. An OASIS initiative that combines various OASIS and OSCAR standards into an overarching architecture for XML authoring and localization.


Okapi Framework

An open source, cross-platform set of tools for localizing and translating content and software. The Okapi framework supports interoperability. You can download tools from the Okapi website: http://okapiframework.org/.


pattern matching

The process of checking a given sequence to identify recurring similarities. When enough similarities exist, they are flagged in the system as possible equivalents.


Quality Metric for Language Translation of Service Information (SAE J2450)

A standard that establishes quality metrics for translating automotive service information. The goal is to provide more objective measures of quality and to ensure that the translations are consistent and accurate, regardless of target language. The standard is available from the SAE website (http://standards.sae.org/j2450_201608/).


TAUS

Translation Automation User Society (https://taus.net). An organization that works to promote and improve machine translation by facilitating innovation, open platforms, and collaboration. This group develops standards, APIs, and other tools to support machine translation.


TermBase eXchange (TBX)

A standard (also known as ISO 30042) that provides an XML-based framework for representing structured terminology data. This framework facilitates interoperability of terminology management systems[Lommel-Arle 11].


Translation Services—Requirements for Translation Services (ISO 17100)

A standard, based on the historic EN 15038 standard, that provides minimum qualification requirements for the human translation process involving quality and service delivery. It gives localization service providers (LSPs) guidelines for managing core processes, resources, and other activities involved in delivering a quality translation service. The standard is available for purchase from the ISO website[Lommel-Arle 12].


W3C

World Wide Web Consortium (https://w3.org). This international community works to develop standards for the Web. It is led by Tim Berners-Lee and Dr. Jeffrey Jaffe. Working groups focus on different aspects of the Web, e.g., HTML5.


Web Content Accessibility Guidelines (WCAG)

A working group of the W3C that works to ensure that the Web is designed to work for all people, regardless of their locale and culture, as well as technological, physical, and mental limitations[Lommel-Arle 4].


Web Ontology Languages (OWL)

A semantic web language created by the W3C that is designed to enable computers to understand the complex knowledge about things and relationships between things.


UTF-8 and UTF-16

Unicode Transformation Format. An algorithmic mapping of every Unicode code point to a unique byte sequence. Every UTF type supports lossless round-tripping. UTF-8 is most common on the Web. UTF-16 is used in Java and Windows. The number denotes the code unit size (8 bits or 16 bits in this case). Refer to the Unicode FAQ for more information[Lunde-Ken 3].


XML

Extensible Markup Language. A standard developed by the World Wide Web Consortium (W3C) that defines rules for creating text-based languages that are both human- and machine-readable. XML is the basis for many localization standards, including XLIFF, TMX, and TBX, as well as content standards such as DocBook and DITA. XML markup allows great flexibility in designing and managing multilingual content.