How to Build a Termbase: From Scattered Glossaries to Governed Terminology

Building a termbase comes down to a repeatable sequence: audit the terms you already use, agree the preferred term and rules for each concept, and put the result somewhere your writers and translators actually work. The point is to stop the same concept being worded three ways across your product, documentation, and markets.

That inconsistency carries real cost. It shows up as reviewer debates over wording, avoidable queries from translators, and, in regulated medical or technical content, a reader who cannot be sure that two terms describe the same part.

This article shows you how to build a termbase that fixes that: what it is, what goes in it, how to stand up a first release in weeks, who owns it, and how it sharpens both human and AI translation. It is the practical core of terminology management, written for the teams who have to make it work day to day.

TL;DR

  • A termbase is a structured, concept-based database of approved terms, broader and more governed than a glossary or a translation memory.
  • Build it in a useful first release of around 150 concepts in four to six weeks, then grow it in small, regular updates.
  • Capture a core set of fields per concept (preferred term, definition, example, domain, owner, status) and add optional fields only when you will maintain them.
  • Give it one accountable owner, a light approval workflow, and a read-only lookup wherever people write.
  • A governed termbase sharpens both human and AI translation, because constraining terminology removes the most common quality failure.
  • The payoff is fewer wording debates, shorter review cycles, and language that reads the same across every market.

What a termbase is, and what it solves

A termbase is a structured, searchable database that stores each concept your organisation uses, its approved term in each language, and the rules for using it. It lets writers and translators stop reinventing wording every time a concept comes up.

It takes a specific kind of friction out of your teams’ work. When people use different words for the same thing, reviewers argue about phrasing, translators raise queries that should never have been needed, and customers see inconsistent language across your product, site, and support.

A governed termbase also makes your other language assets work harder. As phrasing stabilises, your translation memory matches more often and review cycles shorten, so the benefit compounds over time.

Glossary vs termbase vs translation memory

A glossary is a flat list of terms and equivalents. A termbase is a structured, concept-based database with definitions, rules, and governance.

A translation memory stores previously translated segments, where a termbase stores terms. The three work together while doing different jobs.

GlossaryTermbaseTranslation memory
What it storesterms and equivalentsconcepts, terms, rules, metadatapreviously translated segments
Structureflat listconcept-based, structuredsentence or segment pairs
Governanceminimalowned, versioned, approvedauto-generated from past work
Best fora quick startgoverning terminology at scalereusing past translation

In practice you will run a termbase and a translation memory side by side. The glossary is where many teams start, and the point of this guide is how to graduate from that to something governed.

What a good termbase contains

Treat the termbase as a data model rather than a flat spreadsheet. Each entry is a concept with one or more terms per language, plus a small set of fields that do real work. Capture the core fields from day one, and add optional ones only when you will maintain them.

FieldRequired?Why it earns its place
Concept ID and preferred term per languagerequiredthe spine of the entry
Plain-language definition and example sentencerequiredremoves ambiguity for writers and translators
Domain tag (legal, marketing, UI, support, finance)requiredroutes review and prevents clashes
Synonyms, forbidden terms, and the approved alternativerequiredstops word debates and prevents rework
Do-not-translate flag and handling notesrequiredprotects product and brand names, sets inflection rules
Owner, approver, status, review date, change historyrequiredmakes change governable
Grammar or UI attributes, character limits, SEO notes, pronunciationoptionaladd only if you will maintain them

The temptation is to model everything, and it is worth resisting. A termbase that is slow to fill is a termbase nobody updates, so start lean and add fields when a real need appears.

A worked example: what your first entries look like

Here is the template populated, so the fields stop being abstract. Three real-shaped entries:

ConceptPreferred termDomainDo not translate?Note
the main user screendashboardUInoforbidden: console, control panel; 20-character UI limit
the company product nameSmartConnectbrandyesnever inflect or translate; keep camel case
the cooling-off windowright of withdrawallegalnouse the exact statutory phrase per market; legal sign-off required

Notice what each entry does. The UI concept forbids the two rival names and records the character limit; the brand concept locks the form; the legal concept points to the statutory wording and flags who approves it.

How to build your first termbase

Aim for a useful first release of around 150 concepts in four to six weeks, then iterate in small updates. Do not try to capture everything at once; momentum matters more than completeness in the first release.

Step 1: audit what you already have

Collect existing glossaries, brand word lists, style guides, UI string exports, help-centre articles, and recent translations. Read across them to find collisions: product names that change between teams, features described three ways, legal phrases that drift between documents. That collision list is your starter backlog.

Step 2: extract candidate terms

Run a term extraction on high-value sources such as product pages, user interfaces, contracts, technical docs, and release notes. Automated extraction surfaces volume; human judgement trims it to the concepts that actually affect meaning, legal risk, or user action. Keep the signal, drop the noise.

Step 3: define and decide

For each candidate, write a one-sentence definition someone can understand without specialist knowledge. Choose a preferred term per language, record synonyms, and mark any forbidden terms with the approved alternative. Where a term should not be translated, set it as do not translate and note how to handle inflection or plural in target languages.

Step 4: translate and review

In-market linguists add target-language entries, adapting for morphology, register, and constraints such as UI character limits. Lock the high-impact items first, product names, UI labels, legal phrases, and KPI names, because these drive the most queries and the most rework if left vague. High-risk domains such as medical, legal, and technical content deserve the earliest attention.

Step 5: publish and integrate

Make the termbase available where people work. Integrate it with your authoring tools and your translation environment so terms appear as suggestions and are checked automatically. Give everyone else a fast, read-only lookup, and publish a short change log when you release updates so people can see what moved.

Who owns a termbase, and how terms get approved

A termbase needs one accountable owner, a curator who maintains entries, and a light, visible approval path. Without that, it drifts back into the scattered lists you started with.

A workable ownership model:

  • Owner: one accountable person in content, localisation, or product marketing.
  • Curator: a linguist or terminologist who de-duplicates, routes, and finalises entries.
  • Contributors: product managers, writers, support, and translators who propose terms when they spot gaps.
  • Approvers: subject-matter leads per domain, with legal signing off sensitive items.

How a term moves from idea to approved entry

The lifecycle is the backbone of a healthy termbase. A proposal carries the source sentence, a definition, the domain, a suggested term, known synonyms, and whether it should be translated:

  • Proposed: anyone close to the content submits a term; it lands with an auto-assigned curator.
  • In review: the curator de-duplicates and routes to the right domain owner, who sets scope and usage notes.
  • Translation and adaptation: in-market linguists add target entries and validate do-not-translate cases.
  • Approved: the domain owner signs off meaning and fit, legal clears sensitive items, and the entry publishes and syncs to CAT tools.

Service-level targets keep momentum: triage within two working days, domain review within five, translation and in-market review within five to seven, and monthly publication with a short change log. Critical items, product names, legal phrases, safety messages, and KPI labels, can bypass the monthly cycle on a same-week fast track.

How a termbase sharpens AI and human translation

A governed termbase is one of the most useful things you can give both a human translator and a machine, because it constrains the most common quality failure: inconsistent or wrong terminology.

On the human side, a reviewer who can point to an approved term stops re-litigating wording, which is where reviewers quietly lose hours. Queries fall, output stays on brand, and translators keep terms consistent across markets.

On the machine side, the effect is measurable. Customising terminology, tone, and consistency has been shown to deliver 5 to 10 times fewer errors than baseline output (Intento, 2025), and strong terminology-constrained systems now target over 95% term success (research, 2024 to 2025). It matters most in specialised domains, where correct, consistent terms remain an unsolved problem for general machine translation (WMT25 Terminology Task), and where it feeds directly into medical translation quality assurance.

This is also why a termbase pairs naturally with the routing decisions covered separately in our article on machine translation and AI: the higher the stakes, the more the approved terms earn their place.

Getting the tooling and integrations right

Pick tools that fit your team size, security posture, and existing systems, and wire the termbase into the places where words are actually written and translated. A separate portal that needs another login gets ignored.

Selection criteria that matter:

  • Role-based access with audit trails and single sign-on.
  • Connectors or APIs for your CMS, design tools, and translation memory.
  • Automatic term highlighting and checks in CAT tools.
  • Workflows for propose, review, approve, and deprecate.
  • Fast search, inline previews, and friendly editing.

Wire the termbase to where words are written and translated. Suggestions should appear in your CMS or editor and in your CAT tool. Our our SmartConnect links a CMS, PIM, or shop system to us so approved terms surface in the tools your teams already use, rather than in a portal they have to remember to open.

Keeping it healthy, and measuring impact

Treat the termbase as a living product that needs regular care. Release in small, regular cycles so you avoid big, risky drops, review high-risk domains such as legal, safety, and finance quarterly, and archive deprecated entries rather than deleting them so old terms do not creep back via copy-paste.

Usage analytics help you focus: searches with no results show gaps to fill, and entries that never surface may be candidates for pruning. An hour a month with your curator is enough to plan the next set of tweaks.

Define what good looks like at the start, then watch a few measures:

  • Terminology queries from translators and reviewers, which should trend down as the database stabilises.
  • Review time, which should fall as people stop word-smithing and start checking meaning.
  • Translation-memory reuse, which should improve as phrasing repeats.
  • Cross-platform consistency, so your product and UI language reads the same everywhere.

What public institutions can teach you

Public institutions that publish in dozens of languages treat terminology as a core governance function, and their models translate directly to a company, right-sized.

The European Union runs its interinstitutional database, IATE, across the Commission, Parliament, and Council, with concept-oriented, richly annotated entries reviewed under defined workflows. You can explore the EU’s IATE terminology database. The United Nations maintains UNTERM across its agencies on the same principles.

What to copy, right-sized:

  • Concept first: one entry per concept, with variants and forbidden terms recorded per language.
  • Ownership: domain leads and legal reviewers approve high-risk items.
  • Metadata: definitions, contexts, sources, status, and change history on every entry.
  • Integration: term checks in the tools where writing and translation happen.
  • Openness: a read-only lookup for partners and vendors to reduce drift.

Frequently asked questions about building a termbase

What is the difference between a glossary and a termbase?

A glossary is a flat list of terms and their equivalents, quick to start and easy to lose control of. A termbase is a structured, concept-based database with definitions, rules, ownership, and version history, built to govern terminology at scale. Most teams start with a glossary and graduate to a termbase as languages and contributors grow.

Do I need a termbase, or is a glossary enough?

A glossary is enough when one team works in one or two languages and consistency is easy to keep by eye. You need a termbase once several departments produce content, or once you translate into multiple markets, because that is when the same concept starts being worded differently. Drift is the trigger, and it can appear at any company size.

What should a termbase contain?

At minimum, capture a concept ID and preferred term per language, a plain-language definition with an example, a domain tag, synonyms and forbidden terms with the approved alternative, do-not-translate flags, and ownership and status fields. Add optional fields such as character limits or pronunciation only when you will maintain them. Aim for fields that do real work, so the entry stays quick to fill.

How long does it take to build a termbase?

A useful first release of around 150 concepts is realistic in four to six weeks, then you iterate with small, regular updates. Trying to capture everything before launch is the most common reason termbase projects stall. Start with the highest-impact terms and grow from there.

What is the difference between a termbase and a translation memory?

A termbase stores concepts, approved terms, and the rules for using them. A translation memory stores previously translated segments so they can be reused. They complement each other: the termbase governs the words, the translation memory reuses the sentences.

Can a termbase improve machine translation quality?

Yes. Constraining terminology with an approved termbase materially reduces errors in both neural machine translation and large language model output, and it matters most in specialised domains where a single wrong term carries real cost. A termbase is one of the most effective ways to make AI translation safe to use on higher-stakes content.

FAQ JSON-LD schema is delivered in-line in the chat reply, not embedded here. The FAQs above remain visible in the page body.

How we can help at AdHoc Translations

We build and maintain practical termbases that plug into your writing and translation stack, so the database does real work inside the tools your teams already use. We pair native-language specialists with a light governance model that keeps decisions clear.

In practice, we map your languages and domains, propose a field template, and deliver a first release you can grow without changing platform later. We hold ISO 17100, and we keep terminology consistent across human and AI-supported work, leaning on people, process, and technology.

If you are ready to turn scattered glossaries into a governed terminology database, see how we structure your multilingual content.

Sources

  • Intento. The State of Translation Automation 2025 (terminology customisation reduces errors 5 to 10x). 2025. inten.to.
  • WMT25 Terminology Translation Task. Conference on Machine Translation. 2025. statmt.org.
  • Weglot. Machine Translation Quality in 2025 (LLM hallucination rates). 2025. weglot.com.
  • IATE, the EU’s interinstitutional terminology database. European Parliament. europarl.europa.eu.
  • UNTERM, the United Nations terminology database. unterm.un.org.
  • ISO 17100:2015 (Translation services. Requirements for translation services). International Organization for Standardization. iso.org.