← All posts

How the Glotix Translation Engine Works — A Technical Deep Dive

The Problem with Flat Text Translation

Most website translation tools extract text from your page, translate it, and paste it back. This "flat" approach breaks in three ways:

1. Inline elements get destroyed

Click here to start
becomes one string, losing the tag
2. Context is lost — the word "Post" in a blog means something different than "Post" on a form button
3. Dynamic content is missed — SPAs, React apps, and JS-rendered text never get translated

Glotix solves all three.

DOM Tree Walking

When the Glotix SDK loads, it walks your DOM using the browser's native TreeWalker API with NodeFilter.SHOW_TEXT. This finds every Text node individually — not elements, not innerHTML, but the actual text content nodes.

For each text node, we compute a contextual hash:

hash = FNV-1a(textContent + "|" + parentPath)

Where parentPath is the chain of ancestor tag names: BODY>DIV>NAV>A. This gives every text occurrence a unique fingerprint based on both its content AND its position in the DOM tree.

The same word "Home" gets two different hashes depending on where it appears:

  • fnv1a("Home|BODY>NAV>UL>LI>A") — navigation link
  • fnv1a("Home|BODY>MAIN>H1") — page heading

This means each occurrence translates independently with full context awareness.

Translation Queue

Discovered text nodes are batched into a translation queue. The queue debounces for 300ms (to batch rapid DOM changes), then fires a single POST /api/translate request with up to 100 items.

The server checks its KV cache first. Cache hits return instantly. Cache misses go to GPT-4o-mini for AI translation, then get cached permanently.

Client-side, translations are stored in window.__glotix_translateMap — a Map. Repeat encounters of the same hash (from re-renders or navigation) apply instantly without any API call.

MutationObserver

After the initial walk, a MutationObserver watches for DOM changes:

  • childList: true — new elements added
  • characterData: true — text content changed
  • subtree: true — deep watching

When a mutation fires, the affected subtree is re-walked. Only NEW text nodes (hashes not in the known set) get enqueued. A requestAnimationFrame debounce prevents rapid-fire mutations from overwhelming the queue.

Critically, the observer ignores mutations caused by Glotix itself (translation application and restoration) using a WeakSet tracking system.

WeakRef for Memory Safety

Every registered text node is stored as a WeakRef. This means if the host page removes a DOM element, the SDK doesn't hold a reference that prevents garbage collection.

When applying a translation, the SDK calls nodeRef.deref() — if it returns undefined, the node was garbage collected and the translation is skipped silently.

Language Switching

The language switcher widget lives in a Shadow DOM to prevent style conflicts with the host page. When switching languages:

1. All nodes are restored to their original text (stored at registration time)
2. The in-memory translation cache is cleared
3. All known hashes are re-enqueued for the new target language
4. The MutationObserver pauses during restoration to prevent feedback loops
5. New translations are fetched and applied

The 2-second polling interval serves as a safety net, catching any text nodes that the MutationObserver might have missed.

The Result

An 11KB script that translates any website — static, dynamic, SPA, or server-rendered — while preserving DOM structure, understanding context, and handling real-time content changes. All running client-side on the edge.

Ready to translate your website?

See your site in any language in seconds. Free preview, no account needed.

Try Glotix free →