lundi 29 septembre 2025

Manifesto for a New Pricing Model in Translation: From Words… to Risk

AI page

AI inside

Update on a translational AI working the way it did a century ago…

*

A Translator’s Voice: A Risk-Indexed Pricing Model for Translation (version française)

*

Beyond the “faster, cheaper, max-quality” mantra

For decades, translation has been treated as a commodity: we count words, squeeze deadlines, and declare a vague “quality.” That race to the bottom hides real costs—errors, rework, delays, disputes, reputational damage, and burned-out suppliers. 

Thesis: Move from counting words to pricing exposure to risk.

The non-negotiables (after Saint Jerome):

  1. Fitness to truth — sources, facts, citations.
  2. Intelligibility — readers, use cases, context.
  3. Assumption of responsibility — accountable choices, checks, remediation.
Models, metrics, and tools—including LLMs—are only means in service of this threefold duty.

This post argues for replacing per-word pricing with risk-indexed pricing—and for making accountability explicit across clients, LSPs, and translators, especially in the LLM era. The entire responsibility of constantly offering the client the best DEADLINES, the best COSTS, and the best QUALITY falls essentially on the translator, who no longer has a say, at least that's what they believe...

The original error was associating translation with a commodity, a false friend in French (amenity, advantage, comfort, utility, ...), a true enemy in English: basic product, raw material, simple merchandise, no difference between a translation and 1 kilo of potatoes! So the more kilos there are (beyond a few kwords), the bigger the discount must be...

As I said once, the only raw material in translation is the translator's gray matter. Some will object that in the era of LLMs, this allegation is false, since 95% of the translation is done in seconds (in theory...) by the neural translation + transformer duo. That’s the myth. What remains is the 100% accountability: knowing where to intervene, and why.

But currently the reasoning—and the calculation—of clients and LSPs is as follows: if an LLM translates 95% of a text well, only 5% of the work remains for the "finisher." In other words, on a 10,000-word text, they only pay him for 500 words. A bit like paying a bridge inspector per square meter of rust would be absurd...

Serious post-editing is the same. On 10,000 words, correcting 500 may take only a moment; but knowing where and how to intervene on 100% of the text provided by the AI to find them is the fruit of many years of experience and takes much more time! The fluent draft is not the finished work; accountability is. It's the famous analogy with the well-known joke, often told to illustrate the value of expertise and experience compared to the time spent on a task:

A cargo ship is broken down. After days without managing to get it going again, they bring in an old mechanic. He listens, touches a few pipes, takes out a hammer, gives a single blow—the engine starts right away. He sends a bill for €15,000. The shipowner, stunned, demands the details. The mechanic writes:
Hitting with the hammer: €10
Knowing where to hit: €14,990
Total: €15,000

The punchline being that, compared to the cold speed of AI, the "knowing where to hit" (the keys on the keyboard :-) represents human expertise in nuances, contexts, cultures, techniques, etc.

We must therefore definitively eliminate this logic of pricing translation by the kilo, which inevitably leads to the harmful spiral of the lowest bidder and abnormally low offers. When a call for tenders is based on unrealistic promises, the LSP makes up for it later in the production chain (time, revision, profiles), transferring all the residual risk—not recognized and even less remunerated—to the translator. Consequently, the number of words can no longer be the only adjustment variable; now we must price the risk, not the words! 

Out of curiosity, I asked the AI to list the risks associated with this LLM 95% - Finisher 5% approach; the list is impressive:

1. Quality & Accuracy

  • Factual hallucinations (invented info, erroneous citations).
  • Reasoning errors (broken logical chains, poor prioritization).
  • Critical omissions (missing legal/technical details).
  • Over- or under-confidence (categorical tone on dubious matters, or vice versa).
  • Drift from instructions (responses out of scope, non-compliance with brief).
  • Fragility to formulations (prompt variants ⟶ inconsistent results).
  • Performance drop on rare cases (long tail, specialized domains).
  • Poor context management (lost instructions, truncated windows).
  • Risky formatting (dates, numbers, units, tags, placeholders).

2. Security & Privacy

  • Data leaks (PII, secrets, NDA, client documents).
  • Prompt injection / data exfiltration via source content or tools.
  • Exposure to third-party plugins/tools (supply chain).
  • Uncontrolled logging (logs containing sensitive data).
  • Model inversion / extraction of training data (IP/PII risk).
  • Poisoning (corruption of sources, memories, TM/glossaries).

3. Legal & Compliance

  • Non-compliance with sectoral regulations (GDPR, HIPAA, finance, health, defense).
  • Copyright / IP (reproductions too close, data licenses).
  • Defamation (unfounded accusations against people/organizations).
  • Regulated advice (legal, medical, tax) poorly flagged.
  • Insufficient traceability (impossibility to audit/justify an output).

4. Brand & Reputation (clients and LSPs)

  • Brand tone/voice not respected (too familiar, aggressive, flat).
  • Multi-channel inconsistency (responses diverging by touchpoints).
  • Cultural missteps (intercultural gaffes, misplaced humor).
  • Biases / stereotypes (political, gender, ethnicity, religion).
  • Public crises (viral screenshot of an inappropriate response).
  • Loss of trust (marketing promises not kept).

5. LSP / Localization Specific

  • Misinterpretations and false friends (legal, medical, technical).
  • Non-compliant terminology (glossaries/termbases ignored).
  • Variable errors ({name}, {amount}, misplaced placeholders).
  • Regional formatting (dates, currencies, separators, reading direction).
  • Gender and inclusivity (agreements, neutrals, local sensitivities).
  • Local SEO/ASO (inadequate keywords, loss of ranking).
  • Broken layout (expansion/length constraints).
  • Client content confidentiality (leaks via MT/LLM).
  • TMS/CAT chain (poor synchronization, locked segments overwritten).

6. Operational & Product

  • Latency / availability (SLAs not met, timeouts).
  • Unpredictable cost (token drift, tool calls).
  • Model versioning (silent regressions on upgrade).
  • Data drift (real-world changes not absorbed).
  • Poor fallback (uncontrolled degradations when LLM fails).
  • Lacking observability (no metrics, no alerts).
  • Biased evaluations (benchmarks not representative of real traffic).

7. User & Organization

  • Automation bias (humans validating too quickly).
  • Proofreading fatigue (HITL where vigilance drops).
  • Shadow prompts (teams tinkering with unvalidated prompts).
  • Lack of training (misuse, unrealistic expectations).
  • Process change (friction, partial adoption ⟶ control gaps).

8. Sensitive Content & Product Safety

  • Toxicity / harassment (offensive outputs).
  • Disinformation (propagation of plausible errors).
  • Physical/IT security if AI drives actions (executes code, commands).
  • Open to abuse (jailbreaks, misuse detours).

9. Governance & Ethics

  • Absence of clear policies (when to use / not use the LLM).
  • Lack of access control (who can send what to the model).
  • No RAPID® (Recommend, Agree, Perform, Input, Decide), which recommends, approves, performs, inputs upstream, decides.
  • Insufficient documentation (prompts, test sets, decisions).

When you consider that these risks—real, even if rarely all present at once—are effectively offloaded to the so-called “finisher” without legal or financial recognition, this isn’t optimization; it’s a disguised transfer of risk. For the translator—the one person with no control over the tool, no say on compensation, and no insurance coverage—accepting such a setup is untenable economically and ethically. It is, in practice, agreeing to work without a safety net.

The conclusion is clear: if we want to leverage LLMs without creating a professional scapegoat, we must reverse the equation of responsibility and value.

Concretely:

  • Requalify the role. The “finisher” is a translator-editor accountable for accuracy, compliance, and voice. Price and schedule for audit, not touch-ups.
  • Contract for responsibility. Define LLM use limits; allocate liabilities; require professional indemnity; include a right to refuse high-risk drafts.
  • Make traceability mandatory. Deliver an audit log (sources, checks, decisions), versions, PromptID/Model/Params, and passed checklists.
  • Install technical safeguards. PII filters, hallucination checks, regression tests, HITL sampling, and a defined fallback path.
  • Govern usage. Policies on when to use/not use LLMs; RAPID® roles; access controls; mandatory training. (Add a footnote or glossary for “shadow prompts.”)
  • Index price to risk. Higher-risk contexts (medical, legal, global brand) warrant higher control costs; otherwise you build risk debt. Price the risk, not the words.
  • Be transparent with clients. Declare AI use, limits, and the scope of human review. Trust is earned upstream, not post-mortem (in other words, before the incident occurs).

Translation has never been, is not, and will never be a commodity; it is a governed risk activity. The only raw material is expert judgment. Per-word pricing rewards keystrokes, not accountability, and pushes hidden risks downstream—often onto the translator. A risk-indexed model aligns incentives with reality: price the exposure, make roles and liabilities explicit, require traceability, and install technical and organizational safeguards. LLMs can reduce effort; they do not remove responsibility. 

Trust is earned upstream—by design, not apology. Commodity pricing has had its run; risk-indexed pricing should be the norm.


P.S. The ‘5%’ figure is a convenience for illustration; the argument holds at 7–10% or other orders of magnitude.

Aucun commentaire:

Enregistrer un commentaire