UNPKG

i18n-ai-translate

Version:

AI-powered localization CLI, Node library, and GitHub Action. Translate i18next JSON, Gettext PO, Java .properties, and iOS .strings with ChatGPT, Claude, Gemini, or local Ollama models.

162 lines (121 loc) 9.57 kB
# i18n‑ai‑translate [![npm version](https://img.shields.io/npm/v/i18n-ai-translate.svg)](https://www.npmjs.com/package/i18n-ai-translate) [![npm downloads](https://img.shields.io/npm/dw/i18n-ai-translate.svg)](https://www.npmjs.com/package/i18n-ai-translate) [![Build](https://img.shields.io/github/actions/workflow/status/taahamahdi/i18n-ai-translate/build.yml?branch=master)](https://github.com/taahamahdi/i18n-ai-translate/actions/workflows/build.yml) [![License: GPL‑3.0](https://img.shields.io/npm/l/i18n-ai-translate.svg)](https://github.com/taahamahdi/i18n-ai-translate/blob/master/LICENSE) AI‑powered localization for your translation catalogues. Automate translating single files or entire directories with ChatGPT, Gemini, Claude, or local Ollama models — while keeping translations accurate, formatting consistent, and placeholders intact. Works with **i18next‑style** JSON out of the box, plus Gettext `.po`, Java `.properties`, and iOS `.strings`. _For a detailed walkthrough and advanced tips, see [ADVANCED_GUIDE.md](ADVANCED_GUIDE.md)._ --- ## Why use it? | Feature | What it means | | --------------------- | --------------------------------------------------------------------------------------- | | **Multi‑engine** | Choose OpenAI, Google, Anthropic, or your own Ollama models | | **Fast** | Parallel per-batch workers share one rate limiter; translate 20 locales concurrently | | **Safe** | Translations verified against the source before being written | | **Diff‑aware** | Only re‑translate keys you changed; existing translations are preserved | | **Check mode** | Audit existing translations for drift, missing placeholders, or quality regressions | | **Format‑aware** | i18next JSON, Gettext `.po`, Java `.properties`, iOS `.strings` — round‑tripped intact | | **Context-aware** | `--context` flag injects product info so the model picks domain-appropriate terminology | | **Dry‑run** | Preview updates before touching disk | | **Everywhere** | Use as a CLI, GitHub Action, or Node library | --- ## Quick start ### 1 · Install ```bash npm i -g i18n-ai-translate # or yarn add i18n-ai-translate --dev export OPENAI_API_KEY=••• # or GEMINI_API_KEY / ANTHROPIC_API_KEY ``` ### 2 · Translate a file ```bash i18n-ai-translate translate -i i18n/en.json -o fr \ -e chatgpt -m gpt-5.2 ``` Need more languages? Pass multiple codes (`-o fr es de`) or `-A` for **all** 180+. Filenames like `es-ES.json` / `pt-BR.json` are accepted too — the language subtag is extracted automatically. Skip specific locales with `--exclude-languages fr de` (handy for locales you maintain by hand). **Other formats:** besides i18next JSON, Gettext `.po`, Java `.properties`, and iOS `.strings` files work too — the format is inferred from the file extension (override with `--file-format json|po|properties|strings`). Non-translatable structure round-trips losslessly: PO comments, `msgctxt`, and plural forms; `.properties` comments, separators, and line continuations; `.strings` `/* */` and `//` comments and quoting. Native placeholders (`printf` `%s`/`%1$s`/`%@`, MessageFormat `{0}`/`{1}`) are preserved across the translation. Works across `translate` (file + folder), `diff`, and `check`. ### 3 · Translate a folder ```bash i18n-ai-translate translate -i i18n/en -o fr es de \ -e chatgpt -m gpt-5.2 ``` Recursively translates every `*.json` file in `en` and writes the results to `i18n/fr`, `i18n/es`, and `i18n/de`. ### 4 · Translate only what changed ```bash i18n-ai-translate diff \ -b i18n/en-before.json -a i18n/en.json \ -l en -e claude -m claude-sonnet-4-6 ``` Preserves every existing translation; only added/modified keys are re-translated, only deleted keys are removed. Per-locale writes are persisted as each language finishes, so a mid-run crash doesn't discard completed work. ### 5 · Check an existing translation ```bash i18n-ai-translate check -i i18n/en.json -o fr de \ -e chatgpt -m gpt-5.2 --format json ``` Runs the verification pipeline against your existing translations without writing anything. Emits a structured report of keys the model flagged. Exits non-zero if any issue is found, so you can gate CI on it. ### 6 · Keep PRs up‑to‑date Add a one‑liner GitHub Action to auto‑translate whenever `en.json` changes: ```yaml - uses: taahamahdi/i18n-ai-translate@master with: json-file-path: i18n/en.json api-key: ${{ secrets.OPENAI_API_KEY }} ``` --- ## CLI cheat‑sheet ```bash translate -i <src> -o <lang…> [options] # Translate a file or folder diff -b <before> -a <after> [options] # Re‑translate only edited keys check -i <src> -o <lang…> [options] # Verify existing translations (no writes) ``` Common flags (all subcommands accept these unless noted): | Flag | Default | Description | | ------------------------- | --------------- | ------------------------------------------------------------------------------- | | `-e, --engine` | chatgpt | chatgpt · gemini · claude · ollama | | `-m, --model` | gpt‑5.2 | e.g. `gemini‑2.5‑flash`, `claude‑sonnet‑4‑6`, `llama3.3` | | `-l, --input-language` | from filename | ISO‑639‑1 code or English name (`en`, `French`) — BCP‑47 tags like `pt-BR` OK | | `-r, --rate-limit-ms` | engine‑specific | Minimum gap between requests | | `--concurrency` | 2 | Batches to run in parallel within one language | | `--language-concurrency` | 1 | Target languages to translate in parallel (shares pool + rate limit) | | `--tokens-per-minute` | off | Extra TPM cap across all workers; pair with `--concurrency` to stay under tier | | `--context <string>` | — | Product/domain context, e.g. `"a B2B invoicing SaaS"` | | `--glossary <path>` | — | JSON file: keep-verbatim terms + forced per-language translations | | `--exclude-languages` | — | Locales to skip (for manually‑maintained targets) | | `--no-continue-on-error` | continue | Abort on first key/batch failure instead of skipping | | `--dry-run` | false | Don't write files, preview instead (translate/diff only) | | `--cache [path]` | off | Reuse a translation memory across runs; skip unchanged strings (translate/diff) | | `--file-format` | from extension | File format: `json`, `po`, `properties`, `strings` (translate/diff/check) | | `--format` | table | `table` or `json` report output (check only) | Full flag list: `i18n-ai-translate <subcommand> --help`. --- ## Use as a library ```ts import { translate, translateDiff, check } from "i18n-ai-translate"; const fr = await translate({ inputJSON: require("./en.json"), inputLanguageCode: "en", outputLanguageCode: "fr", engine: "chatgpt", model: "gpt-5.2", apiKey: process.env.OPENAI_API_KEY, context: "a music trivia game for Discord", // optional concurrency: 4, // optional }); const report = await check({ inputJSON: require("./en.json"), targetJSON: require("./fr.json"), inputLanguageCode: "en", outputLanguageCode: "fr", engine: "chatgpt", model: "gpt-5.2", apiKey: process.env.OPENAI_API_KEY, }); // report.issues = [{ key, original, translated, issue, suggestion }] ``` --- ## Advanced topics * **Prompt modes**: `csv` (faster, GPT‑class models only) vs `json` (structured output, works with weaker models too) * **Custom prompts**: swap in your own generation/verification prompts via `--override-prompt` * **Translation memory**: `--cache [path]` stores translations in a JSON file (default `.i18n-ai-translate-cache.json`) and reuses them on later runs, so unchanged strings are never re-sent to the model. The key is the source text + languages + `--context` — independent of engine/model, so the cache survives a provider switch. Library callers can pass their own `cache` object. * **Glossary**: `--glossary <path>` points to a JSON file that steers terminology — `doNotTranslate` keeps brand/product names verbatim, and `terms` forces exact per-language translations: `{ "doNotTranslate": ["Acme"], "terms": { "fr": { "Account": "Compte" } } }`. The rules are injected into both the generation and verification prompts; only the run's target language is applied (with BCP-47 base-subtag fallback, so `pt` covers `pt-BR`). * **Plural awareness**: keys ending in `_one`/`_other`/`_few`/`_many` get a CLDR plural hint in JSON mode * **Placeholders**: `{{variables}}` are preserved; customise delimiters with `-p`/`-s` * **Rate-limit handling**: per-engine defaults + exponential backoff; `--tokens-per-minute` adds TPM cap