shaka-player
Version:
DASH/EME video player library
159 lines (120 loc) • 6.2 kB
Markdown
# Localization Design Principles
## Summary
This document goes over the core principle and design decisions that went into
the Shaka Player UI Localization system. The goal of this document is to allow
those looking at the system a way to understand the reasons behind its design
and implementation.
## Definitions
```
Term | Definition | Example
-----------------|-------------------------------------------------------------
Locale | See "Talking About Languages". | "en-CA"
| |
Phrase | Any series of words that have | -
| some meaning in some locale. |
| |
Context | A description of a phrase. | "The title for a button that
| | stops video playback"
| |
Localized Phrase | A phrase in a specific locale | "arret" (fr-CA)
| that maps to a context. |
```
References:
- [Talking About Languages] (talking-about-languages.md) : A detailed
explanation of the terms and relationships used when talking about languages
in the Shaka Player project.
## Core Principle
Find the "closest" localized phrase for a context and locale while respecting
consistency over accuracy.
The idea is that we should be able to give you something that will make sense
to you in the least jarring way possible. We'll go over the "least jarring way
possible" a bit more later, but for now let's look at a simple example to get
an idea of what the problem is:
- The user wants a localized phrase for "the title for a button that stops
video playback" in "en-US".
- We don't have one in "en-US" but we have one in "en", "en-CA", "fr-CA".
Now let's bring in the jarring-problem. People often know more than one
language, so their preferences may include more than one language. Let's look a
slightly more accurate example to get an idea of what the problem is:
- The user wants a localized phrase for "the title for a button that stops
video playback" in "en-US" or "fr-CA".
- We don't have one in "en-US" but we have one in "en", "en-CA", "fr-CA".
## Finishing The Closest Match
There are two levels of searches that we make when looking for the closest
localized phrase. There is the search within a specific locale and the search
between locales.
## Searching Within A Locale
When searching within a locale for a match, we group related locales into four
groups: self, parent, siblings, and children. Each group may have zero or more
locales in them. For example, if we had the locale "en-US" our groups could be:
- Self: [ en-US ]
- Parent: [ en ]
- Siblings: [ en-CA, en-GB ]
- Children: [ ]
and if we had the locale "en" our groups could be:
- Self: [ en ]
- Parent: [ ]
- Siblings: [ ]
- Children: [ en-CA, en-GB, en-US ]
When looking for a localized phrase, we go group-by-group in order of accuracy
(self, parent, siblings, children). When a localized phrase is found for the
given context, we stop searching and return that result.
```
Available locales:
"en", "en-US", "en-GB", en-CA", "fr", "fr-CA"
Search order for "en":
| 1. self | -> | 2. parent | -> | 3. siblings | -> | 4. children |
-----------------------------------------------------------------
| a. en | | | | | | a. en-US |
| | | | | | | b. en-GB |
| | | | | | | c. en-CA |
```
```
Available locales:
"en", "en-US", "en-GB", en-CA", "fr", "fr-CA"
Search order for "en-US":
| 1. self | -> | 2. parent | -> | 3. siblings | -> | 4. children |
-------------------------------------------------------------------
| a. en-US | | a. en | | a. en-GB | | |
| | | | | b. en-CA | | |
```
## Searching Between Locales
When we have multiple locales (e.g. "en-US" or "fr-CA") we only move to a later
locale if no matches were found in the earlier locales. This is because no
matter how loosely we match, we prefer displaying everything in one language.
For example, suppose we returned the best matches across all locales. The user
could end-up seeing some English and some French. While it may be more accurate
case-by-case, it would be less desirable overall.
```
Available locales:
"en", "en-US", "en-GB", en-CA", "fr", "fr-CA", "fr-FR", ...
Search Order for multiple preferences:
| Preference 1 | -> | Preference 2 | -> ... -> | Preference N |
| (en-US) | | (fr-CA) | | (...) |
----------------------------------------------------------------
| a. en-US | | a. fr-CA | | ... |
| b. en | | b. fr | | |
| c. en-GB | | c. fr-FR | | |
| d. en-CA | | | | |
```
## Prioritize for Look-up
There are two key operations for our localization system: insertion and look-up.
Since we assume that insertions will happen far less often than look-ups, we
decided that our system must prioritize the efficiency of look-ups.
To achieve the simplest look-up possible, we take all the localization tables
that we would end-up searching and flatten them into a single map. We do this
once before any requests are made, so that each request is a table look-up
rather than multiple table look-ups. When we merge the tables, we go in
reverse-preference order. This allows the more preferred entries to override
the lesser preferred values.
```
Preference Order : en-US > en > en-GB > en-CA
Merge Order : en-CA > en-GB > en > en-US
Merged : [ A4 ] [ B1 ] [ C2 ] [ D1 ] [ E2 ] [ F3 ] [ G1 ]
^ ^ ^ ^ ^ ^ ^
en-US : ^ [ B1 ] ^ [ D1 ] ^ ^ [ G1 ]
en : ^ [ B2 ] [ C2 ] [ D2 ] [ E2 ] ^ [ G2 ]
en-GB : ^ [ B3 ] [ C3 ] [ D3 ] [ E3 ] [ F3 ] [ G3 ]
en-CA : [ A4 ] [ B4 ] [ D4 ] [ E4 ] [ G4 ]
...
```