UNPKG

@ar-nelson/foldcase

Version:

Unicode Case Folding, both Simple and Full

85 lines (55 loc) 3.19 kB
# foldcase: JavaScript implementation of the Unicode Case Folding Algorithm This is a small Node module that implements the [Unicode Case Folding Algorithm][1], in both Simple and Full variants, by generating a conversion table from the [official case folding table at `unicode.org`][2]. const foldcase = require('@ar-nelson/foldcase'); foldcase('ABC123'); // => 'abc123' `foldcase` is similar to `String.prototype.toLowerCase`, but not the same. Its intended use is case-insensitive string comparisons. [Case folding][3] remains the same across Unicode versions, although `toLowerCase`'s behavior may change. For example, because the Cherokee script originally had only uppercase characters in Unicode, and lowercase characters were added later, `foldcase` converts Cherokee text to *uppercase*. foldcase('ꮳꮃꭹ'); // => 'ᏣᎳᎩ' This module has no dependencies, and should run in Node and on modern browsers. Because it uses a few ES6 features (`\u{…}` escapes, `Array.from`), it won't run on any version of IE without transpilation and polyfills. Use `npm test` to lint and test. Use `npm run codegen` to regenerate `case-tables.js` from the table at `unicode.org` (probably only works on Linux). **Caveat emptor:** Because it's implemented in pure JS, and it operates on 32-bit codepoints instead of 16-bit characters, this algorithm is *slow*. Only use this over `toLowerCase` if you know you need it for compatibility. [1]: https://www.w3.org/International/wiki/Case_folding [2]: http://www.unicode.org/Public/UNIDATA/CaseFolding.txt [3]: https://unicode.org/faq/casemap_charprop.html#2 ## API ### `foldcase(String)` Alias for `foldcase.full`. ### `foldcase.full(String)` Applies the full Unicode case folding algorithm. This algorithm may convert some single characters into multiple characters. foldcase.full('Weiß'); // => 'weiss' ### `foldcase.simple(String)` Applies the simple Unicode case folding algorithm. The simple algorithm will not change the length (in code points) of its input string. foldcase.simple('Weiß'); // => 'weiß' ### `foldcase.charFull(String)` Applies the full Unicode case folding algorithm to a single code point. The code point may be a surrogate pair, so the argument may be a string of length 1 or 2. If the argument is not a single code point, or does not have a case folding conversion, the argument will be returned unchanged. foldcase.charFull('A'); // => 'a' foldcase.charFull('FOO'); // => 'FOO' This version of case folding may convert single characters to multiple characters. foldcase.charFull('ß'); // => 'ss' ### `foldcase.charSimple(String)` Applies the simple Unicode case folding algorithm to a single code point. The code point may be a surrogate pair, so the argument may be a string of length 1 or 2. If the argument is not a single code point, or does not have a case folding conversion, the argument will be returned unchanged. foldcase.charSimple('A'); // => 'a' foldcase.charSimple('FOO'); // => 'FOO' This version of case folding always returns a single code point when given a single code point. foldcase.charSimple('ß'); // => 'ß'