UNPKG

unicode-to-plain-text

Version:

Convert fancy Unicode text to plain ASCII with smart language preservation

114 lines (81 loc) β€’ 3.1 kB
# unicode-to-plain-text Convert fancy Unicode text to plain ASCII with smart language preservation ## Install ``` npm i unicode-to-plain-text ``` ## Usage Basic usage: ```js import { toPlainText } from 'unicode-to-plain-text' // Mathematical styles toPlainText('π‡πžπ₯π₯𝐨 𝐖𝐨𝐫π₯𝐝') // => 'Hello World' // Enclosed characters toPlainText('πŸ…£πŸ…”πŸ…’πŸ…£') // => 'TEST' // Fullwidth forms toPlainText('οΌ¨οΌ₯οΌ¬οΌ¬οΌ―') // => 'HELLO' ``` Language preservation: ```js // Real languages are automatically preserved toPlainText('Hello ΓΡια σας') // => 'Hello ΓΡια σας' (Greek preserved) toPlainText('Test ΠŸΡ€ΠΈΠ²Π΅Ρ‚') // => 'Test ΠŸΡ€ΠΈΠ²Π΅Ρ‚' (Cyrillic preserved) // But lookalike characters are converted toPlainText('Ξ‘ test') // => 'A test' (Greek Alpha β†’ Latin A) ``` Custom pipelines: ```js import { pipe, handleUpsideDown, mapCharacters, normalizeUnicode, removeDecorations, normalizeWhitespace, normalizeCasing } from 'unicode-to-plain-text' // Create a custom pipeline const customTransform = pipe( handleUpsideDown, mapCharacters, normalizeUnicode, removeDecorations, normalizeWhitespace ) const result = customTransform('𝐓𝐄𝐒𝐓') ``` ## API ### toPlainText(text, options?) Converts fancy Unicode text to plain ASCII | Property | Type | Description | | --------- | ------ | ---------------------------------- | | `text` | string | Input text with Unicode characters | | `options` | object | Optional configuration object | #### Options | Option | Type | Default | Description | | ---------------- | ------- | ------- | ------------------------------------------------------------------------------------ | | `normalizeSpaces`| boolean | `true` | Collapse multiple spaces and trim whitespace | | `skipEmoji` | boolean | `false` | Preserve emoji characters (still removes other decorations like box drawing, arrows) | #### Examples ```js // Default behavior - emojis removed toPlainText('Hello πŸŽ‰ World') // => 'Hello World' // Preserve emojis toPlainText('Hello πŸŽ‰ World', { skipEmoji: true }) // => 'Hello πŸŽ‰ World' // Preserve spacing toPlainText('Hello World', { normalizeSpaces: false }) // => 'Hello World' // Combined options toPlainText('π‡πžπ₯π₯𝐨 πŸŽ‰ 𝐖𝐨𝐫π₯𝐝', { skipEmoji: true, normalizeSpaces: false }) // => 'Hello πŸŽ‰ World' ``` Returns a plain ASCII string with normalized whitespace and casing ### Individual Functions - `handleUpsideDown(text)` - Reverses upside-down text - `mapCharacters(text)` - Maps Unicode to ASCII equivalents - `normalizeUnicode(text)` - Removes diacritics from Latin text - `removeDecorations(text)` - Removes emojis and decorations - `normalizeWhitespace(text)` - Normalizes and trims whitespace - `normalizeCasing(text)` - Normalizes inconsistent casing - `pipe(...fns)` - Composes functions into a pipeline ## License Apache-2.0