UNPKG

string-punctuation-tokenizer

Version:

Small library that provides functions to tokenize a string into an array of words with or without punctuation

6 lines (4 loc) 262 B
## v0.9.0 - Fixed Hindi tokenization issues with \u200D that should not break a word. - http://unicode.scarfboy.com/?s=%E0%A4%B8%E0%A4%A8%E0%A5%8D%E2%80%8D%E0%A4%A4%E0%A4%BE%E0%A4%A8 - Extracted Occurrences functions to separate file for better organization.