gpt-tokenizer
Version:
A pure JavaScript implementation of a BPE tokenizer (Encoder/Decoder) for GPT-2 / GPT-3 / GPT-4 and other OpenAI models
747 lines (561 loc) • 84.6 kB
Plain Text
EncodingName: r50k_base
Sample:
Encoded: []
EncodingName: r50k_base
Sample: a
Encoded: [64]
EncodingName: r50k_base
Sample: 1
Encoded: [16]
EncodingName: r50k_base
Sample: a a
Encoded: [64, 257]
EncodingName: r50k_base
Sample: hello
Encoded: [31373]
EncodingName: r50k_base
Sample: Hello, World! How are you today? 🌍
Encoded: [15496, 11, 2159, 0, 1374, 389, 345, 1909, 30, 12520, 234, 235]
EncodingName: r50k_base
Sample: こんにちは、世界!お元気ですか?
Encoded: [46036, 22174, 28618, 2515, 94, 31676, 23513, 10310, 244, 45911, 234, 171, 120, 223, 2515, 232, 17739, 225, 36365, 245, 30640, 33623, 27370, 171, 120, 253]
EncodingName: r50k_base
Sample: Hola, mundo! ¿Cómo estás hoy? 🇪🇸
Encoded: [39, 5708, 11, 27943, 78, 0, 1587, 123, 34, 10205, 5908, 1556, 40138, 289, 726, 30, 12520, 229, 103, 8582, 229, 116]
EncodingName: r50k_base
Sample: Привет, мир! Как дела?
Encoded: [140, 253, 21169, 18849, 38857, 16843, 20375, 11, 12466, 120, 18849, 21169, 0, 12466, 248, 16142, 31583, 12466, 112, 16843, 30143, 16142, 30]
EncodingName: r50k_base
Sample: 안녕하세요, 세상! 오늘 기분이 어때요? 🇰🇷
Encoded: [168, 243, 230, 167, 227, 243, 47991, 246, 168, 226, 116, 168, 248, 242, 11, 23821, 226, 116, 168, 225, 223, 0, 23821, 246, 97, 167, 232, 246, 220, 166, 116, 108, 167, 114, 226, 35975, 112, 23821, 244, 112, 167, 243, 234, 168, 248, 242, 30, 12520, 229, 108, 8582, 229, 115]
EncodingName: r50k_base
Sample: Bonjour, le monde ! Comment ça va aujourd'hui ? 🇫🇷
Encoded: [20682, 73, 454, 11, 443, 285, 14378, 5145, 18957, 6184, 100, 64, 46935, 257, 23577, 454, 67, 6, 71, 9019, 5633, 12520, 229, 104, 8582, 229, 115]
EncodingName: r50k_base
Sample: The quick brown fox jumps over 13 lazy dogs. 😺
Encoded: [464, 2068, 7586, 21831, 18045, 625, 1511, 16931, 6844, 13, 30325, 118]
EncodingName: r50k_base
Sample: 1234567890!@#$%^&*()-=_+[]{};:'",.<>?/|`~ 🎉
Encoded: [10163, 2231, 30924, 3829, 0, 31, 29953, 4, 61, 5, 9, 3419, 12, 28, 62, 10, 21737, 90, 19629, 32105, 1600, 29847, 29, 30, 14, 91, 63, 93, 12520, 236, 231]
EncodingName: r50k_base
Sample: C# is a great programming language for building apps.
Encoded: [34, 2, 318, 257, 1049, 8300, 3303, 329, 2615, 6725, 13]
EncodingName: r50k_base
Sample: El área de un triángulo es (base * altura) / 2.
Encoded: [9527, 6184, 94, 21468, 390, 555, 1333, 6557, 782, 43348, 1658, 357, 8692, 1635, 5988, 5330, 8, 1220, 362, 13]
EncodingName: r50k_base
Sample: Здравствуйте, это мой первый раз здесь. Что мне делать?
Encoded: [140, 245, 43666, 21169, 16142, 38857, 21727, 20375, 38857, 35072, 140, 117, 20375, 16843, 11, 220, 141, 235, 20375, 15166, 12466, 120, 25443, 117, 12466, 123, 16843, 21169, 38857, 45035, 140, 117, 220, 21169, 16142, 140, 115, 12466, 115, 43666, 16843, 21727, 45367, 13, 12466, 100, 20375, 15166, 12466, 120, 22177, 16843, 12466, 112, 16843, 30143, 16142, 20375, 45367, 30]
EncodingName: r50k_base
Sample: હેલો, વિશ્વ! તમે આજે કેમ છો? 🇮🇳
Encoded: [156, 103, 117, 156, 104, 229, 156, 103, 110, 156, 104, 233, 11, 220, 156, 103, 113, 156, 103, 123, 156, 103, 114, 156, 104, 235, 156, 103, 113, 0, 220, 156, 103, 97, 156, 103, 106, 156, 104, 229, 220, 156, 103, 228, 156, 103, 250, 156, 104, 229, 220, 156, 103, 243, 156, 104, 229, 156, 103, 106, 220, 156, 103, 249, 156, 104, 233, 30, 12520, 229, 106, 8582, 229, 111]
EncodingName: r50k_base
Sample: ความรักและการเป็นกันเองเป็นสิ่งสำคัญที่สุดในโลก 🇹🇭
Encoded: [19567, 226, 19567, 100, 19567, 110, 19567, 94, 19567, 96, 19567, 109, 19567, 223, 31479, 223, 19567, 98, 19567, 108, 19567, 223, 19567, 110, 19567, 96, 31479, 222, 19567, 249, 31479, 229, 19567, 247, 19567, 223, 19567, 109, 19567, 247, 31479, 222, 19567, 255, 19567, 229, 31479, 222, 19567, 249, 31479, 229, 19567, 247, 19567, 103, 19567, 112, 31479, 230, 19567, 229, 19567, 103, 19567, 111, 19567, 226, 19567, 109, 19567, 235, 19567, 245, 19567, 113, 31479, 230, 19567, 103, 19567, 116, 19567, 242, 31479, 225, 19567, 247, 31479, 224, 19567, 98, 19567, 223, 12520, 229, 117, 8582, 229, 255]
EncodingName: r50k_base
Sample: Python vs Java: Which programming language should you learn first?
Encoded: [37906, 3691, 7349, 25, 9022, 8300, 3303, 815, 345, 2193, 717, 30]
EncodingName: r50k_base
Sample: A journey of a thousand miles begins with a single step. - Lao Tzu
Encoded: [32, 7002, 286, 257, 7319, 4608, 6140, 351, 257, 2060, 2239, 13, 532, 4689, 78, 309, 27624]
EncodingName: r50k_base
Sample: Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt. 🇩🇪
Encoded: [32423, 19674, 4801, 502, 7274, 5522, 4891, 3996, 68, 7809, 4656, 19674, 4801, 502, 7274, 370, 2120, 13, 12520, 229, 102, 8582, 229, 103]
EncodingName: r50k_base
Sample: יש לי כמה שאלות בנוגע לפרויקט החדש שלך. 🇮🇱
Encoded: [33951, 102, 14360, 250, 25529, 14360, 249, 49168, 38269, 14360, 102, 42973, 40010, 27072, 42064, 14360, 239, 147, 254, 27072, 147, 240, 147, 95, 14360, 250, 147, 97, 37778, 27072, 33951, 100, 147, 246, 14360, 242, 147, 245, 147, 241, 50227, 14360, 102, 40010, 147, 248, 13, 12520, 229, 106, 8582, 229, 109]
EncodingName: r50k_base
Sample: Det är en vacker dag i Sverige. 🇸🇪
Encoded: [11242, 6184, 97, 81, 551, 410, 10735, 48924, 1312, 311, 332, 10045, 13, 12520, 229, 116, 8582, 229, 103]
EncodingName: r50k_base
Sample: A ∀ x (P(x) → Q(x)) ∧ (∃x P(x)) → ∃x Q(x)
Encoded: [32, 18872, 222, 2124, 357, 47, 7, 87, 8, 15168, 1195, 7, 87, 4008, 18872, 100, 357, 24861, 225, 87, 350, 7, 87, 4008, 15168, 18872, 225, 87, 1195, 7, 87, 8]
EncodingName: r50k_base
Sample: O Brasil é o maior país da América do Sul. 🇧🇷
Encoded: [46, 39452, 346, 38251, 267, 17266, 1504, 14187, 41200, 12379, 1703, 2634, 30997, 466, 29357, 13, 12520, 229, 100, 8582, 229, 115]
EncodingName: r50k_base
Sample: L'amore è una forza potente che unisce le persone. 🇮🇹
Encoded: [43, 6, 321, 382, 6184, 101, 555, 64, 329, 4496, 16739, 68, 1125, 555, 271, 344, 443, 2774, 505, 13, 12520, 229, 106, 8582, 229, 117]
EncodingName: r50k_base
Sample: Είναι μια ηλιόλουστη ημέρα στην Ελλάδα. 🇬🇷
Encoded: [138, 243, 138, 107, 26180, 17394, 29945, 18919, 29945, 17394, 7377, 115, 39377, 29945, 139, 234, 39377, 26517, 139, 227, 38392, 32830, 138, 115, 7377, 115, 34703, 138, 255, 33643, 17394, 18074, 225, 32830, 138, 115, 26180, 7377, 243, 39377, 39377, 138, 105, 138, 112, 17394, 13, 12520, 229, 105, 8582, 229, 115]
EncodingName: r50k_base
Sample: Teslim tarihi yaklaşıyor, projeyi zamanında bitirmemiz gerekiyor. 🇹🇷
Encoded: [36504, 2475, 256, 2743, 5303, 46251, 5031, 46481, 30102, 88, 273, 11, 386, 73, 2959, 72, 1976, 10546, 30102, 45658, 1643, 2533, 368, 528, 308, 567, 4106, 88, 273, 13, 12520, 229, 117, 8582, 229, 115]
EncodingName: r50k_base
Sample: Det finnes ingen bedre tid enn nå for å starte noe nytt. 🇳🇴
Encoded: [11242, 957, 2516, 27016, 3996, 260, 29770, 551, 77, 299, 29090, 329, 6184, 98, 923, 68, 645, 68, 299, 88, 926, 13, 12520, 229, 111, 8582, 229, 112]
EncodingName: r50k_base
Sample: Aanvaard de uitdagingen van het leven met moed en vastberadenheid. 🇳🇱
Encoded: [32, 272, 6862, 446, 390, 334, 270, 67, 3039, 268, 5719, 339, 83, 443, 574, 1138, 6941, 276, 551, 5909, 527, 40780, 28420, 13, 12520, 229, 111, 8582, 229, 109]
EncodingName: r50k_base
Sample: Chào mừng bạn đến với thế giới của lập trình. 🇻🇳
Encoded: [1925, 24247, 78, 285, 157, 119, 104, 782, 275, 157, 118, 94, 77, 34754, 239, 157, 118, 123, 77, 410, 157, 119, 249, 72, 294, 157, 118, 123, 308, 72, 157, 119, 249, 72, 269, 157, 119, 100, 64, 300, 157, 118, 255, 79, 491, 127, 105, 77, 71, 13, 12520, 229, 119, 8582, 229, 111]
EncodingName: r50k_base
Sample: Dlaczego warto uczyć się języków obcych? 🇵🇱
Encoded: [35, 75, 330, 89, 1533, 78, 32943, 78, 334, 66, 7357, 38325, 33721, 128, 247, 474, 128, 247, 46355, 10205, 86, 909, 948, 354, 30, 12520, 229, 113, 8582, 229, 109]
EncodingName: r50k_base
Sample: E = mc², uma equação famosa na física. 🇵🇹
Encoded: [36, 796, 36650, 31185, 11, 334, 2611, 1602, 64, 16175, 28749, 1145, 8546, 12385, 277, 41200, 3970, 13, 12520, 229, 113, 8582, 229, 117]
EncodingName: r50k_base
Sample: 你今天遇到什么有趣的事情了吗?🇨🇳
Encoded: [19526, 254, 20015, 232, 25465, 34402, 229, 26344, 108, 20015, 222, 20046, 230, 17312, 231, 164, 114, 96, 21410, 12859, 233, 46349, 227, 12859, 228, 28938, 245, 171, 120, 253, 8582, 229, 101, 8582, 229, 111]
EncodingName: r50k_base
Sample: Nå er det tid for å feire med familie og venner. 🇳🇴
Encoded: [45, 29090, 1931, 1062, 29770, 329, 6184, 98, 730, 557, 1117, 1145, 346, 494, 267, 70, 8710, 1008, 13, 12520, 229, 111, 8582, 229, 112]
EncodingName: r50k_base
Sample: Þetta er góður dagur til að læra eitthvað nýtt. 🇮🇸
Encoded: [127, 252, 15253, 1931, 308, 10205, 27214, 333, 48924, 333, 21502, 257, 27214, 300, 21241, 430, 304, 270, 400, 6862, 27214, 299, 127, 121, 926, 13, 12520, 229, 106, 8582, 229, 116]
EncodingName: r50k_base
Sample: გამარჯობა! როგორ ხართ დღეს? 🇬🇪
Encoded: [157, 225, 240, 157, 225, 238, 157, 225, 249, 157, 225, 238, 157, 225, 254, 157, 225, 107, 157, 225, 251, 157, 225, 239, 157, 225, 238, 0, 28053, 225, 254, 157, 225, 251, 157, 225, 240, 157, 225, 251, 157, 225, 254, 28053, 225, 106, 157, 225, 238, 157, 225, 254, 157, 225, 245, 28053, 225, 241, 157, 225, 99, 157, 225, 242, 157, 225, 94, 30, 12520, 229, 105, 8582, 229, 103]
EncodingName: r50k_base
Sample: Mā te whakawhiti kōrero e whai hua ai tātou. 🇳🇿
Encoded: [44, 10235, 573, 348, 461, 707, 71, 8846, 479, 13090, 34785, 304, 348, 1872, 289, 6413, 257, 72, 256, 10235, 83, 280, 13, 12520, 229, 111, 8582, 229, 123]
EncodingName: r50k_base
Sample: Это был незабываемый опыт, который я буду помнить всегда.
Encoded: [140, 255, 20375, 15166, 12466, 109, 45035, 30143, 12466, 121, 16843, 140, 115, 16142, 140, 109, 45035, 38857, 16142, 16843, 43108, 45035, 140, 117, 12466, 122, 140, 123, 45035, 20375, 11, 12466, 118, 15166, 20375, 15166, 21169, 45035, 140, 117, 220, 40623, 12466, 109, 35072, 43666, 35072, 12466, 123, 25443, 120, 22177, 18849, 20375, 45367, 12466, 110, 21727, 16843, 140, 111, 43666, 16142, 13]
EncodingName: r50k_base
Sample: Διαβάζοντας βιβλία, εμπλουτίζουμε τον εαυτό μας με γνώσεις.
Encoded: [138, 242, 29945, 17394, 26638, 138, 105, 138, 114, 26517, 26180, 32830, 17394, 35558, 27169, 29945, 26638, 39377, 138, 107, 17394, 11, 7377, 113, 34703, 46582, 39377, 26517, 139, 227, 32830, 138, 107, 138, 114, 26517, 139, 227, 34703, 30950, 46651, 26517, 26180, 7377, 113, 17394, 139, 227, 32830, 139, 234, 18919, 17394, 35558, 18919, 30950, 7377, 111, 26180, 139, 236, 38392, 30950, 29945, 35558, 13]
EncodingName: r50k_base
Sample: A számítástechnika világa tele van izgalmas lehetőségekkel. 🇭🇺
Encoded: [32, 264, 89, 6557, 76, 8836, 83, 6557, 4169, 1349, 9232, 39796, 6557, 4908, 5735, 5719, 220, 528, 13528, 5356, 443, 3202, 129, 239, 82, 2634, 469, 74, 7750, 13, 12520, 229, 255, 8582, 229, 118]
EncodingName: r50k_base
Sample: Vždy je dobré mít plán B, pokud něco nevyjde. 🇨🇿
Encoded: [53, 129, 122, 9892, 11223, 466, 1671, 2634, 285, 8836, 83, 458, 21162, 347, 11, 279, 482, 463, 299, 128, 249, 1073, 497, 7670, 73, 2934, 13, 12520, 229, 101, 8582, 229, 123]
EncodingName: r50k_base
Sample: Dragostea e un sentiment minunat care ne unește pe toți. 🇷🇴
Encoded: [46022, 455, 18213, 304, 555, 15598, 949, 403, 265, 1337, 497, 17809, 132, 247, 660, 613, 284, 132, 249, 72, 13, 12520, 229, 115, 8582, 229, 112]
EncodingName: r50k_base
Sample: دیکھو، آسمان میں کتنی تارے ہیں! 🇵🇰
Encoded: [38843, 151, 234, 150, 102, 150, 122, 30335, 148, 234, 17550, 95, 45692, 25405, 12919, 23338, 47048, 151, 234, 150, 118, 220, 150, 102, 41486, 23338, 151, 234, 17550, 103, 12919, 26897, 151, 240, 220, 151, 223, 151, 234, 150, 118, 0, 12520, 229, 113, 8582, 229, 108]
EncodingName: r50k_base
Sample: Nenda polepole na ujifunze kila siku. 🇹🇿
Encoded: [45, 7438, 16825, 36869, 12385, 334, 73, 361, 403, 2736, 8769, 64, 264, 28643, 13, 12520, 229, 117, 8582, 229, 123]
EncodingName: r50k_base
Sample: Каква е твоята любима храна? 🇧🇬
Encoded: [140, 248, 16142, 31583, 38857, 16142, 12466, 113, 220, 20375, 38857, 15166, 40623, 20375, 16142, 12466, 119, 141, 236, 140, 109, 18849, 43108, 16142, 220, 141, 227, 21169, 16142, 22177, 16142, 30, 12520, 229, 100, 8582, 229, 105]
EncodingName: r50k_base
Sample: Sträva alltid efter att bli en bättre version av dig själv.
Encoded: [13290, 11033, 6862, 477, 83, 312, 304, 637, 708, 698, 72, 551, 275, 11033, 926, 260, 2196, 1196, 3100, 264, 73, 11033, 6780, 13]
EncodingName: r50k_base
Sample: Філософія - це наука про знання. 🇺🇦
Encoded: [140, 97, 141, 244, 30143, 15166, 21727, 15166, 141, 226, 141, 244, 40623, 532, 220, 141, 228, 16843, 12466, 121, 16142, 35072, 31583, 16142, 12466, 123, 21169, 15166, 12466, 115, 22177, 16142, 22177, 22177, 40623, 13, 12520, 229, 118, 8582, 229, 99]
EncodingName: r50k_base
Sample: Το πρόγραμμα αυτό είναι πολύ ενδιαφέρον. 🇬🇷
Encoded: [138, 97, 26517, 18074, 222, 33643, 139, 234, 42063, 33643, 17394, 34703, 34703, 17394, 26367, 139, 227, 32830, 139, 234, 7377, 113, 138, 107, 26180, 17394, 29945, 18074, 222, 26517, 39377, 139, 235, 7377, 113, 26180, 138, 112, 29945, 17394, 139, 228, 138, 255, 33643, 26517, 26180, 13, 12520, 229, 105, 8582, 229, 115]
EncodingName: r50k_base
Sample: ^$%#*@!&)(_+=}{|:;"?><,~`'-./][
Encoded: [61, 3, 4, 2, 9, 31, 0, 5, 5769, 62, 47932, 18477, 91, 25, 26, 13984, 6927, 11, 93, 63, 29001, 19571, 7131]
EncodingName: r50k_base
Sample: 4gH@!0sT*#(9^%$[x{}j+|Yz6;Q]~8
Encoded: [19, 70, 39, 31, 0, 15, 82, 51, 9, 2, 7, 24, 61, 4, 3, 58, 87, 90, 92, 73, 10, 91, 56, 89, 21, 26, 48, 60, 93, 23]
EncodingName: r50k_base
Sample: wNb)I<>#:i^P]*cR8ytUx1Q`6O@z/
Encoded: [86, 45, 65, 8, 40, 27, 29, 2, 25, 72, 61, 47, 60, 9, 66, 49, 23, 20760, 52, 87, 16, 48, 63, 21, 46, 31, 89, 14]
EncodingName: r50k_base
Sample: ÄÜö¿¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿
Encoded: [127, 226, 127, 250, 9101, 126, 123, 126, 94, 44359, 14988, 126, 97, 126, 98, 126, 99, 16273, 37102, 16224, 126, 103, 24328, 126, 105, 7461, 5196, 7200, 22519, 31185, 126, 111, 18265, 126, 113, 26604, 9129, 126, 116, 126, 117, 36165, 17730, 126, 120, 23141, 126, 122, 126, 123]
EncodingName: r50k_base
Sample: ƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽ
Encoded: [130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121]
EncodingName: r50k_base
Sample: 5ħÅŸēýïūē$%#^*()_+{[ö&!@#?>|,.<>
Encoded: [20, 128, 100, 127, 227, 129, 116, 27092, 127, 121, 26884, 20317, 27092, 3, 4, 2, 61, 9, 3419, 62, 10, 90, 58, 9101, 5, 0, 41573, 30, 29, 91, 11, 29847, 29]
EncodingName: r50k_base
Sample: 1B4t#%&*()_+dF5g^hJk7LmN0pQrS<>?
Encoded: [16, 33, 19, 83, 2, 4, 5, 9, 3419, 62, 10, 67, 37, 20, 70, 61, 71, 41, 74, 22, 43, 76, 45, 15, 79, 48, 81, 50, 27, 29, 30]
EncodingName: r50k_base
Sample: ¬§±²³µ¶·¹ºª«»¦©¯°±!@#$%^&*()_+
Encoded: [126, 105, 16273, 22519, 31185, 126, 111, 126, 113, 26604, 9129, 126, 117, 36165, 126, 103, 24328, 17730, 126, 99, 16224, 5196, 7200, 22519, 0, 31, 29953, 4, 61, 5, 9, 3419, 62, 10]
EncodingName: r50k_base
Sample: 8mR5*w7^a$!F(0%#J9@X6vZ1)nU3]_Y/
Encoded: [23, 76, 49, 20, 9, 86, 22, 61, 64, 3, 0, 37, 7, 15, 4, 2, 41, 24, 31, 55, 21, 85, 57, 16, 8, 77, 52, 18, 60, 62, 56, 14]
EncodingName: r50k_base
Sample: 😊😀😁😂🤣😃😄😅😆😉😊😋😎😍😘😗😙😚☺️🙂🤗🤔
Encoded: [47249, 232, 47249, 222, 47249, 223, 47249, 224, 8582, 97, 96, 47249, 225, 47249, 226, 8582, 11805, 47249, 228, 47249, 231, 47249, 232, 47249, 233, 47249, 236, 47249, 235, 47249, 246, 47249, 245, 47249, 247, 47249, 248, 24583, 118, 37929, 8582, 25081, 8582, 97, 245, 8582, 97, 242]
EncodingName: r50k_base
Sample: 🤨😐😑😶🙄😏😣😥😮🤐😯😪😫😴😌🤓😛😜😝🤤
Encoded: [8582, 97, 101, 47249, 238, 47249, 239, 47249, 114, 8582, 247, 226, 47249, 237, 47249, 96, 47249, 98, 47249, 106, 8582, 97, 238, 47249, 107, 47249, 103, 47249, 104, 47249, 112, 47249, 234, 8582, 97, 241, 47249, 249, 47249, 250, 47249, 251, 8582, 97, 97]
EncodingName: r50k_base
Sample: 😒😓😔😕🙃🤑😲😷🤒🤕🤢🤧😈👿👹👺💀☠️
Encoded: [47249, 240, 47249, 241, 47249, 242, 47249, 243, 8582, 247, 225, 8582, 97, 239, 47249, 110, 47249, 115, 8582, 97, 240, 8582, 97, 243, 8582, 97, 95, 8582, 97, 100, 47249, 230, 41840, 123, 41840, 117, 41840, 118, 8582, 240, 222, 24583, 254, 37929]
EncodingName: r50k_base
Sample: 😾😿🙀😽😼😻🙈🙉🙊👶👦👧👨👩👴👵👨⚕️👩⚕️
Encoded: [47249, 122, 47249, 123, 8582, 247, 222, 47249, 121, 47249, 120, 47249, 119, 8582, 247, 230, 8582, 247, 231, 8582, 247, 232, 41840, 114, 41840, 99, 41840, 100, 41840, 101, 41840, 102, 41840, 112, 41840, 113, 41840, 101, 447, 235, 158, 248, 243, 37929, 41840, 102, 447, 235, 158, 248, 243, 37929]
EncodingName: r50k_base
Sample: 🌞🌝🌚🌛🌜🌙⭐️🌟💫✨🔥💥☄️🌈☀️🌤️⛅️🌥️
Encoded: [8582, 234, 252, 8582, 234, 251, 8582, 234, 248, 8582, 234, 249, 8582, 234, 250, 8582, 234, 247, 158, 255, 238, 37929, 8582, 234, 253, 8582, 240, 104, 26486, 101, 8582, 242, 98, 8582, 240, 98, 24583, 226, 37929, 8582, 234, 230, 24583, 222, 37929, 8582, 234, 97, 37929, 158, 249, 227, 37929, 8582, 234, 98, 37929]
EncodingName: r50k_base
Sample: 🍏🍎🍐🍊🍋🍌🍉🍇🍓🍈🍒🍑
Encoded: [8582, 235, 237, 8582, 235, 236, 8582, 235, 238, 8582, 235, 232, 8582, 235, 233, 8582, 235, 234, 8582, 235, 231, 8582, 235, 229, 8582, 235, 241, 8582, 235, 230, 8582, 235, 240, 8582, 235, 239]
EncodingName: p50k_base
Sample:
Encoded: []
EncodingName: p50k_base
Sample: a
Encoded: [64]
EncodingName: p50k_base
Sample: 1
Encoded: [16]
EncodingName: p50k_base
Sample: a a
Encoded: [64, 257]
EncodingName: p50k_base
Sample: hello
Encoded: [31373]
EncodingName: p50k_base
Sample: Hello, World! How are you today? 🌍
Encoded: [15496, 11, 2159, 0, 1374, 389, 345, 1909, 30, 12520, 234, 235]
EncodingName: p50k_base
Sample: こんにちは、世界!お元気ですか?
Encoded: [46036, 22174, 28618, 2515, 94, 31676, 23513, 10310, 244, 45911, 234, 171, 120, 223, 2515, 232, 17739, 225, 36365, 245, 30640, 33623, 27370, 171, 120, 253]
EncodingName: p50k_base
Sample: Hola, mundo! ¿Cómo estás hoy? 🇪🇸
Encoded: [39, 5708, 11, 27943, 78, 0, 1587, 123, 34, 10205, 5908, 1556, 40138, 289, 726, 30, 12520, 229, 103, 8582, 229, 116]
EncodingName: p50k_base
Sample: Привет, мир! Как дела?
Encoded: [140, 253, 21169, 18849, 38857, 16843, 20375, 11, 12466, 120, 18849, 21169, 0, 12466, 248, 16142, 31583, 12466, 112, 16843, 30143, 16142, 30]
EncodingName: p50k_base
Sample: 안녕하세요, 세상! 오늘 기분이 어때요? 🇰🇷
Encoded: [168, 243, 230, 167, 227, 243, 47991, 246, 168, 226, 116, 168, 248, 242, 11, 23821, 226, 116, 168, 225, 223, 0, 23821, 246, 97, 167, 232, 246, 220, 166, 116, 108, 167, 114, 226, 35975, 112, 23821, 244, 112, 167, 243, 234, 168, 248, 242, 30, 12520, 229, 108, 8582, 229, 115]
EncodingName: p50k_base
Sample: Bonjour, le monde ! Comment ça va aujourd'hui ? 🇫🇷
Encoded: [20682, 73, 454, 11, 443, 285, 14378, 5145, 18957, 6184, 100, 64, 46935, 257, 23577, 454, 67, 6, 71, 9019, 5633, 12520, 229, 104, 8582, 229, 115]
EncodingName: p50k_base
Sample: The quick brown fox jumps over 13 lazy dogs. 😺
Encoded: [464, 2068, 7586, 21831, 18045, 625, 1511, 16931, 6844, 13, 30325, 118]
EncodingName: p50k_base
Sample: 1234567890!@#$%^&*()-=_+[]{};:'",.<>?/|`~ 🎉
Encoded: [10163, 2231, 30924, 3829, 0, 31, 29953, 4, 61, 5, 9, 3419, 12, 28, 62, 10, 21737, 90, 19629, 32105, 1600, 29847, 29, 30, 14, 91, 63, 93, 12520, 236, 231]
EncodingName: p50k_base
Sample: C# is a great programming language for building apps.
Encoded: [34, 2, 318, 257, 1049, 8300, 3303, 329, 2615, 6725, 13]
EncodingName: p50k_base
Sample: El área de un triángulo es (base * altura) / 2.
Encoded: [9527, 6184, 94, 21468, 390, 555, 1333, 6557, 782, 43348, 1658, 357, 8692, 1635, 5988, 5330, 8, 1220, 362, 13]
EncodingName: p50k_base
Sample: Здравствуйте, это мой первый раз здесь. Что мне делать?
Encoded: [140, 245, 43666, 21169, 16142, 38857, 21727, 20375, 38857, 35072, 140, 117, 20375, 16843, 11, 220, 141, 235, 20375, 15166, 12466, 120, 25443, 117, 12466, 123, 16843, 21169, 38857, 45035, 140, 117, 220, 21169, 16142, 140, 115, 12466, 115, 43666, 16843, 21727, 45367, 13, 12466, 100, 20375, 15166, 12466, 120, 22177, 16843, 12466, 112, 16843, 30143, 16142, 20375, 45367, 30]
EncodingName: p50k_base
Sample: હેલો, વિશ્વ! તમે આજે કેમ છો? 🇮🇳
Encoded: [156, 103, 117, 156, 104, 229, 156, 103, 110, 156, 104, 233, 11, 220, 156, 103, 113, 156, 103, 123, 156, 103, 114, 156, 104, 235, 156, 103, 113, 0, 220, 156, 103, 97, 156, 103, 106, 156, 104, 229, 220, 156, 103, 228, 156, 103, 250, 156, 104, 229, 220, 156, 103, 243, 156, 104, 229, 156, 103, 106, 220, 156, 103, 249, 156, 104, 233, 30, 12520, 229, 106, 8582, 229, 111]
EncodingName: p50k_base
Sample: ความรักและการเป็นกันเองเป็นสิ่งสำคัญที่สุดในโลก 🇹🇭
Encoded: [19567, 226, 19567, 100, 19567, 110, 19567, 94, 19567, 96, 19567, 109, 19567, 223, 31479, 223, 19567, 98, 19567, 108, 19567, 223, 19567, 110, 19567, 96, 31479, 222, 19567, 249, 31479, 229, 19567, 247, 19567, 223, 19567, 109, 19567, 247, 31479, 222, 19567, 255, 19567, 229, 31479, 222, 19567, 249, 31479, 229, 19567, 247, 19567, 103, 19567, 112, 31479, 230, 19567, 229, 19567, 103, 19567, 111, 19567, 226, 19567, 109, 19567, 235, 19567, 245, 19567, 113, 31479, 230, 19567, 103, 19567, 116, 19567, 242, 31479, 225, 19567, 247, 31479, 224, 19567, 98, 19567, 223, 12520, 229, 117, 8582, 229, 255]
EncodingName: p50k_base
Sample: Python vs Java: Which programming language should you learn first?
Encoded: [37906, 3691, 7349, 25, 9022, 8300, 3303, 815, 345, 2193, 717, 30]
EncodingName: p50k_base
Sample: A journey of a thousand miles begins with a single step. - Lao Tzu
Encoded: [32, 7002, 286, 257, 7319, 4608, 6140, 351, 257, 2060, 2239, 13, 532, 4689, 78, 309, 27624]
EncodingName: p50k_base
Sample: Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt. 🇩🇪
Encoded: [32423, 19674, 4801, 502, 7274, 5522, 4891, 3996, 68, 7809, 4656, 19674, 4801, 502, 7274, 370, 2120, 13, 12520, 229, 102, 8582, 229, 103]
EncodingName: p50k_base
Sample: יש לי כמה שאלות בנוגע לפרויקט החדש שלך. 🇮🇱
Encoded: [33951, 102, 14360, 250, 25529, 14360, 249, 49168, 38269, 14360, 102, 42973, 40010, 27072, 42064, 14360, 239, 147, 254, 27072, 147, 240, 147, 95, 14360, 250, 147, 97, 37778, 27072, 33951, 100, 147, 246, 14360, 242, 147, 245, 147, 241, 50227, 14360, 102, 40010, 147, 248, 13, 12520, 229, 106, 8582, 229, 109]
EncodingName: p50k_base
Sample: Det är en vacker dag i Sverige. 🇸🇪
Encoded: [11242, 6184, 97, 81, 551, 410, 10735, 48924, 1312, 311, 332, 10045, 13, 12520, 229, 116, 8582, 229, 103]
EncodingName: p50k_base
Sample: A ∀ x (P(x) → Q(x)) ∧ (∃x P(x)) → ∃x Q(x)
Encoded: [32, 18872, 222, 2124, 357, 47, 7, 87, 8, 15168, 1195, 7, 87, 4008, 18872, 100, 357, 24861, 225, 87, 350, 7, 87, 4008, 15168, 18872, 225, 87, 1195, 7, 87, 8]
EncodingName: p50k_base
Sample: O Brasil é o maior país da América do Sul. 🇧🇷
Encoded: [46, 39452, 346, 38251, 267, 17266, 1504, 14187, 41200, 12379, 1703, 2634, 30997, 466, 29357, 13, 12520, 229, 100, 8582, 229, 115]
EncodingName: p50k_base
Sample: L'amore è una forza potente che unisce le persone. 🇮🇹
Encoded: [43, 6, 321, 382, 6184, 101, 555, 64, 329, 4496, 16739, 68, 1125, 555, 271, 344, 443, 2774, 505, 13, 12520, 229, 106, 8582, 229, 117]
EncodingName: p50k_base
Sample: Είναι μια ηλιόλουστη ημέρα στην Ελλάδα. 🇬🇷
Encoded: [138, 243, 138, 107, 26180, 17394, 29945, 18919, 29945, 17394, 7377, 115, 39377, 29945, 139, 234, 39377, 26517, 139, 227, 38392, 32830, 138, 115, 7377, 115, 34703, 138, 255, 33643, 17394, 18074, 225, 32830, 138, 115, 26180, 7377, 243, 39377, 39377, 138, 105, 138, 112, 17394, 13, 12520, 229, 105, 8582, 229, 115]
EncodingName: p50k_base
Sample: Teslim tarihi yaklaşıyor, projeyi zamanında bitirmemiz gerekiyor. 🇹🇷
Encoded: [36504, 2475, 256, 2743, 5303, 46251, 5031, 46481, 30102, 88, 273, 11, 386, 73, 2959, 72, 1976, 10546, 30102, 45658, 1643, 2533, 368, 528, 308, 567, 4106, 88, 273, 13, 12520, 229, 117, 8582, 229, 115]
EncodingName: p50k_base
Sample: Det finnes ingen bedre tid enn nå for å starte noe nytt. 🇳🇴
Encoded: [11242, 957, 2516, 27016, 3996, 260, 29770, 551, 77, 299, 29090, 329, 6184, 98, 923, 68, 645, 68, 299, 88, 926, 13, 12520, 229, 111, 8582, 229, 112]
EncodingName: p50k_base
Sample: Aanvaard de uitdagingen van het leven met moed en vastberadenheid. 🇳🇱
Encoded: [32, 272, 6862, 446, 390, 334, 270, 67, 3039, 268, 5719, 339, 83, 443, 574, 1138, 6941, 276, 551, 5909, 527, 40780, 28420, 13, 12520, 229, 111, 8582, 229, 109]
EncodingName: p50k_base
Sample: Chào mừng bạn đến với thế giới của lập trình. 🇻🇳
Encoded: [1925, 24247, 78, 285, 157, 119, 104, 782, 275, 157, 118, 94, 77, 34754, 239, 157, 118, 123, 77, 410, 157, 119, 249, 72, 294, 157, 118, 123, 308, 72, 157, 119, 249, 72, 269, 157, 119, 100, 64, 300, 157, 118, 255, 79, 491, 127, 105, 77, 71, 13, 12520, 229, 119, 8582, 229, 111]
EncodingName: p50k_base
Sample: Dlaczego warto uczyć się języków obcych? 🇵🇱
Encoded: [35, 75, 330, 89, 1533, 78, 32943, 78, 334, 66, 7357, 38325, 33721, 128, 247, 474, 128, 247, 46355, 10205, 86, 909, 948, 354, 30, 12520, 229, 113, 8582, 229, 109]
EncodingName: p50k_base
Sample: E = mc², uma equação famosa na física. 🇵🇹
Encoded: [36, 796, 36650, 31185, 11, 334, 2611, 1602, 64, 16175, 28749, 1145, 8546, 12385, 277, 41200, 3970, 13, 12520, 229, 113, 8582, 229, 117]
EncodingName: p50k_base
Sample: 你今天遇到什么有趣的事情了吗?🇨🇳
Encoded: [19526, 254, 20015, 232, 25465, 34402, 229, 26344, 108, 20015, 222, 20046, 230, 17312, 231, 164, 114, 96, 21410, 12859, 233, 46349, 227, 12859, 228, 28938, 245, 171, 120, 253, 8582, 229, 101, 8582, 229, 111]
EncodingName: p50k_base
Sample: Nå er det tid for å feire med familie og venner. 🇳🇴
Encoded: [45, 29090, 1931, 1062, 29770, 329, 6184, 98, 730, 557, 1117, 1145, 346, 494, 267, 70, 8710, 1008, 13, 12520, 229, 111, 8582, 229, 112]
EncodingName: p50k_base
Sample: Þetta er góður dagur til að læra eitthvað nýtt. 🇮🇸
Encoded: [127, 252, 15253, 1931, 308, 10205, 27214, 333, 48924, 333, 21502, 257, 27214, 300, 21241, 430, 304, 270, 400, 6862, 27214, 299, 127, 121, 926, 13, 12520, 229, 106, 8582, 229, 116]
EncodingName: p50k_base
Sample: გამარჯობა! როგორ ხართ დღეს? 🇬🇪
Encoded: [157, 225, 240, 157, 225, 238, 157, 225, 249, 157, 225, 238, 157, 225, 254, 157, 225, 107, 157, 225, 251, 157, 225, 239, 157, 225, 238, 0, 28053, 225, 254, 157, 225, 251, 157, 225, 240, 157, 225, 251, 157, 225, 254, 28053, 225, 106, 157, 225, 238, 157, 225, 254, 157, 225, 245, 28053, 225, 241, 157, 225, 99, 157, 225, 242, 157, 225, 94, 30, 12520, 229, 105, 8582, 229, 103]
EncodingName: p50k_base
Sample: Mā te whakawhiti kōrero e whai hua ai tātou. 🇳🇿
Encoded: [44, 10235, 573, 348, 461, 707, 71, 8846, 479, 13090, 34785, 304, 348, 1872, 289, 6413, 257, 72, 256, 10235, 83, 280, 13, 12520, 229, 111, 8582, 229, 123]
EncodingName: p50k_base
Sample: Это был незабываемый опыт, который я буду помнить всегда.
Encoded: [140, 255, 20375, 15166, 12466, 109, 45035, 30143, 12466, 121, 16843, 140, 115, 16142, 140, 109, 45035, 38857, 16142, 16843, 43108, 45035, 140, 117, 12466, 122, 140, 123, 45035, 20375, 11, 12466, 118, 15166, 20375, 15166, 21169, 45035, 140, 117, 220, 40623, 12466, 109, 35072, 43666, 35072, 12466, 123, 25443, 120, 22177, 18849, 20375, 45367, 12466, 110, 21727, 16843, 140, 111, 43666, 16142, 13]
EncodingName: p50k_base
Sample: Διαβάζοντας βιβλία, εμπλουτίζουμε τον εαυτό μας με γνώσεις.
Encoded: [138, 242, 29945, 17394, 26638, 138, 105, 138, 114, 26517, 26180, 32830, 17394, 35558, 27169, 29945, 26638, 39377, 138, 107, 17394, 11, 7377, 113, 34703, 46582, 39377, 26517, 139, 227, 32830, 138, 107, 138, 114, 26517, 139, 227, 34703, 30950, 46651, 26517, 26180, 7377, 113, 17394, 139, 227, 32830, 139, 234, 18919, 17394, 35558, 18919, 30950, 7377, 111, 26180, 139, 236, 38392, 30950, 29945, 35558, 13]
EncodingName: p50k_base
Sample: A számítástechnika világa tele van izgalmas lehetőségekkel. 🇭🇺
Encoded: [32, 264, 89, 6557, 76, 8836, 83, 6557, 4169, 1349, 9232, 39796, 6557, 4908, 5735, 5719, 220, 528, 13528, 5356, 443, 3202, 129, 239, 82, 2634, 469, 74, 7750, 13, 12520, 229, 255, 8582, 229, 118]
EncodingName: p50k_base
Sample: Vždy je dobré mít plán B, pokud něco nevyjde. 🇨🇿
Encoded: [53, 129, 122, 9892, 11223, 466, 1671, 2634, 285, 8836, 83, 458, 21162, 347, 11, 279, 482, 463, 299, 128, 249, 1073, 497, 7670, 73, 2934, 13, 12520, 229, 101, 8582, 229, 123]
EncodingName: p50k_base
Sample: Dragostea e un sentiment minunat care ne unește pe toți. 🇷🇴
Encoded: [46022, 455, 18213, 304, 555, 15598, 949, 403, 265, 1337, 497, 17809, 132, 247, 660, 613, 284, 132, 249, 72, 13, 12520, 229, 115, 8582, 229, 112]
EncodingName: p50k_base
Sample: دیکھو، آسمان میں کتنی تارے ہیں! 🇵🇰
Encoded: [38843, 151, 234, 150, 102, 150, 122, 30335, 148, 234, 17550, 95, 45692, 25405, 12919, 23338, 47048, 151, 234, 150, 118, 220, 150, 102, 41486, 23338, 151, 234, 17550, 103, 12919, 26897, 151, 240, 220, 151, 223, 151, 234, 150, 118, 0, 12520, 229, 113, 8582, 229, 108]
EncodingName: p50k_base
Sample: Nenda polepole na ujifunze kila siku. 🇹🇿
Encoded: [45, 7438, 16825, 36869, 12385, 334, 73, 361, 403, 2736, 8769, 64, 264, 28643, 13, 12520, 229, 117, 8582, 229, 123]
EncodingName: p50k_base
Sample: Каква е твоята любима храна? 🇧🇬
Encoded: [140, 248, 16142, 31583, 38857, 16142, 12466, 113, 220, 20375, 38857, 15166, 40623, 20375, 16142, 12466, 119, 141, 236, 140, 109, 18849, 43108, 16142, 220, 141, 227, 21169, 16142, 22177, 16142, 30, 12520, 229, 100, 8582, 229, 105]
EncodingName: p50k_base
Sample: Sträva alltid efter att bli en bättre version av dig själv.
Encoded: [13290, 11033, 6862, 477, 83, 312, 304, 637, 708, 698, 72, 551, 275, 11033, 926, 260, 2196, 1196, 3100, 264, 73, 11033, 6780, 13]
EncodingName: p50k_base
Sample: Філософія - це наука про знання. 🇺🇦
Encoded: [140, 97, 141, 244, 30143, 15166, 21727, 15166, 141, 226, 141, 244, 40623, 532, 220, 141, 228, 16843, 12466, 121, 16142, 35072, 31583, 16142, 12466, 123, 21169, 15166, 12466, 115, 22177, 16142, 22177, 22177, 40623, 13, 12520, 229, 118, 8582, 229, 99]
EncodingName: p50k_base
Sample: Το πρόγραμμα αυτό είναι πολύ ενδιαφέρον. 🇬🇷
Encoded: [138, 97, 26517, 18074, 222, 33643, 139, 234, 42063, 33643, 17394, 34703, 34703, 17394, 26367, 139, 227, 32830, 139, 234, 7377, 113, 138, 107, 26180, 17394, 29945, 18074, 222, 26517, 39377, 139, 235, 7377, 113, 26180, 138, 112, 29945, 17394, 139, 228, 138, 255, 33643, 26517, 26180, 13, 12520, 229, 105, 8582, 229, 115]
EncodingName: p50k_base
Sample: ^$%#*@!&)(_+=}{|:;"?><,~`'-./][
Encoded: [61, 3, 4, 2, 9, 31, 0, 5, 5769, 62, 47932, 18477, 91, 25, 26, 13984, 6927, 11, 93, 63, 29001, 19571, 7131]
EncodingName: p50k_base
Sample: 4gH@!0sT*#(9^%$[x{}j+|Yz6;Q]~8
Encoded: [19, 70, 39, 31, 0, 15, 82, 51, 9, 2, 7, 24, 61, 4, 3, 58, 87, 90, 92, 73, 10, 91, 56, 89, 21, 26, 48, 60, 93, 23]
EncodingName: p50k_base
Sample: wNb)I<>#:i^P]*cR8ytUx1Q`6O@z/
Encoded: [86, 45, 65, 8, 40, 27, 29, 2, 25, 72, 61, 47, 60, 9, 66, 49, 23, 20760, 52, 87, 16, 48, 63, 21, 46, 31, 89, 14]
EncodingName: p50k_base
Sample: ÄÜö¿¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿
Encoded: [127, 226, 127, 250, 9101, 126, 123, 126, 94, 44359, 14988, 126, 97, 126, 98, 126, 99, 16273, 37102, 16224, 126, 103, 24328, 126, 105, 7461, 5196, 7200, 22519, 31185, 126, 111, 18265, 126, 113, 26604, 9129, 126, 116, 126, 117, 36165, 17730, 126, 120, 23141, 126, 122, 126, 123]
EncodingName: p50k_base
Sample: ƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽ
Encoded: [130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121]
EncodingName: p50k_base
Sample: 5ħÅŸēýïūē$%#^*()_+{[ö&!@#?>|,.<>
Encoded: [20, 128, 100, 127, 227, 129, 116, 27092, 127, 121, 26884, 20317, 27092, 3, 4, 2, 61, 9, 3419, 62, 10, 90, 58, 9101, 5, 0, 41573, 30, 29, 91, 11, 29847, 29]
EncodingName: p50k_base
Sample: 1B4t#%&*()_+dF5g^hJk7LmN0pQrS<>?
Encoded: [16, 33, 19, 83, 2, 4, 5, 9, 3419, 62, 10, 67, 37, 20, 70, 61, 71, 41, 74, 22, 43, 76, 45, 15, 79, 48, 81, 50, 27, 29, 30]
EncodingName: p50k_base
Sample: ¬§±²³µ¶·¹ºª«»¦©¯°±!@#$%^&*()_+
Encoded: [126, 105, 16273, 22519, 31185, 126, 111, 126, 113, 26604, 9129, 126, 117, 36165, 126, 103, 24328, 17730, 126, 99, 16224, 5196, 7200, 22519, 0, 31, 29953, 4, 61, 5, 9, 3419, 62, 10]
EncodingName: p50k_base
Sample: 8mR5*w7^a$!F(0%#J9@X6vZ1)nU3]_Y/
Encoded: [23, 76, 49, 20, 9, 86, 22, 61, 64, 3, 0, 37, 7, 15, 4, 2, 41, 24, 31, 55, 21, 85, 57, 16, 8, 77, 52, 18, 60, 62, 56, 14]
EncodingName: p50k_base
Sample: 😊😀😁😂🤣😃😄😅😆😉😊😋😎😍😘😗😙😚☺️🙂🤗🤔
Encoded: [47249, 232, 47249, 222, 47249, 223, 47249, 224, 8582, 97, 96, 47249, 225, 47249, 226, 8582, 11805, 47249, 228, 47249, 231, 47249, 232, 47249, 233, 47249, 236, 47249, 235, 47249, 246, 47249, 245, 47249, 247, 47249, 248, 24583, 118, 37929, 8582, 25081, 8582, 97, 245, 8582, 97, 242]
EncodingName: p50k_base
Sample: 🤨😐😑😶🙄😏😣😥😮🤐😯😪😫😴😌🤓😛😜😝🤤
Encoded: [8582, 97, 101, 47249, 238, 47249, 239, 47249, 114, 8582, 247, 226, 47249, 237, 47249, 96, 47249, 98, 47249, 106, 8582, 97, 238, 47249, 107, 47249, 103, 47249, 104, 47249, 112, 47249, 234, 8582, 97, 241, 47249, 249, 47249, 250, 47249, 251, 8582, 97, 97]
EncodingName: p50k_base
Sample: 😒😓😔😕🙃🤑😲😷🤒🤕🤢🤧😈👿👹👺💀☠️
Encoded: [47249, 240, 47249, 241, 47249, 242, 47249, 243, 8582, 247, 225, 8582, 97, 239, 47249, 110, 47249, 115, 8582, 97, 240, 8582, 97, 243, 8582, 97, 95, 8582, 97, 100, 47249, 230, 41840, 123, 41840, 117, 41840, 118, 8582, 240, 222, 24583, 254, 37929]
EncodingName: p50k_base
Sample: 😾😿🙀😽😼😻🙈🙉🙊👶👦👧👨👩👴👵👨⚕️👩⚕️
Encoded: [47249, 122, 47249, 123, 8582, 247, 222, 47249, 121, 47249, 120, 47249, 119, 8582, 247, 230, 8582, 247, 231, 8582, 247, 232, 41840, 114, 41840, 99, 41840, 100, 41840, 101, 41840, 102, 41840, 112, 41840, 113, 41840, 101, 447, 235, 158, 248, 243, 37929, 41840, 102, 447, 235, 158, 248, 243, 37929]
EncodingName: p50k_base
Sample: 🌞🌝🌚🌛🌜🌙⭐️🌟💫✨🔥💥☄️🌈☀️🌤️⛅️🌥️
Encoded: [8582, 234, 252, 8582, 234, 251, 8582, 234, 248, 8582, 234, 249, 8582, 234, 250, 8582, 234, 247, 158, 255, 238, 37929, 8582, 234, 253, 8582, 240, 104, 26486, 101, 8582, 242, 98, 8582, 240, 98, 24583, 226, 37929, 8582, 234, 230, 24583, 222, 37929, 8582, 234, 97, 37929, 158, 249, 227, 37929, 8582, 234, 98, 37929]
EncodingName: p50k_base
Sample: 🍏🍎🍐🍊🍋🍌🍉🍇🍓🍈🍒🍑
Encoded: [8582, 235, 237, 8582, 235, 236, 8582, 235, 238, 8582, 235, 232, 8582, 235, 233, 8582, 235, 234, 8582, 235, 231, 8582, 235, 229, 8582, 235, 241, 8582, 235, 230, 8582, 235, 240, 8582, 235, 239]
EncodingName: p50k_edit
Sample:
Encoded: []
EncodingName: p50k_edit
Sample: a
Encoded: [64]
EncodingName: p50k_edit
Sample: 1
Encoded: [16]
EncodingName: p50k_edit
Sample: a a
Encoded: [64, 257]
EncodingName: p50k_edit
Sample: hello
Encoded: [31373]
EncodingName: p50k_edit
Sample: Hello, World! How are you today? 🌍
Encoded: [15496, 11, 2159, 0, 1374, 389, 345, 1909, 30, 12520, 234, 235]
EncodingName: p50k_edit
Sample: こんにちは、世界!お元気ですか?
Encoded: [46036, 22174, 28618, 2515, 94, 31676, 23513, 10310, 244, 45911, 234, 171, 120, 223, 2515, 232, 17739, 225, 36365, 245, 30640, 33623, 27370, 171, 120, 253]
EncodingName: p50k_edit
Sample: Hola, mundo! ¿Cómo estás hoy? 🇪🇸
Encoded: [39, 5708, 11, 27943, 78, 0, 1587, 123, 34, 10205, 5908, 1556, 40138, 289, 726, 30, 12520, 229, 103, 8582, 229, 116]
EncodingName: p50k_edit
Sample: Привет, мир! Как дела?
Encoded: [140, 253, 21169, 18849, 38857, 16843, 20375, 11, 12466, 120, 18849, 21169, 0, 12466, 248, 16142, 31583, 12466, 112, 16843, 30143, 16142, 30]
EncodingName: p50k_edit
Sample: 안녕하세요, 세상! 오늘 기분이 어때요? 🇰🇷
Encoded: [168, 243, 230, 167, 227, 243, 47991, 246, 168, 226, 116, 168, 248, 242, 11, 23821, 226, 116, 168, 225, 223, 0, 23821, 246, 97, 167, 232, 246, 220, 166, 116, 108, 167, 114, 226, 35975, 112, 23821, 244, 112, 167, 243, 234, 168, 248, 242, 30, 12520, 229, 108, 8582, 229, 115]
EncodingName: p50k_edit
Sample: Bonjour, le monde ! Comment ça va aujourd'hui ? 🇫🇷
Encoded: [20682, 73, 454, 11, 443, 285, 14378, 5145, 18957, 6184, 100, 64, 46935, 257, 23577, 454, 67, 6, 71, 9019, 5633, 12520, 229, 104, 8582, 229, 115]
EncodingName: p50k_edit
Sample: The quick brown fox jumps over 13 lazy dogs. 😺
Encoded: [464, 2068, 7586, 21831, 18045, 625, 1511, 16931, 6844, 13, 30325, 118]
EncodingName: p50k_edit
Sample: 1234567890!@#$%^&*()-=_+[]{};:'",.<>?/|`~ 🎉
Encoded: [10163, 2231, 30924, 3829, 0, 31, 29953, 4, 61, 5, 9, 3419, 12, 28, 62, 10, 21737, 90, 19629, 32105, 1600, 29847, 29, 30, 14, 91, 63, 93, 12520, 236, 231]
EncodingName: p50k_edit
Sample: C# is a great programming language for building apps.
Encoded: [34, 2, 318, 257, 1049, 8300, 3303, 329, 2615, 6725, 13]
EncodingName: p50k_edit
Sample: El área de un triángulo es (base * altura) / 2.
Encoded: [9527, 6184, 94, 21468, 390, 555, 1333, 6557, 782, 43348, 1658, 357, 8692, 1635, 5988, 5330, 8, 1220, 362, 13]
EncodingName: p50k_edit
Sample: Здравствуйте, это мой первый раз здесь. Что мне делать?
Encoded: [140, 245, 43666, 21169, 16142, 38857, 21727, 20375, 38857, 35072, 140, 117, 20375, 16843, 11, 220, 141, 235, 20375, 15166, 12466, 120, 25443, 117, 12466, 123, 16843, 21169, 38857, 45035, 140, 117, 220, 21169, 16142, 140, 115, 12466, 115, 43666, 16843, 21727, 45367, 13, 12466, 100, 20375, 15166, 12466, 120, 22177, 16843, 12466, 112, 16843, 30143, 16142, 20375, 45367, 30]
EncodingName: p50k_edit
Sample: હેલો, વિશ્વ! તમે આજે કેમ છો? 🇮🇳
Encoded: [156, 103, 117, 156, 104, 229, 156, 103, 110, 156, 104, 233, 11, 220, 156, 103, 113, 156, 103, 123, 156, 103, 114, 156, 104, 235, 156, 103, 113, 0, 220, 156, 103, 97, 156, 103, 106, 156, 104, 229, 220, 156, 103, 228, 156, 103, 250, 156, 104, 229, 220, 156, 103, 243, 156, 104, 229, 156, 103, 106, 220, 156, 103, 249, 156, 104, 233, 30, 12520, 229, 106, 8582, 229, 111]
EncodingName: p50k_edit
Sample: ความรักและการเป็นกันเองเป็นสิ่งสำคัญที่สุดในโลก 🇹🇭
Encoded: [19567, 226, 19567, 100, 19567, 110, 19567, 94, 19567, 96, 19567, 109, 19567, 223, 31479, 223, 19567, 98, 19567, 108, 19567, 223, 19567, 110, 19567, 96, 31479, 222, 19567, 249, 31479, 229, 19567, 247, 19567, 223, 19567, 109, 19567, 247, 31479, 222, 19567, 255, 19567, 229, 31479, 222, 19567, 249, 31479, 229, 19567, 247, 19567, 103, 19567, 112, 31479, 230, 19567, 229, 19567, 103, 19567, 111, 19567, 226, 19567, 109, 19567, 235, 19567, 245, 19567, 113, 31479, 230, 19567, 103, 19567, 116, 19567, 242, 31479, 225, 19567, 247, 31479, 224, 19567, 98, 19567, 223, 12520, 229, 117, 8582, 229, 255]
EncodingName: p50k_edit
Sample: Python vs Java: Which programming language should you learn first?
Encoded: [37906, 3691, 7349, 25, 9022, 8300, 3303, 815, 345, 2193, 717, 30]
EncodingName: p50k_edit
Sample: A journey of a thousand miles begins with a single step. - Lao Tzu
Encoded: [32, 7002, 286, 257, 7319, 4608, 6140, 351, 257, 2060, 2239, 13, 532, 4689, 78, 309, 27624]
EncodingName: p50k_edit
Sample: Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt. 🇩🇪
Encoded: [32423, 19674, 4801, 502, 7274, 5522, 4891, 3996, 68, 7809, 4656, 19674, 4801, 502, 7274, 370, 2120, 13, 12520, 229, 102, 8582, 229, 103]
EncodingName: p50k_edit
Sample: יש לי כמה שאלות בנוגע לפרויקט החדש שלך. 🇮🇱
Encoded: [33951, 102, 14360, 250, 25529, 14360, 249, 49168, 38269, 14360, 102, 42973, 40010, 27072, 42064, 14360, 239, 147, 254, 27072, 147, 240, 147, 95, 14360, 250, 147, 97, 37778, 27072, 33951, 100, 147, 246, 14360, 242, 147, 245, 147, 241, 50227, 14360, 102, 40010, 147, 248, 13, 12520, 229, 106, 8582, 229, 109]
EncodingName: p50k_edit
Sample: Det är en vacker dag i Sverige. 🇸🇪
Encoded: [11242, 6184, 97, 81, 551, 410, 10735, 48924, 1312, 311, 332, 10045, 13, 12520, 229, 116, 8582, 229, 103]
EncodingName: p50k_edit
Sample: A ∀ x (P(x) → Q(x)) ∧ (∃x P(x)) → ∃x Q(x)
Encoded: [32, 18872, 222, 2124, 357, 47, 7, 87, 8, 15168, 1195, 7, 87, 4008, 18872, 100, 357, 24861, 225, 87, 350, 7, 87, 4008, 15168, 18872, 225, 87, 1195, 7, 87, 8]
EncodingName: p50k_edit
Sample: O Brasil é o maior país da América do Sul. 🇧🇷
Encoded: [46, 39452, 346, 38251, 267, 17266, 1504, 14187, 41200, 12379, 1703, 2634, 30997, 466, 29357, 13, 12520, 229, 100, 8582, 229, 115]
EncodingName: p50k_edit
Sample: L'amore è una forza potente che unisce le persone. 🇮🇹
Encoded: [43, 6, 321, 382, 6184, 101, 555, 64, 329, 4496, 16739, 68, 1125, 555, 271, 344, 443, 2774, 505, 13, 12520, 229, 106, 8582, 229, 117]
EncodingName: p50k_edit
Sample: Είναι μια ηλιόλουστη ημέρα στην Ελλάδα. 🇬🇷
Encoded: [138, 243, 138, 107, 26180, 17394, 29945, 18919, 29945, 17394, 7377, 115, 39377, 29945, 139, 234, 39377, 26517, 139, 227, 38392, 32830, 138, 115, 7377, 115, 34703, 138, 255, 33643, 17394, 18074, 225, 32830, 138, 115, 26180, 7377, 243, 39377, 39377, 138, 105, 138, 112, 17394, 13, 12520, 229, 105, 8582, 229, 115]
EncodingName: p50k_edit
Sample: Teslim tarihi yaklaşıyor, projeyi zamanında bitirmemiz gerekiyor. 🇹🇷
Encoded: [36504, 2475, 256, 2743, 5303, 46251, 5031, 46481, 30102, 88, 273, 11, 386, 73, 2959, 72, 1976, 10546, 30102, 45658, 1643, 2533, 368, 528, 308, 567, 4106, 88, 273, 13, 12520, 229, 117, 8582, 229, 115]
EncodingName: p50k_edit
Sample: Det finnes ingen bedre tid enn nå for å starte noe nytt. 🇳🇴
Encoded: [11242, 957, 2516, 27016, 3996, 260, 29770, 551, 77, 299, 29090, 329, 6184, 98, 923, 68, 645, 68, 299, 88, 926, 13, 12520, 229, 111, 8582, 229, 112]
EncodingName: p50k_edit
Sample: Aanvaard de uitdagingen van het leven met moed en vastberadenheid. 🇳🇱
Encoded: [32, 272, 6862, 446, 390, 334, 270, 67, 3039, 268, 5719, 339, 83, 443, 574, 1138, 6941, 276, 551, 5909, 527, 40780, 28420, 13, 12520, 229, 111, 8582, 229, 109]
EncodingName: p50k_edit
Sample: Chào mừng bạn đến với thế giới của lập trình. 🇻🇳
Encoded: [1925, 24247, 78, 285, 157, 119, 104, 782, 275, 157, 118, 94, 77, 34754, 239, 157, 118, 123, 77, 410, 157, 119, 249, 72, 294, 157, 118, 123, 308, 72, 157, 119, 249, 72, 269, 157, 119, 100, 64, 300, 157, 118, 255, 79, 491, 127, 105, 77, 71, 13, 12520, 229, 119, 8582, 229, 111]
EncodingName: p50k_edit
Sample: Dlaczego warto uczyć się języków obcych? 🇵🇱
Encoded: [35, 75, 330, 89, 1533, 78, 32943, 78, 334, 66, 7357, 38325, 33721, 128, 247, 474, 128, 247, 46355, 10205, 86, 909, 948, 354, 30, 12520, 229, 113, 8582, 229, 109]
EncodingName: p50k_edit
Sample: E = mc², uma equação famosa na física. 🇵🇹
Encoded: [36, 796, 36650, 31185, 11, 334, 2611, 1602, 64, 16175, 28749, 1145, 8546, 12385, 277, 41200, 3970, 13, 12520, 229, 113, 8582, 229, 117]
EncodingName: p50k_edit
Sample: 你今天遇到什么有趣的事情了吗?🇨🇳
Encoded: [19526, 254, 20015, 232, 25465, 34402, 229, 26344, 108, 20015, 222, 20046, 230, 17312, 231, 164, 114, 96, 21410, 12859, 233, 46349, 227, 12859, 228, 28938, 245, 171, 120, 253, 8582, 229, 101, 8582, 229, 111]
EncodingName: p50k_edit
Sample: Nå er det tid for å feire med familie og venner. 🇳🇴
Encoded: [45, 29090, 1931, 1062, 29770, 329, 6184, 98, 730, 557, 1117, 1145, 346, 494, 267, 70, 8710, 1008, 13, 12520, 229, 111, 8582, 229, 112]
EncodingName: p50k_edit
Sample: Þetta er góður dagur til að læra eitthvað nýtt. 🇮🇸
Encoded: [127, 252, 15253, 1931, 308, 10205, 27214, 333, 48924, 333, 21502, 257, 27214, 300, 21241, 430, 304, 270, 400, 6862, 27214, 299, 127, 121, 926, 13, 12520, 229, 106, 8582, 229, 116]
EncodingName: p50k_edit
Sample: გამარჯობა! როგორ ხართ დღეს? 🇬🇪
Encoded: [157, 225, 240, 157, 225, 238, 157, 225, 249, 157, 225, 238, 157, 225, 254, 157, 225, 107, 157, 225, 251, 157, 225, 239, 157, 225, 238, 0, 28053, 225, 254, 157, 225, 251, 157, 225, 240, 157, 225, 251, 157, 225, 254, 28053, 225, 106, 157, 225, 238, 157, 225, 254, 157, 225, 245, 28053, 225, 241, 157, 225, 99, 157, 225, 242, 157, 225, 94, 30, 12520, 229, 105, 8582, 229, 103]
EncodingName: p50k_edit
Sample: Mā te whakawhiti kōrero e whai hua ai tātou. 🇳🇿
Encoded: [44, 10235, 573, 348, 461, 707, 71, 8846, 479, 13090, 34785, 304, 348, 1872, 289, 6413, 257, 72, 256, 10235, 83, 280, 13, 12520, 229, 111, 8582, 229, 123]
EncodingName: p50k_edit
Sample: Это был незабываемый опыт, который я буду помнить всегда.
Encoded: [140, 255, 20375, 15166, 12466, 109, 45035, 30143, 12466, 121, 16843, 140, 115, 16142, 140, 109, 45035, 38857, 16142, 16843, 43108, 45035, 140, 117, 12466, 122, 140, 123, 45035, 20375, 11, 12466, 118, 15166, 20375, 15166, 21169, 45035, 140, 117, 220, 40623, 12466, 109, 35072, 43666, 35072, 12466, 123, 25443, 120, 22177, 18849, 20375, 45367, 12466, 110, 21727, 16843, 140, 111, 43666, 16142, 13]
EncodingName: p50k_edit
Sample: Διαβάζοντας βιβλία, εμπλουτίζουμε τον εαυτό μας με γνώσεις.
Encoded: [138, 242, 29945, 17394, 26638, 138, 105, 138, 114, 26517, 26180, 32830, 17394, 35558, 27169, 29945, 26638, 39377, 138, 107, 17394, 11, 7377, 113, 34703, 46582, 39377, 26517, 139, 227, 32830, 138, 107, 138, 114, 26517, 139, 227, 34703, 30950, 46651, 26517, 26180, 7377, 113, 17394, 139, 227, 32830, 139, 234, 18919, 17394, 35558, 18919, 30950, 7377, 111, 26180, 139, 236, 38392, 30950, 29945, 35558, 13]
EncodingName: p50k_edit
Sample: A számítástechnika világa tele van izgalmas lehetőségekkel. 🇭🇺
Encoded: [32, 264, 89, 6557, 76, 8836, 83, 6557, 4169, 1349, 9232, 39796, 6557, 4908, 5735, 5719, 220, 528, 13528, 5356, 443, 3202, 129, 239, 82, 2634, 469, 74, 7750, 13, 12520, 229, 255, 8582, 229, 118]
EncodingName: p50k_edit
Sample: Vždy je dobré mít plán B, pokud něco nevyjde. 🇨🇿
Encoded: [53, 129, 122, 9892, 11223, 466, 1671, 2634, 285, 8836, 83, 458, 21162, 347, 11, 279, 482, 463, 299, 128, 249, 1073, 497, 7670, 73, 2934, 13, 12520, 229, 101, 8582, 229, 123]
EncodingName: p50k_edit
Sample: Dragostea e un sentiment minunat care ne unește pe toți. 🇷🇴
Encoded: [46022, 455, 18213, 304, 555, 15598, 949, 403, 265, 1337, 497, 17809, 132, 247, 660, 613, 284, 132, 249, 72, 13, 12520, 229, 115, 8582, 229, 112]
EncodingName: p50k_edit
Sample: دیکھو، آسمان میں کتنی تارے ہیں! 🇵🇰
Encoded: [38843, 151, 234, 150, 102, 150, 122, 30335, 148, 234, 17550, 95, 45692, 25405, 12919, 23338, 47048, 151, 234, 150, 118, 220, 150, 102, 41486, 23338, 151, 234, 17550, 103, 12919, 26897, 151, 240, 220, 151, 223, 151, 234, 150, 118, 0, 12520, 229, 113, 8582, 229, 108]
EncodingName: p50k_edit
Sample: Nenda polepole na ujifunze kila siku. 🇹🇿
Encoded: [45, 7438, 16825, 36869, 12385, 334, 73, 361, 403, 2736, 8769, 64, 264, 28643, 13, 12520, 229, 117, 8582, 229, 123]
EncodingName: p50k_edit
Sample: Каква е твоята любима храна? 🇧🇬
Encoded: [140, 248, 16142, 31583, 38857, 16142, 12466, 113, 220, 20375, 38857, 15166, 40623, 20375, 16142, 12466, 119, 141, 236, 140, 109, 18849, 43108, 16142, 220, 141, 227, 21169, 16142, 22177, 16142, 30, 12520, 229, 100, 8582, 229, 105]
EncodingName: p50k_edit
Sample: Sträva alltid efter att bli en bättre version av dig själv.
Encoded: [13290, 11033, 6862, 477, 83, 312, 304, 637, 708, 698, 72, 551, 275, 11033, 926, 260, 2196, 1196, 3100, 264, 73, 11033, 6780, 13]
EncodingName: p50k_edit
Sample: Філософія - це наука про знання. 🇺🇦
Encoded: [140, 97, 141, 244, 30143, 15166, 21727, 15166, 141, 226, 141, 244, 40623, 532, 220, 141, 228, 16843, 12466, 121, 16142, 35072, 31583, 16142, 12466, 123, 21169, 15166, 12466, 115, 22177, 16142, 22177, 22177, 40623, 13, 12520, 229, 118, 8582, 229, 99]
EncodingName: p50k_edit
Sample: Το πρόγραμμα αυτό είναι πολύ ενδιαφέρον. 🇬🇷
Encoded: [138, 97, 26517, 18074, 222, 33643, 139, 234, 42063, 33643, 17394, 34703, 34703, 17394, 26367, 139, 227, 32830, 139, 234, 7377, 113, 138, 107, 26180, 17394, 29945, 18074, 222, 26517, 39377, 139, 235, 7377, 113, 26180, 138, 112, 29945, 17394, 139, 228, 138, 255, 33643, 26517, 26180, 13, 12520, 229, 105, 8582, 229, 115]
EncodingName: p50k_edit
Sample: ^$%#*@!&)(_+=}{|:;"?><,~`'-./][
Encoded: [61, 3, 4, 2, 9, 31, 0, 5, 5769, 62, 47932, 18477, 91, 25, 26, 13984, 6927, 11, 93, 63, 29001, 19571, 7131]
EncodingName: p50k_edit
Sample: 4gH@!0sT*#(9^%$[x{}j+|Yz6;Q]~8
Encoded: [19, 70, 39, 31, 0, 15, 82, 51, 9, 2, 7, 24, 61, 4, 3, 58, 87, 90, 92, 73, 10, 91, 56, 89, 21, 26, 48, 60, 93, 23]
EncodingName: p50k_edit
Sample: wNb)I<>#:i^P]*cR8ytUx1Q`6O@z/
Encoded: [86, 45, 65, 8, 40, 27, 29, 2, 25, 72, 61, 47, 60, 9, 66, 49, 23, 20760, 52, 87, 16, 48, 63, 21, 46, 31, 89, 14]
EncodingName: p50k_edit
Sample: ÄÜö¿¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿
Encoded: [127, 226, 127, 250, 9101, 126, 123, 126, 94, 44359, 14988, 126, 97, 126, 98, 126, 99, 16273, 37102, 16224, 126, 103, 24328, 126, 105, 7461, 5196, 7200, 22519, 31185, 126, 111, 18265, 126, 113, 26604, 9129, 126, 116, 126, 117, 36165, 17730, 126, 120, 23141, 126, 122, 126, 123]
EncodingName: p50k_edit
Sample: ƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽƒšŠŒŽ
Encoded: [130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121, 130, 240, 32790, 129, 254, 129, 240, 129, 121]
EncodingName: p50k_edit
Sample: 5ħÅŸēýïūē$%#^*()_+{[ö&!@#?>|,.<>
Encoded: [20, 128, 100, 127, 227, 129, 116, 27092, 127, 121, 26884, 20317, 27092, 3, 4, 2, 61, 9, 3419, 62, 10, 90, 58, 9101, 5, 0, 41573, 30, 29, 91, 11, 29847, 29]
EncodingName: p50k_edit
Sample: 1B4t#%&*()_+dF5g^hJk7LmN0pQrS<>?
Encoded: [16, 33, 19, 83, 2, 4, 5, 9, 3419, 62, 10, 67, 37, 20, 70, 61, 71, 41, 74, 22, 43, 76, 45, 15, 79, 48, 81, 50, 27, 29, 30]
EncodingName: p50k_edit
Sample: ¬§±²³µ¶·¹ºª«»¦©¯°±!@#$%^&*()_+
Encoded: [126, 105, 16273, 22519, 31185, 126, 111, 126, 113, 26604, 9129, 126, 117, 36165, 126, 103, 24328, 17730, 126, 99, 16224, 5196, 7200, 22519, 0, 31, 29953, 4, 61, 5, 9, 3419, 62, 10]
EncodingName: p50k_edit
Sample: 8mR5*w7^a$!F(0%#J9@X6vZ1)nU3]_Y/
Encoded: [23, 76, 49, 20, 9, 86, 22, 61, 64, 3, 0, 37, 7, 15, 4, 2, 41, 24, 31, 55, 21, 85, 57, 16, 8, 77, 52, 18, 60, 62, 56, 14]
EncodingName: p50k_edit
Sample: 😊😀😁😂🤣😃😄😅😆😉😊😋😎😍😘😗😙😚☺️🙂🤗🤔
Encoded: [47249, 232, 47249, 222, 47249, 223, 47249, 224, 8582, 97, 96, 47249, 225, 47249, 226, 8582, 11805, 47249, 228, 47249, 231, 47249, 232,