wink-nlp
Version:
Developer friendly Natural Language Processing β¨
287 lines (186 loc) β’ 12.6 kB
Markdown
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/2.3.2)
## Version 2.3.2 November 30, 2024
### βοΈ Updates
- Updated test cases for new model release π€
- Add word vector example link in README. β
# [Fixed some type definitions](https://github.com/winkjs/wink-nlp/releases/tag/2.3.1)
## Version 2.3.1 Nov 24, 2024
### π Fixes
- Updated some BM25Vectorizer methods types according to implementation β thanks to @pavloDeshko β
# [Enabled more special space characters handling](https://github.com/winkjs/wink-nlp/releases/tag/2.3.0)
## Version 2.3.0 May 19, 2024
### β¨ Features
- Detokenization now restores em/en, third/quarter, thin/hair, medium math space characters & narrow non breaking space characters besides the regular nbsp. π π π°οΈ
# [Improved error handling in contextual vectors](https://github.com/winkjs/wink-nlp/releases/tag/2.2.2)
## Version 2.2.2 May 08, 2024
### β¨ Features
- `.contextualVectors()` now throws error if (a) word vectors are not loaded and (b) with `lemma: true`, "pos" is missing in the NLP pipe. π€
### π Fixes
- Refined typescript definitions further. β
# [Added missing typescript definitions](https://github.com/winkjs/wink-nlp/releases/tag/2.2.1)
## Version 2.2.1 May 06, 2024
### π Fixes
- Added missing typescript definitions for word embeddings besides few other typescript fixes. β
# [Added non-breaking space handling capabilities](https://github.com/winkjs/wink-nlp/releases/tag/2.2.0)
## Version 2.2.0 April 03, 2024
### β¨ Features
- Detokenization restores both regular and non-breaking spaces to their original positions. π€
# [Introducing cosine similarity for word vectors](https://github.com/winkjs/wink-nlp/releases/tag/2.1.0)
## Version 2.1.0 March 24, 2024
### β¨ Features
- You can now use `similarity.vector.cosine( vectorA, vectorB )` to compute similarity between two vectors on a scale of 0 to 1. π€
# [Word embeddings have arrived!](https://github.com/winkjs/wink-nlp/releases/tag/2.0.0)
## Version 2.0.0 March 24, 2024
### β¨ Features
- Seamless word embedding integration enhances winkNLP's semantic capabilities. π π π
- Pre-trained 100-dimensional word embeddings for over 350,000 English words released: [wink-embeddings-sg-100d](https://github.com/winkjs/wink-embeddings-sg-100d). π―
- API remains unchanged β no code updates needed for existing projects. The new APIs include: π€©
- **Obtain vector for a token:**Β Use theΒ `.vectorOf( token )`Β API.
- **Compute sentence/document embeddings:**Β Employ theΒ `as.vector`Β helper:Β use `.out( its.lemma, as.vector )` on tokens of a sentence or document. You can also useΒ `its.value`Β orΒ `its.normal`. Tokens can be pre-processed to remove stop words etc using the `.filter()` API. Note, the `as.vector` helper uses averaging technique.
- **Generate contextual vectors:**Β Leverage theΒ `.contextualVectors()`Β method on a document. Useful for pure browser-side applications! Generate custom vectors contextually relevant to your corpus and use them in place of larger pre-trained wink embeddings.
- Comprehensive documentation along with interesting examples is coming up shortly. Stay tuned for updates! π
# [Added Deno example](https://github.com/winkjs/wink-nlp/releases/tag/1.14.3)
## Version 1.14.3 July 21, 2023
### β¨ Features
- Added a live example for how to run winkNLP on Deno. π
# [Fixed a bug](https://github.com/winkjs/wink-nlp/releases/tag/1.14.2)
## Version 1.14.2 July 1, 2023
### π Fixes
- Paramteters in `markup()` are optional now in TS code β squashed a [typescript declaration bug](https://github.com/winkjs/wink-nlp/commit/e6314658766cfa4d40f96b89c211d2d98358cfae). π
# [Squashed a bug](https://github.com/winkjs/wink-nlp/releases/tag/1.14.1)
## Version 1.14.1 June 11, 2023
### π Fixes
- Fixed a [typescript declaration](https://github.com/winkjs/wink-nlp/commit/0ad0690e93f59397dbdde7b876f60c2e5875215b). β
# [Introducing helper for extracting important sentences from a document](https://github.com/winkjs/wink-nlp/releases/tag/1.14.0)
## Version 1.14.0 May 20, 2023
### β¨ Features
- You can now use `its.sentenceWiseImprotance` helper to obtain sentence wise importance (on a scale of 0 to 1) of a document, if it is supported by language model. πππ€
- Checkout live example [How to visualize key sentences in a document?](https://observablehq.com/@winkjs/how-to-visualize-key-sentences-in-a-document) π
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/1.13.1)
## Version 1.13.1 March 27, 2023
### βοΈ Updates
- Some behind the scene model improvements. π π€
- Add clarity on typescript configuration in README. β
# [Improving mark's functionality in custom entities](https://github.com/winkjs/wink-nlp/releases/tag/1.13.0)
## Version 1.13.0 December 09, 2022
### β¨ Features
- Mark allows marking w.r.t. the last element of the pattern. For example if a pattern matches `a fluffy cat` then `mark: [-2, -1]` will extract `fluffy cat` β especially useful when the match length is unknown. π
- Improved error handling while processing mark's arguments. π
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/1.12.3)
## Version 1.12.3 November 18, 2022
### βοΈ Updates
- README is now more informative and links to examples and benchmarks π
- Benchmarked on latest machine, browser versions π₯
# [Ready for Node.js version 18](https://github.com/winkjs/wink-nlp/releases/tag/1.12.2)
## Version 1.12.2 October 13, 2022
### π Fixes
- Fixed incorrect install command in README β
# [Ready for Node.js version 18](https://github.com/winkjs/wink-nlp/releases/tag/1.12.1)
## Version 1.12.1 October 13, 2022
### βοΈ Updates
- Ready for future β we have tested winkNLP on Node.js version 18 including its models. π π
# [Some enhancements plus earned OpenSSF best practices passing badge](https://github.com/winkjs/wink-nlp/releases/tag/1.12.0)
## Version 1.12.0 May 13, 2022
### β¨ Features
- winkNLP earned [Open Source Security Foundation (OpenSSF) Best Practices passing badge](https://bestpractices.coreinfrastructure.org/en/projects/6035). π π π
- `.bowOf()` api of [BM25Vectorizer](https://winkjs.org/wink-nlp/bm25-vectorizer.html) now supports processing of OOV tokens β useful for cosine similarity computation. π
- [Document](https://winkjs.org/wink-nlp/document.html) has a new API β `.pipeConfig()` to inquire the active processing pipeline.
# [Enhancing custom entities & BM25Vectorizer](https://github.com/winkjs/wink-nlp/releases/tag/1.11.0)
## Version 1.11.0 January 30, 2022
### β¨ Features
- Obtain bag-of-words for a tokenized text from BM25Vectorizer using `.bowOf()` api β useful for bow based [similarity](https://winkjs.org/wink-nlp/similarity.html) computation. π
- [`learnCustomEntities()`](https://winkjs.org/wink-nlp/learn-custom-entities.html) displays a console warning, if a complex [short hand pattern](https://winkjs.org/wink-nlp/custom-entities.html) is likely to cause learning/execution slow down.π€βοΈ
# [Enabling loading of BM25Vectorizer model](https://github.com/winkjs/wink-nlp/releases/tag/1.10.0)
## Version 1.10.0 November 18, 2021
### β¨ Features
- Easily load BM25Vectorizer's model using newly introduced `.loadModel()` api. π
# [Enhancing Typescript support](https://github.com/winkjs/wink-nlp/releases/tag/1.9.0)
## Version 1.9.0 November 06, 2021
### β¨ Features
- We have enhanced typescript support to allow easy addition of new typescript enabled language models. π
### βοΈ Updates
- Added naive wikification showcase in README. π
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/1.8.1)
## Version 1.8.1 September 22, 2021
### βοΈ Updates
- Included NLP Pipe details in the README file. π€
# [Introducing Typescript support](https://github.com/winkjs/wink-nlp/releases/tag/1.8.0)
## Version 1.8.0 July 31, 2021
### β¨ Features
- We have added support for Typescript. ππ
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/1.7.2)
## Version 1.7.2 July 15, 2021
### βοΈ Updates
- Some behind the scene updates & fixes. ππ€
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/1.7.1)
## Version 1.7.1 July 09, 2021
### βοΈ Updates
- Improved documentation. ππ€
# [Adding more similarity methods & an as helper](https://github.com/winkjs/wink-nlp/releases/tag/1.7.0)
## Version 1.7.0 July 01, 2021
### β¨ Features
- Now supported similarity methods are cosine for bag of words, tversky & Otsuka-Ochiai (oo) for set. π
- Obtain JS set via `as.set` helper. π
# [Enabling configurable annotation pipeline](https://github.com/winkjs/wink-nlp/releases/tag/1.6.0)
## Version 1.6.0 June 27, 2021
### β¨ Features
- No need to run the entire annotation pipeline, now you can select whatever you want or just even run tokenization by specifying an empty pipe. π€©π
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/1.5.0)
## Version 1.5.0 June 22, 2021
### βοΈ Updates
- Exposed `its` and `as` helpers via the instance of winkNLP as well. π€
# [Introducing cosine similarity & readability stats helper](https://github.com/winkjs/wink-nlp/releases/tag/1.4.0)
## Version 1.4.0 June 15, 2021
### β¨ Features
- Cosine similarity is available on Bag of Words. ππ‘π
- You can now use `its.readabilityStats` helper to obtain document's readability statistics, if it is supported by language model. πππ€
# [Adding long pending lemmatizer support](https://github.com/winkjs/wink-nlp/releases/tag/1.3.0)
## Version 1.3.0 May 22, 2021
### β¨ Features
- Now use `its.lemma` helper to obtain lemma of words. π π
# [Introducing support for browser ready language model](https://github.com/winkjs/wink-nlp/releases/tag/1.2.0)
## Version 1.2.0 December 24, 2020
### β¨ Features
- We have added support for browser ready language model. π€© π
- Now easily vectorize text using bm25-based vectroizer. π€ π
#
### βοΈ Updates
- Examples in README now runs on [RunKit](https://npm.runkit.com/wink-nlp) using web model! β
# [Enabling add-ons to support new language model ](https://github.com/winkjs/wink-nlp/releases/tag/1.1.0)
## Version 1.1.0 September 18, 2020
### β¨ Features
- We have enabled add-ons to support enhanced language models, paving way for new `its` helpers. π
- Now use [`its.stem`](https://winkjs.org/wink-nlp/its-as-helper.html) helper to obtain stems of the words using Porter Stemmer Algorithm V2. π
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/1.0.1)
## Version 1.0.1 August 24, 2020
### βοΈ Updates
- Benchmarked on Node.js v12 & v14 also and updated the speed to minimum observed. πββοΈ
# [Announcing the stable version 1.0.0](https://github.com/winkjs/wink-nlp/releases/tag/1.0.0)
## Version 1.0.0 August 21, 2020
### βοΈ Updates
- Happy to release version 1.0.0 for you! π«π
- You can optionally include custom entity detection while running speed benchmark. π
# [Operational update](https://github.com/winkjs/wink-nlp/releases/tag/0.4.0)
## Version 0.4.0 August 9, 2020
### βοΈ Updates
- Getting ready to move to version 1.0.0 β almost there! π«
# [Operational updates](https://github.com/winkjs/wink-nlp/releases/tag/0.3.1)
## Version 0.3.1 August 3, 2020
### βοΈ Updates
- Some behind the scene updates to test cases. π
- Updated the version of English light language model to the latest β 0.3.0. π
# [Simplified language model installation](https://github.com/winkjs/wink-nlp/releases/tag/0.3.0)
## Version 0.3.0 July 29, 2020
### β¨ Features
- No need to remember or copy/paste long Github url for language model installation. The new script installs the latest version for you automatically. π
# [Improved custom entities](https://github.com/winkjs/wink-nlp/releases/tag/0.2.0)
## Version 0.2.0 July 21, 2020
### β¨ Features
- We have added `.parentCustomEntity()` API to `.tokens()` API. π
#
### π Fixes
- Accessing custom entities was failing whenever there were no custom entities. Now things are as they should be β it tells you that there are none! β
# [Improved interface with language model](https://github.com/winkjs/wink-nlp/releases/tag/0.1.0)
## Version 0.1.0 June 24, 2020
### β¨ Features
- We have improved interface with the language model β now supports the new format. π