markov-strings
Version:
A Markov string generator
164 lines (114 loc) • 5.43 kB
Markdown
[](https://travis-ci.org/scambier/markov-strings)
[](https://coveralls.io/github/scambier/markov-strings?branch=master)
[](https://badge.fury.io/js/markov-strings)
---
! This is the readme for markov-strings **3.x.x.** - The docs for the older **2.x.x** are [here](https://github.com/scambier/markov-strings/tree/v2) !
---
A simplistic Markov chain text generator.
Give it an array of strings, and it will output a randomly generated string.
<sup><i>A rust port of this library is available [here](https://github.com/scambier/markov-strings-rust).</i></sup>
This module was created for the Mastodon bot [@BelgicaNews](https://botsin.space/@BelgicaNews).
- [Markov-strings](
- [Prerequisites](
- [Installing](
- [Usage](
- [API](
- [`new Markov([options])`](
- [`.addData(data)`](
- [`.generate([options])`](
- [`.export()` and `.import(data)`](
- [Unit tests](
- [Changelog](
- [Running the tests](
Built and tested with NodeJS 18
`npm install --save markov-strings`
```js
import Markov from 'markov-strings'
// Not recommended: you can use `require()` if needed, instead of `import`
// const Markov = require('markov-strings').default
// Build the Markov generator
const markov = new Markov({ stateSize: 2 })
// Add data for the generator
const data = [/* insert a few hundreds/thousands sentences here */]
markov.addData(data)
const options = {
maxTries: 20, // Give up if I don't have a sentence after 20 tries (default is 10)
// If you want to get seeded results, you can provide an external PRNG.
prng: Math.random, // Default value if left empty
// You'll often need to manually filter raw results to get something that fits your needs.
filter: (result) => {
return result.string.split(' ').length >= 5 && // At least 5 words
result.string.endsWith('.') // End sentences with a dot.
}
}
// Generate a sentence
const result = markov.generate(options)
console.log(result)
/*
{
string: 'lorem ipsum dolor sit amet etc.',
score: 42,
tries: 5,
refs: [ an array of objects ]
}
*/
```
Markov-strings is built in TypeScript, and exports several types to help you. Take a look at [the source](https://github.com/scambier/markov-strings/blob/master/src/index.ts) to see how it works.
Create a generator instance.
```js
{
stateSize: number
}
```
The `stateSize` is the number of words for each "link" of the generated sentence. `1` will output gibberish sentences without much sense. `2` is a sensible default for most cases. `3` and more can create good sentences if you have a corpus that allows it.
To function correctly, the Markov generator needs its internal data to be correctly structured. `.addData(data)` allows you add raw data, that is automatically formatted to fit the internal structure.
You can call `.addData(data)` as often as you need, **with new data each time (!)**. Multiple calls of `.addData()` with the same data is not recommended, because it will skew the random generation of results.
#### data
```js
string[] | Array<{ string: string }>
```
`data` is an array of strings (sentences), or an array of objects. If you wish to use objects, each one must have a `string` attribute. The bigger the array, the better and more varied the results.
Examples:
```js
[ 'lorem ipsum', 'dolor sit amet' ]
```
or
```js
[
{ string: 'lorem ipsum', attr: 'value' },
{ string: 'dolor sit amet', attr: 'other value' }
]
```
The additionnal data passed with objects will be returned in the `refs` array of the generated sentence.
Returns an object of type `MarkovResult`:
```ts
{
string: string, // The resulting sentence
score: number, // A relative "score" based on the number of possible permutations. Higher is "better", but the actual value depends on your corpus
refs: Array<{ string: string }>, // The array of references used to build the sentence
tries: number // The number of tries it took to output this result
}
```
The `refs` array will contain all objects that have been used to build the sentence. May be useful to fetch meta data or make stats.
```ts
{
maxTries: number // The max number of tentatives before giving up (default is 10)
prng: Math.random, // An external Pseudo Random Number Generator if you want to get seeded results
filter: (result: MarkovResult) => boolean // A callback to filter results (see example above)
}
```
You can export and import the markov built corpus. The exported data is a serializable object, and must be deserialized before being re-imported.
[ ](https://github.com/scambier/markov-strings/issues/9)
`npm test`