shashi
Version:
Shashi, a simple module to generate, using pseudo-randomness, a universal family/set of hash functions, which produce integer values within the selected range (a prime number)..
197 lines (151 loc) • 8.02 kB
Markdown
### Shashi
[](https://www.npmjs.org/package/shashi)
[](https://www.codacy.com/public/44gatti/shashi)
[](https://codeclimate.com/github/rootslab/shashi)
[](https://codeclimate.com/github/rootslab/shashi)
[](https://github.com/rootslab/shashi#mit-license)
[](http://travis-ci.org/rootslab/shashi)
[](https://david-dm.org/rootslab/shashi)
[](https://david-dm.org/rootslab/shashi#info=devDependencies)
[](http://npm-stat.com/charts.html?package=shashi)
[](https://nodei.co/npm/shashi/)
[](https://nodei.co/npm/shashi/)
[](https://sourcegraph.com/github.com/rootslab/shashi)
[](https://sourcegraph.com/github.com/rootslab/shashi)
[](https://sourcegraph.com/github.com/rootslab/shashi)
> __Shashi__, a simple module to generate, using __pseudo-randomness__, a __universal family/set of hash functions__, which produce __integer values__ within the selected range (a __prime__ number).
###A random bit of theory
> A family __H__ of hash functions is __universal__ if, for any two items in the universe, the __probability of collision__ is as small as possible.
> Briefly, assumed that for every input key k ∈ K and for every hash function h ∈ H:
> - h(k) map to { 0, 1, .., m−1 }
> then, __H is universal__ when, for any two items x != y:
> - Prob( h(x) == h(y) ) = __1/m__
> In other words, chosen __h__ uniformly at random, from a __universal set H__ of hash functions, if h is
> used to hash __n__ arbitrary keys into __m__ slots, then, for a given key __k__, we expect that the total
> number of __collisions__ with k is < __n/m__.
> Universal hash functions could also be used to compute __perfect hashing__, for a determined set of keys.
> __Choosing properly a prime p and a constant c__:
> - p > c*|set of keys| (c >= ~2.09)
> We expect to find a perfect hash mapping for values, in a __finite time__ (with __high probability__), interpreting lists of resulting hashed values as edges to insert in a __bipartite__ or __tripartite__ (hyper) __graph G__ = {V,E}:
> - |E| = n (keys)
> - |V| = p
> we set:
> - |V| = c * |E|
> then we choose a prime >= 2*n:
> - |V| = p = c * |E| = c * n > n * 2
> Algo example; we choose |H| = 2, then H = { h0, h1 }:
>
> - while ( true ) :
> - pick-up at random 2 hash functions from H
>
> - for every key k ∈ K, add resulting edge and vertices to the 2-graph:
> - edge = (v0, v1), with v0 = h0(k), v1 = h1(k)
>
> - now testing the 2-graph acyclicity (it could be performed in linear time).
> - if 2-graph is acyclic:
> - break the loop, well done!
> - from now on, you can build a perfect hashing function for your set of keys,
> in a deterministic way.
> - otherwise let's gamble!!
> - continue loop, picking-up 2 others random hash functions
> See also:
> - __[Universal Hashing](http://en.wikipedia.org/wiki/Universal_hashing)__
> - __[Perfect Hash Function](http://en.wikipedia.org/wiki/Perfect_hash_function)__
> - __[Brando](https://github.com/rootslab/brando)__
> - __[Nprime](https://github.com/rootslab/nprime)__
###Install
```bash
$ npm install shashi [-g]
```
> __require__:
```javascript
var Shashi = require( 'shashi' );
```
###Run Tests
> __to run all test files, install devDependecies:__
```bash
$ cd shashi/
# install or update devDependecies
$ npm install
# run tests
$ npm test
```
> __to execute a single test file simply do__:
```bash
$ node test/file-name.js
```
###Run Benchmarks
> run miscellaneous benchmarks for Shashi:
```bash
$ cd shashi/
$ npm run bench
# or to run a single file
$ node bench/file-name-bench.js
```
----------------------------------------------------------------------------------------------
###Method
> Arguments within [ ] are optional.
#####Generate seed sequence using Math.random.
> get a method to use a family of hash functions.
```javascript
/*
* A trivial example, Shashi( 2, 16, 257 ) generates:
* - 2 hash functions, h0 and h1 (indexes 0, 1)
* - every hash fn, expects/accepts at most 16 items to encode
* - the range of items should be: [0,255], then using 1 byte
* per item in the seed sequence.
*
* NOTE:
* - if the p number is not a prime, the next greater prime number will be calculated.
* - if the p number is out of range (> 2^53), you'll get an Error.
*/
Shashi( Number hash_fn, Number items_to_hash, Number prime_for_seed_range ) : Array
```
> It returns an Array composed by:
> - a method used to retrieve and execute a particular hash function, using an index (0-k).
> - the underlying seed __[Sequence](https://github.com/rootslab/brando)__, used for generating the hashed value (an integer).
```javascript
/*
* When you only need to regenerate random data for seeding your hash functions,
* you can simply refresh the seed sequence using Sequence#fill or Sequence#parse.
*
* See also https://github.com/rootslab/brando.
*/
Array : [ Function uhash, Sequence seed ]
// universal set of hash functions
uhash : function ( Number fn_index, Buffer input_data [, Number bytes_per_item ] )
```
#####Generate seed sequence using a random sample.
> When a data Buffer is used to fill the seed sequence, the method automatically switches to the callback mode.
```javascript
Shashi( Number h, Number i, Number p [, Buffer src [, Function cback ] ] ) : undefined
```
> the callback gets 3 arguments:
```javascript
/*
* NOTE: an error was returned when:
* - the number supplied for the seed sequence range is not a prime.
* - the sample data are not enough to fill the random seed Sequence.
*/
cback : function ( Error e, Function uhash, Sequence seed ) { .. }
```
> See also __[examples](example/)__.
### MIT License
> Copyright (c) 2015 < Guglielmo Ferri : 44gatti@gmail.com >
> Permission is hereby granted, free of charge, to any person obtaining
> a copy of this software and associated documentation files (the
> 'Software'), to deal in the Software without restriction, including
> without limitation the rights to use, copy, modify, merge, publish,
> distribute, sublicense, and/or sell copies of the Software, and to
> permit persons to whom the Software is furnished to do so, subject to
> the following conditions:
> __The above copyright notice and this permission notice shall be
> included in all copies or substantial portions of the Software.__
> THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
> EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
> IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
> CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
> TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
> SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
[](https://github.com/igrigorik/ga-beacon)