npm-kludge-search
Version:
Kludgy fast npm searcher
241 lines (158 loc) • 5.97 kB
Markdown
npm-kludge-search
=================
Kludgy fast search of `npm` registry
## INSTALLATION
```
$ npm i -g npm-kludge-search
```
Please be patient -- downloading and building the index takes 3-4 minutes.
## COMPLETION (experimental)
You can get completion over package names. This feature is experimental and only
works for `bash` shell at present.
```
$ . <(npm-kluge-search --script)
$ ni grunt-fi<TAB>
```
This should show you all 67 possibilities, from `grunt-figlet` to `grunt-fixtures2js`.
The alias `ni` is short for `npm install`. Making completion work for
subcommands of `npm` is tricky; this is just a proof of concept.
### Troubleshooting completion
The syntax `. <(cmd)` does not work with `bash 3.2`, which is the standard shell
on OSX. If you have `brew`, you can get a newer version of bash:
```
$ brew install bash
$ exec bash -l
$ echo $BASH_VERSION
4.3.33(1)-release
```
## COMPLETION API (experimental)
When you `require` npm-kludge-search as a module, it exposes one API function,
`complete`:
#### `complete( term, stream, done )`
This will write each completion of `term` to `stream`. After all completions
are written, the callback `done` is called.
**Example:**
```
var nks = require('npm-kludge-search');
nks.complete('foo', process.stdout, function () {});
```
Writes the completions of `foo` to stdout:
```
foo
foo-bar-baz
...
foounit
```
## USE
Now you have fast local searching. For example, here is a worst-case
search, where the entire index is searched in order to find a non-indexed
substring, and only a single hit is returned:
```
$ time npm-kludge-search kludge
NAME DESCRIPTION AUTHOR DATE VERSION KEYWORDS
npm-kludge-search Kludgy fast npm searcher =smikes 2015-01-27 2.5.0 npm, search, fast
Found: 1 packages
real 0m1.026s
```
### Searching for a specific module by name
Use `-n` to search for a specific module by name. Exact match only.
```
$ time npm-kludge-search -n npm-kludge-search
NAME DESCRIPTION AUTHOR DATE VERSION KEYWORDS
npm-kludge-search Kludgy fast npm searcher =smikes 2015-01-27 2.5.0 npm, search, fast
Found: 1 packages
real 0m0.302s
```
### Full-text searching
The full text index includes package names, descriptions, keywords,
and the author fields. Only one search term is supported, and there
is no stemming or other fancy stuff (as yet).
To find out how many modules mention food:
```
$ time npm-kludge-search food |wc
28 322 3411
real 0m0.309s
```
Unicode characters work, provided your terminal supports them:
```
$ time npm-kludge-search 目 |wc
53 364 7860
real 0m0.930s
```
On my machine indexed searches run about 0.5s and full table scans run 1-2s.
### Searching by author
There isn't an explicit author index, but because of the convention of
prefixing author names with `=`, we can fake it pretty easily:
```
$ time npm-kludge-search =substack |wc
473 5365 67151
real 0m0.584s
```
### Reporters
There are currently three reporters: `slow` (the default), `fast`, and `json`.
The `slow` reporter waits until all the results are returned and then formats
them nicely in a table using `columnify`.
The `fast` reporter emits a line at a time in fixed format. Paradoxically, it
is slightly slower than the `slow` reporter.
The `json` reporter emits a stream of unformatted `JSON` objects.
### Advanced Search: Boolean Operators & Regular Expressions
Not implemented yet.
## MAINTENANCE/DEVELOPMENT
### Rebuilding the index with fresh data
Simplest to uninstall/reinstall the module --
```
npm u -g npm-kludge-search
npm i -g npm-kludge-search
```
### Manually rebuilding the index
```
cd $(dirname `which npm-kludge-search`)/../lib/node_modules/npm-kludge-search
rm -fr npm-all-cache.json npmdb.pft
./build-zip.sh
```
### Get a snapshot of the npm registry index
```
$ curl https://registry.npmjs.org/-/all > npm-all-cache.json
```
This is >60 MB, so it takes a while to download.
### Build a database
```
$ node ./bin/populate.js [--db <name> || npmdb.pft] [--from <source-json> || npm-all-cache.json]
```
This builds an uncompressed (directory) index; to build a one-file
(zip-compressed) database, see the `build-zip.sh` script.
### Search the database
```
$ node ./bin/search-db.js [--db <dbfile> || npmdb.pft] [--reporter <rep> || slow] [--name <name> || <term>]
```
If `--name` (or `-n`) is specified, only an exact name search is performed.
Otherwise, the search `term` is checked against name and substring-matched in description.
### BUGS
Sometimes duplicate results are displayed. Believed fixed as of 2.5.0.
### TODO
(Some of these are features on `pure-fts`, but all are listed here for
simplicity.)
Support for passing module-name completion lists to `npm`. This will
require returning a range of values from `keys`.
Faster startup by only loading the indexes that are needed.
More tests, especially performance test so we can detect performance
regressions.
More tests of Unicode characters.
Pre-cook the `npmdb.fts` and distribute it via CDN.
Pure-fts is reasonably fast now, because all three main data tables
(keys, values, fts) are indexed, and only the index is loaded at
startup.
Search is currently case-sensitive; it should not be.
Regexp search is not supported.
Better documentation, more command-line options.
An author index. (This may not be needed, as searching for `=name`
works pretty well.).
When populating index, use streaming JSON.stringify and compression.
Streaming output (currently all results are collected and analyzed to
choose correct column widths)
Other reporters. CSV? Others?
Maybe add more control over what is / is not removed from the `all`
file when indexing.
Test coverage is incorrectly reported as 100% because the `bin`-like
modules populateDb and searchDb are not included in coverage stats.
Exceptional conditions tend to be uncovered.