indentmon
Version:
Use to detect changes in indentation over several lines using Pythonic rules
224 lines (167 loc) • 4.83 kB
Markdown
Node API used to detect meaningful indent levels a la Python,
specifically by handling mixed tabs and spaces.
## Node API usage
const {
indentmonitor,
IndentError,
} = require('indentmon');
const indentlevel = indentmonitor();
try {
for (const line of somearray) {
const [level, trimmed] = indentlevel(line);
}
} catch(e) {
if (e instanceof IndentError) {
console.error('Your indentation ruined everyting');
console.error(e);
}
}
## Motivation
When writing my own tooling I keep finding new reasons to write DSLs.
Having a way to parse out meaningful indents aware of mixed tabs and
spaces lets me quickly establish scope.
## Indent rules
When `indentmon` scans each line, it is following these rules
(paraphrased from [Antti Haapala](https://stackoverflow.com/a/25471702/394397)):
* If both number of tabs and number of spaces matches the previous
line (no matter the order), then the indent level does not change.
* If the number of one of (tabs, spaces) is greater than that on the
previous line and number of the other is at least equal to those
on the previous line, this is an indented block.
* If the tuple (tabs, spaces) matches an indent from a previous block,
dedent to that block.
* Otherwise, raise `IndentError`.
## Testing
While testing may be automated using the below command, you may also
use the Node.js REPL to interact with `indentmon` directly.
$ npm run test MINLEN NTRIALS
The test suite generates `MINLEN*NTRIALS` indented code samples and
uses brute force to ensure they are all parsed correctly. It also
intentionally indents code incorrectly to make sure `indentmon`
raises exceptions.
* `MINLEN`: The minimum lines of code to create in each production.
* `NTRIALS`: The number of productions belonging to each family
to generate for testing.
Test reports have the format `FM#: PROD`, where `FM` is a two letter
code for one of the below production families, `#` is the trial, or
instance of the production tested, and `PROD` is the indent DSL used
to generate code according to a given indent style and pattern.
## Production families
### Constant (`CT`)
Indentation never changes.
foo
bar
baz
snafu
### Increasing (`IN`)
Indentation increases each line.
foo
bar
baz
snafu
### Nondecreasing (`ND`)
Indentation may or may not increase per line, but will never decrease.
foo
bar
baz
snafu
fubar
barbaz
### Nonmonotonic (`NM`)
Indentation grows, then shrinks.
foo
bar
baz
snafu
fubar
barbaz
goofus
gallant
archangel of shamalama
## Anchored (`AN`)
Indentation always returns to column 0 at the end.
foo
bar
baz
snafu
fubar
barbaz
goofus
gallant
archangel of shamalama
## Dropoff (`DO`)
Indentation grows sharply, and falls to a low indent level.
foo
bar
baz
snafu
fubar
barbaz
foo
bar
baz
snafu
fubar
barbaz
gallant
archangel of shamalama
## Indent DSL
DSL productions use the following charset: `><-0123456789`
* `-`: Print line number, then move to next line.
* `>`: Indent one level, then '-' command.
* `<`: Dedent one level, then '-' command.
* `0-9`: Go to indicated indent level, then '-' command.
The following rules apply:
- Indent levels are zero-based.
- You cannot use `<` before the first `>`.
- There cannot be more `<`s than `>`s
- Digits may only exceed the current level by at most 1.
Examples:
'+' indicates an indent. 'n' indicates a newline.
---- prints 1n2n3n4n
->>- prints 1n+2n++3n++4n
><> prints +1n2n+3n
->>>0><>>1 prints 1n+2n++3n+++4n5n+6n7n+8n++9n+10n
->>59 is invalid. Only 6 can follow 5.
->>95 is invalid. Only 3 can follow 2 (implied by >>).
>>3 is valid. (Same as >>>)
>>>1 is valid.
## Using in Node.js REPL
$ node
> .load ./test.js
> render('>>--->><<--')
1
2
3
4
5
6
7
8
9
10
11
You can also generate production from families like so, such that `L` is
an integer for the *minimum* number of lines you want to print. Removing
the call to `render` will show the indent DSL used.
> render(createConstantProduction(L))
> render(createIncreasingProduction(L))
> render(createNonDecreasingProduction(L))
> render(createAnchoredProduction(L))
> render(createNonMonotonicProduction(L))
> render(createDropOffProduction(L))
## TODO
Consider using a generator like so:
const {
indentmonitor,
IndentError,
} = require('indentmon');
try {
for (const [level, trimmed] of indentmonitor(someiterable)) {
// Should user keep access to original line?
// If so, is it worth storing two strings for each line?
}
} catch (e) {
...
}
Also, distinguish between `TabError` and `IndentError`.