ts-regex-builder
Version:
Maintainable regular expressions for TypeScript and JavaScript.
191 lines (133 loc) • 8.98 kB
Markdown
[](https://badge.fury.io/js/ts-regex-builder)


[](http://makeapullrequest.com)
[](https://github.com/callstack/ts-regex-builder/stargazers)
# TS Regex Builder
Build maintainable regular expressions for TypeScript and JavaScript.
[API docs](https://callstack.github.io/ts-regex-builder/api) | [Examples](https://callstack.github.io/ts-regex-builder/examples)
## Goal
Regular expressions are a powerful tool for matching text patterns, yet they are notorious for their hard-to-parse syntax, especially in the case of more complex patterns.
This library allows users to create regular expressions in a structured way, making them easy to write and review. It provides a domain-specific langauge for defining regular expressions, which are finally turned into JavaScript-native `RegExp` objects for fast execution.
```ts
// Regular JS RegExp
const hexColor = /^#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})$/;
// TS Regex Builder DSL
const hexDigit = /[a-fA-F0-9]/; // or: charClass(charRange('a', 'f'), charRange('A', 'F'), charRange('0', '9'));
const hexColor = buildRegExp([
startOfString,
optional('#'),
capture(
choiceOf(
repeat(hexDigit, 6), // #rrggbb
repeat(hexDigit, 3), // #rgb
),
),
endOfString,
]);
```
## Installation
```sh
npm install ts-regex-builder
```
or
```sh
yarn add ts-regex-builder
```
## Basic usage
```js
import { buildRegExp, capture, oneOrMore } from 'ts-regex-builder';
// /Hello (\w+)/
const regex = buildRegExp(['Hello ', capture(oneOrMore(word))]);
```
## Regex domain-specific language
TS Regex Builder allows you to build complex regular expressions using domain-specific language.
Terminology:
- regex construct (`RegexConstruct`) - common name for all regex constructs like character classes, quantifiers, and anchors.
- regex element (`RegexElement`) - a fundamental building block of a regular expression, defined as either a regex construct, a string, or `RegExp` literal (`/.../`).
- regex sequence (`RegexSequence`) - a sequence of regex elements forming a regular expression. For developer convenience, it also accepts a single element instead of an array.
Most of the regex constructs accept a regex sequence as their argument.
Examples of sequences:
- single element (construct): `capture('Hello')`
- single element (string): `'Hello'`
- single element (`RegExp` literal): `/Hello/`
- array of elements: `['USD', oneOrMore(digit), /Hello/]`
Regex constructs can be composed into a tree structure:
```ts
const currencyCode = repeat(charRange('A', 'Z'), 3);
const currencyAmount = buildRegExp([
choiceOf('$', '€', currencyCode), // currency
capture(
oneOrMore(digit), // integer part
optional(['.', repeat(digit, 2)]), // fractional part
),
]);
```
See [Types API doc](https://callstack.github.io/ts-regex-builder/api/types) for more info.
### Regex Builders
| Builder | Regex Syntax | Description |
| ---------------------------------------- | ------------ | ----------------------------------- |
| `buildRegExp(...)` | `/.../` | Create `RegExp` instance |
| `buildRegExp(..., { ignoreCase: true })` | `/.../i` | Create `RegExp` instance with flags |
See [Builder API doc](https://callstack.github.io/ts-regex-builder/api/builder) for more info.
### Regex Constructs
| Construct | Regex Syntax | Notes |
| ------------------- | ------------ | ------------------------------- |
| `choiceOf(x, y, z)` | `x\|y\|z` | Match one of provided sequences |
| `capture(...)` | `(...)` | Create a capture group |
See [Constructs API doc](https://callstack.github.io/ts-regex-builder/api/constructs) for more info.
> [!NOTE]
> TS Regex Builder does not have a construct for non-capturing groups. Such groups are implicitly added when required.
### Quantifiers
| Quantifier | Regex Syntax | Description |
| -------------------------------- | ------------ | ------------------------------------------------- |
| `zeroOrMore(x)` | `x*` | Zero or more occurrence of a pattern |
| `oneOrMore(x)` | `x+` | One or more occurrence of a pattern |
| `optional(x)` | `x?` | Zero or one occurrence of a pattern |
| `repeat(x, n)` | `x{n}` | Pattern repeats exact number of times |
| `repeat(x, { min: n, })` | `x{n,}` | Pattern repeats at least given number of times |
| `repeat(x, { min: n, max: n2 })` | `x{n1,n2}` | Pattern repeats between n1 and n2 number of times |
See [Quantifiers API doc](https://callstack.github.io/ts-regex-builder/api/quantifiers) for more info.
### Assertions
| Assertion | Regex Syntax | Description |
| ------------------------- | ------------ | ------------------------------------------------------------------------ |
| `startOfString` | `^` | Match the start of the string (or the start of a line in multiline mode) |
| `endOfString` | `$` | Match the end of the string (or the end of a line in multiline mode) |
| `wordBoundary` | `\b` | Match the start or end of a word without consuming characters |
| `lookahead(...)` | `(?=...)` | Match subsequent text without consuming it |
| `negativeLookahead(...)` | `(?!...)` | Reject subsequent text without consuming it |
| `lookbehind(...)` | `(?<=...)` | Match preceding text without consuming it |
| `negativeLookbehind(...)` | `(?<!...)` | Reject preceding text without consuming it |
See [Assertions API doc](https://callstack.github.io/ts-regex-builder/api/assertions) for more info.
### Character classes
> [!TIP]
> You may also use inline regexes for specifying character classes, as they offer a concise yet readable syntax. For example, `/[a-z0-9_]/`.
| Character class | Regex Syntax | Description |
| --------------------- | ------------ | ------------------------------------------------- |
| `any` | `.` | Any character |
| `word` | `\w` | Word character: letter, digit, underscore |
| `digit` | `\d` | Digit character: 0 to 9 |
| `whitespace` | `\s` | Whitespace character: space, tab, line break, ... |
| `anyOf('abc')` | `[abc]` | Any of provided characters |
| `charRange('a', 'z')` | `[a-z]` | Character in a range |
| `charClass(...)` | `[...]` | Union of multiple character classes |
| `negated(...)` | `[^...]` | Negation of a given character class |
See [Character Classes API doc](https://callstack.github.io/ts-regex-builder/api/character-classes) and [Unicode API doc](https://callstack.github.io/ts-regex-builder/api/unicode) for more info.
## Examples
See [Examples](https://callstack.github.io/ts-regex-builder/examples).
## Performance
Regular expressions created with this library are executed at runtime, so you should avoid creating them in a context where they would need to be executed multiple times, e.g., inside loops or functions. We recommend that you create a top-level object for each required regex.
## Contributing
See the [contributing guide](CONTRIBUTING.md) to learn how to contribute to the repository and the development workflow.
See the [project guidelines](GUIDELINES.md) to understand our core principles.
## License
MIT
## Inspiration
TS Regex Builder is inspired by [Swift Regex Builder API](https://developer.apple.com/documentation/regexbuilder).
## Reference
- [ECMAScript Regular Expression BNF Grammar](https://tc39.es/ecma262/#sec-regular-expressions)
- [Unicode Regular Expressions](https://www.unicode.org/reports/tr18/)
- [Swift Evolution 351: Regex Builder DSL](https://github.com/apple/swift-evolution/blob/main/proposals/0351-regex-builder.md)
- [Swift Regex Builder API docs](https://developer.apple.com/documentation/regexbuilder)
---
Made with [create-react-native-library](https://github.com/callstack/react-native-builder-bob)