UNPKG

@redpanda-data/docs-extensions-and-macros

Version:

Antora extensions and macros developed for Redpanda documentation.

835 lines (665 loc) 25 kB
# Redpanda Connect Connector Documentation Automation ## Overview This automation generates comprehensive reference documentation for Redpanda Connect connectors, including inputs, outputs, processors, buffers, caches, rate limiters, metrics, tracers, scanners, and optionally Bloblang functions/methods. The automation handles **multi-release attribution**, automatically detecting and processing intermediate releases that may have been missed, ensuring that changes are accurately attributed to their actual release version rather than being lumped together. ## Goals ### Primary Goals 1. **Generate Comprehensive Reference Docs**: Create AsciiDoc documentation for all Redpanda Connect components with: - Field descriptions with types, defaults, and options - Working code examples (minimal and advanced configurations) - Cross-references to related components - Metadata (status badges: stable, beta, experimental, deprecated) 2. **Accurate Version Attribution**: Track when each component and field was introduced: - Detect releases between the last documented version and latest - Process each release pair sequentially - Generate per-version change tracking - Maintain historical accuracy even when releases are skipped 3. **Platform Support Detection**: Identify and document platform availability: - **Cloud-supported**: Available in Redpanda Cloud (both serverless and BYOC) - **Self-hosted only**: Available only in self-hosted deployments - **Cloud-only**: Exclusive to Redpanda Cloud - **Cgo-required**: Requires cgo-enabled builds 4. **Change Detection & Reporting**: Generate detailed change reports: - New connectors and fields - Removed/deprecated components - Changed default values - Breaking changes (removed fields) ## How It Works ### Architecture Overview ```text ┌─────────────────────────────────────────────────────────────────┐ 1. Version Detection Read current version from antora.yml Detect latest version from GitHub releases or rpk Discover all intermediate releases └──────────────────────┬──────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────┐ 2. Sequential Release Processing (for each version pair) ┌───────────────────────────────────────────────────────┐ For each pair (v[n] v[n+1]): a. Fetch connector data for both versions b. Run binary analysis (OSS, Cloud, cgo) c. Generate version-specific diff d. Track changes with version attribution └───────────────────────────────────────────────────────┘ └──────────────────────┬──────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────┐ 3. Final Version Processing Generate AsciiDoc partials (fields & examples) Create full page drafts for new connectors (optional) Update navigation files Create master diff aggregating all changes └──────────────────────┬──────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────────┐ 4. Output Generation Individual diffs: connect-diff-X.X.X_to_Y.Y.Y.json Master diff: connect-diff-master-X.X.X_to_Z.Z.Z.json PR summary with per-version attribution Writer action items └─────────────────────────────────────────────────────────────────┘ ``` ### Multi-Release Processing **Scenario**: antora.yml has version `4.50.0`, latest release is `4.54.0` **Without Multi-Release (OLD behavior)**: ```text 4.50.0 ─────────────────────────► 4.54.0 (all changes lumped) ``` Result: All changes from 4.51.0, 4.52.0, 4.53.0, and 4.54.0 are attributed to 4.54.0 **With Multi-Release (NEW behavior)**: ```text 4.50.0 ──► 4.51.0 ──► 4.52.0 ──► 4.53.0 ──► 4.54.0 diff1 diff2 diff3 diff4 └──────────┴──────────┴──────────┴──────────┘ master-diff.json (accurate attribution) ``` Result: Each change is attributed to its actual release version ### Version Detection Flow 1. **Determine Starting Version**: - Read `asciidoc.attributes.latest-connect-version` from `antora.yml` - OR use `--from-version` flag override - OR fallback to latest JSON file in `docs-data/` 2. **Determine Target Version**: - Use `--connect-version` flag (explicit version) - OR auto-detect latest stable release from GitHub - OR use local `rpk connect --version` 3. **Discover Intermediate Releases**: - Query GitHub Releases API: `repos/redpanda-data/connect/releases` - Filter to stable releases (exclude beta, RC, alpha) - Parse semver and find all versions between start and target - Sort chronologically ### Data Collection For each version being processed: 1. **Connector Metadata** (via `rpk connect list`): ```json { "name": "kafka", "type": "inputs", "status": "stable", "description": "...", "summary": "...", "config": { /* full schema */ } } ``` 2. **Binary Analysis** (download and inspect binaries): - **OSS binary**: Standard self-hosted build - **Cloud binary**: Redpanda Cloud serverless/BYOC build - **Cgo binary**: Build with cgo-enabled components Compares which connectors exist in each binary to determine: - `inCloud`: Present in both OSS and Cloud - `notInCloud`: Only in OSS (self-hosted only) - `cloudOnly`: Only in Cloud binary - `cgoOnly`: Only in cgo-enabled binary 3. **Metadata CSV** (optional, from GitHub): - Commercial names for connectors - Additional categorization info ### Change Detection For each version pair `(oldVersion newVersion)`: ```javascript { "comparison": { "oldVersion": "4.50.0", "newVersion": "4.51.0", "timestamp": "2026-04-01T00:00:00.000Z" }, "summary": { "newComponents": 3, // New connectors "newFields": 15, // New fields added to existing connectors "removedComponents": 0, "removedFields": 2, // Breaking changes! "deprecatedComponents": 0, "deprecatedFields": 1, "changedDefaults": 0 }, "details": { "newComponents": [ { "name": "postgres_cdc", "type": "inputs", "status": "beta", "version": "4.51.0", // Attribution! "description": "..." } ], "newFields": [ { "component": "inputs:kafka", "field": "rack_id", "description": "..." } ], // ... other change categories }, "binaryAnalysis": { "ossVersion": "4.51.0", "cloudVersion": "4.52.0-rc1", "comparison": { "inCloud": [/*...*/], "notInCloud": [/*...*/], "cloudOnly": [/*...*/] }, "cgoOnly": [/*...*/] } } ``` ## Output Specifications ### 1. AsciiDoc Documentation Files #### Field Partials (`modules/components/partials/fields/{type}/{name}.adoc`) ```asciidoc // This content is autogenerated. Do not edit manually. == Fields === `field_name` Description of the field with details. *Type*: `string` *Default*: `"default_value"` *Options*: `option1`, `option2`, `option3` [source,yaml] ---- # Example: field_name: example_value ---- === `another_field` ... ``` **Requirements**: - All fields documented with type, default, options - Code examples in YAML - Cross-references using `xref:` syntax - Conditional content for deprecated/experimental fields #### Example Partials (`modules/components/partials/examples/{type}/{name}.adoc`) ```asciidoc // This content is autogenerated. Do not edit manually. == Examples === Minimal configuration Basic setup with required fields only [source,yaml] ---- input: kafka: addresses: ["localhost:9092"] topics: ["my_topic"] ---- === Advanced configuration Complete configuration with optional fields [source,yaml] ---- input: kafka: addresses: ["localhost:9092"] topics: ["my_topic"] consumer_group: "my_group" checkpoint_limit: 1000 # ... all fields ---- ``` **Requirements**: - Minimal example (required fields only) - Advanced example (all meaningful fields) - **Only output one example if they're identical** (no tabs needed) - Use leading sentence: "Here's an example configuration:" - Real-world, working configurations #### Full Page Drafts (`modules/components/pages/{type}/{name}.adoc`) Generated with `--draft-missing` flag for NEW connectors: ```asciidoc = Connector Name :type: input :status: beta :page-commercial-names: Commercial Name, Alternative Name // tag::single-source[] Brief summary of what this connector does. == Common Use Cases * Use case 1 * Use case 2 [tabs] ==== Common config:: + include::connect:components:partial$examples/inputs/connector_name.adoc[tag=common] Advanced config:: + include::connect:components:partial$examples/inputs/connector_name.adoc[tag=advanced] ==== include::connect:components:partial$fields/inputs/connector_name.adoc[] // end::single-source[] ``` **Requirements**: - Frontmatter with metadata - Single-source tags for reuse in cloud docs - Tabs for common vs advanced configs (only if different) - Platform indicators (☁️ for cloud, 🔧 for cgo) ### 2. Data Files #### Connector Data JSON (`docs-data/connect-{version}.json`) Complete connector metadata for a specific version: ```json { "inputs": [ { "name": "kafka", "status": "stable", "plugin": true, "description": "...", "summary": "...", "config": { "type": "object", "fields": [ { "name": "addresses", "type": "array", "description": "...", "default": [], "kind": "scalar" } ] }, "requiresCgo": false, "cloudOnly": false } ], "outputs": [...], "processors": [...], // ... other component types } ``` **Retention**: Only the latest version is kept after processing completes. #### Individual Diff JSON (`docs-data/connect-diff-{v1}_to_{v2}.json`) Changes between two consecutive versions: ```json { "comparison": { "oldVersion": "4.50.0", "newVersion": "4.51.0", "timestamp": "2026-04-01T10:00:00.000Z" }, "summary": { "newComponents": 3, "newFields": 15, "removedFields": 2, "deprecatedFields": 1 }, "details": { "newComponents": [...], "newFields": [...], "removedFields": [...], "deprecatedFields": [...], "changedDefaults": [...] }, "binaryAnalysis": { "versions": { "oss": "4.51.0", "cloud": "4.52.0", "cgo": "4.52.0" }, "comparison": { "inCloud": [...], "notInCloud": [...], "cloudOnly": [...] }, "cgoOnly": [...], "details": { "cloudSupported": [...], "selfHostedOnly": [...], "cloudOnly": [...] } } } ``` **Retention**: Kept for intermediate versions during processing, cleaned up after master diff is created. #### Master Diff JSON (`docs-data/connect-diff-master-{v1}_to_{vN}.json`) Aggregated changes across multiple releases: ```json { "metadata": { "generatedAt": "2026-04-01T10:00:00.000Z", "startVersion": "4.50.0", "endVersion": "4.54.0", "processedReleases": 4 }, "totalSummary": { "versions": ["4.51.0", "4.52.0", "4.53.0", "4.54.0"], "releaseCount": 4, "newComponents": 12, "newFields": 45, "removedFields": 5, "deprecatedFields": 3 }, "releases": [ { "fromVersion": "4.50.0", "toVersion": "4.51.0", "date": "2024-05-01T00:00:00.000Z", "summary": {...}, "details": {...}, "binaryAnalysis": {...} }, { "fromVersion": "4.51.0", "toVersion": "4.52.0", // ... } // ... one entry per release ] } ``` **Purpose**: Provides writers with accurate per-version attribution for changelog/release notes. ### 3. PR Summary Automatically generated PR description with platform indicators and action items: ```markdown ## 📊 Redpanda Connect Documentation Update **📦 Multi-Release Update:** 4.50.0 4.54.0 **Releases Processed:** 4 **Cloud Version:** 4.55.0 ### Total Changes Across All Releases - **12** new connectors - **45** new fields across 4 release(s) - **5** removed fields ⚠️ - **3** deprecated fields ### Changes Per Release #### 🔖 Version 4.51.0 **New Connectors (3):** - `postgres_cdc` (inputs, beta) ☁️ - `tigerbeetle_cdc` (inputs, beta) 🔧 - `mongodb_cdc` (inputs, stable) ☁️ **New Fields:** 12 added **⚠️ Removed Fields:** 2 #### 🔖 Version 4.52.0 **New Connectors (5):** - `oracledb_cdc` (inputs, experimental) ☁️ - `elasticsearch_v9` (outputs, stable) - ... **New Fields:** 18 added #### 🔖 Version 4.53.0 _No changes in this release_ #### 🔖 Version 4.54.0 **New Connectors (4):** ... ### ✍️ Writer Action Items **Document New Connectors:** - [ ] Document new `postgres_cdc` inputs from **4.51.0** ☁️ - [ ] Document new `tigerbeetle_cdc` inputs from **4.51.0** 🔧 - [ ] Document new `mongodb_cdc` inputs from **4.51.0** ☁️ - [ ] Document new `oracledb_cdc` inputs from **4.52.0** ☁️ - [ ] Document new `a2a_message` processors from **4.54.0** ☁️ ### ☁️ Cloud Docs Update Required **12** new connectors are available in Redpanda Cloud. **Action:** Submit a separate PR to cloud-docs repository. **For connectors in pages:** \```asciidoc include::connect:components:page$type/name.adoc[tag=single-source] \``` **For cloud-only connectors (in partials):** \```asciidoc include::connect:components:partial$components/cloud-only/type/name.adoc[tag=single-source] \``` ### 🔧 Cgo Requirements The following new connectors require cgo-enabled builds: - `tigerbeetle_cdc` (inputs) - `zmq4` (inputs, outputs) - `ffi` (processors) [Cgo installation instructions included] <details> <summary><strong>📋 Detailed Changes</strong> (click to expand)</summary> [Comprehensive breakdown of all changes] </details> ``` **Indicators**: - ☁️ = Cloud-supported (available in Redpanda Cloud) - 🔧 = Requires cgo-enabled build - ⚠️ = Breaking change (removed fields) ## CLI Usage ### Basic Usage ```bash # Generate docs for latest version npx doc-tools generate rpcn-connector-docs --fetch-connectors # Generate docs for specific version npx doc-tools generate rpcn-connector-docs \ --fetch-connectors \ --connect-version 4.54.0 # Process intermediate releases with custom starting version npx doc-tools generate rpcn-connector-docs \ --fetch-connectors \ --from-version 4.50.0 \ --connect-version 4.54.0 ``` ### CLI Flags | Flag | Description | Default | |------|-------------|---------| | `--fetch-connectors` | Fetch fresh connector data using rpk | - | | `--connect-version <version>` | Target Connect version to process | Auto-detect latest | | `--from-version <version>` | Override starting version (instead of antora.yml) | Read from antora.yml | | `--skip-intermediate` | Disable multi-release processing (legacy mode) | Multi-release enabled | | `--cloud-version <version>` | Specific cloud binary version | Auto-detect latest | | `--cgo-version <version>` | Specific cgo binary version | Same as cloud | | `--draft-missing` | Generate full page drafts for new connectors | false | | `--update-whats-new` | Update whats-new.adoc with changes | false | | `--include-bloblang` | Include Bloblang functions/methods | false | | `--overrides <path>` | JSON file with description overrides | `docs-data/overrides.json` | ### Examples #### Catchup After Missed Releases ```bash # antora.yml has 4.50.0, but latest is 4.54.0 # This will process all 4 intermediate releases npx doc-tools generate rpcn-connector-docs \ --fetch-connectors \ --draft-missing \ --update-whats-new ``` **Output**: - Processes: 4.50.0→4.51.0, 4.51.0→4.52.0, 4.52.0→4.53.0, 4.53.0→4.54.0 - Creates: 4 individual diffs + 1 master diff - Generates: Partials, drafts, PR summary with per-version attribution #### Legacy Single-Version Mode ```bash # Disable multi-release processing (old behavior) npx doc-tools generate rpcn-connector-docs \ --fetch-connectors \ --skip-intermediate ``` #### Custom Version Range ```bash # Process releases between specific versions npx doc-tools generate rpcn-connector-docs \ --fetch-connectors \ --from-version 4.52.0 \ --connect-version 4.54.0 ``` ## File Structure ``` docs-extensions-and-macros/ ├── tools/redpanda-connect/ ├── rpcn-connector-docs-handler.js # Main orchestration ├── generate-rpcn-connector-docs.js # Doc generation logic ├── report-delta.js # Diff generation ├── pr-summary-formatter.js # PR summary formatting ├── github-release-utils.js # GitHub API integration ├── multi-version-summary.js # Master diff aggregation ├── connector-binary-analyzer.js # Binary download & analysis └── update-whats-new.js # Release notes updates ├── docs-data/ # Generated data (gitignored) ├── connect-{version}.json # Connector metadata (latest only) ├── connect-diff-{v1}_to_{v2}.json # Individual diffs (intermediate) └── connect-diff-master-{v1}_to_{vN}.json # Master diff (kept) └── modules/components/ ├── pages/{type}/{name}.adoc # Full pages (manually created or drafted) └── partials/ ├── fields/{type}/{name}.adoc # Auto-generated field docs └── examples/{type}/{name}.adoc # Auto-generated examples ``` ## Testing ### Unit Tests ```bash # Run all tests npm test # Test specific modules npm test -- __tests__/tools/github-release-utils.test.js npm test -- __tests__/tools/pr-summary-formatter.test.js ``` **Coverage**: - Version parsing and semver comparisons - Prerelease filtering (beta/RC/alpha) - Intermediate release discovery - Platform detection (cloud vs self-hosted) - PR summary formatting (single and multi-version) - Diff generation and change detection ### Integration Testing ```bash # Create test environment mkdir -p /tmp/test-automation/{docs-data,modules/components/pages} # Create mock antora.yml echo 'name: test version: main asciidoc: attributes: latest-connect-version: "4.50.0"' > /tmp/test-automation/antora.yml # Run automation cd /tmp/test-automation npx doc-tools generate rpcn-connector-docs \ --from-version 4.50.0 \ --connect-version 4.54.0 \ --fetch-connectors ``` **Verify**: - Multiple diffs created (one per release pair) - Master diff with accurate attribution - AsciiDoc partials generated - antora.yml updated to latest version - Only latest JSON retained ## Dependencies ### Runtime Dependencies - `@octokit/rest` - GitHub API client for release discovery - `semver` - Semantic version parsing and comparison - `handlebars` - Template engine for doc generation - `js-yaml` - YAML parsing for antora.yml ### External Tools - `rpk` - Redpanda CLI for fetching connector metadata - `git` - For cloning repositories to fetch binary versions ## Error Handling ### GitHub API Rate Limiting - **Without token**: 60 requests/hour - **With token**: 5,000 requests/hour - **Handling**: Cache responses, graceful degradation to single-version mode ### Missing Intermediate Data - Automatically fetches from GitHub releases - Downloads binaries for specified versions - Falls back to rpk if available locally ### Network Failures - Retries with exponential backoff - Continues processing with partial data - Logs warnings for manual review ## Edge Cases ### No Intermediate Releases - Behaves like legacy single-version mode - No master diff created - Standard PR summary generated ### Beta/RC Versions - Automatically filtered out - Only stable GA releases processed - Explicit override with `--include-prerelease` (not yet implemented) ### Identical Consecutive Versions - Skips diff generation - Logs "No changes detected" - Updates metadata only ### Binary Unavailability - Continues without binary analysis - Platform indicators omitted from output - Warning logged for manual verification ## Future Enhancements 1. **Bloblang Full Support**: Currently optional, could be made default 2. **Automated PR Creation**: Auto-submit PRs with generated content 3. **CI/CD Integration**: GitHub Actions workflow for weekly runs 4. **Historical Backfill**: Process all historical releases for complete attribution 5. **Diff Visualization**: Web UI to browse changes across versions 6. **Custom Templates**: User-provided Handlebars templates for docs ## Maintenance ### Updating Templates Templates are in `tools/redpanda-connect/templates/`: - `connector-fields.hbs` - Field documentation template - `connector-examples.hbs` - Examples template - `connector-full.hbs` - Full page draft template ### Updating Overrides Override file: `docs-data/overrides.json` (or `--overrides` flag) ```json { "inputs": { "kafka": { "fields": { "addresses": { "description": "Custom description override", "examples": ["localhost:9092"] } } } } } ``` Supports `$ref` syntax for deduplication: ```json { "inputs": { "kafka": { "fields": { "tls": { "$ref": "#/common/tls" } } } }, "common": { "tls": { "description": "TLS configuration (reused across components)" } } } ``` ## Troubleshooting ### "No releases found in the specified range" - **Cause**: Invalid version range or no releases exist between versions - **Fix**: Verify versions exist on GitHub, check semver format ### "GitHub API rate limit exceeded" - **Cause**: Too many API requests without authentication - **Fix**: Set `GITHUB_TOKEN` environment variable ### "Binary analysis failed" - **Cause**: Unable to download binaries (network, permissions, etc.) - **Fix**: Check network, ensure write permissions to temp directories ### "Versions match, skipping diff" - **Cause**: Already at target version, no work needed - **Fix**: This is expected behavior, no action needed ## Contact & Support For issues or questions: - **GitHub Issues**: [docs-extensions-and-macros repository](https://github.com/redpanda-data/docs-extensions-and-macros/issues) - **Slack**: #docs channel - **Docs**: Internal Confluence documentation --- **Last Updated**: 2026-04-01 **Version**: 1.0.0 **Maintainers**: Redpanda Docs Team