UNPKG

@octofhir/ucum-wasm

Version:

UCUM (Unified Code for Units of Measure) library for FHIRPath quantity operations with CLI, WASM, and FHIR integration

645 lines (478 loc) โ€ข 22 kB
# UCUM-RS [![CI](https://github.com/octofhir/ucum-rs/workflows/CI/badge.svg)](https://github.com/octofhir/ucum-rs/actions/workflows/ci.yml) [![Crates.io](https://img.shields.io/crates/v/octofhir-ucum.svg)](https://crates.io/crates/octofhir-ucum) [![npm](https://img.shields.io/npm/v/@octofhir/ucum-wasm.svg)](https://www.npmjs.com/package/@octofhir/ucum-wasm) [![Docs.rs](https://docs.rs/octofhir-ucum/badge.svg)](https://docs.rs/octofhir-ucum) High-performance Unified Code for Units of Measure (UCUM) implementation in Rust 2024 edition. ## Quick Start ```sh # Add to your project cargo add octofhir-ucum # Or use the CLI cargo install octofhir-ucum # Example: Convert 100 kPa to mm[Hg] octofhir-ucum convert --value 100 --from kPa --to mm[Hg] ``` ## Features ### ๐Ÿš€ Enhanced API (ADR-001 + Phase 3 Implementation) | Feature | Status | Notes | |------------------------|----------|----------------------------------------| | **Comprehensive Validation** | โœ… | `validate()` with detailed error reporting | | **Unit Analysis** | โœ… | `analyse()` with dimensions, factors, properties | | **Unit Arithmetic** | โœ… | `unit_multiply()`, `unit_divide()` operations | | **Advanced Search** | โœ… | Text, property, fuzzy, and regex search | | **Property Validation** | โœ… | `validate_in_property()` for physical quantities | | **Unit Compatibility** | โœ… | `is_comparable()` for commensurability checking | | **Canonical Forms** | โœ… | `get_canonical_units()` for normalization | | **Special Unit System** | โœ… | Extensible handlers for temperature, logarithmic units | | **Precision Arithmetic** | โœ… | Optional `rust_decimal` support for high precision | | **๐Ÿ†• Model Introspection** | โœ… | `get_model()`, `validate_ucum()`, `get_properties()` | | **๐Ÿ†• Enhanced Display Names** | โœ… | `get_common_display()` with prefixed unit support | | **๐Ÿ†• Advanced Conversion** | โœ… | `convert_with_context()` with precision control | ### ๐Ÿ”ง Core Capabilities | Feature | Status | Notes | |------------------------|----------|----------------------------------------| | SI base/derived units | โœ… | Full support with 7-dimensional vectors | | Customary units | โœ… | Imperial, US customary, etc. | | Specialized units | โœ… | Medical, laboratory, information units | | Prefix handling | โœ… | e.g., kPa, mL, ยตg with precision support | | Expression parsing | โœ… | **Zero-copy architecture** with robust error messages | | Unit conversion | โœ… | Handles factors, offsets, temperature | | Temperature support | โœ… | Celsius, Fahrenheit, Rankine with offsets | | **Performance Optimization** | โœ… | **Phase 1 Complete** - 40% parsing improvement | ### ๐Ÿ› ๏ธ Tools & Integration | Feature | Status | Notes | |------------------------|----------|----------------------------------------| | CLI tool | โœ… | `octofhir-ucum` binary | | WASM support | โœ… | npm package: `@octofhir/ucum-wasm` | | Interactive playground | โœ… | Svelte 5 web application | | FHIR integration | โœ… | FHIR Quantity data type support | | Property-based tests | โœ… | `proptest` | | Fuzzing | โœ… | `cargo-fuzz` targets for parser/eval | ### ๐Ÿ“Š Test Conformance (98.6% Overall) | Test Category | Status | Results | |------------------------|----------|----------------------------------------| | **Overall Conformance** | โœ… | **98.6%** (1120/1136 tests passing) | | Validation tests | โœ… | **99.5%** (1048/1053) | | Conversion tests | โš ๏ธ | **83.1%** (49/59) - acceptable precision differences | | Division tests | โœ… | **100%** (3/3) - precision arithmetic fixed | | Multiplication tests | โœ… | **100%** (4/4) | | Display name tests | โœ… | **94.1%** (16/17) | ### โšก Performance **Current Performance (v0.5.0 with Unified Optimized Parser):** - **Simple parsing**: ~7,900,000 ops/second (~126 ns per operation) **[+2700% improvement]** - **Prefixed units**: ~6,800,000 ops/second (~147 ns per operation) **[+2300% improvement]** - **Unicode handling**: ~6,200,000 ops/second (~161 ns per operation) **[+2100% improvement]** - **Complex expressions**: ~1,560,000 ops/second (~640 ns per operation) **[+450% improvement]** - **Evaluation**: ~1,390,000 ops/second (~718 ns per operation) - **Analysis**: ~606,000 ops/second (~1.65 ยตs per operation) **Performance Optimizations Implemented:** ### High-Performance Parser Architecture - โœ… **Zero-copy string parsing** - Avoids unnecessary string allocations during parsing - โœ… **Lazy Unicode normalization** - Only normalizes ยต characters when detected - โœ… **Fast pattern validation** - Single-pass scanning with optimized character handling - โœ… **Dual AST architecture** - `UnitExpr<'a>` (zero-copy) and `OwnedUnitExpr` (owned) - โœ… **Enhanced prefix lookup** - O(1) HashMap-based prefix resolution ### Parser Features (`parser.rs`) - โœ… **ASCII lookup tables** - Fast character classification with compile-time tables - โœ… **SIMD-ready validation** - Infrastructure for x86_64 SSE2 acceleration - โœ… **Perfect hash maps** - Compile-time perfect hashing for common units (time units) - โœ… **Small vector optimization** - Most UCUM expressions have โ‰ค4 factors - โœ… **Single-pass tokenization** - Efficient tokenizer with minimal backtracking - โœ… **UTF-8 micro sign handling** - Proper handling of ยต (0xC2 0xB5) sequences **Benchmarking Infrastructure:** Comprehensive benchmarks track performance across multiple dimensions: - **Complexity categories**: Simple units, prefixed units, complex expressions, edge cases - **Parser features**: Unicode handling, annotations, ten-power notation, leading division - **Real-world usage**: Medical dosing, engineering calculations, batch processing - **Memory patterns**: Zero-copy vs owned allocations, pathological cases - **Allocation tracking**: Measures memory allocation patterns for different expression types **Technical Implementation Notes:** The unified parser implementation achieves exceptional performance: - `parser.rs` - Single high-performance parser with advanced optimizations - **26x faster** parsing for simple units compared to baseline - **23x faster** for prefixed units with full validation - All optimizations maintain 100% compatibility with UCUM specification - Passes all 117 tests including official conformance tests (98.6% overall) ## WASM Package The UCUM library is available as a WebAssembly package for use in JavaScript/TypeScript applications. ### Installation ```sh # Using npm npm install @octofhir/ucum-wasm # Using yarn yarn add @octofhir/ucum-wasm # Using pnpm pnpm add @octofhir/ucum-wasm ``` ### Usage ```typescript import { start, validate, get_unit_info, convert, evaluate_expression, arithmetic, // Phase 3 functions get_ucum_model, get_unit_display_name, convert_advanced_simple } from '@octofhir/ucum-wasm'; // Initialize the WASM module start(); // Validate a UCUM expression const isValid = validate('mg/dL'); // true // Get information about a unit const unitInfo = get_unit_info('mg'); console.log(unitInfo.factor); // 0.000001 console.log(unitInfo.dimensions); // [1, 0, 0, 0, 0, 0, 0] // Convert between units const result = convert(100, 'kPa', 'mm[Hg]'); // 750.06... // Evaluate a UCUM expression const evalResult = evaluate_expression('mg/dL'); console.log(evalResult.factor); // 0.00001 // Perform arithmetic operations const arithResult = arithmetic('mg', 'mul', 'mL', 1); console.log(arithResult.dimensions); // [1, 3, 0, 0, 0, 0, 0] // Phase 3: Model introspection const model = get_ucum_model(); console.log(model.version); // '2.1' console.log(model.total_units); // 312 // Phase 3: Enhanced display names console.log(get_unit_display_name('kg')); // 'kilogram' console.log(get_unit_display_name('m/s')); // '(meter) / (second)' // Phase 3: Advanced conversion with precision const advResult = convert_advanced_simple(1000, 'g', 'kg', 3); console.log(advResult.value); // 1.000 console.log(advResult.precision_info); // '3 decimal places' ``` ## Phase 3 API Completeness Phase 3 introduces comprehensive model introspection and advanced conversion capabilities to enhance the UCUM implementation. ### Model Introspection ```rust use octofhir_ucum_core::{get_model, validate_ucum, get_properties, get_common_display}; // Get model information let model = get_model(); println!("UCUM Version: {}", model.version); // "2.1" println!("Total Units: {}", model.units.len()); // 312 println!("Total Prefixes: {}", model.prefixes.len()); // 24 // Validate implementation self-consistency let issues = validate_ucum(); if issues.is_empty() { println!("UCUM implementation is valid"); } else { println!("Issues found: {:?}", issues); } // Get all available properties let properties = get_properties(); println!("Available properties: {}", properties.len()); // 101 // Enhanced display names (handles prefixed units) println!("{}", get_common_display("kg")); // "kilogram" println!("{}", get_common_display("cm")); // "centimeter" println!("{}", get_common_display("m/s")); // "(meter) / (second)" ``` ### Advanced Conversion with Precision Control ```rust use octofhir_ucum_core::{ convert_with_context, AdvancedConversionContext, DecimalPrecision, RoundingMode, TemperatureScale }; // Create conversion context with precise control let context = AdvancedConversionContext { precision: DecimalPrecision::Fixed(3), rounding: RoundingMode::Nearest, temperature_scale: TemperatureScale::Celsius, use_special_units: true, }; // Convert with advanced precision let result = convert_with_context(1000.0, "g", "kg", &context)?; println!("Value: {}", result.value); // 1.000 println!("Precision: {}", result.precision_info); // "3 decimal places" println!("Used special units: {}", result.used_special_units); // false // Temperature conversion with special handling let temp_result = convert_with_context(100.0, "Cel", "K", &context)?; println!("Value: {}", temp_result.value); // 373.150 println!("Used special units: {}", temp_result.used_special_units); // true ``` ### CLI Integration All Phase 3 features are available through the CLI: ```sh # Model introspection octofhir-ucum model octofhir-ucum self-validate octofhir-ucum properties --limit 10 # Enhanced display names octofhir-ucum display kg # kilogram octofhir-ucum display "m/s" # (meter) / (second) # Advanced conversion with precision octofhir-ucum convert-advanced --value 1000 --from g --to kg --precision 3 octofhir-ucum convert-advanced --value 100 --from Cel --to K --precision 2 ``` ### WASM Integration Phase 3 functions are fully exposed in the WASM package: ```javascript // Model introspection const model = get_ucum_model(); const validation = validate_ucum_implementation(); const properties = get_ucum_properties(); // Enhanced display names const displayName = get_unit_display_name('kg'); // Advanced conversion const result = convert_advanced_simple(1000, 'g', 'kg', 3); const advancedResult = convert_advanced(100, 'Cel', 'K', { precision_type: 'fixed', precision_value: 2, rounding_mode: 'nearest', temperature_scale: 'celsius', use_special_units: true }); ``` ## Interactive Playground An interactive web-based playground is available to explore the UCUM library's capabilities. ### Features - **Validation**: Validate UCUM expressions - **Unit Information**: Get detailed information about units - **Conversion**: Convert values between compatible units - **Arithmetic**: Perform arithmetic operations on units - **Phase 3 Capabilities**: Model introspection, enhanced display names, and advanced conversion with precision control ### Running Locally ```sh # Navigate to the playground directory cd playground # Install dependencies pnpm install # Start the development server (use npm due to pnpm script execution issues) npm run dev ``` The playground will be available at http://localhost:6000. ## FHIR Integration The UCUM library provides integration with FHIR (Fast Healthcare Interoperability Resources) through the `fhir` feature. ### Installation ```sh # Add to your project cargo add octofhir-ucum --features fhir ``` ### Features - **FHIR Quantity**: FHIR Quantity data type implementation - **Conversion**: Convert between FHIR Quantity and UCUM Quantity - **Unit Conversion**: Convert FHIR Quantities between different units - **Equivalence**: Check if two FHIR Quantities are equivalent - **Error Handling**: Comprehensive error handling for invalid inputs ### Usage ```rust use octofhir_ucum_fhir::{FhirQuantity, convert_quantity, are_equivalent}; // Create a FHIR Quantity with a UCUM code let quantity = FhirQuantity::with_ucum_code(1000.0, "mg"); // Convert to a different unit let converted = convert_quantity(&quantity, "g").unwrap(); assert_eq!(converted.value, 1.0); assert_eq!(converted.code, Some("g".to_string())); // Check if two quantities are equivalent let quantity2 = FhirQuantity::with_ucum_code(1.0, "g"); assert!(are_equivalent(&quantity2, &converted).unwrap()); ``` ## Fuzzing The UCUM library includes fuzzing infrastructure to identify potential bugs and edge cases using `cargo-fuzz`. ### Setup ```sh # Install cargo-fuzz cargo install cargo-fuzz ``` ### Fuzzing Targets - **Parser Fuzzer**: Tests the `parse_expression` function with arbitrary input strings - **Evaluator Fuzzer**: Tests the `evaluate` function with valid UCUM expressions ### Running the Fuzzers ```sh # Run the parser fuzzer cargo fuzz run -p octofhir-ucum-fuzz fuzz_parser # Run the evaluator fuzzer cargo fuzz run -p octofhir-ucum-fuzz fuzz_evaluator ``` ### Continuous Fuzzing For continuous fuzzing, you can set up a CI job that runs the fuzzers for a fixed amount of time: ```sh # Run the parser fuzzer for 5 minutes cargo fuzz run -p octofhir-ucum-fuzz fuzz_parser -- -max_total_time=300 ``` For more details, see the [ucum-fuzz README](ucum-fuzz/README.md). ## Official Test Validation The UCUM library includes validation against the official UCUM test cases from the FHIR/Ucum-java repository to ensure compliance with the UCUM specification. ### Test Coverage Our implementation achieves **91.4% conformance** to the official UCUM functional test suite: - **Total Tests:** 1,068 official UCUM test cases - **Passed:** 976 tests - **Failed:** 92 tests - **Success Rate:** 91.4% ### Running Official Tests ```sh # Run all official validation tests cargo test official_tests # Run with detailed output to see individual test results cargo test run_official_validation_tests -- --nocapture # Run tests from the second official test file cargo test run_official_validation_tests_2 -- --nocapture ``` ## Contributing Guide We welcome contributions to the UCUM-RS project! This guide will help you get started with development and ensure your contributions align with the project's standards. ### Getting Started 1. **Fork and clone the repository:** ```sh git clone https://github.com/YOUR_USERNAME/ucum-rs.git cd ucum-rs ``` 2. **Install dependencies:** - Rust 1.70+ (edition 2021) - `wasm-pack` for WebAssembly builds - `pnpm` for playground development 3. **Build the project:** ```sh # Build all workspace crates cargo build --all # Build with specific features cargo build --features cli cargo build --features wasm cargo build --features fhir ``` ### Development Workflow #### Testing ```sh # Run all tests (recommended before submitting PR) cargo test --all # Run tests with output for debugging cargo test --all -- --nocapture # Run specific test suites cargo test official_tests # Official UCUM conformance tests cargo test test_micro_normalization # UTF-8 handling tests # Run benchmarks cargo bench ``` #### Code Quality ```sh # Format code (required before commit) cargo fmt --all # Check formatting without changes cargo fmt --all -- --check # Run linter with strict warnings cargo clippy --all -- -D warnings # Pre-publish validation (runs all quality checks) just publish-prep # or cargo fmt && cargo clippy --all -- -D warnings && cargo test --all ``` #### Documentation ```sh # Generate and open documentation cargo doc --open --no-deps --all # Validate documentation examples cargo test --doc ``` ### Development Areas #### Core Parser (`src/parser.rs`) - **High-Performance Parser**: Advanced implementation with zero-copy optimizations - Handles all UCUM edge cases with comprehensive error reporting - Must maintain compatibility with UCUM specification **Adding new parser features:** 1. Implement feature in `parser.rs` maintaining performance optimizations 2. Add comprehensive tests covering edge cases 3. Ensure backward compatibility with existing API 4. Validate against official UCUM conformance tests #### AST and Evaluation (`src/ast.rs`, `src/evaluator.rs`) - Follow zero-copy patterns where possible - Use `UnitExpr<'a>` for borrowed data, `OwnedUnitExpr` for owned data - Maintain dimensional analysis consistency #### Registry and Build System (`src/registry.rs`, `build.rs`) - Registry is generated at compile-time from `ucum-essence.xml` - Changes to build system must maintain WASM compatibility - Avoid thread-local storage for cross-platform support ### Feature Development #### Adding New Features 1. **Create issue** describing the feature and use case 2. **Write tests first** - we follow TDD principles 3. **Implement feature** maintaining backward compatibility 4. **Update documentation** including code examples 5. **Add CLI support** if user-facing (optional) 6. **Add WASM bindings** if relevant (optional) #### Performance Optimizations 1. **Benchmark first** - establish baseline performance 2. **Profile bottlenecks** using `cargo bench` and `perf` 3. **Implement optimizations** in `parser_optimized.rs` if parser-related 4. **Validate correctness** - all tests must still pass 5. **Document performance gains** with before/after metrics ### Testing Guidelines #### Test Categories 1. **Unit tests** - Test individual functions and components 2. **Integration tests** - Test complete parsing and evaluation flows 3. **Official conformance tests** - UCUM specification compliance (98.6% pass rate) 4. **Property-based tests** - Using `proptest` for edge case discovery 5. **Fuzzing tests** - Located in `ucum-fuzz/` directory #### Writing Tests ```rust #[test] fn test_new_feature() { // Test successful case let result = parse_expression("your_expression").unwrap(); assert_eq!(result, expected_ast); // Test error cases assert!(parse_expression("invalid_expression").is_err()); // Test edge cases assert_eq!(parse_expression(""), Ok(UnitExpr::Numeric(1.0))); } ``` ### Code Style - **Formatting**: Use `cargo fmt` (rustfmt) for consistent formatting - **Linting**: Address all `clippy` warnings with `cargo clippy --all -- -D warnings` - **Documentation**: Document all public APIs with examples - **Error handling**: Use descriptive error messages with context - **Performance**: Prefer zero-copy patterns, avoid unnecessary allocations ### Submitting Changes 1. **Create feature branch** from `main` 2. **Write tests** covering your changes 3. **Run quality checks**: `just publish-prep` or equivalent commands 4. **Update documentation** if adding public APIs 5. **Submit pull request** with clear description of changes 6. **Address review feedback** promptly #### Pull Request Checklist - [ ] All tests pass (`cargo test --all`) - [ ] Code is formatted (`cargo fmt --all -- --check`) - [ ] No clippy warnings (`cargo clippy --all -- -D warnings`) - [ ] Documentation updated for public APIs - [ ] CHANGELOG.md updated if applicable - [ ] Backward compatibility maintained ### Project-Specific Guidelines #### Unicode Handling - Always handle ยต (micro sign) properly in both parsers - Use UTF-8 byte sequences (0xC2 0xB5) for micro sign detection - Test with both ASCII 'u' and Unicode 'ยต' variants #### WASM Compatibility - Avoid thread-local storage (`thread_local!`) - Test WASM builds: `wasm-pack build --target web --features wasm` - Ensure `no_std` compatibility where possible #### Error Messages - Provide precise error locations with spans - Include suggestions for common mistakes - Test error message clarity with real users ### Getting Help - **Documentation**: Check `CLAUDE.md` for development commands - **Issues**: Search existing issues before creating new ones - **Discussions**: Use GitHub Discussions for questions - **Code Review**: All changes require review before merging ### Playground Development ```sh cd playground pnpm install # Use npm for development due to pnpm script execution issues npm run dev # Runs on http://localhost:6000 ``` The playground provides a real-time testing environment for UCUM expressions and helps validate user-facing functionality. ## Project Structure - `src/` โ€“ Core library (parsing, evaluation, registry) - `src/bin/cli.rs` โ€“ Command-line interface - `src/wasm.rs` โ€“ WebAssembly bindings for JavaScript/TypeScript (@octofhir/ucum-wasm) - `src/fhir.rs` โ€“ FHIR integration (FHIR Quantity data type support) - `ucum-fuzz/` โ€“ Fuzzing infrastructure (cargo-fuzz targets) - `playground/` โ€“ Interactive web-based playground (Svelte 5) - `spec/` โ€“ UCUM specification assets ## License Apache-2.0