@octofhir/ucum-wasm
Version:
UCUM (Unified Code for Units of Measure) library for FHIRPath quantity operations with CLI, WASM, and FHIR integration
645 lines (478 loc) โข 22 kB
Markdown
# UCUM-RS
[](https://github.com/octofhir/ucum-rs/actions/workflows/ci.yml)
[](https://crates.io/crates/octofhir-ucum)
[](https://www.npmjs.com/package/@octofhir/ucum-wasm)
[](https://docs.rs/octofhir-ucum)
High-performance Unified Code for Units of Measure (UCUM) implementation in Rust 2024 edition.
## Quick Start
```sh
# Add to your project
cargo add octofhir-ucum
# Or use the CLI
cargo install octofhir-ucum
# Example: Convert 100 kPa to mm[Hg]
octofhir-ucum convert --value 100 --from kPa --to mm[Hg]
```
## Features
### ๐ Enhanced API (ADR-001 + Phase 3 Implementation)
| Feature | Status | Notes |
|------------------------|----------|----------------------------------------|
| **Comprehensive Validation** | โ
| `validate()` with detailed error reporting |
| **Unit Analysis** | โ
| `analyse()` with dimensions, factors, properties |
| **Unit Arithmetic** | โ
| `unit_multiply()`, `unit_divide()` operations |
| **Advanced Search** | โ
| Text, property, fuzzy, and regex search |
| **Property Validation** | โ
| `validate_in_property()` for physical quantities |
| **Unit Compatibility** | โ
| `is_comparable()` for commensurability checking |
| **Canonical Forms** | โ
| `get_canonical_units()` for normalization |
| **Special Unit System** | โ
| Extensible handlers for temperature, logarithmic units |
| **Precision Arithmetic** | โ
| Optional `rust_decimal` support for high precision |
| **๐ Model Introspection** | โ
| `get_model()`, `validate_ucum()`, `get_properties()` |
| **๐ Enhanced Display Names** | โ
| `get_common_display()` with prefixed unit support |
| **๐ Advanced Conversion** | โ
| `convert_with_context()` with precision control |
### ๐ง Core Capabilities
| Feature | Status | Notes |
|------------------------|----------|----------------------------------------|
| SI base/derived units | โ
| Full support with 7-dimensional vectors |
| Customary units | โ
| Imperial, US customary, etc. |
| Specialized units | โ
| Medical, laboratory, information units |
| Prefix handling | โ
| e.g., kPa, mL, ยตg with precision support |
| Expression parsing | โ
| **Zero-copy architecture** with robust error messages |
| Unit conversion | โ
| Handles factors, offsets, temperature |
| Temperature support | โ
| Celsius, Fahrenheit, Rankine with offsets |
| **Performance Optimization** | โ
| **Phase 1 Complete** - 40% parsing improvement |
### ๐ ๏ธ Tools & Integration
| Feature | Status | Notes |
|------------------------|----------|----------------------------------------|
| CLI tool | โ
| `octofhir-ucum` binary |
| WASM support | โ
| npm package: `@octofhir/ucum-wasm` |
| Interactive playground | โ
| Svelte 5 web application |
| FHIR integration | โ
| FHIR Quantity data type support |
| Property-based tests | โ
| `proptest` |
| Fuzzing | โ
| `cargo-fuzz` targets for parser/eval |
### ๐ Test Conformance (98.6% Overall)
| Test Category | Status | Results |
|------------------------|----------|----------------------------------------|
| **Overall Conformance** | โ
| **98.6%** (1120/1136 tests passing) |
| Validation tests | โ
| **99.5%** (1048/1053) |
| Conversion tests | โ ๏ธ | **83.1%** (49/59) - acceptable precision differences |
| Division tests | โ
| **100%** (3/3) - precision arithmetic fixed |
| Multiplication tests | โ
| **100%** (4/4) |
| Display name tests | โ
| **94.1%** (16/17) |
### โก Performance
**Current Performance (v0.5.0 with Unified Optimized Parser):**
- **Simple parsing**: ~7,900,000 ops/second (~126 ns per operation) **[+2700% improvement]**
- **Prefixed units**: ~6,800,000 ops/second (~147 ns per operation) **[+2300% improvement]**
- **Unicode handling**: ~6,200,000 ops/second (~161 ns per operation) **[+2100% improvement]**
- **Complex expressions**: ~1,560,000 ops/second (~640 ns per operation) **[+450% improvement]**
- **Evaluation**: ~1,390,000 ops/second (~718 ns per operation)
- **Analysis**: ~606,000 ops/second (~1.65 ยตs per operation)
**Performance Optimizations Implemented:**
### High-Performance Parser Architecture
- โ
**Zero-copy string parsing** - Avoids unnecessary string allocations during parsing
- โ
**Lazy Unicode normalization** - Only normalizes ยต characters when detected
- โ
**Fast pattern validation** - Single-pass scanning with optimized character handling
- โ
**Dual AST architecture** - `UnitExpr<'a>` (zero-copy) and `OwnedUnitExpr` (owned)
- โ
**Enhanced prefix lookup** - O(1) HashMap-based prefix resolution
### Parser Features (`parser.rs`)
- โ
**ASCII lookup tables** - Fast character classification with compile-time tables
- โ
**SIMD-ready validation** - Infrastructure for x86_64 SSE2 acceleration
- โ
**Perfect hash maps** - Compile-time perfect hashing for common units (time units)
- โ
**Small vector optimization** - Most UCUM expressions have โค4 factors
- โ
**Single-pass tokenization** - Efficient tokenizer with minimal backtracking
- โ
**UTF-8 micro sign handling** - Proper handling of ยต (0xC2 0xB5) sequences
**Benchmarking Infrastructure:**
Comprehensive benchmarks track performance across multiple dimensions:
- **Complexity categories**: Simple units, prefixed units, complex expressions, edge cases
- **Parser features**: Unicode handling, annotations, ten-power notation, leading division
- **Real-world usage**: Medical dosing, engineering calculations, batch processing
- **Memory patterns**: Zero-copy vs owned allocations, pathological cases
- **Allocation tracking**: Measures memory allocation patterns for different expression types
**Technical Implementation Notes:**
The unified parser implementation achieves exceptional performance:
- `parser.rs` - Single high-performance parser with advanced optimizations
- **26x faster** parsing for simple units compared to baseline
- **23x faster** for prefixed units with full validation
- All optimizations maintain 100% compatibility with UCUM specification
- Passes all 117 tests including official conformance tests (98.6% overall)
## WASM Package
The UCUM library is available as a WebAssembly package for use in JavaScript/TypeScript applications.
### Installation
```sh
# Using npm
npm install @octofhir/ucum-wasm
# Using yarn
yarn add @octofhir/ucum-wasm
# Using pnpm
pnpm add @octofhir/ucum-wasm
```
### Usage
```typescript
import {
start,
validate,
get_unit_info,
convert,
evaluate_expression,
arithmetic,
// Phase 3 functions
get_ucum_model,
get_unit_display_name,
convert_advanced_simple
} from '@octofhir/ucum-wasm';
// Initialize the WASM module
start();
// Validate a UCUM expression
const isValid = validate('mg/dL'); // true
// Get information about a unit
const unitInfo = get_unit_info('mg');
console.log(unitInfo.factor); // 0.000001
console.log(unitInfo.dimensions); // [1, 0, 0, 0, 0, 0, 0]
// Convert between units
const result = convert(100, 'kPa', 'mm[Hg]'); // 750.06...
// Evaluate a UCUM expression
const evalResult = evaluate_expression('mg/dL');
console.log(evalResult.factor); // 0.00001
// Perform arithmetic operations
const arithResult = arithmetic('mg', 'mul', 'mL', 1);
console.log(arithResult.dimensions); // [1, 3, 0, 0, 0, 0, 0]
// Phase 3: Model introspection
const model = get_ucum_model();
console.log(model.version); // '2.1'
console.log(model.total_units); // 312
// Phase 3: Enhanced display names
console.log(get_unit_display_name('kg')); // 'kilogram'
console.log(get_unit_display_name('m/s')); // '(meter) / (second)'
// Phase 3: Advanced conversion with precision
const advResult = convert_advanced_simple(1000, 'g', 'kg', 3);
console.log(advResult.value); // 1.000
console.log(advResult.precision_info); // '3 decimal places'
```
## Phase 3 API Completeness
Phase 3 introduces comprehensive model introspection and advanced conversion capabilities to enhance the UCUM implementation.
### Model Introspection
```rust
use octofhir_ucum_core::{get_model, validate_ucum, get_properties, get_common_display};
// Get model information
let model = get_model();
println!("UCUM Version: {}", model.version); // "2.1"
println!("Total Units: {}", model.units.len()); // 312
println!("Total Prefixes: {}", model.prefixes.len()); // 24
// Validate implementation self-consistency
let issues = validate_ucum();
if issues.is_empty() {
println!("UCUM implementation is valid");
} else {
println!("Issues found: {:?}", issues);
}
// Get all available properties
let properties = get_properties();
println!("Available properties: {}", properties.len()); // 101
// Enhanced display names (handles prefixed units)
println!("{}", get_common_display("kg")); // "kilogram"
println!("{}", get_common_display("cm")); // "centimeter"
println!("{}", get_common_display("m/s")); // "(meter) / (second)"
```
### Advanced Conversion with Precision Control
```rust
use octofhir_ucum_core::{
convert_with_context,
AdvancedConversionContext,
DecimalPrecision,
RoundingMode,
TemperatureScale
};
// Create conversion context with precise control
let context = AdvancedConversionContext {
precision: DecimalPrecision::Fixed(3),
rounding: RoundingMode::Nearest,
temperature_scale: TemperatureScale::Celsius,
use_special_units: true,
};
// Convert with advanced precision
let result = convert_with_context(1000.0, "g", "kg", &context)?;
println!("Value: {}", result.value); // 1.000
println!("Precision: {}", result.precision_info); // "3 decimal places"
println!("Used special units: {}", result.used_special_units); // false
// Temperature conversion with special handling
let temp_result = convert_with_context(100.0, "Cel", "K", &context)?;
println!("Value: {}", temp_result.value); // 373.150
println!("Used special units: {}", temp_result.used_special_units); // true
```
### CLI Integration
All Phase 3 features are available through the CLI:
```sh
# Model introspection
octofhir-ucum model
octofhir-ucum self-validate
octofhir-ucum properties --limit 10
# Enhanced display names
octofhir-ucum display kg # kilogram
octofhir-ucum display "m/s" # (meter) / (second)
# Advanced conversion with precision
octofhir-ucum convert-advanced --value 1000 --from g --to kg --precision 3
octofhir-ucum convert-advanced --value 100 --from Cel --to K --precision 2
```
### WASM Integration
Phase 3 functions are fully exposed in the WASM package:
```javascript
// Model introspection
const model = get_ucum_model();
const validation = validate_ucum_implementation();
const properties = get_ucum_properties();
// Enhanced display names
const displayName = get_unit_display_name('kg');
// Advanced conversion
const result = convert_advanced_simple(1000, 'g', 'kg', 3);
const advancedResult = convert_advanced(100, 'Cel', 'K', {
precision_type: 'fixed',
precision_value: 2,
rounding_mode: 'nearest',
temperature_scale: 'celsius',
use_special_units: true
});
```
## Interactive Playground
An interactive web-based playground is available to explore the UCUM library's capabilities.
### Features
- **Validation**: Validate UCUM expressions
- **Unit Information**: Get detailed information about units
- **Conversion**: Convert values between compatible units
- **Arithmetic**: Perform arithmetic operations on units
- **Phase 3 Capabilities**: Model introspection, enhanced display names, and advanced conversion with precision control
### Running Locally
```sh
# Navigate to the playground directory
cd playground
# Install dependencies
pnpm install
# Start the development server (use npm due to pnpm script execution issues)
npm run dev
```
The playground will be available at http://localhost:6000.
## FHIR Integration
The UCUM library provides integration with FHIR (Fast Healthcare Interoperability Resources) through the `fhir` feature.
### Installation
```sh
# Add to your project
cargo add octofhir-ucum --features fhir
```
### Features
- **FHIR Quantity**: FHIR Quantity data type implementation
- **Conversion**: Convert between FHIR Quantity and UCUM Quantity
- **Unit Conversion**: Convert FHIR Quantities between different units
- **Equivalence**: Check if two FHIR Quantities are equivalent
- **Error Handling**: Comprehensive error handling for invalid inputs
### Usage
```rust
use octofhir_ucum_fhir::{FhirQuantity, convert_quantity, are_equivalent};
// Create a FHIR Quantity with a UCUM code
let quantity = FhirQuantity::with_ucum_code(1000.0, "mg");
// Convert to a different unit
let converted = convert_quantity(&quantity, "g").unwrap();
assert_eq!(converted.value, 1.0);
assert_eq!(converted.code, Some("g".to_string()));
// Check if two quantities are equivalent
let quantity2 = FhirQuantity::with_ucum_code(1.0, "g");
assert!(are_equivalent(&quantity2, &converted).unwrap());
```
## Fuzzing
The UCUM library includes fuzzing infrastructure to identify potential bugs and edge cases using `cargo-fuzz`.
### Setup
```sh
# Install cargo-fuzz
cargo install cargo-fuzz
```
### Fuzzing Targets
- **Parser Fuzzer**: Tests the `parse_expression` function with arbitrary input strings
- **Evaluator Fuzzer**: Tests the `evaluate` function with valid UCUM expressions
### Running the Fuzzers
```sh
# Run the parser fuzzer
cargo fuzz run -p octofhir-ucum-fuzz fuzz_parser
# Run the evaluator fuzzer
cargo fuzz run -p octofhir-ucum-fuzz fuzz_evaluator
```
### Continuous Fuzzing
For continuous fuzzing, you can set up a CI job that runs the fuzzers for a fixed amount of time:
```sh
# Run the parser fuzzer for 5 minutes
cargo fuzz run -p octofhir-ucum-fuzz fuzz_parser -- -max_total_time=300
```
For more details, see the [ucum-fuzz README](ucum-fuzz/README.md).
## Official Test Validation
The UCUM library includes validation against the official UCUM test cases from the FHIR/Ucum-java repository to ensure compliance with the UCUM specification.
### Test Coverage
Our implementation achieves **91.4% conformance** to the official UCUM functional test suite:
- **Total Tests:** 1,068 official UCUM test cases
- **Passed:** 976 tests
- **Failed:** 92 tests
- **Success Rate:** 91.4%
### Running Official Tests
```sh
# Run all official validation tests
cargo test official_tests
# Run with detailed output to see individual test results
cargo test run_official_validation_tests -- --nocapture
# Run tests from the second official test file
cargo test run_official_validation_tests_2 -- --nocapture
```
## Contributing Guide
We welcome contributions to the UCUM-RS project! This guide will help you get started with development and ensure your contributions align with the project's standards.
### Getting Started
1. **Fork and clone the repository:**
```sh
git clone https://github.com/YOUR_USERNAME/ucum-rs.git
cd ucum-rs
```
2. **Install dependencies:**
- Rust 1.70+ (edition 2021)
- `wasm-pack` for WebAssembly builds
- `pnpm` for playground development
3. **Build the project:**
```sh
# Build all workspace crates
cargo build --all
# Build with specific features
cargo build --features cli
cargo build --features wasm
cargo build --features fhir
```
### Development Workflow
#### Testing
```sh
# Run all tests (recommended before submitting PR)
cargo test --all
# Run tests with output for debugging
cargo test --all -- --nocapture
# Run specific test suites
cargo test official_tests # Official UCUM conformance tests
cargo test test_micro_normalization # UTF-8 handling tests
# Run benchmarks
cargo bench
```
#### Code Quality
```sh
# Format code (required before commit)
cargo fmt --all
# Check formatting without changes
cargo fmt --all -- --check
# Run linter with strict warnings
cargo clippy --all -- -D warnings
# Pre-publish validation (runs all quality checks)
just publish-prep # or cargo fmt && cargo clippy --all -- -D warnings && cargo test --all
```
#### Documentation
```sh
# Generate and open documentation
cargo doc --open --no-deps --all
# Validate documentation examples
cargo test --doc
```
### Development Areas
#### Core Parser (`src/parser.rs`)
- **High-Performance Parser**: Advanced implementation with zero-copy optimizations
- Handles all UCUM edge cases with comprehensive error reporting
- Must maintain compatibility with UCUM specification
**Adding new parser features:**
1. Implement feature in `parser.rs` maintaining performance optimizations
2. Add comprehensive tests covering edge cases
3. Ensure backward compatibility with existing API
4. Validate against official UCUM conformance tests
#### AST and Evaluation (`src/ast.rs`, `src/evaluator.rs`)
- Follow zero-copy patterns where possible
- Use `UnitExpr<'a>` for borrowed data, `OwnedUnitExpr` for owned data
- Maintain dimensional analysis consistency
#### Registry and Build System (`src/registry.rs`, `build.rs`)
- Registry is generated at compile-time from `ucum-essence.xml`
- Changes to build system must maintain WASM compatibility
- Avoid thread-local storage for cross-platform support
### Feature Development
#### Adding New Features
1. **Create issue** describing the feature and use case
2. **Write tests first** - we follow TDD principles
3. **Implement feature** maintaining backward compatibility
4. **Update documentation** including code examples
5. **Add CLI support** if user-facing (optional)
6. **Add WASM bindings** if relevant (optional)
#### Performance Optimizations
1. **Benchmark first** - establish baseline performance
2. **Profile bottlenecks** using `cargo bench` and `perf`
3. **Implement optimizations** in `parser_optimized.rs` if parser-related
4. **Validate correctness** - all tests must still pass
5. **Document performance gains** with before/after metrics
### Testing Guidelines
#### Test Categories
1. **Unit tests** - Test individual functions and components
2. **Integration tests** - Test complete parsing and evaluation flows
3. **Official conformance tests** - UCUM specification compliance (98.6% pass rate)
4. **Property-based tests** - Using `proptest` for edge case discovery
5. **Fuzzing tests** - Located in `ucum-fuzz/` directory
#### Writing Tests
```rust
#[test]
fn test_new_feature() {
// Test successful case
let result = parse_expression("your_expression").unwrap();
assert_eq!(result, expected_ast);
// Test error cases
assert!(parse_expression("invalid_expression").is_err());
// Test edge cases
assert_eq!(parse_expression(""), Ok(UnitExpr::Numeric(1.0)));
}
```
### Code Style
- **Formatting**: Use `cargo fmt` (rustfmt) for consistent formatting
- **Linting**: Address all `clippy` warnings with `cargo clippy --all -- -D warnings`
- **Documentation**: Document all public APIs with examples
- **Error handling**: Use descriptive error messages with context
- **Performance**: Prefer zero-copy patterns, avoid unnecessary allocations
### Submitting Changes
1. **Create feature branch** from `main`
2. **Write tests** covering your changes
3. **Run quality checks**: `just publish-prep` or equivalent commands
4. **Update documentation** if adding public APIs
5. **Submit pull request** with clear description of changes
6. **Address review feedback** promptly
#### Pull Request Checklist
- [ ] All tests pass (`cargo test --all`)
- [ ] Code is formatted (`cargo fmt --all -- --check`)
- [ ] No clippy warnings (`cargo clippy --all -- -D warnings`)
- [ ] Documentation updated for public APIs
- [ ] CHANGELOG.md updated if applicable
- [ ] Backward compatibility maintained
### Project-Specific Guidelines
#### Unicode Handling
- Always handle ยต (micro sign) properly in both parsers
- Use UTF-8 byte sequences (0xC2 0xB5) for micro sign detection
- Test with both ASCII 'u' and Unicode 'ยต' variants
#### WASM Compatibility
- Avoid thread-local storage (`thread_local!`)
- Test WASM builds: `wasm-pack build --target web --features wasm`
- Ensure `no_std` compatibility where possible
#### Error Messages
- Provide precise error locations with spans
- Include suggestions for common mistakes
- Test error message clarity with real users
### Getting Help
- **Documentation**: Check `CLAUDE.md` for development commands
- **Issues**: Search existing issues before creating new ones
- **Discussions**: Use GitHub Discussions for questions
- **Code Review**: All changes require review before merging
### Playground Development
```sh
cd playground
pnpm install
# Use npm for development due to pnpm script execution issues
npm run dev # Runs on http://localhost:6000
```
The playground provides a real-time testing environment for UCUM expressions and helps validate user-facing functionality.
## Project Structure
- `src/` โ Core library (parsing, evaluation, registry)
- `src/bin/cli.rs` โ Command-line interface
- `src/wasm.rs` โ WebAssembly bindings for JavaScript/TypeScript (@octofhir/ucum-wasm)
- `src/fhir.rs` โ FHIR integration (FHIR Quantity data type support)
- `ucum-fuzz/` โ Fuzzing infrastructure (cargo-fuzz targets)
- `playground/` โ Interactive web-based playground (Svelte 5)
- `spec/` โ UCUM specification assets
## License
Apache-2.0