UNPKG

bods-data-extractor

Version:

Convert BODS UK dataset bus line data from XML to JSON

204 lines (155 loc) • 5.75 kB
# BODS Data Extractor [![Test BODS Data Extractor](https://github.com/DRFR0ST/bods-data-extractor-js/actions/workflows/test.yml/badge.svg)](https://github.com/DRFR0ST/bods-data-extractor-js/actions/workflows/test.yml) A TypeScript library and CLI tool for converting BODS (Bus Open Data Service) UK dataset bus line data from XML to structured JSON format. ## Features - 🚌 Converts BODS XML files to structured JSON - šŸ”„ Extracts stop points, vehicle journeys, and location data - šŸŽÆ Type-safe TypeScript implementation - šŸ–„ļø Command-line interface for batch processing - šŸ“¦ Can be used as a library in other projects - āœ… Comprehensive test coverage with snapshot testing - ⚔ Built with Bun for fast performance ## Installation ### As a CLI tool ```bash # Clone the repository git clone https://github.com/DRFR0ST/bods-data-extractor-js.git cd bods-data-extractor-js # Install dependencies bun install # Make CLI globally available (optional) bun link ``` ### As a library ```bash bun add bods-data-extractor # or npm install bods-data-extractor ``` ## Usage ### Command Line Interface ```bash # Convert a single XML file bun run cli input/file.xml # Convert to specific output directory bun run cli input/file.xml output/ # Process multiple files for file in input/*.xml; do bun run cli "$file" output/ done ``` ### As a Library ```typescript import { convertBodsXmlToJson } from 'bods-data-extractor'; // Convert XML file to structured JSON const result = convertBodsXmlToJson('./path/to/bods-file.xml'); console.log(result); // { // stopPoints: [...], // location: [...], // startTime: [...] // } ``` ### Output Structure The converter produces a structured JSON object with the following format: ```typescript interface BodsOutput { stopPoints: StopPoint[]; // Bus stops with IDs and names location: Location[]; // Geographic coordinates for route segments startTime: VehicleJourney[]; // Journey schedules and timing } interface StopPoint { id: string; name: string; } interface Location { from: string; to: string; longitude: string; latitude: string; } interface VehicleJourney { time: string; VehicleJourneyCode: string; routeSegments: RouteSegment[]; } ``` ## Development ### Prerequisites - [Bun](https://bun.sh/) runtime - TypeScript 5+ ### Setup ```bash # Clone and install dependencies git clone https://github.com/DRFR0ST/bods-data-extractor-js.git cd bods-data-extractor-js bun install ``` ### Running Tests ```bash # Run all tests bun test # Run tests in watch mode bun test --watch # Run with coverage bun test --coverage ``` ### Project Structure ``` src/ ā”œā”€ā”€ types/bods.ts # TypeScript type definitions ā”œā”€ā”€ utils/xml-parser.ts # XML parsing utilities ā”œā”€ā”€ extractors/ │ ā”œā”€ā”€ stop-points.ts # Stop point extraction │ ā”œā”€ā”€ vehicle-journeys.ts # Vehicle journey extraction │ ā”œā”€ā”€ locations.ts # Location data extraction │ └── journey-pattern-timing-links.ts ā”œā”€ā”€ converter.ts # Main conversion logic ā”œā”€ā”€ cli.ts # Command line interface └── index.ts # Library exports test/ ā”œā”€ā”€ converter.test.ts # Unit tests for converter ā”œā”€ā”€ cli.test.ts # CLI integration tests ā”œā”€ā”€ fixtures/ # Test XML files └── snapshots/ # Expected output snapshots ``` ### Scripts ```bash bun run start # Run the main script bun run cli # Run CLI tool bun run test # Run tests bun run test:watch # Run tests in watch mode bun run test:coverage # Run tests with lcov coverage report bun run test:coverage-text # Run tests with text coverage output bun run dev # Run CLI in development mode with watch bun run build # Build for distribution bun run typecheck # Type checking only ``` ## Testing The project includes comprehensive tests: - **Unit Tests**: Test individual components and functions - **Integration Tests**: Test the CLI and end-to-end conversion - **Snapshot Tests**: Ensure output format consistency across changes ### Snapshot Testing The project uses snapshot tests to ensure the output structure remains consistent. When the expected output changes legitimately, you can update snapshots by deleting the snapshot files and running tests again. ## CI/CD The project includes GitHub Actions workflows that: - Run tests on multiple operating systems (Ubuntu, Windows, macOS) - Test with different Bun versions - Generate test coverage reports - Validate CLI functionality with real files - Test package builds ## Contributing 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/amazing-feature`) 3. Make your changes 4. Add tests for your changes 5. Ensure all tests pass (`bun test`) 6. Commit your changes (`git commit -m 'Add amazing feature'`) 7. Push to the branch (`git push origin feature/amazing-feature`) 8. Open a Pull Request ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## About BODS The Bus Open Data Service (BODS) is a UK government initiative that provides access to bus data across England. This tool helps convert the XML format used by BODS into a more developer-friendly JSON structure. For more information about BODS, visit: https://www.bus-data.dft.gov.uk/