UNPKG

agentic-data-stack-community

Version:

AI Agentic Data Stack Framework - Community Edition. Open source data engineering framework with 4 core agents, essential templates, and 3-dimensional quality validation.

427 lines (339 loc) β€’ 14 kB
# AI Agentic Data Stack Framework - Community Edition [![License](https://img.shields.io/badge/license-MIT-blue)](LICENSE.txt) [![Version](https://img.shields.io/badge/version-1.1.2-green)](package.json) [![Framework](https://img.shields.io/badge/framework-Community%20Edition-orange)](https://github.com/barnyp/agentic-data-stack-framework-community) **Open source data engineering and analytics framework with interactive AI agents, comprehensive templates, and complete example projects.** ## πŸš€ Quick Start ```bash # Install globally npm install -g agentic-data-stack-community # Try the complete example project first cd examples/simple-ecommerce-analytics python sample-data/generate-sample-data.py # Activate interactive agents agentic-data agent data-analyst *analyze-data # Or run structured workflows agentic-data workflow community-analytics-workflow # Create your own project agentic-data init my-analytics-project ``` ## 🌟 What's Included ### πŸ€– 4 Interactive AI Agents - **Data Engineer** (Emma βš™οΈ): Pipeline development, ETL processes, infrastructure setup - **Data Analyst** (Riley πŸ“ˆ): Customer segmentation, RFM analysis, business insights - **Data Product Manager** (Morgan πŸ“Š): Requirements gathering, stakeholder coordination - **Data Quality Engineer** (Quinn πŸ”): 3-dimensional quality validation and monitoring ### πŸ“‹ 20 Essential Templates - **Data Contracts**: Customer data, order processing, product catalogs - **Implementation**: SQL analysis, Python validation scripts - **Project Setup**: Business requirements, architecture planning - **Quality Validation**: Automated testing and monitoring - **Documentation**: User guides, technical specifications ### πŸ” 3-Dimensional Quality Framework - **Completeness**: Data availability and coverage validation - **Accuracy**: Format checking and type validation - **Consistency**: Cross-reference validation and uniqueness checks ### 🎯 Interactive Agent System - **Agent Activation**: `@data-analyst` for guided assistance - **Command Execution**: `*analyze-data` for task-specific operations - **Interactive Shell**: `agentic-data interactive` for persistent agent sessions - **Multi-Agent Workflows**: Advanced orchestration with context handoffs - **Progressive Disclosure**: 12+ elicitation methods for quality content creation - **Session Persistence**: Workflow continuity and progress tracking ### πŸ“Š Complete E-commerce Example - Customer segmentation with RFM analysis - Data quality validation scripts - Business requirements documentation - Sample data generation tools - Interactive agent walkthroughs ## πŸ“¦ Installation ### Global Installation (Recommended) ```bash npm install -g agentic-data-stack-community ``` ### Local Project Installation ```bash npm install agentic-data-stack-community npx agentic-data init my-project ``` ### Development Installation ```bash git clone https://github.com/barnyp/agentic-data-stack-framework-community cd agentic-data-stack-framework-community npm install npm link # Make CLI available globally ``` ## πŸ› οΈ CLI Commands ```bash # Framework Information agentic-data info # Display framework overview agentic-data --version # Show version # Interactive Shell (Recommended) agentic-data interactive # Enter interactive shell mode # Interactive Agents agentic-data agent <agent-name> # Activate interactive agent (legacy) agentic-data agents list # List available agents agentic-data agents show <agent> # Show agent details # Workflows and Tasks agentic-data workflow <workflow-name> # Execute structured workflow agentic-data task <task-name> # Execute specific task # Templates and Examples agentic-data templates list # List available templates agentic-data templates show <template> # Show template details agentic-data examples list # List available examples # Project Management agentic-data init [project-name] # Create new project agentic-data validate # Run quality validation ``` ## 🐚 Interactive Shell Mode The interactive shell provides a persistent, conversational interface with AI agents: ```bash # Enter interactive mode agentic-data interactive # Inside the shell: @data-analyst # Activate Data Analyst agent *help # Show agent capabilities *task # List available tasks *analyze-data # Execute data analysis task *create-doc analysis-report # Create document from template *exit # Deactivate current agent exit # Exit interactive shell ``` ### Interactive Commands - **Agent Activation**: `@data-engineer`, `@data-analyst`, `@data-product-manager`, `@data-quality-engineer` - **Task Commands**: `*task <name>`, `*analyze-data`, `*create-dashboard`, `*define-metrics` - **Document Commands**: `*create-doc <template>`, `*shard-doc <path>`, `*manage-docs` - **Knowledge Commands**: `*kb-mode`, `*search <query>` - **Expansion Commands**: `*manage-packs`, `*install-pack <name>`, `*create-pack` ## πŸ—οΈ Framework Architecture ``` AI Agentic Data Stack Framework - Community Edition β”œβ”€β”€ πŸ€– Interactive AI Agents (4) β”‚ β”œβ”€β”€ Data Engineer (Emma βš™οΈ) β”‚ β”œβ”€β”€ Data Analyst (Riley πŸ“ˆ) β”‚ β”œβ”€β”€ Data Product Manager (Morgan πŸ“Š) β”‚ └── Data Quality Engineer (Quinn πŸ”) β”œβ”€β”€ πŸ“‹ Templates & Tasks (30) β”‚ β”œβ”€β”€ Templates (20): Data contracts, analysis, dashboards β”‚ β”œβ”€β”€ Tasks (10): Pipeline building, analysis, quality checks β”‚ └── Checklists (8): Quality validation, deployment β”œβ”€β”€ πŸ”„ Workflows (9) β”‚ β”œβ”€β”€ Brownfield (5): System integration workflows β”‚ └── Greenfield (4): New project workflows β”œβ”€β”€ πŸ” Quality Framework β”‚ β”œβ”€β”€ Completeness Validation β”‚ β”œβ”€β”€ Accuracy Checking β”‚ └── Consistency Verification └── πŸ“š Complete Examples β”œβ”€β”€ E-commerce Analytics (SQL + Python) β”œβ”€β”€ Interactive CLI Interface └── Sample Data Generation ``` ## 🎯 Use Cases ### Customer Analytics - **RFM Segmentation**: Recency, Frequency, Monetary analysis - **Customer Journey**: Lifecycle and behavior tracking - **Marketing Optimization**: Targeted campaign development ### Data Quality Management - **Automated Validation**: 3-dimensional quality checks - **Data Monitoring**: Continuous quality tracking - **Issue Detection**: Format and consistency validation ### Business Intelligence - **Reporting**: Automated insight generation - **Dashboard Development**: Self-service analytics - **Performance Tracking**: KPI monitoring and alerts ## πŸ“Š Complete Example: E-commerce Customer Segmentation ### 1. Try the Built-in Example ```bash # Navigate to the included example cd examples/simple-ecommerce-analytics # Generate realistic sample data python sample-data/generate-sample-data.py ``` ### 2. Use Interactive Shell Mode ```bash # Enter interactive mode (recommended) agentic-data interactive # Start with requirements gathering @data-product-manager *gather-requirements *exit # Perform data analysis @data-analyst *analyze-data *segment-customers *exit # Validate data quality @data-quality-engineer *implement-quality-checks *exit # Exit interactive shell exit ``` ### 3. Or Use Structured Workflows ```bash # Execute the complete workflow with agent handoffs agentic-data workflow community-analytics-workflow # Follow the interactive prompts for each step ``` ### Expected Results - **5-7 Customer Segments**: Champions, Loyal Customers, At Risk, etc. - **90%+ Data Quality**: Across completeness, accuracy, consistency - **Marketing Ready Lists**: Exportable customer segments with campaign recommendations ## πŸ”§ Configuration ### Project Structure ``` my-project/ β”œβ”€β”€ data-contracts/ # Data specifications β”œβ”€β”€ implementation/ # SQL scripts & Python code β”œβ”€β”€ documentation/ # Project documentation β”œβ”€β”€ validation/ # Quality validation scripts β”œβ”€β”€ sample-data/ # Test data and generators └── README.md # Project overview ``` ### Data Contracts Example ```yaml # customer-data-contract.yaml contract_metadata: name: "customer_data_contract_community" framework_version: "AI Agentic Data Stack Community v1.0" business_context: objective: "Customer segmentation for targeted marketing" quality_framework: dimensions: completeness: customer_id: {threshold: 100.0, criticality: "critical"} email: {threshold: 95.0, criticality: "high"} accuracy: email_format: {threshold: 95.0, validation: "regex_email"} consistency: customer_id_unique: {threshold: 100.0, check: "uniqueness"} ``` ## πŸš€ Getting Started Tutorial ### Step 1: Install and Try Example ```bash npm install -g agentic-data-stack-community # Start with the complete example (recommended) cd examples/simple-ecommerce-analytics python sample-data/generate-sample-data.py ``` ### Step 2: Explore Interactive Shell ```bash # See what's available agentic-data info agentic-data agents list # Enter interactive shell mode agentic-data interactive # Activate your first agent @data-analyst *help *task *analyze-data *exit # Exit shell exit ``` ### Step 3: Try Workflows ```bash # Execute structured multi-agent workflows agentic-data workflow community-analytics-workflow # Follow the interactive prompts for each step ``` ### Step 4: Create Your Own Project ```bash # Initialize your own project agentic-data init my-analytics-project cd my-analytics-project # Copy patterns from the example cp -r ../examples/simple-ecommerce-analytics/implementation . ``` ### Step 5: Interactive Shell ```bash # Enter persistent interactive mode agentic-data interactive # Try different agents and commands ``` ## πŸ“ˆ Performance and Scale ### Community Edition Capabilities - **Data Volume**: Up to 1M records per analysis - **Processing**: Single-machine processing optimized - **Quality Checks**: 3-dimensional framework - **Export Formats**: CSV, JSON for marketing tools - **Update Frequency**: Daily batch processing ### Performance Benchmarks - **Segmentation Analysis**: ~30 seconds for 100K customers - **Quality Validation**: ~15 seconds for 500K records - **Data Export**: ~5 seconds for 50K customer lists ## 🀝 Community & Support ### Community Resources - **GitHub Discussions**: Ask questions, share insights - **Documentation**: Complete guides and tutorials - **Examples**: Real-world implementations - **Contributing**: Help improve the framework ### Getting Help 1. **Check Documentation**: Start with README and examples 2. **Search Issues**: Look for similar questions on GitHub 3. **Ask Community**: Post in GitHub Discussions 4. **Report Bugs**: Create detailed issue reports ### Contributing Guidelines We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for: - Code contribution process - Documentation improvements - Example submissions - Bug reporting guidelines ## 🏒 Enterprise Edition Ready for advanced features? Enterprise Edition includes: ### Additional Capabilities - **8 Specialized Agents**: Including Data Scientist, Governance Officer, Experience Designer - **88 Interactive Templates**: Industry-specific solutions and advanced patterns - **7-Dimensional Quality**: ML-enhanced validation with predictive analytics - **Real-time Collaboration**: Multi-user workflows and approval processes - **Advanced Compliance**: HIPAA, GDPR, SOX automation - **Professional Support**: Training, consulting, and technical support ### Industry Solutions - **Healthcare**: HIPAA-compliant patient analytics - **Financial Services**: Risk modeling and compliance - **Retail**: Advanced recommendation engines - **Manufacturing**: Supply chain optimization ### Contact Enterprise πŸ“ž **Sales**: enterprise@agenticdatastack.com 🌐 **Website**: [Enterprise Features](https://www.agenticdatastack.com/) πŸ“… **Demo**: Schedule a personalized demonstration ## πŸ“„ License & Legal ### Community Edition License This Community Edition is licensed under the [MIT License](LICENSE.txt). ### Comparison | Feature | Community Edition | Enterprise Edition | |---------|------------------|-------------------| | AI Agents | 4 Core Agents | 8 Specialized Agents | | Templates | 20 Essential | 88 Interactive | | Quality Framework | 3-Dimensional | 7-Dimensional + ML | | Support | Community | Professional | | License | MIT (Open Source) | Commercial | | Compliance | Basic | Advanced (HIPAA, GDPR) | <!-- ## πŸ—ΊοΈ Roadmap ### Community Edition v1.1 (Q4 2025) - Additional example implementations - Enhanced CLI with project templates - Improved documentation and tutorials - Community-contributed templates ### Future Releases - Integration with popular data tools - Advanced visualization templates - Multi-language support - Performance optimizations --> <!-- --- ## πŸŽ‰ Success Stories > *"The Community Edition helped us implement customer segmentation in just 2 days. The RFM analysis template saved us weeks of development time."* > **β€” Sarah Chen, Marketing Analytics Manager** > *"Love the 3-dimensional quality framework. It caught data issues we didn't even know we had."* > **β€” Mike Rodriguez, Data Engineer** > *"Perfect for learning data engineering patterns. The examples are realistic and well-documented."* > **β€” Lisa Park, Data Science Student** --> --- **πŸš€ Ready to transform your data operations? Start with `cd examples/simple-ecommerce-analytics` and explore interactive agents!** **Framework**: AI Agentic Data Stack - Community Edition v1.1.2 **License**: MIT **Community**: [GitHub Discussions](https://github.com/barnyp/agentic-data-stack-framework-community/discussions) **Enterprise**: enterprise@agenticdatastack.com