agentic-data-stack-community
Version:
AI Agentic Data Stack Framework - Community Edition. Open source data engineering framework with 4 core agents, essential templates, and 3-dimensional quality validation.
406 lines (315 loc) • 11.3 kB
Markdown
# Getting Started with ADSF Community Edition
Welcome to the AI Agentic Data Stack Framework Community Edition! This guide will help you set up and start using the interactive agent system for your data engineering and analytics projects.
## 📋 Prerequisites
### System Requirements
- **Node.js**: Version 14.0.0 or higher
- **Operating System**: macOS, Linux, or Windows
- **Memory**: Minimum 4GB RAM
- **Storage**: At least 1GB free space
### Skills
- Basic command line knowledge
- Understanding of data concepts
- Familiarity with SQL (helpful but not required)
- Basic Python knowledge (for quality validation scripts)
## 🚀 Installation
### Method 1: NPM Global Installation (Recommended)
```bash
# Install globally
npm install -g agentic-data-stack-community
# Verify installation
agentic-data --version
```
### Method 2: Local Project Installation
```bash
# Create project directory
mkdir my-data-project
cd my-data-project
# Install locally
npm install agentic-data-stack-community
# Use via npx
npx agentic-data --help
```
### Method 3: Clone Repository
```bash
# Clone the repository
git clone https://github.com/barnyp/agentic-data-stack-framework-community.git
cd agentic-data-stack-framework-community
# Install dependencies
npm install
# Run from source
node tools/cli.js --help
```
## 🎯 Your First Interactive Experience
### Step 1: Try the Complete Example (Recommended)
```bash
# Navigate to the included example
cd examples/simple-ecommerce-analytics
# Generate realistic sample data
python sample-data/generate-sample-data.py
# Review what's included
ls implementation/
cat README.md
```
### Step 2: Use Interactive Shell Mode
```bash
# List all available interactive agents
agentic-data agents list
# Enter interactive shell mode (recommended)
agentic-data interactive
# Activate the data analyst (Riley)
@data-analyst
# Interactive agent appears:
# 📈 Riley activated
# 📈 Riley: *help
# 📈 Riley: *task
# 📈 Riley: *analyze-data
# 📈 Riley: *exit
# Exit interactive shell
exit
```
### Step 3: Try Structured Workflows
```bash
# Run a complete multi-agent workflow
agentic-data workflow community-analytics-workflow
# Follow the interactive prompts for each step:
# 1. Requirements gathering (data-product-manager)
# 2. Data profiling (data-analyst)
# 3. Analysis execution (data-analyst)
# 4. Quality validation (data-quality-engineer)
```
### Step 4: Create Your Own Project
```bash
# Initialize new project with patterns from example
agentic-data init my-analytics-project
cd my-analytics-project
# Copy example patterns
cp -r ../examples/simple-ecommerce-analytics/implementation .
```
## 🤖 Working with Interactive AI Agents
### Data Engineer Agent (Emma ⚙️)
Emma helps with pipeline development and infrastructure setup.
```bash
# Activate Emma for interactive assistance
agentic-data agent data-engineer
# Available commands in Emma's session:
# ⚙️ Emma: *build-pipeline # Build data pipelines
# ⚙️ Emma: *setup-monitoring # Setup monitoring systems
# ⚙️ Emma: *implement-quality-checks # Add quality validation
# ⚙️ Emma: *profile-data # Analyze data characteristics
# ⚙️ Emma: *help # Show all commands
# ⚙️ Emma: *exit # Leave Emma's session
```
**Common Use Cases:**
- Setting up data ingestion pipelines
- Designing ETL processes
- Infrastructure planning
- Performance optimization
- Deployment strategies
### Data Analyst Agent (Riley 📈)
Riley specializes in business intelligence and customer analytics.
```bash
# Enter interactive shell and activate Riley
agentic-data interactive
@data-analyst
# Available commands in Riley's session:
# 📈 Riley: *analyze-data # Perform comprehensive analysis
# 📈 Riley: *segment-customers # Customer segmentation
# 📈 Riley: *create-dashboard # Build interactive dashboards
# 📈 Riley: *define-metrics # Define business metrics
# 📈 Riley: *help # Show all commands
# 📈 Riley: *exit # Leave Riley's session
```
**Common Use Cases:**
- Customer segmentation analysis
- RFM (Recency, Frequency, Monetary) analysis
- Business intelligence dashboards
- Data visualization design
- Insight generation
### Data Product Manager Agent (Morgan 📊)
Morgan handles project coordination and stakeholder management.
```bash
# Enter interactive shell and activate Morgan
agentic-data interactive
@data-product-manager
# Available commands in Morgan's session:
# 📊 Morgan: *gather-requirements # Stakeholder requirements
# 📊 Morgan: *create-data-contract # Create data contracts
# 📊 Morgan: *define-metrics # Success metrics
# 📊 Morgan: *help # Show all commands
# 📊 Morgan: *exit # Leave Morgan's session
```
**Common Use Cases:**
- Business requirements gathering
- Project planning and coordination
- Stakeholder communication
- Value mapping and ROI planning
- Data contract creation
### Data Quality Engineer Agent (Quinn 🔍)
Quinn focuses on the 3-dimensional quality framework.
```bash
# Activate Quinn for quality assurance
agentic-data agent data-quality-engineer
# Available commands in Quinn's session:
# 🔍 Quinn: *validate-data-quality # Comprehensive quality validation
# 🔍 Quinn: *profile-data # Statistical data profiling
# 🔍 Quinn: *setup-quality-monitoring # Quality monitoring setup
# 🔍 Quinn: *help # Show all commands
# 🔍 Quinn: *exit # Leave Quinn's session
```
**Common Use Cases:**
- Data completeness validation
- Data accuracy verification
- Consistency checking across systems
- Quality monitoring setup
- Issue investigation and resolution
## 📊 E-commerce Analytics Example
The included e-commerce analytics example demonstrates real-world interactive agent usage:
### What's Included
- **Customer Segmentation**: RFM analysis with interactive agent guidance
- **Data Quality Validation**: 3-dimensional quality framework via Quinn
- **Business Requirements**: Stakeholder templates via Alex
- **Sample Data**: Realistic e-commerce dataset generation
### Running the Example with Agents
```bash
# Navigate to the example
cd examples/simple-ecommerce-analytics
# Generate sample data
python sample-data/generate-sample-data.py
# Use interactive shell with agents
agentic-data interactive
# Gather requirements
@data-product-manager
*gather-requirements
# Perform analysis
@data-analyst
*analyze-data
*segment-customers
# Validate quality
@data-quality-engineer
*implement-quality-checks
# Exit shell
exit
```
## 🔍 3-Dimensional Quality Framework
### Interactive Quality Validation with Quinn
The community edition provides quality validation through the Data Quality Engineer agent.
```bash
# Activate Quinn for interactive quality validation
agentic-data agent data-quality-engineer
# Available quality commands in Quinn's session:
# 🔍 Quinn: *validate-data-quality # Full 3-dimensional validation
# 🔍 Quinn: *profile-data # Data profiling and statistics
# 🔍 Quinn: *setup-quality-monitoring # Quality monitoring setup
```
### Quality Dimensions
- **Completeness**: Data availability and coverage validation
- **Accuracy**: Format checking and business rule validation
- **Consistency**: Cross-reference validation and uniqueness checks
### Command Line Quality Checks
```bash
# Run framework validation on project
agentic-data validate
# Validate specific example project
agentic-data validate --path ./examples/simple-ecommerce-analytics
```
## 📚 Templates Overview
The community edition includes 20 essential templates accessible through agents.
### View Available Templates
```bash
# List all community templates
agentic-data templates list
# Show specific template details
agentic-data templates show data-contract-tmpl
```
### Template Categories
#### Data Engineering Templates
- `data-pipeline-tmpl`: Core pipeline structure
- `infrastructure-tmpl`: Infrastructure setup
- `etl-patterns-tmpl`: ETL workflow patterns
#### Analysis Templates
- `data-analysis-tmpl`: Analysis workflow patterns
- `customer-segmentation-tmpl`: Segmentation methodology
- `dashboard-tmpl`: Dashboard design templates
#### Quality Templates
- `quality-checks-tmpl`: Quality validation framework
- `data-profiling-tmpl`: Data exploration patterns
- `quality-monitoring-tmpl`: Monitoring setup
#### Business Templates
- `business-requirements-tmpl`: Requirements documentation
- `data-contract-tmpl`: Data contracts and specifications
- `stakeholder-engagement-tmpl`: Communication planning
### Using Templates with Agents
```bash
# Activate an agent and create documents
agentic-data agent data-analyst
*create-doc data-analysis-tmpl
```
## 🛠️ Configuration
### Project Configuration
Create a `.adsf-config.yaml` file in your project root:
```yaml
project:
name: "My Analytics Project"
version: "1.0.0"
agents: ["data-engineer", "data-analyst"]
quality:
dimensions: ["completeness", "accuracy", "consistency"]
thresholds:
completeness: 0.95
accuracy: 0.98
consistency: 0.90
templates:
default_format: "yaml"
output_directory: "./generated"
```
### Global Configuration
Configure global settings:
```bash
# Set default project template
adsf-community config set default-template business-requirements-tmpl
# Set quality thresholds
adsf-community config set quality.completeness.threshold 0.95
# View current configuration
adsf-community config list
```
## 🚀 Next Steps
1. **Explore the Example**: Start with the e-commerce analytics example
2. **Try Different Agents**: Experiment with each of the 4 core agents
3. **Generate Templates**: Create templates for your specific use case
4. **Implement Quality Checks**: Set up 3-dimensional quality validation
5. **Build Your First Pipeline**: Create a complete data pipeline
## 🆘 Getting Help
### Command Line Help
```bash
# General help
adsf-community --help
# Agent-specific help
adsf-community agents --help
# Template help
adsf-community templates --help
```
### Documentation
- **User Guide**: `docs/user-guide.md`
- **Core Concepts**: `docs/core-concepts.md`
- **Examples**: `./examples/`
- **API Reference**: `docs/api-reference.md`
### Community Support
- **GitHub Issues**: Report bugs and request features
- **Discussions**: Ask questions and share experiences
- **Contributing**: See `CONTRIBUTING.md` for contribution guidelines
### Enterprise Features
For advanced capabilities including 4 additional agents, 68 more templates, and 7-dimensional quality framework:
- **Website**: https://www.agenticdatastack.com
- **Email**: enterprise@agenticdatastack.com
- **Migration Guide**: `docs/migration/enterprise-upgrade.md`
## 📈 Success Tips
1. **Start Small**: Begin with simple use cases and grow gradually
2. **Use Examples**: Learn from the included e-commerce example
3. **Focus on Quality**: Implement quality checks from the beginning
4. **Document Everything**: Use templates to maintain good documentation
5. **Engage Community**: Participate in discussions and share learnings
Welcome to the ADSF Community! Happy data engineering! 🎉