@clduab11/gemini-flow

Version:

Revolutionary AI agent swarm coordination platform with Google Services integration, multimedia processing, and production-ready monitoring. Features 8 Google AI services, quantum computing capabilities, and enterprise-grade security.

github.com/claude-ai/gemini-flow

claude-ai/gemini-flow

791 lines (656 loc) • 21.6 kB

Markdown

# Configuration Management Guide ## Overview This document outlines configuration management best practices for Google Services in the Gemini-Flow platform, ensuring consistent, secure, and maintainable configuration across all environments. ## Table of Contents 1. [Configuration Architecture](#configuration-architecture) 2. [Environment Management](#environment-management) 3. [Secret Management](#secret-management) 4. [Configuration Validation](#configuration-validation) 5. [Deployment Strategies](#deployment-strategies) 6. [Rollback Procedures](#rollback-procedures) 7. [Automation Scripts](#automation-scripts) ## Configuration Architecture ### Configuration Hierarchy ``` config/ ├── base/ # Base configurations │ ├── service-defaults.yaml # Default service settings │ ├── resource-limits.yaml # Resource constraints │ └── feature-flags.yaml # Feature toggles ├── environments/ # Environment-specific configs │ ├── development/ │ │ ├── services.yaml # Dev service configurations │ │ ├── scaling.yaml # Dev scaling parameters │ │ └── monitoring.yaml # Dev monitoring settings │ ├── staging/ │ │ ├── services.yaml # Staging configurations │ │ ├── scaling.yaml # Staging scaling parameters │ │ └── monitoring.yaml # Staging monitoring settings │ └── production/ │ ├── services.yaml # Production configurations │ ├── scaling.yaml # Production scaling parameters │ └── monitoring.yaml # Production monitoring settings ├── secrets/ # Secret templates (encrypted) │ ├── service-accounts.yaml # Service account templates │ ├── api-keys.yaml # API key templates │ └── certificates.yaml # Certificate templates └── schemas/ # Configuration schemas ├── service-schema.json # Service config validation ├── scaling-schema.json # Scaling config validation └── monitoring-schema.json # Monitoring config validation ``` ### Configuration Sources Priority 1. **Command Line Arguments** (highest priority) 2. **Environment Variables** 3. **Configuration Files** 4. **Default Values** (lowest priority) ## Environment Management ### Development Environment ```yaml # config/environments/development/services.yaml vertex_ai: endpoint: "https://us-central1-aiplatform.googleapis.com" project_id: "gemini-flow-dev" location: "us-central1" model_settings: default_model: "gemini-2.0-flash" temperature: 0.7 max_tokens: 1000 rate_limits: requests_per_minute: 60 concurrent_requests: 5 workspace: oauth2: scopes: - "https://www.googleapis.com/auth/drive.readonly" - "https://www.googleapis.com/auth/spreadsheets" api_version: "v1" timeout_seconds: 30 streaming: webrtc: stun_servers: - "stun:stun.l.google.com:19302" buffer_size_mb: 10 compression_enabled: true ``` ### Staging Environment ```yaml # config/environments/staging/services.yaml vertex_ai: endpoint: "https://us-central1-aiplatform.googleapis.com" project_id: "gemini-flow-staging" location: "us-central1" model_settings: default_model: "gemini-2.5-pro" temperature: 0.5 max_tokens: 2000 rate_limits: requests_per_minute: 300 concurrent_requests: 20 workspace: oauth2: scopes: - "https://www.googleapis.com/auth/drive" - "https://www.googleapis.com/auth/spreadsheets" - "https://www.googleapis.com/auth/documents" api_version: "v1" timeout_seconds: 60 streaming: webrtc: stun_servers: - "stun:stun.l.google.com:19302" - "stun:stun1.l.google.com:19302" buffer_size_mb: 50 compression_enabled: true cdn_enabled: true ``` ### Production Environment ```yaml # config/environments/production/services.yaml vertex_ai: endpoint: "https://us-central1-aiplatform.googleapis.com" project_id: "gemini-flow-prod" location: "us-central1" model_settings: default_model: "gemini-2.5-pro" temperature: 0.3 max_tokens: 4000 rate_limits: requests_per_minute: 1000 concurrent_requests: 100 failover: enabled: true backup_regions: ["us-east1", "europe-west1"] workspace: oauth2: scopes: - "https://www.googleapis.com/auth/drive" - "https://www.googleapis.com/auth/spreadsheets" - "https://www.googleapis.com/auth/documents" - "https://www.googleapis.com/auth/presentations" api_version: "v1" timeout_seconds: 120 retry_policy: max_retries: 3 backoff_multiplier: 2 streaming: webrtc: stun_servers: - "stun:stun.l.google.com:19302" - "stun:stun1.l.google.com:19302" - "stun:stun2.l.google.com:19302" buffer_size_mb: 100 compression_enabled: true cdn_enabled: true edge_locations: ["us", "eu", "asia"] ``` ## Secret Management ### Google Cloud Secret Manager Integration ```bash #!/bin/bash # scripts/secret-management.sh # Create secrets in Google Cloud Secret Manager create_secrets() { local env=$1 # Service Account Key gcloud secrets create "gemini-flow-${env}-service-account" \ --data-file="secrets/${env}/service-account.json" # OAuth2 Client Secret gcloud secrets create "gemini-flow-${env}-oauth2-secret" \ --data-file="secrets/${env}/oauth2-client-secret.txt" # API Keys gcloud secrets create "gemini-flow-${env}-api-keys" \ --data-file="secrets/${env}/api-keys.json" } # Retrieve secrets for deployment retrieve_secrets() { local env=$1 local output_dir=$2 mkdir -p "$output_dir" # Retrieve service account gcloud secrets versions access latest \ --secret="gemini-flow-${env}-service-account" \ > "$output_dir/service-account.json" # Retrieve OAuth2 secret gcloud secrets versions access latest \ --secret="gemini-flow-${env}-oauth2-secret" \ > "$output_dir/oauth2-client-secret.txt" # Retrieve API keys gcloud secrets versions access latest \ --secret="gemini-flow-${env}-api-keys" \ > "$output_dir/api-keys.json" } # Rotate secrets rotate_secrets() { local env=$1 echo "Rotating secrets for $env environment..." # Generate new service account key gcloud iam service-accounts keys create "new-service-account.json" \ --iam-account="gemini-flow-${env}@project.iam.gserviceaccount.com" # Update secret gcloud secrets versions add "gemini-flow-${env}-service-account" \ --data-file="new-service-account.json" echo "Secret rotation completed for $env" } ``` ### Kubernetes Secret Management ```yaml # k8s/secrets.yaml apiVersion: v1 kind: Secret metadata: name: gemini-flow-secrets namespace: gemini-flow type: Opaque data: service-account.json: | {{ .Files.Get "secrets/service-account.json" | b64enc }} oauth2-client-secret: | {{ .Files.Get "secrets/oauth2-client-secret.txt" | b64enc }} stringData: vertex-ai-endpoint: "https://us-central1-aiplatform.googleapis.com" project-id: "{{ .Values.projectId }}" --- apiVersion: external-secrets.io/v1beta1 kind: SecretStore metadata: name: gcpsm-secret-store namespace: gemini-flow spec: provider: gcpsm: projectId: "{{ .Values.projectId }}" auth: workloadIdentity: clusterLocation: us-central1 clusterName: gemini-flow-cluster serviceAccountRef: name: external-secrets-sa --- apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: gemini-flow-external-secret namespace: gemini-flow spec: refreshInterval: 15s secretStoreRef: name: gcpsm-secret-store kind: SecretStore target: name: gemini-flow-secrets-external creationPolicy: Owner data: - secretKey: service-account.json remoteRef: key: gemini-flow-prod-service-account - secretKey: oauth2-client-secret remoteRef: key: gemini-flow-prod-oauth2-secret ``` ## Configuration Validation ### Schema Validation ```json { "$schema": "http://json-schema.org/draft-07/schema#", "title": "Gemini-Flow Service Configuration", "type": "object", "properties": { "vertex_ai": { "type": "object", "properties": { "endpoint": { "type": "string", "format": "uri", "pattern": "^https://.*aiplatform\\.googleapis\\.com$" }, "project_id": { "type": "string", "pattern": "^[a-z][a-z0-9-]{4,28}[a-z0-9]$" }, "location": { "type": "string", "enum": ["us-central1", "us-east1", "europe-west1", "asia-southeast1"] }, "model_settings": { "type": "object", "properties": { "default_model": { "type": "string", "enum": ["gemini-2.0-flash", "gemini-2.5-pro", "gemini-2.5-flash"] }, "temperature": { "type": "number", "minimum": 0, "maximum": 2 }, "max_tokens": { "type": "integer", "minimum": 1, "maximum": 1000000 } }, "required": ["default_model", "temperature", "max_tokens"] } }, "required": ["endpoint", "project_id", "location", "model_settings"] } }, "required": ["vertex_ai"] } ``` ### Configuration Validation Script ```bash #!/bin/bash # scripts/validate-config.sh validate_config() { local env=$1 local config_file="config/environments/$env/services.yaml" local schema_file="config/schemas/service-schema.json" echo "Validating configuration for $env environment..." # Convert YAML to JSON for validation yq eval -o=json "$config_file" > "/tmp/config-$env.json" # Validate against schema if npx ajv-cli validate -s "$schema_file" -d "/tmp/config-$env.json"; then echo "✅ Configuration validation passed for $env" else echo "❌ Configuration validation failed for $env" exit 1 fi # Additional custom validations validate_project_access "$env" validate_service_accounts "$env" validate_quotas "$env" } validate_project_access() { local env=$1 local project_id=$(yq eval ".vertex_ai.project_id" "config/environments/$env/services.yaml") if gcloud projects describe "$project_id" > /dev/null 2>&1; then echo "✅ Project access verified for $project_id" else echo "❌ Cannot access project $project_id" exit 1 fi } validate_service_accounts() { local env=$1 local project_id=$(yq eval ".vertex_ai.project_id" "config/environments/$env/services.yaml") if gcloud iam service-accounts describe "gemini-flow-${env}@${project_id}.iam.gserviceaccount.com" > /dev/null 2>&1; then echo "✅ Service account verified for $env" else echo "❌ Service account not found for $env" exit 1 fi } validate_quotas() { local env=$1 local project_id=$(yq eval ".vertex_ai.project_id" "config/environments/$env/services.yaml") local location=$(yq eval ".vertex_ai.location" "config/environments/$env/services.yaml") # Check Vertex AI quotas local quota_usage=$(gcloud ai quotas list --project="$project_id" --location="$location" --format="value(usage)" --filter="metric:aiplatform.googleapis.com/predict_requests") if [[ $quota_usage -lt 1000 ]]; then echo "✅ Quota availability verified for $env" else echo "⚠️ High quota usage detected for $env: $quota_usage" fi } ``` ## Deployment Strategies ### Blue-Green Deployment ```bash #!/bin/bash # scripts/blue-green-deploy.sh deploy_blue_green() { local env=$1 local new_version=$2 local current_slot=$(kubectl get service gemini-flow-service -o jsonpath='{.spec.selector.slot}') local target_slot="green" if [[ "$current_slot" == "green" ]]; then target_slot="blue" fi echo "Deploying $new_version to $target_slot slot in $env environment" # Update configuration update_config "$env" "$target_slot" "$new_version" # Deploy to target slot kubectl set image deployment/gemini-flow-$target_slot \ app=gemini-flow:$new_version # Wait for deployment kubectl rollout status deployment/gemini-flow-$target_slot --timeout=600s # Health check if health_check "$target_slot"; then echo "Health check passed, switching traffic" switch_traffic "$target_slot" else echo "Health check failed, rolling back" kubectl rollout undo deployment/gemini-flow-$target_slot exit 1 fi } switch_traffic() { local target_slot=$1 # Update service selector kubectl patch service gemini-flow-service -p '{"spec":{"selector":{"slot":"'$target_slot'"}}}' echo "Traffic switched to $target_slot slot" } ``` ### Canary Deployment ```bash #!/bin/bash # scripts/canary-deploy.sh deploy_canary() { local env=$1 local new_version=$2 local canary_percentage=${3:-10} echo "Starting canary deployment: $canary_percentage% traffic to $new_version" # Deploy canary version kubectl apply -f - <<EOF apiVersion: apps/v1 kind: Deployment metadata: name: gemini-flow-canary namespace: gemini-flow spec: replicas: 1 selector: matchLabels: app: gemini-flow version: canary template: metadata: labels: app: gemini-flow version: canary spec: containers: - name: gemini-flow image: gemini-flow:$new_version envFrom: - configMapRef: name: gemini-flow-config-$env - secretRef: name: gemini-flow-secrets EOF # Update Istio VirtualService for traffic splitting kubectl apply -f - <<EOF apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: gemini-flow-vs namespace: gemini-flow spec: hosts: - gemini-flow-service http: - match: - headers: canary: exact: "true" route: - destination: host: gemini-flow-service subset: canary - route: - destination: host: gemini-flow-service subset: stable weight: $((100 - canary_percentage)) - destination: host: gemini-flow-service subset: canary weight: $canary_percentage EOF # Monitor canary deployment monitor_canary "$new_version" "$canary_percentage" } monitor_canary() { local version=$1 local percentage=$2 local duration=300 # 5 minutes local interval=30 echo "Monitoring canary deployment for $duration seconds" for ((i=0; i<duration; i+=interval)); do # Check error rate local error_rate=$(get_error_rate_for_version "$version") local latency_p95=$(get_latency_p95_for_version "$version") echo "Canary metrics - Error rate: $error_rate%, P95 latency: ${latency_p95}ms" if (( $(echo "$error_rate > 1.0" | bc -l) )); then echo "❌ Canary error rate too high, rolling back" rollback_canary exit 1 fi if (( latency_p95 > 1000 )); then echo "❌ Canary latency too high, rolling back" rollback_canary exit 1 fi sleep $interval done echo "✅ Canary monitoring completed successfully" promote_canary "$version" } ``` ## Rollback Procedures ### Configuration Rollback ```bash #!/bin/bash # scripts/config-rollback.sh rollback_config() { local env=$1 local target_version=${2:-"previous"} echo "Rolling back configuration for $env to $target_version" # Get previous configuration version from Git if [[ "$target_version" == "previous" ]]; then target_version=$(git log --oneline -n 2 --format="%H" -- "config/environments/$env/" | tail -1) fi # Create backup of current config backup_current_config "$env" # Checkout previous configuration git checkout "$target_version" -- "config/environments/$env/" # Validate rollback configuration if validate_config "$env"; then echo "✅ Configuration rollback validation passed" apply_config "$env" else echo "❌ Configuration rollback validation failed" restore_backup_config "$env" exit 1 fi } backup_current_config() { local env=$1 local backup_dir="backups/config/$(date +%Y%m%d_%H%M%S)_$env" mkdir -p "$backup_dir" cp -r "config/environments/$env/" "$backup_dir/" echo "Current configuration backed up to $backup_dir" } apply_config() { local env=$1 # Update Kubernetes ConfigMaps kubectl create configmap gemini-flow-config-$env \ --from-file="config/environments/$env/" \ --dry-run=client -o yaml | kubectl apply -f - # Restart deployments to pick up new config kubectl rollout restart deployment/gemini-flow-api kubectl rollout restart deployment/gemini-flow-agents # Wait for rollout completion kubectl rollout status deployment/gemini-flow-api --timeout=300s kubectl rollout status deployment/gemini-flow-agents --timeout=300s } ``` ## Automation Scripts ### Configuration Deployment Pipeline ```yaml # .github/workflows/config-deployment.yml name: Configuration Deployment on: push: paths: - 'config/**' branches: - main - staging - development jobs: validate-config: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '18' - name: Install dependencies run: | npm install -g ajv-cli yq - name: Validate configuration schemas run: | for env in development staging production; do if [[ -f "config/environments/$env/services.yaml" ]]; then echo "Validating $env configuration..." scripts/validate-config.sh $env fi done - name: Security scan run: | # Check for exposed secrets scripts/scan-secrets.sh config/ deploy-config: needs: validate-config runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' steps: - uses: actions/checkout@v4 - name: Setup Google Cloud CLI uses: google-github-actions/setup-gcloud@v1 with: service_account_key: ${{ secrets.GCP_SA_KEY }} project_id: ${{ secrets.GCP_PROJECT_ID }} - name: Deploy to staging run: | scripts/deploy-config.sh staging - name: Run integration tests run: | scripts/run-integration-tests.sh staging - name: Deploy to production run: | scripts/deploy-config.sh production ``` ### Configuration Monitoring ```bash #!/bin/bash # scripts/config-monitor.sh monitor_config_drift() { local env=$1 echo "Monitoring configuration drift for $env environment" # Get current config from Kubernetes kubectl get configmap gemini-flow-config-$env -o jsonpath='{.data}' > "/tmp/k8s-config-$env.json" # Get expected config from Git yq eval -o=json "config/environments/$env/services.yaml" > "/tmp/git-config-$env.json" # Compare configurations if diff "/tmp/k8s-config-$env.json" "/tmp/git-config-$env.json" > "/tmp/config-diff-$env.txt"; then echo "✅ No configuration drift detected for $env" else echo "⚠️ Configuration drift detected for $env" cat "/tmp/config-diff-$env.txt" # Send alert send_drift_alert "$env" "/tmp/config-diff-$env.txt" fi } send_drift_alert() { local env=$1 local diff_file=$2 # Send Slack notification curl -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"🚨 Configuration drift detected in $env environment. Check the logs for details.\"}" \ "$SLACK_WEBHOOK_URL" # Create incident ticket create_incident_ticket "Configuration Drift" "$env" "$diff_file" } ``` ## Best Practices ### 1. Configuration Versioning - Store all configurations in Git - Use semantic versioning for configuration releases - Tag configuration versions with deployment information ### 2. Environment Isolation - Separate configurations by environment - Use different Google Cloud projects for isolation - Implement proper IAM boundaries ### 3. Secret Security - Never store secrets in plain text - Use Google Cloud Secret Manager for production - Rotate secrets regularly - Implement least privilege access ### 4. Validation and Testing - Validate all configurations before deployment - Test configuration changes in staging first - Implement automated configuration testing ### 5. Monitoring and Alerting - Monitor for configuration drift - Alert on configuration deployment failures - Track configuration change impact on system metrics --- **Document Owner**: SRE Team **Last Updated**: August 14, 2025 **Next Review**: November 14, 2025 **Version**: 1.0