UNPKG

@kubeasy-dev/kubeasy-cli

Version:

Command Line to interact with kubeasy.dev and challenges

1,075 lines (854 loc) 25.3 kB
# Validation Examples This document provides comprehensive examples of all validation types supported by the Kubeasy CLI validation system. ## Table of Contents 1. [Condition Validation](#condition-validation) 2. [Status Validation](#status-validation) 3. [Advanced Field Path Syntax](#advanced-field-path-syntax) 4. [Log Validation](#log-validation) 5. [Event Validation](#event-validation) 6. [Connectivity Validation](#connectivity-validation) 7. [Complete Challenge Example](#complete-challenge-example) 8. [Metrics Validation (REMOVED)](#metrics-validation-removed) 9. [Best Practices](#best-practices) 10. [Troubleshooting](#troubleshooting) 11. [Reference](#reference) --- ## Condition Validation Checks Kubernetes resource conditions (e.g., Pod Ready, ContainersReady). This is a shorthand for common condition checks. ### Basic Pod Ready Check ```yaml validations: - key: pod-ready title: "Pod Ready" description: "The application pod must be in Ready state" order: 1 type: condition spec: target: kind: Pod labelSelector: app: my-application checks: - type: Ready status: "True" ``` **When to use**: Verify that pods are running and passing readiness probes. **What it checks**: - Finds all pods matching the label selector - Checks if each pod has a `Ready` condition with status `True` - All matching pods must meet the condition --- ### Multiple Conditions ```yaml validations: - key: pod-healthy title: "Pod Healthy" description: "Pod must be both Ready and Initialized" order: 1 type: condition spec: target: kind: Pod labelSelector: app: database checks: - type: Ready status: "True" - type: Initialized status: "True" ``` **When to use**: Verify multiple aspects of pod health. --- ### Common Condition Types | Resource | Condition Types | |----------|-----------------| | Pod | `Ready`, `ContainersReady`, `Initialized`, `PodScheduled` | | Deployment | `Available`, `Progressing`, `ReplicaFailure` | | StatefulSet | `Ready` | | Job | `Complete`, `Failed` | --- ## Status Validation Validates arbitrary status fields using operators. Use this for numeric comparisons, string values, or any status field access. ### Replica Count ```yaml validations: - key: scaled-replicas title: "Scaled to 3 Replicas" description: "Deployment must have exactly 3 ready replicas" order: 1 type: status spec: target: kind: Deployment name: web-app checks: - field: readyReplicas operator: "==" value: 3 - field: availableReplicas operator: ">=" value: 3 ``` **When to use**: Verify horizontal scaling has been applied. **Available operators**: `==`, `!=`, `>`, `<`, `>=`, `<=` **Note**: Field paths are relative to `status` (no prefix needed). --- ### Restart Count with Array Access ```yaml validations: - key: low-restarts title: "Low Restart Count" description: "Pod must have fewer than 3 restarts" order: 1 type: status spec: target: kind: Pod labelSelector: app: stable-app checks: - field: containerStatuses[0].restartCount operator: "<" value: 3 ``` **When to use**: Verify pod stability over time. **Field path syntax**: - Simple field: `readyReplicas` - Array index: `containerStatuses[0].restartCount` - Array filter: `conditions[type=Ready].status` --- ### Condition via Status (Advanced) ```yaml validations: - key: deployment-available title: "Deployment Available" description: "The deployment must be available" order: 1 type: status spec: target: kind: Deployment name: web-app checks: - field: conditions[type=Available].status operator: "==" value: "True" ``` **When to use**: Check conditions on any resource type using the flexible status validation. **Tip**: For simple condition checks, prefer the `condition` type. Use `status` type when you need operators or complex field paths. --- ### StatefulSet Replicas ```yaml validations: - key: statefulset-ready title: "StatefulSet Ready" description: "All StatefulSet replicas must be ready" order: 1 type: status spec: target: kind: StatefulSet name: database checks: - field: readyReplicas operator: "==" value: 3 - field: currentReplicas operator: "==" value: 3 ``` **When to use**: Verify stateful applications are fully deployed. --- ### Boolean and String Fields ```yaml validations: - key: phase-running title: "Pod Running" description: "Pod must be in Running phase" order: 1 type: status spec: target: kind: Pod name: my-pod checks: - field: phase operator: "==" value: "Running" ``` **Supported value types**: string, integer, boolean, float --- ## Advanced Field Path Syntax The `status` validation type supports advanced field path syntax for accessing nested fields, arrays, and filtering. ### Simple Field Access Access direct fields on the status object: ```yaml checks: - field: phase operator: "==" value: "Running" - field: readyReplicas operator: ">=" value: 3 ``` **Note**: Field paths are relative to `status` no prefix needed. Internally, the system automatically prepends `status.` to your field path. --- ### Nested Field Access Access fields within nested objects using dot notation: ```yaml checks: - field: loadBalancer.ingress[0].hostname operator: "!=" value: "" ``` --- ### Array Index Access Access specific array elements using `[index]` notation: ```yaml # Access first container's restart count checks: - field: containerStatuses[0].restartCount operator: "<" value: 5 # Access second container's ready state - field: containerStatuses[1].ready operator: "==" value: true ``` **Bounds checking**: The system validates array bounds at runtime and returns a clear error if the index is out of range. --- ### Array Filtering Filter arrays by field value using `[field=value]` notation: ```yaml # Find condition by type and check its status checks: - field: conditions[type=Ready].status operator: "==" value: "True" # Check Available condition - field: conditions[type=Available].status operator: "==" value: "True" # Check Progressing condition reason - field: conditions[type=Progressing].reason operator: "==" value: "NewReplicaSetAvailable" ``` **When to use**: Array filtering is useful when the array order is not guaranteed, which is common for Kubernetes conditions. --- ### Complex Path Examples Combining multiple syntax features: ```yaml validations: - key: complex-check title: "Complex Status Check" type: status spec: target: kind: Pod name: my-pod checks: # First container must be ready - field: containerStatuses[0].ready operator: "==" value: true # Ready condition must be True - field: conditions[type=Ready].status operator: "==" value: "True" # No restarts on first container - field: containerStatuses[0].restartCount operator: "==" value: 0 # Pod must be in Running phase - field: phase operator: "==" value: "Running" ``` --- ### Supported Value Types | Type | YAML Example | Operators | |------|--------------|-----------| | String | `value: "Running"` | `==`, `!=` | | Integer | `value: 3` | `==`, `!=`, `>`, `<`, `>=`, `<=` | | Boolean | `value: true` | `==`, `!=` | | Float | `value: 0.95` | `==`, `!=`, `>`, `<`, `>=`, `<=` | **Type coercion**: Integer and float values are coerced for comparison (e.g., `3` equals `3.0`). --- ### Available Operators | Operator | Description | Supported Types | |----------|-------------|-----------------| | `==` | Equal | All types | | `!=` | Not equal | All types | | `>` | Greater than | Integer, Float | | `<` | Less than | Integer, Float | | `>=` | Greater than or equal | Integer, Float | | `<=` | Less than or equal | Integer, Float | --- ### Field Validation Errors Field paths are validated at parse time using Go reflection. If a field doesn't exist, you'll get a helpful error message: ``` Error: check 0: field "readyReplica" not found in DeploymentStatus Available fields: availableReplicas, collisionCount, conditions, observedGeneration, readyReplicas, replicas, unavailableReplicas, updatedReplicas ``` **Common mistakes**: - Typo in field name: `readyReplica` instead of `readyReplicas` - Wrong case: `ReadyReplicas` instead of `readyReplicas` - Including `status.` prefix: `status.readyReplicas` instead of `readyReplicas` --- ## Log Validation Searches container logs for expected strings. ### Basic Log Search ```yaml validations: - key: app-started title: "Application Started" description: "Application must log startup message" order: 1 type: log spec: target: kind: Pod labelSelector: app: web-server expectedStrings: - "Server listening on port 8080" ``` **When to use**: Verify that an application has started successfully. **What it checks**: - Fetches logs from all matching pods (last 5 minutes by default) - Searches for each expected string in logs - All expected strings must be found in at least one pod's logs --- ### Specific Container Logs ```yaml validations: - key: sidecar-logs title: "Sidecar Running" description: "Sidecar container must be operational" order: 1 type: log spec: target: kind: Pod labelSelector: app: multi-container-app container: logging-sidecar # Specify exact container expectedStrings: - "Log forwarder initialized" - "Connected to log aggregator" ``` **When to use**: Check logs from a specific container in a multi-container pod. --- ### Custom Time Window ```yaml validations: - key: recent-activity title: "Recent Activity" description: "Application must have logged activity in last minute" order: 1 type: log spec: target: kind: Pod labelSelector: app: worker expectedStrings: - "Processing job" sinceSeconds: 60 # Only check last 60 seconds of logs ``` **When to use**: Verify recent activity or recent configuration changes. **Default**: `sinceSeconds: 300` (5 minutes) --- ### Database Connection Check ```yaml validations: - key: db-connected title: "Database Connected" description: "Application must successfully connect to database" order: 1 type: log spec: target: kind: Pod labelSelector: app: api-server expectedStrings: - "Database connection established" - "Running migrations" - "Migration complete" ``` **When to use**: Verify successful database initialization. --- ## Event Validation Detects forbidden Kubernetes events (e.g., OOMKilled, Evicted, BackOff). ### OOMKilled Detection ```yaml validations: - key: no-oom title: "No OOM Kills" description: "Pod must not be killed due to out of memory" order: 1 type: event spec: target: kind: Pod labelSelector: app: memory-intensive forbiddenReasons: - OOMKilled sinceSeconds: 300 ``` **When to use**: Verify that pods have sufficient memory configured. **What it checks**: - Lists all events in the namespace - Filters events for matching pods - Checks if any events have forbidden reasons - Only considers events within the time window --- ### Eviction and Scheduling Failures ```yaml validations: - key: pod-stability title: "Pod Stability" description: "Pod must not be evicted or fail to schedule" order: 1 type: event spec: target: kind: Pod labelSelector: app: critical-service forbiddenReasons: - Evicted - FailedScheduling - FailedMount sinceSeconds: 600 # Check last 10 minutes ``` **When to use**: Verify pod stability and resource availability. --- ### Crash Loop Detection ```yaml validations: - key: no-crashes title: "No Crash Loops" description: "Pod must not be in crash loop backoff" order: 1 type: event spec: target: kind: Pod labelSelector: app: unstable-app forbiddenReasons: - BackOff - CrashLoopBackOff ``` **When to use**: Verify that application starts successfully without crashes. --- ### Image Pull Failures ```yaml validations: - key: image-pull-success title: "Image Pull Success" description: "Pod must successfully pull container images" order: 1 type: event spec: target: kind: Pod labelSelector: app: new-deployment forbiddenReasons: - Failed - ErrImagePull - ImagePullBackOff ``` **When to use**: Verify that container images are accessible. --- ## Connectivity Validation Tests HTTP connectivity between pods. ### Basic Service Connectivity ```yaml validations: - key: service-reachable title: "Backend Service Reachable" description: "Frontend can reach backend service" order: 1 type: connectivity spec: sourcePod: labelSelector: app: frontend targets: - url: http://backend-service:8080/health expectedStatusCode: 200 ``` **When to use**: Verify network connectivity and service discovery. **What it checks**: - Finds a running pod matching sourcePod selector - Executes `curl` (or `wget` fallback) from that pod - Checks HTTP status code matches expected value --- ### Multiple Endpoints ```yaml validations: - key: all-services-reachable title: "All Services Reachable" description: "Application can reach all dependent services" order: 1 type: connectivity spec: sourcePod: name: app-pod-12345 # Specific pod name targets: - url: http://database-service:5432 expectedStatusCode: 200 - url: http://cache-service:6379 expectedStatusCode: 200 - url: http://api-gateway:80/health expectedStatusCode: 200 ``` **When to use**: Verify all service dependencies are accessible. --- ### Custom Timeout ```yaml validations: - key: slow-service title: "Slow Service Responds" description: "Service responds within 10 seconds" order: 1 type: connectivity spec: sourcePod: labelSelector: app: client targets: - url: http://slow-service:8080/heavy-operation expectedStatusCode: 200 timeoutSeconds: 10 # Custom timeout ``` **When to use**: Test connectivity to slower services. **Default**: `timeoutSeconds: 5` --- ### External Connectivity ```yaml validations: - key: internet-access title: "Internet Access" description: "Pod can reach external services" order: 1 type: connectivity spec: sourcePod: labelSelector: app: worker targets: - url: https://api.github.com expectedStatusCode: 200 ``` **When to use**: Verify egress network policies allow external access. **Note**: Requires pods to have `curl` or `wget` available. --- ### Cross-Namespace Communication ```yaml validations: - key: cross-namespace title: "Cross-Namespace Access" description: "App can reach service in another namespace" order: 1 type: connectivity spec: sourcePod: labelSelector: app: frontend targets: - url: http://backend.production.svc.cluster.local:8080 expectedStatusCode: 200 ``` **When to use**: Verify multi-namespace service communication. --- ## Complete Challenge Example Here's a complete `challenge.yaml` with multiple validation types: ```yaml title: Microservices Deployment description: | Deploy and configure a microservices application with proper resource limits, scaling, and network connectivity. theme: microservices difficulty: medium estimated_time: 30 initial_situation: | You have three microservices: frontend, backend, and database. The deployment is failing due to configuration issues. objective: | 1. Fix resource limits to prevent OOM kills 2. Scale backend to 3 replicas 3. Ensure all services can communicate 4. Verify application startup objectives: # 1. Resource Limits Fixed - key: no-oom-kills title: "No Memory Issues" description: "Pods must not be killed due to out of memory" order: 1 type: event spec: target: kind: Pod labelSelector: tier: backend forbiddenReasons: - OOMKilled - Evicted sinceSeconds: 300 # 2. Scaling Applied - key: backend-scaled title: "Backend Scaled" description: "Backend must have 3 ready replicas" order: 2 type: status spec: target: kind: Deployment name: backend checks: - field: readyReplicas operator: "==" value: 3 - field: availableReplicas operator: ">=" value: 3 # 3. All Pods Ready - key: all-pods-ready title: "All Pods Ready" description: "Frontend, backend, and database pods must be ready" order: 3 type: condition spec: target: kind: Pod labelSelector: app: microservices checks: - type: Ready status: "True" # 4. Application Started - key: app-started title: "Application Started" description: "Backend must log successful startup" order: 4 type: log spec: target: kind: Pod labelSelector: tier: backend expectedStrings: - "Server listening on port 8080" - "Database connection established" sinceSeconds: 300 # 5. Database Connectivity - key: db-connection title: "Database Reachable" description: "Backend can connect to database" order: 5 type: connectivity spec: sourcePod: labelSelector: tier: backend targets: - url: http://database-service:5432 expectedStatusCode: 200 timeoutSeconds: 5 # 6. Frontend to Backend - key: frontend-backend title: "Frontend to Backend" description: "Frontend can reach backend API" order: 6 type: connectivity spec: sourcePod: labelSelector: tier: frontend targets: - url: http://backend-service:8080/api/health expectedStatusCode: 200 # 7. No Crash Loops - key: no-crashes title: "No Crash Loops" description: "No pods should be in crash loop" order: 7 type: event spec: target: kind: Pod labelSelector: app: microservices forbiddenReasons: - BackOff - CrashLoopBackOff sinceSeconds: 600 ``` --- ## Metrics Validation (REMOVED) **Breaking Change**: The `metrics` validation type has been removed in v2.0.0. ### Why Was It Removed? The `metrics` type was redundant with the enhanced `status` type: - Both validated status fields with operators - The `status` type now supports all the same functionality - Removing `metrics` simplifies the codebase and reduces confusion ### Migration Guide Migrate from `metrics` to `status` by: 1. Change `type: metrics` to `type: status` 2. **Remove** the `status.` prefix from field paths 3. Keep `checks` array unchanged (same structure) **Before (v1.x):** ```yaml type: metrics spec: target: kind: Deployment name: web-app checks: - field: status.readyReplicas operator: ">=" value: 3 - field: status.availableReplicas operator: "==" value: 3 ``` **After (v2.0+):** ```yaml type: status spec: target: kind: Deployment name: web-app checks: - field: readyReplicas # No "status." prefix! operator: ">=" value: 3 - field: availableReplicas # No "status." prefix! operator: "==" value: 3 ``` ### Migration Checklist - [ ] Find all `type: metrics` in your challenge.yaml files - [ ] Replace `type: metrics` with `type: status` - [ ] Remove `status.` prefix from all field paths - [ ] Test your validations with `kubeasy challenge submit` ### Automated Migration You can use this sed command to help migrate: ```bash # Replace type: metrics with type: status sed -i 's/type: metrics/type: status/g' challenge.yaml # Remove status. prefix from field paths (manual review recommended) sed -i 's/field: status\./field: /g' challenge.yaml ``` **Note**: Always review changes manually after automated migration. --- ## Best Practices ### 1. Choosing Between Condition and Status | Use `condition` when | Use `status` when | |---------------------|-------------------| | Checking standard K8s conditions | Checking numeric fields (replicas, restarts) | | Simple Ready/Available checks | Using operators (>, <, >=, <=) | | Targeting Pods directly | Accessing nested fields with array syntax | ### 2. Ordering Validations Order validations from most basic to most complex: ```yaml objectives: - order: 1 # Basic: Pods exist and are ready type: condition - order: 2 # Scaling: Correct replica count type: status - order: 3 # Application: Logs show startup type: log - order: 4 # Stability: No crash events type: event - order: 5 # Advanced: Network connectivity type: connectivity ``` ### 3. Meaningful Titles Use titles that describe success state, not the check: - Good: "Pod Ready", "Database Connected" - Bad: "Check Pod Status", "Validate Database" ### 4. Clear Descriptions Explain what the user should achieve: ```yaml description: "The application pod must be running and ready to accept traffic" ``` Not what the validation does: ```yaml description: "Checks if pod has Ready condition set to True" ``` ### 5. Label Selectors vs Names Prefer label selectors for flexibility: ```yaml # ✅ Good: Works with any matching pod target: kind: Pod labelSelector: app: my-app # ⚠️ Less flexible: Requires exact name target: kind: Pod name: my-app-abc123 ``` ### 6. Time Windows Adjust time windows based on application behavior: - Fast-starting apps: `sinceSeconds: 60` - Slow-starting apps: `sinceSeconds: 600` - Default: `sinceSeconds: 300` (5 minutes) --- ## Troubleshooting ### Common Issues **"No matching pods found"** - Check label selectors match pod labels - Verify namespace is correct - Ensure pods are created: `kubectl get pods -l app=your-app` **"Missing strings in logs"** - Check container name is correct - Verify application actually logs the expected string - Increase `sinceSeconds` if app starts slowly **"Invalid response from URL"** - Ensure service returns HTTP status codes - Check service DNS name is correct - Verify network policies allow connectivity **"Field not found"** - Check field path is correct: `readyReplicas` (not `status.readyReplicas`) - Use `kubectl get deployment -o yaml` to see available fields - For arrays: use `[0]` for index or `[field=value]` for filtering > **Note**: For supported Kubernetes resources (Pod, Deployment, StatefulSet, etc.), field paths are validated at parse time using reflection. Invalid field paths will cause an error when loading `challenge.yaml`, with a helpful message listing available fields. This early validation catches typos before runtime. > > However, some fields are conditionally populated (e.g., `containerStatuses` only exists after containers start). These fields pass parse-time validation but may return "field not found" at runtime if the resource isn't in the expected state. --- ## Reference ### Validation Types | Type | Purpose | Common Use Cases | |------|---------|-----------------| | `condition` | Check K8s conditions | Pod Ready, Deployment Available | | `status` | Check any status field | Replica counts, restart counts, phase | | `log` | Search container logs | Startup messages, error detection | | `event` | Detect forbidden events | OOMKilled, CrashLoopBackOff | | `connectivity` | Test HTTP endpoints | Service discovery, network policies | ### Supported Resource Kinds | Kind | Group | Version | Notes | |------|-------|---------|-------| | Pod | core | v1 | Direct pod checks | | Deployment | apps | v1 | Checks owned pods | | StatefulSet | apps | v1 | Checks owned pods | | DaemonSet | apps | v1 | Checks owned pods | | ReplicaSet | apps | v1 | Checks owned pods | | Job | batch | v1 | Checks owned pods | | Service | core | v1 | Connectivity targets only | ### Default Values | Parameter | Default | Validation Type | |-----------|---------|-----------------| | sinceSeconds | 300 (5 min) | log, event | | timeoutSeconds | 5 | connectivity | | container | first container | log | --- For migration from the old CRD-based system, see [MIGRATION.md](../MIGRATION.md).