oneie

Version:

Build apps, websites, and AI agents in English. Zero-interaction setup for AI agents (Claude Code, Cursor, Windsurf). Download to your computer, run in the cloud, deploy to the edge. Open source and free forever.

one.ie

one-ie/one

639 lines (513 loc) • 16.1 kB

Markdown

# Video Backend Plan (Multi-Tenant Database Layer) **Version:** 1.0.0 **Status:** Planning **Created:** 2025-11-08 **Prerequisites:** video-frontend.md must be completed first **Total Cycles:** 24 (backend infrastructure) --- ## Executive Summary Add multi-tenant database layer to support creator uploads, progress tracking, analytics, and AI-powered features. Migrates static content collections to Convex with full 6-dimension ontology compliance. **Architecture:** Convex Backend (Effect.ts services + Convex mutations/queries) **Quick Win:** Cycle 10 - Multi-tenant video database live with groupId scoping **Prerequisite:** Frontend working (see `/one/things/plans/video-frontend.md`) --- ## 6-Dimension Ontology Mapping ### GROUPS - Organization media libraries (groupId scoping) - Course collections (nested groups with parentGroupId) - Playlist containers (group-scoped) ### PEOPLE - Content creators (upload, manage) → `creator` thing type - Viewers (consume, track progress) → `user` thing type - Roles: `org_owner` (full access), `org_user` (view only), `customer` (purchased content) ### THINGS Core video/podcast entities: ```typescript { type: "video", groupId: Id<"groups">, name: "Video Title", properties: { description: string, youtubeId?: string, // For YouTube embeds videoUrl?: string, // For native hosting (Cloudinary/Mux) thumbnail: string, duration: number, // Seconds transcript?: string, // Full transcript text categories: string[], tags: string[], viewCount: number, publishedAt: number, }, status: "draft" | "published" | "archived" } { type: "podcast", groupId: Id<"groups">, name: "Episode Title", properties: { description: string, audioUrl: string, // MP3/M4A file (Cloudinary) thumbnail: string, duration: number, transcript?: string, episodeNumber?: number, seasonNumber?: number, viewCount: number, publishedAt: number, }, status: "draft" | "published" | "archived" } { type: "playlist", groupId: Id<"groups">, name: "Playlist Title", properties: { description: string, playlistType: "video" | "podcast" | "course", itemCount: number, }, status: "active" } { type: "course", groupId: Id<"groups">, name: "Course Title", properties: { description: string, instructor: string, level: "beginner" | "intermediate" | "advanced", estimatedHours: number, totalLessons: number, }, status: "draft" | "published" } { type: "lesson", groupId: Id<"groups">, name: "Lesson Title", properties: { description: string, order: number, videoId?: Id<"things">, // Reference to video thing duration: number, lessonType: "video" | "text" | "quiz", }, status: "draft" | "published" } ``` ### CONNECTIONS ```typescript { type: "contains", fromThingId: playlistId, toThingId: videoId, metadata: { order: 1 } } { type: "authored", fromThingId: creatorId, toThingId: videoId, metadata: { role: "creator" | "producer" | "editor", createdAt: number } } { type: "enrolled_in", fromThingId: userId, toThingId: courseId, metadata: { startedAt: number, progress: 0.65, // 0-1 scale lastAccessedAt: number } } { type: "watched", fromThingId: userId, toThingId: videoId, metadata: { lastPosition: 245, // Seconds completed: false, watchedAt: number } } ``` ### EVENTS ```typescript { type: "content_event", action: "video_viewed", actorId: userId, targetId: videoId, groupId: Id<"groups">, metadata: { duration: number, completionRate: 0.85, watchTime: number, deviceType: "mobile" | "desktop" } } { type: "content_event", action: "video_uploaded", actorId: creatorId, targetId: videoId, groupId: Id<"groups">, metadata: { size: number, // Bytes format: "mp4" | "webm", processingTime: number } } { type: "entity_created", action: "playlist_created", actorId: creatorId, targetId: playlistId, groupId: Id<"groups">, metadata: { itemCount: 0 } } ``` ### KNOWLEDGE - Video transcripts chunked for RAG (500-word chunks, 50-word overlap) - Podcast transcripts with speaker diarization - Auto-generated labels: topics, keywords, sentiment - Vector embeddings for semantic search (text-embedding-3-large, 1536-dim) - Thematic connections between content --- ## ⚡ Quick Wins (Cycles 1-10) **Goal:** Multi-tenant video database live by Cycle 10 ### Cycle 1-3: Schema Design (3 cycles) **Agent:** agent-backend ✓ **Cycle 1:** Update schema.ts - Add video/podcast/playlist/course/lesson thing types to schema - Add watched/contains/enrolled_in connection types - Add content_event event types - Validate against 6-dimension ontology ✓ **Cycle 2:** Add validation schemas - Zod schemas for video/podcast properties - Connection metadata validators - Event metadata validators ✓ **Cycle 3:** Test schema - Create sample video thing - Create sample connections - Create sample events - Verify multi-tenant isolation (groupId) ### Cycle 4-7: Core Mutations (4 cycles) **Agent:** agent-backend ✓ **Cycle 4:** Video mutations - `createVideo` - Add video to library - `updateVideo` - Update metadata - `deleteVideo` - Archive video - All scoped by groupId ✓ **Cycle 5:** Progress tracking mutations - `updateWatchProgress` - Track viewing position - `markVideoComplete` - Mark as watched - Create watched connection + content_event ✓ **Cycle 6:** Playlist mutations - `createPlaylist` - Create video collections - `addToPlaylist` - Add video to playlist (creates contains connection) - `removeFromPlaylist` - Remove video - `reorderPlaylist` - Update order metadata ✓ **Cycle 7:** Course mutations - `createCourse` - Create course thing - `createLesson` - Create lesson thing - `enrollInCourse` - Create enrolled_in connection - `updateCourseProgress` - Track lesson completion ### Cycle 8-10: Core Queries (3 cycles) **Agent:** agent-backend ✓ **Cycle 8:** Video queries - `listVideos` - Get all videos for group (groupId scoped) - `getVideoById` - Single video with metadata - `getCreatorVideos` - Videos by creator (authored connections) ✓ **Cycle 9:** Playlist queries - `listPlaylists` - Get all playlists for group - `getPlaylistVideos` - Videos in playlist order (contains connections) - `getUserPlaylists` - Playlists created by user ✓ **Cycle 10:** Progress queries - `getWatchProgress` - Get user's watch history (watched connections) - `getCourseProgress` - Get enrollment progress (enrolled_in connections) - ✅ **MILESTONE: Multi-tenant video database live** --- ## 📋 Full Plan (24 Cycles) ### Phase 1: Schema (Cycles 1-3) **Agent:** agent-backend **Deliverable:** Database schema with 6-dimension compliance - Cycle 1: Update schema.ts with video types - Cycle 2: Add Zod validation schemas - Cycle 3: Test schema with sample data ### Phase 2: Video Mutations (Cycles 4-7) **Agent:** agent-backend **Deliverable:** Video CRUD operations - Cycle 4: Video mutations (create, update, delete) - Cycle 5: Progress tracking mutations - Cycle 6: Playlist mutations - Cycle 7: Course mutations ### Phase 3: Video Queries (Cycles 8-10) **Agent:** agent-backend **Deliverable:** Video retrieval APIs - Cycle 8: Video queries (list, get, search) - Cycle 9: Playlist queries - Cycle 10: Progress queries - ✅ **MILESTONE: Multi-tenant database live** ### Phase 4: Upload Infrastructure (Cycles 11-14) **Agent:** agent-backend **Deliverable:** File upload to Cloudinary - Cycle 11: Cloudinary integration setup - Install Cloudinary SDK - Configure upload presets - Test video upload - Cycle 12: Video upload mutation - Accept file upload - Upload to Cloudinary - Create video thing with videoUrl - Create entity_created event - Cycle 13: Thumbnail generation - Auto-generate from video - Upload to Cloudinary - Store in properties.thumbnail - Cycle 14: Audio upload mutation - Upload podcast MP3/M4A - Create podcast thing - Create entity_created event ### Phase 5: Transcription (Cycles 15-17) **Agent:** agent-backend **Deliverable:** AI-powered transcripts - Cycle 15: Whisper API integration - Install OpenAI SDK - Configure Whisper API - Test transcription - Cycle 16: Transcription mutation - Extract audio from video - Send to Whisper API - Store in properties.transcript - Cycle 17: Podcast transcription - Transcribe podcast audio - Store in properties.transcript ### Phase 6: Knowledge Extraction (Cycles 18-20) **Agent:** agent-backend **Deliverable:** RAG-ready knowledge chunks - Cycle 18: Chunk transcripts - Split into 500-word chunks (50-word overlap) - Create knowledge things (type: "chunk") - Link to video/podcast via thingKnowledge junction - Cycle 19: Generate embeddings - Use text-embedding-3-large (1536-dim) - Store in knowledge.vector - Batch process for efficiency - Cycle 20: Semantic search query - `searchVideos` - Vector similarity search - Filter by groupId (multi-tenant) - Return ranked results ### Phase 7: Analytics (Cycles 21-23) **Agent:** agent-backend **Deliverable:** Analytics tracking - Cycle 21: View tracking - Create content_event (action: "video_viewed") - Update viewCount in properties - Track device type, completion rate - Cycle 22: Engagement analytics - Track watch time - Calculate completion rates - Identify popular content - Cycle 23: Creator analytics queries - `getCreatorAnalytics` - Total views, watch time - `getVideoAnalytics` - Per-video metrics - `getCourseAnalytics` - Enrollment, completion rates ### Phase 8: Deploy Backend (Cycle 24) **Agent:** agent-ops **Deliverable:** Production backend live - Cycle 24: Production deployment - Deploy to Convex production (shocking-falcon-870.convex.cloud) - Configure Cloudinary production keys - Configure OpenAI API keys - ✅ **LAUNCH: Full backend live** --- ## Technology Stack ### Backend - **Database:** Convex (real-time sync, multi-tenant) - **Business Logic:** Effect.ts services (pure functions) - **Storage:** Cloudinary (video + audio hosting) - **Transcription:** Whisper API (OpenAI) - **Embeddings:** text-embedding-3-large (OpenAI) - **Validation:** Zod ### External Services - **Cloudinary:** Video/audio upload, transformation, CDN streaming - Free tier: 25GB storage, 25GB bandwidth/month - Video transformation (resize, compress, thumbnail generation) - **OpenAI:** Transcription (Whisper) + embeddings - Whisper API: $0.006/minute - Embeddings: $0.13/1M tokens - **Convex:** Real-time database + file storage - Production: shocking-falcon-870.convex.cloud --- ## Migration from Content Collections ### Step 1: Export Existing Videos (Cycle 1 of frontend integration) ```typescript // scripts/migrate-videos-to-convex.ts import { getCollection } from 'astro:content'; const videos = await getCollection('videos'); for (const video of videos) { // Transform to Convex format const videoData = { type: "video", groupId: DEFAULT_GROUP_ID, // Migration group name: video.data.title, properties: { description: video.data.description, youtubeId: video.data.youtubeId, thumbnail: video.data.thumbnail, duration: video.data.duration, categories: video.data.categories, tags: video.data.tags, viewCount: 0, publishedAt: video.data.publishedAt.getTime(), }, status: "published" }; // Create in Convex await createVideo(videoData); } ``` ### Step 2: Update Frontend to Use Convex (Cycle 2 of frontend integration) ```typescript // Before (content collections) import { getCollection } from 'astro:content'; const videos = await getCollection('videos'); // After (Convex) import { useQuery } from 'convex/react'; import { api } from '@/convex/_generated/api'; const videos = useQuery(api.queries.videos.listVideos, { groupId: currentGroupId }); ``` --- ## Success Metrics ### Cycle 10 Milestone ✅ Multi-tenant database live - [ ] Video/podcast things created - [ ] Multi-tenant isolation working (groupId) - [ ] Progress tracking functional - [ ] Playlist management working ### Cycle 14 Milestone ✅ File upload working - [ ] Video upload to Cloudinary - [ ] Thumbnail auto-generation - [ ] Audio upload for podcasts ### Cycle 20 Milestone ✅ AI features live - [ ] Transcription working (Whisper) - [ ] Knowledge chunks created - [ ] Semantic search functional ### Cycle 24 Launch ✅ Full backend in production - [ ] All mutations/queries deployed - [ ] Analytics tracking live - [ ] Cloudinary CDN configured - [ ] Multi-tenant tested --- ## Risk Assessment ### Technical Risks **Medium Risk: Cloudinary Integration** - Mitigation: Well-documented React SDK - Fallback: Convex file storage (1GB free tier) **Low Risk: Transcription** - Whisper API is stable and accurate - Can defer to manual transcripts initially **Low Risk: Vector Search** - Convex supports vector queries natively - OpenAI embeddings are proven ### Business Risks **Medium Risk: Storage Costs** - Cloudinary free tier: 25GB storage, 25GB bandwidth/month - Estimate: $20-50/month for moderate usage - Mitigation: Use YouTube embeds where possible **Low Risk: API Costs** - Whisper: $0.006/minute (~$0.36/hour of video) - Embeddings: $0.13/1M tokens (~$0.01/video) - Estimate: $10-20/month for 100 videos --- ## Dependencies ### Before Starting - [x] Frontend working (video-frontend.md completed) - [x] Convex backend operational - [x] Multi-tenant groups working - [x] Authentication (Better Auth) - [ ] Cloudinary account setup - [ ] OpenAI API key ### Blocking Dependencies - Cycle 1 blocks Cycle 4 (schema before mutations) - Cycle 8 blocks frontend integration (queries needed) - Cycle 11 blocks Cycle 12 (Cloudinary setup before upload) - Cycle 15 blocks Cycle 16 (Whisper setup before transcription) --- ## Frontend Integration (Post-Backend) **See:** Frontend cycles in video-frontend.md should be updated to use Convex **Changes Required:** 1. Replace content collections with Convex queries 2. Add upload UI (drag-and-drop) 3. Add progress tracking UI (resume playback) 4. Add analytics dashboard (creator view) 5. Add semantic search UI (vector search) **Estimated:** 6 additional cycles (frontend integration) --- ## Next Steps ### To Start Execution ```bash # 1. Review this plan cat /Users/toc/Server/ONE/one/things/plans/video-backend.md # 2. Ensure frontend is complete cat /Users/toc/Server/ONE/one/things/plans/video-frontend.md # 3. Start Cycle 1 (schema design) # Agent: agent-backend # 4. Execute cycles sequentially /next # Advance to next cycle /done # Mark current cycle complete ``` ### To Modify Plan ```bash /plan optimize # Reduce cycle count further /plan add-feature [X] # Add new feature to plan /plan skip [N] # Skip cycle N (not applicable) ``` --- ## Appendix: External Service Setup ### Cloudinary Setup ```bash # 1. Create account at cloudinary.com # 2. Get credentials from dashboard CLOUDINARY_CLOUD_NAME=your_cloud_name CLOUDINARY_API_KEY=your_api_key CLOUDINARY_API_SECRET=your_api_secret # 3. Configure upload preset # Settings → Upload → Upload presets # - Preset name: "video_uploads" # - Mode: Unsigned # - Folder: "videos/" # - Resource type: Video # - Format: mp4 # - Transformations: Auto quality, auto format ``` ### OpenAI Setup ```bash # 1. Create account at platform.openai.com # 2. Generate API key OPENAI_API_KEY=sk-... # 3. Enable services # - Whisper API (transcription) # - Embeddings API (text-embedding-3-large) ``` ### Convex Environment Variables ```bash # backend/.env.local CLOUDINARY_CLOUD_NAME=... CLOUDINARY_API_KEY=... CLOUDINARY_API_SECRET=... OPENAI_API_KEY=... ``` --- **Built for scale, powered by the 6-dimension ontology.** **Plan Status:** Ready for execution (after frontend complete) **Next Command:** Complete video-frontend.md first, then start Cycle 1