UNPKG

oneie

Version:

Build apps, websites, and AI agents in English. Zero-interaction setup for AI agents (Claude Code, Cursor, Windsurf). Download to your computer, run in the cloud, deploy to the edge. Open source and free forever.

639 lines (513 loc) 16.1 kB
# Video Backend Plan (Multi-Tenant Database Layer) **Version:** 1.0.0 **Status:** Planning **Created:** 2025-11-08 **Prerequisites:** video-frontend.md must be completed first **Total Cycles:** 24 (backend infrastructure) --- ## Executive Summary Add multi-tenant database layer to support creator uploads, progress tracking, analytics, and AI-powered features. Migrates static content collections to Convex with full 6-dimension ontology compliance. **Architecture:** Convex Backend (Effect.ts services + Convex mutations/queries) **Quick Win:** Cycle 10 - Multi-tenant video database live with groupId scoping **Prerequisite:** Frontend working (see `/one/things/plans/video-frontend.md`) --- ## 6-Dimension Ontology Mapping ### GROUPS - Organization media libraries (groupId scoping) - Course collections (nested groups with parentGroupId) - Playlist containers (group-scoped) ### PEOPLE - Content creators (upload, manage) `creator` thing type - Viewers (consume, track progress) `user` thing type - Roles: `org_owner` (full access), `org_user` (view only), `customer` (purchased content) ### THINGS Core video/podcast entities: ```typescript { type: "video", groupId: Id<"groups">, name: "Video Title", properties: { description: string, youtubeId?: string, // For YouTube embeds videoUrl?: string, // For native hosting (Cloudinary/Mux) thumbnail: string, duration: number, // Seconds transcript?: string, // Full transcript text categories: string[], tags: string[], viewCount: number, publishedAt: number, }, status: "draft" | "published" | "archived" } { type: "podcast", groupId: Id<"groups">, name: "Episode Title", properties: { description: string, audioUrl: string, // MP3/M4A file (Cloudinary) thumbnail: string, duration: number, transcript?: string, episodeNumber?: number, seasonNumber?: number, viewCount: number, publishedAt: number, }, status: "draft" | "published" | "archived" } { type: "playlist", groupId: Id<"groups">, name: "Playlist Title", properties: { description: string, playlistType: "video" | "podcast" | "course", itemCount: number, }, status: "active" } { type: "course", groupId: Id<"groups">, name: "Course Title", properties: { description: string, instructor: string, level: "beginner" | "intermediate" | "advanced", estimatedHours: number, totalLessons: number, }, status: "draft" | "published" } { type: "lesson", groupId: Id<"groups">, name: "Lesson Title", properties: { description: string, order: number, videoId?: Id<"things">, // Reference to video thing duration: number, lessonType: "video" | "text" | "quiz", }, status: "draft" | "published" } ``` ### CONNECTIONS ```typescript { type: "contains", fromThingId: playlistId, toThingId: videoId, metadata: { order: 1 } } { type: "authored", fromThingId: creatorId, toThingId: videoId, metadata: { role: "creator" | "producer" | "editor", createdAt: number } } { type: "enrolled_in", fromThingId: userId, toThingId: courseId, metadata: { startedAt: number, progress: 0.65, // 0-1 scale lastAccessedAt: number } } { type: "watched", fromThingId: userId, toThingId: videoId, metadata: { lastPosition: 245, // Seconds completed: false, watchedAt: number } } ``` ### EVENTS ```typescript { type: "content_event", action: "video_viewed", actorId: userId, targetId: videoId, groupId: Id<"groups">, metadata: { duration: number, completionRate: 0.85, watchTime: number, deviceType: "mobile" | "desktop" } } { type: "content_event", action: "video_uploaded", actorId: creatorId, targetId: videoId, groupId: Id<"groups">, metadata: { size: number, // Bytes format: "mp4" | "webm", processingTime: number } } { type: "entity_created", action: "playlist_created", actorId: creatorId, targetId: playlistId, groupId: Id<"groups">, metadata: { itemCount: 0 } } ``` ### KNOWLEDGE - Video transcripts chunked for RAG (500-word chunks, 50-word overlap) - Podcast transcripts with speaker diarization - Auto-generated labels: topics, keywords, sentiment - Vector embeddings for semantic search (text-embedding-3-large, 1536-dim) - Thematic connections between content --- ## ⚡ Quick Wins (Cycles 1-10) **Goal:** Multi-tenant video database live by Cycle 10 ### Cycle 1-3: Schema Design (3 cycles) **Agent:** agent-backend **Cycle 1:** Update schema.ts - Add video/podcast/playlist/course/lesson thing types to schema - Add watched/contains/enrolled_in connection types - Add content_event event types - Validate against 6-dimension ontology **Cycle 2:** Add validation schemas - Zod schemas for video/podcast properties - Connection metadata validators - Event metadata validators **Cycle 3:** Test schema - Create sample video thing - Create sample connections - Create sample events - Verify multi-tenant isolation (groupId) ### Cycle 4-7: Core Mutations (4 cycles) **Agent:** agent-backend **Cycle 4:** Video mutations - `createVideo` - Add video to library - `updateVideo` - Update metadata - `deleteVideo` - Archive video - All scoped by groupId **Cycle 5:** Progress tracking mutations - `updateWatchProgress` - Track viewing position - `markVideoComplete` - Mark as watched - Create watched connection + content_event **Cycle 6:** Playlist mutations - `createPlaylist` - Create video collections - `addToPlaylist` - Add video to playlist (creates contains connection) - `removeFromPlaylist` - Remove video - `reorderPlaylist` - Update order metadata **Cycle 7:** Course mutations - `createCourse` - Create course thing - `createLesson` - Create lesson thing - `enrollInCourse` - Create enrolled_in connection - `updateCourseProgress` - Track lesson completion ### Cycle 8-10: Core Queries (3 cycles) **Agent:** agent-backend **Cycle 8:** Video queries - `listVideos` - Get all videos for group (groupId scoped) - `getVideoById` - Single video with metadata - `getCreatorVideos` - Videos by creator (authored connections) **Cycle 9:** Playlist queries - `listPlaylists` - Get all playlists for group - `getPlaylistVideos` - Videos in playlist order (contains connections) - `getUserPlaylists` - Playlists created by user **Cycle 10:** Progress queries - `getWatchProgress` - Get user's watch history (watched connections) - `getCourseProgress` - Get enrollment progress (enrolled_in connections) - **MILESTONE: Multi-tenant video database live** --- ## 📋 Full Plan (24 Cycles) ### Phase 1: Schema (Cycles 1-3) **Agent:** agent-backend **Deliverable:** Database schema with 6-dimension compliance - Cycle 1: Update schema.ts with video types - Cycle 2: Add Zod validation schemas - Cycle 3: Test schema with sample data ### Phase 2: Video Mutations (Cycles 4-7) **Agent:** agent-backend **Deliverable:** Video CRUD operations - Cycle 4: Video mutations (create, update, delete) - Cycle 5: Progress tracking mutations - Cycle 6: Playlist mutations - Cycle 7: Course mutations ### Phase 3: Video Queries (Cycles 8-10) **Agent:** agent-backend **Deliverable:** Video retrieval APIs - Cycle 8: Video queries (list, get, search) - Cycle 9: Playlist queries - Cycle 10: Progress queries - **MILESTONE: Multi-tenant database live** ### Phase 4: Upload Infrastructure (Cycles 11-14) **Agent:** agent-backend **Deliverable:** File upload to Cloudinary - Cycle 11: Cloudinary integration setup - Install Cloudinary SDK - Configure upload presets - Test video upload - Cycle 12: Video upload mutation - Accept file upload - Upload to Cloudinary - Create video thing with videoUrl - Create entity_created event - Cycle 13: Thumbnail generation - Auto-generate from video - Upload to Cloudinary - Store in properties.thumbnail - Cycle 14: Audio upload mutation - Upload podcast MP3/M4A - Create podcast thing - Create entity_created event ### Phase 5: Transcription (Cycles 15-17) **Agent:** agent-backend **Deliverable:** AI-powered transcripts - Cycle 15: Whisper API integration - Install OpenAI SDK - Configure Whisper API - Test transcription - Cycle 16: Transcription mutation - Extract audio from video - Send to Whisper API - Store in properties.transcript - Cycle 17: Podcast transcription - Transcribe podcast audio - Store in properties.transcript ### Phase 6: Knowledge Extraction (Cycles 18-20) **Agent:** agent-backend **Deliverable:** RAG-ready knowledge chunks - Cycle 18: Chunk transcripts - Split into 500-word chunks (50-word overlap) - Create knowledge things (type: "chunk") - Link to video/podcast via thingKnowledge junction - Cycle 19: Generate embeddings - Use text-embedding-3-large (1536-dim) - Store in knowledge.vector - Batch process for efficiency - Cycle 20: Semantic search query - `searchVideos` - Vector similarity search - Filter by groupId (multi-tenant) - Return ranked results ### Phase 7: Analytics (Cycles 21-23) **Agent:** agent-backend **Deliverable:** Analytics tracking - Cycle 21: View tracking - Create content_event (action: "video_viewed") - Update viewCount in properties - Track device type, completion rate - Cycle 22: Engagement analytics - Track watch time - Calculate completion rates - Identify popular content - Cycle 23: Creator analytics queries - `getCreatorAnalytics` - Total views, watch time - `getVideoAnalytics` - Per-video metrics - `getCourseAnalytics` - Enrollment, completion rates ### Phase 8: Deploy Backend (Cycle 24) **Agent:** agent-ops **Deliverable:** Production backend live - Cycle 24: Production deployment - Deploy to Convex production (shocking-falcon-870.convex.cloud) - Configure Cloudinary production keys - Configure OpenAI API keys - **LAUNCH: Full backend live** --- ## Technology Stack ### Backend - **Database:** Convex (real-time sync, multi-tenant) - **Business Logic:** Effect.ts services (pure functions) - **Storage:** Cloudinary (video + audio hosting) - **Transcription:** Whisper API (OpenAI) - **Embeddings:** text-embedding-3-large (OpenAI) - **Validation:** Zod ### External Services - **Cloudinary:** Video/audio upload, transformation, CDN streaming - Free tier: 25GB storage, 25GB bandwidth/month - Video transformation (resize, compress, thumbnail generation) - **OpenAI:** Transcription (Whisper) + embeddings - Whisper API: $0.006/minute - Embeddings: $0.13/1M tokens - **Convex:** Real-time database + file storage - Production: shocking-falcon-870.convex.cloud --- ## Migration from Content Collections ### Step 1: Export Existing Videos (Cycle 1 of frontend integration) ```typescript // scripts/migrate-videos-to-convex.ts import { getCollection } from 'astro:content'; const videos = await getCollection('videos'); for (const video of videos) { // Transform to Convex format const videoData = { type: "video", groupId: DEFAULT_GROUP_ID, // Migration group name: video.data.title, properties: { description: video.data.description, youtubeId: video.data.youtubeId, thumbnail: video.data.thumbnail, duration: video.data.duration, categories: video.data.categories, tags: video.data.tags, viewCount: 0, publishedAt: video.data.publishedAt.getTime(), }, status: "published" }; // Create in Convex await createVideo(videoData); } ``` ### Step 2: Update Frontend to Use Convex (Cycle 2 of frontend integration) ```typescript // Before (content collections) import { getCollection } from 'astro:content'; const videos = await getCollection('videos'); // After (Convex) import { useQuery } from 'convex/react'; import { api } from '@/convex/_generated/api'; const videos = useQuery(api.queries.videos.listVideos, { groupId: currentGroupId }); ``` --- ## Success Metrics ### Cycle 10 Milestone Multi-tenant database live - [ ] Video/podcast things created - [ ] Multi-tenant isolation working (groupId) - [ ] Progress tracking functional - [ ] Playlist management working ### Cycle 14 Milestone File upload working - [ ] Video upload to Cloudinary - [ ] Thumbnail auto-generation - [ ] Audio upload for podcasts ### Cycle 20 Milestone AI features live - [ ] Transcription working (Whisper) - [ ] Knowledge chunks created - [ ] Semantic search functional ### Cycle 24 Launch Full backend in production - [ ] All mutations/queries deployed - [ ] Analytics tracking live - [ ] Cloudinary CDN configured - [ ] Multi-tenant tested --- ## Risk Assessment ### Technical Risks **Medium Risk: Cloudinary Integration** - Mitigation: Well-documented React SDK - Fallback: Convex file storage (1GB free tier) **Low Risk: Transcription** - Whisper API is stable and accurate - Can defer to manual transcripts initially **Low Risk: Vector Search** - Convex supports vector queries natively - OpenAI embeddings are proven ### Business Risks **Medium Risk: Storage Costs** - Cloudinary free tier: 25GB storage, 25GB bandwidth/month - Estimate: $20-50/month for moderate usage - Mitigation: Use YouTube embeds where possible **Low Risk: API Costs** - Whisper: $0.006/minute (~$0.36/hour of video) - Embeddings: $0.13/1M tokens (~$0.01/video) - Estimate: $10-20/month for 100 videos --- ## Dependencies ### Before Starting - [x] Frontend working (video-frontend.md completed) - [x] Convex backend operational - [x] Multi-tenant groups working - [x] Authentication (Better Auth) - [ ] Cloudinary account setup - [ ] OpenAI API key ### Blocking Dependencies - Cycle 1 blocks Cycle 4 (schema before mutations) - Cycle 8 blocks frontend integration (queries needed) - Cycle 11 blocks Cycle 12 (Cloudinary setup before upload) - Cycle 15 blocks Cycle 16 (Whisper setup before transcription) --- ## Frontend Integration (Post-Backend) **See:** Frontend cycles in video-frontend.md should be updated to use Convex **Changes Required:** 1. Replace content collections with Convex queries 2. Add upload UI (drag-and-drop) 3. Add progress tracking UI (resume playback) 4. Add analytics dashboard (creator view) 5. Add semantic search UI (vector search) **Estimated:** 6 additional cycles (frontend integration) --- ## Next Steps ### To Start Execution ```bash # 1. Review this plan cat /Users/toc/Server/ONE/one/things/plans/video-backend.md # 2. Ensure frontend is complete cat /Users/toc/Server/ONE/one/things/plans/video-frontend.md # 3. Start Cycle 1 (schema design) # Agent: agent-backend # 4. Execute cycles sequentially /next # Advance to next cycle /done # Mark current cycle complete ``` ### To Modify Plan ```bash /plan optimize # Reduce cycle count further /plan add-feature [X] # Add new feature to plan /plan skip [N] # Skip cycle N (not applicable) ``` --- ## Appendix: External Service Setup ### Cloudinary Setup ```bash # 1. Create account at cloudinary.com # 2. Get credentials from dashboard CLOUDINARY_CLOUD_NAME=your_cloud_name CLOUDINARY_API_KEY=your_api_key CLOUDINARY_API_SECRET=your_api_secret # 3. Configure upload preset # Settings → Upload → Upload presets # - Preset name: "video_uploads" # - Mode: Unsigned # - Folder: "videos/" # - Resource type: Video # - Format: mp4 # - Transformations: Auto quality, auto format ``` ### OpenAI Setup ```bash # 1. Create account at platform.openai.com # 2. Generate API key OPENAI_API_KEY=sk-... # 3. Enable services # - Whisper API (transcription) # - Embeddings API (text-embedding-3-large) ``` ### Convex Environment Variables ```bash # backend/.env.local CLOUDINARY_CLOUD_NAME=... CLOUDINARY_API_KEY=... CLOUDINARY_API_SECRET=... OPENAI_API_KEY=... ``` --- **Built for scale, powered by the 6-dimension ontology.** **Plan Status:** Ready for execution (after frontend complete) **Next Command:** Complete video-frontend.md first, then start Cycle 1