UNPKG

aiwg

Version:

Cognitive architecture for AI-augmented software development with structured memory, ensemble validation, and closed-loop correction. FAIR-aligned artifacts, 84% cost reduction via human-in-the-loop, standards adopted by 100+ organizations.

492 lines (371 loc) 15.3 kB
# Research Framework **Version:** 1.0.0 **Status:** Beta **Installation:** `aiwg use research` ## Overview The Research Framework provides a complete workflow for academic research management: discovering papers via semantic search, acquiring PDFs with FAIR validation, documenting papers using RAG-grounded LLM summarization, and synthesizing knowledge into literature notes. It automates 75% of the research process while maintaining high quality standards and preventing hallucinations. ## Why Use This Framework | Problem | Solution | Time Savings | |---------|----------|--------------| | Manual paper search | Semantic search + gap detection | 60% faster | | Tracking paper sources | Persistent REF-XXX identifiers | Always organized | | Paywalled papers | Unpaywall integration + manual upload | Streamlined access | | Reading papers | RAG-based summarization | 75% faster (20min 5min) | | Preventing hallucinations | Claim validation against source | 0% vs 56% hallucination rate | | Quality assessment | Automated GRADE scoring | Consistent standards | | Note organization | Zettelkasten literature notes | Scalable knowledge base | ## Research Workflow Stages The framework implements an 8-stage workflow: ``` 1. Discovery Find papers via semantic search 2. Acquisition Download PDFs and extract metadata 3. Documentation Summarize using RAG and extract structured data 4. Citation Track citations and generate bibliographies 5. Quality Assess source quality using GRADE methodology 6. Synthesis Create literature notes and link related work 7. Gap Analysis Identify research gaps and contradictions 8. Archival Version control and corpus management ``` **Current Implementation:** All 8 stages (Discovery, Acquisition, Documentation, Citation, Quality, Archival, Provenance, Workflow) ## Agent Catalog ### Implemented Agents (v1.0.0) | Agent | Purpose | Model | Tools | |-------|---------|-------|-------| | **Discovery Agent** | Search academic databases, rank by relevance, detect gaps | sonnet | Bash, Read, Write, Grep, Glob, WebFetch | | **Acquisition Agent** | Download PDFs, validate FAIR compliance, assign REF-XXX IDs | sonnet | Bash, Read, Write, Glob, Grep, WebFetch | | **Documentation Agent** | RAG summarization, structured extraction, literature notes | sonnet | Bash, Read, Write, Grep, Glob, WebFetch | | **Citation Agent** | Format citations (9000+ styles), build citation networks, manage bibliographies | sonnet | Bash, Read, Write, Grep, Glob, WebFetch | | **Quality Agent** | GRADE assessments, FAIR validation, multi-dimensional quality scoring | sonnet | Bash, Read, Write, Grep, Glob, WebFetch | | **Archival Agent** | OAIS-compliant archival, SIP/AIP/DIP packages, integrity verification | sonnet | Bash, Read, Write, Glob, Grep | | **Provenance Agent** | W3C PROV logging, lineage tracking, reproducibility packages | sonnet | Bash, Read, Write, Glob, Grep | | **Workflow Agent** | DAG-based orchestration, multi-stage pipelines, progress tracking | sonnet | Bash, Read, Write, Glob, Grep, Task | ## Installation ### Quick Start ```bash # Install AIWG CLI npm install -g aiwg # Deploy Research Framework aiwg use research # Verify installation aiwg research --version ``` ### What Gets Installed ``` .claude/agents/ ├── discovery-agent.md # Semantic search and gap detection ├── acquisition-agent.md # PDF download and FAIR validation ├── documentation-agent.md # RAG summarization and note creation ├── citation-agent.md # Citation formatting and networks ├── quality-agent.md # GRADE assessments and FAIR validation ├── archival-agent.md # OAIS-compliant archival ├── provenance-agent.md # W3C PROV lineage tracking └── workflow-agent.md # Multi-stage orchestration .aiwg/research/ ├── discovery/ # Search results and strategies ├── sources/ # PDFs and metadata ├── pdfs/ ├── metadata/ └── checksums.txt ├── knowledge/ # Summaries and notes ├── summaries/ ├── extractions/ └── notes/ └── config/ # Agent configurations ├── discovery-agent.yaml ├── acquisition-agent.yaml ├── documentation-agent.yaml ├── citation-agent.yaml ├── quality-agent.yaml ├── archival-agent.yaml ├── provenance-agent.yaml └── workflow-agent.yaml ``` ## Usage Examples ### 1. Discover Papers ```bash # Basic search aiwg research search "OAuth2 security best practices" # Advanced search with filters aiwg research search "LLM agent safety" --year 2022-2024 --venue conference --limit 100 # With citation network traversal aiwg research search "retrieval augmented generation" --citation-network # Systematic review (PRISMA-compliant) aiwg research discover --preregister ``` **Output:** - `.aiwg/research/discovery/search-results-{timestamp}.json` - `.aiwg/research/discovery/search-strategy.md` - `.aiwg/research/discovery/gap-report-{timestamp}.md` - `.aiwg/research/discovery/acquisition-queue.json` ### 2. Acquire Papers ```bash # Single paper aiwg research acquire REF-025 # Bulk acquisition from discovery queue aiwg research acquire --from-queue # Manual upload for paywalled paper aiwg research acquire --upload /path/to/paper.pdf --ref REF-027 # Retry failed downloads aiwg research acquire --retry-failed ``` **Output:** - `.aiwg/research/sources/pdfs/REF-XXX-{slug}.pdf` - `.aiwg/research/sources/metadata/REF-XXX-metadata.json` - `.aiwg/research/sources/acquisition-report-{timestamp}.md` - `.aiwg/research/sources/checksums.txt` (updated) ### 3. Document Papers ```bash # Summarize single paper aiwg research summarize REF-025 # Progressive summarization (multi-level) aiwg research summarize REF-025 --progressive # Bulk documentation aiwg research summarize --from-acquired # Create permanent note from literature note aiwg research note-create --permanent --based-on REF-025 # Generate Map of Content (topic overview) aiwg research moc-create "LLM Evaluation Methods" ``` **Output:** - `.aiwg/research/knowledge/summaries/REF-XXX-summary.md` - `.aiwg/research/knowledge/extractions/REF-XXX-extraction.json` - `.aiwg/research/knowledge/notes/REF-XXX-literature-note.md` ## Complete Workflow Example ```bash # Step 1: Discover papers on LLM agent safety aiwg research search "LLM agent safety alignment" --year 2022-2024 --limit 50 # Output: Found 50 papers, identified 2 research gaps, created acquisition queue # Step 2: Acquire papers from queue aiwg research acquire --from-queue # Output: Acquired 45/50 papers (90%), 5 paywalled (flagged for manual upload) # Step 3: Document acquired papers aiwg research summarize --from-acquired # Output: Generated summaries for 45 papers, average time 4min 30sec per paper # Total time saved: ~11 hours vs manual reading # Step 4: Review gap analysis cat .aiwg/research/discovery/gap-report-latest.md # Output: Identifies under-researched topics for future investigation ``` ## Key Features ### Semantic Search with Gap Detection - Natural language queries (no boolean operators required) - Relevance ranking: similarity (40%), citations (30%), venue (20%), recency (10%) - Automatic gap detection via topic clustering - Citation network traversal (forward and backward) - PRISMA-compliant search documentation ### FAIR-Compliant Acquisition - REF-XXX persistent identifiers - SHA-256 checksum integrity verification - FAIR scoring: Findable, Accessible, Interoperable, Reusable (0-100) - Shared corpus deduplication (avoid duplicate downloads) - Paywalled paper workflow with manual upload support ### RAG-Based Documentation - **Zero hallucination target**: All claims validated against source text - Multi-level summarization: 1-sentence, 1-paragraph, 1-page - Structured extraction: Claims, methods, datasets, findings - GRADE quality assessment (High/Moderate/Low/Very Low) - Zettelkasten literature notes with atomic note principles ## Configuration ### Discovery Agent Settings ```yaml # .aiwg/research/config/discovery-agent.yaml discovery_agent: api: primary: semantic-scholar fallback: [arxiv, crossref] timeout_ms: 30000 ranking: relevance_weight: 0.40 citation_weight: 0.30 venue_weight: 0.20 recency_weight: 0.10 gap_detection: sparse_cluster_threshold: 5 contradiction_variance: 0.50 ``` ### Acquisition Agent Settings ```yaml # .aiwg/research/config/acquisition-agent.yaml acquisition_agent: download: timeout_seconds: 60 concurrent_downloads: 5 fair_scoring: findable: doi_present: 40 metadata_complete: 50 accessible: persistent_url: 50 clear_license: 50 ``` ### Documentation Agent Settings ```yaml # .aiwg/research/config/documentation-agent.yaml documentation_agent: llm: default_model: claude-opus-4 fallback_model: claude-sonnet-4 temperature: 0.3 hallucination: enabled: true confidence_threshold: 0.9 user_review_required: true ``` ## Architecture ### Data Flow ``` Discovery Agent (search results + acquisition queue) Acquisition Agent (PDFs + metadata + checksums) Documentation Agent (summaries + extractions + literature notes) Knowledge Base ``` ### Storage Structure ``` .aiwg/research/ ├── discovery/ # Stage 1: Discovery ├── search-results-*.json ├── search-strategy.md ├── gap-report-*.md └── acquisition-queue.json ├── sources/ # Stage 2: Acquisition ├── pdfs/ └── REF-XXX-{slug}.pdf ├── metadata/ └── REF-XXX-metadata.json ├── checksums.txt └── ref-counter.txt ├── knowledge/ # Stage 3: Documentation ├── summaries/ └── REF-XXX-summary.md ├── extractions/ └── REF-XXX-extraction.json └── notes/ ├── REF-XXX-literature-note.md └── permanent-*.md └── config/ # Configuration ├── discovery-agent.yaml ├── acquisition-agent.yaml └── documentation-agent.yaml ``` ## Quality Standards ### Discovery Quality Targets - Search completion time: <10 seconds for 100 results - Gap analysis time: <30 seconds - API success rate: >95% - Relevance precision: >80% (user-validated) ### Acquisition Quality Targets - Download time per paper: <60 seconds (median) - Success rate: >90% - FAIR score average: >80/100 - Duplicate detection: 100% accuracy ### Documentation Quality Targets - Summarization time: <5 minutes per paper - Hallucination detection rate: >95% recall - Extraction completeness: >90% fields populated - GRADE consistency: >80% agreement with expert assessments ## Research Foundations This framework's design is grounded in academic research: | Practice | Source | Key Finding | |----------|--------|-------------| | Semantic Search | Semantic Scholar API | 200M+ papers indexed, relevance ranking | | Gap Detection | Systematic review methodology | Cluster analysis identifies under-researched topics | | FAIR Principles | GO-FAIR Initiative | Findability, Accessibility, Interoperability, Reusability | | RAG Pattern | Lewis et al. (2020) | Retrieval-Augmented Generation reduces hallucinations | | GRADE Framework | GRADE Working Group | Standardized quality assessment for evidence | | Zettelkasten | Ahrens (2017) | Atomic notes with linking enables knowledge synthesis | ## Integration with SDLC Framework The Research Framework can be used alongside the SDLC Framework: ```bash # During Inception: Discover research on requirements elicitation aiwg research search "requirements elicitation agile" --year 2018-2024 # During Elaboration: Research architectural patterns aiwg research search "microservices patterns" --venue conference # During Construction: Find test automation best practices aiwg research search "test automation best practices" --citation-network # Documentation: Cite acquired research aiwg research cite REF-025 --format bibtex ``` ## Troubleshooting ### Discovery Issues **Problem:** Empty search results ```bash # Check API connectivity aiwg research discovery --health-check # Try alternative query formulation aiwg research search "your query" --refine-from last ``` **Problem:** Rate limit hit ```bash # Configure API key for higher limits export SEMANTIC_SCHOLAR_API_KEY="your-api-key" # Use fallback APIs # Automatic fallback to arXiv and CrossRef ``` ### Acquisition Issues **Problem:** Download failures ```bash # Retry failed downloads aiwg research acquire --retry-failed # Check disk space df -h .aiwg/research/sources/ ``` **Problem:** Paywalled papers ```bash # Manual upload workflow aiwg research acquire --upload /path/to/paper.pdf --ref REF-XXX ``` ### Documentation Issues **Problem:** Hallucination detected ```bash # Review flagged content cat .aiwg/research/knowledge/logs/REF-XXX-hallucination-detected.log # Regenerate with stricter validation aiwg research summarize REF-XXX --strict-validation ``` **Problem:** Poor text extraction ```bash # Enable OCR for scanned PDFs aiwg research summarize REF-XXX --ocr ``` ## Roadmap ### v1.0.0 (Current - Beta) - Discovery Agent (semantic search, gap detection, citation network traversal) - Acquisition Agent (PDF download, FAIR validation, REF-XXX assignment) - Documentation Agent (RAG summarization, literature notes, concept mapping) - Citation Agent (9000+ styles, citation networks, bibliography management) - Quality Agent (GRADE assessments, FAIR validation, quality reporting) - Archival Agent (OAIS compliance, SIP/AIP/DIP packages, integrity verification) - Provenance Agent (W3C PROV logging, lineage tracking, reproducibility) - Workflow Agent (DAG orchestration, parallel execution, failure recovery) ### v1.1.0 (Q2 2026) - Enhanced gap detection (contradiction analysis) - Multi-language support (non-English papers) - Corpus health monitoring dashboard ### v1.2.0 (Q3 2026) - Synthesis Agent (permanent notes, knowledge graphs) - ML-based recommendation engine - Advanced deduplication with fuzzy matching ## Contributing Research Framework is part of the AIWG project. See main project documentation for contribution guidelines. ## References - @.aiwg/flows/research-framework/elaboration/ - Complete elaboration specifications - @.aiwg/flows/research-framework/inception/vision-document.md - Framework vision and goals - @.claude/rules/thought-protocol.md - Agent reasoning protocol - [Semantic Scholar API](https://www.semanticscholar.org/product/api) - [FAIR Principles](https://www.go-fair.org/fair-principles/) - [GRADE Framework](https://www.gradeworkinggroup.org/) ## Support - **Documentation**: `@agentic/code/frameworks/research-complete/` - **Issues**: https://github.com/jmagly/aiwg/issues - **Discord**: https://discord.gg/BuAusFMxdA - **Telegram**: https://t.me/+oJg9w2lE6A5lOGFh --- **License:** MIT **Maintainers:** AIWG Research Team **Last Updated:** 2026-02-03