aiwg
Version:
Cognitive architecture for AI-augmented software development with structured memory, ensemble validation, and closed-loop correction. FAIR-aligned artifacts, 84% cost reduction via human-in-the-loop, standards adopted by 100+ organizations.
415 lines (297 loc) • 12.8 kB
Markdown
# Literature Note Template
---
template_id: literature-note
version: 1.0.0
reasoning_required: true
framework: research-complete
---
## Ownership & Collaboration
- Document Owner: Research Analyst
- Contributor Roles: Domain Expert, Technical Researcher
- Automation Inputs: PDF extraction, citation metadata
- Automation Outputs: `literature-note-REF-XXX.md` capturing key insights
## Phase 1: Core (ESSENTIAL)
### Paper Identification
**Reference ID:** REF-XXX
<!-- EXAMPLE: REF-018 -->
**Title:** [Full paper title]
<!-- EXAMPLE: ReAct: Synergizing Reasoning and Acting in Language Models -->
**Authors:** [Author list]
<!-- EXAMPLE: Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., Cao, Y. -->
**Year:** YYYY
<!-- EXAMPLE: 2022 -->
**Source:** [Journal/Conference/Preprint]
<!-- EXAMPLE: ICLR 2023 (International Conference on Learning Representations) -->
### One-Sentence Summary
> [Single sentence capturing the core contribution]
<!-- EXAMPLE: ReAct improves LLM performance by 34% through interleaving reasoning traces with tool actions, reducing hallucinations to near-zero. -->
## Reasoning
> Complete this section BEFORE detailed note-taking. Per @.claude/rules/reasoning-sections.md
1. **Relevance Analysis**: Why is this paper important for AIWG?
> [Explain how this research connects to AIWG's mission, which components it affects, and priority level]
<!-- EXAMPLE:
ReAct is critical for AIWG because it provides the foundational pattern for agent reasoning loops (Thought→Action→Observation). This directly impacts:
- All SDLC agents that use tools (Read, Write, Bash, etc.)
- Ralph loop implementation (iteration structure)
- Agent debugging and transparency
Priority: HIGH - Core pattern used throughout framework
-->
2. **Key Insight Extraction**: What are the 2-3 most important findings?
> [Identify the findings that matter most for our use case, not necessarily what the authors emphasized]
<!-- EXAMPLE:
1. TAO loop structure reduces hallucinations dramatically (56% → 0% with tool grounding)
2. Explicit reasoning traces enable better human oversight and debugging
3. 34% performance improvement on HotpotQA demonstrates real-world value
-->
3. **Application Planning**: How will we apply these insights?
> [Concrete plans for integrating this research into AIWG]
<!-- EXAMPLE:
- Standardize all agents to use TAO loop (@.claude/rules/tao-loop.md)
- Add thought protocol with 6 thought types (@.claude/rules/thought-protocol.md)
- Implement tool grounding in Ralph loop
- Add TAO logging for agent debugging
-->
4. **Limitation Assessment**: What are the boundaries of applicability?
> [What this research doesn't cover, or contexts where it may not apply]
<!-- EXAMPLE:
- ReAct tested on question-answering tasks, not full SDLC workflows
- Single-agent focus; multi-agent coordination not addressed
- No discussion of human-in-the-loop patterns
- Performance on code generation tasks not evaluated
-->
5. **Gap Identification**: What follow-up research is needed?
> [Questions this paper leaves unanswered that we should investigate]
<!-- EXAMPLE:
- How does ReAct scale to long-running agent sessions? (Ralph loops run 10+ iterations)
- Can reasoning quality be measured automatically?
- How do multiple agents coordinate with ReAct patterns?
-->
## Phase 2: Detailed Analysis (EXPAND WHEN READY)
<details>
<summary>Click to expand detailed findings and analysis</summary>
### Research Question
> What problem is the paper solving?
<!-- EXAMPLE:
How can we improve LLM reasoning on tasks requiring both internal knowledge and external tool use, while reducing hallucinations?
-->
### Methodology
**Study Type:** [Experimental, Survey, Case Study, Systematic Review, etc.]
<!-- EXAMPLE: Experimental - comparative evaluation across multiple benchmarks -->
**Approach:**
- [Key methods used]
- [Experimental design]
- [Evaluation metrics]
<!-- EXAMPLE:
- Prompting methodology: Interleave reasoning thoughts with tool actions
- Baselines: Standard prompting, Chain-of-Thought, Act-only
- Benchmarks: HotpotQA, FEVER, ALFWorld, WebShop
- Metrics: Success rate, fact accuracy, trajectory efficiency
-->
**Sample Size/Scope:** [N=?, datasets, domains]
<!-- EXAMPLE:
- HotpotQA: 500 multi-hop questions
- FEVER: 500 fact verification claims
- ALFWorld: 134 household tasks
- WebShop: 251 online shopping tasks
-->
### Key Findings (Detailed)
#### Finding 1: [Specific finding]
**Result:** [Quantitative or qualitative result]
<!-- EXAMPLE:
**Result:** ReAct achieves 34% relative improvement over Act-only baseline on HotpotQA (49% → 66% success rate)
-->
**Significance:** [Why this matters]
<!-- EXAMPLE:
**Significance:** Demonstrates that explicit reasoning traces improve performance beyond pure action execution. The reasoning→action→observation loop enables error detection and course correction.
-->
**Evidence Quality:** [HIGH/MODERATE/LOW per GRADE]
<!-- EXAMPLE:
**Evidence Quality:** HIGH - Controlled experiment with clear baselines, multiple tasks, reproducible methodology
-->
**Application to AIWG:**
<!-- EXAMPLE:
**Application to AIWG:**
- Implement TAO loop in all tool-using agents
- Track thought types (goal, progress, extraction, reasoning, exception, synthesis)
- Enable iteration-level debugging via thought logs
-->
#### Finding 2: [Specific finding]
[Repeat structure from Finding 1]
<!-- EXAMPLE:
**Result:** ReAct reduces hallucinations to 0% on FEVER (vs 56% for baseline), with tool grounding
**Significance:** External tool use provides factual grounding that prevents fabrication
**Evidence Quality:** HIGH - Clear metrics, multiple evaluations
**Application to AIWG:** Require agents to ground claims in tool observations (Read, Grep results)
-->
#### Finding 3: [Specific finding]
[Repeat structure from Finding 1]
### Supporting Evidence
| Claim | Evidence | Page/Section | Quality |
|-------|----------|--------------|---------|
| [Claim 1] | [Data/quote] | p. X, Fig Y | HIGH |
| [Claim 2] | [Data/quote] | p. X, Table Y | MODERATE |
<!-- EXAMPLE:
| Claim | Evidence | Page/Section | Quality |
| ReAct reduces errors | 34% improvement on HotpotQA | p. 4, Table 1 | HIGH |
| Tool grounding prevents hallucinations | 0% vs 56% hallucination rate | p. 6, Figure 3 | HIGH |
| Works across domains | Consistent gains on 4 benchmarks | p. 7, Table 2 | HIGH |
-->
### Limitations & Caveats
- [Limitation 1: What the research doesn't prove or cover]
- [Limitation 2: Methodological constraints]
- [Limitation 3: Generalizability concerns]
<!-- EXAMPLE:
- Task focus: QA and simple interactive tasks, not complex SDLC workflows
- Single-agent: No multi-agent coordination patterns
- Context length: Not tested on long-running sessions (Ralph loops run 10+ iterations)
- Human oversight: Doesn't address HITL gate patterns
-->
### Related Work
**Builds on:**
- [Prior work 1 with @reference]
- [Prior work 2 with @reference]
<!-- EXAMPLE:
**Builds on:**
- @.aiwg/research/findings/REF-016-chain-of-thought.md - CoT reasoning baseline
- @.aiwg/research/findings/REF-019-toolformer.md - Tool use in LLMs
-->
**Extends/Contradicts:**
- [Related work 3 with relationship]
<!-- EXAMPLE:
**Extends:**
- Extends CoT by adding action execution and observation feedback
- Extends Toolformer by adding explicit reasoning traces
-->
**Cited by:** [If known, list key papers citing this work]
</details>
## Phase 3: Implementation Details (ADVANCED)
<details>
<summary>Click to expand implementation notes and technical details</summary>
### Technical Implementation
**Algorithm/Method:**
```
[Pseudocode or detailed description of core method]
```
<!-- EXAMPLE:
```
ReAct Loop:
1. THOUGHT: Generate reasoning about current state and next action
2. ACTION: Execute tool call with parameters
3. OBSERVATION: Capture tool output
4. Repeat until task complete or max iterations
```
-->
**Key Parameters:**
- [Parameter 1: Value or range]
- [Parameter 2: Value or range]
<!-- EXAMPLE:
**Key Parameters:**
- Max iterations: 5-10 depending on task complexity
- Temperature: 0.7 for reasoning, 0 for action generation
- Prompt format: "Thought: ... Action: ... Observation: ..."
-->
### Code/Artifacts
**Available Resources:**
- Repository: [URL if available]
- Demo: [URL if available]
- Datasets: [URL if available]
<!-- EXAMPLE:
**Available Resources:**
- Repository: https://github.com/ysymyth/ReAct
- Demo: https://react-lm.github.io/
- Datasets: HotpotQA, FEVER (public)
-->
### Reproducibility Notes
- [Note 1: What's needed to reproduce]
- [Note 2: Known challenges in replication]
<!-- EXAMPLE:
- Requires OpenAI API access (GPT-3.5 or GPT-4)
- Prompt engineering critical - exact wording matters
- Tool implementations must be reliable (search, calculator, etc.)
-->
### Integration Points
**AIWG Components Affected:**
- [Component 1 with @reference]
- [Component 2 with @reference]
<!-- EXAMPLE:
**AIWG Components Affected:**
- @.claude/rules/tao-loop.md - Core loop structure
- @.claude/rules/thought-protocol.md - Thought type taxonomy
- @agentic/code/addons/ralph/schemas/iteration-analytics.yaml - Logging format
- All tool-using agents in @agentic/code/frameworks/sdlc-complete/agents/
-->
**Implementation Status:**
- [ ] Rule defined (@.claude/rules/)
- [ ] Schema created (if applicable)
- [ ] Agents updated
- [ ] Tests written
- [ ] Documentation complete
</details>
## Connections & Links
### Upstream (Papers this builds on)
- @.aiwg/research/findings/REF-XXX.md - [Relationship]
- @.aiwg/research/findings/REF-YYY.md - [Relationship]
<!-- EXAMPLE:
- @.aiwg/research/findings/REF-016-chain-of-thought.md - Foundational reasoning pattern
- @.aiwg/research/findings/REF-019-toolformer.md - Tool augmentation pattern
-->
### Downstream (Papers citing this)
- @.aiwg/research/findings/REF-XXX.md - [Relationship]
<!-- EXAMPLE:
- @.aiwg/research/findings/REF-022-autogen.md - Multi-agent extension
-->
### Lateral (Related topics)
- @.aiwg/research/findings/REF-XXX.md - [Relationship]
<!-- EXAMPLE:
- @.aiwg/research/synthesis/topic-04-tool-grounding.md - Tool use patterns
- @.aiwg/research/synthesis/topic-03-cognitive-scaffolding.md - Reasoning structure
-->
### AIWG Implementation
- @.claude/rules/tao-loop.md - [Implementation of this research]
- @.aiwg/requirements/use-cases/UC-XXX.md - [Use case driven by this research]
<!-- EXAMPLE:
- @.claude/rules/tao-loop.md - TAO loop standardization
- @.claude/rules/thought-protocol.md - Six thought types
- @.aiwg/requirements/use-cases/UC-AP-002-track-reasoning.md - Reasoning transparency
-->
## Personal Notes & Insights
> Space for open-ended observations, questions, and connections
<!-- EXAMPLE:
This paper is the foundation for agent transparency. The TAO loop makes agent thinking visible, which is critical for debugging and trust.
Question: Can we extend TAO to multi-agent conversations? Each agent maintains TAO, but how do they coordinate?
Insight: The thought types in @.claude/rules/thought-protocol.md map well to different TAO phases:
- Goal/Progress thoughts → Pre-action planning
- Extraction/Reasoning thoughts → Post-observation analysis
- Exception thoughts → Error detection in observation
- Synthesis thoughts → Task completion assessment
Need to investigate: How does TAO scale to 10+ iteration loops in Ralph? Does reasoning quality degrade?
-->
## References
- @.aiwg/research/sources/[PDF-filename].pdf - Original paper
- @.aiwg/research/fixity-manifest.json - PDF checksum record
- @.aiwg/research/provenance/records/REF-XXX.prov.yaml - Provenance record
- @agentic/code/frameworks/research-complete/schemas/frontmatter-schema.yaml - Metadata schema
## Template Usage Notes
**When to create a literature note:**
- When adding a new paper to research corpus
- During deep reading of existing papers
- When synthesizing findings across papers
**Note-taking approach:**
- Read paper first, then populate ESSENTIAL section immediately
- Complete Reasoning section while paper is fresh in mind
- Expand EXPAND WHEN READY section during synthesis phase
- Fill ADVANCED section when implementing findings
**Anti-patterns to avoid:**
- Copying abstract verbatim (synthesize in your own words)
- Including findings not relevant to AIWG
- Skipping limitations section (critical for proper application)
- Not tracking implementation status
## Metadata
- **Template Type:** research-literature-note
- **Framework:** research-complete
- **Primary Agent:** @agentic/code/frameworks/research-complete/agents/discovery-agent.md
- **Related Templates:**
- @agentic/code/frameworks/research-complete/templates/summary.md
- @agentic/code/frameworks/research-complete/templates/extraction.yaml
- **Version:** 1.0.0
- **Last Updated:** 2026-02-03