@wemake.cx/metacognitive-monitoring
Version:
MCP server for diagrammatic thinking and spatial representation
211 lines (166 loc) • 7.57 kB
Markdown
# Metacognitive Monitoring MCP Server
A systematic framework for self-monitoring knowledge boundaries, claim certainty, and reasoning quality to enhance
metacognitive awareness and calibrated confidence.
## Core Concepts
### Knowledge Assessment
Knowledge assessments evaluate understanding within specific domains. Each assessment includes:
- Domain identification and scope
- Self-assessed knowledge level (expert to none)
- Confidence calibration (0.0-1.0)
- Supporting evidence for the assessment
- Known limitations and gaps
- Relevant training data cutoffs
Example:
```json
{
"domain": "Machine Learning Optimization",
"knowledgeLevel": "proficient",
"confidenceScore": 0.75,
"supportingEvidence": "Familiar with gradient descent, Adam optimizer, and regularization techniques",
"knownLimitations": ["Limited experience with advanced meta-learning algorithms"],
"relevantTrainingCutoff": "2021-09"
}
```
### Claim Assessment
Claim assessments classify and evaluate specific statements. They include:
- Statement classification (fact, inference, speculation, uncertain)
- Confidence scoring for the claim
- Evidence basis supporting the claim
- Alternative interpretations
- Falsifiability criteria
Example:
```json
{
"claim": "Transformer models require quadratic memory with sequence length",
"status": "fact",
"confidenceScore": 0.9,
"evidenceBasis": "Self-attention mechanism computes all pairwise token interactions",
"alternativeInterpretations": ["Linear attention variants exist but with trade-offs"],
"falsifiabilityCriteria": "Discovery of attention mechanism with linear complexity and equivalent performance"
}
```
### Reasoning Assessment
Reasoning assessments evaluate individual reasoning steps. They contain:
- Description of the reasoning step
- Potential cognitive biases
- Underlying assumptions
- Logical validity scoring (0.0-1.0)
- Inference strength evaluation (0.0-1.0)
Example:
```json
{
"step": "Since the model performs well on training data, it will generalize to new data",
"potentialBiases": ["Confirmation bias", "Overfitting neglect"],
"assumptions": ["Training data is representative", "Model complexity is appropriate"],
"logicalValidity": 0.3,
"inferenceStrength": 0.4
}
```
## API
### Tools
- **metacognitiveMonitoring**
- Systematic self-monitoring of knowledge and reasoning quality
- Input: Comprehensive metacognitive monitoring data structure
- `task` (string): The task or question being addressed
- `stage` (enum): Current monitoring stage - "knowledge-assessment" | "planning" | "execution" | "monitoring" |
"evaluation" | "reflection"
- `knowledgeAssessment` (object, optional): Domain knowledge evaluation
- `domain` (string): Knowledge domain being assessed
- `knowledgeLevel` (enum): "expert" | "proficient" | "familiar" | "basic" | "minimal" | "none"
- `confidenceScore` (number): Confidence in assessment (0.0-1.0)
- `supportingEvidence` (string): Evidence for knowledge level claim
- `knownLimitations` (string[]): Known knowledge gaps
- `relevantTrainingCutoff` (string, optional): Training data cutoff date
- `claims` (array, optional): Specific claim assessments
- `claim` (string): Statement being assessed
- `status` (enum): "fact" | "inference" | "speculation" | "uncertain"
- `confidenceScore` (number): Confidence in claim (0.0-1.0)
- `evidenceBasis` (string): Supporting evidence
- `alternativeInterpretations` (string[], optional): Alternative explanations
- `falsifiabilityCriteria` (string, optional): Criteria for falsification
- `reasoningSteps` (array, optional): Reasoning step evaluations
- `step` (string): Description of reasoning step
- `potentialBiases` (string[]): Identified cognitive biases
- `assumptions` (string[]): Underlying assumptions
- `logicalValidity` (number): Logical validity score (0.0-1.0)
- `inferenceStrength` (number): Inference strength score (0.0-1.0)
- `overallConfidence` (number): Overall confidence in conclusions (0.0-1.0)
- `uncertaintyAreas` (string[]): Areas of significant uncertainty
- `recommendedApproach` (string): Recommended approach based on assessment
- `monitoringId` (string): Unique identifier for monitoring session
- `iteration` (number): Current iteration of monitoring process
- `nextAssessmentNeeded` (boolean): Whether further assessment is required
- `suggestedAssessments` (array, optional): Suggested next assessments - "knowledge" | "claim" | "reasoning" |
"overall"
- Returns structured metacognitive analysis with visual confidence indicators
- Supports iterative refinement of self-awareness and calibration
- Tracks knowledge boundaries and reasoning quality over time
## Setup
### bunx
```json
{
"mcpServers": {
"Metacognitive Monitoring": {
"command": "bunx",
"args": ["-y", "@wemake.cx/metacognitive-monitoring@latest"]
}
}
}
```
#### bunx with custom settings
The server can be configured using the following environment variables:
```json
{
"mcpServers": {
"Metacognitive Monitoring": {
"command": "bunx",
"args": ["-y", "@wemake.cx/metacognitive-monitoring@latest"],
"env": {
"MONITORING_HISTORY_LIMIT": "100",
"CONFIDENCE_THRESHOLD": "0.7"
}
}
}
}
```
- `MONITORING_HISTORY_LIMIT`: Maximum number of monitoring sessions to retain (default: 50)
- `CONFIDENCE_THRESHOLD`: Minimum confidence threshold for high-confidence claims (default: 0.8)
## System Prompt
The prompt for utilizing metacognitive monitoring should encourage systematic self-assessment:
```markdown
Follow these steps for metacognitive monitoring:
1. Knowledge Boundary Assessment:
- Explicitly assess your knowledge level in the relevant domain
- Identify specific areas of strength and limitation
- Calibrate confidence based on evidence and experience
- Acknowledge training data cutoffs and their implications
2. Claim Classification:
- Distinguish between facts, inferences, speculation, and uncertainty
- Provide evidence basis for each significant claim
- Consider alternative interpretations of evidence
- Establish falsifiability criteria where appropriate
3. Reasoning Quality Monitoring:
- Evaluate each reasoning step for logical validity
- Identify potential cognitive biases affecting judgment
- Make underlying assumptions explicit
- Assess inference strength and confidence
4. Uncertainty Management:
- Identify areas of significant uncertainty
- Recommend approaches based on confidence levels
- Suggest additional assessments when needed
- Iterate on understanding as new information emerges
5. Calibration and Iteration:
- Track confidence calibration over time
- Refine assessments based on feedback
- Maintain awareness of knowledge boundaries
- Continuously improve metacognitive accuracy
```
## Usage Examples
### Technical Domain Assessment
When working in specialized technical domains, systematically assess knowledge boundaries and claim confidence levels.
### Complex Reasoning Chains
For multi-step reasoning, evaluate each step for biases, assumptions, and logical validity.
### Uncertain Scenarios
In high-uncertainty situations, explicitly track confidence levels and identify areas requiring additional information.
### Evidence Evaluation
When evaluating evidence, distinguish between different types of claims and their evidential basis.