@dreamhorizonorg/sentinel
Version:
Open-source, zero-dependency tool that blocks compromised packages BEFORE download. Built to counter supply chain and credential theft attacks like Shai-Hulud.
274 lines (200 loc) • 8.48 kB
Markdown
# 📊 Data Sources for Vulnerability Detection
Sentinel Package Manager supports **three types of data sources** for vulnerability detection.
## 📑 Table of Contents
- [Overview](#-how-they-work-together) - How data sources work together
- [Local JSON Files](#1️-local-json-files) - Custom blacklist files
- [API Endpoints](#2️-api-endpoints) - Remote JSON endpoints
- [Vulnerability Providers](#3️-vulnerability-providers) - OSV, GitHub, Snyk
> **Related:** See [Providers Guide](PROVIDERS.md) for detailed provider configuration.
> **Quick Start:** See [Usage Guide](USAGE.md) for basic usage.
---
## 🎯 How They Work Together
Sentinel supports three types of data sources:
1. **Local JSON Files** - Custom blacklist files
2. **API Endpoints** - HTTP/HTTPS endpoints returning JSON
3. **Vulnerability Providers** - Established security databases (OSV, GitHub, Snyk)
### **Validation Flow:**
```
1. Check Local Blacklist (JSON file or API endpoint)
↓
2. Check Vulnerability Providers (OSV, GitHub, Snyk)
↓
3. Fallback to npm audit (only when scanning projects with lockfiles)
↓
├─ Vulnerability Found → BLOCK ❌
└─ All Clear → Install ✅
```
**All three sources are checked** - if any source finds a vulnerability, installation is blocked.
> **📊 See [Priority & Conflict Resolution](#-priority--conflict-resolution) below for detailed priority order and merge behavior.**
---
## 1️⃣ **Local JSON Files**
### **What It Is:**
A local JSON file containing a list of compromised packages (like the Shai-Hulud blacklist).
### **Configuration:**
**Option 1: Config File**
```json
{
"dataSourcePath": "./config/compromised-packages.json"
}
```
**Option 2: CLI Argument**
```bash
sentinel scan package-name --localDataSource="./config/blacklist.json"
```
**Option 3: Default Locations** (automatic)
- `./config/compromised-packages.json` (repository root)
- `~/.sentinel/config/compromised-packages.json` (user-wide)
### **JSON Format:**
```json
[
{
"name": "malicious-package",
"compromisedVersions": ["1.0.0", "1.0.1"],
"notes": "Shai-Hulud worm"
},
{
"name": "another-package",
"compromisedVersions": [],
"notes": "All versions compromised"
}
]
```
### **Use Cases:**
- ✅ Company-wide blacklists
- ✅ Custom security policies
- ✅ Known compromised packages (Shai-Hulud, etc.)
- ✅ Offline environments
### **Priority:**
- Highest priority (checked first)
- Overrides providers if both find the same package
---
## 2️⃣ **API Endpoints**
### **What It Is:**
An HTTP/HTTPS endpoint that returns JSON in the expected format.
### **Configuration:**
**Option 1: Config File**
```json
{
"endpoint": "https://api.example.com/compromised-packages.json"
}
```
**Option 2: CLI Argument**
```bash
sentinel scan package-name --remoteDataSource="https://api.example.com/blacklist.json"
```
### **Expected JSON Format:**
The endpoint must return a JSON array in this format:
```json
[
{
"name": "compromised-package",
"compromisedVersions": ["1.0.0", "1.0.1"],
"notes": "Security vulnerability"
}
]
```
### **Requirements:**
- ✅ Must return valid JSON
- ✅ Must be accessible via HTTP/HTTPS
- ✅ Should return array format (or single object will be wrapped)
- ✅ Should have reasonable response time (< 5 seconds)
### **Use Cases:**
- ✅ Centralized security database
- ✅ Company security API
- ✅ Real-time blacklist updates
- ✅ Multi-team coordination
### **Priority:**
- Second priority (after local files)
- Overrides default locations if specified
---
## 3️⃣ **Vulnerability Providers**
### **What It Is:**
Integration with established security databases (OSV, GitHub Advisories, Snyk) for real-time vulnerability checks.
### **Available Providers:**
| Provider | Token Required | Default | Description |
|----------|----------------|---------|-------------|
| **OSV** | ❌ No | ✅ Enabled | Google's comprehensive vulnerability database |
| **GitHub** | ⚠️ Optional | ✅ Enabled | GitHub Security Advisories (npm ecosystem) |
| **Snyk** | ✅ Required | ❌ Disabled | Enterprise-grade vulnerability database |
### **Quick Configuration:**
```json
{
"providers": {
"osv": { "enabled": true },
"github": { "enabled": true, "token": null },
"snyk": { "enabled": false, "token": null }
}
}
```
**Providers run automatically out of the box** - OSV and GitHub are enabled by default, no configuration needed.
### **Priority:**
- Third priority (after local files and API endpoints)
- Runs in parallel with blacklist checks
- First vulnerability found blocks installation
**📚 For detailed provider configuration, troubleshooting, and best practices, see [Providers Guide](PROVIDERS.md).**
---
## 📊 Priority & Conflict Resolution
### **For Blacklist Data Sources (JSON/API):**
| Priority | Source | Config Key | Description |
|----------|--------|------------|-------------|
| 1st | Local file/folder | `dataSourcePath` | If specified, uses exact file or folder |
| 2nd | API endpoint | `endpoint` | If specified, uses HTTP/HTTPS endpoint |
| 3rd | Default locations | Auto-detect | `./config/compromised-packages.json` or `~/.sentinel/config/compromised-packages.json` |
### **For Vulnerability Detection:**
| Priority | Source | Description | Behavior |
|----------|--------|-------------|----------|
| 1st | Local blacklist | JSON file or API endpoint | Highest priority, always checked first. If local blacklist marks package as compromised → **BLOCK** (even if providers say safe) |
| 2nd | Vulnerability providers | OSV, GitHub, Snyk | Checked in parallel. If any provider finds vulnerability → **BLOCK** |
| 3rd | npm audit | Fallback | Only when scanning projects with lockfiles. Cannot check standalone packages |
**Merge Behavior:** Sources are checked in priority order. First source to find a vulnerability blocks installation. Local blacklist always takes precedence.
---
## 🎯 **Complete Example**
### **Using All Three Sources:**
```json
{
"dataSourcePath": "./config/company-blacklist.json",
"endpoint": "https://security-api.company.com/compromised-packages.json",
"providers": {
"osv": { "enabled": true },
"github": { "enabled": true, "token": "ghp_..." },
"snyk": { "enabled": true, "token": "..." }
}
}
```
**What happens:**
1. ✅ Checks local file: `./config/company-blacklist.json`
2. ✅ Checks API endpoint: `https://security-api.company.com/compromised-packages.json`
3. ✅ Checks OSV provider (real-time)
4. ✅ Checks GitHub Advisories (real-time)
5. ✅ Checks Snyk (real-time)
6. ✅ Falls back to npm audit if nothing found
**If any source finds a vulnerability → Installation blocked** ❌
---
## ✅ **Summary**
| Data Source | Type | Config Key | Priority | Token Required | Use Case |
|-------------|------|------------|----------|----------------|----------|
| **Local JSON** | File | `dataSourcePath` | 1st | ❌ No | Custom blacklists, offline |
| **API Endpoint** | HTTP | `endpoint` | 2nd | ⚠️ Depends on API | Centralized database |
| **OSV Provider** | API | `providers.osv` | 3rd | ❌ No | Real-time CVE data |
| **GitHub Provider** | API | `providers.github` | 3rd | ⚠️ Optional | npm ecosystem advisories |
| **Snyk Provider** | API | `providers.snyk` | 3rd | ✅ Required | Enterprise-grade |
**All three options are fully supported and work together!** 🎉
---
## ⚠️ Important Limitations
### **API Endpoint Merge Behavior**
- API endpoints are checked **after** local blacklist
- If local blacklist marks package as compromised → **BLOCK** (API result ignored)
- If API endpoint fails → Continues to providers (fail-open behavior)
- Multiple API endpoints: First endpoint to find vulnerability blocks installation
### **Local Blacklist Format**
- Only exact version matching supported (e.g., `"1.2.3"` matches `1.2.3`)
- Semver ranges not supported (e.g., `">=1.2.0"` won't work)
- Use empty array `[]` to block all versions: `"package-name": []`
- Scoped packages supported: `"@scope/package": ["1.0.0"]`
### **Priority Order**
See [Priority & Conflict Resolution](#priority--conflict-resolution) above for complete details.
---
## 📚 **More Information**
- **[Usage Guide](USAGE.md)** - Complete configuration examples
- **[Providers Guide](PROVIDERS.md)** - Detailed provider documentation
- **[Troubleshooting](TROUBLESHOOTING.md)** - Common issues and solutions