docxmcp
Version:
A Model Context Protocol (MCP) server for processing .docx files into markdown with image extraction
145 lines (102 loc) • 2.81 kB
Markdown
# DocxMCP
A Model Context Protocol (MCP) server that processes .docx files and converts them to markdown format with image extraction.
## Quick Start
Run the server without installation using npx:
```bash
npx -y docxmcp@latest
```
## Features
- Converts .docx documents to clean markdown format
- Extracts embedded images as base64-encoded data
- Provides a simple MCP tool interface for document processing
- Zero configuration required
## Usage
### Starting the Server
```bash
# Run with default settings (stdio)
npx -y docxmcp@latest
# Run with debug logging
npx -y docxmcp@latest --log-level debug
# Run on a specific port (for testing)
npx -y docxmcp@latest --port 3000
# Show help
npx -y docxmcp@latest --help
```
### Available Options
- `--port <port>` - Port to listen on (default: stdio)
- `--log-level <level>` - Log level: debug, info, warn, error (default: info)
- `--help, -h` - Show help message
- `--version, -v` - Show version
### MCP Tool Interface
The server exposes a single MCP tool:
#### `process_docx`
Processes a .docx file and returns its contents as markdown with extracted images.
**Input:**
```json
{
"filePath": "/path/to/document.docx"
}
```
**Output:**
```json
{
"content": [
{
"type": "text",
"text": "# Document Title\n\nDocument content in markdown format..."
},
{
"type": "image",
"data": "base64-encoded-image-data",
"mimeType": "image/png"
}
]
}
```
## Integration with MCP Clients
### Claude Desktop
Add to your Claude configuration file:
```json
{
"mcpServers": {
"docxmcp": {
"command": "npx",
"args": ["-y", "docxmcp@latest"]
}
}
}
```
### Goose
Configure in your Goose settings:
```yaml
mcp_servers:
- name: docxmcp
command: npx
args: ["-y", "docxmcp@latest"]
```
## Quick Validation Example
To test that the server is working correctly:
1. Start the server:
```bash
npx -y docxmcp@latest --log-level debug
```
2. In another terminal, send a test request to process a .docx file
3. You should see markdown output with any extracted images
## Troubleshooting
### Common Issues
1. **"File must be a .docx document" error**
- Ensure the file path ends with `.docx`
- The server only processes Word documents in .docx format
2. **"File does not exist or is not accessible" error**
- Verify the file path is correct and absolute
- Check file permissions
3. **No output or server not responding**
- Ensure you're using stdio mode (default) when integrating with MCP clients
- Check the log level for more detailed debugging information
## Requirements
- Node.js >= 16.0.0
- Works on macOS, Linux, and Windows
## License
ISC
## Repository
[https://github.com/alephnull1678/DocxMCP](https://github.com/alephnull1678/DocxMCP)