UNPKG

invoice_ocr_mcp

Version:

企业发票OCR识别MCP服务器,基于RapidOCR和ModelScope生态,支持多种发票类型的结构化识别

491 lines (378 loc) 8.96 kB
# 部署指南 ## 部署方式概览 Invoice OCR MCP支持多种部署方式: 1. **本地开发部署** - 适合开发和测试 2. **Docker容器部署** - 推荐的生产环境部署方式 3. **Kubernetes集群部署** - 适合大规模生产环境 4. **云服务部署** - 支持主流云平台 ## 1. 本地开发部署 ### 环境要求 - Python 3.8+ - 至少4GB内存 - GPU支持(可选,用于加速推理) - ModelScope账号和API Token ### 安装步骤 ```bash # 1. 克隆项目 git clone https://github.com/your-org/invoice-ocr-mcp.git cd invoice-ocr-mcp # 2. 创建虚拟环境 python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate # 3. 安装依赖 pip install -r requirements.txt # 4. 配置环境变量 cp env.example .env # 编辑.env文件,添加你的ModelScope API Token # 5. 下载模型 python scripts/download_models.py # 6. 启动服务 python src/invoice_ocr_mcp/server.py ``` ### 验证部署 ```bash # 测试MCP连接 python examples/client_example.py ``` ## 2. Docker容器部署 ### 快速开始 ```bash # 1. 克隆项目 git clone https://github.com/your-org/invoice-ocr-mcp.git cd invoice-ocr-mcp # 2. 配置环境变量 cp env.example .env # 编辑.env文件 # 3. 构建并启动 docker-compose up -d # 4. 查看日志 docker-compose logs -f invoice-ocr-mcp ``` ### 生产环境配置 ```bash # 使用生产环境配置启动 docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d ``` ### GPU支持 ```bash # 启动GPU版本 docker-compose --profile gpu up -d ``` ## 3. Kubernetes部署 ### 前置条件 - Kubernetes 1.20+ - kubectl已配置 - Helm 3.0+(可选) ### 部署清单 创建 `k8s/` 目录并添加以下文件: #### namespace.yaml ```yaml apiVersion: v1 kind: Namespace metadata: name: invoice-ocr ``` #### configmap.yaml ```yaml apiVersion: v1 kind: ConfigMap metadata: name: invoice-ocr-config namespace: invoice-ocr data: # 从configs/目录复制配置文件 ``` #### secret.yaml ```yaml apiVersion: v1 kind: Secret metadata: name: invoice-ocr-secrets namespace: invoice-ocr type: Opaque data: modelscope-token: <base64-encoded-token> ``` #### deployment.yaml ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: invoice-ocr-mcp namespace: invoice-ocr spec: replicas: 3 selector: matchLabels: app: invoice-ocr-mcp template: metadata: labels: app: invoice-ocr-mcp spec: containers: - name: invoice-ocr-mcp image: your-registry/invoice-ocr-mcp:latest ports: - containerPort: 8000 env: - name: MODELSCOPE_API_TOKEN valueFrom: secretKeyRef: name: invoice-ocr-secrets key: modelscope-token resources: requests: memory: "2Gi" cpu: "500m" limits: memory: "4Gi" cpu: "2000m" volumeMounts: - name: config mountPath: /app/configs - name: cache mountPath: /app/cache volumes: - name: config configMap: name: invoice-ocr-config - name: cache emptyDir: {} ``` #### service.yaml ```yaml apiVersion: v1 kind: Service metadata: name: invoice-ocr-service namespace: invoice-ocr spec: selector: app: invoice-ocr-mcp ports: - port: 80 targetPort: 8000 type: ClusterIP ``` #### ingress.yaml ```yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: invoice-ocr-ingress namespace: invoice-ocr annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: rules: - host: invoice-ocr.example.com http: paths: - path: / pathType: Prefix backend: service: name: invoice-ocr-service port: number: 80 ``` ### 部署命令 ```bash # 应用所有配置 kubectl apply -f k8s/ # 查看部署状态 kubectl get pods -n invoice-ocr # 查看服务 kubectl get services -n invoice-ocr ``` ## 4. 云服务部署 ### 阿里云容器服务ACK ```bash # 1. 创建ACK集群 aliyun cs CreateCluster --region cn-hangzhou # 2. 配置kubectl aliyun cs GET kubeconfig --region cn-hangzhou --cluster-id <cluster-id> # 3. 部署应用 kubectl apply -f k8s/ ``` ### AWS EKS ```bash # 1. 创建EKS集群 eksctl create cluster --name invoice-ocr --region us-west-2 # 2. 部署应用 kubectl apply -f k8s/ ``` ### Azure AKS ```bash # 1. 创建AKS集群 az aks create --resource-group myResourceGroup --name invoice-ocr # 2. 获取凭据 az aks get-credentials --resource-group myResourceGroup --name invoice-ocr # 3. 部署应用 kubectl apply -f k8s/ ``` ## 5. 配置说明 ### 环境变量配置 主要配置项: ```bash # ModelScope配置 MODELSCOPE_API_TOKEN=your_token_here MODELSCOPE_CACHE_DIR=./cache/modelscope # 服务器配置 HOST=0.0.0.0 PORT=8000 LOG_LEVEL=INFO # 性能配置 MAX_CONCURRENT_REQUESTS=10 MAX_BATCH_SIZE=50 ``` ### 配置文件 #### configs/models.yaml ```yaml models: text_detection: name: "damo/cv_resnet18_ocr-detection-line-level_damo" cache_dir: "./cache/models/text_detection" text_recognition: name: "damo/cv_convnextTiny_ocr-recognition-general_damo" cache_dir: "./cache/models/text_recognition" invoice_classification: name: "damo/cv_resnest50_ocr-invoice-classification" cache_dir: "./cache/models/classification" ``` #### configs/server.yaml ```yaml server: host: "0.0.0.0" port: 8000 workers: 4 performance: max_concurrent_requests: 10 max_batch_size: 50 timeout: 30 cache: enabled: true expire_time: 86400 redis_url: "redis://localhost:6379/0" ``` ## 6. 监控和日志 ### 日志配置 ```yaml # configs/logging.yaml version: 1 formatters: standard: format: "%(asctime)s [%(levelname)s] %(name)s: %(message)s" handlers: file: class: logging.handlers.RotatingFileHandler filename: logs/app.log maxBytes: 104857600 # 100MB backupCount: 10 formatter: standard loggers: invoice_ocr_mcp: level: INFO handlers: [file] propagate: false ``` ### Prometheus监控 启用Prometheus监控: ```bash # 启动监控堆栈 docker-compose --profile monitoring up -d ``` 访问监控界面: - Prometheus: http://localhost:9090 - Grafana: http://localhost:3000 (admin/admin123) ## 7. 性能优化 ### GPU加速 启用GPU支持: ```bash # Docker部署 docker-compose --profile gpu up -d # Kubernetes部署 # 在deployment.yaml中添加GPU资源请求 resources: limits: nvidia.com/gpu: 1 ``` ### 缓存优化 配置Redis缓存: ```bash # 启动Redis docker-compose --profile cache up -d ``` ### 负载均衡 使用Nginx反向代理: ```bash # 启动代理 docker-compose --profile proxy up -d ``` ## 8. 安全配置 ### SSL/TLS 配置HTTPS: ```bash # 生成自签名证书(仅用于测试) openssl req -x509 -newkey rsa:4096 -keyout ssl/key.pem -out ssl/cert.pem -days 365 -nodes # 更新环境变量 ENABLE_HTTPS=true SSL_CERT_PATH=./ssl/cert.pem SSL_KEY_PATH=./ssl/key.pem ``` ### API认证 启用API密钥认证: ```bash # 设置API密钥 API_KEY=your_secure_api_key_here ``` ## 9. 故障排除 ### 常见问题 1. **模型加载失败** ```bash # 检查ModelScope连接 python scripts/test_modelscope.py # 清理缓存重新下载 rm -rf cache/modelscope python scripts/download_models.py ``` 2. **内存不足** ```bash # 减少并发数 MAX_CONCURRENT_REQUESTS=5 # 启用GPU加速 USE_GPU=true ``` 3. **网络连接问题** ```bash # 检查防火墙设置 # 确保端口8000已开放 ``` ### 日志分析 ```bash # 查看错误日志 tail -f logs/app.log | grep ERROR # 分析性能 grep "processing_time" logs/app.log | tail -100 ``` ## 10. 升级指南 ### 版本升级 ```bash # Docker升级 docker-compose pull docker-compose up -d # Kubernetes升级 kubectl set image deployment/invoice-ocr-mcp invoice-ocr-mcp=your-registry/invoice-ocr-mcp:new-version -n invoice-ocr ``` ### 数据迁移 在升级前备份重要数据: ```bash # 备份配置和数据 tar -czf backup-$(date +%Y%m%d).tar.gz configs/ data/ cache/ ``` ## 支持 如有部署问题,请: 1. 查看[故障排除文档](troubleshooting.md) 2. 提交[GitHub Issue](https://github.com/your-org/invoice-ocr-mcp/issues) 3. 联系技术支持:support@example.com