inference-server

Version:

Libraries and server to build AI applications. Adapters to various native bindings allowing local inference. Integrate it with your application, or use as a microservice.

github.com/iimez/inference-server

iimez/inference-server

85 lines (63 loc) • 3.17 kB

Markdown

View Raw

### CLI The package comes with a few CLI commands to help you manage model files. ```bash $ infs --help infs <command> Commands: infs list [configPath] List stored models [aliases: ls, dir] infs show <modelName> Print details of a model [aliases: info, details] infs prepare <configPath> Prepare models defined in configuration [aliases: prep, download] infs remove <pattern> Delete models matching the pattern [aliases: rm, del] ``` #### List Print currently stored models. ```bash $ infs list Models cache path: /home/user/.cache/inference-server/models Total cache size: 392 GB 156 files: └── huggingface.co (342 GB) ├── Combatti (2.02 GB) │ └── llama3.2-3B-FunctionCalling-main (2.02 GB) ├── Comfy-Org (6.53 GB) │ └── stable-diffusion-3.5-fp8-main (6.53 GB) │ └── text_encoders (6.53 GB) ├── HuggingFaceTB (1.86 GB) │ ├── SmolLM2-1.7B-Instruct-main (1.72 GB) │ │ └── onnx (1.71 GB) │ └── smollm-135M-instruct-v0.2-Q8_0-GGUF-main (145 MB) ``` Positional arguments: - `configPath`: Path to the configuration file. If not specified the command will look for `'infs.config.js', 'infs.config.mjs', 'infs.config.json', 'package.json'` in the current working directory. `list` will only print models that are defined in the configuration file. If no configuration file is found, all models in the cache will be shown. Flags: - `--all` `-a`: Show all models in cache, independently of the configuration file. - `--json` `-j`: Output in JSON format. - `--files` `-f`: Only directories will be shown by default. Use this flag to show files as well. - `--list` `-l`: Output as flat list instead of tree for easier parsing and copying. #### Remove Delete one or more models from cache by their path pattern. ```bash $ infs remove huggingface.co/Combatti/llama3.2-3B-FunctionCalling-main └── llama3.2-3B-FunctionCalling-main (2.02 GB) └── unsloth.Q4_K_M.gguf (2.02 GB) This will remove one file freeing 2.02 GB total. Delete huggingface.co/Combatti/llama3.2-3B-FunctionCalling-main from disk? (y/N): y Deleting llama3.2-3B-FunctionCalling-main ... Done ``` Note that this commands takes a glob pattern. For example, to delete only certain quants, or all of a hub organization's models. Positional arguments: - `pattern`: Path pattern to match models to delete. Use `infs list` to see the available models. Flags: - `--yes` `-y`: Skip confirmation prompt. #### Prepare Run preparation tasks for all models defined in the configuration file. This command will download and validate the model files. ```bash $ infs prepare tests/testmodels.config.js ``` Positional arguments: - `configPath`: Path to the configuration file. If not specified the command will look for `'infs.config.js', 'infs.config.mjs', 'infs.config.json', 'package.json'` in the current working directory. Flags: - `--concurrency` `-c`: Number of concurrent preparation tasks. Default is 1.