@lobehub/chat

Version:

Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.

github.com/lobehub/lobe-chat

lobehub/lobe-chat

402 lines • 313 kB

JSON

{ "01-ai/yi-1.5-34b-chat": { "description": "Zero One Everything, the latest open-source fine-tuned model with 34 billion parameters, supports various dialogue scenarios with high-quality training data aligned with human preferences." }, "01-ai/yi-1.5-9b-chat": { "description": "Zero One Everything, the latest open-source fine-tuned model with 9 billion parameters, supports various dialogue scenarios with high-quality training data aligned with human preferences." }, "360/deepseek-r1": { "description": "[360 Deployment Version] DeepSeek-R1 extensively utilizes reinforcement learning techniques in the post-training phase, significantly enhancing model inference capabilities with minimal labeled data. It performs comparably to OpenAI's o1 official version in tasks such as mathematics, coding, and natural language reasoning." }, "360gpt-pro": { "description": "360GPT Pro, as an important member of the 360 AI model series, meets diverse natural language application scenarios with efficient text processing capabilities, supporting long text understanding and multi-turn dialogue." }, "360gpt-pro-trans": { "description": "A translation-specific model, finely tuned for optimal translation results." }, "360gpt-turbo": { "description": "360GPT Turbo offers powerful computation and dialogue capabilities, with excellent semantic understanding and generation efficiency, making it an ideal intelligent assistant solution for enterprises and developers." }, "360gpt-turbo-responsibility-8k": { "description": "360GPT Turbo Responsibility 8K emphasizes semantic safety and responsibility, designed specifically for applications with high content safety requirements, ensuring accuracy and robustness in user experience." }, "360gpt2-o1": { "description": "360gpt2-o1 builds a chain of thought using tree search and incorporates a reflection mechanism, trained with reinforcement learning, enabling the model to self-reflect and correct errors." }, "360gpt2-pro": { "description": "360GPT2 Pro is an advanced natural language processing model launched by 360, featuring exceptional text generation and understanding capabilities, particularly excelling in generation and creative tasks, capable of handling complex language transformations and role-playing tasks." }, "360zhinao2-o1": { "description": "360zhinao2-o1 uses tree search to build a chain of thought and introduces a reflection mechanism, utilizing reinforcement learning for training, enabling the model to possess self-reflection and error-correction capabilities." }, "4.0Ultra": { "description": "Spark4.0 Ultra is the most powerful version in the Spark large model series, enhancing text content understanding and summarization capabilities while upgrading online search links. It is a comprehensive solution for improving office productivity and accurately responding to demands, leading the industry as an intelligent product." }, "AnimeSharp": { "description": "AnimeSharp (also known as “4x-AnimeSharp”) is an open-source super-resolution model developed by Kim2091 based on the ESRGAN architecture, focusing on upscaling and sharpening anime-style images. It was renamed from “4x-TextSharpV1” in February 2022, originally also suitable for text images but significantly optimized for anime content." }, "Baichuan2-Turbo": { "description": "Utilizes search enhancement technology to achieve comprehensive links between large models and domain knowledge, as well as knowledge from the entire web. Supports uploads of various documents such as PDF and Word, and URL input, providing timely and comprehensive information retrieval with accurate and professional output." }, "Baichuan3-Turbo": { "description": "Optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5." }, "Baichuan3-Turbo-128k": { "description": "Features a 128K ultra-long context window, optimized for high-frequency enterprise scenarios, significantly improving performance and cost-effectiveness. Compared to the Baichuan2 model, content creation improves by 20%, knowledge Q&A by 17%, and role-playing ability by 40%. Overall performance is superior to GPT-3.5." }, "Baichuan4": { "description": "The model is the best in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also boasts industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks." }, "Baichuan4-Air": { "description": "The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks." }, "Baichuan4-Turbo": { "description": "The leading model in the country, surpassing mainstream foreign models in Chinese tasks such as knowledge encyclopedias, long texts, and creative generation. It also possesses industry-leading multimodal capabilities, excelling in multiple authoritative evaluation benchmarks." }, "ByteDance-Seed/Seed-OSS-36B-Instruct": { "description": "Seed-OSS is a series of open-source large language models developed by ByteDance's Seed team, designed specifically for powerful long-context processing, reasoning, agents, and general capabilities. The Seed-OSS-36B-Instruct in this series is an instruction-tuned model with 36 billion parameters, natively supporting ultra-long context lengths, enabling it to handle massive documents or complex codebases in a single pass. This model is specially optimized for reasoning, code generation, and agent tasks (such as tool usage), while maintaining balanced and excellent general capabilities. A key feature of this model is the \"Thinking Budget\" function, which allows users to flexibly adjust the reasoning length as needed, effectively improving reasoning efficiency in practical applications." }, "DeepSeek-R1": { "description": "A state-of-the-art efficient LLM, skilled in reasoning, mathematics, and programming." }, "DeepSeek-R1-Distill-Llama-70B": { "description": "DeepSeek R1— the larger and smarter model in the DeepSeek suite— distilled into the Llama 70B architecture. Based on benchmark testing and human evaluation, this model is smarter than the original Llama 70B, particularly excelling in tasks requiring mathematical and factual accuracy." }, "DeepSeek-R1-Distill-Qwen-1.5B": { "description": "The DeepSeek-R1 distillation model based on Qwen2.5-Math-1.5B optimizes inference performance through reinforcement learning and cold-start data, refreshing the benchmark for open-source models across multiple tasks." }, "DeepSeek-R1-Distill-Qwen-14B": { "description": "The DeepSeek-R1 distillation model based on Qwen2.5-14B optimizes inference performance through reinforcement learning and cold-start data, refreshing the benchmark for open-source models across multiple tasks." }, "DeepSeek-R1-Distill-Qwen-32B": { "description": "The DeepSeek-R1 series optimizes inference performance through reinforcement learning and cold-start data, refreshing the benchmark for open-source models across multiple tasks, surpassing the level of OpenAI-o1-mini." }, "DeepSeek-R1-Distill-Qwen-7B": { "description": "The DeepSeek-R1 distillation model based on Qwen2.5-Math-7B optimizes inference performance through reinforcement learning and cold-start data, refreshing the benchmark for open-source models across multiple tasks." }, "DeepSeek-V3": { "description": "DeepSeek-V3 is a MoE model developed in-house by Deep Seek Company. Its performance surpasses that of other open-source models such as Qwen2.5-72B and Llama-3.1-405B in multiple assessments, and it stands on par with the world's top proprietary models like GPT-4o and Claude-3.5-Sonnet." }, "DeepSeek-V3-1": { "description": "DeepSeek V3.1: Next-generation reasoning model, enhancing complex reasoning and chain-of-thought capabilities, ideal for tasks requiring in-depth analysis." }, "DeepSeek-V3-Fast": { "description": "Model provider: sophnet platform. DeepSeek V3 Fast is the high-TPS ultra-fast version of DeepSeek V3 0324, fully powered without quantization, featuring enhanced coding and mathematical capabilities for faster response!" }, "DeepSeek-V3.1": { "description": "DeepSeek-V3.1 - Non-Thinking Mode; DeepSeek-V3.1 is a newly launched hybrid reasoning model by DeepSeek, supporting both thinking and non-thinking reasoning modes, with higher thinking efficiency compared to DeepSeek-R1-0528. Post-training optimization significantly enhances agent tool usage and agent task performance." }, "DeepSeek-V3.1-Fast": { "description": "DeepSeek V3.1 Fast is the high-TPS, ultra-fast version of DeepSeek V3.1. Hybrid Thinking Mode: By changing the chat template, a single model can support both thinking and non-thinking modes simultaneously. Smarter Tool Invocation: Post-training optimization significantly improves the model's performance in tool usage and agent tasks." }, "DeepSeek-V3.1-Think": { "description": "DeepSeek-V3.1 - Thinking Mode; DeepSeek-V3.1 is a newly launched hybrid reasoning model by DeepSeek, supporting both thinking and non-thinking reasoning modes, with higher thinking efficiency compared to DeepSeek-R1-0528. Post-training optimization significantly enhances agent tool usage and agent task performance." }, "DeepSeek-V3.2-Exp": { "description": "DeepSeek V3.2 is the latest general-purpose large model released by DeepSeek, supporting a hybrid inference architecture and featuring enhanced Agent capabilities." }, "DeepSeek-V3.2-Exp-Think": { "description": "DeepSeek V3.2 Thinking Mode. Before outputting the final answer, the model first generates a chain of thought to improve the accuracy of the final response." }, "Doubao-lite-128k": { "description": "Doubao-lite offers ultra-fast response times and better cost-effectiveness, providing customers with more flexible options for different scenarios. Supports inference and fine-tuning with a 128k context window." }, "Doubao-lite-32k": { "description": "Doubao-lite offers ultra-fast response times and better cost-effectiveness, providing customers with more flexible options for different scenarios. Supports inference and fine-tuning with a 32k context window." }, "Doubao-lite-4k": { "description": "Doubao-lite offers ultra-fast response times and better cost-effectiveness, providing customers with more flexible options for different scenarios. Supports inference and fine-tuning with a 4k context window." }, "Doubao-pro-128k": { "description": "The best-performing flagship model, suitable for handling complex tasks. It excels in scenarios such as reference Q&A, summarization, creative writing, text classification, and role-playing. Supports inference and fine-tuning with a 128k context window." }, "Doubao-pro-32k": { "description": "The best-performing flagship model, suitable for handling complex tasks. It excels in scenarios such as reference Q&A, summarization, creative writing, text classification, and role-playing. Supports inference and fine-tuning with a 32k context window." }, "Doubao-pro-4k": { "description": "The best-performing flagship model, suitable for handling complex tasks. It excels in scenarios such as reference Q&A, summarization, creative writing, text classification, and role-playing. Supports inference and fine-tuning with a 4k context window." }, "DreamO": { "description": "DreamO is an open-source image customization generation model jointly developed by ByteDance and Peking University, designed to support multi-task image generation through a unified architecture. It employs an efficient compositional modeling approach to generate highly consistent and customized images based on multiple user-specified conditions such as identity, subject, style, and background." }, "ERNIE-3.5-128K": { "description": "Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information." }, "ERNIE-3.5-8K": { "description": "Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information." }, "ERNIE-3.5-8K-Preview": { "description": "Baidu's self-developed flagship large-scale language model, covering a vast amount of Chinese and English corpus. It possesses strong general capabilities, meeting the requirements for most dialogue Q&A, creative generation, and plugin application scenarios; it supports automatic integration with Baidu's search plugin to ensure the timeliness of Q&A information." }, "ERNIE-4.0-8K-Latest": { "description": "Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information." }, "ERNIE-4.0-8K-Preview": { "description": "Baidu's self-developed flagship ultra-large-scale language model, which has achieved a comprehensive upgrade in model capabilities compared to ERNIE 3.5, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information." }, "ERNIE-4.0-Turbo-8K-Latest": { "description": "Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, suitable for complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It offers better performance compared to ERNIE 4.0." }, "ERNIE-4.0-Turbo-8K-Preview": { "description": "Baidu's self-developed flagship ultra-large-scale language model, demonstrating excellent overall performance, widely applicable to complex task scenarios across various fields; supports automatic integration with Baidu search plugins to ensure the timeliness of Q&A information. It outperforms ERNIE 4.0 in performance." }, "ERNIE-Character-8K": { "description": "Baidu's self-developed vertical scene large language model, suitable for applications such as game NPCs, customer service dialogues, and role-playing conversations, featuring more distinct and consistent character styles, stronger adherence to instructions, and superior inference performance." }, "ERNIE-Lite-Pro-128K": { "description": "Baidu's self-developed lightweight large language model, balancing excellent model performance with inference efficiency, offering better results than ERNIE Lite, suitable for inference on low-power AI acceleration cards." }, "ERNIE-Speed-128K": { "description": "Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance." }, "ERNIE-Speed-Pro-128K": { "description": "Baidu's latest self-developed high-performance large language model released in 2024, with outstanding general capabilities, providing better results than ERNIE Speed, suitable as a base model for fine-tuning, effectively addressing specific scenario issues while also exhibiting excellent inference performance." }, "FLUX-1.1-pro": { "description": "FLUX.1.1 Pro" }, "FLUX.1-Kontext-dev": { "description": "FLUX.1-Kontext-dev is a multimodal image generation and editing model developed by Black Forest Labs based on the Rectified Flow Transformer architecture, featuring 12 billion parameters. It specializes in generating, reconstructing, enhancing, or editing images under given contextual conditions. The model combines the controllable generation advantages of diffusion models with the contextual modeling capabilities of Transformers, supporting high-quality image output and widely applicable to image restoration, completion, and visual scene reconstruction tasks." }, "FLUX.1-Kontext-pro": { "description": "FLUX.1 Kontext [pro]" }, "FLUX.1-dev": { "description": "FLUX.1-dev is an open-source multimodal language model (MLLM) developed by Black Forest Labs, optimized for vision-and-language tasks by integrating image and text understanding and generation capabilities. Built upon advanced large language models such as Mistral-7B, it achieves vision-language collaborative processing and complex task reasoning through a carefully designed visual encoder and multi-stage instruction fine-tuning." }, "Gryphe/MythoMax-L2-13b": { "description": "MythoMax-L2 (13B) is an innovative model suitable for multi-domain applications and complex tasks." }, "HelloMeme": { "description": "HelloMeme is an AI tool that automatically generates memes, GIFs, or short videos based on the images or actions you provide. It requires no drawing or programming skills; simply prepare reference images, and it will help you create visually appealing, fun, and stylistically consistent content." }, "HiDream-I1-Full": { "description": "HiDream-E1-Full is an open-source multimodal image editing large model launched by HiDream.ai, based on the advanced Diffusion Transformer architecture combined with powerful language understanding capabilities (embedded LLaMA 3.1-8B-Instruct). It supports image generation, style transfer, local editing, and content repainting through natural language instructions, demonstrating excellent vision-language comprehension and execution abilities." }, "HunyuanDiT-v1.2-Diffusers-Distilled": { "description": "hunyuandit-v1.2-distilled is a lightweight text-to-image model optimized through distillation, capable of rapidly generating high-quality images, especially suitable for low-resource environments and real-time generation tasks." }, "InstantCharacter": { "description": "InstantCharacter is a tuning-free personalized character generation model released by Tencent AI team in 2025, designed to achieve high-fidelity, cross-scene consistent character generation. The model supports character modeling based on a single reference image and can flexibly transfer the character to various styles, actions, and backgrounds." }, "InternVL2-8B": { "description": "InternVL2-8B is a powerful visual language model that supports multimodal processing of images and text, capable of accurately recognizing image content and generating relevant descriptions or answers." }, "InternVL2.5-26B": { "description": "InternVL2.5-26B is a powerful visual language model that supports multimodal processing of images and text, capable of accurately recognizing image content and generating relevant descriptions or answers." }, "Kolors": { "description": "Kolors is a text-to-image model developed by the Kuaishou Kolors team. Trained with billions of parameters, it excels in visual quality, Chinese semantic understanding, and text rendering." }, "Kwai-Kolors/Kolors": { "description": "Kolors is a large-scale latent diffusion text-to-image generation model developed by the Kuaishou Kolors team. Trained on billions of text-image pairs, it demonstrates significant advantages in visual quality, complex semantic accuracy, and Chinese and English character rendering. It supports both Chinese and English inputs and performs exceptionally well in understanding and generating Chinese-specific content." }, "Llama-3.2-11B-Vision-Instruct": { "description": "Exhibits outstanding image reasoning capabilities on high-resolution images, suitable for visual understanding applications." }, "Llama-3.2-90B-Vision-Instruct\t": { "description": "Advanced image reasoning capabilities suitable for visual understanding agent applications." }, "Meta-Llama-3-3-70B-Instruct": { "description": "Llama 3.3 70B: A versatile Transformer model suitable for conversational and generative tasks." }, "Meta-Llama-3.1-405B-Instruct": { "description": "Llama 3.1 instruction-tuned text model optimized for multilingual dialogue use cases, performing excellently on common industry benchmarks among many available open-source and closed chat models." }, "Meta-Llama-3.1-70B-Instruct": { "description": "Llama 3.1 instruction-tuned text model optimized for multilingual dialogue use cases, performing excellently on common industry benchmarks among many available open-source and closed chat models." }, "Meta-Llama-3.1-8B-Instruct": { "description": "Llama 3.1 instruction-tuned text model optimized for multilingual dialogue use cases, performing excellently on common industry benchmarks among many available open-source and closed chat models." }, "Meta-Llama-3.2-1B-Instruct": { "description": "An advanced cutting-edge small language model with language understanding, excellent reasoning capabilities, and text generation abilities." }, "Meta-Llama-3.2-3B-Instruct": { "description": "An advanced cutting-edge small language model with language understanding, excellent reasoning capabilities, and text generation abilities." }, "Meta-Llama-3.3-70B-Instruct": { "description": "Llama 3.3 is the most advanced multilingual open-source large language model in the Llama series, offering performance comparable to a 405B model at a very low cost. Based on the Transformer architecture, it enhances usability and safety through supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Its instruction-tuned version is optimized for multilingual dialogue and outperforms many open-source and closed chat models on various industry benchmarks. Knowledge cutoff date is December 2023." }, "Meta-Llama-4-Maverick-17B-128E-Instruct-FP8": { "description": "Llama 4 Maverick: A large-scale model based on Mixture-of-Experts, offering an efficient expert activation strategy for superior inference performance." }, "MiniMax-M1": { "description": "A newly developed inference model. World-leading: 80K chain-of-thought x 1M input, delivering performance on par with top-tier international models." }, "MiniMax-M2": { "description": "Purpose-built for efficient coding and agent workflows." }, "MiniMax-Text-01": { "description": "In the MiniMax-01 series of models, we have made bold innovations: for the first time, we have implemented a linear attention mechanism on a large scale, making the traditional Transformer architecture no longer the only option. This model has a parameter count of up to 456 billion, with a single activation of 45.9 billion. Its overall performance rivals that of top overseas models while efficiently handling the world's longest context of 4 million tokens, which is 32 times that of GPT-4o and 20 times that of Claude-3.5-Sonnet." }, "MiniMaxAI/MiniMax-M1-80k": { "description": "MiniMax-M1 is a large-scale hybrid attention inference model with open-source weights, featuring 456 billion parameters, with approximately 45.9 billion parameters activated per token. The model natively supports ultra-long contexts of up to 1 million tokens and, through lightning attention mechanisms, reduces floating-point operations by 75% compared to DeepSeek R1 in tasks generating 100,000 tokens. Additionally, MiniMax-M1 employs a Mixture of Experts (MoE) architecture, combining the CISPO algorithm with an efficient reinforcement learning training design based on hybrid attention, achieving industry-leading performance in long-input inference and real-world software engineering scenarios." }, "Moonshot-Kimi-K2-Instruct": { "description": "With a total of 1 trillion parameters and 32 billion activated parameters, this non-thinking model achieves top-tier performance in cutting-edge knowledge, mathematics, and coding, excelling in general agent tasks. It is carefully optimized for agent tasks, capable not only of answering questions but also taking actions. Ideal for improvisational, general chat, and agent experiences, it is a reflex-level model requiring no prolonged thinking." }, "NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO": { "description": "Nous Hermes 2 - Mixtral 8x7B-DPO (46.7B) is a high-precision instruction model suitable for complex computations." }, "OmniConsistency": { "description": "OmniConsistency enhances style consistency and generalization in image-to-image tasks by introducing large-scale Diffusion Transformers (DiTs) and paired stylized data, effectively preventing style degradation." }, "Phi-3-medium-128k-instruct": { "description": "The same Phi-3-medium model, but with a larger context size for RAG or few-shot prompting." }, "Phi-3-medium-4k-instruct": { "description": "A 14B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data." }, "Phi-3-mini-128k-instruct": { "description": "The same Phi-3-mini model, but with a larger context size for RAG or few-shot prompting." }, "Phi-3-mini-4k-instruct": { "description": "The smallest member of the Phi-3 family, optimized for both quality and low latency." }, "Phi-3-small-128k-instruct": { "description": "The same Phi-3-small model, but with a larger context size for RAG or few-shot prompting." }, "Phi-3-small-8k-instruct": { "description": "A 7B parameter model that provides better quality than Phi-3-mini, focusing on high-quality, reasoning-dense data." }, "Phi-3.5-mini-instruct": { "description": "An updated version of the Phi-3-mini model." }, "Phi-3.5-vision-instrust": { "description": "An updated version of the Phi-3-vision model." }, "Pro/Qwen/Qwen2-7B-Instruct": { "description": "Qwen2-7B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 7B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks. Qwen2-7B-Instruct outperforms Qwen1.5-7B-Chat in multiple evaluations, showing significant performance improvements." }, "Pro/Qwen/Qwen2.5-7B-Instruct": { "description": "Qwen2.5-7B-Instruct is one of the latest large language models released by Alibaba Cloud. This 7B model shows significant improvements in coding and mathematics. It also provides multilingual support, covering over 29 languages, including Chinese and English. The model has made notable advancements in instruction following, understanding structured data, and generating structured outputs, especially JSON." }, "Pro/Qwen/Qwen2.5-Coder-7B-Instruct": { "description": "Qwen2.5-Coder-7B-Instruct is the latest version in Alibaba Cloud's series of code-specific large language models. This model significantly enhances code generation, reasoning, and repair capabilities based on Qwen2.5, trained on 55 trillion tokens. It not only improves coding abilities but also maintains advantages in mathematics and general capabilities, providing a more comprehensive foundation for practical applications such as code agents." }, "Pro/Qwen/Qwen2.5-VL-7B-Instruct": { "description": "Qwen2.5-VL is the newest addition to the Qwen series, featuring enhanced visual comprehension capabilities. It can analyze text, charts, and layouts within images, comprehend long videos while capturing events. The model supports reasoning, tool manipulation, multi-format object localization, and structured output generation. It incorporates optimized dynamic resolution and frame rate training for video understanding, along with improved efficiency in its visual encoder." }, "Pro/THUDM/GLM-4.1V-9B-Thinking": { "description": "GLM-4.1V-9B-Thinking is an open-source vision-language model (VLM) jointly released by Zhipu AI and Tsinghua University's KEG Lab, designed specifically for handling complex multimodal cognitive tasks. Based on the GLM-4-9B-0414 foundation model, it significantly enhances cross-modal reasoning ability and stability by introducing the Chain-of-Thought reasoning mechanism and employing reinforcement learning strategies." }, "Pro/THUDM/glm-4-9b-chat": { "description": "GLM-4-9B-Chat is the open-source version of the GLM-4 series pre-trained models launched by Zhipu AI. This model excels in semantics, mathematics, reasoning, code, and knowledge. In addition to supporting multi-turn dialogues, GLM-4-9B-Chat also features advanced capabilities such as web browsing, code execution, custom tool invocation (Function Call), and long-text reasoning. The model supports 26 languages, including Chinese, English, Japanese, Korean, and German. In multiple benchmark tests, GLM-4-9B-Chat has demonstrated excellent performance, such as in AlignBench-v2, MT-Bench, MMLU, and C-Eval. The model supports a maximum context length of 128K, making it suitable for academic research and commercial applications." }, "Pro/deepseek-ai/DeepSeek-R1": { "description": "DeepSeek-R1 is a reinforcement learning (RL) driven inference model that addresses issues of repetitiveness and readability in models. Prior to RL, DeepSeek-R1 introduced cold start data to further optimize inference performance. It performs comparably to OpenAI-o1 in mathematical, coding, and reasoning tasks, and enhances overall effectiveness through carefully designed training methods." }, "Pro/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B": { "description": "DeepSeek-R1-Distill-Qwen-7B is a model derived from Qwen2.5-Math-7B through knowledge distillation. It was fine-tuned using 800,000 carefully selected samples generated by DeepSeek-R1, demonstrating exceptional reasoning capabilities. The model achieves outstanding performance across multiple benchmarks, including 92.8% accuracy on MATH-500, a 55.5% pass rate on AIME 2024, and a score of 1189 on CodeForces, showcasing strong mathematical and programming abilities for a 7B-scale model." }, "Pro/deepseek-ai/DeepSeek-V3": { "description": "DeepSeek-V3 is a mixed expert (MoE) language model with 671 billion parameters, utilizing multi-head latent attention (MLA) and the DeepSeekMoE architecture, combined with a load balancing strategy without auxiliary loss to optimize inference and training efficiency. Pre-trained on 14.8 trillion high-quality tokens and fine-tuned with supervision and reinforcement learning, DeepSeek-V3 outperforms other open-source models and approaches leading closed-source models." }, "Pro/deepseek-ai/DeepSeek-V3.1-Terminus": { "description": "DeepSeek-V3.1-Terminus is an updated version of the V3.1 model released by DeepSeek, positioned as a hybrid agent large language model. This update focuses on fixing user-reported issues and improving stability while maintaining the model's original capabilities. It significantly enhances language consistency, reducing the mixing of Chinese and English and the occurrence of abnormal characters. The model integrates both \"Thinking Mode\" and \"Non-thinking Mode,\" allowing users to switch flexibly between chat templates to suit different tasks. As a key optimization, V3.1-Terminus improves the performance of the Code Agent and Search Agent, making tool invocation and multi-step complex task execution more reliable." }, "Pro/deepseek-ai/DeepSeek-V3.2-Exp": { "description": "DeepSeek-V3.2-Exp is an experimental version released by DeepSeek as an intermediate step toward the next-generation architecture. Building on V3.1-Terminus, it introduces the DeepSeek Sparse Attention (DSA) mechanism to enhance training and inference efficiency for long-context scenarios. It features targeted optimizations for tool use, long-document comprehension, and multi-step reasoning. V3.2-Exp serves as a bridge between research and production, ideal for users seeking higher inference efficiency in high-context-budget applications." }, "Pro/moonshotai/Kimi-K2-Instruct-0905": { "description": "Kimi K2-Instruct-0905 is the latest and most powerful version of Kimi K2. It is a top-tier Mixture of Experts (MoE) language model with a total of 1 trillion parameters and 32 billion activated parameters. Key features of this model include enhanced agent coding intelligence, demonstrating significant performance improvements in public benchmark tests and real-world agent coding tasks; and an improved frontend coding experience, with advancements in both aesthetics and practicality for frontend programming." }, "QwQ-32B-Preview": { "description": "QwQ-32B-Preview is an innovative natural language processing model capable of efficiently handling complex dialogue generation and context understanding tasks." }, "Qwen/QVQ-72B-Preview": { "description": "QVQ-72B-Preview is a research-oriented model developed by the Qwen team, focusing on visual reasoning capabilities, with unique advantages in understanding complex scenes and solving visually related mathematical problems." }, "Qwen/QwQ-32B": { "description": "QwQ is the inference model of the Qwen series. Compared to traditional instruction-tuned models, QwQ possesses reasoning and cognitive abilities, achieving significantly enhanced performance in downstream tasks, especially in solving difficult problems. QwQ-32B is a medium-sized inference model that competes effectively against state-of-the-art inference models (such as DeepSeek-R1 and o1-mini). This model employs technologies such as RoPE, SwiGLU, RMSNorm, and Attention QKV bias, featuring a 64-layer network structure and 40 Q attention heads (with 8 KV heads in the GQA architecture)." }, "Qwen/QwQ-32B-Preview": { "description": "QwQ-32B-Preview is Qwen's latest experimental research model, focusing on enhancing AI reasoning capabilities. By exploring complex mechanisms such as language mixing and recursive reasoning, its main advantages include strong analytical reasoning, mathematical, and programming abilities. However, it also faces challenges such as language switching issues, reasoning loops, safety considerations, and differences in other capabilities." }, "Qwen/Qwen-Image": { "description": "Qwen-Image is a foundational image generation model developed by Alibaba's Tongyi Qianwen team, featuring 20 billion parameters. The model has made significant advancements in complex text rendering and precise image editing, excelling particularly at generating images with high-fidelity Chinese and English text. Qwen-Image can handle multi-line layouts and paragraph-level text while maintaining coherent typography and contextual harmony in generated images. Beyond its exceptional text rendering capabilities, the model supports a wide range of artistic styles—from photorealism to anime aesthetics—adapting flexibly to diverse creative needs. It also boasts powerful image editing and understanding capabilities, supporting advanced operations such as style transfer, object addition/removal, detail enhancement, text editing, and even human pose manipulation. Qwen-Image is designed to be a comprehensive foundational model for intelligent visual creation and processing, integrating language, layout, and imagery." }, "Qwen/Qwen-Image-Edit-2509": { "description": "Qwen-Image-Edit-2509 is the latest image editing version of Qwen-Image, released by Alibaba's Tongyi Qianwen team. Built upon the 20B-parameter Qwen-Image model, it has been further trained to extend its unique text rendering capabilities into the domain of image editing, enabling precise manipulation of text within images. Qwen-Image-Edit employs an innovative architecture that feeds the input image into both Qwen2.5-VL (for visual semantic control) and a VAE Encoder (for visual appearance control), enabling dual editing capabilities in both semantics and appearance. This allows for not only localized visual edits such as adding, removing, or modifying elements, but also high-level semantic edits like IP creation and style transfer that require semantic consistency. The model has demonstrated state-of-the-art (SOTA) performance across multiple public benchmarks, making it a powerful foundational model for image editing." }, "Qwen/Qwen2-72B-Instruct": { "description": "Qwen2 is an advanced general-purpose language model that supports various types of instructions." }, "Qwen/Qwen2-7B-Instruct": { "description": "Qwen2-72B-Instruct is an instruction-tuned large language model in the Qwen2 series, with a parameter size of 72B. This model is based on the Transformer architecture and employs techniques such as the SwiGLU activation function, attention QKV bias, and group query attention. It can handle large-scale inputs. The model excels in language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning across multiple benchmark tests, surpassing most open-source models and demonstrating competitive performance comparable to proprietary models in certain tasks." }, "Qwen/Qwen2-VL-72B-Instruct": { "description": "Qwen2-VL is the latest iteration of the Qwen-VL model, achieving state-of-the-art performance in visual understanding benchmarks." }, "Qwen/Qwen2.5-14B-Instruct": { "description": "Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks." }, "Qwen/Qwen2.5-32B-Instruct": { "description": "Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks." }, "Qwen/Qwen2.5-72B-Instruct": { "description": "A large language model developed by the Alibaba Cloud Tongyi Qianwen team" }, "Qwen/Qwen2.5-72B-Instruct-128K": { "description": "Qwen2.5 is a new large language model series with enhanced understanding and generation capabilities." }, "Qwen/Qwen2.5-72B-Instruct-Turbo": { "description": "Qwen2.5 is a new large language model series designed to optimize instruction-based task processing." }, "Qwen/Qwen2.5-7B-Instruct": { "description": "Qwen2.5 is a brand new series of large language models designed to optimize the handling of instruction-based tasks." }, "Qwen/Qwen2.5-7B-Instruct-Turbo": { "description": "Qwen2.5 is a new large language model series designed to optimize instruction-based task processing." }, "Qwen/Qwen2.5-Coder-32B-Instruct": { "description": "Qwen2.5-Coder focuses on code writing." }, "Qwen/Qwen2.5-Coder-7B-Instruct": { "description": "Qwen2.5-Coder-7B-Instruct is the latest version in Alibaba Cloud's series of code-specific large language models. This model significantly enhances code generation, reasoning, and repair capabilities based on Qwen2.5, trained on 55 trillion tokens. It not only improves coding abilities but also maintains advantages in mathematics and general capabilities, providing a more comprehensive foundation for practical applications such as code agents." }, "Qwen/Qwen2.5-VL-32B-Instruct": { "description": "Qwen2.5-VL-32B-Instruct is a multimodal large language model developed by the Tongyi Qianwen team, representing part of the Qwen2.5-VL series. This model excels not only in recognizing common objects but also in analyzing text, charts, icons, graphics, and layouts within images. It functions as a visual agent capable of reasoning and dynamically manipulating tools, with the ability to operate computers and mobile devices. Additionally, the model can precisely locate objects in images and generate structured outputs for documents like invoices and tables. Compared to its predecessor Qwen2-VL, this version demonstrates enhanced mathematical and problem-solving capabilities through reinforcement learning, while also exhibiting more human-preferred response styles." }, "Qwen/Qwen2.5-VL-72B-Instruct": { "description": "Qwen2.5-VL is the vision-language model in the Qwen2.5 series. This model demonstrates significant improvements across multiple dimensions: enhanced visual comprehension capable of recognizing common objects, analyzing text, charts, and layouts; serving as a visual agent that can reason and dynamically guide tool usage; supporting understanding of long videos exceeding 1 hour while capturing key events; able to precisely locate objects in images by generating bounding boxes or points; and capable of producing structured outputs particularly suitable for scanned data like invoices and forms." }, "Qwen/Qwen3-14B": { "description": "Qwen3 is a next-generation model with significantly enhanced capabilities, achieving industry-leading levels in reasoning, general tasks, agent functions, and multilingual support, with a switchable thinking mode." }, "Qwen/Qwen3-235B-A22B": { "description": "Qwen3 is a next-generation model with significantly enhanced capabilities, achieving industry-leading levels in reasoning, general tasks, agent functions, and multilingual support, with a switchable thinking mode." }, "Qwen/Qwen3-235B-A22B-Instruct-2507": { "description": "Qwen3-235B-A22B-Instruct-2507 is a flagship mixture-of-experts (MoE) large language model developed by Alibaba Cloud Tongyi Qianwen team within the Qwen3 series. It has 235 billion total parameters with 22 billion activated per inference. Released as an update to the non-thinking mode Qwen3-235B-A22B, it focuses on significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, programming, and tool usage. Additionally, it enhances coverage of multilingual long-tail knowledge and better aligns with user preferences in subjective and open-ended tasks to generate more helpful and higher-quality text." }, "Qwen/Qwen3-235B-A22B-Thinking-2507": { "description": "Qwen3-235B-A22B-Thinking-2507 is a member of the Qwen3 large language model series developed by Alibaba Tongyi Qianwen team, specializing in complex reasoning tasks. Based on a mixture-of-experts (MoE) architecture with 235 billion total parameters and approximately 22 billion activated per token, it balances strong performance with computational efficiency. As a dedicated “thinking” model, it significantly improves performance in logic reasoning, mathematics, science, programming, and academic benchmarks requiring human expertise, ranking among the top open-source thinking models. It also enhances general capabilities such as instruction following, tool usage, and text generation, natively supports 256K long-context understanding, and is well-suited for scenarios requiring deep reasoning and long document processing." }, "Qwen/Qwen3-30B-A3B": { "description": "Qwen3 is a next-generation model with significantly enhanced capabilities, achieving industry-leading levels in reasoning, general tasks, agent functions, and multilingual support, with a switchable thinking mode." }, "Qwen/Qwen3-30B-A3B-Instruct-2507": { "description": "Qwen3-30B-A3B-Instruct-2507 is an updated version of the Qwen3-30B-A3B non-thinking mode. It is a Mixture of Experts (MoE) model with a total of 30.5 billion parameters and 3.3 billion active parameters. The model features key enhancements across multiple areas, including significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage. Additionally, it has made substantial progress in covering long-tail multilingual knowledge and better aligns with user preferences in subjective and open-ended tasks, enabling it to generate more helpful responses and higher-quality text. Furthermore, its long-text comprehension capability has been extended to 256K tokens. This model supports only the non-thinking mode and does not generate `<think></think>` tags in its output." }, "Qwen/Qwen3-30B-A3B-Thinking-2507": { "description": "Qwen3-30B-A3B-Thinking-2507 is the latest “thinking” model in the Qwen3 series released by Alibaba’s Tongyi Qianwen team. As a mixture-of-experts (MoE) model with 30.5 billion total parameters and 3.3 billion active parameters, it is designed to improve capabilities for handling complex tasks. The model demonstrates significant performance gains on academic benchmarks requiring logical reasoning, mathematics, science, programming, and domain expertise. At the same time, its general abilities—such as instruction following, tool use, text generation, and alignment with human preferences—have been substantially enhanced. The model natively supports long-context understanding of 256K tokens and can scale up to 1 million tokens. This version is tailored for “thinking mode,” intended to solve highly complex problems through detailed step-by-step reasoning, and it also exhibits strong agent capabilities." }, "Qwen/Qwen3-32B": { "description": "Qwen3 is a next-generation model with significantly enhanced capabilities, achieving industry-leading levels in reasoning, general tasks, agent functions, and multilingual support, with a switchable thinking mode." }, "Qwen/Qwen3-8B": { "description": "Qwen3 is a next-generation model with significantly enhanced capabilities, achieving industry-leading levels in reasoning, general tasks, agent functions, and multilingual support, with a switchable thinking mode." }, "Qwen/Qwen3-Coder-30B-A3B-Instruct": { "description": "Qwen3-Coder-30B-A3B-Instruct is a code model in the Qwen3 series developed by Alibaba's Tongyi Qianwen team. As a streamlined and optimized model, it focuses on enhancing code-handling capabilities while maintaining high performance and efficiency. The model demonstrates notable advantages among open-source models on complex tasks such as agentic coding, automated browser operations, and tool invocation. It natively supports a long context of 256K tokens and can be extended up to 1M tokens, enabling better understanding and processing at the codebase level. Additionally, the model provides robust agentic coding support for platforms like Qwen Code and CLINE, and it employs a dedicated function-calling format." }, "Qwen/Qwen3-Coder-480B-A35B-Instruct": { "description": "Qwen3-Coder-480B-A35B-Instruct, released by Alibaba, is the most agentic code model to date. It is a mixture-of-experts (MoE) model with 480 billion total parameters and 35 billion active parameters, striking a balance between efficiency and performance. The model natively supports a 256K (~260k) token context window and can be extended to 1,000,000 tokens through extrapolation methods such as YaRN, enabling it to handle large codebases and complex programming tasks. Qwen3-Coder is designed for agent-style coding workflows: it not only generates code but can autonomously interact with development tools and environments to solve complex programming problems. On multiple benchmarks for coding and agent tasks, this model achieves top-tier results among open-source models, with performance comparable to leading models like Claude Sonnet 4." }, "Qwen/Qwen3-Next-80B-A3B-Instruct": { "description": "Qwen3-Next-80B-A3B-Instruct is the next-generation foundational model released by Alibaba's Tongyi Qianwen team. It is based on the brand-new Qwen3-Next architecture, designed to achieve ultimate training and inference efficiency. The model employs an innovative hybrid attention mechanism (Gated DeltaNet and Gated Attention), a highly sparse mixture-of-experts (MoE) structure, and multiple training stability optimizations. As a sparse model with a total of 80 billion parameters, it activates only about 3 billion parameters during inference, significantly reducing computational costs. When handling long-context tasks exceeding 32K tokens, its inference throughput is more than 10 times higher than the Qwen3-32B model. This model is an instruction-tuned version designed for general tasks and does not support the Thinking mode. In terms of performance, it is comparable to Tongyi Qianwen's flagship Qwen3-235B model on some benchmarks, especially demonstrating clear advantages in ultra-long context tasks." }, "Qwen/Qwen3-Next-80B-A3B-Thinking": { "description": "Qwen3-Next-80B-A3B-Thinking is the next-generation foundational model released by Alibaba's Tongyi Qianwen team, specifically designed for complex reasoning tasks. It is based on the innovative Qwen3-Next architecture, which integrates a hybrid attention mechanism (Gated DeltaNet and Gated Attention) and a highly sparse mixture-of-experts (MoE) structure, aiming for ultimate training and inference efficiency. As a sparse model with a total of 80 billion parameters, it activates only about 3 billion parameters during infe