@lobehub/chat
Version:
Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.
153 lines (94 loc) • 6.26 kB
text/mdx
---
title: Anthropic Claude 系列 Tools Calling 评测
description: >-
使用 LobeChat 测试 Anthropic Claude 系列模型(Claude 3.5 sonnet / Claude 3 Opus / Claude 3 haiku) 的工具调用(Function Calling)能力,并展现评测结果
tags:
- Tools Calling
- Benchmark
- Function Calling 评测
- 工具调用
- 插件
---
# Anthropic Claude Series Tools Calling
Overview of Anthropic Claude Series model Tools Calling capabilities:
| Model | Support Tools Calling | Stream | Parallel | Simple Instruction Score | Complex Instruction |
| ----------------- | --------------------- | ------ | -------- | ------------------------ | ------------------- |
| Claude 3.5 Sonnet | ✅ | ✅ | ✅ | 🌟🌟🌟 | 🌟🌟 |
| Claude 3 Opus | ✅ | ✅ | ❌ | 🌟 | ⛔️ |
| Claude 3 Sonnet | ✅ | ✅ | ❌ | 🌟🌟 | ⛔️ |
| Claude 3 Haiku | ✅ | ✅ | ❌ | 🌟🌟 | ⛔️ |
## Claude 3.5 Sonnet
### Simple Instruction Call: Weather Query
Test Instruction: Instruction ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/42a6980c-ea2a-44fd-b61f-a7989827f5a5" />
<Image alt="Claude 3.5 Sonnet Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/71146b75-2c73-48c3-9688-1d8814d2a791" />
<details>
<summary>Tools Calling Raw Output:</summary>
```yml
```
</details>
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/a9a40899-d5f3-4ef2-aa08-922751b05ca6" />
From the above video:
1. Sonnet 3.5 supports Stream Tools Calling and Parallel Tools Calling;
2. In Stream Tools Calling, it is observed that creating long sentences will cause a delay (as seen in the Tools Calling raw output `[chunk 40]` and `[chunk 41]` with a delay of 6s). Therefore, there will be a relatively long waiting time at the beginning stage of Tools Calling.
<Image alt="Claude 3.5 Sonnet Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/23e2d7e5-a6f3-4f4c-9c6a-5651f35a5910" />
<details>
<summary>Tools Calling Raw Output:</summary>
```yml
```
</details>
## Claude 3 Opus
### Simple Instruction Call: Weather Query
Test Instruction: Instruction ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/0e120fa2-8410-4552-a947-5ab7a91d994d" />
From the above video:
1. Claude 3 Opus outputs a `<thinking>` tag at the beginning of Tools Calling, which is not very helpful for users and consumes more tokens;
2. Opus triggers Tools Calling twice, indicating that it does not support Parallel Tools Calling;
3. The raw output of Tools Calling shows that Opus also supports Stream Tools Calling.
<Image alt="Claude 3 Opus Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/fa2f89bc-b9d5-43e3-a15e-1e79174d002c" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/b2dc8cd9-2582-43fe-9121-29c20a1cdc7b" />
From the above video:
1. Combining with simple tasks, Opus will always output a `<thinking>` tag, which significantly impacts the user experience;
2. Opus outputs the prompts field as a string instead of an array, causing an error and preventing the plugin from being called correctly.
<Image alt="Claude 3 Opus Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/1eee785d-932f-4320-845e-eed0bee4b1ae" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
## Claude 3 Sonnet
### Simple Instruction Call: Weather Query
Test Instruction: Instruction ①
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/600becd5-7f12-4a9a-86c7-e5cca0db6b1b" />
From the above video, it can be seen that Claude 3 Sonnet triggers Tools Calling twice, indicating that it does not support Parallel Tools Calling.
<Image alt="Claude 3 Sonnet Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/e82f5c69-7607-488f-8c10-0482fb380c6c" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/c150aa5f-36bc-40f2-a779-9c4fdcf2cd4c" />
From the above video, it can be seen that Sonnet 3 fails in the complex instruction call. The error is due to prompts being expected as an array but generated as a string.
<Image alt="Claude 3.5 Sonnet Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/b7d84e26-920d-4a82-8798-1b1060ebb341" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>
## Claude 3 Haiku
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/02b3e872-735a-4928-8245-a90786acea8b" />
From the above video:
1. Claude 3 Haiku triggers Tools Calling twice, indicating that it also does not support Parallel Tools Calling;
2. Haiku does not provide a good response and directly calls the tool;
<Image alt="Claude 3 Haiku Tools Calling for Simple Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/9081b586-cf43-440f-8ef8-1de5d8658694" />
### Complex Instruction Call: Literary Map
Test Instruction: Instruction ②
<Video src="https://github.com/lobehub/lobe-chat/assets/28616219/d1e3f804-0b89-4b90-9d78-69aee0db1c4d" />
From the above video, it can be seen that Haiku 3 also fails in the complex instruction call. The error is the same as prompts generating a string instead of an array.
<Image alt="Claude 3 Haiku Tools Calling for Complex Instruction" src="https://github.com/lobehub/lobe-chat/assets/28616219/cde80220-4615-43bb-934f-35fe0de88754" />
<details>
<summary>Tools Calling Raw Output:</summary>
</details>