@lobehub/chat

Version:

Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.

github.com/lobehub/lobe-chat

lobehub/lobe-chat

110 lines (72 loc) • 8.13 kB

text/mdx

--- title: Google Gemini 系列 Tools Calling 评测 description: >- 使用 LobeChat 测试 Google Gemini 系列模型（Gemini 1.5 Pro / Gemini 1.5 Flash）的工具调用（Function Calling）能力，并展现评测结果 tags: - Tools Calling - Benchmark - Function Calling 评测 - 工具调用 - 插件 --- # Google Gemini 系列 Tools Calling Google Gemini 系列模型 Tools Calling 能力一览： | 模型 | 支持 Tools Calling | 流式（Stream） | 并发（Parallel） | 简单指令得分 | 复杂指令 | | ---------------- | ---------------- | ----------- | ------------ | ------ | ---- | | Gemini 1.5 Pro | ✅ | ❌ | ✅ | ⛔ | ⛔ | | Gemini 1.5 Flash | ❌ | ❌ | ❌ | ⛔ | ⛔ | <Callout type={'important'}> 根据我们的的实际测试，强烈建议不要给 Gemini 开启插件，因为目前（截止 2024.07.07）它的 Tools Calling 能力实在太烂了。 </Callout> ## Gemini 1.5 Pro ### 简单调用指令：天气查询测试指令：指令 ① <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/a5a35431-2a15-4e79-97d5-502637f829bc" /> Gemini 输出的 json 中，name 是错误的，因此 LobeChat 无法识别到它调用了什么插件。（入参中，天气插件的 name 为 `realtime-weather____fetchCurrentWeather`，而 Gemini 返回的是 `weather____fetchCurrentWeather`）。 <Image alt="Gemini 1.5 Pro 简单指令的 Tools Calling" src="https://github.com/lobehub/lobe-chat/assets/28616219/1e077799-c25e-43c7-8492-c5c0bb9aed9b" /> <details> <summary>Tools Calling 原始输出：</summary> ```yml [stream start] 2024-7-7 17:53:25.647 [chunk 0] 2024-7-7 17:53:25.654 {"candidates":[{"content":{"parts":[{"text":"好的"}],"role":"model"},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":1,"totalTokenCount":96}} [chunk 1] 2024-7-7 17:53:26.288 {"candidates":[{"content":{"parts":[{"text":"\n\n"}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":1,"totalTokenCount":96}} [chunk 2] 2024-7-7 17:53:26.336 {"candidates":[{"content":{"parts":[{"functionCall":{"name":"weather____fetchCurrentWeather","args":{"city":"杭州"}}},{"functionCall":{"name":"weather____fetchCurrentWeather","args":{"city":"北京"}}}],"role":"model"},"finishReasoSTOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":79,"totalTokenCount":174}} [stream finished] total chunks: 3 ``` </details> ### 复杂调用指令：文生图测试指令：指令 ② <Image alt="Gemini 1.5 Pro 复杂指令的 Tools Calling" src="https://github.com/lobehub/lobe-chat/assets/28616219/a2454a60-3271-4786-861f-d49ceac1316e" /> 在测试复杂指令集时，Google 直接抛错： ```json { "message": "[400 Bad Request] Invalid JSON payload received. Unknown name \"maxItems\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"minItems\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[1].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[3].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[4].value': Cannot find field. [{\"@type\":\"type.googleapis.com/google.rpc.BadRequest\",\"fieldViolations\":[{\"field\":\"tools[0].function_declarations[0].parameters.properties[0].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"maxItems\\\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[0].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"minItems\\\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[1].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[1].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[3].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[3].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[4].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[4].value': Cannot find field.\"}]}]" } ``` 上述抛错中提到并不支持包含 `maxItems` 的 schema，因此 Gemini 1.5 Pro 相当于无法使用 DallE 插件。相关 issue: - [Support for minItems and maxItems for FunctionDeclarationSchemaType.ARRAY?](https://github.com/google-gemini/generative-ai-js/issues/200) - [Gemini Models unusable when dalle plugin is enabled](https://github.com/lobehub/lobe-chat/issues/2537) 综合以上两个测试来看，Google 的 Tool Calling 能力似乎是支持了，但是几乎没法在日常中使用，我个人认为已经等于虚假宣传了。 ## Gemini 1.5 Flash ### 简单调用指令：天气查询测试指令：指令 ① <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/6cab77e8-d761-4a91-8325-a61748cebac1" /> 而 Gemini 1.5 flash 更为抽象，说完调用就结束了。结合以下原始输出可以看到，Gemini 1.5 Flash 并没有输出 Tool Calling 的数据，因此可以说是完全不可用。 ```yml stream start] 2024-7-7 19:4:50.936 [chunk 0] 2024-7-7 19:4:50.943 {"candidates":[{"content":{"parts":[{"text":"好的"}],"role":"model"},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":1,"totalTokenCount":97}} [chunk 1] 2024-7-7 19:4:52.209 {"candidates":[{"content":{"parts":[{"text":"，请稍等，我正在查询杭州和北京的天气信息。 "}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"ATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":16,"totalTokenCount":112}} [chunk 2] 2024-7-7 19:4:53.288 {"candidates":[{"content":{"parts":[{"text":"\n"}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":16,"totalTokenCount":112}} [stream finished] total chunks: 3 ``` ### 复杂调用指令：文生图测试指令：指令 ② 该指令和 Gemini 1.5 Pro 的复杂指令一样，直接抛错，因此不再详细展开。