UNPKG

@lobehub/chat

Version:

Lobe Chat - an open-source, high-performance chatbot framework that supports speech synthesis, multimodal, and extensible Function Call plugin system. Supports one-click free deployment of your private ChatGPT/LLM web application.

110 lines (72 loc) 8.13 kB
--- title: Google Gemini 系列 Tools Calling 评测 description: >- 使用 LobeChat 测试 Google Gemini 系列模型(Gemini 1.5 Pro / Gemini 1.5 Flash)的工具调用(Function Calling)能力,并展现评测结果 tags: - Tools Calling - Benchmark - Function Calling 评测 - 工具调用 - 插件 --- # Google Gemini 系列 Tools Calling Google Gemini 系列模型 Tools Calling 能力一览: | 模型 | 支持 Tools Calling | 流式 (Stream) | 并发(Parallel) | 简单指令得分 | 复杂指令 | | ---------------- | ---------------- | ----------- | ------------ | ------ | ---- | | Gemini 1.5 Pro | | | | | | | Gemini 1.5 Flash | | | | | | <Callout type={'important'}> 根据我们的的实际测试,强烈建议不要给 Gemini 开启插件,因为目前(截止 2024.07.07)它的 Tools Calling 能力实在太烂了 </Callout> ## Gemini 1.5 Pro ### 简单调用指令:天气查询 测试指令:指令 ① <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/a5a35431-2a15-4e79-97d5-502637f829bc" /> Gemini 输出的 json 中,name 是错误的,因此 LobeChat 无法识别到它调用了什么插件(入参中,天气插件的 name 为 `realtime-weather____fetchCurrentWeather`,而 Gemini 返回的是 `weather____fetchCurrentWeather`) <Image alt="Gemini 1.5 Pro 简单指令的 Tools Calling" src="https://github.com/lobehub/lobe-chat/assets/28616219/1e077799-c25e-43c7-8492-c5c0bb9aed9b" /> <details> <summary>Tools Calling 原始输出:</summary> ```yml [stream start] 2024-7-7 17:53:25.647 [chunk 0] 2024-7-7 17:53:25.654 {"candidates":[{"content":{"parts":[{"text":"好的"}],"role":"model"},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":1,"totalTokenCount":96}} [chunk 1] 2024-7-7 17:53:26.288 {"candidates":[{"content":{"parts":[{"text":"\n\n"}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":1,"totalTokenCount":96}} [chunk 2] 2024-7-7 17:53:26.336 {"candidates":[{"content":{"parts":[{"functionCall":{"name":"weather____fetchCurrentWeather","args":{"city":"杭州"}}},{"functionCall":{"name":"weather____fetchCurrentWeather","args":{"city":"北京"}}}],"role":"model"},"finishReasoSTOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":95,"candidatesTokenCount":79,"totalTokenCount":174}} [stream finished] total chunks: 3 ``` </details> ### 复杂调用指令:文生图 测试指令:指令 ② <Image alt="Gemini 1.5 Pro 复杂指令的 Tools Calling" src="https://github.com/lobehub/lobe-chat/assets/28616219/a2454a60-3271-4786-861f-d49ceac1316e" /> 在测试复杂指令集时,Google 直接抛错: ```json { "message": "[400 Bad Request] Invalid JSON payload received. Unknown name \"maxItems\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"minItems\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[1].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[3].value': Cannot find field.\nInvalid JSON payload received. Unknown name \"default\" at 'tools[0].function_declarations[0].parameters.properties[4].value': Cannot find field. [{\"@type\":\"type.googleapis.com/google.rpc.BadRequest\",\"fieldViolations\":[{\"field\":\"tools[0].function_declarations[0].parameters.properties[0].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"maxItems\\\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[0].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"minItems\\\" at 'tools[0].function_declarations[0].parameters.properties[0].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[1].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[1].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[3].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[3].value': Cannot find field.\"},{\"field\":\"tools[0].function_declarations[0].parameters.properties[4].value\",\"description\":\"Invalid JSON payload received. Unknown name \\\"default\\\" at 'tools[0].function_declarations[0].parameters.properties[4].value': Cannot find field.\"}]}]" } ``` 上述抛错中提到并不支持包含 `maxItems` 的 schema,因此 Gemini 1.5 Pro 相当于无法使用 DallE 插件 相关 issue: - [Support for minItems and maxItems for FunctionDeclarationSchemaType.ARRAY?](https://github.com/google-gemini/generative-ai-js/issues/200) - [Gemini Models unusable when dalle plugin is enabled](https://github.com/lobehub/lobe-chat/issues/2537) 综合以上两个测试来看,GoogleTool Calling 能力似乎是支持了,但是几乎没法在日常中使用,我个人认为已经等于虚假宣传了 ## Gemini 1.5 Flash ### 简单调用指令:天气查询 测试指令:指令 ① <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/6cab77e8-d761-4a91-8325-a61748cebac1" />Gemini 1.5 flash 更为抽象,说完调用就结束了结合以下原始输出可以看到,Gemini 1.5 Flash 并没有输出 Tool Calling 的数据,因此可以说是完全不可用 ```yml stream start] 2024-7-7 19:4:50.936 [chunk 0] 2024-7-7 19:4:50.943 {"candidates":[{"content":{"parts":[{"text":"好的"}],"role":"model"},"finishReason":"STOP","index":0}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":1,"totalTokenCount":97}} [chunk 1] 2024-7-7 19:4:52.209 {"candidates":[{"content":{"parts":[{"text":",请稍等,我正在查询杭州和北京的天气信息。 "}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"ATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":16,"totalTokenCount":112}} [chunk 2] 2024-7-7 19:4:53.288 {"candidates":[{"content":{"parts":[{"text":"\n"}],"role":"model"},"finishReason":"STOP","index":0,"safetyRatings":[{"category":"HARM_CATEGORY_SEXUALLY_EXPLICIT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HATE_SPEECH","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_HARASSMENT","probability":"NEGLIGIBLE"},{"category":"HARM_CATEGORY_DANGEROUS_CONTENT","probability":"NEGLIGIBLE"}]}],"usageMetadata":{"promptTokenCount":96,"candidatesTokenCount":16,"totalTokenCount":112}} [stream finished] total chunks: 3 ``` ### 复杂调用指令:文生图 测试指令:指令 ② 该指令和 Gemini 1.5 Pro 的复杂指令一样,直接抛错,因此不再详细展开。