@visactor/vmind
Version:
<div align="center"> <a href="https://github.com/VisActor#gh-light-mode-only" target="_blank"> <img alt="VisActor Logo" width="200" src="https://github.com/VisActor/.github/blob/main/profile/logo_500_200_light.svg"/> </a> <a href="https://githu
16 lines (10 loc) • 7.07 kB
JavaScript
"use strict";
Object.defineProperty(exports, "__esModule", {
value: !0
}), exports.getFieldInfoPrompt = exports.getBasePrompt = void 0;
const gptPrompt_1 = require("./gptPrompt"), dataTableExplanation = '## Data Table\n1. Key of dataTable is fieldName in fieldInfo\n1. The value type of a \'numerical\', \'ratio\', or \'count\' field MUST be \'number\' or \'number[]\', and DO NOT perform any arithmetic operations..\n2. ALWAYS generate flatten data table rather than unflatten data table\n### Flatten Data Table Example\n```\ndataTable:[{date:"Monday",class:"class No.1",score:20},{date:"Monday",class:"class No.2",score:30},{date:"Tuesday",class:"class No.1",score:25},{date:"Tuesday",class:"class No.2",score:28}]\n```\n### Unflatten Data Table Example\n```\ndataTable:[{date:"Monday",classNo.1:20,classNo.2:30},{date:"Tuesday",classNo.1:25,classNo.2:28}]\n```', baseExamples = '# Examples1\ntext:今年6月各大厂商发布了过去1个月的财报数据,其中阿里在V月份利润额达到了1000亿,经调整后的利润额为100亿,而字节跳动V月份的利润额为800亿,经调整后利润额为120亿。\n\nResponse:\n```\n{"fieldInfo":[{"fieldName":"公司","description":"公司名称","type":"string",},{"fieldName":"月份","description":"具体月份","type":"string",},{"fieldName":"利润调整","description":"是否经过利润调整","type":"string",},{"fieldName":"利润额","description":"利润总额","type":"numerical",}],"dataTable":[{"公司":"阿里","月份":"5月","利润调整":"调整前","利润额":100000000000,},{"公司":"阿里","月份":"5月","利润调整":"调整后","利润额":10000000000,},{"公司":"字节跳动","月份":"5月","利润调整":"调整前","利润额":80000000000,},{"公司":"字节跳动","月份":"5月","利润调整":"调整后","利润额":12000000000,},]}\n```\n# Examples2\ntext: John Smith was very tall, ranking in the 90th percentile for his age group. He knew Jane Doe. who ranking in the 75th percentile for her age group.\n\nResponse:\n```\n{"fieldInfo":[{"fieldName":"name","description":"The name of a person","type":"string",},{"fieldName":"ranking","description":"The ranking of height in age group","type":"ratio"}],"dataTable":[{"name":"John Smith","ranking":90,},{"name":"Jane Doe","ranking":75}]}\n```\n# Examples3\ntext: 现在有大约60%-70%的年轻人有入睡困难,而在两年前,入睡困难的年轻人占比才只有30%。\n\nResponse:\n```\n{"fieldInfo":[{"fieldName":"年份","description":"数据对应时间","type":"date",dateGranularity:"year"},{"fieldName":"入睡困难占比","description":"年轻人入睡困呐占总人数的比例","type":"ratio"}],"dataTable":[{"年份":"2024","占比":[0.6,0.7],},{"年份":"2022","占比":0.3}]}\n```\n', getCommonInfomation = language => `# Common Information\n${"chinese" === language ? `1. 今年是${(new Date).getFullYear()}年\n2. 8.5折和85折含义相同,都代表85%的折扣` : `1. This year is ${(new Date).getFullYear()}`}\n`, getResponse = showThoughs => `\nResponse in the following format:\n\`\`\`\n{\nfieldInfo: {\nfieldName: string;\ndescription: string;\ntype: 'date' | 'time' | 'string' | 'region' | 'numerical' | 'ratio' | 'count';\nratioGranularity?: '%' | '‰'; // generate when fieldType is 'ratio', represent the ratio granularity of ratio data\ndateGranularity?: 'year' | 'quarter' | 'month' | 'week' | 'day'; // generate when fieldType is 'date', represent the date granularity of date time\n}[],\ndataTable: Record<string,string|number|number[]>[];\n${showThoughs ? "thoughts: string, // your thought process" : ""}\n}\n\`\`\`\n`, getFieldTypeExplanation = language => `## Field Information\n1. ALWAYS generate a field information, which represents the specific information of each column field in the data table.\n2. ALWAYS generate a field description\n3. ALWAYS generate a field type, chosen from 'date' | 'time' | 'string' | 'region' | 'numerical' | 'ratio' | 'count';'date' refers to data that can be specified down to the year, quarter, month, week, or day.'ratio' means ratio value or percentage(%), such as ${"english" === language ? "YoY or MoM" : "同比、环比、增长率、占比等"}.The forms of ratio data are usually Percentage (%) such as 60%.'count' means count data\n4. ALWAYS generate dateGranularity for 'date' type, represent the date granularity of date time`, getBasePrompt = (language, showThoughs = !1) => `You are an expert extraction algorithm.You are an expert extraction algorithm, especially sensitive to data, date, category, data comparison and similar content.Your task is to extract high-quality data tables and field information from the text for further analysis, such as visualization charts, etc.\n# Response\n${getResponse(showThoughs)}\n${getFieldTypeExplanation(language)}\n${dataTableExplanation}\n${getCommonInfomation(language)}\n# Constraints:\n1. Answer language MUST: ${language}\n2. Strictly define the type of return format, use JSON format to reply, do not include any extra content.\n3. The extracted data strives for simplicity.\n4. Prefer flatten data table rather than unflatten data table.\n5. Numerical and Ratio data are unit-free, e.g., '10万' becomes '100000', '1k' becomes '1000'.\n6. Only extract value in ratio type, eg., '95%' becomes '95'; 'reduce 30%' becomes '-30'.\n7. Ensure the correctness of the value and type of each row in dataTable, return null for the value of unknown fields.\n8. The change in values should be reflected in the positive or negative nature of the data, not in the field names, such as ${"chinese" === language ? "'下降了50个点'的提取结果为'-50'" : "'Decrease by 50' becomes '-50'"}.\n# Steps\nYou should think step-by-step as follow:\n0. Answer language MUST: ${language}\n1. Check if the task involves data extraction. If not, set isDataExtraction to false in json mode; otherwise, proceed with steps.\n2. Read the entire text and generate the MOST IMPORTANT fields with numerical or ratio or count field type first.\n3. Re-read the text, generate concise and clear fields associated with the fields found in Step2.\n4. Extract all relevant data tables from the text based on field information. Each field's data should be concise and convey a single meaning.\n5. Format date data based on granularity, e.g., yyyy-mm-dd, mm-dd, mm, yyyy-mm, or yyyy-qq.\n6. When a date field has multiple date granularities, change the type of field to string.\n7. Extract interval/range data in the form of an array.\n8. Avoid any calculations or numerical conversions, like currency conversion.\n9. Check the data in the dataTable to ensure the correctness of the type.\n10. Recheck all data to ensure that no numerical or ratio data is missing.\n---\n${baseExamples}`;
exports.getBasePrompt = getBasePrompt;
const getFieldInfoPrompt = (language, showThoughs = !1, reGenerateFieldInfo = !1) => (0,
gptPrompt_1.getFieldInfoPrompt)(language, showThoughs, reGenerateFieldInfo);
exports.getFieldInfoPrompt = getFieldInfoPrompt;
//# sourceMappingURL=doubaoPrompt.js.map