书生·浦语大模型实战系列文章目录
书生·浦语大模型全链路开源体系发展历程和特点(lesson 1)
部署 InternLM2-Chat-1.8B(lesson 2-1)
部署八戒demo InternLM2-Chat-1.8B(lesson 2-2)
部署InternLM2-Chat-7B 模型(lesson 2-3)
部署浦语·灵笔2 模型(lesson 2-4)
部署InternLM Studio“茴香豆”知识助手(lesson 3)
XTuner 微调 LLM: 1.8B、多模态和 Agent(lesson 4)
LMDeploy 量化部署 LLM & VLM 实践(lesson 5)
Lagent & AgentLego 智能体应用搭建(lesson 6)
OpenCompass 大模型评测实战(lesson 7)
OpenCompass 大模型评测作业(lesson 7)
- 书生·浦语大模型实战系列文章目录
- 一、环境配置
- 1.1创建开发机和 conda 环境
- 1.2 安装
- 1.3 数据准备
- 1.4 启动评测 (10% A100 8GB 资源)
模型评估阶段:配置 -> 推理 -> 评估 -> 可视化。
一、环境配置
1.1创建开发机和 conda 环境
选择镜像为 Cuda11.7-conda,并选择 GPU 为10% A100。
1.2 安装
面向GPU的环境安装
studio-conda -o internlm-base -t opencompass
source activate opencompass
git clone -b 0.2.4 https://github.com/open-compass/opencompass
cd opencompass
pip install -e .
如果pip install -e .安装未成功,请运行:
dart pip install -r requirements.txt
1.3 数据准备
解压评测数据集到 data/ 处
cp /share/temp/datasets/OpenCompassData-core-20231110.zip /root/opencompass/
unzip OpenCompassData-core-20231110.zip
查看支持的数据集和模型
列出所有跟 InternLM 及 C-Eval 相关的配置
python tools/list_configs.py internlm ceval
报错了:
(opencompass) root-studio-50092202:~/opencompass# python tools/list_configs.py internlm ceval
Traceback (most recent call last):
File "/root/opencompass/tools/list_configs.py", line 3, in <module>
import tabulate
ModuleNotFoundError: No module named 'tabulate'
安装该模块:
(opencompass) root-studio-50092202:~/opencompass# pip install tabulate
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting tabulate
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/40/44/4a5f08c96eb108af5cb50b41f76142f0afa346dfa99d5296fe7202a11854/tabulate-0.9.0-py3-none-any.whl (35 kB)
Installing collected packages: tabulate
Successfully installed tabulate-0.9.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
安装 tabulate 模块后,再次尝试运行 list_configs.py 脚本
python tools/list_configs.py internlm ceval
依然报错,尝试执行以下代码:
pip install -r requirements.txt
再次尝试运行 list_configs.py 脚本
python tools/list_configs.py internlm ceval
看到以下内容:
(opencompass) root-studio-50092202:~/opencompass# python tools/list_configs.py internlm ceval
+----------------------------------------+----------------------------------------------------------------------+
| Model | Config Path |
|----------------------------------------+----------------------------------------------------------------------|
| hf_internlm2_1_8b | configs/models/hf_internlm/hf_internlm2_1_8b.py |
| hf_internlm2_20b | configs/models/hf_internlm/hf_internlm2_20b.py |
| hf_internlm2_7b | configs/models/hf_internlm/hf_internlm2_7b.py |
| hf_internlm2_base_20b | configs/models/hf_internlm/hf_internlm2_base_20b.py |
| hf_internlm2_base_7b | configs/models/hf_internlm/hf_internlm2_base_7b.py |
| hf_internlm2_chat_1_8b | configs/models/hf_internlm/hf_internlm2_chat_1_8b.py |
| hf_internlm2_chat_1_8b_sft | configs/models/hf_internlm/hf_internlm2_chat_1_8b_sft.py |
| hf_internlm2_chat_20b | configs/models/hf_internlm/hf_internlm2_chat_20b.py |
| hf_internlm2_chat_20b_sft | configs/models/hf_internlm/hf_internlm2_chat_20b_sft.py |
| hf_internlm2_chat_20b_with_system | configs/models/hf_internlm/hf_internlm2_chat_20b_with_system.py |
| hf_internlm2_chat_7b | configs/models/hf_internlm/hf_internlm2_chat_7b.py |
| hf_internlm2_chat_7b_sft | configs/models/hf_internlm/hf_internlm2_chat_7b_sft.py |
| hf_internlm2_chat_7b_with_system | configs/models/hf_internlm/hf_internlm2_chat_7b_with_system.py |
| hf_internlm2_chat_math_20b | configs/models/hf_internlm/hf_internlm2_chat_math_20b.py |
| hf_internlm2_chat_math_20b_with_system | configs/models/hf_internlm/hf_internlm2_chat_math_20b_with_system.py |
| hf_internlm2_chat_math_7b | configs/models/hf_internlm/hf_internlm2_chat_math_7b.py |
| hf_internlm2_chat_math_7b_with_system | configs/models/hf_internlm/hf_internlm2_chat_math_7b_with_system.py |
| hf_internlm_20b | configs/models/hf_internlm/hf_internlm_20b.py |
| hf_internlm_7b | configs/models/hf_internlm/hf_internlm_7b.py |
| hf_internlm_chat_20b | configs/models/hf_internlm/hf_internlm_chat_20b.py |
| hf_internlm_chat_7b | configs/models/hf_internlm/hf_internlm_chat_7b.py |
| hf_internlm_chat_7b_8k | configs/models/hf_internlm/hf_internlm_chat_7b_8k.py |
| hf_internlm_chat_7b_v1_1 | configs/models/hf_internlm/hf_internlm_chat_7b_v1_1.py |
| internlm_7b | configs/models/internlm/internlm_7b.py |
| lmdeploy_internlm2_chat_20b | configs/models/hf_internlm/lmdeploy_internlm2_chat_20b.py |
| lmdeploy_internlm2_chat_7b | configs/models/hf_internlm/lmdeploy_internlm2_chat_7b.py |
| ms_internlm_chat_7b_8k | configs/models/ms_internlm/ms_internlm_chat_7b_8k.py |
+----------------------------------------+----------------------------------------------------------------------+
+--------------------------------+------------------------------------------------------------------+
| Dataset | Config Path |
|--------------------------------+------------------------------------------------------------------|
| ceval_clean_ppl | configs/datasets/ceval/ceval_clean_ppl.py |
| ceval_contamination_ppl_810ec6 | configs/datasets/contamination/ceval_contamination_ppl_810ec6.py |
| ceval_gen | configs/datasets/ceval/ceval_gen.py |
| ceval_gen_2daf24 | configs/datasets/ceval/ceval_gen_2daf24.py |
| ceval_gen_5f30c7 | configs/datasets/ceval/ceval_gen_5f30c7.py |
| ceval_internal_ppl_1cd8bf | configs/datasets/ceval/ceval_internal_ppl_1cd8bf.py |
| ceval_ppl | configs/datasets/ceval/ceval_ppl.py |
| ceval_ppl_1cd8bf | configs/datasets/ceval/ceval_ppl_1cd8bf.py |
| ceval_ppl_578f8d | configs/datasets/ceval/ceval_ppl_578f8d.py |
| ceval_ppl_93e5ce | configs/datasets/ceval/ceval_ppl_93e5ce.py |
| ceval_zero_shot_gen_bd40ef | configs/datasets/ceval/ceval_zero_shot_gen_bd40ef.py |
+--------------------------------+------------------------------------------------------------------+
(opencompass) root-studio-50092202:~/opencompass#
1.4 启动评测 (10% A100 8GB 资源)
通过以下命令评测 InternLM2-Chat-1.8B 模型在 C-Eval 数据集上的性能。由于 OpenCompass 默认并行启动评估过程,我们可以在第一次运行时以 --debug 模式启动评估,并检查是否存在问题。在 --debug 模式下,任务将按顺序执行,并实时打印输出。
python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
命令解析
--datasets ceval_gen \ --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b \ # HuggingFace 模型路径 --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b \ # HuggingFace tokenizer 路径(如果与模型路径相同,可以省略) --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True \ # 构建 tokenizer 的参数 --model-kwargs device_map='auto' trust_remote_code=True \ # 构建模型的参数 --max-seq-len 1024 \ # 模型可以接受的最大序列长度 --max-out-len 16 \ # 生成的最大 token 数 --batch-size 2 \ # 批量大小 --num-gpus 1 # 运行模型所需的 GPU 数量 --debug ```
出现大量报错:
(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 07:17:23 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 07:17:23 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 07:17:24 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 07:17:27 - OpenCompass - INFO - Partitioned into 1 tasks.
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
06/09 07:17:31 - OpenCompass - INFO - Partitioned into 52 tasks.
06/09 07:17:33 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network]: No predictions found.
06/09 07:17:35 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system]: No predictions found.
06/09 07:17:37 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture]: No predictions found.
06/09 07:17:38 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming]: No predictions found.
06/09 07:17:40 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics]: No predictions found.
06/09 07:17:42 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry]: No predictions found.
06/09 07:17:44 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics]: No predictions found.
06/09 07:17:46 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics]: No predictions found.
06/09 07:17:48 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics]: No predictions found.
06/09 07:17:49 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer]: No predictions found.
06/09 07:17:51 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer]: No predictions found.
06/09 07:17:53 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics]: No predictions found.
06/09 07:17:55 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics]: No predictions found.
06/09 07:17:57 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry]: No predictions found.
06/09 07:17:59 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology]: No predictions found.
06/09 07:18:00 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics]: No predictions found.
06/09 07:18:02 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology]: No predictions found.
06/09 07:18:04 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics]: No predictions found.
06/09 07:18:06 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry]: No predictions found.
06/09 07:18:08 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine]: No predictions found.
06/09 07:18:10 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics]: No predictions found.
06/09 07:18:11 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration]: No predictions found.
06/09 07:18:13 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism]: No predictions found.
06/09 07:18:15 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought]: No predictions found.
06/09 07:18:17 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science]: No predictions found.
06/09 07:18:19 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification]: No predictions found.
06/09 07:18:21 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics]: No predictions found.
06/09 07:18:22 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography]: No predictions found.
06/09 07:18:24 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics]: No predictions found.
06/09 07:18:26 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]: No predictions found.
06/09 07:18:28 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history]: No predictions found.
06/09 07:18:30 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation]: No predictions found.
06/09 07:18:32 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic]: No predictions found.
06/09 07:18:33 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law]: No predictions found.
06/09 07:18:35 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature]: No predictions found.
06/09 07:18:37 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies]: No predictions found.
06/09 07:18:39 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide]: No predictions found.
06/09 07:18:41 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional]: No predictions found.
06/09 07:18:43 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese]: No predictions found.
06/09 07:18:45 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history]: No predictions found.
06/09 07:18:46 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history]: No predictions found.
06/09 07:18:48 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant]: No predictions found.
06/09 07:18:50 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science]: No predictions found.
06/09 07:18:52 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection]: No predictions found.
06/09 07:18:54 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine]: No predictions found.
06/09 07:18:56 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine]: No predictions found.
06/09 07:18:57 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner]: No predictions found.
06/09 07:18:59 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant]: No predictions found.
06/09 07:19:01 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer]: No predictions found.
06/09 07:19:03 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer]: No predictions found.
06/09 07:19:05 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant]: No predictions found.
06/09 07:19:06 - OpenCompass - ERROR - /root/opencompass/opencompass/tasks/openicl_eval.py - _score - 238 - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician]: No predictions found.
dataset version metric mode opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
---------------------------------------------- --------- -------- ------ ---------------------------------------------------------------------------------------
ceval-computer_network - - - -
ceval-operating_system - - - -
ceval-computer_architecture - - - -
ceval-college_programming - - - -
ceval-college_physics - - - -
ceval-college_chemistry - - - -
ceval-advanced_mathematics - - - -
ceval-probability_and_statistics - - - -
ceval-discrete_mathematics - - - -
ceval-electrical_engineer - - - -
ceval-metrology_engineer - - - -
ceval-high_school_mathematics - - - -
ceval-high_school_physics - - - -
ceval-high_school_chemistry - - - -
ceval-high_school_biology - - - -
ceval-middle_school_mathematics - - - -
ceval-middle_school_biology - - - -
ceval-middle_school_physics - - - -
ceval-middle_school_chemistry - - - -
ceval-veterinary_medicine - - - -
ceval-college_economics - - - -
ceval-business_administration - - - -
ceval-marxism - - - -
ceval-mao_zedong_thought - - - -
ceval-education_science - - - -
ceval-teacher_qualification - - - -
ceval-high_school_politics - - - -
ceval-high_school_geography - - - -
ceval-middle_school_politics - - - -
ceval-middle_school_geography - - - -
ceval-modern_chinese_history - - - -
ceval-ideological_and_moral_cultivation - - - -
ceval-logic - - - -
ceval-law - - - -
ceval-chinese_language_and_literature - - - -
ceval-art_studies - - - -
ceval-professional_tour_guide - - - -
ceval-legal_professional - - - -
ceval-high_school_chinese - - - -
ceval-high_school_history - - - -
ceval-middle_school_history - - - -
ceval-civil_servant - - - -
ceval-sports_science - - - -
ceval-plant_protection - - - -
ceval-basic_medicine - - - -
ceval-clinical_medicine - - - -
ceval-urban_and_rural_planner - - - -
ceval-accountant - - - -
ceval-fire_engineer - - - -
ceval-environmental_impact_assessment_engineer - - - -
ceval-tax_accountant - - - -
ceval-physician - - - -
06/09 07:19:07 - OpenCompass - INFO - write summary to /root/opencompass/outputs/default/20240609_071723/summary/summary_20240609_071723.txt
06/09 07:19:07 - OpenCompass - INFO - write csv to /root/opencompass/outputs/default/20240609_071723/summary/summary_20240609_071723.csv
尝试安装protobuf
(opencompass) root-studio-50092202:~/opencompass# pip install protobuf
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting protobuf
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/3d/50/85a4f42e14eebfc9091e8e38e77ba4874d52fae08984c97725ba3265a1e1/protobuf-5.27.1-cp38-abi3-manylinux2014_x86_64.whl (309 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 309.2/309.2 kB 4.0 MB/s eta 0:00:00
Installing collected packages: protobuf
Successfully installed protobuf-5.27.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
再次启动,依然报错,设置一下:
export MKL_SERVICE_FORCE_INTEL=1
再次启动,似乎没在报错:
(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 07:35:11 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 07:35:11 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 07:35:11 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 07:35:11 - OpenCompass - INFO - Partitioned into 1 tasks.
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
06/09 07:35:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:38<00:00, 19.36s/it]
。。。
这一句似乎有问题:
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
遇到错误mkl-service + Intel® MKL MKL_THREADING_LAYER=INTEL is
incompatible with libgomp.so.1 … 解决方案:·```
dart export MKL_SERVICE_FORCE_INTEL=1
#或 export MKL_THREADING_LAYER=GNU
(opencompass) root-studio-50092202:~/opencompass# export MKL_SERVICE_FORCE_INTEL=1
(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 07:35:11 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 07:35:11 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 07:35:11 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 07:35:11 - OpenCompass - INFO - Partitioned into 1 tasks.
Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.
06/09 07:35:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:38<00:00, 19.36s/it]
06/09 07:36:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 55/55 [00:00<00:00, 1460042.53it/s]
[2024-06-09 07:36:56,575] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 28/28 [01:00<00:00, 2.15s/it]
06/09 07:37:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1684597.51it/s]
[2024-06-09 07:37:56,864] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [01:01<00:00, 2.48s/it]
06/09 07:38:58 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1533738.03it/s]
[2024-06-09 07:38:58,918] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:55<00:00, 2.22s/it]
06/09 07:39:54 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49/49 [00:00<00:00, 1657426.58it/s]
[2024-06-09 07:39:54,521] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:37<00:00, 1.51s/it]
06/09 07:40:32 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 47/47 [00:00<00:00, 1564541.97it/s]
[2024-06-09 07:40:32,398] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [01:06<00:00, 2.79s/it]
06/09 07:41:39 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 46/46 [00:00<00:00, 1429170.25it/s]
[2024-06-09 07:41:39,474] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:41<00:00, 1.80s/it]
06/09 07:42:21 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 44/44 [00:00<00:00, 1453144.69it/s]
[2024-06-09 07:42:21,081] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:36<00:00, 1.65s/it]
06/09 07:42:57 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:00<00:00, 1326403.83it/s]
[2024-06-09 07:42:57,517] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:39<00:00, 2.09s/it]
06/09 07:43:37 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:00<00:00, 1337838.34it/s]
[2024-06-09 07:43:37,399] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:36<00:00, 1.94s/it]
06/09 07:44:14 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:00<00:00, 1235821.71it/s]
[2024-06-09 07:44:14,280] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:30<00:00, 1.82s/it]
06/09 07:44:45 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [00:00<00:00, 302870.97it/s]
[2024-06-09 07:44:45,339] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17/17 [00:25<00:00, 1.49s/it]
06/09 07:45:10 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 1160923.43it/s]
[2024-06-09 07:45:10,721] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:35<00:00, 2.19s/it]
06/09 07:45:45 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:00<00:00, 1150649.77it/s]
[2024-06-09 07:45:45,836] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:30<00:00, 1.91s/it]
06/09 07:46:16 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 29/29 [00:00<00:00, 273336.67it/s]
[2024-06-09 07:46:16,516] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:23<00:00, 1.59s/it]
06/09 07:46:40 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 29/29 [00:00<00:00, 1136773.98it/s]
[2024-06-09 07:46:40,406] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:24<00:00, 1.64s/it]
06/09 07:47:04 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 949653.74it/s]
[2024-06-09 07:47:05,015] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:25<00:00, 2.13s/it]
06/09 07:47:30 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 949653.74it/s]
[2024-06-09 07:47:30,681] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:25<00:00, 2.16s/it]
06/09 07:47:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 932067.56it/s]
[2024-06-09 07:47:56,665] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:23<00:00, 1.98s/it]
06/09 07:48:20 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 996666.30it/s]
[2024-06-09 07:48:20,468] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:29<00:00, 2.46s/it]
06/09 07:48:49 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 955138.53it/s]
[2024-06-09 07:48:50,029] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:19<00:00, 1.66s/it]
06/09 07:49:09 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 910084.83it/s]
[2024-06-09 07:49:09,981] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:24<00:00, 2.05s/it]
06/09 07:49:34 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 910084.83it/s]
[2024-06-09 07:49:34,721] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:19<00:00, 1.62s/it]
06/09 07:49:54 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 910084.83it/s]
[2024-06-09 07:49:54,224] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:37<00:00, 3.11s/it]
06/09 07:50:31 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 913610.77it/s]
[2024-06-09 07:50:31,735] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:34<00:00, 3.12s/it]
06/09 07:51:06 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 887256.62it/s]
[2024-06-09 07:51:06,088] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:15<00:00, 1.42s/it]
06/09 07:51:21 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 904653.80it/s]
[2024-06-09 07:51:21,801] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:15<00:00, 1.40s/it]
06/09 07:51:37 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 22/22 [00:00<00:00, 971312.51it/s]
[2024-06-09 07:51:37,281] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:20<00:00, 1.90s/it]
06/09 07:51:58 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 863533.18it/s]
[2024-06-09 07:51:58,276] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:23<00:00, 2.14s/it]
06/09 07:52:21 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 898779.43it/s]
[2024-06-09 07:52:21,905] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:17<00:00, 1.62s/it]
06/09 07:52:39 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21/21 [00:00<00:00, 855149.36it/s]
[2024-06-09 07:52:39,795] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:26<00:00, 2.42s/it]
06/09 07:53:06 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 855980.41it/s]
[2024-06-09 07:53:06,519] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:21<00:00, 2.16s/it]
06/09 07:53:28 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 855980.41it/s]
[2024-06-09 07:53:28,157] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:19<00:00, 1.94s/it]
06/09 07:53:47 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 796917.76it/s]
[2024-06-09 07:53:47,609] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:15<00:00, 1.51s/it]
06/09 07:54:02 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 804967.43it/s]
[2024-06-09 07:54:02,804] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00, 1.31s/it]
06/09 07:54:15 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 198732.61it/s]
[2024-06-09 07:54:15,940] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:25<00:00, 2.53s/it]
06/09 07:54:41 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 744782.95it/s]
[2024-06-09 07:54:41,363] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:31<00:00, 3.11s/it]
06/09 07:55:12 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 198732.61it/s]
[2024-06-09 07:55:12,508] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:16<00:00, 1.69s/it]
06/09 07:55:29 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 565189.90it/s]
[2024-06-09 07:55:29,494] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00, 1.86s/it]
06/09 07:55:48 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 711533.71it/s]
[2024-06-09 07:55:48,150] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00, 1.80s/it]
06/09 07:56:06 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 866214.96it/s]
[2024-06-09 07:56:06,263] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:24<00:00, 2.46s/it]
06/09 07:56:30 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 773706.56it/s]
[2024-06-09 07:56:30,951] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00, 1.83s/it]
06/09 07:56:49 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 766267.08it/s]
[2024-06-09 07:56:49,334] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:14<00:00, 1.47s/it]
06/09 07:57:04 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 758969.30it/s]
[2024-06-09 07:57:04,081] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:30<00:00, 3.01s/it]
06/09 07:57:34 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 781291.92it/s]
[2024-06-09 07:57:34,239] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00, 1.38s/it]
06/09 07:57:48 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 813181.39it/s]
[2024-06-09 07:57:48,118] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:12<00:00, 1.27s/it]
06/09 07:58:00 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 821564.70it/s]
[2024-06-09 07:58:00,899] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:32<00:00, 3.24s/it]
06/09 07:58:33 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 201751.33it/s]
[2024-06-09 07:58:33,373] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:13<00:00, 1.31s/it]
06/09 07:58:46 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 796917.76it/s]
[2024-06-09 07:58:46,584] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:12<00:00, 1.26s/it]
06/09 07:58:59 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 794710.23it/s]
[2024-06-09 07:58:59,282] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:29<00:00, 3.32s/it]
06/09 07:59:29 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [00:00<00:00, 747499.72it/s]
[2024-06-09 07:59:29,271] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:26<00:00, 2.99s/it]
06/09 07:59:56 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 677867.31it/s]
[2024-06-09 07:59:56,234] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:14<00:00, 1.80s/it]
06/09 08:00:10 - OpenCompass - INFO - Start inferencing [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 131414.22it/s]
[2024-06-09 08:00:10,676] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:08<00:00, 1.48s/it]
06/09 08:00:19 - OpenCompass - INFO - time elapsed: 1494.75s
06/09 08:00:22 - OpenCompass - INFO - Partitioned into 52 tasks.
06/09 08:00:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network]: {'accuracy': 47.368421052631575}
06/09 08:00:26 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system]: {'accuracy': 47.368421052631575}
06/09 08:00:28 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture]: {'accuracy': 23.809523809523807}
06/09 08:00:30 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming]: {'accuracy': 13.513513513513514}
06/09 08:00:32 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics]: {'accuracy': 42.10526315789473}
06/09 08:00:34 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry]: {'accuracy': 33.33333333333333}
06/09 08:00:36 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics]: {'accuracy': 10.526315789473683}
06/09 08:00:37 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics]: {'accuracy': 38.88888888888889}
06/09 08:00:39 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics]: {'accuracy': 25.0}
06/09 08:00:41 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer]: {'accuracy': 27.027027027027028}
06/09 08:00:43 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer]: {'accuracy': 54.166666666666664}
06/09 08:00:45 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics]: {'accuracy': 16.666666666666664}
06/09 08:00:47 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics]: {'accuracy': 42.10526315789473}
06/09 08:00:49 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry]: {'accuracy': 47.368421052631575}
06/09 08:00:50 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology]: {'accuracy': 26.31578947368421}
06/09 08:00:52 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics]: {'accuracy': 36.84210526315789}
06/09 08:00:54 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology]: {'accuracy': 80.95238095238095}
06/09 08:00:56 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics]: {'accuracy': 47.368421052631575}
06/09 08:00:58 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry]: {'accuracy': 80.0}
06/09 08:01:00 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine]: {'accuracy': 43.47826086956522}
06/09 08:01:02 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics]: {'accuracy': 32.72727272727273}
06/09 08:01:03 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration]: {'accuracy': 36.36363636363637}
06/09 08:01:05 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism]: {'accuracy': 68.42105263157895}
06/09 08:01:07 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought]: {'accuracy': 70.83333333333334}
06/09 08:01:09 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science]: {'accuracy': 55.172413793103445}
06/09 08:01:11 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification]: {'accuracy': 59.09090909090909}
06/09 08:01:13 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics]: {'accuracy': 57.89473684210527}
06/09 08:01:14 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography]: {'accuracy': 47.368421052631575}
06/09 08:01:16 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics]: {'accuracy': 71.42857142857143}
06/09 08:01:18 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]: {'accuracy': 75.0}
06/09 08:01:20 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history]: {'accuracy': 52.17391304347826}
06/09 08:01:22 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation]: {'accuracy': 73.68421052631578}
06/09 08:01:24 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic]: {'accuracy': 27.27272727272727}
06/09 08:01:26 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law]: {'accuracy': 29.166666666666668}
06/09 08:01:28 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature]: {'accuracy': 47.82608695652174}
06/09 08:01:30 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies]: {'accuracy': 42.42424242424242}
06/09 08:01:32 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide]: {'accuracy': 51.724137931034484}
06/09 08:01:33 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional]: {'accuracy': 34.78260869565217}
06/09 08:01:35 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese]: {'accuracy': 42.10526315789473}
06/09 08:01:37 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history]: {'accuracy': 65.0}
06/09 08:01:39 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history]: {'accuracy': 86.36363636363636}
06/09 08:01:41 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant]: {'accuracy': 42.5531914893617}
06/09 08:01:43 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science]: {'accuracy': 52.63157894736842}
06/09 08:01:45 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection]: {'accuracy': 40.909090909090914}
06/09 08:01:47 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine]: {'accuracy': 68.42105263157895}
06/09 08:01:48 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine]: {'accuracy': 31.818181818181817}
06/09 08:01:50 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner]: {'accuracy': 47.82608695652174}
06/09 08:01:52 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant]: {'accuracy': 36.734693877551024}
06/09 08:01:54 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer]: {'accuracy': 38.70967741935484}
06/09 08:01:56 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer]: {'accuracy': 51.61290322580645}
06/09 08:01:58 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant]: {'accuracy': 36.734693877551024}
06/09 08:02:00 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician]: {'accuracy': 42.857142857142854}
dataset version metric mode opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
---------------------------------------------- --------- ------------- ------ ---------------------------------------------------------------------------------------
ceval-computer_network db9ce2 accuracy gen 47.37
ceval-operating_system 1c2571 accuracy gen 47.37
ceval-computer_architecture a74dad accuracy gen 23.81
ceval-college_programming 4ca32a accuracy gen 13.51
ceval-college_physics 963fa8 accuracy gen 42.11
ceval-college_chemistry e78857 accuracy gen 33.33
ceval-advanced_mathematics ce03e2 accuracy gen 10.53
ceval-probability_and_statistics 65e812 accuracy gen 38.89
ceval-discrete_mathematics e894ae accuracy gen 25
ceval-electrical_engineer ae42b9 accuracy gen 27.03
ceval-metrology_engineer ee34ea accuracy gen 54.17
ceval-high_school_mathematics 1dc5bf accuracy gen 16.67
ceval-high_school_physics adf25f accuracy gen 42.11
ceval-high_school_chemistry 2ed27f accuracy gen 47.37
ceval-high_school_biology 8e2b9a accuracy gen 26.32
ceval-middle_school_mathematics bee8d5 accuracy gen 36.84
ceval-middle_school_biology 86817c accuracy gen 80.95
ceval-middle_school_physics 8accf6 accuracy gen 47.37
ceval-middle_school_chemistry 167a15 accuracy gen 80
ceval-veterinary_medicine b4e08d accuracy gen 43.48
ceval-college_economics f3f4e6 accuracy gen 32.73
ceval-business_administration c1614e accuracy gen 36.36
ceval-marxism cf874c accuracy gen 68.42
ceval-mao_zedong_thought 51c7a4 accuracy gen 70.83
ceval-education_science 591fee accuracy gen 55.17
ceval-teacher_qualification 4e4ced accuracy gen 59.09
ceval-high_school_politics 5c0de2 accuracy gen 57.89
ceval-high_school_geography 865461 accuracy gen 47.37
ceval-middle_school_politics 5be3e7 accuracy gen 71.43
ceval-middle_school_geography 8a63be accuracy gen 75
ceval-modern_chinese_history fc01af accuracy gen 52.17
ceval-ideological_and_moral_cultivation a2aa4a accuracy gen 73.68
ceval-logic f5b022 accuracy gen 27.27
ceval-law a110a1 accuracy gen 29.17
ceval-chinese_language_and_literature 0f8b68 accuracy gen 47.83
ceval-art_studies 2a1300 accuracy gen 42.42
ceval-professional_tour_guide 4e673e accuracy gen 51.72
ceval-legal_professional ce8787 accuracy gen 34.78
ceval-high_school_chinese 315705 accuracy gen 42.11
ceval-high_school_history 7eb30a accuracy gen 65
ceval-middle_school_history 48ab4a accuracy gen 86.36
ceval-civil_servant 87d061 accuracy gen 42.55
ceval-sports_science 70f27b accuracy gen 52.63
ceval-plant_protection 8941f9 accuracy gen 40.91
ceval-basic_medicine c409d6 accuracy gen 68.42
ceval-clinical_medicine 49e82d accuracy gen 31.82
ceval-urban_and_rural_planner 95b885 accuracy gen 47.83
ceval-accountant 002837 accuracy gen 36.73
ceval-fire_engineer bc23f5 accuracy gen 38.71
ceval-environmental_impact_assessment_engineer c64e2d accuracy gen 51.61
ceval-tax_accountant 3a5e3c accuracy gen 36.73
ceval-physician 6e277d accuracy gen 42.86
ceval-stem - naive_average gen 39.21
ceval-social-science - naive_average gen 57.43
ceval-humanities - naive_average gen 50.23
ceval-other - naive_average gen 44.62
ceval-hard - naive_average gen 32
ceval - naive_average gen 46.19
06/09 08:02:00 - OpenCompass - INFO - write summary to /root/opencompass/outputs/default/20240609_073511/summary/summary_20240609_073511.txt
06/09 08:02:00 - OpenCompass - INFO - write csv to /root/opencompass/outputs/default/20240609_073511/summary/summary_20240609_073511.csv
为了解决
is incompatible with libgomp.so.1 library. Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it. ```
export MKL_THREADING_LAYER=GNU
上述错误没再出现
opencompass) root-studio-50092202:~/opencompass# export MKL_THREADING_LAYER=GNU
(opencompass) root-studio-50092202:~/opencompass# python run.py --datasets ceval_gen --hf-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-path /share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 1024 --max-out-len 16 --batch-size 2 --num-gpus 1 --debug
06/09 09:20:40 - OpenCompass - INFO - Loading ceval_gen: configs/datasets/ceval/ceval_gen.py
06/09 09:20:40 - OpenCompass - INFO - Loading example: configs/summarizers/example.py
06/09 09:20:40 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
06/09 09:20:40 - OpenCompass - INFO - Partitioned into 1 tasks.
06/09 09:20:51 - OpenCompass - INFO - Task [opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_economics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-tax_accountant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-physician,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-civil_servant,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-urban_and_rural_planner,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-teacher_qualification,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_programming,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-electrical_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-business_administration,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-art_studies,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-fire_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-environmental_impact_assessment_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-education_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-professional_tour_guide,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-metrology_engineer,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-mao_zedong_thought,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-law,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-veterinary_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-modern_chinese_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-chinese_language_and_literature,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-legal_professional,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-logic,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-plant_protection,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-clinical_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_architecture,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_history,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-computer_network,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-operating_system,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-college_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-advanced_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chemistry,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_biology,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_physics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-marxism,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_politics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_geography,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-ideological_and_moral_cultivation,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_chinese,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-sports_science,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-basic_medicine,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-probability_and_statistics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-high_school_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-discrete_mathematics,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b/ceval-middle_school_geography]
20240609_092040
tabulate format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dataset version metric mode opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
---------------------------------------------- --------- ------------- ------ ---------------------------------------------------------------------------------------
ceval-computer_network db9ce2 accuracy gen 47.37
ceval-operating_system 1c2571 accuracy gen 47.37
ceval-computer_architecture a74dad accuracy gen 23.81
ceval-college_programming 4ca32a accuracy gen 13.51
ceval-college_physics 963fa8 accuracy gen 42.11
ceval-college_chemistry e78857 accuracy gen 33.33
ceval-advanced_mathematics ce03e2 accuracy gen 10.53
ceval-probability_and_statistics 65e812 accuracy gen 38.89
ceval-discrete_mathematics e894ae accuracy gen 25
ceval-electrical_engineer ae42b9 accuracy gen 27.03
ceval-metrology_engineer ee34ea accuracy gen 54.17
ceval-high_school_mathematics 1dc5bf accuracy gen 16.67
ceval-high_school_physics adf25f accuracy gen 42.11
ceval-high_school_chemistry 2ed27f accuracy gen 47.37
ceval-high_school_biology 8e2b9a accuracy gen 26.32
ceval-middle_school_mathematics bee8d5 accuracy gen 36.84
ceval-middle_school_biology 86817c accuracy gen 80.95
ceval-middle_school_physics 8accf6 accuracy gen 47.37
ceval-middle_school_chemistry 167a15 accuracy gen 80
ceval-veterinary_medicine b4e08d accuracy gen 43.48
ceval-college_economics f3f4e6 accuracy gen 32.73
ceval-business_administration c1614e accuracy gen 36.36
ceval-marxism cf874c accuracy gen 68.42
ceval-mao_zedong_thought 51c7a4 accuracy gen 70.83
ceval-education_science 591fee accuracy gen 55.17
ceval-teacher_qualification 4e4ced accuracy gen 59.09
ceval-high_school_politics 5c0de2 accuracy gen 57.89
ceval-high_school_geography 865461 accuracy gen 47.37
ceval-middle_school_politics 5be3e7 accuracy gen 71.43
ceval-middle_school_geography 8a63be accuracy gen 75
ceval-modern_chinese_history fc01af accuracy gen 52.17
ceval-ideological_and_moral_cultivation a2aa4a accuracy gen 73.68
ceval-logic f5b022 accuracy gen 27.27
ceval-law a110a1 accuracy gen 29.17
ceval-chinese_language_and_literature 0f8b68 accuracy gen 47.83
ceval-art_studies 2a1300 accuracy gen 42.42
ceval-professional_tour_guide 4e673e accuracy gen 51.72
ceval-legal_professional ce8787 accuracy gen 34.78
ceval-high_school_chinese 315705 accuracy gen 42.11
ceval-high_school_history 7eb30a accuracy gen 65
ceval-middle_school_history 48ab4a accuracy gen 86.36
ceval-civil_servant 87d061 accuracy gen 42.55
ceval-sports_science 70f27b accuracy gen 52.63
ceval-plant_protection 8941f9 accuracy gen 40.91
ceval-basic_medicine c409d6 accuracy gen 68.42
ceval-clinical_medicine 49e82d accuracy gen 31.82
ceval-urban_and_rural_planner 95b885 accuracy gen 47.83
ceval-accountant 002837 accuracy gen 36.73
ceval-fire_engineer bc23f5 accuracy gen 38.71
ceval-environmental_impact_assessment_engineer c64e2d accuracy gen 51.61
ceval-tax_accountant 3a5e3c accuracy gen 36.73
ceval-physician 6e277d accuracy gen 42.86
ceval-stem - naive_average gen 39.21
ceval-social-science - naive_average gen 57.43
ceval-humanities - naive_average gen 50.23
ceval-other - naive_average gen 44.62
ceval-hard - naive_average gen 32
ceval - naive_average gen 46.19
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
-------------------------------------------------------------------------------------------------------------------------------- THIS IS A DIVIDER --------------------------------------------------------------------------------------------------------------------------------
csv format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
dataset,version,metric,mode,opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
ceval-computer_network,db9ce2,accuracy,gen,47.37
ceval-operating_system,1c2571,accuracy,gen,47.37
ceval-computer_architecture,a74dad,accuracy,gen,23.81
ceval-college_programming,4ca32a,accuracy,gen,13.51
ceval-college_physics,963fa8,accuracy,gen,42.11
ceval-college_chemistry,e78857,accuracy,gen,33.33
ceval-advanced_mathematics,ce03e2,accuracy,gen,10.53
ceval-probability_and_statistics,65e812,accuracy,gen,38.89
ceval-discrete_mathematics,e894ae,accuracy,gen,25.00
ceval-electrical_engineer,ae42b9,accuracy,gen,27.03
ceval-metrology_engineer,ee34ea,accuracy,gen,54.17
ceval-high_school_mathematics,1dc5bf,accuracy,gen,16.67
ceval-high_school_physics,adf25f,accuracy,gen,42.11
ceval-high_school_chemistry,2ed27f,accuracy,gen,47.37
ceval-high_school_biology,8e2b9a,accuracy,gen,26.32
ceval-middle_school_mathematics,bee8d5,accuracy,gen,36.84
ceval-middle_school_biology,86817c,accuracy,gen,80.95
ceval-middle_school_physics,8accf6,accuracy,gen,47.37
ceval-middle_school_chemistry,167a15,accuracy,gen,80.00
ceval-veterinary_medicine,b4e08d,accuracy,gen,43.48
ceval-college_economics,f3f4e6,accuracy,gen,32.73
ceval-business_administration,c1614e,accuracy,gen,36.36
ceval-marxism,cf874c,accuracy,gen,68.42
ceval-mao_zedong_thought,51c7a4,accuracy,gen,70.83
ceval-education_science,591fee,accuracy,gen,55.17
ceval-teacher_qualification,4e4ced,accuracy,gen,59.09
ceval-high_school_politics,5c0de2,accuracy,gen,57.89
ceval-high_school_geography,865461,accuracy,gen,47.37
ceval-middle_school_politics,5be3e7,accuracy,gen,71.43
ceval-middle_school_geography,8a63be,accuracy,gen,75.00
ceval-modern_chinese_history,fc01af,accuracy,gen,52.17
ceval-ideological_and_moral_cultivation,a2aa4a,accuracy,gen,73.68
ceval-logic,f5b022,accuracy,gen,27.27
ceval-law,a110a1,accuracy,gen,29.17
ceval-chinese_language_and_literature,0f8b68,accuracy,gen,47.83
ceval-art_studies,2a1300,accuracy,gen,42.42
ceval-professional_tour_guide,4e673e,accuracy,gen,51.72
ceval-legal_professional,ce8787,accuracy,gen,34.78
ceval-high_school_chinese,315705,accuracy,gen,42.11
ceval-high_school_history,7eb30a,accuracy,gen,65.00
ceval-middle_school_history,48ab4a,accuracy,gen,86.36
ceval-civil_servant,87d061,accuracy,gen,42.55
ceval-sports_science,70f27b,accuracy,gen,52.63
ceval-plant_protection,8941f9,accuracy,gen,40.91
ceval-basic_medicine,c409d6,accuracy,gen,68.42
ceval-clinical_medicine,49e82d,accuracy,gen,31.82
ceval-urban_and_rural_planner,95b885,accuracy,gen,47.83
ceval-accountant,002837,accuracy,gen,36.73
ceval-fire_engineer,bc23f5,accuracy,gen,38.71
ceval-environmental_impact_assessment_engineer,c64e2d,accuracy,gen,51.61
ceval-tax_accountant,3a5e3c,accuracy,gen,36.73
ceval-physician,6e277d,accuracy,gen,42.86
ceval-stem,-,naive_average,gen,39.21
ceval-social-science,-,naive_average,gen,57.43
ceval-humanities,-,naive_average,gen,50.23
ceval-other,-,naive_average,gen,44.62
ceval-hard,-,naive_average,gen,32.00
ceval,-,naive_average,gen,46.19
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
-------------------------------------------------------------------------------------------------------------------------------- THIS IS A DIVIDER --------------------------------------------------------------------------------------------------------------------------------
raw format
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-------------------------------
Model: opencompass.models.huggingface.HuggingFace_Shanghai_AI_Laboratory_internlm2-chat-1_8b
ceval-computer_network: {'accuracy': 47.368421052631575}
ceval-operating_system: {'accuracy': 47.368421052631575}
ceval-computer_architecture: {'accuracy': 23.809523809523807}
ceval-college_programming: {'accuracy': 13.513513513513514}
ceval-college_physics: {'accuracy': 42.10526315789473}
ceval-college_chemistry: {'accuracy': 33.33333333333333}
ceval-advanced_mathematics: {'accuracy': 10.526315789473683}
ceval-probability_and_statistics: {'accuracy': 38.88888888888889}
ceval-discrete_mathematics: {'accuracy': 25.0}
ceval-electrical_engineer: {'accuracy': 27.027027027027028}
ceval-metrology_engineer: {'accuracy': 54.166666666666664}
ceval-high_school_mathematics: {'accuracy': 16.666666666666664}
ceval-high_school_physics: {'accuracy': 42.10526315789473}
ceval-high_school_chemistry: {'accuracy': 47.368421052631575}
ceval-high_school_biology: {'accuracy': 26.31578947368421}
ceval-middle_school_mathematics: {'accuracy': 36.84210526315789}
ceval-middle_school_biology: {'accuracy': 80.95238095238095}
ceval-middle_school_physics: {'accuracy': 47.368421052631575}
ceval-middle_school_chemistry: {'accuracy': 80.0}
ceval-veterinary_medicine: {'accuracy': 43.47826086956522}
ceval-college_economics: {'accuracy': 32.72727272727273}
ceval-business_administration: {'accuracy': 36.36363636363637}
ceval-marxism: {'accuracy': 68.42105263157895}
ceval-mao_zedong_thought: {'accuracy': 70.83333333333334}
ceval-education_science: {'accuracy': 55.172413793103445}
ceval-teacher_qualification: {'accuracy': 59.09090909090909}
ceval-high_school_politics: {'accuracy': 57.89473684210527}
ceval-high_school_geography: {'accuracy': 47.368421052631575}
ceval-middle_school_politics: {'accuracy': 71.42857142857143}
ceval-middle_school_geography: {'accuracy': 75.0}
ceval-modern_chinese_history: {'accuracy': 52.17391304347826}
ceval-ideological_and_moral_cultivation: {'accuracy': 73.68421052631578}
ceval-logic: {'accuracy': 27.27272727272727}
ceval-law: {'accuracy': 29.166666666666668}
ceval-chinese_language_and_literature: {'accuracy': 47.82608695652174}
ceval-art_studies: {'accuracy': 42.42424242424242}
ceval-professional_tour_guide: {'accuracy': 51.724137931034484}
ceval-legal_professional: {'accuracy': 34.78260869565217}
ceval-high_school_chinese: {'accuracy': 42.10526315789473}
ceval-high_school_history: {'accuracy': 65.0}
ceval-middle_school_history: {'accuracy': 86.36363636363636}
ceval-civil_servant: {'accuracy': 42.5531914893617}
ceval-sports_science: {'accuracy': 52.63157894736842}
ceval-plant_protection: {'accuracy': 40.909090909090914}
ceval-basic_medicine: {'accuracy': 68.42105263157895}
ceval-clinical_medicine: {'accuracy': 31.818181818181817}
ceval-urban_and_rural_planner: {'accuracy': 47.82608695652174}
ceval-accountant: {'accuracy': 36.734693877551024}
ceval-fire_engineer: {'accuracy': 38.70967741935484}
ceval-environmental_impact_assessment_engineer: {'accuracy': 51.61290322580645}
ceval-tax_accountant: {'accuracy': 36.734693877551024}
ceval-physician: {'accuracy': 42.857142857142854}
ceval-stem: {'naive_average': 39.210234139009884}
ceval-social-science: {'naive_average': 57.43003472631422}
ceval-humanities: {'naive_average': 50.22940845801545}
ceval-other: {'naive_average': 44.618935819046335}
ceval-hard: {'naive_average': 31.99926900584795}
ceval: {'naive_average': 46.18916955944266}
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$