目录
- 前情提要
- torch_npu框架
- mindsport框架
- mindnlp框架
- 下载模型
- 国外
- 国内
- 环境设置
- 代码适配(非流式)
- Main
- Branch
- 结果展示
- 代码适配(流式)
前情提要
torch_npu框架
官方未适配
mindsport框架
官方未适配
mindnlp框架
官方适配了,但是速度非常非常慢,10秒一个字
下载模型
国外
Hugging Face
国内
modelscope
环境设置
pip install transformers==4.39.2
pip3 install torch==2.1.0
pip3 install torch-npu==2.1.0.post4
pip3 install accelerate==0.24.1
pip3 install transformers-stream-generator==0.0.5
代码适配(非流式)
Main
import torch
import torch_npu
import os
import platform
torch_device = "npu:1" # 0~7
torch.npu.set_device(torch.device(torch_device))
torch.npu.set_compile_mode(jit_compile=False)
option = {}
option["NPU_FUZZY_COMPILE_BLACKLIST"] = "Tril"
torch.npu.set_option(option)
from transformers import AutoModelForCausalLM, AutoTokenizer
# device = "cuda" # the device to load the model onto
DEFAULT_CKPT_PATH = '/root/.cache/modelscope/hub/qwen/Qwen2-7B-Instruct'
model = AutoModelForCausalLM.from_pretrained(
DEFAULT_CKPT_PATH,
torch_dtype=torch.float16,
device_map=torch_device
).npu().eval()
tokenizer = AutoTokenizer.from_pretrained(DEFAULT_CKPT_PATH)
while True:
prompt = input("user:")
if prompt == "exit":
break
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(torch_device)
generated_ids = model.generate(
model_inputs.input_ids,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print("Qwen2-7B-Instruct:",response)
Branch
找到自己虚拟环境
which python
我的是/root/anaconda3/envs/sakura/bin/python
找到/lib/python3.9/site-packages/transformers/generation/utils.py
示例:
/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/utils.py
找到第2708行,注释掉2708行~2712行
在2709行添加
next_token_scores = outputs.logits[:, -1, :]
示例:
出错就是在这里,如果进行了pre-process distribution,就会报错
/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/logits_process.py:455: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at build/CMakeFiles/torch_npu.dir/compiler_depend.ts:74.)
sorted_indices_to_remove[..., -self.min_tokens_to_keep :] = 0
Traceback (most recent call last):
File "/root/Qwen_test.py", line 63, in <module>
generated_ids = model.generate(
File "/root/anaconda3/envs/sakura/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/utils.py", line 1576, in generate
result = self._sample(
File "/root/anaconda3/envs/sakura/lib/python3.9/site-packages/transformers/generation/utils.py", line 2736, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: Sync:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:158 NPU error, error code is 507018
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
E39999: Inner Error!
E39999: 2024-07-02-14:14:50.735.070 An exception occurred during AICPU execution, stream_id:23, task_id:2750, errcode:21008, msg:inner error[FUNC:ProcessAicpuErrorInfo][FILE:device_error_proc.cc][LINE:730]
TraceBack (most recent call last):
rtStreamSynchronizeWithTimeout execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
synchronize stream failed, runtime result = 507018[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
DEVICE[1] PID[864803]:
EXCEPTION TASK:
Exception info:TGID=864803, model id=65535, stream id=23, stream phase=SCHEDULE, task id=2750, task type=aicpu kernel, recently received task id=2750, recently send task id=2749, task phase=RUN
Message info[0]:aicpu=0,slot_id=0,report_mailbox_flag=0x5a5a5a5a,state=0x5210
Other info[0]:time=2024-07-02-14:14:50.091.974, function=proc_aicpu_task_done, line=970, error code=0x2a
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EZ9999: Inner Error!
EZ9999: 2024-07-02-14:14:50.743.702 Kernel task happen error, retCode=0x2a, [aicpu exception].[FUNC:PreCheckTaskErr][FILE:task_info.cc][LINE:1776]
TraceBack (most recent call last):
Aicpu kernel execute failed, device_id=1, stream_id=23, task_id=2750, errorCode=2a.[FUNC:PrintAicpuErrorInfo][FILE:task_info.cc][LINE:1579]
Aicpu kernel execute failed, device_id=1, stream_id=23, task_id=2750, fault op_name=[FUNC:GetError][FILE:stream.cc][LINE:1512]
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.745.695 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.747.300 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.814.377 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.816.023 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.817.628 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.819.236 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.820.843 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
[W compiler_depend.ts:368] Warning: NPU warning, error code is 507018[Error]:
[Error]: The aicpu execution is abnormal.
Rectify the fault based on the error information in the ascend log.
EH9999: Inner Error!
rtDeviceSynchronize execute failed, reason=[aicpu exception][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
EH9999: 2024-07-02-14:14:50.822.422 wait for compute device to finish failed, runtime result = 507018.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
TraceBack (most recent call last):
(function npuSynchronizeDevice)
结果展示
最后运行Main文件
代码适配(流式)
未完待续