大模型学习与实践笔记（九）

news2025/4/7 7:02:42

一、LMDeply方式部署

使用 LMDeploy 以本地对话方式部署 InternLM-Chat-7B 模型，生成 300 字的小故事

2.api 方式部署

运行

结果：

显存占用：

二、报错与解决方案

在使用命令，对lmdeploy 进行源码安装是时，报错

1.源码安装语句

pip install 'lmdeploy[all]==v0.1.0'

2.报错语句：

Building wheels for collected packages: flash-attn
  Building wheel for flash-attn (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [9 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      
      
      torch.__version__  = 2.0.1
      
      
      running bdist_wheel
      Guessing wheel URL:  https://github.com/Dao-AILab/flash-attention/releases/download/v2.4.2/flash_attn-2.4.2+cu118torch2.0cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
      error: <urlopen error Tunnel connection failed: 503 Service Unavailable>
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for flash-attn
  Running setup.py clean for flash-attn
Failed to build flash-attn
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects

3.解决方法

（1）在https://github.com/Dao-AILab/flash-attention/releases/ 下载对应版本的安装包

（2）通过pip 进行安装

pip install flash_attn-2.3.5+cu117torch2.0cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

4.参考链接

https://github.com/Dao-AILab/flash-attention/issues/224

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/1396373.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！

大模型学习与实践笔记（九）

一、LMDeply方式部署

二、报错与解决方案

1.源码安装语句

2.报错语句：

3.解决方法

4.参考链接

相关文章

航空飞行器运维VR模拟互动教学更直观有趣

MySQL 基于创建时间进行RANGE分区

《后疫情时代薪酬管理和数字化趋势报告》

Kubernetes(K8S)拉取本地镜像部署Pod 实现类似函数/微服务功能（可设置参数并实时调用）

【学习记录24】vue3自定义指令

golang面试题大全

Java进阶-Tomcat发布JavaWeb项目

springcloud之链路追踪

python数字图像处理基础（十）——背景建模

【信号与系统】【北京航空航天大学】实验四、幅频、相频响应和傅里叶变换

干掉xshell, 这款远程终端工具：开源、免费、跨平台，同时支持SSH+SFTP+Telent+TCP+Serial，太香了。

按空格键改变text显示的内容并打印输出

Elastic Stack（1）：Elastic Stack简介

线程基础知识点

自动驾驶概述

[C语言]编译和链接

龙哥的问题（积性函数，莫比乌斯反演）

5G消息一站式解决方案，实现全新“跳代”应用体验

2017年认证杯SPSSPRO杯数学建模A题(第一阶段)安全的后视镜全过程文档及程序

Zoho Survey评价：功能全面，值得一试