Llama-7b-hf和vicuna-7b-delta-v0合并成vicuna-7b-v0

news2024/11/19 15:33:37

最近使用pandagpt需要vicuna-7b-v0,重新过了一遍,前段时间部署了vicuna-7b-v3,还是有不少差别的,transforms和fastchat版本更新导致许多地方不匹配,出现很多错误,记录一下。

更多相关内容可见Fastchat实战部署vicuna-7b-v1.3(小羊驼)_Spielberg_1的博客-CSDN博客

一、配置环境

conda create -n fastchat python=3.9   #  fastchat官方建议Python版本要>= 3.8

切换到fastchat

conda activate fastchat

安装torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1

pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 

二、安装fastchat和transformers

安装fschat==0.1.10,官方建议vicuna-7b-delta-v0对应的fastchat版本低于0.1.10

pip install fschat==0.1.10

安装transformers

pip install transformers

三、合并权重,生成vicuna-7b-v0模型

python -m fastchat.model.apply_delta \
       --base /root/LLaMA-7B-hf/llama-7b-hf \
       --target /root/vicuna-7b-v0 \
       --delta /root/vicuna-7b-delta-v0
basehf格式的llama-7b模型的路径
target合并权重后生成的vicuna-7b-v0模型路径,稍后启动FastChat要用
delta从huggingface下载的vicuna-7b-delta-v0路径

vicuna-7b模型合并需要30G的RAM,请合理评估

生成目标模型保存在/root/vicuna-7b-v0

命令行输出

(fastchat) root@dl-230904040428gxb-pod-jupyter-7599dcdb54-qjppf:~# python -m fastchat.model.apply_delta --base /root/LLaMA-7B-hf/llama-7b-hf --target /root/vicuna-7b-v0 --delta /root/vicuna-7b-delta-v0
Loading the base model from /root/LLaMA-7B-hf/llama-7b-hf
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 33/33 [02:12<00:00,  4.03s/it]
Loading the delta from /root/vicuna-7b-delta-v0
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [02:56<00:00, 88.21s/it]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=True`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
You are resizing the embedding layer without providing a `pad_to_multiple_of` parameter. This means that the new embeding dimension will be 32001. This might induce some performance reduction as *Tensor Cores* will not be available. For more details  about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc
Applying the delta
Applying delta: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 291/291 [00:24<00:00, 11.96it/s]
Saving the target model to /root/vicuna-7b-v0

四、推理

python3 -m fastchat.serve.cli --model-path /root/vicuna-7b-v0
model-path上一步合并生成的target路径

控制台输出,Human是输入,Assistant是模型的回复

(fastchat) root@dl-230904040428gxb-pod-jupyter-7599dcdb54-qjppf:~# python3 -m fastchat.serve.cli --model-path /root/vicuna-7b-v0
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=True`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [02:00<00:00, 60.38s/it]
Human: hello
Assistant: Hello! How can I help you today?
Human: can you tell me a jok
Assistant: Sure, here's a joke for you:
Why was the math book sad?
Because it had too many problems!
I hope you found that joke amusing. Do you have any other questions or topics you'd like to discuss?
Human: a guide to touring in China
Assistant: China is a vast and diverse country with a rich history and culture, and there is no shortage of things to see and do. Here are some tips for touring in China:
1. Plan ahead: Research the destinations you want to visit, the best time to go, and the logistics of getting there. Consider factors such as cost, time, transportation, and accommodation.
2. Get a visa: Depending on your nationality, you may need to obtain a visa before you travel to China. Make sure you have enough time to apply for a visa and that you have all the necessary documents.
3. Pack appropriately: Pack clothes for the season, and bring a mix of casual and formal wear. Also, don't forget to pack any necessary medication and travel accessories like a first aid kit, phone charger, and a portable battery pack.
4. Learn some basic Mandarin: While many people in China speak English, having some basic Mandarin will help you communicate with locals and navigate the country more easily.
5. Be respectful: China is a highly cultured country, and it's important to be respectful of local customs and traditions. Dress modestly, avoid pointing with your chin, and avoid discussing sensitive political or religious topics.
6. Try the local food: Chinese cuisine is diverse and delicious, and there are many local specialties to try. Don't be afraid to try new things, but be aware of food safety concerns and only eat at reputable restaurants.
7. Stay safe: Be aware of your surroundings and take proper safety precautions, such as avoiding unfamiliar or poorly lit areas at night, and keeping valuable items safe.
8. Enjoy the experience: China is a beautiful and fascinating country, and there is no shortage of things to see and do. Enjoy the experience, take it all in, and have a great time!
Human: how to study vicuna
Assistant: Studying vicuñas can be a challenging and rewarding experience, but it requires careful planning and preparation. Here are some tips for studying vicuñas:
1. Research the species: Learn about the biology, behavior, and ecology of vicuñas, including their habitat, diet, and social structure.
2. Identify the best sites: Identify the best sites to study vicuñas, such as national parks, wildlife reserves, or other protected areas. Make sure you have permission to study the vicuñas and that you are familiar with the local regulations and laws.
3. Develop your skills: Develop the necessary skills to study vicuñas, such as knowledge of the local language, tracking and observation skills, and experience in capturing and handling animals.
4. Set up your equipment: Set up your equipment, such as cameras, GPS devices, and other necessary tools, to monitor and study the vicuñas.
5. Observe and collect data: Observe the vicuñas in their natural habitat and collect data on their behavior, such as their movement patterns, feeding habits, and social interactions.
6. Analyze your data: Analyze the data you have collected and draw conclusions about the behavior and ecology of the vicuñas.
7. Communicate your findings: Communicate your findings to other researchers and conservationists, and use your research to inform conservation efforts and protect the vicuñas.
8. Consider the ethics: Remember to consider the ethical implications of your study and to minimize any negative impacts on the vicuñas and their habitat.

总结:vicuna-7b支持英文,回答能力有限。

遇到的问题

ImportError: cannot import name 'is_tokenizers_available' from 'transformers.utils' 

原因:transformers版本不匹配

解决方法:安装transformers,

pip install transformers

查看版本为4.32.1

ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported.

翻译;ValueError:Tokenizer类LLaMATokenizer不存在或当前未导入。

原因:transformers版本更新,AutoTokenizer 更新为LlamaTokenizer,AutoModelForCausalLM 更新为LlamaForCausalLM

解决办法:

1、打开fastchat.model.apply_delta.py
将所有的AutoTokenizer 替换为 LlamaTokenizer
AutoModelForCausalLM 替换为 LlamaForCausalLM

2、找到llama-7b的模型,改动tokenizer_config.json文件,

把"tokenizer_class": "LLaMATokenizer" 改为 "tokenizer_class": "LlamaTokenizer".

ImportError: cannot import name ‘LlamaTokenizerFast’ from ‘transformers’

翻译:ImportError:无法从“transformers”导入名称“LlamaTokenizerFast”

原因:transformers中无法导入LlamaTokenizerFast

解决:

确认您已经安装了最新的 Transformers 库:请检查您是否已经安装了最新版本的 Transformers 库,您可以使用pip命令来更新Transformers库:

pip install --upgrade transformers

检查LlamaTokenizerFast是否存在于Transformers库中:请确保您在Transformers库中找到了LlamaTokenizerFast类。您可以查看Transformers文档或使用以下命令来检查:

python -c "from transformers import LlamaTokenizerFast"

如果该命令未报告任何错误,则表示LlamaTokenizerFast类可用。

UnboundLocalError: local variable 'sentencepiece_model_pb2' referenced before assignment

解决办法:

pip install protobuf

Error:AutoTokenizer.from_pretrained,UnboundLocalError: local variable 'sentencepiece_model_pb2' referenced before assignment · Issue #25848 · huggingface/transformers · GitHub

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

原因:fastchat版本不匹配,降低到0.1.10版本

 查看“FastChat版本兼容性”文档:https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md

有效的方式:

pip install fschat==0.1.10

RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0 · Issue #132 · Vision-CAIR/MiniGPT-4 · GitHub

用到的ubuntu命令:

chmod +rwx file      给file文件添加读、写、执行权限。r表示可读,w表示可写,x表示执行

chmod -rwx file       删除file文件读、写、执行权限

nvidia-smi -l 5          每隔5秒刷新nvidia-smi,实时查看GPU使用、显存占用情况

参考

Fastchat实战部署vicuna-7b-v1.3(小羊驼)_Spielberg_1的博客-CSDN博客

nvidia-smi命令实时查看GPU使用、显存占用情况_我们是宇宙中最孤独的孩子的博客-CSDN博客

MiniGPT-4 本地部署 RTX 3090 - 知乎

解决ValueError: Tokenizer class LLaMATokenizer does not exist or is not currently imported_wx6176918821622的技术博客_51CTO博客

http://www.kuazhi.com/post/445223.html

ChatGPT DeepSpeed 部署中bug以及解决方法_博客_夸智网

Error:AutoTokenizer.from_pretrained,UnboundLocalError: local variable 'sentencepiece_model_pb2' referenced before assignment · Issue #25848 · huggingface/transformers · GitHub

ubuntu如何修改读写权限设置 - 小小蚂蚁

小羊驼模型(FastChat-vicuna)运行踩坑记录 - 知乎

win10,win11 下部署Vicuna-7B,Vicuna-13B模型,gpu cpu运行_babytiger的博客-CSDN博客

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/971710.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Python小知识 - 【Python】如何使用Pytorch构建机器学习模型

【Python】如何使用Pytorch构建机器学习模型 机器学习是人工智能的一个分支&#xff0c;它的任务是在已有的数据集上学习&#xff0c;最终得到一个能够解决新问题的模型。Pytorch是一个开源的机器学习框架&#xff0c;它可以让我们用更少的代码构建模型&#xff0c;并且可以让模…

docker 安装rabbitmq

前提&#xff1a;安装好docker docker安装_Steven-Russell的博客-CSDN博客 centos7安装docker_centos7 docker 安装软件_Steven-Russell的博客-CSDN博客 1、启动docker systemctl start docker 2、下载镜像 // 可以先search查询一下可用镜像&#xff0c;此处直接下载最新版本…

LinkedList(3):并发异常

1 LinkedList并发异常 package com.example.demo;import java.util.Iterator; import java.util.LinkedList;public class TestLinkedList {public static void main(String[] args) {LinkedList linkedList new LinkedList(); //双向链表linkedList.add(11);linkedList.add(…

【ES6】require、export和import的用法

在JavaScript中&#xff0c;require、export和import是Node.js的模块系统中的关键字&#xff0c;用于处理模块间的依赖关系。 1、require&#xff1a;这是Node.js中引入模块的方法。当你需要使用其他模块提供的功能时&#xff0c;可以使用require关键字来引入该模块。例如&…

docker从零部署jenkins保姆级教程

jenkins&#xff0c;基本是最常用的持续集成工具。在实际的工作中&#xff0c;后端研发一般没有jenkins的操作权限&#xff0c;只有一些查看权限&#xff0c;但是我们的代码是经过这个工具构建出来部署到服务器的&#xff0c;所以我觉着有必要了解一下这个工具的搭建过程以及简…

分布式环境下的数据同步

一般而言elasticsearch负责搜索&#xff08;查询&#xff09;&#xff0c;而sql数据负责记录&#xff08;增删改&#xff09;&#xff0c;elasticsearch中的数据来自于sql数据库&#xff0c;因此sql数据发生改变时&#xff0c;elasticsearch也必须跟着改变&#xff0c;这个就是…

数据结构与算法-插入希尔归并

一&#xff1a;排序引入 我们通常从哪几个方面来分析一个排序算法&#xff1f; 1.时间效率&#xff1a;决定了算法运行多久&#xff0c;O&#xff08;1&#xff09; 2.空间复杂度&#xff1a; 3.比较次数&交换次数:排序肯定会牵涉到两个操作&#xff0c;一个比较是肯定的。…

mac常见问题(五) Mac 无法开机

在mac的使用过程中难免会碰到这样或者那样的问题&#xff0c;本期为您带来Mac 无法开机怎么进行操作。 1、按下 Mac 上的电源按钮。每台 Mac 电脑都有一个电源按钮&#xff0c;通常标有电源符号 。然后检查有没有通电迹象&#xff0c;例如&#xff1a; 发声&#xff0c;例如由风…

springmvc5.x-mvc实现原理及源码实现

上文&#xff1a;spring5.x-声明式事务原理及源码实现 系列文章&#xff1a; spring5.x-声明式事务原理及源码实现 spring5.x-AOP实现原理及源码分析 spring5.x-监听器原理及源码实现 spring5.x-解决循环依赖分析 spring5.x-IOC模块源码学习 spring5.x介绍及搭配spring源码阅读…

Xcode 清空最近打开的项目

打开Xcode任意项目 File -> Open Recent -> Clear Menu

桌面应用小程序,一种创新的跨端开发方案

Qt Group在提及2023年有桌面端应用程序开发热门趋势时&#xff0c;曾经提及三点&#xff1a; 关注用户体验&#xff1a;无论您是为桌面端、移动端&#xff0c;还是为两者一起开发应用程序&#xff0c;有一点是可以确定的&#xff1a;随着市场竞争日益激烈&#xff0c;对产品的期…

怎么批量在图片名后加相同的文字

怎么批量在图片名后加相同的文字&#xff1f;有个小伙伴通过私信想我咨询一个问题&#xff0c;它从事的是摄影类的工作&#xff0c;每天会在电脑上存储非常多的图片&#xff0c;时间一久电脑上保存的图片非常的多&#xff0c;这让图片的管理和查找变得比较麻烦&#xff0c;有时…

从智能手机到智能机器人:小米品牌的高端化之路

原创 | 文 BFT机器人 前言 在前阵子落幕的2023世界机器人大会“合作之夜”上&#xff0c;北京经济技术开发区管委会完成了与世界机器人合作组织、小米机器人等16个重点项目签约&#xff0c;推动机器人创新链和产业链融合&#xff0c;其中小米的投资额达到20亿&#xff01; 据了…

E5061B/是德科技keysight E5061B网络分析仪

181/2461/8938产品概述 是德科技E5061B(安捷伦)网络分析仪在从5 Hz到3 GHz的宽频率范围内提供通用的高性能网络分析。E5061B提供ENA系列常见的出色RF性能&#xff0c;还提供全面的LF(低频)网络测量能力&#xff1b;包括内置1 Mohm输入的增益相位测试端口。E5061B从低频到高频的…

通过cpolar内网穿透,在家实现便捷的SSH远程连接公司内网服务器教程

文章目录 1. Linux CentOS安装cpolar2. 创建TCP隧道3. 随机地址公网远程连接4. 固定TCP地址5. 使用固定公网TCP地址SSH远程 本次教程我们来实现如何在外公网环境下&#xff0c;SSH远程连接家里/公司的Linux CentOS服务器&#xff0c;无需公网IP&#xff0c;也不需要设置路由器。…

VLAN间路由:单臂路由与三层交换

文章目录 一、定义二、实现方式单臂路由三层交换 三、单臂路由与三层路由优缺点对比四、常用命令 首先可以看下思维导图&#xff0c;以便更好的理解接下来的内容。 一、定义 VLAN间路由是一种网络配置方法&#xff0c;旨在实现不同虚拟局域网&#xff08;VLAN&#xff09;之…

pdf文件过大如何缩小上传?pdf压缩跟我学

在我们日常工作和生活中&#xff0c;经常会遇到PDF文件过大的问题&#xff0c;给文件传输和存储带来了很大的不便。那么&#xff0c;如何缩小PDF文件大小以便上传呢&#xff1f;下面就给大家分享几个压缩方法&#xff0c;一起来了解下PDF文件压缩方法吧~ 方法一&#xff1a;嗨格…

数据结构——七大排序[源码+动图+性能测试]

本章代码gitee仓库&#xff1a;排序 文章目录 &#x1f383;0. 思维导图&#x1f9e8;1. 插入排序✨1.1 直接插入排序✨1.2 希尔排序 &#x1f38a;2. 选择排序&#x1f38b;2.1 直接选择排序&#x1f38b;2.2 堆排序 &#x1f38f;3. 交换排序&#x1f390;3.1 冒泡排序&#…

TS编译选项

自动监控编译 tsc xxx.ts -w 在一个文件夹下&#xff0c;创建 tsconfig.json 文件&#xff0c;在用命令 tsc 就可以自动编译当前文件夹下的ts文件 tsconfig.json文件配置如下&#xff1a; {/*tsconfig.json 是ts编译器的配置文件&#xff0c;ts编译器可以根据它的信息来对代…

mysql 的增删改查以及模糊查询、字符集语句的使用

一、mysql启动与登陆(windows下的mysql操作) 1.启动mysql服务 net start mysql81 2.登陆mysql mysql -uroot -p 3.查看所有数据库 show databases; 二、模糊查询&#xff08;like&#xff09; 1. _代表查询单个 2.%代表查询多个 3.查找所有含有schema的数据库&#xff1b;…