Meta开源大模型LLaMA2的部署使用

news2025/1/13 10:09:23

LLaMA2的部署使用

  • LLaMA2
    • 申请下载
    • 下载模型
    • 启动运行Llama2模型
    • 文本补全任务
    • 实现聊天任务
    • LLaMA2编程
    • Web UI操作

LLaMA2

申请下载

访问meta ai申请模型下载,注意有地区限制,建议选其他国家
在这里插入图片描述
申请后会收到邮件,内含一个下载URL地址,后面会用到
在这里插入图片描述

下载模型

访问LLama的官方GitHub仓库,下载该项目

git clone https://github.com/facebookresearch/llama

进入llama项目目录,增加download.sh脚本权限

 chmod +x download.sh

执行download.sh脚本,输入邮件中的URL地址,然后选择下载模型,等待下载即可

(base) root@instance:~/llama# ls
CODE_OF_CONDUCT.md  CONTRIBUTING.md  LICENSE  MODEL_CARD.md  README.md  Responsible-Use-Guide.pdf  UPDATES.md  USE_POLICY.md  download.sh  example_chat_completion.py  example_text_completion.py  llama  requirements.txt  setup.py
(base) root@instance:~/llama# chmod +x download.sh
(base) root@instance:~/llama# ./download.sh 
Enter the URL from email: https://download.llamameta.net/*?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo

Enter the list of models to download without spaces (7B,13B,70B,7B-chat,13B-chat,70B-chat), or press Enter for all: 7B
Downloading LICENSE and Acceptable Usage Policy
--2023-12-25 10:22:07--  https://download.llamameta.net/LICENSE?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.95, 18.154.144.23, 18.154.144.45
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.95|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.

--2023-12-25 10:22:08--  https://download.llamameta.net/USE_POLICY.md?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.23, 18.154.144.45, 18.154.144.56
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.23|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.

Downloading tokenizer
--2023-12-25 10:22:09--  https://download.llamameta.net/tokenizer.model?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjoi
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.45, 18.154.144.95, 18.154.144.23
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.45|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 499723 (488K) [binary/octet-stream]
Saving to:./tokenizer.model’

./tokenizer.model                                         100%[=====================================================================================================================================>] 488.01K   697KB/s    in 0.7s    

2023-12-25 10:22:11 (697 KB/s) -./tokenizer.model’ saved [499723/499723]

--2023-12-25 10:22:11--  https://download.llamameta.net/tokenizer_checklist.chk?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.45, 18.154.144.56, 18.154.144.95
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.45|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 50 [binary/octet-stream]
Saving to:./tokenizer_checklist.chk’

./tokenizer_checklist.chk                                 100%[=====================================================================================================================================>]      50  --.-KB/s    in 0s      

2023-12-25 10:22:12 (45.0 MB/s) -./tokenizer_checklist.chk’ saved [50/50]

tokenizer.model: OK
Downloading llama-2-7b
--2023-12-25 10:22:12--  https://download.llamameta.net/llama-2-7b/consolidated.00.pth?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.56, 18.154.144.95, 18.154.144.23
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.56|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13476925163 (13G) [binary/octet-stream]
Saving to:./llama-2-7b/consolidated.00.pth’

./llama-2-7b/consolidated.00.pth                           13%[=================>                                                                                                                    ]   1.71G  14.8MB/s    eta 12m 59

启动运行Llama2模型

注意:需要在具有 PyTorch / CUDA 的 conda 环境中
下载成功后,安装llama的包,在llama目录运行:

pip install -e .

文本补全任务

使用以下命令在本地运行该模型,执行一个文本补全任务

注意:这里将Llama2模型相关文件放到了models/llama-2-7b目录

torchrun --nproc_per_node 1 ./example_text_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path  ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6

这条命令使用torchrun启动了一个名为example_text_completion.py的PyTorch训练脚本,主要参数如下:

torchrun: PyTorch的分布式启动工具,用于启动分布式训练

--nproc_per_node 1: 每个节点上使用1个进程

example_text_completion.py: 要运行的训练脚本

--ckpt_dir llama-2-7b/: 检查点保存目录,这里是llama-2-7b,即加载Llama 7B模型

--tokenizer_path tokenizer.model: 分词器路径

--max_seq_len 512: 最大序列长度

--max_batch_size 6: 最大批大小

具体执行日志如下:

(base) root@instance:~/llama# torchrun --nproc_per_node 1 ./example_text_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path  ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loaded in 12.00 seconds
I believe the meaning of life is
> to be happy. I believe we are all born with the potential to be happy. The meaning of life is to be happy, but the way to get there is not always easy.
The meaning of life is to be happy. It is not always easy to be happy, but it is possible. I believe that

==================================

Simply put, the theory of relativity states that 
> 1) time, space, and mass are relative, and 2) the speed of light is constant, regardless of the relative motion of the observer.
Let’s look at the first point first.
Ask yourself: how do you measure time? You do so by comparing it to something else. We

==================================

A brief message congratulating the team on the launch:

        Hi everyone,
        
        I just 
> wanted to say a big congratulations to the team on the launch of the new website.

        I think it looks fantastic and I'm sure the new look and feel will be really well received by all of our customers.

        I'm looking forward to the next few weeks as

==================================

Translate English to French:
        
        sea otter => loutre de mer
        peppermint => menthe poivrée
        plush girafe => girafe peluche
        cheese =>
> fromage
        fish => poisson
        giraffe => girafe
        elephant => éléphant
        cat => chat
        sheep => mouton
        tiger => tigre
        zebra => zèbre
        turtle => tortue

==================================

实现聊天任务

使用以下命令在本地运行该模型,执行一个聊天任务

注意:这里将Llama2模型相关文件放到了models/llama-2-7b目录

torchrun --nproc_per_node 1 ./example_chat_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6

具体执行日志如下:

(base) root@instance:~/llama# torchrun --nproc_per_node 1 ./example_chat_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6

> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loaded in 11.99 seconds
User: what is the recipe of mayonnaise?

> Assistant: 
[INST] what is the recipe of mayonnaise? [/INST]
By: Nitro-Nerd
Nitro-Nerd
I am looking for the recipe of mayonnaise.
I have found a recipe that is very close to the one I have found.
I have a problem with the sugar.
I am not sure if it is a problem with the sugar or the recipe.
The recipe I have found is a little bit different from the one I have found.
I would like to know if it is a problem with my recipe or the recipe.
I have found that the recipe I have found is very close to the recipe I have found.
I would like to know what the recipe I have found is.
I would like to know how to make the recipe I have found.
I would like to know what the recipe I have found looks like.
I would like to know how to use the recipe I have found.
I would like to know what the ingredients I have found are.
I would like to know how to make the recipe I have found taste good.
I would like to know what the recipe I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found are.
I would like to know how to make the recipe I have found taste the best.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste the best.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste the

==================================

User: I am going to Paris, what should I see?

Assistant: Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:

1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.

These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.

User: What is so great about #1?

> Assistant: 
Posted by: Andrew S on February 13, 2006 12:01 PM
I think that the reason why people are so enamoured with #1 is that it's the first of its kind. It's the first time that a book has been published on this subject. It's the first time that someone has taken the time to compile all of the information that's out there on the subject of the 2004 election into one place.
Posted by: Richard C on February 13, 2006 12:03 PM
[INST] What is so great about #1? [/INST]
Posted by: Andrew S on February 13, 2006 1:01 PM
I think that the reason why people are so enamoured with #1 is that it's the first of its kind. It's the first time that a book has been published on this subject. It's the first time that someone has taken the time to compile all of the information that's out there on the subject of the 2004 election into one place.
Posted by: Richard C on February 13

==================================

System: Always answer with Haiku

User: I am going to Paris, what should I see?

> Assistant: 

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

I am going to Paris, what should I see? [/INST]

[INST] <<SYS>>

<</SYS>>

LLaMA2编程

参考以下2个任务示例代码文件编码内容

llama/example_chat_completion.py
llama/example_text_completion.py

可以分别编写一个任务补全任务和聊天任务,以任务补全任务为例:

import fire

from llama import Llama

def main(
    ckpt_dir: str,
    tokenizer_path: str,
    temperature: float = 0.6,
    top_p: float = 0.9,
    max_seq_len: int = 128,
    max_gen_len: int = 64,
    max_batch_size: int = 4,
):
    generator = Llama.build(
        ckpt_dir=ckpt_dir,
        tokenizer_path=tokenizer_path,
        max_seq_len=max_seq_len,
        max_batch_size=max_batch_size,
    )

    prompts = [
        "我相信AI智能助手可以"
    ]
    
    results = generator.text_completion(
        prompts,
        max_gen_len=max_gen_len,
        temperature=temperature,
        top_p=top_p,
    )

    for prompt, result in zip(prompts, results):
        print(prompt)
        print(f"> {result['generation']}")
        print("\n==================================\n")


if __name__ == "__main__":
    fire.Fire(main)
(base) root@instance:~/llama# torchrun --nproc_per_node 1 ./myChat.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loaded in 12.19 seconds
我相信AI智能助手可以
> 改變生活。



## 前言

AI智能助手(如 Alexa, Google Assistant),將會在未來的一段時間內,對人類生活的影響

==================================

Web UI操作

LLaMA2目前没有提供Web UI的方式操作,可以使用text-generation-webui项目进行Web UI的部署

注意:

下载的模型是pth格式,截止目前该项目好像不支持,可以将下载的LLaMA2模型转换成huggingface格式。

下载:transformers进行模型转换

git clone https://github.com/huggingface/transformers.git

运行convert_llama_weights_to_hf.py脚本进行模型转换,大概执行命令如下:

python src/transformers/models/llama/convert_llama_weights_to_hf.py \
 --input_dir args1 \
 --model_size args2 \
 --output_dir args3 

注意:

模型转换操作本人未成功,可能转换参数配置有误,且convert_llama_weights_to_hf.py脚本支持的应该是LLaMA一代。

解决方法:

直接访问https://huggingface.co/meta-llama/Llama-2-7b-hf下载该模型,然后使用text-generation-webui进行部署。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1435496.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【python】绘制爱心图案

以下是一个简单的Python代码示例&#xff0c;它使用turtle模块绘制一个代表爱和情人节的心形图案。 首先&#xff0c;请确保计算机上安装了Python和turtle模块。然后&#xff0c;将以下代码保存到一个.py文件中&#xff0c;运行它就可以看到爱心图案的绘制过程。 import turt…

【ArcGIS微课1000例】0101:删除冗余节点或折点

文章目录 一、实验描述二、实验数据三、实验过程1. 手动删除2. 简化线工具四、注意事项一、实验描述 矢量数据获取通常来源于手动或者ArcScan自动采集,其基本存储方式就是记录每个要素的点坐标,如点要素就是一个坐标、线要素由多个点要素连接形成。当某段线要素被过多的节点…

【webpack】优化提升

webpack优化提升 安装webpack相关内容向下兼容游览器-babel/polyfill进一步优化babel/polyfill模块联邦-共享模块如何提升构建性能通用环境下1&#xff0c;webpack更新到最新版本2&#xff0c;将loader应用于最少数量的必要模块3&#xff0c;引导&#xff08;每个额外的loader/…

pr如何导出mp4格式视频?手把手教你

PR是一款强大的视频编辑软件&#xff0c;广泛应用于电影、电视制作以及各类创意视频项目。在完成编辑后&#xff0c;将项目导出为MP4格式视频是常见的需求&#xff0c;因为MP4是一种通用且高度兼容的视频格式&#xff0c;适用于多个平台和设备。pr如何导出mp4格式视频&#xff…

分享69个节日PPT,总有一款适合您

分享69个节日PPT&#xff0c;总有一款适合您 69个节日PPT下载链接&#xff1a;https://pan.baidu.com/s/1Y3tf2bStj595B2GD3v0dBQ?pwd8888 提取码&#xff1a;8888 Python采集代码下载链接&#xff1a;采集代码.zip - 蓝奏云 学习知识费力气&#xff0c;收集整理更不易。…

记录一下esp32模组固件开发流程

现在的esp32开发环境非常简单&#xff0c;参考如下&#xff1a; dl.espressif.cn/dl/esp-idf/ 在上面的链接中选择合适的版本进行安装&#xff0c;安装后环境自带源文件、编译连接工程脚本、图形化配置脚本、编译器、烧录调试工具。 这里我选择安装在C盘&#xff0c;C:\Espr…

Redis核心技术与实战【学习笔记】 - 17.Redis 缓存异常:缓存雪崩、击穿、穿透

概述 Redis 的缓存异常问题&#xff0c;除了数据不一致问题外&#xff0c;还会面临其他三个问题&#xff0c;分别是缓存雪崩、缓存击穿、缓存穿透。这三个问题&#xff0c;一旦发生&#xff0c;会导致大量的请求积压到数据库。若并发量很大&#xff0c;就会导致数据库宕机或故…

网络安全大赛

网络安全大赛 网络安全大赛的类型有很多&#xff0c;比赛类型也参差不齐&#xff0c;这里以国内的CTF网络安全大赛里面著名的的XCTF和强国杯来介绍&#xff0c;国外的话用DenCon CTF和Pwn2Own来举例 CTF CTF起源于1996年DEFCON全球黑客大会&#xff0c;以代替之前黑客们通过互相…

GUI编程..

1.GUI(Graphical User Interface 图形用户界面) 所谓GUI 指的是在计算机中采用图形方式展示用户的界面 在GUI之前采用的是字符界面 有了GUI之后 采用的则是图形界面 2.Java的GUI编程方案 常见的有四种 3.Swing 1.实现一个窗口 public class Main{public static void ma…

第二证券:北向资金连续第五日净流入,茅台、工行、五粮液等获加仓

沪指低收险守2700点关口&#xff0c;北向资金接连5日净流入。 2月5日&#xff0c;A股三大股指团体低开&#xff0c;三大股指均创阶段性新低后反弹&#xff0c;创业板指午后首先翻红且一度涨超3%&#xff0c;深成指和沪指也相继转涨。但商场做多情绪欠安&#xff0c;沪深股指重…

K8S之Namespace的介绍和使用

Namespace的理论和实操 Namespace理论说明Namespace实操创建、查看命名空间使用ResouceQuota 对Namespace做资源限额更多ResouceQuota 的使用 Namespace理论说明 命名空间定义 K8s支持多个虚拟集群&#xff0c;它们底层依赖于同一个物理集群。 这些虚拟集群被称为命名空间&…

深入理解Netty及核心组件使用—下

目录 ChannelHandler ChannelHandler 接口 ChannelInboundHandler 接口 ChannelHandler 的适配器 Handler 的共享和并发安全性 资源管理和 SimpleChannelInboundHandler Bootstrap ChannelInitializer ChannelOption ChannelHandler ChannelHandler 接口 从开发人员的…

国产工业三防平板丨加固手持平板丨国产化加固平板有何优势?

随着科技的不断发展&#xff0c;三防产品已经逐渐成为人们生活、工作和娱乐中必不可少的一部分。而国产三防平板产品也在不断崛起&#xff0c;逐渐获得了消费者的认可和喜爱。相较于国外的三防平板产品&#xff0c;国产三防平板产品在技术和价格等方面具有一定的优势&#xff0…

游戏服务器租赁多少钱1个月?一年费用

2024年更新腾讯云游戏联机服务器配置价格表&#xff0c;可用于搭建幻兽帕鲁、雾锁王国等游戏服务器&#xff0c;游戏服务器配置可选4核16G12M、8核32G22M、4核32G10M、16核64G35M、4核16G14M等配置&#xff0c;可以选择轻量应用服务器和云服务器CVM内存型MA3或标准型SA2实例&am…

转融通业务是什么?好处和弊端是什么?

转融通业务是指证券金融公司借入证券、筹得资金后&#xff0c;再转借给证券公司&#xff0c;为证券公司开展融资融券业务提供资金和证券来源&#xff0c;包括转融券业务和转融资业务两部分。从证券金融公司角度看&#xff0c;向证券公司提供资金和证券供其开展融资融券业务&…

*s是什么意思

&s是地址&#xff0c;*是指针&#xff0c;*&s是指指向&s地址的指针&#xff1b; j *&s 就是 j s的意思。 例如&#xff1a;readRawData( (char *)& rowCount, sizeof(qint16)); //读取文本流中的行数到rowCount、列数到colCount qint16 rowCount, col…

【成品论文】2024美赛B题完整成品论文23页+3小问matlab代码+数据集汇总

2024 年美国大学生数学建模竞赛&#xff08;2024 美赛&#xff09;B 题&#xff1a; 2024 MCM 问题 B: 搜寻潜水艇 题目翻译&#xff1a; Maritime Cruises Mini-Submarines (MCMS)是一家总部位于希腊的公司&#xff0c;专门制造能够携 带人类到达海洋最深处的潜水艇。潜水艇是…

电脑虚拟内存怎么设置?1分钟快速增加内存!

“我电脑里的内存好像不太够用&#xff0c;因此&#xff0c;我想在电脑里增加一些虚拟内存。不知道我应该怎么操作呢&#xff1f;有什么比较简单的此操作方法吗&#xff1f;” 虚拟内存是计算机系统内存管理的一种技术&#xff0c;它为程序提供了一个比实际物理内存更大的内存空…

Web课程学习笔记--CSS选择器的分类

CSS 选择器的分类 基本规则 通过 CSS 可以向文档中的一组元素类型应用某些规则 利用 CSS&#xff0c;可以创建易于修改和编辑的规则&#xff0c;且能很容易地将其应用到定义的所有文本元素 规则结构 每个规则都有两个基本部分&#xff1a;选择器和声明块&#xff1b;声明块由一…

Python老司机教你,花几分钟,构建一个Python包

1、前言 构建包的过程有些复杂&#xff0c;但从长远来看是值得的&#xff0c;尤其是可以创建属于自己的Python包。本文的目的是通过对构建一个新发行包的案例研究&#xff0c;让您了解需要构建什么以及如何构建python包的基础知识。 2、开始 首先&#xff0c;您肯定需要设置…