开源模型应用落地-Qwen2.5-Coder模型小试-码无止境（一）

news2025/7/14 3:41:49

一、前言

代码专家模型是一种基于人工智能的先进技术，旨在自动分析和理解大量代码库，并从中学习常见的编码模式和最佳实践。这种模型通过深度学习和自然语言处理，能够提供准确而高效的代码建议，帮助开发人员在编写代码时有效地避免常见的错误和陷阱。

利用代码专家模型，开发人员能够获得高效、准确且个性化的代码支持。这不仅显著提高了工作效率，还能在不同技术环境中简化软件开发流程。通过自动化的代码分析，开发人员可以更专注于创造性的编程任务，从而推动软件开发的创新与进步。

总之，代码专家模型的引入不仅是技术的进步，更是软件开发方式的一次革命。它为开发者提供了更为强大的工具，使他们能够在复杂的技术环境中游刃有余，专注于更具创造性和挑战性的项目。随着这一技术的不断发展与普及，未来的软件开发将会更加高效、智能和富有创造力。

二、术语

2.1.Qwen2.5-Coder

是通义千问新一代开源模型中用于编程的特定系列大语言模型（前身是 code qwen）。它具有以下特点：

1. 训练数据及规模方面

在多达 5.5 万亿 tokens 的编程相关数据上作了训练，涵盖源代码、文本-代码关联数据、合成数据等，为模型提供了丰富的编程知识和模式。

2. 性能提升方面

相比 code qwen 1.5，在代码生成、代码推理和代码修复方面有了显著提升，为开发者提供更高效、准确的代码辅助。

3. 应用基础方面

为代码相关的实际应用如代码代理等提供了更全面的基础，让模型不仅局限于代码的生成，还能更好地适应各种代码相关任务的实际场景。

4. 上下文处理方面

支持长上下文，最长可达 128k tokens，能够处理大规模的代码文本和复杂的编程逻辑。

5. 模型体系方面

发布了三个基础语言模型和指令微调语言模型，包括 1.5B、7B 和未来的 32B（开发中）不同参数规模的版本，以满足不同用户和场景的需求。

2.2.Qwen2.5-Coder-7B-Instruct

是在 Qwen2.5-Coder 的基础上通过指令微调得到的模型，它在多个任务上性能进一步提升，在多编程语言能力、代码推理、数学能力和基础能力等方面表现突出。

三、前置条件

3.1.基础环境

操作系统：centos7

Tesla V100-SXM2-32GB CUDA Version: 12.2

3.2.下载模型

huggingface：

https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct/tree/main

ModelScope：

git clone https://www.modelscope.cn/qwen/Qwen2.5-Coder-7B-Instruct.git

PS：

1. 根据实际情况选择不同规格的模型

3.3.创建虚拟环境

conda create --name qwen2.5 python=3.10

3.4.安装依赖库

conda activate qwen2.5
pip install transformers torch accelerate

四、使用方式

4.1.生成代码能力

# -*-  coding = utf-8 -*-
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

device = "cuda" 

modelPath='/data/model/qwen2.5-coder-7b-instruct'

def loadTokenizer():
    # print("loadTokenizer: ", modelPath)
    tokenizer = AutoTokenizer.from_pretrained(modelPath)
    return tokenizer

def loadModel(config):
    print("loadModel: ",modelPath)
    model = AutoModelForCausalLM.from_pretrained(
        modelPath,
        torch_dtype="auto",
        device_map="auto"
    )
    model.generation_config = config
    return model


if __name__ == '__main__':
    prompt = "用Python写一个冒泡排序算法的例子"
    messages = [
        {"role": "system", "content": "你是一名Python编程助手，专注于生成高效、清晰和可读的Python代码。请确保代码遵循最佳实践，并添加适当的注释以便于理解。"},
        {"role": "user", "content": prompt}
    ]

    config = GenerationConfig.from_pretrained(modelPath, top_p=0.9, temperature=0.45, repetition_penalty=1.1,
                                              do_sample=True, max_new_tokens=8192)
    tokenizer = loadTokenizer()
    model = loadModel(config)

    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(device)

    generated_ids = model.generate(
        model_inputs.input_ids
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print(response)

调用结果：

在IDEA中运行模型生成的代码

结论：

模型能根据需求生成可运行代码

4.2.修改代码的能力

示例说明：

把冒泡排序正确的代码故意修改为错误,异常为：UnboundLocalError: local variable 'j' referenced before assignment

# -*-  coding = utf-8 -*-
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

device = "cuda"

modelPath = '/data/model/qwen2.5-coder-7b-instruct'


def loadTokenizer():
    # print("loadTokenizer: ", modelPath)
    tokenizer = AutoTokenizer.from_pretrained(modelPath)
    return tokenizer


def loadModel(config):
    # print("loadModel: ",modelPath)
    model = AutoModelForCausalLM.from_pretrained(
        modelPath,
        torch_dtype="auto",
        device_map="auto"
    )
    model.generation_config = config
    return model


if __name__ == '__main__':
    prompt = '''
我用Python写了一个冒泡排序的算法例子，但是运行结果不符合预期，请修改，具体代码如下:
def bubble_sort(nums):
    n = len(nums)
    for i in range(n):
        for j in range(0, n-i-1):
            if nums[j] < nums[j+1]:
                nums[j], nums[j+1] = nums[j+1], nums[j]
    return nums

if __name__ == "__main__":
    unsorted_list = [64, 34, 25, 12, 22, 11, 90]
    print("原始列表：", unsorted_list)
    sorted_list = bubble_sort(unsorted_list)
    print("排序后的列表：", sorted_list)         
'''

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt}
    ]

    config = GenerationConfig.from_pretrained(modelPath, top_p=0.9, temperature=0.1, repetition_penalty=1.1,
                                              do_sample=True, max_new_tokens=8192)
    tokenizer = loadTokenizer()
    model = loadModel(config)

    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(device)

    generated_ids = model.generate(
        model_inputs.input_ids
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print(response)

调用结果：