langchain 部署组件-LangServe

news2025/1/18 7:35:25

原文:🦜️🏓 LangServe | 🦜️🔗 Langchain

LangServe 

🚩 We will be releasing a hosted version of LangServe for one-click deployments of LangChain applications. Sign up here to get on the waitlist.

Overview​

LangServe helps developers deploy LangChain runnables and chains as a REST API.

This library is integrated with FastAPI and uses pydantic for data validation.

In addition, it provides a client that can be used to call into runnables deployed on a server. A javascript client is available in LangChainJS.

Features​

  • Input and Output schemas automatically inferred from your LangChain object, and enforced on every API call, with rich error messages
  • API docs page with JSONSchema and Swagger (insert example link)
  • Efficient /invoke/batch and /stream endpoints with support for many concurrent requests on a single server
  • /stream_log endpoint for streaming all (or some) intermediate steps from your chain/agent
  • Playground page at /playground with streaming output and intermediate steps
  • Built-in (optional) tracing to LangSmith, just add your API key (see Instructions])
  • All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
  • Use the client SDK to call a LangServe server as if it was a Runnable running locally (or call the HTTP API directly)
  • LangServe Hub

Limitations​

  • Client callbacks are not yet supported for events that originate on the server
  • OpenAPI docs will not be generated when using Pydantic V2. Fast API does not support mixing pydantic v1 and v2 namespaces. See section below for more details.

Hosted LangServe​

We will be releasing a hosted version of LangServe for one-click deployments of LangChain applications. Sign up here to get on the waitlist.

Security​

  • Vulnerability in Versions 0.0.13 - 0.0.15 -- playground endpoint allows accessing arbitrary files on server. Resolved in 0.0.16.

Installation​

For both client and server:

pip install "langserve[all]"

or pip install "langserve[client]" for client code, and pip install "langserve[server]" for server code.

LangChain CLI 🛠️​

Use the LangChain CLI to bootstrap a LangServe project quickly.

To use the langchain CLI make sure that you have a recent version of langchain-cli installed. You can install it with pip install -U langchain-cli.

langchain app new ../path/to/directory

Examples​

Get your LangServe instance started quickly with LangChain Templates.

For more examples, see the templates index or the examples directory.

Server​

Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain that uses the Anthropic model to tell a joke about a topic.

#!/usr/bin/env python
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes


app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple api server using Langchain's Runnable interfaces",
)

add_routes(
    app,
    ChatOpenAI(),
    path="/openai",
)

add_routes(
    app,
    ChatAnthropic(),
    path="/anthropic",
)

model = ChatAnthropic()
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(
    app,
    prompt | model,
    path="/joke",
)

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="localhost", port=8000)

Docs​

If you've deployed the server above, you can view the generated OpenAPI docs using:

⚠️ If using pydantic v2, docs will not be generated for invoke/batch/stream/stream_log. See Pydantic section below for more details.

curl localhost:8000/docs

make sure to add the /docs suffix.

⚠️ Index page / is not defined by design, so curl localhost:8000 or visiting the URL will return a 404. If you want content at / define an endpoint @app.get("/").

Client​

Python SDK

from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable

openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/joke/")

joke_chain.invoke({"topic": "parrots"})

# or async
await joke_chain.ainvoke({"topic": "parrots"})

prompt = [
    SystemMessage(content='Act like either a cat or a parrot.'),
    HumanMessage(content='Hello!')
]

# Supports astream
async for msg in anthropic.astream(prompt):
    print(msg, end="", flush=True)

prompt = ChatPromptTemplate.from_messages(
    [("system", "Tell me a long story about {topic}")]
)

# Can define custom chains
chain = prompt | RunnableMap({
    "openai": openai,
    "anthropic": anthropic,
})

chain.batch([{ "topic": "parrots" }, { "topic": "cats" }])

In TypeScript (requires LangChain.js version 0.0.166 or later):

import { RemoteRunnable } from "langchain/runnables/remote";

const chain = new RemoteRunnable({
  url: `http://localhost:8000/joke/`,
});
const result = await chain.invoke({
  topic: "cats",
});

Python using requests:

import requests
response = requests.post(
    "http://localhost:8000/joke/invoke/",
    json={'input': {'topic': 'cats'}}
)
response.json()

You can also use curl:

curl --location --request POST 'http://localhost:8000/joke/invoke/' \
    --header 'Content-Type: application/json' \
    --data-raw '{
        "input": {
            "topic": "cats"
        }
    }'

Endpoints​

The following code:

...
add_routes(
  app,
  runnable,
  path="/my_runnable",
)

adds of these endpoints to the server:

  • POST /my_runnable/invoke - invoke the runnable on a single input
  • POST /my_runnable/batch - invoke the runnable on a batch of inputs
  • POST /my_runnable/stream - invoke on a single input and stream the output
  • POST /my_runnable/stream_log - invoke on a single input and stream the output, including output of intermediate steps as it's generated
  • GET /my_runnable/input_schema - json schema for input to the runnable
  • GET /my_runnable/output_schema - json schema for output of the runnable
  • GET /my_runnable/config_schema - json schema for config of the runnable

These endpoints match the LangChain Expression Language interface -- please reference this documentation for more details.

Playground​

You can find a playground page for your runnable at /my_runnable/playground. This exposes a simple UI to configure and invoke your runnable with streaming output and intermediate steps.

Widgets​

The playground supports widgets and can be used to test your runnable with different inputs.

In addition, for configurable runnables, the playground will allow you to configure the runnable and share a link with the configuration:

Sharing

Legacy Chains​

LangServe works with both Runnables (constructed via LangChain Expression Language) and legacy chains (inheriting from Chain). However, some of the input schemas for legacy chains may be incomplete/incorrect, leading to errors. This can be fixed by updating the input_schema property of those chains in LangChain. If you encounter any errors, please open an issue on THIS repo, and we will work to address it.

Deployment​

Deploy to GCP​

You can deploy to GCP Cloud Run using the following command:

gcloud run deploy [your-service-name] --source . --port 8001 --allow-unauthenticated --region us-central1 --set-env-vars=OPENAI_API_KEY=your_key

Pydantic​

LangServe provides support for Pydantic 2 with some limitations.

  1. OpenAPI docs will not be generated for invoke/batch/stream/stream_log when using Pydantic V2. Fast API does not support [mixing pydantic v1 and v2 namespaces].
  2. LangChain uses the v1 namespace in Pydantic v2. Please read the following guidelines to ensure compatibility with LangChain

Except for these limitations, we expect the API endpoints, the playground and any other features to work as expected.

Advanced​

Handling Authentication​

If you need to add authentication to your server, please reference FastAPI's security documentation and middleware documentation.

Files​

LLM applications often deal with files. There are different architectures that can be made to implement file processing; at a high level:

  1. The file may be uploaded to the server via a dedicated endpoint and processed using a separate endpoint
  2. The file may be uploaded by either value (bytes of file) or reference (e.g., s3 url to file content)
  3. The processing endpoint may be blocking or non-blocking
  4. If significant processing is required, the processing may be offloaded to a dedicated process pool

You should determine what is the appropriate architecture for your application.

Currently, to upload files by value to a runnable, use base64 encoding for the file (multipart/form-data is not supported yet).

Here's an example that shows how to use base64 encoding to send a file to a remote runnable.

Remember, you can always upload files by reference (e.g., s3 url) or upload them as multipart/form-data to a dedicated endpoint.

Custom Input and Output Types​

Input and Output types are defined on all runnables.

You can access them via the input_schema and output_schema properties.

LangServe uses these types for validation and documentation.

If you want to override the default inferred types, you can use the with_types method.

Here's a toy example to illustrate the idea:

from typing import Any

from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda

app = FastAPI()


def func(x: Any) -> int:
    """Mistyped function that should accept an int but accepts anything."""
    return x + 1


runnable = RunnableLambda(func).with_types(
    input_schema=int,
)

add_routes(app, runnable)

Custom User Types​

Inherit from CustomUserType if you want the data to de-serialize into a pydantic model rather than the equivalent dict representation.

At the moment, this type only works server side and is used to specify desired decoding behavior. If inheriting from this type the server will keep the decoded type as a pydantic model instead of converting it into a dict.

from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda

from langserve import add_routes
from langserve.schema import CustomUserType

app = FastAPI()


class Foo(CustomUserType):
    bar: int


def func(foo: Foo) -> int:
    """Sample function that expects a Foo type which is a pydantic model"""
    assert isinstance(foo, Foo)
    return foo.bar

# Note that the input and output type are automatically inferred!
# You do not need to specify them.
# runnable = RunnableLambda(func).with_types( # <-- Not needed in this case
#     input_schema=Foo,
#     output_schema=int,
# 
add_routes(app, RunnableLambda(func), path="/foo")

Playground Widgets​

The playground allows you to define custom widgets for your runnable from the backend.

  • A widget is specified at the field level and shipped as part of the JSON schema of the input type
  • A widget must contain a key called type with the value being one of a well known list of widgets
  • Other widget keys will be associated with values that describe paths in a JSON object

General schema:

type JsonPath = number | string | (number | string)[];
type NameSpacedPath = { title: string; path: JsonPath }; // Using title to mimick json schema, but can use namespace
type OneOfPath = { oneOf: JsonPath[] };

type Widget = {
    type: string // Some well known type (e.g., base64file, chat etc.)
    [key: string]: JsonPath | NameSpacedPath | OneOfPath;
};

File Upload Widget​

Allows creation of a file upload input in the UI playground for files that are uploaded as base64 encoded strings. Here's the full example.

Snippet:

try:
    from pydantic.v1 import Field
except ImportError:
    from pydantic import Field

from langserve import CustomUserType


# ATTENTION: Inherit from CustomUserType instead of BaseModel otherwise
#            the server will decode it into a dict instead of a pydantic model.
class FileProcessingRequest(CustomUserType):
    """Request including a base64 encoded file."""

    # The extra field is used to specify a widget for the playground UI.
    file: str = Field(..., extra={"widget": {"type": "base64file"}})
    num_chars: int = 100

Example widget:

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1239349.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

PHP反序列化简单使用

注&#xff1a;比较简陋&#xff0c;仅供参考。 编写PHP代码&#xff0c;实现反序列化的时候魔法函数自动调用计算器 PHP反序列化 serialize(); 将对象序列化成字符串 unserialize(); 将字符串反序列化回对象 创建类 class Stu{ public $name; public $age; public $sex; publi…

Vue 定义只读数据 readonly

readonly 让一个响应式数据变为 **深层次的只读数据**。 isReadonly 判断一个数据是不是只读数据。 应用场景&#xff1a;不希望数据被修改时使用。 readonly 深层次只读&#xff1a; <template><h1>reactive数据</h1><p>姓名&#xff1a;{{ info…

在win10上安装pytorch-gpu版本2

安装anaconda即下载了python&#xff0c;还可以创建虚拟环境。 目录 1.1 anaconda安装 1.2 pytorch-gpu安装 1.1 Anaconda安装 anaconda的安装请看我之前发的tensoflow-gpu安装&#xff0c;里面有详细的安装过程&#xff0c;这里不做重复描述&#xff0c;传送门 1.2 pyt…

采用connector-c++ 8.0操作数据库

1.下载最新的Connector https://dev.mysql.com/downloads/connector/cpp/&#xff0c;下载带debug的库。 解压缩到本地&#xff0c;本次使用的是带debug模式的connector库&#xff1a; 注&#xff1a;其中mysqlcppconn与mysqlcppconn8的区别是&#xff1a; 2.在cmakelist…

Redux系列实现异步请求

一、简介 普通的action处理会自动分派给对应的reducer处理。异步的action会经过Middlewares进行处理&#xff0c;异步完成后再交由对应的reducer处理。 1.Middleware (1) 截获action 判断action是否是一个promise操作。 (2) 发出action 二、代码实现 举个例子&#xff0c;获取…

电磁优化的并行空间映射方法

空间映射(SM)是一种公认的加速电磁优化的方法。现有的SM方法大多基于顺序计算机制。本文提出了一种用于电磁优化的并行SM方法。在该方法中&#xff0c;每次迭代开发的代理模型被训练以同时匹配多个点的精细模型。多点训练和SM使代理模型在比标准SM更大的邻域内有效。本文提出的…

SQL语句执行过程

一条 SQL 的执行过程可以大致分为以下几个步骤&#xff1a; 连接器&#xff1a; ○ 客户端与数据库建立连接&#xff0c;并发送 SQL 语句给数据库服务。 ○ 连接器验证客户端的身份和权限&#xff0c;确保用户有足够的权限执行该 SQL 语句。查询缓存&#xff1a; ○ 连接器首先…

matlab绘图函数plot和fplot的区别

一、背景 有的函数用plot画就会报错&#xff0c;显示数据必须为可转换为双精度值的数值、日期时间、持续时间、分类或数组。 如下图所示&#xff1a; 但用fplot函数就没有问题&#xff0c;因此这里记录一下两者的区别&#xff0c;如果使用不当&#xff0c;画出的图可能就是下…

C++虚析构和纯虚析构解决delete堆区父类指针无法调用子类的构造函数

#include<iostream> #include<string>using namespace std;//虚析构和纯虚析构 class Animal { public:Animal(){cout<<"执行Animal的构造函数"<<endl;}~Animal(){cout<<"执行Animal的析构函数"<<endl;}virtual void …

centos无法进入系统之原因解决办法集合

前言 可爱的小伙伴们&#xff0c;由于精力有限&#xff0c;暂时整理了两类。如果没有你遇到的问题也没有关系&#xff0c;欢迎底下留言评论或私信&#xff0c;小编看到后第一时间帮助解决 一. Centos 7 LVM xfs文件系统修复 情况1&#xff1a; [sda] Assuming drive cache:…

【数据结构初阶】栈和队列

栈和队列 1.栈1.1栈的概念和结构1.2栈的实现 2.队列2.1队列的概念和结构2.2队列的实现 1.栈 1.1栈的概念和结构 栈&#xff1a;一种特殊的线性表&#xff0c;其只允许在固定的一端进行插入和删除元素操作。进行数据插入和删除操作的一端称为栈顶&#xff0c;另一端称为栈底。…

ros2不同机器通讯时IP设置

看到这就是不同机器的IP地址&#xff0c;为了避免在路由器为不同的机器使用DHCP分配到上面的地址&#xff0c;可以设置DHCP分配的范围&#xff1a;&#xff08;我的路由器是如下设置的&#xff0c;一般路由器型号都不一样&#xff0c;自己找一下&#xff09; 防火墙设置-----&…

Jenkins扩展篇-流水线脚本语法

JenkinsFile可以通过两种语法来声明流水线结构&#xff0c;一种是声明式语法&#xff0c;另一种是脚本式语法。 脚本式语法以Groovy语言为基础&#xff0c;语法结构同Groovy相同。 由于Groovy学习不适合所有初学者&#xff0c;所以Jenkins团队为编写Jenkins流水线提供一种更简…

【UE】用样条线实现测距功能(上)

目录 效果 步骤 一、创建样条网格体组件3D模型 二、实现点击连线功能 三、实现显示两点间距离功能 效果 步骤 一、创建样条网格体组件3D模型 创建一个圆柱模型&#xff0c;这里底面半径设置为10mm&#xff0c;高度设置为1000mm 注意该模型的坐标轴在如下位置&#xff1…

设计循环队列,解决假溢出问题

什么是假溢出&#xff1f; 当我们使用队列这种基本的数据结构时&#xff0c;很容易发现&#xff0c;随着入队和出队操作的不断进行&#xff0c;队列的数据区域不断地偏向队尾方向移动。当我们的队尾指针指向了队列之外的区域时&#xff0c;我们就不能再进行入队操作了&#xff…

瑞吉外卖优化

1.缓存问题 用户数量多&#xff0c;系统访问量大频繁访问数据库,系统性能下降,用户体验差 2.导入依赖 和配置 <dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-redis</artifactId></dependenc…

C语言好好题(一维数组)

两天没有更新了&#xff0c;贴纸们&#xff0c;有没有想我呀。&#x1f604;&#x1f604;&#x1f604; 好了&#xff0c;就寒暄到这里吧&#xff0c;下面请看题&#xff1a; 有序序列判断 输入一个整数序列&#xff0c;判断是否是有序序列&#xff0c;有序&#xff0c;指序列…

Java基础(程序控制结构篇)

Java的程序控制结构与C语言一致&#xff0c;分为顺序结构、选择结构&#xff08;分支结构&#xff09;和循环结构三种。 一、顺序结构 如果程序不包含选择结构或是循环结构&#xff0c;那么程序中的语句就是顺序的逐条执行&#xff0c;这就是顺序结构。 import java.util.Sc…