近期,Anthropic 发布了其最新的大模型 Claude3。截止本文撰写时,Claude3 Opus、Claude3 Sonnet、Claude3 Haiku 均已在 Amazon Bedrock 可用,随着 Amazon Bedrock 可提供越来越多的大模型,您可以在您的应用场景里将其落地,以便扩展您丰富的业务功能。
在此博客里,我们将通过构建一个前端示例(基于开源项目 ChatGPT-Next-Web)并通过 Amazon CDK 部署,来演示如何将您的 AI Chat 助手接入 Amazon Bedrock,其中根据不同场景快速提问(预置多种 Prompt)并实现多模态、流式输出(打字机效果)等能力。
我们在该方案主要使用到如下服务和组件:
- Amazon Elastic Container Service(Amazon ECS)是一项完全托管的容器编排服务,可帮助您更有效地部署、管理和扩展容器化的应用程序。
- Amazon Elastic Container Registry(Amazon ECR)是完全托管式容器注册表,提供高性能托管,让您能在任何地方可靠地部署应用程序映像和构件。
- Amazon Bedrock 是一项完全托管的服务,通过单个 API 提供来自 AI21 Labs、Anthropic、Cohere、Meta、Stability AI 和 Amazon 等领先人工智能公司的高性能基础模型(FM),以及通过安全性、隐私性和负责任的 AI 构建生成式人工智能应用程序所需的一系列广泛功能。
- Amazon Cognito Customer Identity And Access Management.
事先准备:
- 您需要确保您已有亚马逊云科技 Global 账号,能够访问美东 1 区(us-east-1)、美西 2 区(us-west-2)、东京区(ap-northeast-1)、新加坡区(ap-southeast-1)其中任意一区,且能够在 Amazon Bedrock 控制台申请 Claude3 的权限,请参考:Amazon Bedrock endpoints and quotas – AWS General ReferenceAmazon Bedr…。如您需要 Claude3 Opus 相关功能,截止本文撰写时,Claude3 Opus 仅支持美西 2 区(us-west-2)使用。
- 您需要准备亚马逊云科技账号上述区域的 Access Key、Secret Access Key,且此账号拥有足够 IAM 权限能够调用 Amazon Bedrock。
- 您需要准备一台已安装Docker、Node.js(v>=18)、AWS SDK环境的电脑/EC2服务器,将上述步骤的Access Key、Secret Access Key 通过aws configure配置,上述所有服务均启动。
演示
架构设计
阐述:
- 项目代码通过 Next.js 实现(React.js 的一种 SSR 框架)。
- Next.js 的 UI 层负责页面渲染,UI 逻辑实现,向服务端发请求,预置 Prompt 和模型参数等功能。
- Next.js API 层负责暴露 API 供 UI 层调用,Server 层负责拼装请求大模型的参数,实现 Agent,通过 AWS SDK for javascript 调用 Amazon Bedrock。
- Amazon Cognito 负责鉴权,与 UI 层代码嵌入,具体实现逻辑请参考 Build & connect backend – JavaScript – AWS Amplify Documentation。您也可以根据您实际情况修改
app/components/home.tsx
中的代码来修改全局渲染逻辑,接入不同的鉴权场景。
请求流程:
- 用户在浏览器打开部署好的项目网址(通过 NLB 暴露)。
- 流量先到 Amazon ECS 判断未登录后,重定向到 Amazon Cognito 的鉴权页面,登录成功后请求流量打到 Amazon ECS,返回正常的页面。
- 输入问题之后,UI 层的 JS 代码会将历史对话上下文,参数配置,发送给 Server 端 API,Server 端拼装参数并请求 Amazon Bedrock。
- 得到返回后直接将 Bedrock response 中的 body 转成 ReadableStream,返回给 UI 层。UI 层实现流式响应、上屏逻辑。
- 如果您在设置中选择压缩历史记录上下文,每次对话正确响应后,UI 层会将聊天上下文自动发请求到服务端,并要求大模型对其内容进行总结归纳。
完整代码原文传送门:基于Amazon Bedrock打造Claude3 Opus智能助理-国外VPS网站https://www.vps911.com/vpscp/1657.html
实现向 Bedrock 请求
UI 层修改
此项目的开源代码默认不支持接入 Claude3,或接入 Amazon Bedrock 托管的 Claude3,需要在代码中手动修改。
首先您需要在 app/constant.ts
文件中,添加 Claude3 的模型。
export enum ServiceProvider {
Claude = "Claude",
......
}
export enum ModelProvider {
AmazonClaude = "AmazonClaude",
......
}
export const SUMMARIZE_MODEL = "anthropic-claude3-sonnet";
export const AmazonPath = {
ChatPath: "v1/chat/completions",
};
export const DEFAULT_MODELS = [
{
name: "Claude 3 Opus",
available: true,
provider: {
id: "anthropic",
providerName: "Amazon Bedrock",
providerType: "anthropic-claude3-opus",
},
},
{
name: "Claude 3 Sonnet",
available: true,
provider: {
id: "anthropic",
providerName: "Amazon Bedrock",
providerType: "anthropic-claude3-sonnet",
},
},
{
name: "Claude 3 Haiku",
available: true,
provider: {
id: "anthropic",
providerName: "Amazon Bedrock",
providerType: "anthropic-claude3-haiku",
},
},
{
name: "Claude 2.1",
available: true,
provider: {
id: "anthropic",
providerName: "Amazon Bedrock",
providerType: "anthropic-claude21",
},
},
{
name: "Claude 2",
available: true,
provider: {
id: "anthropic",
providerName: "Amazon Bedrock",
providerType: "anthropic-claude2",
},
},
] as const;
app/client/platforms
下新建 claude.ts 逻辑可参考此文件夹下其他文件。
【可选】由于项目默认不支持设置 Claude 相关的 top_p,top_k 等参数,需要对设置场景进行修改。
app/components/model-config.tsx
<ListItem
title={Locale.Settings.TopP.Title}
subTitle={Locale.Settings.TopP.SubTitle}
>
<InputRange
value={(props.modelConfig.top_p ?? 1).toFixed(1)}
min="0"
max="1"
step="0.1"
onChange={(e) => {
props.updateConfig(
(config) =>
(config.top_p = ModalConfigValidator.top_p(
e.currentTarget.valueAsNumber
))
);
}}
></InputRange>
</ListItem>
{/* 新增 */}
<ListItem
title={Locale.Settings.TopK.Title}
subTitle={Locale.Settings.TopK.SubTitle}
>
<InputRange
value={(props.modelConfig.top_k ?? 250).toFixed(1)}
min="0"
max="500"
step="1"
onChange={(e) => {
props.updateConfig(
(config) =>
(config.top_k = ModalConfigValidator.top_k(
e.currentTarget.valueAsNumber
))
);
}}
></InputRange>
</ListItem>
app/store/config.ts
......
modelConfig: {
model: "Claude 3 Sonnet" as ModelType,
temperature: 0.5,
top_p: 0.9,
top_k: 250,
max_tokens: 2048,
presence_penalty: 0,
frequency_penalty: 0,
sendMemory: true,
historyMessageCount: 4,
compressMessageLengthThreshold: 1000,
enableInjectSystemPrompts: true,
template: DEFAULT_INPUT_TEMPLATE,
},
......
export const ModalConfigValidator = {
......
temperature(x: number) {
return limitNumber(x, 0, 2, 1);
},
top_p(x: number) {
return limitNumber(x, 0, 1, 1);
},
// 新增
top_k(x: number) {
return limitNumber(x, 0, 500, 1);
},
};
app/client/api.ts
export interface LLMConfig {
model: string;
temperature?: number;
top_p?: number;
stream?: boolean;
presence_penalty?: number;
frequency_penalty?: number;
top_k?: number;
max_token?: number;
}
Server 层
新建 API,app/api 下新建 2 层文件夹:claude/[…path],随后在 app/api/claude/[…path]下新建文件 route.ts,接受 UI 层传入的请求信息。
创建处理请求的逻辑:新建 app/api/cluadeServices.ts
,参考 AWS SDK for JavaScript v3。
import { AWSBedrockAnthropicStream, AWSBedrockStream } from "ai";
export const requestAmazonClaude = async (req: NextRequest) => {
const controller = new AbortController();
const timeoutId = setTimeout(
() => {
controller.abort();
},
10 * 60 * 1000 // 10分钟超时
);
const requestBodyStr = await streamToString(req.body);
const requestBody = JSON.parse(requestBodyStr);
// 创建Bedrock Client
const bedrockruntime = new BedrockRuntimeClient(AWS_PARAM);
let response;
// 发起Bedrock请求
try {
response = await bedrockResponse(
bedrockruntime,
requestBody.model.includes(CLAUDE3_KEY)
? requestBody.messages
: getClaude2Prompt(requestBody.messages),
requestBody.temperature,
toInt(requestBody.max_tokens, 8192),
requestBody.top_p,
requestBody.top_k,
requestBody.model
);
} finally {
clearTimeout(timeoutId);
}
if (typeof response === "string") {
return new NextResponse(response, {
headers: {
"Content-Type": "text/plain",
},
});
}
// 处理Claude3响应 https://sdk.vercel.ai/docs/guides/providers/aws-bedrock
if (requestBody.model.includes(CLAUDE3_KEY)) {
const stream = AWSBedrockStream(
response,
undefined,
(chunk) => chunk.delta?.text
);
return new NextResponse(stream, {
headers: {
"Content-Type": "application/octet-stream",
},
});
}
// 处理Claude2响应
const stream = AWSBedrockAnthropicStream(response);
return new NextResponse(stream, {
headers: {
"Content-Type": "application/octet-stream",
},
});
};
// 具体拼装参数的逻辑
const bedrockResponse = async (
bedrockruntime: { send: (arg0: any) => any },
bodyData: any,
temperature: number,
max_tokens_to_sample: number,
top_p: number,
top_k: number,
model: string
) => {
if (model.includes(CLAUDE2_KEY)) {
const requestBodyPrompt = {
prompt: bodyData,
max_tokens_to_sample,
temperature, // 控制随机性或创造性,0.1-1 推荐0.3-0.5,太低没有创造性,一点输入错误/语意不明确就会产生比较大偏差,且过低不会有纠错机制
top_p, // 采样参数,从 tokens 里选择 k 个作为候选,然后根据它们的 likelihood scores 来采样
top_k, // 设置越大,生成的内容可能性越大;设置越小,生成的内容越固定;
};
console.log("<----- Request Claude2 Body -----> ", requestBodyPrompt);
const command = new InvokeModelWithResponseStreamCommand({
body: JSON.stringify(requestBodyPrompt),
modelId:
model === "anthropic-claude21"
? "anthropic.claude-v2:1"
: "anthropic.claude-v2",
contentType: "application/json",
accept: "application/json",
});
const response = await bedrockruntime.send(command);
return response;
}
if (model.includes(CLAUDE3_KEY)) {
const requestBody = await changeMsgToStand(bodyData, true);
const requestBodyPrompt = {
anthropic_version:
(ClaudeEnum as any)[model]?.knowledge_date ||
ClaudeEnum["anthropic-claude3-sonnet"].knowledge_date,
messages: requestBody,
max_tokens: max_tokens_to_sample,
temperature,
top_p,
top_k,
};
console.log("<----- Request Claude3 Body -----> ", requestBodyPrompt);
const command = new InvokeModelWithResponseStreamCommand({
body: JSON.stringify(requestBodyPrompt),
modelId:
(ClaudeEnum as any)[model]?.model_id ||
ClaudeEnum["anthropic-claude3-sonnet"].model_id,
contentType: "application/json",
accept: "application/json",
});
const response = await bedrockruntime.send(command);
// 注意:Claude 3 流式返回的数据结果与Claude2不同,请注意处理
return response;
}
};
export const ClaudeEnum = {
"anthropic-claude3-sonnet": {
knowledge_date: "bedrock-2023-05-31",
model_id: "anthropic.claude-3-sonnet-20240229-v1:0",
},
"anthropic-claude3-haiku": {
knowledge_date: "bedrock-2023-05-31",
model_id: "anthropic.claude-3-haiku-20240307-v1:0",
},
"anthropic-claude3-opus": {
knowledge_date: "bedrock-2023-05-31",
model_id: "anthropic.claude-3-opus-20240229-v1:0",
},
};
export const CLAUDE3_KEY = "anthropic-claude3";
export const CLAUDE2_KEY = "anthropic-claude2";
流式输出(Streaming output)
服务端:得到响应时,不要直接解析。逻辑如下:
- 调用 InvokeModelWithResponseStream API,请求 Amazon Bedrock。
- 得到响应response,但此时请不要直接解析 response 的 body(是一个流),否则还是在中间 Server 层串行等待,且性能很差,如需一次性返回全部响应 body,请使用 InvokeModel API。注意:Amazon Bedrock 上各模型的响应的 Body 类型各不相同,Claude2 与 Claude3 响应也不同。详情请参考文档:AWS SDK for JavaScript v3。
- 响应 response 的 Body 需要进行流透传,返回给浏览器端,这一步可以手动将 response 的 Body 解析成流,也可以使用第三方的库。
- 这里推荐使用 npm 的 ai 库(AWS Bedrock – Vercel AI SDK),但部分场景(如 Claude3 的响应)仍需手动处理。参考如下:
// Server import { AWSBedrockAnthropicStream, AWSBedrockStream } from "ai"; ...... // Claude 2/2.1 const claude2Stream = AWSBedrockAnthropicStream(claude2Response); return new NextResponse(claude2Stream, { headers: { "Content-Type": "application/octet-stream", }, }); // Claude 3 改了响应格式 const claude3Stream = AWSBedrockStream( claude3Response, undefined, (chunk) => chunk.delta?.text ); return new NextResponse(claude3Stream, { headers: { "Content-Type": "application/octet-stream", }, }); // Bedrock 上其他大模型类似
UI 层(接受流式返回)
...... fetchEventSource(chatPath, { ...chatPayload, async onopen(res) { clearTimeout(requestTimeoutId); const contentType = res.headers.get("content-type"); console.log( "[Claude] request response content type: ", contentType ); if (contentType?.startsWith("text/plain")) { responseText = await res.clone().text(); return finish(); } if ( res.body && contentType?.startsWith("application/octet-stream") ) { try { const reader = res.clone().body?.getReader(); const decoder = new TextDecoder("utf-8"); let result = (await reader?.read()) || { done: true }; while (!result.done) { responseText += decoder.decode(result.value, { stream: true, }); continueMsg(); try { result = (await reader?.read()) || { done: true }; } catch { break; } } } finally { } return finish(); } ......
Amazon Bedrock Limit
Amazon Bedrock 每个账号在每个 region 请求上限为每分钟 60 次,如果您需要更多的请求量,请联系 Amazon 的支持人员,协助您提升上限。具体请参考:Quotas for Amazon Bedrock – Amazon Bedrock。
如您仅希望在本地实现限流,可以通过 Nginx 反向代理、NLB 请求限制等方式实现,以下是通过 Next.js 的中间层代码进行限制:
根目录下创建 middleware.ts
您可以参考 Next.js 的官方文档,实现您的 API 拦截、请求限制等功能:Routing: Middleware | Next.js。
import { NextRequest, NextResponse } from "next/server";
import rateLimit from "./app/utils/rate-limiter";
const requestCount = 0; // 初始化总请求计数
const lastResetTime = Date.now(); // 初始化上次重置时间
export async function middleware(req: NextRequest) {
const path = req.nextUrl.pathname;
// 只对 /api/claude 路由进行限流
if (path.includes("api/claude/")) {
try {
const result = await rateLimit(
requestCount,
lastResetTime,
60,
60 * 1000
);
// 每分钟最多 60 次请求 bedrock限制一分钟最多60次请求 如有需要可以提Case提高上限
// 参考https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase
if (!result.success) {
return NextResponse.json(
{ error: "Too many requests, please try again later." },
{ status: 429 }
);
}
} catch (err) {
console.error("Error rate limiting", err);
return NextResponse.json(
{ error: "Error rate limiting" },
{ status: 500 }
);
}
}
return NextResponse.next();
}
export const config = {
matcher: "/api/claude/:path*",
};
let requestCount = 0;
let lastResetTime = 0;
export async function rateLimit(
requestCount: number,
lastResetTime: number,
limit: number,
duration: number,
) {
const now = Date.now();
// 如果距离上次重置已经超过了一分钟,则重置计数
if (now - lastResetTime > duration) {
requestCount = 1;
lastResetTime = now;
} else {
requestCount++;
}
if (requestCount > limit) {
return { success: false, count: requestCount };
}
return { success: true, count: requestCount };
}
实现多模态
目前 Claude3 各模型支持解析图片,您可参考官方文档:Vision。
【代码改动】:您可以在 app/utils.ts
文件的 isVisionModel 方法中,设置对模型是否支持图片的判断,来放开项目对多模态的支持。
注意:本项目中 message 对图片 base64 的 type/对象名称为 image_url,与 Claude3type 名为 image 不同,需要您手动转换。
对于 PDF 等其他类型文件,可以通过第三方插件(如 react-pdf:react-pdf – npm)将其在浏览器端转换成图片。出于性能和安全性考虑,这里我们也推荐您在浏览器端实现文件解析和转换,而不是在 Next.js 的服务端。
react-pdf 在构建时需要确保您的构建环境有正确的 canvas 版本(或采用有图形化界面的环境进行构建),且 next.config.js 已按其 npm 文档中配置。
部署
可参考以下 CDK,或您通过 PM2 在 EC2 上运行。
export interface ApplicationLoadBalancerProps {
readonly internetFacing: boolean;
}
export interface NetworkProps {
readonly vpc: IVpc;
readonly subnets?: SubnetSelection;
}
export interface AiChatECSProps {
readonly networkProps: NetworkProps;
readonly region: string;
readonly accountId: string;
readonly cognitoInfo: any;
readonly accessKey?: string;
readonly secretAccessKey?: string;
readonly bucketName?: string;
readonly bucketArn?: string;
readonly partition: string;
}
export class EcsStack extends Construct {
readonly securityGroup: ISecurityGroup;
public ecsIP: string;
constructor(scope: Construct, id: string, props: AiChatECSProps) {
super(scope, id);
const repositoryName = "commonchatui-docker-repo";
// create ecr
const repository = new Repository(this, "CommonchatuiRepository", {
repositoryName: "commonchatui-docker-repo",
imageTagMutability: TagMutability.MUTABLE,
removalPolicy: RemovalPolicy.DESTROY,
});
// deploy front docker image
const ui_image = new DockerImageAsset(this, "commonchatui_image", {
directory: path.join(__dirname, "your_next_app_dic"), // 改成您前端应用的代码目录地址
file: "Dockerfile",
platform: Platform.LINUX_AMD64,
});
const imageTag = "latest";
const dockerImageUri = `${props.accountId}.dkr.ecr.${props.region}.amazonaws.com/${repositoryName}:${imageTag}`;
// upload front docker image to ecr
const ecrDeploy = new ECRDeployment(this, "commonchat_image_deploy", {
src: new DockerImageName(ui_image.imageUri),
dest: new DockerImageName(dockerImageUri),
});
new CfnOutput(this, "ECRRepositories", {
description: "ECR Repositories",
value: ecrDeploy.node.addr,
}).overrideLogicalId("ECRRepositories");
new CfnOutput(this, "ECRImageUrl", {
description: "ECR image url",
value: dockerImageUri,
}).overrideLogicalId("ECRImageUrl");
// create ecs security group
this.securityGroup = new SecurityGroup(this, "commonchat_sg", {
vpc: props.networkProps.vpc,
description: "Common chart security group",
allowAllOutbound: true,
});
// add inter rule
this.securityGroup.addIngressRule(
Peer.anyIpv4(),
Port.tcp(443),
"Default ui 443 port"
);
this.securityGroup.addIngressRule(
Peer.anyIpv4(),
Port.tcp(80),
"Default ui 80 port"
);
// add endpoint
props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCECREP", {
service: InterfaceVpcEndpointAwsService.ECR,
privateDnsEnabled: true,
securityGroups: [this.securityGroup],
});
props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCECRDockerEP", {
service: InterfaceVpcEndpointAwsService.ECR_DOCKER,
privateDnsEnabled: true,
securityGroups: [this.securityGroup],
});
props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogEP", {
service: InterfaceVpcEndpointAwsService.CLOUDWATCH_LOGS,
privateDnsEnabled: true,
securityGroups: [this.securityGroup],
});
props.networkProps.vpc.addGatewayEndpoint("CommonChatVPCS3", {
service: GatewayVpcEndpointAwsService.S3,
});
props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogECS", {
service: InterfaceVpcEndpointAwsService.ECS,
privateDnsEnabled: true,
securityGroups: [this.securityGroup],
});
props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogECSAgent", {
service: InterfaceVpcEndpointAwsService.ECS_AGENT,
privateDnsEnabled: true,
securityGroups: [this.securityGroup],
});
props.networkProps.vpc.addInterfaceEndpoint(
"CommonChatVPCLogECSTelemetry",
{
service: InterfaceVpcEndpointAwsService.ECS_TELEMETRY,
privateDnsEnabled: true,
securityGroups: [this.securityGroup],
}
);
// create ecr service
const ecsService = this.createECSGroup(props, imageTag, repository);
// create lb to ecs
this.createNlbEndpoint(props, ecsService, [80]);
}
private createECSGroup(
props: AiChatECSProps,
imageTag: string,
repository: IRepository
) {
const ecsClusterName = "CommonchatUiCluster";
const cluster = new Cluster(this, ecsClusterName, {
clusterName: "commonchat-ui-front",
vpc: props.networkProps.vpc,
enableFargateCapacityProviders: true,
});
const taskDefinition = new FargateTaskDefinition(
this,
"commonchatui_deploy",
{
cpu: 2048,
memoryLimitMiB: 4096,
runtimePlatform: {
operatingSystemFamily: OperatingSystemFamily.LINUX,
cpuArchitecture: CpuArchitecture.of("X86_64"),
},
family: "CommonchatuiDeployTask",
taskRole: this.getTaskRole(this, "CommonchatuiDeployTaskRole"),
executionRole: this.getExecutionTaskRole(
this,
"CommonchatuiDeployExecutionTaskRole"
),
}
);
const portMappings = [
{
containerPort: 80,
hostPort: 80,
protocol: Protocol.TCP,
appProtocol: AppProtocol.http,
name: "app_port",
},
{
containerPort: 443,
hostPort: 443,
protocol: Protocol.TCP,
appProtocol: AppProtocol.http,
name: "app_port_443",
},
];
const envConfig: any = {
DEFAULT_REGION: props.region,
BUCKET_NAME: props.bucketName,
USER_POOL_ID: props.cognitoInfo.userPoolId,
USER_POOL_CLIENT_ID: props.cognitoInfo.userPoolClientId,
};
if (props.accessKey && props.secretAccessKey) {
envConfig.ACCESS_KEY = props.accessKey;
envConfig.SECRET_ACCESS_KEY = props.secretAccessKey;
}
taskDefinition.addContainer("CommonchatuiContainer", {
containerName: "commonchatui_container",
image: ContainerImage.fromEcrRepository(repository, imageTag),
essential: true,
cpu: 2048,
memoryLimitMiB: 4096,
portMappings: portMappings,
environment: envConfig,
logging: LogDriver.awsLogs({
streamPrefix: "commonchat_ui",
}),
});
// 给ECS容器的角色配置权限
taskDefinition.addToTaskRolePolicy(
new PolicyStatement({
effect: Effect.ALLOW,
actions: [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
],
resources: ["*"],
})
);
return new FargateService(this, "CommonchatuiService", {
serviceName: "commonchat-ui-service",
cluster: cluster,
taskDefinition: taskDefinition,
desiredCount: 1,
assignPublicIp: false,
platformVersion: FargatePlatformVersion.LATEST,
securityGroups: [this.securityGroup],
vpcSubnets: { subnetType: SubnetType.PRIVATE_WITH_EGRESS },
capacityProviderStrategies: [{ capacityProvider: "FARGATE", weight: 2 }],
propagateTags: PropagatedTagSource.TASK_DEFINITION,
maxHealthyPercent: 100,
minHealthyPercent: 0,
});
}
private getExecutionTaskRole(self: this, roleId: string): IRole {
// throw new Error('Method not implemented.');
return new Role(self, roleId, {
assumedBy: new ServicePrincipal("ecs-tasks.amazonaws.com"),
});
}
private getTaskRole(self: this, roleId: string): IRole {
return new Role(self, roleId, {
assumedBy: new ServicePrincipal("ecs-tasks.amazonaws.com"),
managedPolicies: [
ManagedPolicy.fromAwsManagedPolicyName(
"service-role/AmazonECSTaskExecutionRolePolicy"
),
],
});
}
// 如果使用自有证书 选择ALB
private createNlbEndpoint(
props: AiChatECSProps,
ecsService: FargateService,
servicePorts: Array<number>
) {
const nlb = new NetworkLoadBalancer(this, "CommonchatUiLoadBalancer", {
loadBalancerName: "commonchat-ui-service",
internetFacing: true,
crossZoneEnabled: false,
vpc: props.networkProps.vpc,
vpcSubnets: { subnetType: SubnetType.PUBLIC } as SubnetSelection,
});
servicePorts.forEach((itemPort) => {
const listener = nlb.addListener(`CommonchatUiLBListener-${itemPort}`, {
port: itemPort,
protocol: LBProtocol.TCP_UDP,
});
const targetGroup = new NetworkTargetGroup(
this,
`CommonchatUiLBTargetGroup-${itemPort}`,
{
targetGroupName: "commonchat-ui-service-target",
vpc: props.networkProps.vpc,
port: itemPort,
protocol: LBProtocol.TCP_UDP,
targetType: TargetType.IP,
healthCheck: {
enabled: true,
interval: Duration.seconds(180),
healthyThresholdCount: 2,
unhealthyThresholdCount: 2,
port: itemPort.toString(),
protocol: LBProtocol.TCP,
timeout: Duration.seconds(10),
},
}
);
listener.addTargetGroups(
`CommonChatUiLBTargetGroups-${itemPort}`,
targetGroup
);
targetGroup.addTarget(
// targetGroups
ecsService.loadBalancerTarget({
containerName: "commonchattui_container",
containerPort: itemPort,
})
);
});
new CfnOutput(this, "FrontUiUrlDefault", {
description: "Common chart ui url by default",
value: `http://${nlb.loadBalancerDnsName}`, // alb
}).overrideLogicalId("FrontUiUrlDefault");
}
}
总结
本文中,我们介绍了如何使用开源框架 ChatGPT-Next-Web,接入您 AWS 账号下的 Amazon Bedrock [Claude 3 Sonnet(Text Image)、Claude 3 Haiku(Text Image)]大模型,实现一个轻量级的多模态聊天机器人,并实现流式输出和部署。
更多相关内容:国外VPS资讯网