IntelliNode：Node.js大模型访问统一接口库【Gen AI】

news2025/4/14 12:56:24

使用最新的 AI 模型更新你的应用程序可能具有挑战性，因为它涉及了解不同 AI 模型的复杂性并管理许多依赖项。 IntelliNode 是一个开源库，旨在通过提供统一且易于使用的界面来解决集成 AI 模型的挑战。这使开发人员能够快速构建 AI 原型并使用高级 AI 功能增强其应用程序，从而开辟广泛的业务场景。

在这里插入图片描述 > 推荐：用 NSDT场景设计器快速搭建可编程3D场景。

1、为什么要使用不同的AI模型？

每个 AI 模型都有自己的优势和独特的功能。 Cohere 擅长生成自定义文本模型，而 OpenAI 的 ChatGPT 可增强用户交互和上下文理解。谷歌的 DeepMind 文本转语音模型提供高质量的合成音频，而 DALL·E 和 Stable Diffusion 以其图像生成能力而闻名。通过利用这些模型，开发人员可以获得尖端的人工智能技术，并根据他们的应用程序对其进行定制。
在这里插入图片描述

IntelliNode 简化了将多个 AI 模型与单个客户端集成的过程，将应用程序的业务逻辑与模型差异分离开来。借助 IntelliNode，开发人员可以使用 Cohere 语言模型快速生成文本、使用 ChatGPT 生成图像描述、使用 Stable Diffusion 生成图像，或者使用 Google DeepMind 的模型合成音频，所有这些只需几行代码。

2、快速上手IntelliNode

为了演示使用 IntelliNode 集成 AI 模型的简单性，让我们考虑构建一个生成产品描述、图像和动态音频内容的电子商务工具的示例。
在这里插入图片描述

首先，将以下模块添加到你的NodeJS项目：

npm i intellinode

让我们为打算销售的游戏椅生成产品文本描述：

const {RemoteLanguageModel,LanguageModelInput} = require('intellinode');

const textModelInput = 'Write a creative product description for gaming chair with black and red colors';
const textProductDesc = await generateProductDescription(textModelInput, MyKeys.cohere, 'cohere', 'command-xlarge-20221108');

// common function to use it with any text generation
async function generateProductDescription(textInput, apiKey, modelBackend, modelName) {
  const langModel = new RemoteLanguageModel(apiKey, modelBackend);
  const results = await langModel.generateText(new LanguageModelInput({
    prompt: textInput,
    model: modelName,
    maxTokens: 300
  }));
  return results[0].trim();
}

下一步是使用产品详细信息生成图像描述：

const { Chatbot, ChatGPTInput } = require('intellinode');

const imageDescription = await getImageDescription(textProductDesc, MyKeys.openai, 'openai');

// common function to use with any future code
async function getImageDescription(textInput, apiKey, modelBackend) {
  const chatbot = new Chatbot(apiKey, modelBackend);
  const input = new ChatGPTInput('generate image description from paragraph to use it as prompt to generate image from DALL·E or stable diffusion image model. return only the image description to use it as direct input');
  input.addUserMessage(textInput);
  const responses = await chatbot.chat(input);
  return responses[0].trim();
}

此时，我们可以利用图像描述，使用稳定扩散或DALL·E 2生成高质量图像；在下面的代码中，我们将使用扩散，但如果你想使用其他模型，则需要进行一些更改：

const {RemoteImageModel,SupportedImageModels,ImageModelInput} = require('intellinode');

const images = await generateImage(imageDescription, MyKeys.stability, SupportedImageModels.STABILITY);

// common function for future use
async function generateImage(imageText, apiKey, modelBackend) {
  const imgModel = new RemoteImageModel(apiKey, modelBackend);
  const imageInput = new ImageModelInput({
    prompt: imageText,
    numberOfImages: 3,
    width: 512,
    height: 512
  });
  return await imgModel.generateImages(imageInput);
}

如果想使用 Openai 生成图像并比较适合你的案例的输出，只需修改两个参数并保持代码和输出流程相同：

// optional code change to use DALL·E instead of Diffusion
// 1. MyKeys is a dictionary to store multiple keys.
// 2. SupportedImageModels provided by the library.
const images = await generateImage(imageDescription, 
                    MyKeys.openai, 
                    SupportedImageModels.OPENAI);

输出：

在这里插入图片描述

对于交互式体验，我们可以为产品描述生成音频：

const {RemoteSpeechModel, Text2SpeechInput, AudioHelper} = require('intellinode');

const decodedAudio = await generateSpeech(textProductDesc, MyKeys.google, 'google');

// common function for future use
async function generateSpeech(textProductDesc, apiKey, modelBackend) {
  const speechModel = new RemoteSpeechModel(apiKey);
  const input = new Text2SpeechInput({ text: textProductDesc, language: 'en-gb' });
  const audioContent = await speechModel.generateSpeech(input);
  const audioHelper = new AudioHelper();
  return audioHelper.decode(audioContent);
}

输出：
在这里插入图片描述

我们看到了如何无缝集成各种 AI 模型，使开发人员能够专注于其应用程序的核心功能，并利用 AI 功能来增强用户体验。

3、深入探索业务用例

IntelliNode 为各行各业的企业开辟了许多机会。除了电子商务应用程序之外，这里还有一些我们可以使用该库构建的其他潜在用例：

客户支持：通过实施能够理解用户查询并及时提供相关响应的人工智能聊天机器人来改善客户服务体验。我们可以利用库中的语言和音频模型来实现这一点。
语音助手：使用 Google DeepMind 的文本转语音模型创建语音驱动的应用程序或将语音命令功能集成到现有产品中。
视觉内容生成：利用图像和语言模型为数字营销活动、社交媒体帖子或网站设计自动生成具有视觉吸引力的内容。企业可以通过将 DALL·E 的创意能力与 GPT-3 或 Cohere.ai 等强大的语言模型相结合来创建独特的视觉效果。