使用 Elastic 将 AI 摘要添加到你的网站

news2024/10/3 13:03:36

作者:来自 Elastic Gustavo Llermaly

我们目前所知道的搜索(搜索栏、结果、过滤器、页面等)已经取得了长足的进步,并实现了多种不同的功能。当我们知道找到所需内容所需的关键字或知道哪些文档包含我们想要的信息时,尤其如此。但是,当结果是包含长文本的文档时,除了阅读和总结之外,我们还需要额外的步骤来获得最终答案。因此,为了简化此过程,Google 及其搜索生成体验 ( Search Generative Experience - SGE) 等公司使用 AI 通过 AI 摘要来补充搜索结果。

如果我告诉你,你可以使用 Elastic 做同样的事情,你会怎么做?

在本文中,你将学习创建一个 React 组件,该组件将显示回答用户问题的 AI 摘要以及搜索结果,以帮助用户更快地回答他们的问题。我们还将要求模型提供引用,以便答案以搜索结果为基础。

最终结果将如下所示:

你可以在此处找到完整的工作示例存储库。

步骤

  1. 创建端点
  2. 创建索引
  3. 索引数据
  4. 创建组件
  5. 提出问题

创建端点

在创建端点之前,请先查看该项目的高级架构。

从 UI 使用 Elasticsearch 的推荐方法是代理调用,因此我们将为此目的启动 UI 可以连接的后端。你可以在此处阅读有关此方法的更多信息。

重要提示:本文概述的方法提供了一种处理 Elasticsearch 查询和生成摘要的简单方法。在实施此解决方案之前,请考虑你的具体用例和要求。更合适的架构将涉及在代理后面的同一 API 调用下进行搜索和完成。

嵌入端点

为了启用语义搜索,我们将使用 ELSER 模型来帮助我们不仅通过单词匹配进行查找,而且还通过语义含义进行查找。

你可以使用 Kibana UI 创建 ELSER 端点:

或者通过 _inference API:

PUT _inference/sparse_embedding/elser-embeddings
{
  "service": "elser",
  "service_settings": {
    "model_id": ".elser_model_2",
    "num_allocations": 1,
    "num_threads": 1
  }
}

Completion 端点

要生成 AI 摘要,我们必须将相关文档作为上下文和用户查询发送到模型。为此,我们创建了一个连接到 OpenAI 的完成端点。如果你不想与 OpenAI 合作,你还可以在不断增加的不同提供商列表中进行选择。

PUT _inference/completion/summaries-completion
{
  "service": "openai",
  "service_settings": {
    "api_key": "<API_KEY>",
    "model_id": "gpt-4o-mini"
  }
}

每次用户运行搜索时,我们都会调用模型,因此我们需要速度和成本效益,这是一个测试新款的好机会。

索引数据

由于我们正在为网站添加搜索体验,因此我们可以使用 Elastic 网络爬虫来索引网站内容并使用我们自己的文档进行测试。在此示例中,我将使用 Elastic Labs Blog。

要创建爬虫,请按照文档中的说明进行操作。

在此示例中,我们将使用以下设置:

注意:我添加了一些提取规则来清理字段值。我还使用了爬虫中的 semantic_text 字段,并将其与 article_content 字段关联起来

提取字段的简要说明:

meta_img:用作缩略图的文章图像。meta_author:作者姓名,可按作者进行筛选。article_content:我们仅索引 div 中文章的主要内容,排除页眉和页脚等不相关的数据。此优化通过生成更短的嵌入来增强搜索相关性并降低成本。

应用规则并成功执行抓取后,文档的外观如下:

{
    "_index": "search-labs-index",
    "_id": "66a5568a30cc8eb607eec315",
    "_version": 1,
    "_seq_no": 6,
    "_primary_term": 3,
    "found": true,
    "_source": {
      "last_crawled_at": "2024-07-27T20:20:25Z",
      "url_path_dir3": "langchain-collaboration",
      "meta_img": "https://www.elastic.co/search-labs/assets/images/langchain-partner-blog.png?5c6faef66d5699625c50453e356927d0",
      "semantic_text": {
        "inference": {
          "inference_id": "elser_model_2",
          "model_settings": {
            "task_type": "sparse_embedding"
          },
          "chunks": [
            {
              "text": """Tutorials Integrations Blog Start Free Trial Contact Sales Open navigation menu Blog / Generative AI LangChain and Elastic collaborate to add vector database and semantic reranking for RAG In the last year, we have seen a lot of movement in generative AI. Many new services and libraries have emerged. LangChain has separated itself as the most popular library for building applications with large language models (LLMs), for example Retrieval Augmented Generation (RAG) systems. The library makes it really easy to prototype and experiment with different models and retrieval systems. To enable the first-class support for Elasticsearch in LangChain, we recently elevated our integration from a community package to an official LangChain partner package . This work makes it straightforward to import Elasticsearch capabilities into LangChain applications. The Elastic team manages the code and the release process through a dedicated repository . We will keep improving the LangChain integration there, making sure that users can take full advantage of the latest improvements in Elasticsearch. Our collaboration with Elastic in the last 12 months has been exceptional, particularly as we establish better ways for developers and end users to build RAG applications from prototype to production," said Harrison Chase, Co-Founder and CEO at LangChain. "The LangChain-Elasticsearch vector database integrations will help do just that, and we're excited to see this partnership grow with future feature and integration releases. Elasticsearch is one of the most flexible and performant retrieval systems that includes a vector database. One of our goals at Elastic is to also be the most open retrieval system out there. In a space as fast-moving as generative AI, we want to have the developer's back when it comes to utilizing emerging tools and libraries. This is why we work closely with libraries like LangChain and add native support to the GenAI ecosystem. From using Elasticsearch as a vector database to hybrid search and orchestrating a full RAG application. Elasticsearch and LangChain have collaborated closely this year. We are putting our extensive experience in building search tools into making your experience of LangChain easier and more flexible. Let's take a deeper look in this blog. Rapid RAG prototyping RAG is a technique for providing users with highly relevant answers to questions. The main advantages over using LLMs directly are that user data can be easily integrated, and hallucinations by the LLM can be minimized. This is achieved by adding a document retrieval step that provides relevant context for the""",
              "embeddings": {
                "rag": 2.2831416,
                "elastic": 2.1994505,
                "genera": 1.990228,
                "lang": 1.9417559,
                "vector": 1.7541072,
                "##ai": 1.5763651,
                "integration": 1.5619806,
                "##sea": 1.5154194,
                "##rank": 1.4946039,
                "retrieval": 1.3957807,
                "ll": 1.362704 
                // more embeddings ...
              }
            }
          ]
        }
      },
      "additional_urls": [
        "https://www.elastic.co/search-labs/blog/langchain-collaboration"
      ],
      "body_content": """Tutorials Integrations Blog Start Free Trial Contact Sales Open navigation menu Blog / Generative AI LangChain and Elastic collaborate to add vector database and semantic reranking for RAG In the last year, we have seen a lot of movement in generative AI. Many new services and libraries have emerged. LangChain has separated itself as the most popular library for building applications with large language models (LLMs), for example Retrieval Augmented Generation (RAG) systems. The library makes it really easy to prototype and experiment with different models and retrieval systems. To enable the first-class support for Elasticsearch in LangChain, we recently elevated our integration from a community package to an official LangChain partner package . This work makes it straightforward to import Elasticsearch capabilities into LangChain applications. The Elastic team manages the code and the release process through a dedicated repository . We will keep improving the LangChain integration there, making sure that users can take full advantage of the latest improvements in Elasticsearch. Our collaboration with Elastic in the last 12 months has been exceptional, particularly as we establish better ways for developers and end users to build RAG applications from prototype to production," said Harrison Chase, Co-Founder and CEO at LangChain. "The LangChain-Elasticsearch vector database integrations will help do just that, and we're excited to see this partnership grow with future feature and integration releases. Elasticsearch is one of the most flexible and performant retrieval systems that includes a vector database. One of our goals at Elastic is to also be the most open retrieval system out there. In a space as fast-moving as generative AI, we want to have the developer's back when it comes to utilizing emerging tools and libraries. This is why we work closely with libraries like LangChain and add native support to the GenAI ecosystem. From using Elasticsearch as a vector database to hybrid search and orchestrating a full RAG application. Elasticsearch and LangChain have collaborated closely this year. We are putting our extensive experience in building search tools into making your experience of LangChain easier and more flexible. Let's take a deeper look in this blog. Rapid RAG prototyping RAG is a technique for providing users with highly relevant answers to questions. The main advantages over using LLMs directly are that user data can be easily integrated, and hallucinations by the LLM can be minimized. This is achieved by adding a document retrieval step that provides relevant context for the LLM. Since its inception, Elasticsearch has been the go-to solution for relevant document retrieval and has since been a leading innovator, offering numerous retrieval strategies. When it comes to integrating Elasticsearch into LangChain, we have made it easy to choose between the most common retrieval strategies, for example, dense vector, sparse vector, keyword or hybrid. And we enabled power users to further customize these strategies. Keep reading to see some examples. (Note that we assume we have an Elasticsearch deployment .) LangChain integration package In order to use the langchain-elasticsearch partner package, you first need to install it: pip install langchain-elasticsearch Then you can import the classes you need from the langchain_elasticsearch module, for example, the ElasticsearchStore , which gives you simple methods to index and search your data. In this example, we use Elastic's sparse vector model ELSER (which has to be deployed first) as our retrieval strategy. from langchain_elasticsearch import ElasticsearchStore es_store = ElasticsearchStore( es_cloud_id="your-cloud-id", es_api_key="your-api-key", index_name="rag-example", strategy=ElasticsearchStore.SparseVectorRetrievalStrategy(model_id=".elser_model_2"), ), A simple RAG application Now, let's build a simple RAG example application. First, we add some example documents to our Elasticsearch store. texts = [ "LangChain is a framework for developing applications powered by large language models (LLMs).", "Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases.", ... ] es_store.add_texts(texts) Next, we define the LLM. Here, we use the default gpt-3.5-turbo model offered by OpenAI, which also powers ChatGPT. from langchain_openai import ChatOpenAI llm = ChatOpenAI(api_key="sk-...") # or set the OPENAI_API_KEY environment variable Now we are ready to plug together our RAG system. For simplicity we take a standard prompt for instructing the LLM. We also transform the Elasticsearch store into a LangChain retriever. Finally, we chain together the retrieval step with adding the documents to the prompt and sending it to the LLM. from langchain import hub from langchain_core.runnables import RunnablePassthrough prompt = hub.pull("rlm/rag-prompt") # standard prompt from LangChain hub retriever = es_store.as_retriever() def format_docs(docs): return "\n\n".join(doc.page_content for doc in docs) rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser() ) With these few lines of code, we now already have a simple RAG system. Users can now ask questions on the data: rag_chain.invoke("Which frameworks can help me build LLM apps?") "LangChain is a framework specifically designed for building LLM-powered applications. ..." It's as simple as this. Our RAG system can now respond with info about LangChain, which ChatGPT (version 3.5) cannot. Of course there are many ways to improve this system. One of them is optimizing the way we retrieve the documents. Full retrieval flexibility through the Retriever The Elasticsearch store offers common retrieval strategies out-of-the-box, and developers can freely experiment with what works best for a given use case. But what if your data model is more complex than just text with a single field? What, for example, if your indexing setup includes a web crawler that yields documents with texts, titles, URLs and tags and all these fields are important for search? Elasticsearch's Query DSL gives users full control over how to search their data. And in LangChain, the ElasticsearchRetriever enables this full flexibility directly. All that is required is to define a function that maps the user input query to an Elasticsearch request. Let's say we want to add semantic reranking capabilities to our retrieval step. By adding a Cohere reranking step, the results at the top become more relevant without extra manual tuning. For this, we define a Retriever that takes in a function that returns the respective Query DSL structure. def text_similarity_reranking(search_query: str) -> Dict: return { "retriever": { "text_similarity_reranker": { "retriever": { "standard": { "query": { "match": { "text_field": search_query } } } }, "field": "text_field", "inference_id": "cohere-rerank-service", "inference_text": search_query, "window_size": 10 } } } retriever = ElasticsearchRetriever.from_es_params( es_cloud_id="your-cloud-id", es_api_key="your-api-key", index_name="rag-example", content_field=text_field, body_func=text_similarity_reranking, ) (Note that the query structure for similarity reranking is still being finalized. It will be available in an upcoming release.) This retriever can slot seamlessly into the RAG code above. The result is that the retrieval part of our RAG pipeline is much more accurate, leading to more relevant documents being forwarded to the LLM and, most importantly, to more relevant answers. Conclusion Elastic's continued investment into LangChain's ecosystem brings the latest retrieval innovations to one of the most popular GenAI libraries. Through this collaboration, Elastic and LangChain enable developers to rapidly and easily build RAG solutions for end users while providing the necessary flexibility for in-depth tuning of results quality. Ready to try this out on your own? Start a free trial . Looking to build RAG into your apps? Want to try different LLMs with a vector database? Check out our sample notebooks for LangChain, Cohere and more on Github, and join Elasticsearch Relevance Engine training now. Max Jakob 5 min read 11 June 2024 Generative AI Integrations Share Twitter Facebook LinkedIn Recommended Articles Integrations How To Generative AI • 25 July 2024 Protecting Sensitive and PII information in RAG with Elasticsearch and LlamaIndex How to protect sensitive and PII data in a RAG application with Elasticsearch and LlamaIndex. Srikanth Manvi How To Generative AI • 19 July 2024 Build a Conversational Search for your Customer Success Application with Elasticsearch and OpenAI Explore how to enhance your customer success application by implementing a conversational search feature using advanced technologies such as Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) Lionel Palacin Integrations How To Generative AI Vector Database • 11 July 2024 semantic_text with Amazon Bedrock Using semantic_text new feature, and AWS Bedrock as inference endpoint service Gustavo Llermaly Integrations How To Generative AI Vector Database • 10 July 2024 Elasticsearch open inference API adds Amazon Bedrock support Elasticsearch open inference API adds support for embeddings generated from models hosted on Amazon Bedrock." Mark Hoy Hemant Malik Vector Database How To Generative AI • 10 July 2024 Playground: Experiment with RAG using Bedrock Anthropic Models and Elasticsearch in minutes Playground is a low code interface for developers to explore grounding LLMs of their choice with their own private data, in minutes. Joe McElroy Aditya Tripathi Max Jakob 5 min read 11 June 2024 Generative AI Integrations Share Twitter Facebook LinkedIn Jump to Rapid RAG prototyping LangChain integration package A simple RAG application Full retrieval flexibility through the Retriever Conclusion Sitemap RSS Feed Search Labs Repo Elastic.co ©2024. Elasticsearch B.V. All Rights Reserved.""",
      "article_content": """In the last year, we have seen a lot of movement in generative AI. Many new services and libraries have emerged. LangChain has separated itself as the most popular library for building applications with large language models (LLMs), for example Retrieval Augmented Generation (RAG) systems. The library makes it really easy to prototype and experiment with different models and retrieval systems. To enable the first-class support for Elasticsearch in LangChain, we recently elevated our integration from a community package to an official LangChain partner package . This work makes it straightforward to import Elasticsearch capabilities into LangChain applications. The Elastic team manages the code and the release process through a dedicated repository . We will keep improving the LangChain integration there, making sure that users can take full advantage of the latest improvements in Elasticsearch. Our collaboration with Elastic in the last 12 months has been exceptional, particularly as we establish better ways for developers and end users to build RAG applications from prototype to production," said Harrison Chase, Co-Founder and CEO at LangChain. "The LangChain-Elasticsearch vector database integrations will help do just that, and we're excited to see this partnership grow with future feature and integration releases. Elasticsearch is one of the most flexible and performant retrieval systems that includes a vector database. One of our goals at Elastic is to also be the most open retrieval system out there. In a space as fast-moving as generative AI, we want to have the developer's back when it comes to utilizing emerging tools and libraries. This is why we work closely with libraries like LangChain and add native support to the GenAI ecosystem. From using Elasticsearch as a vector database to hybrid search and orchestrating a full RAG application. Elasticsearch and LangChain have collaborated closely this year. We are putting our extensive experience in building search tools into making your experience of LangChain easier and more flexible. Let's take a deeper look in this blog. Rapid RAG prototyping RAG is a technique for providing users with highly relevant answers to questions. The main advantages over using LLMs directly are that user data can be easily integrated, and hallucinations by the LLM can be minimized. This is achieved by adding a document retrieval step that provides relevant context for the LLM. Since its inception, Elasticsearch has been the go-to solution for relevant document retrieval and has since been a leading innovator, offering numerous retrieval strategies. When it comes to integrating Elasticsearch into LangChain, we have made it easy to choose between the most common retrieval strategies, for example, dense vector, sparse vector, keyword or hybrid. And we enabled power users to further customize these strategies. Keep reading to see some examples. (Note that we assume we have an Elasticsearch deployment .) LangChain integration package In order to use the langchain-elasticsearch partner package, you first need to install it: pip install langchain-elasticsearch Then you can import the classes you need from the langchain_elasticsearch module, for example, the ElasticsearchStore , which gives you simple methods to index and search your data. In this example, we use Elastic's sparse vector model ELSER (which has to be deployed first) as our retrieval strategy. from langchain_elasticsearch import ElasticsearchStore es_store = ElasticsearchStore( es_cloud_id="your-cloud-id", es_api_key="your-api-key", index_name="rag-example", strategy=ElasticsearchStore.SparseVectorRetrievalStrategy(model_id=".elser_model_2"), ), A simple RAG application Now, let's build a simple RAG example application. First, we add some example documents to our Elasticsearch store. texts = [ "LangChain is a framework for developing applications powered by large language models (LLMs).", "Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases.", ... ] es_store.add_texts(texts) Next, we define the LLM. Here, we use the default gpt-3.5-turbo model offered by OpenAI, which also powers ChatGPT. from langchain_openai import ChatOpenAI llm = ChatOpenAI(api_key="sk-...") # or set the OPENAI_API_KEY environment variable Now we are ready to plug together our RAG system. For simplicity we take a standard prompt for instructing the LLM. We also transform the Elasticsearch store into a LangChain retriever. Finally, we chain together the retrieval step with adding the documents to the prompt and sending it to the LLM. from langchain import hub from langchain_core.runnables import RunnablePassthrough prompt = hub.pull("rlm/rag-prompt") # standard prompt from LangChain hub retriever = es_store.as_retriever() def format_docs(docs): return "\n\n".join(doc.page_content for doc in docs) rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser() ) With these few lines of code, we now already have a simple RAG system. Users can now ask questions on the data: rag_chain.invoke("Which frameworks can help me build LLM apps?") "LangChain is a framework specifically designed for building LLM-powered applications. ..." It's as simple as this. Our RAG system can now respond with info about LangChain, which ChatGPT (version 3.5) cannot. Of course there are many ways to improve this system. One of them is optimizing the way we retrieve the documents. Full retrieval flexibility through the Retriever The Elasticsearch store offers common retrieval strategies out-of-the-box, and developers can freely experiment with what works best for a given use case. But what if your data model is more complex than just text with a single field? What, for example, if your indexing setup includes a web crawler that yields documents with texts, titles, URLs and tags and all these fields are important for search? Elasticsearch's Query DSL gives users full control over how to search their data. And in LangChain, the ElasticsearchRetriever enables this full flexibility directly. All that is required is to define a function that maps the user input query to an Elasticsearch request. Let's say we want to add semantic reranking capabilities to our retrieval step. By adding a Cohere reranking step, the results at the top become more relevant without extra manual tuning. For this, we define a Retriever that takes in a function that returns the respective Query DSL structure. def text_similarity_reranking(search_query: str) -> Dict: return { "retriever": { "text_similarity_reranker": { "retriever": { "standard": { "query": { "match": { "text_field": search_query } } } }, "field": "text_field", "inference_id": "cohere-rerank-service", "inference_text": search_query, "window_size": 10 } } } retriever = ElasticsearchRetriever.from_es_params( es_cloud_id="your-cloud-id", es_api_key="your-api-key", index_name="rag-example", content_field=text_field, body_func=text_similarity_reranking, ) (Note that the query structure for similarity reranking is still being finalized. It will be available in an upcoming release.) This retriever can slot seamlessly into the RAG code above. The result is that the retrieval part of our RAG pipeline is much more accurate, leading to more relevant documents being forwarded to the LLM and, most importantly, to more relevant answers. Conclusion Elastic's continued investment into LangChain's ecosystem brings the latest retrieval innovations to one of the most popular GenAI libraries. Through this collaboration, Elastic and LangChain enable developers to rapidly and easily build RAG solutions for end users while providing the necessary flexibility for in-depth tuning of results quality.""",
      "domains": [
        "https://www.elastic.co"
      ],
      "title": "LangChain and Elastic collaborate to add vector database and semantic reranking for RAG — Search Labs",
      "meta_author": [
        "Max Jakob"
      ],
      "url": "https://www.elastic.co/search-labs/blog/langchain-collaboration",
      "url_scheme": "https",
      "meta_description": "Learn how LangChain and Elasticsearch can accelerate your speed of innovation in the LLM and GenAI space.",
      "headings": [
        "LangChain and Elastic collaborate to add vector database and semantic reranking for RAG",
        "Rapid RAG prototyping",
        "LangChain integration package",
        "A simple RAG application",
        "Full retrieval flexibility through the Retriever",
        "Conclusion",
        "Protecting Sensitive and PII information in RAG with Elasticsearch and LlamaIndex",
        "Build a Conversational Search for your Customer Success Application with Elasticsearch and OpenAI",
        "semantic_text with Amazon Bedrock",
        "Elasticsearch open inference API adds Amazon Bedrock support",
        "Playground: Experiment with RAG using Bedrock Anthropic Models and Elasticsearch in minutes"
      ],
      "links": [
        "https://cloud.elastic.co/registration?onboarding_token=search&cta=cloud-registration&tech=trial&plcmt=navigation&pg=search-labs",
        "https://discuss.elastic.co/c/search/84",
        "https://github.com/elastic/elasticsearch-labs",
        "https://github.com/langchain-ai/langchain-elastic",
        "https://pypi.org/project/langchain-elasticsearch/",
        "https://python.langchain.com/v0.2/docs/integrations/providers/elasticsearch/",
        "https://search.elastic.co/?location%5B0%5D=Search+Labs&referrer=https://www.elastic.co/search-labs/blog/langchain-collaboration",
        "https://www.elastic.co/contact",
        "https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html",
        "https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html#download-deploy-elser",
        "https://www.elastic.co/search-labs",
        "https://www.elastic.co/search-labs/blog",
        "https://www.elastic.co/search-labs/blog",
        "https://www.elastic.co/search-labs/blog/category/generative-ai",
        "https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank",
        "https://www.elastic.co/search-labs/blog/langchain-collaboration#a-simple-rag-application",
        "https://www.elastic.co/search-labs/blog/langchain-collaboration#conclusion",
        "https://www.elastic.co/search-labs/blog/langchain-collaboration#full-retrieval-flexibility-through-the-retriever",
        "https://www.elastic.co/search-labs/blog/langchain-collaboration#langchain-integration-package",
        "https://www.elastic.co/search-labs/blog/langchain-collaboration#rapid-rag-prototyping",
        "https://www.elastic.co/search-labs/blog/retrieval-augmented-generation-rag",
        "https://www.elastic.co/search-labs/blog/semantic-reranking-with-retrievers",
        "https://www.elastic.co/search-labs/integrations",
        "https://www.elastic.co/search-labs/tutorials",
        "https://www.elastic.co/search-labs/tutorials/install-elasticsearch"
      ],
      "id": "66a5568a30cc8eb607eec315",
      "url_port": 443,
      "url_host": "www.elastic.co",
      "url_path_dir2": "blog",
      "url_path": "/search-labs/blog/langchain-collaboration",
      "url_path_dir1": "search-labs"
    }
  }

创建代理

要设置代理服务器,我们将使用 express.js。我们将按照最佳实践创建两个端点:一个用于处理 _search 调用,另一个用于 completion 调用。

首先创建一个名为 es-proxy 的新目录,使用 cd es-proxy 导航到该目录,然后使用 npm init 初始化你的项目。

接下来,使用以下命令安装必要的依赖项:

yarn add express axios dotenv cors

以下是每个包的简要说明:

express:用于创建代理服务器,该服务器将处理传入的请求并将其转发到 Elasticsearch。axios:一种流行的 HTTP 客户端,可简化对 Elasticsearch API 的请求。dotenv:允许你通过将敏感数据存储在环境变量中来管理它们,例如 API 密钥。cors:通过处理跨源资源共享 (CORS),使你的 UI 能够向不同的域(在本例中为代理服务器)发出请求。当你的前端后端托管在不同的域或端口上时,这对于避免出现问题至关重要。

现在,创建一个 .env 文件来安全地存储你的 Elasticsearch URL 和 API 密钥:

ELASTICSEARCH_URL=https://<your_elasticsearch_url>
API_KEY=<your_api_key>

确保你创建的 API 密钥仅限于所需的索引,并且是只读的

最后,创建一个 index.js 文件,内容如下:

index.js

require("dotenv").config();

const express = require("express");
const cors = require("cors");
const app = express();
const axios = require("axios");

app.use(express.json());
app.use(cors());

const { ELASTICSEARCH_URL, API_KEY } = process.env;

// Handle all _search requests
app.post("/api/:index/_search", async (req, res) => {
  try {
    const response = await axios.post(
      `${ELASTICSEARCH_URL}/${req.params.index}/_search`,
      req.body,
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: `ApiKey ${API_KEY}`,
        },
      }
    );
    res.json(response.data);
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

// Handle all _completion requests
app.post("/api/completion", async (req, res) => {
  try {
    const response = await axios.post(
      `${ELASTICSEARCH_URL}/_inference/completion/summaries-completion`,
      req.body,
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: `ApiKey ${API_KEY}`,
        },
      }
    );
    res.json(response.data);
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

// Start the server
const PORT = process.env.PORT || 1337;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`);
});

现在,通过运行 node index.js 启动服务器。这将默认在端口 1337 上启动服务器,或者在 .env 文件中定义的端口上启动服务器。

创建组件

对于 UI 组件,我们将使用 Search UI React 库 search-ui。我们将创建一个自定义组件,以便每次用户运行搜索时,它都会使用我们创建的 completion 推理端点将最佳结果发送到 LLM,然后将答案显示给用户。

你可以在此处找到有关配置实例的完整教程。你可以在计算机上运行 search-ui,也可以在此处使用我们的在线沙盒。

运行示例并连接到数据后,在启动应用程序文件夹中的终端中运行以下安装步骤:

yarn add axios antd html-react-parser

安装其他依赖项后,为新组件创建一个新的 AiSummary.js 文件。这将包括一个简单的提示,用于向 AI 提供说明和规则。

AiSummary.js

import { withSearch } from "@elastic/react-search-ui";
import { useState, useEffect } from "react";
import axios from "axios";
import { Card } from "antd";
import parse from "html-react-parser";

const formatSearchResults = (results) => {
  return results
    .slice(0, 3)
    .map(
      (result) => `
    Article Author(s): ${result.meta_author.raw.join(",")}
    Article URL: ${result.url.raw}
    Article title: ${result.title.raw}
    Article content: ${result.article_content.raw}
  `
    )
    .join("\n");
};

const fetchAiSummary = async (searchTerm, results) => {
  const prompt = `
    You are a search assistant. Your mission is to complement search results with an AI Summary to address the user request.
    User request: ${searchTerm}
    Top search results: ${formatSearchResults(results)}
    Rules:
    - The answer must be short. No more than one paragraph.
    - Use HTML
    - Use content from the most relevant search results only to answer the user request
    - Add highlights wrapping in <i><b></b></i> tags the most important phrases of your answer
    - At the end of the answer add a citations section with links to the articles you got the answer on this format:
    <h4>Citations</h4>
    <ul>
      <li><a href="{url}"> {title} </a></li>
    </ul>
    - Only provide citations from the top search results I showed you, and only if they are relevant to the user request.
  `;
  const responseData = await axios.post(
    "http://localhost:1337/api/completion",
    { input: prompt },
    {
      headers: {
        "Content-Type": "application/json",
      },
    }
  );
  return responseData.data.completion[0].result;
};

const AiSummary = ({ results, searchTerm, resultSearchTerm }) => {
  const [aiSummary, setAiSummary] = useState("");
  const [isLoading, setIsLoading] = useState(false);

  useEffect(() => {
    if (searchTerm) {
      setIsLoading(true);
      fetchAiSummary(searchTerm, results).then((summary) => {
        setAiSummary(summary);
        setIsLoading(false);
      });
    }
  }, [resultSearchTerm]);

  return (
    <Card style={{ width: "100%" }} loading={isLoading}>
      <div>
        <h2>AI Summary</h2>
        {!resultSearchTerm ? "Ask anything!" : parse(aiSummary)}
      </div>
    </Card>
  );
};

export default withSearch(({ results, searchTerm, resultSearchTerm }) => ({
  results,
  searchTerm,
  resultSearchTerm,
  AiSummary,
}))(AiSummary);

更新 App.js

现在我们创建了自定义组件,是时候将其添加到应用程序中了。你的 App.js 应如下所示:

App.js

import React from "react";
import ElasticsearchAPIConnector from "@elastic/search-ui-elasticsearch-connector";
import {
  ErrorBoundary,
  SearchProvider,
  SearchBox,
  Results,
  Facet,
} from "@elastic/react-search-ui";
import { Layout } from "@elastic/react-search-ui-views";
import "@elastic/react-search-ui-views/lib/styles/styles.css";
import AiSummary from "./AiSummary";

const connector = new ElasticsearchAPIConnector(
  {
    host: "http://localhost:1337/api",
    index: "search-labs-index",
  },
  (requestBody, requestState) => {
    if (!requestState.searchTerm) return requestBody;
    requestBody.query = {
      semantic: {
        query: requestState.searchTerm,
        field: "semantic_text",
      },
    };
    return requestBody;
  }
);

const config = {
  debug: true,
  searchQuery: {
    search_fields: {
      semantic_text: {},
    },
    result_fields: {
      title: {
        snippet: {},
      },
      article_content: {
        snippet: {
          size: 10,
        },
      },
      meta_description: {},
      url: {},
      meta_author: {},
      meta_img: {},
    },
    facets: {
      "meta_author.enum": { type: "value" },
    },
  },
  apiConnector: connector,
  alwaysSearchOnInitialLoad: false,
};

export default function App() {
  return (
    <SearchProvider config={config}>
      <div className="App">
        <ErrorBoundary>
          <Layout
            header={<SearchBox />}
            bodyHeader={<AiSummary />}
            bodyContent={
              <Results
                titleField="title"
                thumbnailField="meta_img"
                urlField="url"
              />
            }
            sideContent={
              <Facet key={"1"} field={"meta_author.enum"} label={"author"} />
            }
          />
        </ErrorBoundary>
      </div>
    </SearchProvider>
  );
}

请注意,在连接器实例中我们如何覆盖默认查询以使用语义查询并利用我们创建的 semantic_text 映射。

提出问题

现在是时候测试它了。提出有关你索引的文档的任何问题,在搜索结果上方,你应该会看到一张带有 AI 摘要的卡片:

结论

重新设计你的搜索体验对于保持用户的参与度非常重要,并节省他们浏览结果以找到问题答案的时间。借助 Elastic 开放推理服务和搜索用户界面,设计此类体验比以往任何时候都更容易。你准备好尝试了吗?

准备好自己尝试了吗?开始免费试用。

Elasticsearch 集成了 LangChain、Cohere 等工具。加入我们的 Beyond RAG Basics 网络研讨会,构建你的下一个 GenAI 应用程序!

原文:Generate AI summaries with Elastic — Search Labs

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2186258.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

云计算SLA响应时间的matlab模拟与仿真

目录 1.程序功能描述 2.测试软件版本以及运行结果展示 3.核心程序 4.本算法原理 5.完整程序 1.程序功能描述 用matlab模拟&#xff0c;一个排队理论。输入一堆包&#xff0c;经过buffer&#xff08;一个或者几个都行&#xff09;传给server&#xff0c;这些包会在buffer里…

热网无线监测系统/config.aspx接口存在反射性XSS漏洞

漏洞描述 热网无线监测系统/config.aspx接口存在反射性XSS漏洞 漏洞复现 FOFA body"Downloads/HDPrintInstall.rar" || body"skins/login/images/btn_login.jpg" POC IP/config.aspx POC <script>alert(1)</script> 点击确认成功弹窗1

C++继承与菱形继承(一文了解全部继承相关基础知识和面试点!)

目的减少重复代码冗余 Class 子类(派生类) &#xff1a; 继承方式 父类&#xff08;基类&#xff09; 继承方式共有三种&#xff1a;公共、保护、私有 父类的私有成员private无论哪种继承方式都不可以被子类使用 保护protected权限的内容在类内是可以访问&#xff0c;但是在…

kubernetes-强制删除命名空间

一、故障现象 1、删除命名空间卡住、强制删除也卡住 2、其他终端显示命名空间下无资源 二、处理步骤 1、kubectl get namespace cilium-test -o json > temp.json 获取你需要删除的命名空间json描述文件。 2、修改finalize字段 3、替换 kubectl replace --raw "/api/v1…

csdn加目录,标题

最简单的办法是&#xff0c; 目录 先写文章&#xff0c; 在给需要加标题的&#xff0c; 给他添加格式为标题一&#xff0c; 在点目录 先写文章&#xff0c; 在给需要加标题的&#xff0c; 给他添加格式为标题一&#xff0c; 最后点目录就会生成目录在文章前面了&#x…

AMD GPU推理:三步让你了解AI推理的游戏规则

我们常听到“AI推理”这个词,但很多朋友可能只了解个大概。如果你对AI有基本的认识,可能会好奇:AMD的GPU怎么在推理中大展拳脚?它跟常见的NVIDIA相比又如何?今天,我们就来聊聊这个问题。你可能在考虑选购GPU或者在项目中使用AMD,本文将带你从实际应用出发,一步步拆解。…

云原生(四十一)| 阿里云ECS服务器介绍

文章目录 阿里云ECS服务器介绍 一、云计算概述 二、什么是公有云 三、公有云优缺点 1、优点 2、缺点 四、公有云品牌 五、市场占有率 六、阿里云ECS概述 七、阿里云ECS特点 阿里云ECS服务器介绍 一、云计算概述 云计算是一种按使用量付费的模式&#xff0c;这种模式…

Stable Diffusion绘画 | 来训练属于自己的模型:炼丹参数调整--步数设置与计算

要想训练一个优质的模型&#xff0c;一定要认识和了解模型训练中&#xff0c;参数的作用和意义。 整个模型训练的过程&#xff0c;参数并不是一成不变的&#xff0c;也没有固定的模板&#xff0c; 当我们修改了模型训练里面的某个参数&#xff0c;很可能就需要连带其他一系列…

USB外设的Device与Host的差异

USB&#xff0c;Universal Serial Bus。 USB协会&#xff1a; USB-IF协会认证&#xff1a;USB IF全称USB Implementers Forum&#xff0c;是由一群开发通用串行总线规范的公司创立的非营利性组织。USB-IF组织的成立旨在推广通用串行总线技术并提供相应的技术规范&#xff0c;…

手机使用指南:如何在没有备份的情况下从 Android 设备恢复已删除的联系人

在本指南中&#xff0c;您将了解如何从 Android 手机内存中恢复已删除的联系人。Android 诞生、见证并征服了 80% 的智能手机行业。有些人可能将此称为“非常大胆的宣言”&#xff0c;但最近的统计数据完全支持我们的说法。灵活性、高度改进的可用性和快速性是 Android 操作系统…

【C语言】内存函数的使用和模拟实现

文章目录 一、memcpy的使用和模拟实现二、memmove的使用和模拟实现三、memset的使用四、memcmp的使用 一、memcpy的使用和模拟实现 在之前我们学习了使用和模拟实现strncpy函数&#xff0c;它是一个字符串函数&#xff0c;用来按照给定的字节个数来拷贝字符串&#xff0c;那么问…

“衣依”服装销售平台:Spring Boot框架的设计与实现

3系统分析 3.1可行性分析 通过对本“衣依”服装销售平台实行的目的初步调查和分析&#xff0c;提出可行性方案并对其一一进行论证。我们在这里主要从技术可行性、经济可行性、操作可行性等方面进行分析。 3.1.1技术可行性 本“衣依”服装销售平台采用JAVA作为开发语言&#xff…

TOGAF框架在企业数字化转型中从理论到实践的全面应用指南

数字化转型的背景与意义 随着全球技术的快速发展&#xff0c;数字化已成为现代企业生存和发展的核心驱动力。企业数字化转型不仅意味着引入新技术&#xff0c;还要求在业务模式、组织架构和运营方式上进行深度变革。然而&#xff0c;数字化转型的实施通常面临诸多挑战&#xf…

Scrapy 爬虫的大模型支持

使用 Scrapy 时&#xff0c;你可以轻松使用大型语言模型 (LLM) 来自动化或增强你的 Web 解析。 有多种使用 LLM 来帮助进行 Web 抓取的方法。在本指南中&#xff0c;我们将在每个页面上调用一个 LLM&#xff0c;从中抽取我们定义的一组属性&#xff0c;而无需编写任何选择器或…

Python和C++混淆矩阵地理学医学物理学视觉语言模型和算法模型评估工具

&#x1f3af;要点 优化损失函数评估指标海岸线检测算法评估遥感视觉表征和文本增强乳腺癌预测模型算法液体中闪烁光和切伦科夫光分离多标签分类任务性能评估有向无环图、多路径标记和非强制叶节点预测二元分类评估特征归因可信性评估马修斯相关系数对比其他准确度 Python桑…

骨传导耳机哪款好?全网最全的5大精品骨传导耳机测评报告分享

在追求高效运动体验与听觉享受的现代生活里&#xff0c;骨传导耳机以其独特的设计和创新技术吸引了众多用户的目光。相较于传统入耳式耳机&#xff0c;骨传导耳机通过振动头部骨骼而非耳膜来传输声音&#xff0c;不仅提供了更好的佩戴舒适度&#xff0c;还能在一定程度上保持对…

如何使用工具删除 iPhone 上的图片背景

在 iPhone 上删除背景图像变得简单易行。感谢最近 iOS 更新中引入的新功能。如今&#xff0c;iOS 用户现在可以毫不费力地删除背景&#xff0c;而无需复杂的应用程序。在这篇文章中&#xff0c;您将学习如何使用各种方法去除 iPhone 上的背景。这可确保您可以选择最适合您偏好的…

PMP--三模--解题--121-130

文章目录 14.敏捷--产品待办事项列表--敏捷中应对变更的方法就是不断地梳理需求的优先级。--要求产品负责人与客户确定事项的优先级。121、 [单选] 在迭代演示期间&#xff0c;客户注意到他们希望对产品进行改进。团队接下来应该做什么&#xff1f; 9.资源管理--认可与奖励--在…

C语言+单片机

今天内容有点水哈哈&#xff08;忙着练焊铁技术了嘻嘻&#xff09; C语言 简单学习了while语言以及其与for语言的区别和适用方法 .循环结构&#xff1a; 初始化语句条件判断句条件控制句 for语句 for(int1;i<100;i){执行条件} for (int i 1; i < 100; i) {printf(&quo…

YOLOv11改进 | 注意力篇 | YOLOv11引入MSDA多尺度空洞注意力

1. MSDA介绍 1.1 摘要&#xff1a;作为事实上的解决方案&#xff0c;鼓励使用普通视觉变换器&#xff08;ViT&#xff09;对任意图像块之间的远程依赖性进行建模&#xff0c;而全局参与感受野会导致二次计算成本。 Vision Transformers 的另一个分支利用了受 CNN 启发的局部注…