【RAG落地利器】向量数据库Qdrant使用教程

news2025/2/28 6:34:24

TrustRAG项目地址🌟:https://github.com/gomate-community/TrustRAG

可配置的模块化RAG框架

环境依赖

本教程基于docker安装Qdrant数据库,在此之前请先安装docker.

  • Docker - The easiest way to use Qdrant is to run a pre-built Docker image.
  • Python version >=3.8

启动Qdrant容器

1.拉取镜像

docker pull qdrant/qdrant

2.启动qdrant容器服务

docker run -d \
    --name qdrant_server \
    -v $(pwd)/qdrant_storage:/qdrant/storage \
    -p 6333:6333 \
    qdrant/qdrant
  • qdrant/qdrant 镜像创建一个名为 qdrant_server 的容器。
  • 将宿主机的 $(pwd)/qdrant_storage 目录挂载到容器的 /qdrant/storage 目录,以实现数据持久化。
  • 将宿主机的 6333 端口映射到容器的 6333 端口,以便通过宿主机访问 Qdrant 服务。
  • 容器在后台运行,不会占用当前终端。
docker logs qdrant_server

可以看到下面日志:

通过 http://localhost:6333/dashboard 地址访问web ui

基于RESTful API 操作向量数据库

第一步:创建一个集合

Qdrant向量数据库的集合概念可以类比MYSQL的表结构,用于统一存储同一类向量数据,集合中存储的每一条数据,在Qdrant中称为点(points),这里的点有数学几何空间的点类似的意思,代表向量在几何空间中的表示(你就当成一条数据看待就行)。

首先,我们需要创建一个名为 star_charts 的集合,用来存储殖民地数据。每个位置都会用一个四维向量来表示,并且我们会使用点积(Dot Product)作为相似度搜索的距离度量。

运行以下命令来创建集合:

PUT collections/star_charts
{
  "vectors": {
    "size": 4,
    "distance": "Dot"
  }
}

第二步:将数据加载到集合中

创建好集合之后,我们可以向集合添加向量数据,在Qdrant中向量数据使用point表示,一条point数据包括三部分id、payload(关联数据)、向量数据(vector)三部分。

现在集合已经设置好了,接下来我们添加一些数据。每个位置都会有一个向量和一些额外的信息(称为 payload),比如它的名字。

运行以下请求来添加数据:

PUT collections/star_charts/points
{
  "points": [
    {
      "id": 1,
      "vector": [0.05, 0.61, 0.76, 0.74],
      "payload": {
        "colony": "Mars"
      }
    },
    {
      "id": 2,
      "vector": [0.19, 0.81, 0.75, 0.11],
      "payload": {
        "colony": "Jupiter"
      }
    },
    {
      "id": 3,
      "vector": [0.36, 0.55, 0.47, 0.94],
      "payload": {
        "colony": "Venus"
      }
    },
    {
      "id": 4,
      "vector": [0.18, 0.01, 0.85, 0.80],
      "payload": {
        "colony": "Moon"
      }
    },
    {
      "id": 5,
      "vector": [0.24, 0.18, 0.22, 0.44],
      "payload": {
        "colony": "Pluto"
      }
    }
  ]
}

第三步:运行搜索查询

现在,我们来搜索一下与某个特定向量(代表一个空间位置)最接近的三个殖民地。这个查询会返回这些殖民地以及它们的 payload 信息。

运行以下查询来找到最近的殖民地:

POST collections/star_charts/points/search
{
  "vector": [0.2, 0.1, 0.9, 0.7],
  "limit": 3,
  "with_payload": true
}

这样,你就可以找到与给定向量最接近的三个殖民地了!


上面命令,我们都可以在面板里面执行,

点击集合可以看到我们刚刚创建的例子:

点击可视化,我们可以看到集合里面的向量(point)

更多高级用法可以查看面板中的教程:

http://localhost:6333/dashboard#/tutorial

基于qdrant_client操作向量数据库

以下是将上述内容转换为 Markdown 格式的版本:

# Qdrant 快速入门指南

## 安装 `qdrant-client` 包(Python)

```bash
pip install qdrant-client

初始化客户端

from qdrant_client import QdrantClient

client = QdrantClient(url="http://localhost:6333")

创建 Collection

所有的向量数据(vector data)都存储在 Qdrant Collection 上。创建一个名为 test_collection 的 collection,该 collection 使用 dot product 作为比较向量的指标。

from qdrant_client.models import Distance, VectorParams

client.create_collection(
    collection_name="test_collection",
    vectors_config=VectorParams(size=4, distance=Distance.DOT),
)

添加带 Payload 的向量

Payload 是与向量相关联的数据。

from qdrant_client.models import PointStruct

operation_info = client.upsert(
    collection_name="test_collection",
    wait=True,
    points=[
        PointStruct(id=1, vector=[0.05, 0.61, 0.76, 0.74], payload={"city": "Berlin"}),
        PointStruct(id=2, vector=[0.19, 0.81, 0.75, 0.11], payload={"city": "London"}),
        PointStruct(id=3, vector=[0.36, 0.55, 0.47, 0.94], payload={"city": "Moscow"}),
        PointStruct(id=4, vector=[0.18, 0.01, 0.85, 0.80], payload={"city": "New York"}),
        PointStruct(id=5, vector=[0.24, 0.18, 0.22, 0.44], payload={"city": "Beijing"}),
        PointStruct(id=6, vector=[0.35, 0.08, 0.11, 0.44], payload={"city": "Mumbai"}),
    ]
)

print(operation_info)

运行查询

search_result = client.query_points(
    collection_name="test_collection", query=[0.2, 0.1, 0.9, 0.7], limit=3
).points

print(search_result)

输出

[
  {
    "id": 4,
    "version": 0,
    "score": 1.362,
    "payload": null,
    "vector": null
  },
  {
    "id": 1,
    "version": 0,
    "score": 1.273,
    "payload": null,
    "vector": null
  },
  {
    "id": 3,
    "version": 0,
    "score": 1.208,
    "payload": null,
    "vector": null
  }
]

添加过滤器

from qdrant_client.models import Filter, FieldCondition, MatchValue

search_result = client.query_points(
    collection_name="test_collection",
    query=[0.2, 0.1, 0.9, 0.7],
    query_filter=Filter(
        must=[FieldCondition(key="city", match=MatchValue(value="London"))]
    ),
    with_payload=True,
    limit=3,
).points

print(search_result)
[
    {
        "id": 2,
        "version": 0,
        "score": 0.871,
        "payload": {
            "city": "London"
        },
        "vector": null
    }
]

语义搜索入门实现

以官方教程为例,我在TrustRAG中对Qdrant进行了封装改造:

官方教程:https://qdrant.tech/documentation/beginner-tutorials/neural-search/
TrusRAG实现代码QdrantEngine:https://github.com/gomate-community/TrustRAG/blob/main/trustrag/modules/engine/qdrant.py

以下为使用完整代码:

from trustrag.modules.engine.qdrant import QdrantEngine
from trustrag.modules.engine.qdrant import SentenceTransformerEmbedding
if __name__ == "__main__":
    # Initialize embedding generators
    local_embedding_generator = SentenceTransformerEmbedding(model_name_or_path="all-MiniLM-L6-v2", device="cpu")
    # openai_embedding_generator = OpenAIEmbedding(api_key="your_key", base_url="https://ark.cn-beijing.volces.com/api/v3", model="your_model_id")

    # Initialize QdrantEngine with local embedding generator
    qdrant_engine = QdrantEngine(
        collection_name="startups",
        embedding_generator=local_embedding_generator,
        qdrant_client_params={"host": "192.168.1.5", "port": 6333},
    )

    documents=[
        {"name": "SaferCodes", "images": "https:\/\/safer.codes\/img\/brand\/logo-icon.png",
         "alt": "SaferCodes Logo QR codes generator system forms for COVID-19",
         "description": "QR codes systems for COVID-19.\nSimple tools for bars, restaurants, offices, and other small proximity businesses.",
         "link": "https:\/\/safer.codes", "city": "Chicago"},
        {"name": "Human Practice",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/373036-94d1e190f12f2c919c3566ecaecbda68-thumb_jpg.jpg?buster=1396498835",
         "alt": "Human Practice -  health care information technology",
         "description": "Point-of-care word of mouth\nPreferral is a mobile platform that channels physicians\u2019 interest in networking with their peers to build referrals within a hospital system.\nHospitals are in a race to employ physicians, even though they lose billions each year ($40B in 2014) on employment. Why ...",
         "link": "http:\/\/humanpractice.com", "city": "Chicago"},
        {"name": "StyleSeek",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/3747-bb0338d641617b54f5234a1d3bfc6fd0-thumb_jpg.jpg?buster=1329158692",
         "alt": "StyleSeek -  e-commerce fashion mass customization online shopping",
         "description": "Personalized e-commerce for lifestyle products\nStyleSeek is a personalized e-commerce site for lifestyle products.\nIt works across the style spectrum by enabling users (both men and women) to create and refine their unique StyleDNA.\nStyleSeek also promotes new products via its email newsletter, 100% personalized ...",
         "link": "http:\/\/styleseek.com", "city": "Chicago"},
        {"name": "Scout",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/190790-dbe27fe8cda0614d644431f853b64e8f-thumb_jpg.jpg?buster=1389652078",
         "alt": "Scout -  security consumer electronics internet of things",
         "description": "Hassle-free Home Security\nScout is a self-installed, wireless home security system. We've created a more open, affordable and modern system than what is available on the market today. With month-to-month contracts and portable devices, Scout is a renter-friendly solution for the other ...",
         "link": "http:\/\/www.scoutalarm.com", "city": "Chicago"},
        {"name": "Invitation codes", "images": "https:\/\/invitation.codes\/img\/inv-brand-fb3.png",
         "alt": "Invitation App - Share referral codes community ",
         "description": "The referral community\nInvitation App is a social network where people post their referral codes and collect rewards on autopilot.",
         "link": "https:\/\/invitation.codes", "city": "Chicago"},
        {"name": "Hyde Park Angels",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/61114-35cd9d9689b70b4dc1d0b3c5f11c26e7-thumb_jpg.jpg?buster=1427395222",
         "alt": "Hyde Park Angels - ",
         "description": "Hyde Park Angels is the largest and most active angel group in the Midwest. With a membership of over 100 successful entrepreneurs, executives, and venture capitalists, the organization prides itself on providing critical strategic expertise to entrepreneurs and ...",
         "link": "http:\/\/hydeparkangels.com", "city": "Chicago"},
        {"name": "GiveForward",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/1374-e472ccec267bef9432a459784455c133-thumb_jpg.jpg?buster=1397666635",
         "alt": "GiveForward -  health care startups crowdfunding",
         "description": "Crowdfunding for medical and life events\nGiveForward lets anyone to create a free fundraising page for a friend or loved one's uncovered medical bills, memorial fund, adoptions or any other life events in five minutes or less. Millions of families have used GiveForward to raise more than $165M to let ...",
         "link": "http:\/\/giveforward.com", "city": "Chicago"},
        {"name": "MentorMob",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/19374-3b63fcf38efde624dd79c5cbd96161db-thumb_jpg.jpg?buster=1315734490",
         "alt": "MentorMob -  digital media education ventures for good crowdsourcing",
         "description": "Google of Learning, indexed by experts\nProblem: Google doesn't index for learning. Nearly 1 billion Google searches are done for \"how to\" learn various topics every month, from photography to entrepreneurship, forcing learners to waste their time sifting through the millions of results.\nMentorMob is ...",
         "link": "http:\/\/www.mentormob.com", "city": "Chicago"},
        {"name": "The Boeing Company",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/49394-df6be7a1eca80e8e73cc6699fee4f772-thumb_jpg.jpg?buster=1406172049",
         "alt": "The Boeing Company -  manufacturing transportation", "description": "",
         "link": "http:\/\/www.boeing.com", "city": "Berlin"},
        {"name": "NowBoarding \u2708\ufe0f",
         "images": "https:\/\/static.above.flights\/img\/lowcost\/envelope_blue.png",
         "alt": "Lowcost Email cheap flights alerts",
         "description": "Invite-only mailing list.\n\nWe search the best weekend and long-haul flight deals\nso you can book before everyone else.",
         "link": "https:\/\/nowboarding.club\/", "city": "Berlin"},
        {"name": "Rocketmiles",
         "images": "https:\/\/d1qb2nb5cznatu.cloudfront.net\/startups\/i\/158571-e53ddffe9fb3ed5e57080db7134117d0-thumb_jpg.jpg?buster=1361371304",
         "alt": "Rocketmiles -  e-commerce online travel loyalty programs hotels",
         "description": "Fueling more vacations\nWe enable our customers to travel more, travel better and travel further. 20M+ consumers stock away miles & points to satisfy their wanderlust.\nFlying around or using credit cards are the only good ways to fill the stockpile today. We've built the third way. Customers ...",
         "link": "http:\/\/www.Rocketmiles.com", "city": "Berlin"}

    ]
    vectors = qdrant_engine.embedding_generator.generate_embedding([doc["description"] for doc in documents])
    print(vectors.shape)
    payload = [doc for doc  in documents]

    # Upload vectors and payload
    qdrant_engine.upload_vectors(vectors=vectors, payload=payload)

    # Build a filter for city and category
    conditions = [
        {"key": "city", "match": "Berlin"},
    ]
    custom_filter = qdrant_engine.build_filter(conditions)

    # Search for startups related to "vacations" in Berlin
    results = qdrant_engine.search(text="vacations", query_filter=custom_filter, limit=5)
    for result in results:
        print(result)

参考资料

  • 官方教程:https://qdrant.tech/documentation/beginner-tutorials/search-beginners/
  • Qdrant向量数据库介绍:https://www.tizi365.com/topic/8144.html
  • Qdrant官方快速入门和教程简化版:https://www.cnblogs.com/shizidushu/p/18385637
  • 【RAG利器】向量数据库qdrant各种用法,多种embedding生成方法
    :https://www.cnblogs.com/zxporz/p/18336698

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2279020.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【逆境中绽放:万字回顾2024我在挑战中突破自我】

🌈个人主页: Aileen_0v0 🔥热门专栏: 华为鸿蒙系统学习|计算机网络|数据结构与算法 ​💫个人格言:“没有罗马,那就自己创造罗马~” 文章目录 一、引言二、个人成长与盘点情感与心理成长学习与技能提升其它荣誉 三、年度创作历程回顾创作内容概…

HTTP / 2

序言 在之前的文章中我们介绍过了 HTTP/1.1 协议,现在再来认识一下迭代版本 2。了解比起 1.1 版本,后面的版本改进在哪里,特点在哪里?话不多说,开始吧⭐️! 一、 HTTP / 1.1 存在的问题 很多时候新的版本的…

于灵动的变量变幻间:函数与计算逻辑的浪漫交织(下)

大家好啊,我是小象٩(๑ω๑)۶ 我的博客:Xiao Xiangζั͡ޓއއ 很高兴见到大家,希望能够和大家一起交流学习,共同进步。 这一节我们主要来学习单个函数的声明与定义,static和extern… 这里写目录标题 一、单个函数…

pthread_create函数

函数原型 pthread_create 是 POSIX 线程&#xff08;pthread&#xff09;库中的一个函数&#xff0c;用于在程序中创建一个新线程。 #include <pthread.h>int pthread_create(pthread_t *thread, const pthread_attr_t *attr,void *(*start_routine) (void *), void *a…

VSCode 的部署

一、VSCode部署 (1)、简介 vsCode 全称 Visual Studio Code&#xff0c;是微软出的一款轻量级代码编辑器&#xff0c;免费、开源而且功能强大。它支持几乎所有主流的程序语言的语法高亮、智能代码补全、自定义热键、括号匹配、代码片段、代码对比Diff、版本管理GIT等特性&…

模之屋模型导入到UE5

去模之屋随便下个模型 安装Blender2.8 插件 cats-blender-plugin &#xff0c; 打开blender 2.8转换 pmx转换fbx https://github.com/absolute-quantum/cats-blender-plugin Index of /release/Blender2.80/ 修改单位 修复贴图 更高清了 点fix model 修复模型 改为编辑模式…

用Cursor生成一个企业官网前端页面(生成腾讯、阿里官网静态页面)

用Cursor生成一个企业官网前端页面 第一版&#xff1a; <!DOCTYPE html> <html lang"zh-CN"> <head><meta charset"UTF-8"><meta name"viewport" content"widthdevice-width, initial-scale1.0"><…

css 实现自定义虚线

需求&#xff1a; ui 画的图是虚线&#xff0c;但是虚线很宽正常的border 参数无法做到 进程&#xff1a; 尝试使用 border&#xff1a;1px dashed 发现使用这个虽然是虚线但是很短密密麻麻的 这并不是我们想要的那就只能换方案 第一个最简单&#xff0c;让ui 画一个图然…

【鸿蒙】0x02-LiteOS-M基于Qemu RISC-V运行

OpenHarmony LiteOS-M基于Qemu RISC-V运行 系列文章目录更新日志OpenHarmony技术架构OH技术架构OH支持系统类型轻量系统&#xff08;mini system&#xff09;小型系统&#xff08;small system&#xff09;标准系统&#xff08;standard system&#xff09; 简介环境准备安装QE…

力扣动态规划-2【算法学习day.96】

前言 ###我做这类文章一个重要的目的还是给正在学习的大家提供方向&#xff08;例如想要掌握基础用法&#xff0c;该刷哪些题&#xff1f;建议灵神的题单和代码随想录&#xff09;和记录自己的学习过程&#xff0c;我的解析也不会做的非常详细&#xff0c;只会提供思路和一些关…

细说STM32F407单片机电源低功耗SleepMode模式及应用示例

目录 一、STM32F4的低功耗模式 1、睡眠(Sleep)模式 2、停止(Stop)模式 3、待机(Standby)模式 二、睡眠模式 1、进入睡眠模式 2、睡眠模式的状态 3、退出睡眠模式 4、SysTick的影响 三、应用示例 1、工程配置 &#xff08;1&#xff09; 时钟、DEBUG、GPIO、CodeGen…

【竞技宝】LOL:ning直播再次锐评

北京时间1月18日,目前英雄联盟LPL2025正在如火如荼的进行之中,很多队伍都已经打完了新赛季的首场比赛,其中就包括AL战队,AL在休赛期进行了大幅度的人员调整,整体实力相比之前增强了不少,在16日的比赛中,AL3-0轻松击败LGD拿下了赛季开门红,而AL的打野选手tarzan在本场比赛中表现…

构建安全防线:基于视频AI的煤矿管理系统架构创新成果展示

前言 本文我将介绍一款AI产品的成果展示——“基于视频AI识别技术的煤矿安全生产管理系统”。这款产品是目前我在创业阶段和几位矿业大学的博士共同从架构设计、开发到交付的全过程中首次在博客频道发布, 我之前一直想写但没有机会来整理这套系统的架构, 因此我也特别感谢CSDN平…

QT笔记- Qt6.8.1 Android编程 添加AndroidManifest.xml文件以支持修改权限

1. 切换项目选项卡&#xff0c;找到构建的步骤下的最后一项构建安卓APK&#xff0c;展开后找到应用程序栏&#xff0c;点击安卓自定义中的创建模板. 2. 弹出对话框勾选图中选项后点完成 3. 回到项目&#xff0c;查看.pro文件&#xff0c;里面多了很多内容不管&#xff0c;在下…

STM32-笔记43-低功耗

一、什么是低功耗&#xff1f; 低功耗‌是指通过优化设计和采用特定的技术手段&#xff0c;降低电子设备在运行过程中消耗的能量&#xff0c;从而延长电池寿命、提高性能和减少发热。低功耗设计主要从芯片设计和系统设计两个方面进行&#xff0c;旨在减少所有器件的功率损耗&am…

重温STM32之环境安装

缩写 CMSIS&#xff1a;common microcontroller software interface standard 1&#xff0c;keil mdk安装 链接 Keil Product Downloads 安装好后&#xff0c;开始安装平台软件支持包&#xff08;keil 5后不在默认支持所有的平台软件开发包&#xff0c;需要自行下载&#…

【三国游戏——贪心、排序】

题目 代码 #include <bits/stdc.h> using namespace std; using ll long long; const int N 1e510; int a[N], b[N], c[N]; int w[4][N]; int main() {int n;cin >> n;for(int i 1; i < n; i)cin >> a[i];for(int i 1; i < n; i)cin >> b[i…

想品客老师的第一天:值类型使用

前面两章的摘要 ECMAscript&#xff08;也就是ES&#xff09;是JavaScript的一个标准&#xff0c;就像c的c11和c99一样&#xff0c;几把的一年出一套标准 freeze()是一个对象方法&#xff0c;表示锁定、固定一个对象不可改变&#xff08;因为const对于标量不可变&#xff0c;…

leetcode刷题记录(六十七)——21. 合并两个有序链表

&#xff08;一&#xff09;问题描述 21. 合并两个有序链表 - 力扣&#xff08;LeetCode&#xff09;21. 合并两个有序链表 - 将两个升序链表合并为一个新的 升序 链表并返回。新链表是通过拼接给定的两个链表的所有节点组成的。 示例 1&#xff1a;[https://assets.leetcode…

学习微信小程序的下拉列表控件-picker

1、创建一个空白工程 2、index.wxml中写上picker布局&#xff1a; <!--index.wxml--> <view class"container"><picker mode"selector" range"{{array}}" bindchange"bindPickerChange"><view class"pick…