特性
Milvus由Go(63.4%),Python(17.0%),C++(16.6%),Shell(1.3%)等语言开发开发,支持python,go,java接口(C++,Rust,c#等语言还在开发中),支持单机、集群部署,支持CPU、GPU运算。Milvus 中的所有搜索和查询操作都在内存中执行。,当前支持的Dimensions of a vector的最大值为32,768。其他限制。
使用步骤:
M i l v u s 和之前讨论的 f a i s s , u s e a r c h 的不同在于,使用前需要先安装服务端的 M i l v u s ,否则会有以下错误: \color{red} Milvus和之前讨论的faiss,usearch的不同在于,使用前需要先安装服务端的Milvus,否则会有以下错误: Milvus和之前讨论的faiss,usearch的不同在于,使用前需要先安装服务端的Milvus,否则会有以下错误:pymilvus.exceptions.MilvusException: <MilvusException: (code=2, message=Fail connecting to server on 127.0.0.1:19530. Timeout)>
安装Milvus:根据你的操作系统和需求,选择适合的安装方式,可以是Docker容器、二进制文件或源代码编译安装。
dokcer
curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
sudo chmod +x ./docker-compose-linux-x86_64
sudo cp ./docker-compose-linux-x86_64 /usr/bin/docker-compose
docker-compose version
- link: Docker的快速使用, docker 中使用gpu, Docker Compose: 集合管理Docker的工具安装
安装
wget https://github.com/milvus-io/milvus/releases/download/v2.3.4/milvus-standalone-docker-compose.yml -O docker-compose.yml
sudo docker compose up -d
$ sudo docker compose up -d
[+] Running 23/23
✔ standalone 7 layers [⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 13.8s
✔ d5fd17ec1767 Pull complete 2.4s
✔ 7ab813dbf013 Pull complete 2.6s
✔ 971f9356e3f1 Pull complete 4.1s
✔ 278f4560205e Pull complete 4.2s
✔ b83f734869d9 Pull complete 10.0s
✔ 1f27396f6efc Pull complete 10.1s
✔ fe556ec02776 Pull complete 10.1s
✔ etcd 7 layers [⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 15.8s
✔ dbba69284b27 Pull complete 10.6s
✔ 270b322b3c62 Pull complete 10.7s
✔ 7c21e2da1038 Pull complete 10.8s
✔ cb4f77bfee6c Pull complete 10.8s
✔ e5485096ca5d Pull complete 10.8s
✔ 3ea3736f61e1 Pull complete 10.9s
✔ 1e815a2c4f55 Pull complete 10.9s
✔ minio 6 layers [⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 14.1s
✔ c7e856e03741 Pull complete 6.6s
✔ c1ff217ec952 Pull complete 6.6s
✔ b12cc8972a67 Pull complete 6.6s
✔ 4324e307ea00 Pull complete 6.9s
✔ 152089595ebc Pull complete 6.9s
✔ 05f217fb8612 Pull complete 10.3s
[+] Building 0.0s (0/0)
[+] Running 4/4
✔ Network milvus Created 0.1s
✔ Container milvus-minio Started 4.6s
✔ Container milvus-etcd Started 3.5s
✔ Container milvus-standalone Started
$ sudo docker compose ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
milvus-etcd quay.io/coreos/etcd:v3.5.5 "etcd -advertise-cli…" etcd About a minute ago Up About a minute (healthy) 2379-2380/tcp
milvus-minio minio/minio:RELEASE.2023-03-20T20-16-18Z "/usr/bin/docker-ent…" minio About a minute ago Up About a minute (healthy) 0.0.0.0:9000-9001->9000-9001/tcp, :::9000-9001->9000-9001/tcp
milvus-standalone milvusdb/milvus:v2.3.4 "/tini -- milvus run…" standalone About a minute ago Up About a minute (healthy) 0.0.0.0:9091->9091/tcp, :::9091->9091/tcp, 0.0.0.0:19530->19530/tcp, :::19530->19530/tcp
测试链接
docker port milvus-standalone 19530/tcp
// docker port 命令用于查看正在运行的容器中某个端口的映射情况
$ sudo docker port milvus-standalone 19530/tcp
0.0.0.0:19530
[::]:19530
停止 Milvus服务
-
要停止 Milvus 单机版,请运行:
-
sudo docker compose down
-
如需在停止 Milvus 后删除数据,请执行以下命令:
-
sudo rm -rf volumes
客户端使用
安装
$ pip3 install pymilvus # https://github.com/milvus-io/pymilvus
使用
from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection, utility
# --------------------------------------------------------------------------------------------
# 服务器地址信息
HOST = '127.0.0.1'
PORT = '19530'
# 向量信息
DIM = 200 # dimension
COLLECTION_NAME = 'test'
# --------------------------------------------------------------------------------------------
# 创建 Milvus 集合,可参考https://milvus.io/docs/create_collection.md
def create_milvus_collection(collection_name, dim):
# 是否已存在同名集合
if utility.has_collection(collection_name):
utility.drop_collection(collection_name)# 如果存在,则删除已有集合
# 定义集合的字段信息。注:为了降低数据插入的复杂度,Milvus 允许你为每个标量字段指定一个默认值,不包括主键字段
fields = [
FieldSchema(name='path', dtype=DataType.VARCHAR, description='图像路径', max_length=500,
is_primary=True, auto_id=False),# 存储图像路径的 'path' 字段
FieldSchema(name='embedding', dtype=DataType.FLOAT_VECTOR, description='图像嵌入向量', dim=dim)# 存储图像嵌入向量的 'embedding' 字段
]
# 创建集合的模式
schema = CollectionSchema(fields=fields, description='集合描述信息')
# 使用架构创建集合,到这一步创建的集合就能使用了
collection = Collection(name=collection_name, schema=schema)
# 定义用于创建索引的参数,以下示例构建一个 10 聚类IVF_FLAT索引,其中欧几里得距离 (L2) 作为相似度指标
index_params = {
"metric_type":"L2",
"index_type":"IVF_FLAT",
"params":{"nlist":10}
}
# 在 'embedding' 字段上使用指定参数创建索引
collection.create_index(field_name='embedding', index_params=index_params)
# 返回创建好的集合对象
return collection
# 发起连接
connections.connect(host=HOST, port=PORT)
# 创建 collection
collection = create_milvus_collection(COLLECTION_NAME, DIM)
print(f'A new collection created: {COLLECTION_NAME}')
# 或者直接连接已有collection collection = Collection("book")
import random
data = [ [str(i) for i in range(2000)], [[random.random() for _ in range(200)] for _ in range(2000)], # None,
]
print(len(data))
mr = collection.insert(data)
search_params = {
"metric_type": "L2",
"offset": 0,
"ignore_growing": False,
"params": {"nprobe": 10}
}
collection.load()
results = collection.search(
data=[[random.random() for _ in range(200)]],
anns_field="embedding", # Name of the field to search on.
param=search_params,
limit=10,
expr=None,# 用于筛选属性的布尔表达式。有关更多信息,请参见布尔表达式规则。https://milvus.io/docs/boolean.md
output_fields=['embedding'],#要返回的字段的名称。Milvus 支持返回向量字段。(可选)
# consistency_level="Strong" # 搜索的一致性级别(可选)
)
print(results[0].ids)
print(results[0].distances)
hit = results[0][0]
print(hit.entity.get('embedding')) # 需要指定output_fields
# ['537', '1228', '389', '1527', '395', '190', '1221', '555', '1789', '886']
# [25.513811111450195, 26.030805587768555, 26.122865676879883, 26.59450912475586, 26.952003479003906, 27.123659133911133, 27.264328002929688, 27.28336524963379, 27.417621612548828, 27.71729278564453]
# [0.15461023, 0.30096045, 0.26865703, 0.25927073, 0.33812553, 0.54217076, 0.15246719, 0.731632, 0.45709008, 0.79914236, 0.9088526, 0.02686498, 0.42263803, 0.69333476, 0.39840952, 0.6991515, 0.5305877, 0.6620755, 0.5817265, 0.21614578, 0.8906462, 0.64077824, 0.09763326, 0.8131759, 0.31869066, 0.7435266, 0.727443, 0.6023419, 0.665456, 0.3228657, 0.10494679, 0.7091096, 0.3667962, 0.3149366, 0.15853179, 0.24909244, 0.23726037, 0.17990382, 0.3514512, 0.116617575, 0.5656539, 0.36453706, 0.7430549, 0.5163423, 0.17115992, 0.3062062, 0.9076736, 0.5650338, 0.43389124, 0.6029854, 0.3382137, 0.38251325, 0.7953752, 0.19413383, 0.21625121, 0.04543528, 0.97489053, 0.76131046, 0.17360009, 0.32513952, 0.7822587, 0.99820197, 0.97119784, 0.11839666, 0.004737074, 0.18586244, 0.21051529, 0.5463567, 0.28732273, 0.59985745, 0.35132825, 0.17821868, 0.08039577, 0.22121702, 0.51074564, 0.9789643, 0.91906327, 0.3212936, 0.9785981, 0.70479745, 0.77640325, 0.03191031, 0.12803258, 0.8522966, 0.48946765, 0.8437068, 0.17805281, 0.3471558, 0.7912329, 0.19458486, 0.9588124, 0.5400154, 0.3107983, 0.08004966, 0.40348408, 0.8400167, 0.255088, 0.29406822, 0.69000036, 0.7577903, 0.6970145, 0.99666446, 0.5368813, 0.25070563, 0.10906121, 0.6366669, 0.75897807, 0.2470287, 0.83007634, 0.17270081, 0.37081972, 0.5600866, 0.47211888, 0.48388532, 0.09467795, 0.43837216, 0.3848784, 0.33862317, 0.5992313, 0.49879825, 0.21382369, 0.4665225, 0.20776376, 0.41195828, 0.77341104, 0.41533098, 0.1488313, 0.29170626, 0.90135145, 0.9490258, 0.5797127, 0.046041798, 0.032213394, 0.9823944, 0.22410004, 0.01474563, 0.54565424, 0.84022516, 0.3146623, 0.60868996, 0.8468924, 0.5047047, 0.44784358, 0.76461, 0.39477462, 0.4341565, 0.04060842, 0.7913311, 0.3800782, 0.76624304, 0.27977547, 0.5467395, 0.7406536, 0.051075574, 0.859247, 0.16734485, 0.55351096, 0.77330744, 0.21997604, 0.6573193, 0.47392654, 0.22703278, 0.21453229, 0.5354482, 0.68723947, 0.3444063, 0.19725236, 0.63618726, 0.20056139, 0.41761643, 0.3148263, 0.0072599854, 0.14207017, 0.96439177, 0.727712, 0.61615413, 0.67021996, 0.73491627, 0.64917046, 0.6545984, 0.6521858, 0.86778504, 0.65002567, 0.65721965, 0.57199746, 0.27476418, 0.5959397, 0.17169125, 0.30866027, 0.6539025, 0.83966345, 0.18539791, 0.64870465, 0.9470506, 0.6794907, 0.75711423, 0.88191146, 0.075844504, 0.9600152, 0.38191438]
相关项目
reverse_image_search
- Towhee 可以通过 ML 模型和其他操作的管道生成嵌入向量。它旨在使民主化,允许每个人 - 从初学者开发人员到大型组织 - 只需几行代码即可生成密集嵌入。使用Towhee分析非结构化数据,如反向图像搜索、反向视频搜索、音频分类、问答系统、分子搜索等。
- https://github.com/towhee-io/examples/blob/main/image/reverse_image_search/workflow.png
项目训练营
osschat
- https://osschat.io/chat,Enhanced ChatGPT with documentation, issues, blog posts, community Q&A as knowledge bases. Built for every community and developer.
轻松搭建基于Milvus的文本检索系统