1.部署minio环境
docker pull minio/minio
宿主机与容器挂在映射
宿主机位置 | 容器位置 |
---|---|
/data/minio/config | /data |
/data/minio/data | /root/.minio |
拉起环境:
docker run -p 9000:9000 -p 9090:9090 --name minio \
-d --restart=always \
-e "MINIO_ACCESS_KEY=admin" \
-e "MINIO_SECRET_KEY=admin123456" \
-v /data/minio/data:/data \
-v /data/minio/config:/root/.minio \minio/minio \
server /data --console-address ":9090
2.准备starrocks环境
参考docker部署starrocks 使用 Docker 部署 StarRocks @ deploy_with_docker @ StarRocks Docs
3.minio文件查询/全库备份·实操
借助python生成parquet文件
xiuchenggong@xiuchengdeMacBook-Pro ~ % python3
Python 3.9.10 (main, Jan 15 2022, 11:48:04)
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd;
>>> pf = pd.read_csv("/Users/xiuchenggong/test.csv")
>>> pf.to_parquet("/Users/xiuchenggong/test.parquet",engine="pyarrow")
3.1 去查存在minio上的parquet数据(支持查parquet或orc格式数据):
StarRocks > CREATE EXTERNAL TABLE table_1
-> (
-> name string,
-> id int
-> )
-> ENGINE=file
-> PROPERTIES
-> (
-> "path" = "s3a://starrocks/test.parquet",
-> "format" = "parquet",
-> "aws.s3.enable_ssl" = "false",
-> "aws.s3.enable_path_style_access" = "true",
-> "aws.s3.endpoint" = "172.17.0.3:9000",
-> "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",
-> "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC"
-> );
Query OK, 0 rows affected (0.009 sec)
StarRocks > show tables;
+-------------------+
| Tables_in_test_db |
+-------------------+
| table_1 |
| test1 |
| test2 |
+-------------------+
3 rows in set (0.003 sec)
StarRocks > select * from table_1;
+--------------+------+
| name | id |
+--------------+------+
| gongxiucheng | 1 |
| gongzixi | 2 |
+--------------+------+
2 rows in set (0.073 sec)
3.2 全量备份到minio(外表不能备份)
创建repository:
StarRocks > create repository starrocks_backup_01
-> with broker
-> on location "s3a://starrocks"
-> properties(
-> "aws.s3.enable_ssl" = "false",
-> "aws.s3.enable_path_style_access" = "true",
-> "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",
-> "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC",
-> "aws.s3.endpoint" = "172.17.0.3:9000"
-> )
-> ;
开始备份:
StarRocks > drop table table_1;
Query OK, 0 rows affected (0.010 sec)
StarRocks > backup snapshot test_db.snapshot_minio to starrocks_backup_01 properties("type"="full");
Query OK, 0 rows affected (0.024 sec)
StarRocks > show backup\G;
*************************** 1. row ***************************
JobId: 11047
SnapshotName: snapshot_minio
DbName: test_db
State: SAVE_META
BackupObjs: [test_db.test1], [test_db.test2]
CreateTime: 2023-09-05 01:58:42
SnapshotFinishedTime: 2023-09-05 01:58:48
UploadFinishedTime: 2023-09-05 01:58:54
FinishedTime: NULL
UnfinishedTasks:
Progress:
TaskErrMsg:
Status: [OK]
Timeout: 86400
1 row in set (0.003 sec)
ERROR: No query specified
StarRocks > show backup\G;
*************************** 1. row ***************************
JobId: 11047
SnapshotName: snapshot_minio
DbName: test_db
State: FINISHED
BackupObjs: [test_db.test1], [test_db.test2]
CreateTime: 2023-09-05 01:58:42
SnapshotFinishedTime: 2023-09-05 01:58:48
UploadFinishedTime: 2023-09-05 01:58:54
FinishedTime: 2023-09-05 01:59:00
UnfinishedTasks:
Progress:
TaskErrMsg:
Status: [OK]
Timeout: 86400
1 row in set (0.004 sec)
ERROR: No query specified
查看minio上文件:
备份成功;