目录
HBASE 基本操作
hbase shell:进入hbase shell环境
status命令:查看集群状态
version:查看版本信息
create:创建表
drop 删除表
list:查看所有表
desc :查看表结构
exists :查看表是否存在
表的启动和禁用
列簇的增删改
插入数据
删除第2行的数据
Hbase是Big Table的开源java版本,是建立在HDFS之上,提供高可靠性,高性能,列存储,可伸缩,实时读写的NoSql的数据库系统。
Hbase仅能通过逐渐(row key)和主键的range来检索数据,仅支持单行事务
Hbase主要用来存储结构化和半结构化的松散数据
Hbase查询数据功能很简单,不支持join等复杂操作,不支持复杂的事务,从技术上来说,Hbase更像是一个【数据存储】而不是数据库。
Hbase中支持的数据类型“byte[ ]”
hbase和hadoop一样,Hbase目标主要依靠横向扩展,通过不断增阿基廉价的商用服务器,来增加存储和处理能力,例如,把集群从10个节点扩展到20个节点,存储能力和处理能力都会加倍。
Hbase中的表一般有这样的特点:
大:一个表可以有上十亿行,上百万列
面向列:面向列(族)的存储和权限控制,列(族)独立检索
稀疏:对于为空(null)的列,并不占用存储空间,因此,表可以设计的非常稀疏
HBASE 基本操作
hbase shell:进入hbase shell环境
[hadoop@vm2 ~]$ hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.4.8, rf844d09157d9dce6c54fcd53975b7a45865ee9ac, Wed Oct 27 08:48:57 PDT 2021
Took 0.0016 seconds
status命令:查看集群状态
hbase:001:0> status
1 active master, 1 backup masters, 3 servers, 0 dead, 0.6667 average load
Took 0.7874 seconds
version:查看版本信息
hbase:002:0> version
2.4.8, rf844d09157d9dce6c54fcd53975b7a45865ee9ac, Wed Oct 27 08:48:57 PDT 2021
Took 0.0018 seconds
create:创建表
命令格式1:create ‘表名’,‘列簇名1’,‘列簇名2’…
Hbase创建表的时候不需要创建所有列簇,对于空(null)的列,并不占用存储空间,因此表可以设计的非常稀疏
# 创建一个名为test的表,包含 hbaseinfo 和 specialinfo 两个列簇
hbase:004:0> create 'test','hbaseinfo','specialinfo'
Created table test
Took 1.3269 seconds
=> Hbase::Table - test
drop 删除表
# 删除表之前需要先禁用表
hbase:005:0> disable 'test'
Took 1.0000 seconds # 删除表
hbase:006:0> drop 'test'
Took 0.7162 seconds
list:查看所有表
hbase:008:0> list
TABLE
test
1 row(s)
Took 0.0131 seconds
=> ["test"]
desc :查看表结构
hbase:010:0> desc 'test'
Table test is ENABLED
test
COLUMN FAMILIES DESCRIPTION
{NAME => 'hbaseinfo', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCOD
ING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATI
ON_SCOPE => '0'}
{NAME => 'specialinfo', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENC
ODING => 'NONE', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICA
TION_SCOPE => '0'}
2 row(s)
Quota is disabled
Took 0.1308 seconds
exists :查看表是否存在
hbase:011:0> exists 'test'
Table test does exist
Took 0.0172 seconds
=> true
表的启动和禁用
# 禁用表
hbase:013:0> disable 'test'
Took 0.6758 seconds # 检查表是否被禁用
hbase:014:0> is_disabled 'test'
true
Took 0.0222 seconds
=> true
# 启用表
hbase:015:0> enable 'test'
Took 0.6949 seconds # 检查表是否被启用
hbase:016:0> is_enabled 'test'
true
Took 0.0168 seconds
=> true
列簇的增删改
# 添加列簇
hbase:018:0> alter 'test','name'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.0527 seconds
# 删除列簇
hbase:019:0> alter 'test',{NAME=>'specialinfo',METHOD=>'delete'}
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.8760 seconds
# 更改列簇存储版本的限制
# 默认情况下列簇只储存一个版本数据,如果需要存储多个版本数据,需要修改列簇属性
hbase:021:0> alter 'test',{NAME=>'hbaseinfo',VERSIONS=>3}
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.8611 seconds
插入数据
命令格式:put '表名',‘行键','列簇名','列名','值',[时间戳]
hbase:022:0> put 'test','1','hbaseinfo:name','wang'
Took 0.2073 seconds
hbase:023:0> put 'test','1','hbaseinfo:age','20'
Took 0.0137 seconds
hbase:024:0> put 'test','1','hbaseinfo:birthday','2000-01-01'
Took 0.0106 seconds
hbase:025:0> put 'test','1','hbaseinfo:location','Boston'
Took 0.0191 seconds
获取指定行,指定行中的列簇,列的信息
# 获取第1行所有列的数据信息
hbase:026:0> get 'test','1'
COLUMN CELL
hbaseinfo:age timestamp=2023-04-13T02:49:42.913, value=20
hbaseinfo:birthday timestamp=2023-04-13T02:50:07.241, value=2000-01-01
hbaseinfo:location timestamp=2023-04-13T02:52:06.107, value=Boston
hbaseinfo:name timestamp=2023-04-13T02:49:11.233, value=wang
1 row(s)
Took 0.0808 seconds
# 获取第1行 hbaseinfo 列族里的数据信息
hbase:027:0> get 'test','1','hbaseinfo'
COLUMN CELL
hbaseinfo:age timestamp=2023-04-13T02:49:42.913, value=20
hbaseinfo:birthday timestamp=2023-04-13T02:50:07.241, value=2000-01-01
hbaseinfo:location timestamp=2023-04-13T02:52:06.107, value=Boston
hbaseinfo:name timestamp=2023-04-13T02:49:11.233, value=wang
1 row(s)
Took 0.0422 seconds
# 获取第1行中,hbaseinfo中的name列数据。
hbase:028:0> get 'test','1','hbaseinfo:name'
COLUMN CELL
hbaseinfo:name timestamp=2023-04-13T02:49:11.233, value=wang
1 row(s)
Took 0.0215 seconds
删除第2行的数据
delete 'test','2'
删除第2行指定列的数据
delete 'test','2','hbaseinfo:name'
命名空间
# 创建命名空间
hbase:029:0> create_namespace 'wang'
Took 0.4548 seconds # 列出所有命名空间
hbase:030:0> list_namespace
NAMESPACE
default
hbase
wang
3 row(s)
Took 0.0567 seconds # 在指定命名空间下面创建表
hbase:031:0> create 'wang:company','abt'
Created table wang:company
Took 1.1369 seconds
=> Hbase::Table - wang:company