每一次只添加一个数据显然不像是大数据开发,在开发项目的时候也肯定会涉及到大量的数据操作。
使用Java
进行批量数据操作,其实就是循环的在Put
对象中添加数据最后在通过Table
对象提交。
如何进行批量操作呢,讲到批量操作,相信大家肯定第一时间会想到循环?
没错,使用循环确实就可以添加多个数据了,示例:
Table tableStep3 = connection.getTable(tableStep3Name);
// 循环添加数据
byte[] row = Bytes.toBytes("20001");
Put put = new Put(row);
for (int i = 1; i <= 4; i++) {
byte[] columnFamily = Bytes.toBytes("data");
byte[] qualifier = Bytes.toBytes(String.valueOf(i));
byte[] value = Bytes.toBytes("value" + i);
put.addColumn(columnFamily, qualifier, value);
}
tableStep3.put(put);
代码执行结果:
可以发现,这一段代码向同一个行中添加了四列数据。
我们要添加多行数据应该如何处理呢,我猜你肯定想到了:使用集合!
List<Put> puts = new ArrayList<>();
// 循环添加数据
for (int i = 1; i <= 4; i++) {
byte[] row = Bytes.toBytes("row" + i);
Put put = new Put(row);
byte[] columnFamily = Bytes.toBytes("data");
byte[] qualifier = Bytes.toBytes(String.valueOf(i));
byte[] value = Bytes.toBytes("value" + i);
put.addColumn(columnFamily, qualifier, value);
puts.add(put);
}
Table table = connection.getTable(tableName);
table.put(puts);
上述代码向HBase
中添加了四行数据,结合上次实训,可以发现table
对象的put()
方法是一个重载方法既可以接收Put
对象也可以接收Put
集合。
添加完数据的表结构:
编程要求
好了,到你啦,在右侧编辑器begin-end
中编写Java
代码向HBase
的stu
表(表需要自己创建)中添加数据如下:
表名 | 行键 | 列族:列 | 值 |
---|---|---|---|
stu | 20181122 | basic_info:name | 阿克蒙德 |
stu | 20181122 | basic_info:gender | male |
stu | 20181122 | basic_info:birthday | 1987-05-23 |
stu | 20181122 | basic_info:connect | tel:13974036666 |
stu | 20181122 | basic_info:address | HuNan-ChangSha |
stu | 20181122 | school_info:college | ChengXing |
stu | 20181122 | school_info:class | class 1 grade 2 |
stu | 20181122 | school_info:object | Software |
stu | 20181123 | basic_info:name | 萨格拉斯 |
stu | 20181123 | basic_info:gender | male |
stu | 20181123 | basic_info:birthday | 1986-05-23 |
stu | 20181123 | basic_info:connect | tel:18774036666 |
stu | 20181123 | basic_info:address | HuNan-ChangSha |
stu | 20181123 | school_info:college | ChengXing |
stu | 20181123 | school_info:class | class 2 grade 2 |
stu | 20181123 | school_info:object | Software |
可以发现这里有两个列族,如何添加多个列族呢?
在我们之前讲到的建表中setColumnFamily(family)
方法,这个方法是可以调用多次的。
package step3;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.TableDescriptors;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor;
import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.client.TableDescriptor;
import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
import org.apache.hadoop.hbase.util.Bytes;
public class Task {
public void batchPut()throws Exception{
/********* Begin *********/
Configuration config = new Configuration();
Connection conn = ConnectionFactory.createConnection(config);
Admin admin = conn.getAdmin();
// 建表
TableName tableName = TableName.valueOf(Bytes.toBytes("stu"));
TableDescriptorBuilder builder = TableDescriptorBuilder.newBuilder(tableName);
ColumnFamilyDescriptor family = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("basic_info")).build();
ColumnFamilyDescriptor family2 = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("school_info")).build();
builder.setColumnFamily(family);
builder.setColumnFamily(family2);
admin.createTable(builder.build());
List<Put> puts = new ArrayList<>();
String[] rows = {"20181122","20181123"};
String[][] basic_infos = {{"阿克蒙德","male","1987-05-23","tel:139********","HUNan-ChangSha"},{"萨格拉斯","male","1986-05-23","tel:187********","HUNan-ChangSha"}};
String[] basic_colums = {"name","gender","birthday","connect","address"};
String[][] school_infos = {{"ChengXing","class 1 grade 2","Software"},{"ChengXing","class 2 grade 2","Software"}};
String[] school_colums = {"college","class","object"};
for (int x = 0; x < rows.length; x++) {
// 循环添加数据
Put put = new Put(Bytes.toBytes(rows[x]));
for (int i = 0; i < basic_infos.length; i++) {
byte[] columFamily = Bytes.toBytes("basic_info");
byte[] qualifier = Bytes.toBytes(basic_colums[i]);
byte[] value = Bytes.toBytes(basic_infos[x][i]);
put.addColumn(columFamily, qualifier, value);
}
for (int i = 0; i < school_infos.length; i++) {
byte[] columFamily = Bytes.toBytes("school_info");
byte[] qualifier = Bytes.toBytes(school_colums[i]);
byte[] value = Bytes.toBytes(school_infos[x][i]);
put.addColumn(columFamily, qualifier, value);
}
puts.add(put);
}
Table table = conn.getTable(tableName);
table.put(puts);
/********* End *********/
}
}