背景
我们都知道hbase的数据是分布在多台RegionServer角色的机器上的,每个RegionServer都有一到多个Region管理不同rowkey范围的数据,所以建表前通过合理的Region的分区及数量,可以避免热点读写问题和充分利用各RegionServer的资源,vmaster-hbase提供了预分区的功能
手动分区
用户根据数据特点和资源组机器数量提供分割点
1.1分割点是字符串
   

1.2分割点是整数
hbase存储的都是二进制的byte,所有Int类型的分割点都要转换为十六进制传入,比如我们有如下分割点:1,10,15,每个分割点都是一个Int类型,可以利用Bytes.toHex(Bytes.toBytes(splitPoint))得出分割点的十六进制表示: 分割点十六进制表示
1.2.1分割点十六进制表示
| Int | 十六进制表示 | 
|---|---|
| 1 | \x00\x00\x00\x01 | 
| 10 | \x00\x00\x00\x0a | 
| 15 | \x00\x00\x00\x0f | 
1.2.2分割点测试

 
自动分区
2.1 HexStringSplit
分区数根据机器数选择,推荐每台机器20~30个region
rowkey是整数时,建议采用此分区算法,HexStringSplit将整个无符号整数范围00000000~FFFFFFFF根据region数据平均划分,转化为十六进制字符,长度不够8自动左填充'0',调用Bytes.toBytes(bigIntegerString)转到字节数组,核心代码如下:
2.1.1Rowkey范围切分
|     public byte[][] split(int n) {
      Preconditions.checkArgument(lastRowInt.compareTo(firstRowInt) > 0,
          "last row (%s) is configured less than first row (%s)", lastRow,
          firstRow);
      // +1 to range because the last row is inclusive
      BigInteger range = lastRowInt.subtract(firstRowInt).add(BigInteger.ONE);
      Preconditions.checkState(range.compareTo(BigInteger.valueOf(n)) >= 0,
          "split granularity (%s) is greater than the range (%s)", n, range);
      BigInteger[] splits = new BigInteger[n - 1];
      BigInteger sizeOfEachSplit = range.divide(BigInteger.valueOf(n));
      for (int i = 1; i < n; i++) {
        // NOTE: this means the last region gets all the slop.
        // This is not a big deal if we're assuming n << MAXHEX
        splits[i - 1] = firstRowInt.add(sizeOfEachSplit.multiply(BigInteger
            .valueOf(i)));
      }
      return convertToBytes(splits);
    }
 | 
2.1.2分割点转为字节数组
|     /**
     * Returns the bytes corresponding to the BigInteger
     *
     * @param bigInteger number to convert
     * @param pad padding length
     * @return byte corresponding to input BigInteger
     */
    public static byte[] convertToByte(BigInteger bigInteger, int pad) {
      String bigIntegerString = bigInteger.toString(16);
      bigIntegerString = StringUtils.leftPad(bigIntegerString, pad, '0');
      return Bytes.toBytes(bigIntegerString);
    } | 
2.2 UniformSplit
分区数根据机器数选择,推荐每台机器20~30个region
当rowkey是原始字节数组byte[],raw byte的范围是\x00~\xff,rowKey接近统一随机的byte值比如hashes,采用此分区算法,UniformSplit采用BigInteger的toByteArray()转化分割点
2.2.1分割点算法
|   /**
   * Iterate over keys within the passed range.
   */
  public static Iterable<byte[]> iterateOnSplits(
      final byte[] a, final byte[]b, boolean inclusive, final int num)
  {
    byte [] aPadded;
    byte [] bPadded;
    if (a.length < b.length) {
      aPadded = padTail(a, b.length - a.length);
      bPadded = b;
    } else if (b.length < a.length) {
      aPadded = a;
      bPadded = padTail(b, a.length - b.length);
    } else {
      aPadded = a;
      bPadded = b;
    }
    if (compareTo(aPadded,bPadded) >= 0) {
      throw new IllegalArgumentException("b <= a");
    }
    if (num <= 0) {
      throw new IllegalArgumentException("num cannot be <= 0");
    }
    byte [] prependHeader = {1, 0};
    final BigInteger startBI = new BigInteger(add(prependHeader, aPadded));
    final BigInteger stopBI = new BigInteger(add(prependHeader, bPadded));
    BigInteger diffBI = stopBI.subtract(startBI);
    if (inclusive) {
      diffBI = diffBI.add(BigInteger.ONE);
    }
    final BigInteger splitsBI = BigInteger.valueOf(num + 1);
    //when diffBI < splitBI, use an additional byte to increase diffBI
    if(diffBI.compareTo(splitsBI) < 0) {
      byte[] aPaddedAdditional = new byte[aPadded.length+1];
      byte[] bPaddedAdditional = new byte[bPadded.length+1];
      for (int i = 0; i < aPadded.length; i++){
        aPaddedAdditional[i] = aPadded[i];
      }
      for (int j = 0; j < bPadded.length; j++){
        bPaddedAdditional[j] = bPadded[j];
      }
      aPaddedAdditional[aPadded.length] = 0;
      bPaddedAdditional[bPadded.length] = 0;
      return iterateOnSplits(aPaddedAdditional, bPaddedAdditional, inclusive,  num);
    }
    final BigInteger intervalBI;
    try {
      intervalBI = diffBI.divide(splitsBI);
    } catch(Exception e) {
      LOG.error("Exception caught during division", e);
      return null;
    }
    final Iterator<byte[]> iterator = new Iterator<byte[]>() {
      private int i = -1;
      @Override
      public boolean hasNext() {
        return i < num+1;
      }
      @Override
      public byte[] next() {
        i++;
        if (i == 0) return a;
        if (i == num + 1) return b;
        BigInteger curBI = startBI.add(intervalBI.multiply(BigInteger.valueOf(i)));
        byte [] padded = curBI.toByteArray();
        if (padded[1] == 0)
          padded = tail(padded, padded.length - 2);
        else
          padded = tail(padded, padded.length - 1);
        return padded;
      }
      @Override
      public void remove() {
        throw new UnsupportedOperationException();
      }
    };
    return new Iterable<byte[]>() {
      @Override
      public Iterator<byte[]> iterator() {
        return iterator;
      }
    };
  } | 


















