来看下copyTable的一些使用参数:
- Usage: CopyTable [general options] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>
- Options:
- rs.class hbase.regionserver.class of the peer cluster
- specify if different from current cluster
- rs.impl hbase.regionserver.impl of the peer cluster
- startrow the start row
- stoprow the stop row
- starttime beginning of the time range (unixtime in millis)
- without endtime means from starttime to forever
- endtime end of the time range. Ignored if no starttime specified.
- versions number of cell versions to copy
- new.name new table's name
- peer.adr Address of the peer cluster given in the format
- hbase.zookeeer.quorum:hbase.zookeeper.client.port:zookeeper.znode.parent
- families comma-separated list of families to copy
- To copy from cf1 to cf2, give sourceCfName:destCfName.
- To keep the same name, just give "cfName"
- all.cells also copy delete markers and deleted cells
- Args:
- tablename Name of the table to copy
- Examples:
- To copy 'TestTable' to a cluster that uses replication for a 1 hour window:
- $ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 --peer.adr=server1,server2,server3:2181:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTable
- For performance consider the following general options:
- -Dhbase.client.scanner.caching=100
- -Dmapred.map.tasks.speculative.execution=false
从上面参数,可以看出,copyTable支持设定需要复制的表的时间范围,cell的版本,也可以指定列簇,设定从集群的地址,起始/结束行键等。参数还是很灵活的。
copyTable支持如下几个场景:
1、表深度拷贝:相当于一个快照,不过这个快照是包含原表实际数据的,0.94.x版本之前是不支持snapshot快照命令的,所以用copyTable相当于可以实现对原表的拷贝, 使用方式如下:
- create 'table_snapshot',{NAME=>"i"}
- hbase org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=tableCopy table_snapshot
2、集群间拷贝:在集群之间以表维度同步一个表数据,使用方式如下:
- create 'table_test',{NAME=>"i"} #目的集群上先创建一个与原表结构相同的表
- hbase org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=zk-addr1,zk-addr2,zk-addr3:2181:/hbase table_test
3、增量备份:增量备份表数据,参数中支持timeRange,指定要备份的时间范围,使用方式如下:
- hbase org.apache.hadoop.hbase.mapreduce.CopyTable ... --starttime=start_timestamp --endtime=end_timestamp
(编辑:晋中站长网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|