转载自:老大的博客
因时间同步的问题导致Hbase集群都挂了,经过一番修复Hbase集群以及可以跑起来,修复过程大约如下:
hdfs fsck / -delete 删除missing的block
hadoop dfsadmin -report 查看hadoop集群状态
zkCli.sh 登录zk
rmr /hbase 删除/hbase节点
关闭所有的 regionserver
执行hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair 修复源数据 修复完成后启动master
启动regionserver
此时Hbase集群以及可以启动,并且Hbase Shell也是正常的,但是一旦涉及到scan数据那就出错了,如下:
base(main):010:0> scan 'ApplicationIndex'
ROW COLUMN+CELL
ERROR: No server address listed in hbase:meta for region ApplicationIndex,,1517480748304.31070658ad5552726d4b24c47f47a727. containing row
看报错信息应该是表存在,但是表没有分配region,没有server信息。可通过如下命令查看是否是这样的:
hbase(main):011:0> scan 'hbase:meta' , {LIMIT=>10,FILTER=>"PrefixFilter('ApplicationIndex')"}
ROW COLUMN+CELL
ApplicationIndex,,1517480748304.31070658ad5552726d4 column=info:regioninfo, timestamp=1523277765889, value={ENCODED => 31070658ad5552726d4b24c47f47a727,
NAME => 'ApplicationIndex,,1517480748304.31070658ad5b24c47f47a727.552726d4b24c47f47a727.', STARTKEY => '', ENDKEY => ''}
1 row(s) in 0.0360 seconds
此时发现这里只有regioninfo,并没有server的记录:
base(main):013:0> scan 'hbase:meta' , {LIMIT=>10,FILTER=>"PrefixFilter('hbase')"}
ROW COLUMN+CELL
hbase:namespace,,1517480288768.ab4a22852177a867ae63 column=info:regioninfo, timestamp=1523278150687, value={ENCODED => ab4a22852177a867ae631aa109fe2b81, NAME => 'hbase:namespace,,1517480288768.ab4a228521771aa109fe2b81.
a867ae631aa109fe2b81.',
STARTKEY => '', ENDKEY => ''}
hbase:namespace,,1517480288768.ab4a22852177a867ae63 column=info:seqnumDuringOpen, timestamp=1523278150687, value=\x00\x00\x00\x00\x00\x00\x00\x05
1aa109fe2b81.
hbase:namespace,,1517480288768.ab4a22852177a867ae63 column=info:server, timestamp=1523278150687, value=OP-APM-08:16020
1aa109fe2b81.
hbase:namespace,,1517480288768.ab4a22852177a867ae63 column=info:serverstartcode, timestamp=1523278150687, value=1523278113491
1aa109fe2b81.
hbase:namespace,,1523276566308.73f980027b0b2d3badb4 column=info:regioninfo, timestamp=1523278150793, value={ENCODED => 73f980027b0b2d3badb4627dc1fc5c67, NAME => 'hbase:namespace,,1523276566308.73f980027b0b
627dc1fc5c67. 2d3badb4627dc1fc5c67.', STARTKEY => '', ENDKEY => ''}
hbase:namespace,,1523276566308.73f980027b0b2d3badb4 column=info:seqnumDuringOpen, timestamp=1523278150793, value=\x00\x00\x00\x00\x00\x00\x00\x0C
627dc1fc5c67.
hbase:namespace,,1523276566308.73f980027b0b2d3badb4 column=info:server, timestamp=1523278150793, value=OP-APM-06:16020
627dc1fc5c67.
hbase:namespace,,1523276566308.73f980027b0b2d3badb4 column=info:serverstartcode, timestamp=1523278150793, value=1523277988017
627dc1fc5c67.
2 row(s) in 0.0160 seconds
这里发现info:server这个数据没有了,修复办法很简单:
先关闭表
hbase(main):014:0> disable 'ApplicationIndex'
然后再开启表
hbase(main):015:0> enable 'ApplicationIndex'
此时,会自动分配一个server,检测结果:
hbase(main):016:0> scan 'hbase:meta' , {LIMIT=>10,FILTER=>"PrefixFilter('ApplicationIndex')"}
ROW COLUMN+CELL
ApplicationIndex,,1517480748304.31070658ad5552726d4 column=info:regioninfo, timestamp=1523325348714, value={ENCODED => 31070658ad5552726d4b24c47f47a727, NAME => 'ApplicationIndex,,1517480748304.31070658ad5
b24c47f47a727. 552726d4b24c47f47a727.', STARTKEY => '', ENDKEY => ''}
ApplicationIndex,,1517480748304.31070658ad5552726d4 column=info:seqnumDuringOpen, timestamp=1523325348714, value=\x00\x00\x00\x00\x00\x00\x03\xB8
b24c47f47a727.
ApplicationIndex,,1517480748304.31070658ad5552726d4 column=info:server, timestamp=1523325348714, value=OP-APM-06:16020
b24c47f47a727.
ApplicationIndex,,1517480748304.31070658ad5552726d4 column=info:serverstartcode, timestamp=1523325348714, value=1523277988017
b24c47f47a727.
1 row(s) in 0.0190 seconds
此时该表已经可以查询数据了。
评论前必须登录!
注册