HBase Interview Questions and Answers Part-5
42.Which file in Hbase is designed after the SSTable file of BigTable?
The HFile in Habse which stores the Actual data(not metadata) is designed after the SSTable file of BigTable.
43.Why do we pre-create empty regions?
Tables in HBase are initially created with one region by default. Then for bulk imports, all clients will write to the same region until it is large enough to split and become distributed across the cluster. So empty regions are created to make this process faster.
44.What is hotspotting in Hbase?
Hotspotting is asituation when a large amount of client traffic is directed at one node, or only a few nodes, of a cluster. This traffic may represent reads, writes, or other operations. This traffic overwhelms the single machine responsible for hosting that region, causing performance degradation and potentially leading to region unavailability.
45.What are the approaches to avoid hotspotting?
Hotspotting can be avoided or minimized by distributing the rowkeys across multiple regions. The different techniques to do this is salting and Hashing.
46.Why should we try to minimize the row name and column name sizes in Hbase?
In Hbase values are always freighted with their coordinates; as a cell value passes through the system, it’ll be accompanied by its row, column name, and timestamp. If the rows and column names are large, especially compared to the size of the cell value, then indices that are kept on HBase storefiles (StoreFile (HFile)) to facilitate random access may end up occupying large chunks of the HBase allotted RAM than the data itself because the cell value coordinates are large.
47.What is the scope of a rowkey in Habse?
Rowkeys are scoped to ColumnFamilies. The same rowkey could exist in each ColumnFamily that exists in a table without collision.
48.What is the information stored in hbase:meta table?
The Hbase:meta tables stores details of region in the system in the following format.
info:regioninfo (serialized HRegionInfo instance for this region)
info:server (server:port of the RegionServer containing this region)
info:serverstartcode (start-time of the RegionServer process containing this region)
49.What is a Namespace in Hbase?
A Namespace is a logical grouping of tables . It is similar to a database object in a Relational database system.
50.How do we get the complete list of columns that exist in a column Family?
The complete list of columns in a column family can be obtained only querying all the rows for that column family.