11.Is Hbase a scale out or scale up process?
Hbase runs on top of Hadoop which is a distributed system. Haddop can only scale uo as and when required by adding more machines on the fly. So Hbase is a scale out process.
12.What are the step in writing something into Hbase by a client?
In Hbase the client does not write directly into the HFile. The client first writes to WAL(Write Access Log), which then is accessed by Memdtore. The Memstore Flushes the data into permanent memory from time to time.
13.What is compaction in Hbase?
As more and more data is written to Hbase, many HFiles get created. Compaction is the process of merging these HFiles to one file and after the merged file is created successfully, discard the old file.
14.What are the different compaction types in Hbase?
There are two types of compaction. Major and Minor compaction. In minor compaction, the adjacent small HFiles are merged to create a single HFile without removing the deleted HFiles. Files to be merged are chosen randomly.
In Major compaction, all the HFiles of a column are emerged and a single HFiles is created. The delted HFiles are discarded and it is generally triggered manually.
15.What is the difference between the commands delete column and delete family?
The Delete column command deletes all versions of a column but the delete family deletes all columns of a particular family.
16.What is a cell in Hbase?
A cell in Hbase is the smallest unit of a Hbase table which holds a piece of data in the form of a tuple{row,column,version}
17.What is the role of the class HColumnDescriptor in Hbase?
This class is used to store information about a column family such as the number of versions, compression settings, etc. It is used as input when creating a table or adding a column.
18.What is the lower bound of versions in Hbase?
The lower bound of versions indicates the minimum number of versions to be stored in Hbase for a column. For example If the value is set to 3 then three latest version wil be maintained and the older ones will be removed.
19.What is TTL (Time to live) in Hbase?
TTL is a data retention technique using which the version of a cell can be preserved till a specific time period.Once that timestamp is reached the specific version will be removed.
20.Does Hbase support table joins?
Hbase does not support table jons. But using a mapreduce job we can specify join queries to retrieve data from multiple Hbase tables.