Columnfamilies in HBase

Column-families are basically a group of columns which have the same attributes.

They help define a set of shared features to the group of columns that are a part of the family.

Naming convention for HBase column families is family:name

For example, a column Science which belongs to column family Subject will be represented as subject:science

Column families need to be created upfront, while columns can be added at any later point of time.

Column families help storing data over multiple locations.

Physically, all column families are stored together on one device.

Creation using Command Line:

Using command line, we can declare a column family as:

CREATE ‘subjects’ ‘commerce’ ‘science’

where the syntax is

CREATE ‘tablename’ ‘colfamily1’ ‘colfamilyn’

Adding columns along with creating columnfamilies can be done as follows:

CREATE ‘subjects’ ‘commerce:accounts’ ‘science:maths’

Advantages:

  • It is easier to tune and manage storage at the column family level.
  • Column families have the same access pattern and characteristics.

Disadvantages:

  • An important point to be considered is that performance degrades with increased number of column families.
  • Data management becomes an overhead if the number of column families is more.

Leave a comment

Blog at WordPress.com.

Up ↑