Columnfamilies in HBase

Column-families are basically a group of columns which have the same attributes.

They help define a set of shared features to the group of columns that are a part of the family.

Naming convention for HBase column families is family:name

For example, a column Science which belongs to column family Subject will be represented as subject:science

Column families need to be created upfront, while columns can be added at any later point of time. 

Column families help storing data over multiple locations.

Physically, all column families are stored together on one device.

Creation using Command Line:

Using command line, we can declare a column family as:

      CREATE ‘subjects’ ‘commerce’ ‘science’

where the syntax is

      CREATE ‘tablename’ ‘colfamily1’ ‘colfamilyn’

Adding columns along with creating columnfamilies can be done as follows:

      CREATE ‘subjects’ ‘commerce:accounts’ ‘science:maths’

Advantages:

  • It is easier to tune and manage storage at the column family level.
  • Column families have the same access pattern and characteristics.

Disadvantages:

  • An important point to be considered is that performance degrades with increased number of column families.
  • Data management becomes an overhead if the number of column families is more.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: