Friday, June 1, 2012

Options for Table Inheritance for Hibernate

There are many ways to perform table inheritance in Hibernate, and if you're expecting to be working with large data sets. Selection of the right approach will be tricky.

1) Table per class hierarchy approachThis is the approach where all the data for the entire class hierarchy (parents & children included) are stored in the same table.

Advantages:
  1. Fast performance in most situations.
  2. Ease of implementation.
Disadvantages:
  1. Table will clutter up with too many columns & with a lot of null values if the children classes have a lot of columns unique to them.
  2. Which in turns makes it less flexible / scalable (e.g. when some of the child classes need more columns).
Recommendation:
  1. Use only where there are very few class-specific columns (i.e. vast majority of columns are shared between classes in the hierarchy).
  2. And in addition if requirements to add columns to child classes are not likely to happen.
2) Table per concrete class approach (with union-subclass)Each sub-class has their own table containing ALL columns for the class. i.e. none of the columns are shared in a parent table.

Advantages:
  1. Data is split to multiple tables. For large data-sets, there won't be any gigantic parent tables for shared data.
  2. For large data sets, queries on the child classes are much simpler with better performance (i.e. no joins needed).
Disadvantages:
  1. No common location to store common data. If you want to go into the DB and manually run a query at the abstracted parent level across all tables, good luck!
  2. Changes in DB structure for parent class requires changes to DB tables in all child classes.
  3. Queries at the parent class level will be expensive due to unions required.
Recommendations:
  1. Use only when a very small percentage of queries are at the parent class level.
  2. This is more for situations where there is a need to run simple & infrequent logic at a more abstracted level, while most queries & operations are at the respective child-class levels.
  3. Use it only when the abstraction at the parent level is not important to you (i.e. only as a convenience in certain situations).
3) Table per sub-class approach (with joined-subclass)
This is the approach where sub-classes & the parent share primary keys. The shared columns of the sub-classes will reside together in a parent table.

Advantages:
  1. Common data is stored in common parent tables. Therefore easy to maintain, easy to query.
  2. Sub-class tables are compact & do not contain redundant columns.
  3. Sub-classes can be very different in structure & this will still work well.
Disadvantages:
  1. Outer Joins used to query data for each instance. On big data-sets, this can be a very expensive option.
  2. The above is exacerbated when there are multiple layers & complex class hierarchies.
Recommendations:
  1. Logically most elegant approach. This is the most recommended approach unless you have a large dataset.
  2. However with large data-set and/or complex class hierarchies, performance will be a very real concern.
Table per sub-class (with discriminator)This has the same DB structure as (3) above except for the addition of a discriminator column.

Advantages:
  1. Same as (3) with the added advantage of having more control over the joins (via the fetch="select" option).
Disadvantages:
  1. Without the joins, this is more vulnerable to the N+1 query problem.
Recommendation:
  1. Use this over (3) when you have large data sets & the N+1 query problem isn't severe in your situation.
References:
  1. docs.jboss.org/hibernate/core/3.6/reference/en-US/html/inheritance.html
  2. simsonlive.wordpress.com/2008/03/09/how-inheritance-works-in-hibernate/

No comments:

Post a Comment