Tuesday, October 28, 2014

Fetching Large Datasets With Hibernate

The Query.list() method in Hibernate returns a List which contains all the results from the query. This works fine in most cases but if the query returns a large number of rows, memory utilisation could be an issue because the List will be a huge in-memory object.

The solution is to stream the results and scroll through them using the Query.scroll() method. For best results, the following also should be done:

  • Scroll with ScrollMode.FORWARD_ONLY
  • Use a StatelessSession (to avoid caching)
  • Set fetch size to Integer.MIN_VALUE (this one I'm not 100% sure about, but it's recommended in some blogs)
References:

Sunday, October 19, 2014

Open Source Columnar Databases

Columnar databases are actually SQL databases with a different storage format where data is organised according to columns rather than rows. This is far more optimal for fact table queries.

If cloud hosting is a the only option, then the best solution is probably Amazon Redshift. However, if you need to install and operate your own the open source options (which have not been evaluated) are:

If these are not suitable, there is also a commercial option in Infobright: https://www.infobright.com/

Wednesday, October 8, 2014

Google Drive Revoke Shares to All Documents for User


  1. From the list of folders on the left, find an item that says "All Items"
  2. Check all items
  3. More -> Share
  4. You will see the list of all users that you have shared documents with, just delete the user from the list