Tuesday, August 28, 2012

POST vs PUT in REST

In designing REST APIs, it's often said to map POST to Create operations and PUT to update operations, but most people don't really explain why. Here's a good explanation I found:
http://jcalcote.wordpress.com/2008/10/16/put-or-post-the-rest-of-the-story/

In a nutshell:
  • Browsers treat PUTs as idempotent operations, and will not hesitate to call the server again with the same URL if the user hit the "back" button on the browser.
  • Browsers does not treat POSTs as idempotent operations, so when the user hits the "back" button on the browser, it asks the user if he wants to submit the data again.
Since it is not possible for the server to ensure idempotency when asked to create something new (i.e. it wouldn't know if the 2nd, 3rd call etc. are genuine calls to create more entries with the same data), POST is use for creating new entries.

Monday, August 27, 2012

Naming Convention for REST URLs

Separate words with camel case or dashes?

Not a very important issue but dashes wins it for me because URLs are mostly expected to be case insensitive.

Sunday, August 26, 2012

MongoDB Database Design: Large vs Small Documents

For relational databases, the approach is to have as many tables as you need to create the perfectly normalized database design. For MongoDB (and true to same or less extent for other NoSQL databases due to differing indexing capabilities), it seems the recommended approach is when in doubt, use less large documents (i.e. the embedding rather than linking approach).

Unlike other NoSQL databases, you don't have to create many small collections just because you have to query by with different fields as keys, as you can use indexing to solve this problem with MongoDB.  Furthermore, MongoDB allows indexes to be created on objects nested within the document hierarchy.

I can think of the following key benefits of using large documents:
  • Less joins, less follow up queries, write much less code
  • Operations within a single document are atomic, don't have to worry about writing to 3 separate documents and then what happens if one of the writes fail
The main downside to large documents is when you have to run queries that result in table-scans. You'll just have to make sure that you have indexes covering all possible queries you need to make.

References:

Sunday, August 19, 2012

Java Reflection: Access a Private Field

It is possible to access private fields in Java classes using the Reflection API (as long as the Security Manager isn't configured to prevent this).

This is done by adding "field.setAccessible(true);" before trying to obtain the value of the field.

Example code:


Field field = SomeClass.class.getDeclaredField(fieldName);
feld.setAccessible(true);
return field.get(obj);

JUnit Reference Equality Comparison

This can be done using the "assertSame" function.

Wednesday, August 15, 2012

Pentaho: MySQL Bulk Load

In MySQL, bulk load operations i.e. "LOAD DATA" has much faster performance than INSERT. In Pentaho's Data Integration module, it is possible to directly perform a bulk load operation into MySQL, using the "MySQL Bulk Loader" Step (in the "Bulk loading" folder).

However, one "gotcha" with it (version 4.3) is that it will crash on an empty stream. You'll either have to find a way to handle it, or use it only in situation where that either doesn't matter or never happens, or hope they fix this in future versions.

Thursday, August 9, 2012

Pentaho Data Integration 4.3 With MySQL

With version 4.3 of Pentaho Data Integration, the MySQL driver is not bundled because of "license compatibility issues". So if you need to connect to the MySQL database, you'll need to download the JDBC driver separately and copy it into the "libext/JDBC" folder.