<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Basil Vandegriend: Professional Software Development &#187; Hibernate</title>
	<atom:link href="http://www.basilv.com/psd/blog/tag/hibernate/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.basilv.com/psd</link>
	<description></description>
	<lastBuildDate>Wed, 25 Jan 2012 13:23:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Streaming Data to Reduce Memory Usage</title>
		<link>http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage</link>
		<comments>http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage#comments</comments>
		<pubDate>Thu, 05 May 2011 13:01:52 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[design]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[memory]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=641</guid>
		<description><![CDATA[I recently performed a series of optimizations to reduce an application's memory usage. After completing several of these I noticed that there was a common theme to many of my optimizations that I could explicitly apply to help identify further opportunities for improvement. As a reoccuring solution, this qualifies as a design pattern which I [...]]]></description>
			<content:encoded><![CDATA[<p>I recently performed a series of optimizations to reduce an application's memory usage. After completing several of these I noticed that there was a common theme to many of my optimizations that I could explicitly apply to help identify further opportunities for improvement. As a reoccuring solution, this qualifies as a design pattern which I refer to as <em>Streaming Data</em>.</p>
<h3>Context</h3>
<p>This pattern applies when you need to process a significant volume of data but the processing can be done incrementally on small subsets of the data. A typical example is loading a list of entities and then iterating through the list to process each one. While the results (output) of processing can be combined across all the entities, it is important that the input to the processing only requires a small subset of all the data, and not the entire list of entities. A code example illustrating this problem context is shown below:</p>
<pre class=" prettyprint">
List&lt;Entity&gt; entities = loadEntities();
List&lt;ProcessingResult&gt; results = new ArrayList&lt;ProcessingResult&gt;();
for (Entity entity : entities) {
  ProcessingResult result = processEntity(entity);
  results.add(entity);
}
</pre>
<h3>Solution</h3>
<p>Reducing the memory usage in the above example is based on the observation that loading the entire list of objects to process can consume a large amount of memory and is not necessary since we only use one object at a time. So the solution is to stream - incrementally retrieve - these objects instead of loading them all at once. For the consumer of this data the only change required is to first obtain a reference to the stream such as an <code>Iterable</code> that incrementally fetches data. Updating our prior code example results in the following (changed lines shown in green background):</p>
<pre class=" prettyprint">
<span style="background:#97FF77;">Iterable&lt;Entity&gt; entities = streamEntities();</span>
List&lt;ProcessingResult&gt; results = new ArrayList&lt;ProcessingResult&gt;();
for (Entity entity : entities) {
  ProcessingResult result = processEntity(entity);
  results.add(entity);
}
</pre>
<h3>Examples</h3>
<p>The mechanism to use for streaming objects will depend on the source of the data and may require significant changes compared to a bulk load. Here are some specific examples.</p>
<h4>Parsing XML</h4>
<p><a href="http://www.basilv.com/psd/blog/2008/simple-xml-parsing-using-jaxb">Parsing XML files using JAXB</a> is a convenient approach for converting the entire file into a tree of Java objects, but it populates the entire tree at once. To instead stream such data use the SAX parser provided as part of the JAXP API. The SAX parser is event-based, which means that it iterates over the entities (and attributes) of your XML and for each item invokes callbacks you define.</p>
<h4>Querying Databases using Hibernate</h4>
<p>When using Hibernate to query for a collection of entities it is convenient to simply ask Hibernate for the entire collection. A typical example of doing this using the query by criteria API within Hibernate is below:</p>
<pre class=" prettyprint">
public List&lt;Entity&gt; queryData() {
  Criteria criteria = session.createCriteria(Entity.class)
  // Add appropriate restrictions
  // ...
  List&lt;Entity&gt; result = criteria.list();
  return result;
}
</pre>
<p>When the criteria returns a large volume of data, however, this approach will consume a high volume of data. Instead use the <code>scroll</code> method on <code>Criteria</code> to return a <code>ScrollableResults</code> instance that can be used to iterate through the results. If you prefer to not expose the rest of the application to Hibernate classes, you can wrap the <code>ScrollableResults</code> in a special implementation of <code>Iterator</code> (which I leave as an exercise to the reader). The revision of the above example using streaming looks like the following (changed lines shown in green background):</p>
<pre class=" prettyprint">
<span style="background:#97FF77;">public Iterator&lt;Entity&gt; queryData() {</span>
  Criteria criteria = session.createCriteria(Entity.class)
  // Add appropriate restrictions
  // ...
<span style="background:#97FF77;">  ScrollableResults scrollableResults = criteria.scroll();</span>
<span style="background:#97FF77;">  Iterator&lt;Entity&gt; result = new ScrollableResultsIterator(scrollableResults);</span>
  return result;
}
</pre>
<p>This scroll approach only works when all the data can be processed within the same database transaction since the Hibernate session must remain open for the <code>ScrollableResults</code> to be able to continue fetching data. If this is not suitable then another option is to load the data using multiple queries that each return a subset of the data. One common example of this is when displaying search results to an user. Rather than showing all the results (which may number in the hundreds or thousands) show one page at a time and let the user step through the various pages of results. Due to the frequency with which this occurs I refer to this solution as <em>paging</em>. To implement this in Hibernate using the query by criteria API is fairly simple:</p>
<ol>
<li>Start by creating your criteria object and defining its restrictions as you normally would.</li>
<li>Apply an ordering to the criteria. It is best if this ordering is consistent, by which I mean that database updates or inserts between queries will not result in invalid or unexpected results being returned. This assumes each query for a page executes in a separate database transaction which provides no guarantees of transactional isolation for the group of queries as a whole. In some contexts, consistency is not required. If it is then I prefer to use an auto-incrementing surogate primary key as the field to sort by in order to achieve the highest level of consistency. </li>
<li>Apply restrictions to retrieve only the specific page. This is done using the methods <code>setFirstResult</code> and <code>setMaxResults</code> on the <code>Criteria</code> object.</li>
</ol>
<h3>Consequences</h3>
<p>One potential consequence of streaming data is a reduction in performance because data is loaded piece by piece rather than in bulk. To mitigate this, the solution is use what I call <em>loading sets</em>: define subsets of the total data volume that are small enough to not impact memory usage but large enough to minimize performance impacts. Then load the data one set at a time. The consuming API does not need to change: it can still iterate or stream over each loaded set, and then fetch the next set once the current one is exhausted.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Avoiding Caching To Improve Hibernate Performance</title>
		<link>http://www.basilv.com/psd/blog/2010/avoiding-caching-to-improve-hibernate-performance</link>
		<comments>http://www.basilv.com/psd/blog/2010/avoiding-caching-to-improve-hibernate-performance#comments</comments>
		<pubDate>Mon, 08 Feb 2010 14:18:54 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[tools]]></category>
		<category><![CDATA[design]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/?p=486</guid>
		<description><![CDATA[I was recently doing some performance tuning and made the surprising discovery that doing less caching in Hibernate actually improved performance in a particular scenario. When I discovered the problem this seemed very counter-intuitive. In fact, my original design maximized the use of caching in order to improve performance, but the opposite happened in practice. [...]]]></description>
			<content:encoded><![CDATA[<p>I was recently doing some performance tuning and made the surprising discovery that doing less caching in <a href="http://www.hibernate.org/">Hibernate</a> actually improved performance in a particular scenario. When I discovered the problem this seemed very counter-intuitive. In fact, my original design maximized the use of caching in order to improve performance, but the opposite happened in practice. In hindsight, naturally, the reason for this was fairly obvious. So I thought I would share the details of this situation so help you avoid making the same mistake.</p>
<p>I was tuning a batch processing application that received XML input data sets, each consisting of thousands of separate input records. The processing logic converted each input record into multiple Hibernate entities – as many as several hundred. This logic required a number of queries to implement - some to load related, preexisting entities, and others to verify consistency with existing data. This queried data would often be needed for multiple input records in the same data set. Based on this, I decided to use a single Hibernate session to process the entire data set, committing after each input record but keeping the session open to be able to make use of cached entities for subsequent processing.</p>
<p>When initial performance tests were carried out, they showed a disturbing trend: the processing time required per input record in the data set increased linearly. This meant that the total time required to process a data set increased exponentially with the size of the set! This is illustrated by the diagrams below.<br />
<img src="http://www.basilv.com/psd/wp-content/uploads/2010/02/RecordProcessingPerformance.png" alt="Record Processing Performance" title="Record Processing Performance" width="559" height="670" class="size-full wp-image-487" /></p>
<p>An analysis of where the time was being spent showed that the majority of the processing logic required only constant time per record. Where was the extra time going? The culprit seemed to be the call to commit the transaction to the database. I knew that even a few hundred database insert/update statements would execute quickly in nearly constant time (databases are built to scale, after all). The actual database commit was equally speedy. Normally by default I assume that network calls will be the source of performance delays. But in this case this assumption appeared to be incorrect.</p>
<p>So what exactly was happening when I committed the Hibernate transaction, before the calls to the database? Hibernate's first step is to perform a flush to write all entities with changes (called dirty entities) to the database via insert/update/delete calls. How exactly does Hibernate determine which entities are dirty? For loaded entities Hibernate uses byte-code instrumentation to add logic to track when entities become dirty. But my scenario involved new entities, for which Hibernate could not work its magic. So on each flush Hibernate scanned the fields of each entity to see if there were changes. A linearly-increasing number of entities naturally led to a linearly-increasing time per flush. To make matters worse, Hibernate's flush algorithm apparently has a <a href="https://jira.jboss.org/jira/browse/EJBTHREE-649">performance problem</a> when dealing with cascaded collections, which I was using in my scenario.</p>
<p>The solution to my performance problem was to evict all the entities from the session after committing, thus making all entities detached, and then reattaching to the session the few entities I reused in subsequent processing.</p>
<p>This article is one of a series on <a href="http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks">Hibernate Tips &#038; Tricks</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2010/avoiding-caching-to-improve-hibernate-performance/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Automatically Populating Audit Columns in Hibernate</title>
		<link>http://www.basilv.com/psd/blog/2008/automatically-populating-audit-columns-in-hibernate</link>
		<comments>http://www.basilv.com/psd/blog/2008/automatically-populating-audit-columns-in-hibernate#comments</comments>
		<pubDate>Sun, 13 Apr 2008 22:32:30 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2008/automatically-populating-audit-columns-in-hibernate</guid>
		<description><![CDATA[Audit columns are a common design pattern used to record data creation and modification information for database tables. A typical implementation of this pattern is to add four columns to every non-static database table: CREATE_USER, CREATE_TIMESTAMP, UPDATE_USER, and UPDATE_TIMESTAMP. The create columns are populated only when a record is initially populated, while the update columns [...]]]></description>
			<content:encoded><![CDATA[<p>Audit columns are a common design pattern used to record data creation and modification information for database tables. A typical implementation of this pattern is to add four columns to every non-static database table: CREATE_USER, CREATE_TIMESTAMP, UPDATE_USER, and UPDATE_TIMESTAMP. The create columns are populated only when a record is initially populated, while the update columns are populated each time the record is updated. </p>
<p>While database triggers can be used to populate the timestamp columns, the user columns are trickier to populate. In a typical web application or three-tier architecture, individual clients do not communicate directly with the database but go through an intermediate layer – the web or application server. The data source used by the application server to connect to the database manages a pool of connections using a common application id to authenticate to the database. This avoids the overhead of a new database connection for each client request and allows a large number of user requests to be serviced by a smaller number of connections. Since it is the application, and not the user, who authenticates with the database, the database does not know the identity of the user behind the database operations, so triggers to populate user columns will not work as desired. The solution is to instead populate these columns within the application code. </p>
<p>Explicitly populating the audit columns in the code for every insert or update, however, is far from ideal - especially when using an object-relational mapping framework such as Hibernate. One of the major advantages of using Hibernate is its ability to encapsulate and hide (in a leaky fashion) the work involved in persisting to a relational database. This allows the business logic, expressed in terms of persisted domain objects, to be kept as readable and simple as possible. Cluttering this logic with calls to set the audit columns complicates the code and is error-prone – missing a single update or creation means there is a hole in the auditing. From the viewpoint of both maintainability and security, the ideal solution would be to configure Hibernate to automatically populate these audit columns whenever a record is created or updated.</p>
<p>The idea of adding orthogonal, or cross-cutting, functionality automatically to a code base is the realm of aspect-oriented programming. Hibernate supports this programming model through the use of interceptors, which allow client code to be executed as part of Hibernate's core processing. We can create an <code>AuditInterceptor</code> to set the audit columns of non-static domain objects, as identified by an <code>Auditable</code> interface. One significant issue is how to obtain the user id to set in the <code>AuditInterceptor</code>. Since the <code>AuditInterceptor</code> is registered with Hibernate and never called directly, there is no way to directly pass in the user id. The typical solution is to use a thread local singleton (i.e. an instance of <code> ThreadLocal</code>) to store the user id for the current thread. For a typical web application, at the start of processing an user's session, the user's id must therefore be registered with the <code>AuditInterceptor</code>. The code for <code>Auditable</code> and <code>AuditInterceptor</code> is shown below:</p>
<pre class="prettyprint">
public interface Auditable
{

  public String getCreateUserId();

  public void setCreateUserId(String createUserId);

  public String getUpdateUserId();

  public void setUpdateUserId(String updateUserId);

}

public class AuditInterceptor extends EmptyInterceptor
{

  private static ThreadLocal<String> userPerThread = new ThreadLocal<String>();

  /**
   * Store the user for the current thread.
   * @param user Cannot be null or empty.
   */
  public static void setUserForCurrentThread(String user) {
    userPerThread.set(user);
  }

  /**
   * Get the user for the current thread.
   * (Used primarily for testing).
   * @return the current user.
   */
  public static String getUserForCurrentThread() {
    return userPerThread.get();
  }

  @Override public boolean onFlushDirty(Object entity,
    Serializable id, Object[] currentState,
    Object[] previousState, String[] propertyNames,
    Type[] types) {

    boolean changed = false;

    if (entity instanceof Auditable) {
      changed = updateAuditable(currentState, propertyNames);
    }
    return changed;
  }

  @Override public boolean onSave(Object entity,
    Serializable id, Object[] currentState,
    String[] propertyNames, Type[] types) {

    boolean changed = false;

    if (entity instanceof Auditable) {
      changed = updateAuditable(currentState, propertyNames);
    }
    return changed;

  }

  private boolean updateAuditable(Object[] currentState,
    String[] propertyNames) {
    boolean changed = false;
    for (int i = 0; i < propertyNames.length; i++) {
      if ("createUserId".equals(propertyNames[i])) {
        if (currentState[i] == null) {
          currentState[i] = userPerThread.get();
          changed = true;
        }
      }
      if ("updateUserId".equals(propertyNames[i])) {
        currentState[i] = userPerThread.get();
        changed = true;
      }
    }
    return changed;
  }

}
</pre>
<p>The <code>AuditInterceptor</code> must be registered with Hibernate. This can be done when the Hibernate session is created, as shown below. Depending on how sessions are created within your code base, you could also provide the current user id to the constructor of <code>AuditInterceptor</code>.</p>
<pre class="prettyprint">
Session session = sessionFactory.openSession(new AuditInterceptor());
</pre>
<p>The source code listed in this article is provided in the <em>Java Examples</em> project which can be downloaded from the <a href="http://www.basilv.com/psd/software">Software</a> page.</p>
<p>This article is one of a series on <a href="http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks">Hibernate Tips &#038; Tricks</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/automatically-populating-audit-columns-in-hibernate/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Improving Performance via Eager Fetching in Hibernate</title>
		<link>http://www.basilv.com/psd/blog/2008/improving-performance-via-eager-fetching-in-hibernate</link>
		<comments>http://www.basilv.com/psd/blog/2008/improving-performance-via-eager-fetching-in-hibernate#comments</comments>
		<pubDate>Sun, 16 Mar 2008 01:35:46 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2008/improving-performance-via-eager-fetching-in-hibernate</guid>
		<description><![CDATA[My previous article started discussing Hibernate relationships, focusing on lazy versus non-lazy relationships. This article continues the theme by discussing how to improve performance when dealing with relationships in Hibernate through a feature called eager fetching. Hibernate's abstraction of database access behind getter and setter methods on domain objects hides potentially inefficient database access. Mindlessly [...]]]></description>
			<content:encoded><![CDATA[<p>My <a href="http://www.basilv.com/psd/blog/2008/avoid-non-lazy-relationships-in-hibernate">previous article</a> started discussing Hibernate relationships, focusing on lazy versus non-lazy relationships. This article continues the theme by discussing how to improve performance when dealing with relationships in Hibernate through a feature called eager fetching.</p>
<p>Hibernate's abstraction of database access behind getter and setter methods on domain objects hides potentially inefficient database access. Mindlessly using Hibernate without considering the implications for the database operations being performed can lead to significant performance hits. One of the most significant performance problems is the N+1 query problem, where iterating over a collection of entities and accessing a related entity for each will result in one query to return the collection, and then a separate query for each entity to retrieve its related entity. For N entities in the collection, this is a total of N+1 queries. This is quite inefficient, especially for larger values of N, considering that a single database query can be written to return all of the related entities for all the entities of the collection. I touched on this issue previously in the context of non-lazy relationships, but it still occurs with lazy relationships. To provide a concrete example, I will refer to the following object model (the same as in my previous article):<br />
<a href='http://www.basilv.com/psd/wp-content/uploads/2008/03/hibernaterelationshipsclassdiagram.png' title='Class Diagram'><img src='http://www.basilv.com/psd/wp-content/uploads/2008/03/hibernaterelationshipsclassdiagram.png' alt='Class Diagram' /></a></p>
<p>The corresponding classes are summarized as follows:</p>
<pre class="prettyprint">
class Customer {
  public Collection<Order> getOrders();
  public void addOrder(Order order);
}
class Order {
  public Customer getCustomer();
  public PaymentMethod getPaymentMethod();
  public void setPaymentMethod(PaymentMethod m);
}
class PaymentMethod {
  public Order getOrder();
}
</pre>
<p>Let us say I want to give a bonus to a customer who has only paid by cash. The naive implementation for a given customer would be the following:</p>
<pre class="prettyprint">
public boolean qualifiesForBonus(Customer customer) {
  for (Order order : customer.getOrders() ) {
    PaymentMethod paymentMethod = order.getPaymentMethod();
    if (!paymentMethod.isCash()) {
      return false;
    }
  }
  return true;
}
</pre>
<p>This implementation results in N+1 queries, which may not be so bad if the customer has only a few orders. But what if we want to calculate this for all customers? Now it is M(N+1) + 1 queries (where M is the number of customers), and the cost becomes much higher. This question can be answered with a single SQL query:</p>
<pre class="prettyprint">select * from Customer c join Order o on o.customer_FK = c.id
where not exists (
select * from PaymentMethod m where m.order_FK = o.id and m.type <> 'CASH'
)</pre>
<p>It would be nice to be able to calculate this logic across relationships using our domain objects while efficiently retrieving them via a single query. Hibernate's eager fetching feature lets you accomplish this. Eager fetching allows you to load related entities at the same time using a single query. You simply tell Hibernate in your query by criteria or HQL to fetch the associated entity. Behind the scenes, Hibernate generates an SQL statement that joins to the associated entity table. The result set includes the columns from both the base entity and the associated entity, which Hibernate converts into the appropriate domain objects.</p>
<p>For an example, we will use Hibernate HQL to retrieve all customers in order to calculate if they qualify for a bonus. We want to eagerly fetch the orders and payment methods used in the calculation ahead of time to avoid multiple SQL queries being issued. The following code does this:</p>
<pre class="prettyprint">List<Customer> list = session.createQuery(
  "select c from Customer c left join fetch c.orders o " +
  "left join fetch o.paymentMethod p").list();</pre>
<p>The list of customers returned already has the Order and PaymentMethod objects loaded in a single SQL query, so now when you iterate over the list and call the <code>qualifiesForBonus()</code> method, the processing will be done entirely in memory with no additional SQL queries.</p>
<p>As wonderful as eager fetching is, it comes with several surprises. Understanding the different types of SQL joins is important. The above HQL specifying "left join" is shorthand for "left outer join", which means that Customers without Orders will still be returned. If you just specify "join", which is shorthand for "inner join", then such Customers will not be returned. Let us say that the database holds three customers with two orders and one customer with no orders. The resulting SQL using "left join" will return seven rows (3 * 2 + 1), whereas using "join" it will return only six rows. For the "left join" query, how many elements will the resulting list created by Hibernate contain? You might expect the answer to be four, for the four customers in the database, but the answer is seven. Hibernate does only return four instances of Customer, as per its guarantee of having a single Java instance of a given domain object per session, but three of these instances are referenced twice in the list. Given the strong object-oriented nature of HQL and Hibernate, I found it surprising that such an object-based query language would return duplicate references to match what the SQL query returns. The apparent explanation from the Hibernate website for this behavior is that there are situations where this the desired behavior (although I cannot think of any myself). Fortunately, there is a simple solution. Adding the keyword distinct to the query (i.e. "select distinct c from ...") will not change the SQL query or the number of results it returns, but it will cause Hibernate to eliminate the duplicate references and return four elements in the list as expected.</p>
<p>To use eager fetching when doing query by criteria, use the <code>setFetchMode()</code> method. For example:</p>
<pre class="prettyprint">List<Customer> list = session.createCriteria(Customer.class)
  .setFetchMode("orders", FetchMode.JOIN).list();</pre>
<p>Hibernate will determine whether to do an inner or outer join: if the relationship is optional, an outer join will be used, potentially returning duplicate references as described above.</p>
<p>The entire point of eager fetching is to achieve better performance. If you try to eagerly fetch too many relationships, however, especially optional relationships resulting in outer joins, you may find your performance becomes worse, not better. Each additional outer join adds another dimension to the Cartesian product representing the result of the query, and too large a result set will cause performance to be worse than without the eager fetching. Hibernate 2 actually limited eager fetching to a single to-many relationship to help prevent this from occurring. Hibernate 3 drops this restriction, leaving you responsible for the consequences.</p>
<p>There are other surprises lurking in Hibernate's implementation of eager fetching. I recommend referring to <a href="http://www.hibernate.org/">Hibernate's online documentation</a> for more information, particularly the FAQs.</p>
<p>This article is one of a series on <a href="http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks">Hibernate Tips &#038; Tricks</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/improving-performance-via-eager-fetching-in-hibernate/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Avoid Non-Lazy Relationships in Hibernate</title>
		<link>http://www.basilv.com/psd/blog/2008/avoid-non-lazy-relationships-in-hibernate</link>
		<comments>http://www.basilv.com/psd/blog/2008/avoid-non-lazy-relationships-in-hibernate#comments</comments>
		<pubDate>Thu, 06 Mar 2008 05:39:07 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2008/avoid-non-lazy-relationships-in-hibernate</guid>
		<description><![CDATA[Support for entity relationships is a great time-saving feature in Hibernate, but it can also be a trap for the unsuspecting developer. Handling relationships between entities can be a complex business, and I for one am glad for all the support that Hibernate provides. Hibernate's assistance, however, can do more harm than good when it [...]]]></description>
			<content:encoded><![CDATA[<p>Support for entity relationships is a great time-saving feature in <a href="http://www.hibernate.org/">Hibernate</a>, but it can also be a trap for the unsuspecting developer. Handling relationships between entities can be a complex business, and I for one am glad for all the support that Hibernate provides. Hibernate's assistance, however, can do more harm than good when it comes to performance considerations. Because Hibernate does such a good job of hiding what SQL is generated for entity relationships and when it is executed, it is all too easy to produce poorly performing code and not realize it. Hibernate does provide a number of features to optimize performance, especially for relationships, such as the first and second level cache, lazy loading, and eager fetching. This multitude of features, however, can compound the problem when developers are unaware of them or of how they work. </p>
<p>This article focuses on lazy versus non-lazy relationships. In Hibernate 3 relationships are by default lazy: when an entity is loaded, the related entity or collection will not be loaded until it is accessed. In Hibernate 2, the default was non-lazy relationships. This default was changed in Hibernate 3 because of all the problems people encountered with non-lazy relationships. Despite the change in default in Hibernate 3, I have still seen developers explicitly use non-lazy relationships in the mistaken belief that they will provide a performance boost. I believe their assumption is that because Hibernate loads the non-lazy related entities at the same time as it loads the base entity, Hibernate must do this in a single SQL statement which is therefore more efficient. This is not true. </p>
<p>Hibernate's actual behavior is easy to see once you <a href="http://www.basilv.com/psd/blog/2008/hibernate-and-logging">turn on SQL logging</a>. Hibernate will issue one query to load the entities, and then issue one or more additional SQL queries in order to load the related entities. This does not seem so bad for a single relationship, but consider a typical application with a network of related entities. If all the relationships are non-lazy, then loading one entity will load all the related entities in the network. If collection relationships are involved the problem can become even worse. In certain scenarios, such as when loading a collection of child records that each have a non-lazy relationship to another entity, Hibernate will perform 1+N queries: one query to load the collection of children, and one query per child entity (N queries total) to load the related record for each child. </p>
<p>To provide a concrete example, consider the following object model.<br />
<a href='http://www.basilv.com/psd/wp-content/uploads/2008/03/hibernaterelationshipsclassdiagram.png' title='Class Diagram'><img src='http://www.basilv.com/psd/wp-content/uploads/2008/03/hibernaterelationshipsclassdiagram.png' alt='Class Diagram' /></a></p>
<p>The corresponding classes are summarized as follows:</p>
<pre class="prettyprint">
class Customer {
  public Collection<Order> getOrders();
}
class Order {
  public Customer getCustomer();
  public PaymentMethod getPaymentMethod();
}
class PaymentMethod {
  public Order getOrder();
}
</pre>
<p>Let us assume that the relationship between Order and PaymentMethod is non-lazy. What happens when we have a single Customer and access its orders? Calling the getOrders() method on Customer when Customer is persistent and the orders collection has not yet been initialized causes Hibernate to issue a query to load the orders:
<pre class="prettyprint">select * from Order where Customer_FK = ?</pre>
<p>Each row returned by this query is converted by Hibernate into an Order instance which is then associated with the existing Customer object. Because the relationship to PaymentMethod is non-lazy, however, Hibernate must also load the corresponding PaymentMethod object via an additional query. I.e.
<pre class="prettyprint">select * from PaymentMethod where Order_FK = ?</pre>
<p>This PaymentMethod query is executed for each Order, so if there are N orders on the customer, Hibernate will execute N+1 queries. This is despite the relationship between Order and PaymentMethod being one-to-one. If it was an one-to-many relationship instead with M PaymentMethod instances for each Order, the performance would be even worse: 1+N*M queries total. On a recent project where multiple relationships were configured as non-lazy (compared to just one relationship in this example), I have seen a single operation result in hundreds of queries being executed to load in almost the entire related object graph. </p>
<p>The solution to this problem is easy: use only lazy relationships. I have a hard time thinking of any scenario where a non-lazy relationship would be a good idea. They certainly do not help performance, since the same set of SQL queries is issued for lazy and non-lazy relationships - it is just that lazy relationships result in queries being deferred until they are needed. If you do want to improve the performance of loading related entities then <a href="http://www.basilv.com/psd/blog/2008/improving-performance-via-eager-fetching-in-hibernate">eager fetching is the feature to use</a>.</p>
<p>This article is one of a series on <a href="http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks">Hibernate Tips &#038; Tricks</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/avoid-non-lazy-relationships-in-hibernate/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Hibernate and Logging</title>
		<link>http://www.basilv.com/psd/blog/2008/hibernate-and-logging</link>
		<comments>http://www.basilv.com/psd/blog/2008/hibernate-and-logging#comments</comments>
		<pubDate>Sat, 23 Feb 2008 22:08:40 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[log4j]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2008/hibernate-and-logging</guid>
		<description><![CDATA[Hibernate tries to hide the details of dealing with relational databases, but it is at best a leaky abstraction. At its most basic level, Hibernate is a framework that issues SQL commands to the database. Sometimes it does not do what you would expect or want (more on that in future articles). Therefore it is [...]]]></description>
			<content:encoded><![CDATA[<p>Hibernate tries to hide the details of dealing with relational databases, but it is at best a <a href="http://www.joelonsoftware.com/articles/LeakyAbstractions.html">leaky abstraction</a>. At its most basic level, Hibernate is a framework that issues SQL commands to the database. Sometimes it does not do what you would expect or want (more on that in future articles). Therefore it is very useful at times to monitor or review the SQL being produced by Hibernate. The book <a href="http://www.amazon.ca/gp/product/1932394885?ie=UTF8&#038;tag=basilvandegri-20&#038;linkCode=as2&#038;camp=15121&#038;creative=330641&#038;creativeASIN=1932394885">Java Persistence with Hibernate</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=1932394885" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> covers this topic primarily by discussing the <code>show.sql</code> hibernate configuration property which writes generated SQL to the console. I have found this far too limiting: I instead want the SQL logged in my application's log (i.e. as produced by log4j) to receive all the advantages that a central logging system can provide. This can be accomplished by turning on <code>DEBUG</code> logging for the logging context <code>org.hibernate.SQL</code>. If you are using log4j, add the following line to your log4j.properties file: </p>
<pre class="prettyprint">
log4j.logger.org.hibernate.SQL=DEBUG
</pre>
<p>One limitation of this SQL logging is that it reports the SQL statement but not the values of the parameters. Since Hibernate almost always uses prepared statements with parameters, this is often an annoying limitation. Fortunately, Hibernate does allow the logging of parameter values: turn on <code>DEBUG</code> logging for the logging context <code>org.hibernate.type</code>. Unfortunately, this results in very verbose logs since each parameter value for each query is a separate log entry. A sample log entry for a single insert statement is shown below.</p>
<pre class="prettyprint">
07 Feb 2008 13:30:52,596 DEBUG insert into example.customer (CATEGORY,
CREATE_USER_ID, CREATE_TIMESTAMP, UPDATE_USER_ID, UPDATE_TIMESTAMP, OID)
values (?, ?, ?, ?, ?, ?) - org.hibernate.SQL [main] [44513 ms]

07 Feb 2008 13:30:52,596 DEBUG binding 'S' to parameter: 1 -
hibernate.type.StringType [main] [44513 ms]

07 Feb 2008 13:30:52,596 DEBUG binding 'junit test' to parameter: 2 -
hibernate.type.StringType [main] [44513 ms]

07 Feb 2008 13:30:52,596 DEBUG binding '2008-02-07 13:30:52' to parameter: 3 -
hibernate.type.DbTimestampType [main] [44513 ms]

07 Feb 2008 13:30:52,596 DEBUG binding '2008-02-07 13:30:52' to parameter: 4 -
hibernate.type.TimestampType [main] [44513 ms]

07 Feb 2008 13:30:52,596 DEBUG binding 'junit test' to parameter: 5 -
hibernate.type.StringType [main] [44513 ms]

07 Feb 2008 13:30:52,596 DEBUG binding '2691102' to parameter: 6 -
hibernate.type.LongType [main] [44513 ms]
</pre>
<p>I much prefer the way that Spring JDBC logs SQL statements – both the SQL and the list of parameter values are logged as a single statement. If you know how to do this in Hibernate, I would appreciate hearing from you.</p>
<p>Hibernate uses Apache's <a href="http://commons.apache.org/logging/">commons-logging</a> to abstract the actual logging mechanism: typically either <a href="http://logging.apache.org/log4j/index.html">log4j</a> or Java logging (introduced in Java 1.4). I always use log4j myself, and normaly including the log4j jar file in the classpath is sufficient to have commons-logging use log4j. When running code in an application server, however, this is not always the case. I have had issues with both WebSphere and WebLogic application servers configuring commons-logging for some other behavior and having that setting 'leak' into my application. As a result, instead of having log messages from Hibernate appear in the application log along with the rest of the log statements produced directly with the log4j API, the Hibernate log messages appear elsewhere. In WebSphere 6.1, I had them going to the console (standard out). The simplest solution is to add a configuration file named <code>commons-logging.properties</code> to the root of your application's classpath with the following content that instructs commons-logging to use log4j for logging:</p>
<pre class="prettyprint">
org.apache.commons.logging.Log=org.apache.commons.logging.impl.Log4JLogger
</pre>
<p>This article is one of a series on <a href="http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks">Hibernate Tips &#038; Tricks</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/hibernate-and-logging/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Hibernate Tips and Tricks</title>
		<link>http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks</link>
		<comments>http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks#comments</comments>
		<pubDate>Sat, 23 Feb 2008 21:58:22 +0000</pubDate>
		<dc:creator>Basil Vandegriend</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[Hibernate]]></category>
		<category><![CDATA[Java EE]]></category>
		<category><![CDATA[software development]]></category>

		<guid isPermaLink="false">http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks</guid>
		<description><![CDATA[Hibernate is a de facto standard for object-relational mapping. One of my recent projects involved the use of the latest version of Hibernate (3.2). Since I had not used Hibernate since its version 2 days, I picked up the authoritative reference Java Persistence with Hibernate which is co-authored by Gavin King, the founder of Hibernate. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.hibernate.org/">Hibernate</a> is a de facto standard for object-relational mapping. One of my recent projects involved the use of the latest version of Hibernate (3.2). Since I had not used Hibernate since its version 2 days,  I picked up the authoritative reference <a href="http://www.amazon.ca/gp/product/1932394885?ie=UTF8&#038;tag=basilvandegri-20&#038;linkCode=as2&#038;camp=15121&#038;creative=330641&#038;creativeASIN=1932394885">Java Persistence with Hibernate</a><img src="http://www.assoc-amazon.ca/e/ir?t=basilvandegri-20&#038;l=as2&#038;o=15&#038;a=1932394885" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> which is co-authored by Gavin King, the founder of Hibernate. I carefully perused the book prior to starting development and found it useful and comprehensive. </p>
<p>As development proceeded, however, I discovered a number of surprises that Hibernate threw my way and a number of seemingly common issues that arose that I either could not find in the book or were inadequately covered. So I wanted to share these Hibernate tips &#038; tricks with you. Rather than produce one large article consisting of a collection of unrelated tips, I decided to produce separate articles, each on a focused subject. Below I list these specific tips &#038; tricks articles:</p>
<ul>
<li><a href="http://www.basilv.com/psd/blog/2008/hibernate-and-logging">Hibernate and Logging</a>: Provides tips on how to log the SQL created by Hibernate.</li>
<li><a href="http://www.basilv.com/psd/blog/2008/avoid-non-lazy-relationships-in-hibernate">Avoid Non-Lazy Relationships</a>: Why lazy relationships should always be used.</li>
<li><a href="http://www.basilv.com/psd/blog/2008/improving-performance-via-eager-fetching-in-hibernate">Improving Performance via Eager Fetching in Hibernate</a>: How eager fetching works and how to use it.</li>
<li><a href="http://www.basilv.com/psd/blog/2008/automatically-populating-audit-columns-in-hibernate">Automatically Populating Audit Columns in Hibernate</a>: How to easily populate audit columns for your entire application.</li>
<li><a href="http://www.basilv.com/psd/blog/2010/avoiding-caching-to-improve-hibernate-performance">Avoiding Caching To Improve Hibernate Performance</a>: When overuse of Hibernate's cache can hurt performance.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.basilv.com/psd/blog/2008/hibernate-tips-and-tricks/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

