Archive

Archive for the ‘hibernate’ Category

The Micro Framework Approach

December 3, 2010 5 comments


Flattr this
The demand for Grails and Groovy is clearly raising these days – at least here in Austria and Germany. Although most of my workshops have their individual adaptions (depending on the previous knowledge and programming language experience of participants) there are parts which can more or less be found unmodified in every workshop: Groovy essentials/advanced topics and what I call micro framework examples. This article is about the idea behind micro framework examples and why I find them that useful as workshop examples.

What is a Micro Framework Example?

I strongly believe that true understanding on patterns behind frameworks like Hibernate and Spring can’t easily be treat in a bunch of slides. Explaining patterns is one thing but to actually see how those are applied is another one. One approach I’ve found to be really useful in workshops is the use of micro framework examples. A micro framework example implements the core functionality behind a specific framework – reduced to the very fundamentals. One advantage to implement a micro framework example together with participants is to force triggering a thinking process of what functionality is needed and how it can be implemented. Another side-effect is that it allows to slightly introduce the original frameworks ubiquitous language simply by using the same class and method names.

Let me give you an example. The most threatening topic for many of my clients is to understand Hibernate and its persistence patterns. One approach to create a better understanding would be to implement a micro Hibernate framework example. This can be done in a simple Groovy script MicroHibernate.groovy which defines two classes and a simple test case. The first class implements the registry pattern and is called SessionFactory:

class SessionFactory {

    private def storage

    def SessionFactory(def storage)  {
        this.storage = storage
    }

    def newStorageConnection()  {
        return storage
    }
}

The SessionFactory acts as the main access point to get a reference to some storage connection. In the micro framework example this will simply be a Map. Dealing with SQL or even a real database would uselessly complicate the example and we want to concentrate on the core essentials. Let’s go on to the next class which implements the persistence context pattern:


class Session {

    static Log log = LogFactory.getLog(Session)

    private def sessionFactory

    def Session(def sessionFactory) { this.sessionFactory = sessionFactory }

    def snapshots = [:] // a Map(Domain-Class, Map(Identifier, Properties))
    def identityMap = [:] // a Map(Domain-Class, Map(Identifier, ObjectRef))
    def modifiedPersistentObjects = [:] // a Map(Domain-Class, List(Identifier))
    
    def propertyChanged(def obj)  {
        if (!modifiedPersistentObjects[obj.getClass()]) modifiedPersistentObjects[obj.getClass()] = []

        modifiedPersistentObjects[obj.getClass()] << obj.id

        log.info "propertyChanged of object: ${obj} with id ${obj.id}"
    }

    def load(Class<?> domainClassType, Long identifier)  {
        
        if (identityMap[domainClassType] && identityMap[domainClassType][identifier])  {
            return identityMap[domainClassType][identifier]
        }
        
        def conn = sessionFactory.newStorageConnection()
        def loadedObj = conn[domainClassType][identifier]
        if (!loadedObj) throw new Exception("Object of type ${domainClassType} with id ${identifier} could not be found!")
        
        if (!snapshots[domainClassType]) snapshots[domainClassType] = [:]
        if (!identityMap[domainClassType]) identityMap[domainClassType] = [:]


        def properties = loadedObj.getProperty("props")
        snapshots[domainClassType][identifier] = properties.inject([:], { m, property -> m[property] = loadedObj[property]; m })

        log.info "create snapshot of ${domainClassType} id ${identifier} with properties ${snapshots[domainClassType][identifier]}"

        identityMap[domainClassType][identifier] = loadedObj
        
        loadedObj.metaClass.getId = { -> identifier } 
        loadedObj.metaClass.setProperty = { String name, Object arg ->
            def metaProperty = delegate.metaClass.getMetaProperty(name)

            if (metaProperty)  {
                owner.propertyChanged(loadedObj)
                
                metaProperty.setProperty(delegate, arg)
            }
        }
        
        return loadedObj
    }
}

A Session object can be used to retrieve already persistent objects and to persist so-called transient objects. I like to start by implementing the load method which loads a persistent object from the storage connection of the current session factory. Of course, this is not an example for Groovy beginners but with a little knowledge of MOP and with some programming guidance it should not be a big thing to understand what is going on. At the end let’s define some test case which shows how both classes are actually used:

def storage = [:]

class Person {
    String name

    String toString() { name }

    static props = ['name']
}

storage[Person] = [:]
storage[Person][1 as Long] = new Person(name: 'Max Mustermann')
storage[Person][2 as Long] = new Person(name: 'Erika Mustermann')

def sessionFactory = new SessionFactory(storage)
def session = sessionFactory.newSession()

def person = session.load(Person, 1)

Interestingly, even without considering SQL, DB connection handling, threading issues etc. participants already get a feeling of several Hibernate gotchas beginners otherwise often struggle with:

  • the first level cache
  • the need for proxies or MOP modifications
  • Hibernate’s use of object snapshots
  • the IdentityMap pattern
  • repeatable read transaction isolation level
  • etc.

It is amazing how much can be explained by implementing some framework’s core functionality in about 5 minutes. The Session functionality gets than extended by flush, discard and save/delete functionality. If programmers have been through the process of implementing such a micro Hibernate example they often get a basic and fundamental understanding of how an orm framework could work and what the main challenging problems are. By keeping the class and method names in sync with the concrete Hibernate implementations participants learn the framework’s basic domain language.

GSamples – A Repository for Sharing Workshop Examples

The example mentioned above is available in a public github repository what I called GSamples [0], a collection of Groovy and Grails workshop examples. At the time of publishing this article it contains two micro framework examples, the other one is a simple dependency injection container. In addition, GSamples holds Groovy scripts dealing with Groovy Essentials and another one dealing with advanced Groovy topics like the Meta-Object Protocol and Closures. Feel free to extend, distribute or use it!

[0] GSamples – https://github.com/andresteingress/gsamples

Advertisements

GroovyMag October 2010 Issue is Out!

GroovyMag 2010/10 features an article of mine about GORM and persistence context patterns. Check it out – really worth the 4.99 😉

Content:

  • Hibernate, GORM and the Persistence Context by Andre Steingress
  • Getting Started with Gaelyk by Peter Bell
  • Lean Groovy Part VII by Hamlet D’Arcy
  • Groovy Under the Hood – Closure Class by Kirsten Schwank
  • … and much more
Categories: hibernate, Intro, patterns

Getting the Persistence Context Picture (Part III)

April 20, 2010 1 comment


Flattr this
Part 3 of this series deals with more advanced topics, requiring knowledge about persistence patterns and Hibernate APIs.

  • [0] Getting the Persistence Context Picture (Part I)
  • [1] Getting the Persistence Context Picture (Part II)

Conversational State Management

One advanced use case when using persistence frameworks is realization of conversations.

A conversation spans multiple user interactions and, most of the time, realizes a well-defined process. Best way to think of a conversation is to think of some kind of wizard, e.g. a newsletter registration wizard.

A Newsletter Registration Conversation

A newsletter registration wizard typically spans multiple user interactions, whereas each interaction needs user input and further validation to move on:

  1. a user needs to provide basic data, e.g. firstname, lastname, birthdate, etc.
  2. a user needs to register for several newsletter categories
  3. a user gets a summary and needs to confirm that information

Each user interaction is part of the overall newsletter registration process. Technically speaking, whenever the user aborts the process at some time, or an unrecoverable error occurs, this must have no consequence on the underlying persistent data structures. E.g. if a user registered for a newsletter (step 2) and stops its newsletter registration per closing the browser window and HTTP session runs out of time, the registration and the newly created newsletter user needs to be rolled back.

A first naive approach to realize conversations is to use a single database transaction. Modern applications hardly use that approach because its error-prone and not justifiable in terms of performance considerations. In order to really get a grasp of the problems we would face, let us take a look at some basics on database transactions.

A Small Intro to Database Transactions

Whenever a database transaction gets started, all data modification is tracked by the database. For example, in case of MySQL (InnoDB) databases, pages (think of a special data structure) are modified in a buffer pool and modifications are tracked in a redo log which is hold in synchronization with the disk. Whenever a transaction is committed the dirty pages are flushed out to the filesystem, otherwise if the transaction is rolled back, the dirty pages are removed from the pool and the changes are redone.

It depends on the current transaction level if the current transaction has access to changes done by transactions executed in parallel (more details on MySQL transactions can be found at [2]). MySQL’s default transaction level is “repeatable read”: all reads within the same transaction return the same results – even if another transaction might have changed data in the meantime. InnoDB (a transactional MySQL database engine, integrated in MySQL server) achieves this behavior by creating snapshots when the first query is created.

Other isolation levels (confirming to SQL-92 standard) are: “read uncommitted” > “read committed” > “repeatable read” > “serializable”. The order represents the magnitude of locking which is necessary to realize the respective transaction level.

A Naive Approach

Single DB Transaction

Back to conversational state management: as mentioned above, a naive approach would be to use a single database transaction for a single conversation. This approach apparently has many problems:

  • if data is modified and DML statements generated, usually locks are created, avoiding other transactions to change it.
  • databases are designed to keep transactions as short as possible, a transaction is seen as atomic unit and not a long living session, long-running transactions are typically discarded by the database management system.
  • especially in web applications, it is hard for an application to determine conversation aborts – when the user closes its browser window in the middle of a transaction, or kills the browser process, there is hardly a change for the application to detect that circumstance.
  • a transaction is typically linked to a database connection. the number of database transactions is typically limited to the application.

As you can see, spanning a conversation with a database transaction is not an option. But a pattern already known from the previous articles comes to rescue: the persistence context.

Extended Persistence Context Pattern

As we’ve already seen in the second part of this series [1] Grails uses a so-called Session-per-Request pattern.

Session per Request Pattern

Whenever a controller’s method is called, a new Hibernate session spans the method call and, with flush mode turned to manual, the view rendering. When the view rendering is done, the session is closed. Of course, this pattern is not an option when implementing conversations, since changes in a controller’s method call are committed on the method’s return. One could pass Grails standard behavior using detached objects, but let me tell you: life gets only more complicated when detaching modified objects – especially in advanced domain models.

What we will need to implement a conversation is a mechanism that spans the persistence context over several user requests, that pattern is called: the extended persistence context pattern.

Extended Persistence Context

An extended persistence context reuses the persistence context for all interactions within the same conversation. In Hibernate speak: we need to find a way to (re)use a single org.hibernate.Session instance for conversation lifetime.

Fortunately, there is a Grails plugin which serves that purpose perfectly: the web flow plugin.

Conversational Management with Web Flows

The Grails web flow plugin is based on Spring Web Flow [3]. Spring Web Flow uses XML configuration data to specify web flows:

<flow xmlns="http://www.springframework.org/schema/webflow"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.springframework.org/schema/webflow
  http://www.springframework.org/schema/webflow/spring-webflow-2.0.xsd">

  <view-state id="enterBasicUserData">
    <transition on="submit" to="registerForNewsletters" />
  </view-state>
	
  <view-state id="registerForNewsletters">
    <transition on="submit" to="showSummary" />
    <!-- ... -->
  </view-state>

  <view-state id="showSummary">
    <transition on="save" to="newsletterConfirmed" />
    <transition on="cancel" to="newsletterCanceled" />
  </view-state>  
	
  <end-state id="newsletterConfirmed" >
    <output name="newsletterId" value="newsletter.id"/>
  </end-state>

  <end-state id="newsletterCanceled" />		
</flow>	

Groovy uses its own DSL implemented in org.codehaus.groovy.grails.webflow.engine.builder.FlowBuilder. This approach has the advantage of being tightly integrated into the Grails controller concept:

// ...
def newsletterRegistrationFlow = {
  step1 {
    on("save")   {
      def User userInstance = new User(params)
      flow.userInstance = userInstance

      if (!userInstance.validate()) {
        log.error "User could not be saved: ${userInstance.errors}"
            
        return error()
      }
    }.to "step2"
  }

  step2 {
    on("save")  {
      def categoryIds = params.list('newsletter.id')*.toLong()
      // ...
      def User userInstance = flow.userInstance
      newsletterService.registerUserForNewsletterCategories(userInstance, categoryIds)
      // ...
   }.to "step3"
  }

  step3()  {
     on("save")  {
        def userInstance = flow.userInstance
        // ...
        userInstance.save()
     }
  }
}

In this case, the closure property newsletterRegistrationFlow is placed in a dedicated controller class and is automatically recognized by the web flow plugin. The plugin is responsible for instantiating a Grails web flow builder object which needs a closure as one of its input parameters.

Leaving the DSL aside, best thing about web flows is that it realizes the extended persistence context aka flow managed persistence context (FMPC). The HibernateFlowExecutionListener is the place where the Hibernate session is created and than reused over multiple user interactions. It implements the FlowExecutionListener interface.

The flow execution listener provides callbacks for various states in the lifecycle of a conversation. Grails HibernateFlowExecutionListener uses these callbacks to implement the extended persistence context pattern. On conversation start, it creates a new Hibernate session:

public void sessionStarting(RequestContext context, FlowSession session, MutableAttributeMap input) {
	// ...
	Session hibernateSession = createSession(context);
	session.getScope().put(PERSISTENCE_CONTEXT_ATTRIBUTE, hibernateSession);
	bind(hibernateSession);
	// ...
}

Whenever the session is paused, in between separate user requests, it is disconnected from the current database connection:

public void paused(RequestContext context) {
	if (isPersistenceContext(context.getActiveFlow())) {
		Session session = getHibernateSession(context.getFlowExecutionContext().getActiveSession());
		unbind(session);
		session.disconnect();
	}
}

Whenever resuming the current web flow, the session is connected with the database connection again. Whenever a web flow has completed its last step, the session is resumed and all changes are flushed in a single transaction:

public void sessionEnding(RequestContext context, FlowSession session, String outcome, MutableAttributeMap output) {
  // ...
  final Session hibernateSession = getHibernateSession(session);
  // ...
  transactionTemplate.execute(new TransactionCallbackWithoutResult() {
	  protected void doInTransactionWithoutResult(TransactionStatus status) {
	    sessionFactory.getCurrentSession();
	    }
	  });
  }
    
  unbind(hibernateSession);
  hibernateSession.close();
  // ...
}

A call to sessionFactory.getCurrentSession() causes the current session to be connected with the transaction and, at the end of the transaction template, committing all changes within that transaction. All changes which have been tracked in-memory so far, are by then synchronized with the database state.

The price to be paid for conversations is higher memory consumption. In order to estimate the included effort, we need to take a closer look at how Hibernate realizes loading and caching of entities. In addition to implementing conversations, memory consumption is especially important in Hibernate based batch jobs.

Using Hibernate in Batch Jobs

The most important thing when working with Hibernate is to remember: the persistence context references all persistent entities loaded, but entities don’t know anything about it. As long as the persistence context is alive it does not discard references automatically.

This is particularly important in batch jobs. When executing queries with large result sets you have to manually discard the Hibernate session otherwise the program is definitely running out of memory:

for (def item : newsletters)  {
  // process item...
  if (++counter % 50 == 0)  {
    session.flush()
    session.clear()
  }
  // ...
}

Session provides a clear method that detaches all persistent objects being tracked by this session instance. Invoked on specific object instances, evict always to remove selected persistent objects from a particular session.

In this context, it might be worth to take a look at Hibernate’s StatefulPersistenceContext class. This is the piece of code that actually implements the persistence context pattern. As you can see in the following code snippet, invoking clear removes all references to all tracked objects:

public void clear() {
	// ...
	entitiesByKey.clear();
	entitiesByUniqueKey.clear();
	entityEntries.clear();
	entitySnapshotsByKey.clear();
	collectionsByKey.clear();
	collectionEntries.clear();
	// ...
}

Another thing to notice when executing large result sets and keeping persistence contexts in memory is that Hibernate uses state snapshots to recognize modifications on persistent objects (remember how InnoDB realizes repeatable-read transaction isolation;-)).

Whenever a persistent object is loaded, Hibernate creates a snapshot of the current state and keeps that snapshot in internal data-structures:

// ..
EntityEntry entry = getSession().getPersistenceContext().getEntry( instance );
if ( entry == null ) {
	throw new AssertionFailure( "possible nonthreadsafe access to session" );
}
		
if ( entry.getStatus()==Status.MANAGED || persister.isVersionPropertyGenerated() ) {

TypeFactory.deepCopy(
		state,
		persister.getPropertyTypes(),
		persister.getPropertyCheckability(),
	        state,
		session
);
// ...

Whenever you don’t want Hibernate to create snapshot objects, you have to use readonly queries or objects. Marking a query as “readonly” is as easy as setting its setReadOnly(true) property. In read-only mode, no snapshots are created and modified persistent objects are not marked as dirty.

Newsletter.withSession {
        org.hibernate.classic.Session session ->

              def query = session.createQuery("from Newsletter").setReadOnly(true)
              def newsletters = query.list()

              for (def item : newsletters)  {
                // ...
              }
      }

If your batch accesses the persistence context with read-access only, there is another way to optimize DB access: using a stateless session. SessionFactory has an openStatelessSession method that creates a fully statless session, without caching, modification tracking etc. In Grails, obtaining a stateless session is nothing more than injecting the current sessionFactory bean and calling openStatelessSession on it:

def Session statelessSession = sessionFactory.openStatelessSession()
statelessSession.beginTransaction()

// ...

statelessSession.getTransaction().commit()
statelessSession.close()

In combination with stateless session objects, it is worth mentioning that if you want to modify data there is an interface to do that even when working with stateless sessions:

public void doWork(Work work) throws HibernateException;

Where interface Work has a single method declaration:

public interface Work {
	/**
	 * Execute the discrete work encapsulated by this work instance using the supplied connection.
	 *
	 * @param connection The connection on which to perform the work.
	 * @throws SQLException Thrown during execution of the underlying JDBC interaction.
	 * @throws HibernateException Generally indicates a wrapped SQLException.
	 */
	public void execute(Connection connection) throws SQLException;
}

As you can see execute gets a reference on the current Connection which, in the case of JDBC connections, can be used to formulate raw SQL queries.

If your batch is processing large chunks of data, paging might be interesting too. Again, this can be done by setting the appropriate properties of Hibernate’s Query class.

// ...
def Query query = session.createQuery("from Newsletter")
query.setFirstResult(0)
query.setMaxResults(50)
query.setReadOnly(true)
query.setFlushMode(FlushMode.MANUAL)
// ...

The code snippet above explicitly sets the flush mode to “manual”, since flushing does not make sense in this context (all retrieved objects are readonly).

A similiar API can be found in the Criteria class, being supported by Grails by its own Criteria Builder DSL [6].

Conclusion

As you can see, there are various options to use Hibernate even for batch processing of large data sets. Programmers are not restricted on using predefined methodologies, although understanding the fundamental patterns is a crucial point. Adjusting Hibernate’s behavior and generated SQL is a matter of knowing the right extension points.

I hope you had a good time reading that article series. I know, a lot of things have been unsaid but if you are missing something really much or want to gain more insights in a particular topic related to Hibernate, GORM, Grails etc. just drop a comment, i’ll try to take notice of it in one of the following blog posts.

[0] Getting the Persistence Context Picture (Part I)
[1] Getting the Persistence Context Picture (Part II)
[2] MySQL InnoDB Transactions
[3] Spring Web Flow Project
[4] Hibernate – Project Home Page
[5] Hibernate Documentation – Chapter: Improving Performance
[6] Criteria Builder DSL

Categories: basic, grails, hibernate, patterns

Getting the Persistence Context Picture (Part II)

April 8, 2010 2 comments


Flattr this
The first article of this series [0] took a look at the basic patterns found in todays persistence frameworks. In this article we will have a look at how Hibernate APIs relate to those patterns and how Hibernate is utilized in Grails.

A Closer Look at Hibernate’s Persistence Context APIs

All data creation, modification and altering has to be done in a persistence context. A persistence context is a concrete software element that maintains a list of all object modifications and, in a nutshell, at the end of the persistence context life-time or business transaction synchronizes them with the current database state.

When developing with a modern web framework – as Grails is – it is most likely you don’t even have to care about opening a persistence context or closing it, or even know about how this could be done.

But as application complexity raises, you have to know Hibernate’s persistence context APIs and understand how they are integrated within the web application framework of your choice. Let us take a look at the most import APIs and how they correspond to persistence patterns.

The Registry Pattern or org.hibernate.SessionFactory

The SessionFactory class implements the registry pattern. A registry is used by the infrastructure-layer to obtain a reference to the current persistence context, or to create a new one if not found in the current context.

Usually, as noted in SessionFactory’s Java documentation, a session-factory refers to a single persistence provider. Most of the time, application need just a single session-factory. Indeed, if an application wanted to work across multiple databases, it would have to maintain multiple SessionFactory instances, one for each database.

Imagine a session-factory to be a configuration hub – it is the place where all configuration settings are read and used for constructing persistence contexts.

In a Grails application, the application’s session-factory can be easily obtained by declaring a property of type org.hibernate.SessionFactory:


class SomeService {

    SessionFactory sessionFactory

    void myServiceMethod()  {
        def session = sessionFactory.getCurrentSession()
        // ...
    }
}

The Grails application’s session-factoy is than injected by dependency injection since every Grails service class is a Spring-managed component. Other components include controllers, domain-classes and custom beans (either in beans.groovy, beans.xml, other bean definition XMLs or annotated Groovy/Java classes).

A Grails application’s session-factory is set-up by the HibernatePluginSupport class, which is a utility class used by Grails hibernate plugin. When taking a look at the source code you’ll find out that the code uses Grails Spring builder DSL to declare a ConfigurableLocalSessionFactoryBean. This type of bean is usually used in Spring applications to create a Hibernate session-factory instance during application bootstrap and to keep track of it during the entire life-time of the application-context.

//...
sessionFactory(ConfigurableLocalSessionFactoryBean) {
    dataSource = dataSource
    // ...
    hibernateProperties = hibernateProperties
    lobHandler = lobHandlerDetector
}
//...

Btw, if we would have to create a session-factory within a plain Groovy application, it wouldn’t get much harder:


def configuration = new Configuration()
    .setProperty("hibernate.dialect", "org.hibernate.dialect.MySQLInnoDBDialect")
    .setProperty("hibernate.connection.datasource", "java:comp/env/jdbc/test")
    .setProperty("hibernate.order_updates", "true")
    // ...

def sessionFactory = configuration.buildSessionFactory()

The Configuration class can be used to configure a session-factory programatically, other options would be to use a properties-file or an XML file (named hibernate.properties or hibernate.cfg.xml). If using any of the file-based configurations, take care your configuration file can be loaded by the current class-loader, therefore put it in the class-path’s root directory.

Grails configuration of Hibernate’s session-factory is pretty much hided from application developers. In order to provide a custom hibernate.cfg.xml, just put it in the grails-app/conf/hibernate folder.

The Persistence-Context Pattern or org.hibernate.Session

The Session builds the heart of Hibernate: it resembles a persistence context. Thus, whenever the application needs to access object-mapping functionality in either form, it needs to work with an instance of type Session.

The Session interface provides a bunch of methods letting the application infrastructure interact with the persistence context:

        // ...
	public Query createQuery(String queryString) throws HibernateException;

	public SQLQuery createSQLQuery(String queryString) throws HibernateException;

	public Query createFilter(Object collection, String queryString) throws HibernateException;

	public Query getNamedQuery(String queryName) throws HibernateException;

	public void clear();

	public Object get(Class clazz, Serializable id) throws HibernateException;

	public void setReadOnly(Object entity, boolean readOnly);

	public void doWork(Work work) throws HibernateException;

	Connection disconnect() throws HibernateException;

	void reconnect() throws HibernateException;

	void reconnect(Connection connection) throws HibernateException;
        // ...

Whenever e.g. a query is created by one of the querying methods, all objects which are retrieved are automatically linked to the session. For each attached (Hibernate term for “linked”) object, the session holds a reference and meta-data about it. Whenever a transient Groovy object is saved, it gets automatically attached to the current session. Notice that this is a unidirectional relationship: the session knows everything whereas the attached object instances don’t know anything about being linked to a session.

Lazy and Eager Loading

In regard to attaching objects to the current session, you need to know the concepts of lazy and eager loading of object relationships.

Whenever a persistent class A references another persistent class B, A is said to have a relationship with B. Object relationships are mapped either with foreign keys or relationship tables in the underlying database schema. As default, Hibernate uses a lazy loading approach: whenever a root object with relations to other objects is loaded, the related objects are not loaded. The other approach would be eager loading, where the object relationship is loaded with the root object.

Lazy vs. Eager Loading

Lazy loading does not hurt as long as objects are attached to a persistence context. Although, if the persistence context is closed, there is no way to navigate over a lazy loaded relationship. Whenever application code needs to access lazy relationships this leads to a lazy loading exceptions.

Obtaining a Session

Per default, a session instance can be obtained using a session-factory instance:


def session = sessionFactory.openSession()
def tx = session.beginTransaction()

// ... work with the session

tx.commit()
session.close()

As it is the case with the code sample above, most of the time application code is working in a transactional context, that is, the current method is executed within a single transaction. Therefore, it is a common idiom to open a transaction with the beginning of a session, although this is not enforced by Hibernate’s API. If we would not use transaction boundaries, we could just omit the method call to beginTransaction:


def session = sessionFactory.openSession()

// ... work with the session

session.close()

You need to be careful in this scenario. If Hibernate obtains a JDBC connection, it automatically turns autocommit mode off by setting jdbcConnection.setAutoCommit(false). Indeed, this is the JDBC way to tell the database to start a new transaction. However, how the database driver reacts on pending transactions is not specified, an application runs into undefined behavior.

General Session-Handling in Grails

As manual session handling can get tricky, web frameworks like Grails hide most of these problems. The Grails’ Object Relational Mapping (GORM) layer is a thin layer above Hibernate 3. Grails uses this layer to enrich domain classes with a lot of DAO like functionality. E.g. to each domain class so-called dynamic finders are added, which most of the time completely replace the need for data access objects (DAOs). Handling of Hibernate sessions is mainly hidden by GORM which internally uses Spring’s Hibernate integration and Hibernate.

Whenever executing a GORM query Grails internally creates a HibernateTemplate. A HibernateTemplate is a neat way to get defined access to a Hibernate session. It completely hides getting the session-factory and retrieving a session. Clients only need to implement callback methods, which are than called on execution of that template. Let’s take a look how such templates are used when executing e.g. a dynamic finder method like findBy.


class SomeController {
  def myAction()  {
    def User user = User.findByLogin(params.id)
    // ...
  }
}

When invoking the static findBy dynamic finder method, the following code is executed:


// ...
return super.getHibernateTemplate().execute( new HibernateCallback() {
    public Object doInHibernate(Session session) throws HibernateException, SQLException {
        Criteria crit = getCriteria(session, additionalCriteria, clazz);
	// ... do some criteria building

        final List list = crit.list();
        if(!list.isEmpty()) {
            return GrailsHibernateUtil.unwrapIfProxy(list.get(0));
        }
        return null;
     }
});

As can be seen, Grails internally does nothing more than creating a Spring HibernateTemplate and in its doInHibernate callback, creates a plain Hibernate Criteria object which is used to specify object queries. Spring hides the code of finding the current session and setting properties according to the current program context, GORM adds this functionality and does all Groovy meta-programming stuff (adding static methods etc. to the domain’s meta class).

The same is true for saving domain objects using GORM’s save method:

protected Object performSave(final Object target, final boolean flush) {
        HibernateTemplate ht = getHibernateTemplate();
        return ht.execute(new HibernateCallback() {
            public Object doInHibernate(Session session) throws HibernateException, SQLException {
                session.saveOrUpdate(target);
                if(flush) {
                    // ...
                    getHibernateTemplate().flush();
                    // ...
                }
                return target;
            }
        });
}

Session Flushing

Since a session spans a business transaction (remember, not the same as a technical transaction) it might be left open by the infrastructure-layer over several user interaction requests. The application needs to ensure that a session is closed and its state synchronized with the database at some time, which is called flushing a session.

As we have already seen in the Grails source code above, flushing is mainly handled by the web framework, but programmers should know the point-cuts where it actually happens:

  • whenever a Transaction gets committed
  • before a query is executed
  • if session.flush() is called explicitly

Be aware that flushing is a costly operation in a persistence context, as Hibernate needs to synchronize the current object model in memory with the database. Programmers could change the default behavior described above by setting an alternate flush mode on the current session:

def session = sessionFactory.openSession()
session.setFlushMode(FlushMode.NEVER)

FlushMode.NEVER in this case means that session flushing is deactivated, only explicit calls to session.flush() triggers it.

In Grails, session flushing is done after each controller call, due to Spring’s OpenSessionInView interception mechanism. In order to access lazy-loaded properties in GSP pages, the session is not closed completely after a controller’s method call but after response rendering is done. Therefore, it sets session flushing mode to FlushMode.NEVER after a controller method call to avoid DB modifications caused by GSP page code.

Another place where sessions get flushed, is at the end of each service method call (as long as the service is marked as being transactional or is annotated with @Transactional):

class SomeService {
    static transactional = true

    def someMethod()  {
       // ... transactional code
    }
}
class SomeService {

    @Transactional
    def someMethod()  {
       // ... transactional code
    }
}

When doing integration tests on Grails classes, you need to remind these point-cuts where sessions get flushed. To make things even more complicated, there is one additional thing that is different in integration tests: each integration test method runs in its own transaction, which at the end is rollbacked by Grails testing classes. E.g. if testing a controller’s save method, chances are you can’t find an SQL INSERT or UPDATE statement in database logs. This is the intended behavior, but it causes confusion if bugs dealing with persistence issues need to be reproduced by test-cases.

If it is about transactions in integration tests, there is a way to deactivate transaction boundaries there:

class SomeTest extends GrailsTestCase {
    static transactional = false

    @Test
    def testWithoutTransactionBoundary()  {
       // ... transactional code
    }
}

Summary

In this article we took a look at how Grails and GORM handles Hibernate’s basic APIs: the SessionFactory and the Session. The next article in this series will deal with more advanced features: GORM in batch-jobs and conversational state management.

[0] Getting the Persistence Context Picture (Part I)
[1] Hibernate Project
[2] Spring Hibernate Integration
[3] Grails GORM Documentation

Categories: basic, grails, hibernate, patterns

Getting the Persistence Context Picture (Part I)

March 23, 2010 2 comments


Flattr this

This article series deals with Hibernate‘s basic APIs and how Hibernate is used in Grails applications. The first part of this series is meant to be seen as overall introduction to objects, persisting objects and Hibernate as a persistence framework.
I have been in a lot of projects where Hibernate and the persistence layer was handled as the application‘s holy grail: whenever an error was thrown, programmers did not try to understand the concrete problem, but consumed their time by finding work-arounds. From my experience, that behavior was simply caused by a lack of knowledge about basic persistence context patterns. In this article i will try to explain the most fundamental patterns and concepts which should already help to gain knowledge on how persistence frameworks core data-structures work.
Objects, objects everywhere…
Let’s start with object-orientation. Implementing a persistence framework mainly involves the question on how to map objects from an object-oriented domain into a relational data model, which is found in most databases we’re dealing today. In order to understand persistence mechanisms from bottom-up, we should revise the basic concepts on objects and classes. The basic definition of objects is:


Whenever we are talking about objects in an object-oriented context, we speak of runtime representatives of classes, whereas the classes can be seen as construction plans to be used when running the program and constructing new instances. A class consists of attributes and operations on that attributes. Objects at runtime represent the attribute‘s values which are tightly connected with the class‘s operations on them. If seen from a logical view, an object is represented by it‘s state and operations.

Attributes might be of any datatype available in the programming environment. Most programming languages decide between simple datatypes and custom class data types, whereas custom class data-types contain custom as well as API classes. At runtime therefore attribute values either contain scalar values (e.g. a number, a string, a boolean value, etc.) or references to other objects. A reference‘s value might either reference another object or is void.
Every object created during the execution of an object-oriented program has an object identity. Depending on its context object identity has two meanings:
1. By reference: an object A denotes as being equal to another object B if their references are equal.

2. By state: an object A denotes as being equal to another object B if their attribute values are equal.

In object-orientational theory the first one is named „object identity“ and the latter one „object equality“, thus being identical is not as being the same.
Unfortunately in the Java environment being an identical object means the same as being equal to another object, since java.lang.Object.equals() implements reference comparison with the == operator by default.
Let‘s change our view to a completely relational mapping model. In a first naive approach our tables would correspond to classes, whereas columns represent the class‘s attributes.

At runtime an object instance would be represented by a single database row filled with values for each of the available columns. In fact, this is how it is done most of the time when we are using persistence frameworks like Hibernate.

The examples above show very simple structured object’s classes. In practice persistence frameworks also need a way to map relationships (1:n, m:n, m:1, 1:1) between objects and database tables. Usually this is done using foreign keys in the relational model, but imagine a collection with a lot of referring objects – the persistence frameworks APIs need to provide mechanisms for batch loading, cursor support etc.

We have already seen object identity and its two characteristics – with database persistency a third identity comes into play: the object’s primary key. But the problem is that objects don’t necessarily know about their primary key until a key is explicitly requested from the database. It gets even trickier if you think of relationships between objects – how can a programmer ensure that relationships are mapped in the correct order depending on foreign key constraints between the objects database tables.

Since programmers really should not deal with issues like object relational mapping, object identity, batch loading relationships for relationship traversal, etc. various persistence or object-relational mapping (ORM) frameworks have prospered. They all have in common that they provide functionality that persists objects of an object-oriented programming environment into some persistent store, thus persisted objects are called persistent objects.

The Persistence Context
For persistent objects there needs to be some explicitly defined context in which creation, modification and retrieval of persistent objects can happen. This context is known as Persistence Context (or Persistent Closure). In fact, most of the persistence frameworks provide APIs that provides access to a persistence context, even though the persistence context‘s functionality is often split into several APIs.

Let’s take a look at the basic definition of persistence contexts ([0] “Persistence Context” pattern):



Notice that the term „business transaction“ does not refer to database transactions. A business transaction is a logical transaction that might span several operations e.g. ordering a pizza from the customer‘s view is a single business transaction, but it might be the case that this single business transaction involves several technical transactions to complete the request.
To lookup the current persistence context during execution classes might use a Registry which provides access to the current persistence context.
The persistence context deals with few problems caused by the object-orientation/relational mapping mismatch. It ensures that all operations on objects are tracked for the persistence context‘s life time, to keep possible db transactions as short as possible. It handles the problem of object identity and ensures that there will never be multiple object instances with the same database primary key. It tracks associations and resolves them in order to satisfy foreign key constraints. It implements a cache mechanism to automatically gain a certain transaction isolation level to solve repeatable read problems and to lower the number of executed SQL statements. Overall, a persistence provider already has gained a lot of knowledge about database systems and ORM so I would consider decisions for custom implementation of persistence contexts as highly risky.
Hibernate‘s Persistence Context APIs
Let‘s take a look at how Hibernate implements the Persistence Context and related patterns. Overall, Hibernate‘s persistence context API mainly consists of the following classes:
org.hibernate.SessionFactory
The SessionFactory resembles the persistence context managing component in Hibernate – the registry. This is the central place for global configuration and set-up. During runtime  there will only be a single session factory instance in your application‘s environment (except for multiple data-sources, e.g. legacy databases, ldap providers etc.) which acts as factory for retrieving persistence context objects.
org.hibernate.Session
A Session represents a component which tracks (de-)attachment, modification and retrieval of persistent objects. This is Hibernate‘s persistence context (finally). Whenever you need a persistence context in your application you have to look up a SessionFactory and create a new Session using the openSession() method. Be aware that Hibernate is not restricting programmers in how you handle sessions. If you decide to implement a long-running business transaction (aka conversation) with a single session instance, you are free to do so.
org.hibernate.Transaction
A Transaction actually is used to implement a Unit of Work within the application. A single session might span multiple transactions and it is recommended that there is at least a single uncommitted transaction when working in a session. Note that the actual implementation of how the transaction is handled on database-side is hidden by Hibernate‘s implementation of that interface.
Still if you use Hibernate without an explicit call to session.beginTransaction() Hibernate will operate in auto-commit mode (be sure to specify connection.autocommit=„true“ in your configuration‘s xml).
Lifecycle Management
So far we have heard from persistent objects as being the type of objects which have been persisted by the persistence context. But that persistent state is just a single station in the life-cycle of objects managed by Hibernate as a persistence provider.
In fact, I assume that in most applications the domain model entities will have to be kept in some persistent store. Therefore an entity object‘s instance will run through several states between instance creation and being actually stored in the persistent store of your choice:
  1. (De-) Attaching Instances
  2. Saving/Updating Instances
  3. Removing Instances
(De-) Attaching Instances
Imagine your first application‘s bootstrap. Chances are good that you might have to create some persistent objects on startup to get the system working. Whenever creating object instances which have never been persisted we are talking of transient objects. These objects have in no way been connected to a persistence context.
The process of letting the persistence context know of the existing of a transient object is called „attaching“. Therefore, newly attached objects are called either attached or persistent objects.
On the other way around the process of disconnecting persistent objects from the current persistence context is known as „dettaching“ or „evicting“.
Saving or Updating Instances
Save or update operations can only be applied to persistent objects. If you need to save a transient or dettached object, you as an application developer have to attach that object instance to the current persistence context. Hibernate eases these two steps (attaching/save) since it provides update/save methods which automatically attach transient or detached objects.
Removing Instances
Removing can only be applied on persistent objects. As it is the case with save/update operations transient or dettached objects first needs to be attached to the current persistence context to get removed. Whenever a persistent object is removed it actually is just marked as being in state „removed“ by the underlying persistence mechanisms.

Summary

The gained knowledge about the Persistence Context pattern and Life-Cycle of persistent objects already equips us with a lot of basic knowledge on how persistence frameworks like Hibernate operate. In the next part we will take a look at using Hibernate APIs in applications and how Grails utilizes Hibernate.

[0] Patterns of Enterprise Application Architecture, Martin Fowler

Categories: grails, hibernate, patterns