Tuesday, September 13, 2011

What are XA transactions? What is a XA datasource?

An XA transaction, in the most general terms, is a "global transaction" that may span multiple resources. A non-XA transaction always involves just one resource. An XA transaction involves a coordinating transaction manager, with one or more databases (or other resources, like JMS) all involved in a single global transaction. Non-XA transactions have no transaction coordinator, and a single resource is doing all its transaction work itself (this is sometimes called local transactions).

XA transactions come from the X/Open group specification on distributed, global transactions. JTA includes the X/Open XA spec, in modified form. Most stuff in the world is non-XA - a Servlet or EJB or plain old JDBC in a Java application talking to a single database. XA gets involved when you want to work with multiple resources - 2 or more databases, a database and a JMS connection, all of those plus maybe a JCA resource - all in a single transaction. In this scenario, you'll have an app server like Websphere or Weblogic or JBoss acting as the Transaction Manager, and your various resources (Oracle, Sybase, IBM MQ JMS, SAP, whatever) acting as transaction resources. Your code can then update/delete/publish/whatever across the many resources. When you say "commit", the results are commited across all of the resources. When you say "rollback", _everything_ is rolled back across all resources.

The Transaction Manager coordinates all of this through a protocol called Two Phase Commit (2PC). This protocol also has to be supported by the individual resources. In terms of datasources, an XA datasource is a data source that can participate in an XA global transaction. A non-XA datasource generally can't participate in a global transaction (sort of - some people implement what's called a "last participant" optimization that can let you do this for exactly one non-XA item).

Most developers have at least heard of XA, which describes the standard protocol that allows coordination, commitment, and recovery between transaction managers and resource managers.

Products such as CICS, Tuxedo, and even BEA WebLogic Server act as transaction managers, coordinating transactions across different resource managers. Typical XA resources are databases, messaging queuing products such as JMS or WebSphere MQ, mainframe applications, ERP packages, or anything else that can be coordinated with the transaction manager. XA is used to coordinate what is commonly called a two-phase commit (2PC) transaction. The classic example of a 2PC transaction is when two different databases need to be updated atomically. Most people think of something like a bank that has one database for savings accounts and a different one for checking accounts. If a customer wants to transfer money between his checking and savings accounts, both databases have to participate in the transaction or the bank risks losing track of some money.

The problem is that most developers think, "Well, my application uses only one database, so I don't need to use XA on that database." This may not be true. The question that should be asked is, "Does the application require shared access to multiple resources that need to ensure the integrity of the transaction being performed?" For instance, does the application use Java 2 Connector Architecture adapters, the BEA WebLogic Server Messaging Bridge, or the Java Message Service (JMS)? If the application needs to update the database and any of these other resources in the same transaction, then both the database and the other resource need to be treated as XA resources.

In addition to Web or EJB applications that may touch different resources, XA is often needed when building Web services or BEA WebLogic Integration applications. Integration applications often span disparate resources and involve asynchronous interfaces. As a result, they frequently require 2PC. An extremely common use case for WebLogic Integration that calls for XA is to pull a message from WebSphere MQ, do some business processing with the message, make updates to a database, and then place another message back on MQ. Usually this whole process has to occur in a guaranteed and transactional manner. There is a tendency to shy away from XA because of the performance penalty it imposes. Still, if transaction coordination across multiple resources is needed, there is no way to avoid XA. If the requirements for an application include phrases such as "persistent messaging with guaranteed once and only once message delivery," then XA is probably needed.

Figure 1 shows a common, though extremely simplified, BEA WebLogic Integration process definition that needs to use XA. A JMS message is received to start the process. Assume the message is a customer order. The order then has to be placed in the order shipment database and placed on another message queue for further processing by a legacy billing application. Unless XA is used to coordinate the transaction between the database and JMS, we risk updating the shipment database without updating the billing application. This could result in the order being shipped, but the customer might never be billed.

Once you've determined that your application does in fact need to use XA, how do we make sure it is used correctly? Fortunately, J2EE and the Java Transaction API (JTA) hide the implementation details of XA. Coding changes are not required to enable XA for your application. Using XA properly is a matter of configuring the resources that need to be enrolled in the same transaction. Depending on the application, the BEA WebLogic Server resources that most often need to be configured for XA are connection pools, data sources, JMS Servers, JMS connection factories, and messaging bridges. Fortunately, the entire configuration needed on the WebLogic side can be done from the WebLogic Server Console.

Before worrying about the WebLogic configuration for XA, we have to ensure that the resources we want to access are XA enabled. Check with the database administrator, the WebSphere MQ administrator, or whoever is in charge of the resources that are outside WebLogic. These resources do not always enable XA by default, nor do all resources support the X/Open XA interface, which is required to truly do XA transactions. For example, some databases require that additional scripts be run in order to enable XA.

For those resources that do not support XA at all, some transaction managers allow for a "one-phase" optimization. In a one-phase optimization, the transaction manager issues a "prepare to commit" command to all of the XA resources. If all of the XA resources respond affirmatively, the transaction manager will commit the non-XA resource. The transaction manager will then commit all of the XA resources. This allows the transaction manager to work with a non-XA resource, but normally only one XA resource per transaction is allowed. There is a small chance that something will go wrong after committing the non-XA resource and before the XA resources all commit, but this is the best alternative if a resource just doesn't support XA.

Connection pools are where most people start configuring WebLogic for XA. The connection pool needs to use an XA driver. Most database vendors provide XA drivers for their databases. BEA WebLogic Server 8.1 SP2 ships with a number of XA drivers for Oracle, DB2, Informix, SQL Server, and Sybase. We need to ensure that the Driver classname on the connection pool page of the BEA WebLogic Console is in fact an XA driver. When using the configuration wizards in BEA WebLogic Server 8.1, the wizards always note which drivers are XA enabled.

When more than one XA driver is available for the database involved, be sure to run some benchmarks to determine which driver gives the best performance. Sometimes different drivers for the same database implement XA in completely different ways. This leads to wide variances in performance. For example, the Oracle 9.2 OCI Driver implements XA natively, while the Oracle 9.2 Thin Driver relies on stored procedures in the database to implement XA. As a result, the Oracle 9.2 OCI driver generally performs XA transactions much faster than the Thin driver. Oracle's newest Type 4 driver, the 10g Thin Driver, also implements XA natively and is backwards compatible with some previous versions of the Oracle database. Taking the time to fully evaluate alternative drivers can lead to significant performance improvements.

1 comment: