I have no special talent. I'm only passionately curious - Albert Einstein
Using DBUnit to transfer data between databases Comment on Using DBUnit to transfer data between databases 0

  At my current client, I've ran into the need to have the ability to transfer data from our production database to our lower level environments for QA and other user testing purposes.  One of the other consultants here mentioned a tool called DBUnit, and he said he thought it might have the capabilities I was looking for.

DBUnit has turned out to be a great tool for performing database tasks with Ant.  I was able to set up an Ant task and create a flat file (XML Format) data dump of the entire database.  The task was very straightforward, called Export, and has a quite a few options for generating the exported database file.

  Export options include creating a DTD for your XML files using the database model, generating the exported data into either a 'flat' or XML file.  Although the flat file is an XML file anyways, the XML option offers a nested element structure which is easier to read. 

  The caveat I encountered immediately was memory concerns while exporting large databases.  It seems that DBUnit will read the entire database into memory when performing an export task.  I ended up breaking out the export into multiple files, specifying a specific subset of tables for each file:

Export Data to Flat Files

<dbunit driver="${sql.driver}"
            password="${sql.password}"
            userid="${sql.userid}"
            url="${sql.url}"
            schema="${target.schema}"           
            classpath="${sql.classpath}">
            <dbconfig>
                <property name="datatypeFactory" value="org.dbunit.ext.db2.Db2DataTypeFactory" />               
                <property name="batchSize" value="50" />
                <!--feature name="qualifiedTableNames" value="true" /-->
                <feature name="batchedStatements" value="true" />
                <!--feature name="datatypeWarning" value="false" / -->               
            </dbconfig>
           
            <export dest="${target.schema}.export.dtd" format="dtd"/>           
            <!-- Export the dataset to three files.  If you try to export the database
            to one file you will run into local OutOfMemory Exceptions -->
            <export dest="${targetDir}\${target.schema}.export.part1.xml" doctype="${target.schema}.export.dtd" encoding="ISO-8859-1">
                <table name="table_1" />
                ...
                <table name="table_9" />                               
            </export>
            <export dest="${targetDir}\${target.schema}.export.part2.xml" doctype="${target.schema}.export.dtd"  encoding="ISO-8859-1">
                <table name="table_10" />
                ...
                <table name="table_22" />
            </export>
            <export dest="${targetDir}\${target.schema}.export.part3.xml" doctype="${target.schema}.export.dtd"  encoding="ISO-8859-1">
                <table name="table_23" />               
            </export>

        </dbunit>

The end result is an xml file which is portable to any database (any vendor, any version, etc.).  As you can see in my configuration above, I'm performing a transfer from a DB2 database.  My next configuration shows how to import the database into another DB2 database, but you could transfer the data into an SQL Server, Oracle, or any other database which contained the necessary schema.  It's important to note that I am importing and exporting to three seperate data files due to memory concerns.  If you have a smaller database, you may not need to break it out like I did. 

Import Data from Flat files to any database you want:

        <dbunit driver="${sql.driver}"
            password="${sql.password}"
            userid="${sql.userid}"
            url="${sql.url}"
            schema="${target.schema}"           
            classpath="${sql.classpath}">
            <dbconfig>
                <property name="datatypeFactory" value="org.dbunit.ext.db2.Db2DataTypeFactory" />               
                <property name="batchSize" value="50" />
                <!--feature name="qualifiedTableNames" value="true" /-->
                <feature name="batchedStatements" value="true" />
                <!--feature name="datatypeWarning" value="false" / -->               
            </dbconfig>
            <!--Import the dataset from three files.  If you try to import the database
                from one file you will run into local OutOfMemory Exceptions.  The
                files here are created using the db-export tasks.  For Operation type,
                use CLEAN_INSERT to completely delete all existing data and insert data from
                the export files, or use REFRESH, which will update existing data and will
                in a delta-style format using your export files.  See http://www.dbunit.org/components.html
                under DatabaseOperation for more information -->
            <operation type="CLEAN_INSERT" src="${targetDir}\${fromschema}.export.part1.xml" />
            <operation type="CLEAN_INSERT" src="${targetDir}\${fromschema}.export.part2.xml" />
            <operation type="CLEAN_INSERT" src="${targetDir}\${fromschema}.export.part3.xml" />
        </dbunit>

Resources:

DBUnit DatabaseOperations Documentation

DBUnit Configuable Features and Properties

DBUnit Javadoc (use this to find your Database Specific dataTypeFactory)


0 comments

Add a comment

Please provide your name, email address (won't be published) and a comment

About

David Malone is a Java developer residing in the Twin Cities area.  He has been developing enterprise applications since 2004.  This is his personal blog, as well as his design and development workspace.