Geometry Database Replication

Up-to-date instructions are on the Atlas Wiki


ColManReplicator

Utility for replicating Atlas Geometry SQL databases. It is available from ColMan which is based on SQLTuple.

SQLTuple provides AIDA NTuple interface of SQL databases. It can support any SQL database locally or over the  networks. The current distribution is configures for use with MySQL, PostgreSQL, McKoi and Oracle, but any other database can be easily added. SQLTuple is based on FreeHEP AIDA implementation and is completely AIDA compliant and interoperable with other AIDA components (so it can transparently use other AIDA storages, like XML files and HBook or Root files (in read mode)). The interaction with SQL is completely customizable at run-time via three (textual) configuration files:
  1. Implementation.properties specifies general features of database backend (protocol name, drive, ...).
  2. Type.properties specifies mapping of SQL types (in different backend technologies) to AIDA types.
  3. StmtSrc.properties specifies SQL commands used to perform standard (AIDA) operations (possibly per backend technology).

ColMan provides Collection Management utilities on top of SQLTuple: replicating, filtering, merging, analysing,... of Event Collection metadata. It performs all manipulations via AIDA NTuple API.

ColManReplicator is one utility (in fact one class) which uses SQLTuple API to replicate Atlas Geometry databases. It simply reads AIDA NTuple from one database (e.g. Oracle) and writes in into another (e.g. MySQL). This is possible as long as tables use only SQL types which are supportable by SQLTuple AIDA mapping. In case those SQL types are not directly supported by default, a customized mapping (using Type.properties configuration file) can be easily provided as long as there is one-to-one correspondence between tables and types in two used technologies. Additional table characteristics (key, indexes, ...) can be incorporated using StmtSrc.properties configuration file.

ColManReplicator introduces certain overhead to replication by interpreting SQL tables as AIDA NTuples. This overhead, however, is not too big and enables transparent (network-wide) replication between various technologies (in principle any SQL database and file formats supportable by AIDA). Afterall, SQL tables have to be interpreted somehow to be replicated between different technologies.

ColManReplicator is available from ColMan 0.9.6, which is installed as Atlas External package, but not yet available from releases.

Note:
ColManReplicator can't be used to replicate ConditionsDB because its Oracle and MySQL incarnations have different schemas and actual application code is needed to perform the translation.

Note:
ColMan contains its own copy of Oracle JDBC driver, as this is not supported by External/AtlasOracle package - see Bug Report #4426.

ColManReplicator can be used in a following way:

# Setup Java
. /afs/cern.ch/sw/java/share/bin/setjdk sun 1.5.0beta2
export PATH="${PATH}:${JAVA_HOME}/bin"

# Write a file listing all tables to be replicated: tables.txt
# (ColManReplicator can replicate the whole database too, but
# that may be quite long operation for just testing)
# For example:
ALMN_DATA
ALMN_DATA2VERS

# Replicate Oracle tables into MySQL tables
# (this works also to PostgreSQL, Cloudscape and others)
java
  -jar /afs/cern.ch/atlas/offline/external/Colman/ColMan-0.9.6/lib/ColManReplicator.exe.jar

  jdbc:oracle:thin:@oradev9.cern.ch:1521:D9
  jdbc:mysql://<mysql server>/<database>
  <oracle id> <oracle passwd> <mysql id> <mysql passwd>
  tables.txt

# Replicate Oracle tables into AIDA file
# (which can be then read by any AIDA tool)
java
  -jar /afs/cern.ch/atlas/offline/external/Colman/ColMan-0.9.6/lib/ColManReplicator.exe.jar

  jdbc:oracle:thin:@oradev9.cern.ch:1521:D9
  CollectionTest.aida
  <oracle id> <oracle passwd> <anything> <anything>
  tables.txt

The key part of net.hep.atlas.Database.DetectorDescription.Management.Replicator class looks like this, it is pure AIDA code, it should work with any AIDA implementations which supports SQL storage:

IAnalysisFactory af = IAnalysisFactory.create();
ITree inTree  = af.createTreeFactory().create(inUrl, ...);
ITree outTree = af.createTreeFactory().create(outUrl, ...);
ITupleFactory tf = af.createTupleFactory(outTree);
IFilter filter = tf.createFilter(true);
String[] tables = inTree.listObjectNames();
ITuple inTuple;
ITuple outTuple;
for (String table : tables) {
  inTuple = inTree.find(table);
  outTuple = tf.createFiltered(table, inTuple, filter);
  }
inTree.close();
outTree.commit();
outTree.close();


JDBC Importer

JDBC Importer is a SourceForge utility for flexible exporting/importing of SQL databases. It supports all JDBC-complient databases (i.e. practicaly all existing SQL databases). It is available here.

Exporting/importing rules are described by an XML configuration file (see example).

JDBC Importer supports not only direct column-by-column copy of databases, but also sophisticated conversions involving manipulating several tables. Simple conversions can be described directly by XML configuration files, more complex conversions can be described by ExportEngine subclasses (see example).

JDBC Importer seems to be based on similar concepts as Generic Replicator (see below), except that it is already functional product.


Generic Replicator

Generic Replicator is a prototype of a generic Utility for replicating SQL database.

Database schema (in all required technologies) should be described by a XML file. It can contain three types of mapping:
  1. Simple one-to-one column mapping based on declaration of column types in different technologies. This is equivalent to ColManReplicator configuration.
  2. Embedded scripts to convert columns from/to different technologies.
  3. Embedded scripts to convert whole tables (rows) from/to different technologies. Here, columns can be re-arranged (e.g. (x,y,z)->(rho,phi,theta)).
Generic Replicator will be more flexible than ColManReplicator as it will allow replication even when table schemas are different in different technoloies (by providing conversion scripts). It requires, however, formal specification (as an XML file) of SQL schemas in all supported technologies.

An example of the mapping file can be seen here (conversion using embedded scripts is not yet implemented):

Actual conversion is then performed by simple Groovy script, available here, its key part looks like this:

doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File(schema))

input  = Sql.newInstance(inUrl,
                         inUser,
                         inPasswd,
                         inDriver)
                       
output = Sql.newInstance(outUrl,
                         outUser,
                         outPasswd,
                         outDriver)

XPathAPI.selectNodeList(doc, "/database/table").each {
  output.execute(createStmt)
  input.eachRow(selectStmt) {
    output.execute(insertStmt) {
      }
    }


Comment to the Geometric Database

The content of Geometric Database (i.e. a hierarchical storage of versioned data) could be directly stored in an XML database.  An XML database can store XML files and can put tags (versions) on any element (at any level). The access pattern looks a bit like CVS. Elements (nested) can be committed, retrieved updated, deleted, searched and tagged. Most XML databases use XML:DB standard for access and other XML standards (XUpdate, XQuery, XPath,...) for operations.

For example, following command extracts XML representation of LAr BarrelGeometry version 02 (using Xindice DB):

xindice retreive_document
  -c xindice://xmldbhost.domain.org:8080/ATLAS/LAr/BarrelGeometry
  -n BarrelGeometry-02
  -f File.xml

Equivalent call can be done also directly from the code, in which case it delivers DOM image of requested data.

Existing SQL Geometry Database can be easily converted to XML files suitable for storing in XML databases.

There is certainly no problem in replication between XML databases, they just export/import their XML content.

There are already many XML databases (Xindice, eXist, BerkeleyDB, ...) with APIs in many languages. WebService Frameworks usually already contain  an XML Database (for example JWSDP, which is installed as Atlas External packages, contains XIndice).

J.Hrivnac, Sep'04