AIDA NTuple with SQL backend

Based on FreeHEP

Version 1.2.0.0 (09/Nov/2010 at 11:11:32 CET)

Overview,
Web Service
*** JAS3 HowTo ***
Download, Download as JAS3Plugin
Google Gadgets:
- SQLTuple Tags Characteristics Histograms:
JavaDoc
HowTos
Bugs/ToDos/Requests
Release History
Presentations

Overview

SQLTuple extends FreeHEP implementation of ITuple AIDA interface so that ITuples can be stored in an SQL database. It supports any relational DB backend via JDBC standard interface. Configuration is provided for MySQL, PostgreSQL, Derby, Oracle and C-JDBC, other databases can be used at least in read mode when their JDBC library is provided. All AIDA operations (projections, filters, evaluators,... ) are supported in a standard way. Some new functions have been included on top of standard AIDA Interface.

SQLTuple can be used in any AIDA-complient tool.

Types mapping:

On default, SQLTuple (on MySQL) uses type mapping compatible with AIDA mapping. Mapping rules different from standard JDBC ones are specified in the Types.properties file. Most of other numeric types are supported for reading, they are mapped to AIDA types as appropriate. MySQL boolean, byte, char and short are all read as short because their proper type can't be extracted from MySQL Schema.

Advantages over alternative Tools:

SQLTuple runs on any platform without recompilation and can be compiled on any platform using any Java compiler version 1.5+.
All Relational Databases can be supported in local (embedded) or remote (server) mode (as long as such modes are supported by the database).
The performance of SQLTuple is in most cases higher than performance of equivalent C++ implementation.
SQLTuple can be easily used from other languages, like Python, Ruby, Groovy, PNuts or C++ or as a language-neutral XML-RPC Web-Service.
All SQL mapping (both types and commands) is customisable via text files.

Added Values for Pool:

Standard API, already used in many Applications Frameworks and well know to Users.
Platform-independent, multi-language API.
Many AIDA tools ready to access/proccess metadata.
Support for wide range of RDBSs.
Global operations (chaining, merging, filtering,...) on NTuples.

Added Values for AIDA:

SQL databases to store NTuples.

Multilanguage support:

Java programs can directly use SQLTuple. JAS3 provides a convenient graphical way to access SQLTuple through SQLTuple JAS3 Plugin.
Python code can access SQLTuple within JAS3 or directly using Jython or JPype.
PNuts code can access SQLTuple within JAS3.
C++ programs can access SQLTuple through the AIDA API exported by AIDA-JNI package.
Groovy can use SQLTuple directly.
Language-neutral XML-RPC Web-Service access to SQLTuple is provided.

Documentation

AIDA Persistency Architecture

Concerning persistency mechanism, there are two kinds of AIDA objects:

Objects, which need direct access to the persistency service only for their IO operations (read/write/update/...): IHistograms, IClouds, IProfiles, ... All Persistency requirements for those objects can be implemented by standard persistency techniques based on reflective Transient-Persistent Separation (JDO, Hibernate, Serialisation,...)

Objects, which need direct access to the persistency service for some of their standard operations: ITuples. It is not feasible to completely separate Transient and Persistent form of those objects. Their Persistency should be tightly interfaced with their transient form. One possibility (chosen here for SQL-based persistency) is to implement an ITuple extension for each persistency mechanism.

SQLTuple Domain Decomposition and Dependencies

SQLTuple implements AIDA interface as an extension of the FreeHEP JAIDA implementation. It contains several extensions (Inspector, Plotter) using AIDA to perform global operations on NTuples (not only SQLTuples). All the functionality is available using standard AIDA interfaces and WebService access.

FreeHEP AIDA Data Store Technologies

AIDA NTuples can be stored using many different storage technologies (Compressed XML files, Root files, HBook files, several SQL databases). Operations between technologies (filterring, merging,...) are possible via standard AIDA channels. Within SQL technology, native SQL channels are used to speed up operations.

SQLTupleFactory Design:

Documentations:

Ant Build files build.xml manages building and testing SQLTuple.

Configuration files (they are searched in the CLASSPATH):

hep/aida/ref/sql/Implementation.properties defines the features of supported SQL implementations (default exists).
hep/aida/ref/sql/Type.properties defines the customised mapping between Java and SQL types (default exists).
hep/aida/ref/sql/StmtSrc.properties defines the SQL commands used to perform all SQL operations (default exists).
log4j.properties defines the propertis of logging system (default exists). Debugging information is written to SQLTuple.log file. Graphical logger lf5 can be used too.

The package distribution is available here. JAS3 Plugin is available here. Java 1.5+ is required to run. Ant 1.6.5+ is required to build.

HowTos

Access to SQL database is specified via following properties (supplied via ITreeFactory.create(...) factory):

hep.aida.ref.sql.db
hep.aida.ref.sql.schema (optional)
hep.aida.ref.sql.table (optional)
hep.aida.ref.sql.forwardonly (optional)
hep.aida.ref.sql.name (optional)
hep.aida.ref.sql.src (default=hep/aida/ref/sql/StmtSrc.properties)
hep.aida.ref.sql.user
hep.aida.ref.sql.passwd (optional)

HowToRun in JAS3

HowToRun

# Untar full distribution
tar zxvf SQLTuple-dist.tar.gz

# Go to steering directory
cd SQLTuple/ant

# Setup Java (1.5+) and Ant (1.5+)
# If you are on CERN afs, you can do:
. /afs/cern.ch/user/h/hrivnac/public/Ant/setup-CERN.sh

# Build
ant jar

# Local Test
# Tests are performed on an embedded Derby database
# Example logfile
ant lcheck

# Remote Test
# Tests are performed on MySQL and PostgreSQL servers
# MySQL and PostgreSQL databases "Tuples" should exist on localhost
#   and be accessible for user test with password test
# Example logfile
ant rcheck

# Full Test
# Runs both Local and Remote Tests
ant check

# Build documentation
ant doc

# Build full distribution
ant dist

# Ask for generic Ant help
ant -h[elp]

# Ask for SQLTuple Ant help
ant -p[rojecthelp]

Interactive commands:

Usage: java -jar SQLTupleInspector.exe.jar [-options]

options:
  -help                  show command line args
  -connections           show known logical connections
  -logical               use logical connection from database (user and password are then not needed)
  -tuples <url>          show list of ntuples in a database specified as jdbc:<technology>://<database> or <logical connection>
  -columns <url>         show list of columns in a table specified as jdbc:<technology>://<database>/<table> or <logical connection>/<table>
  -rows <url>            show list of rows in a table specified as jdbc:<technology>://<database>/<table> or <logical connection>/<table>
  -user <user>           user
  -password <password>   password

Usage: java -jar SQLTuplePlotter.exe.jar [-options] <url>

parameters:
  <url>   jdbc:<technology>://<database>/<table> or <logical connection>/<table>

options:
  -help                  show command line args
  -connections           show known logical connections
  -logical               use logical connection from database (user and password are then not needed)
  -x <x>                 x axis formula
  -y <y>                 y axis formula (optional)
  -weight <weight>       weight formula (optional)
  -filter <filter>       filter formula (optional)
  -user <user>           user
  -password <password>   password (optional)
  -xmin <xmin>           x min (optional, speeds up creation of histograns from big ntuples)
  -xmax <xmax>           x max (optional, speeds up creation of histograns from big ntuples)
  -ymin <ymin>           y min (optional, speeds up creation of histograns from big ntuples)
  -ymax <ymax>           y max (optional, speeds up creation of histograns from big ntuples)

API Examples:

Writing new NTuple: see hep.aida.ref.sql.Test.Write for an example.
Reading existing NTuple satisfying a Filter: see hep.aida.ref.sql.Test.Read for an example.
Chaining two existing NTuples satisfying a Filter: see hep.aida.ref.sql.Test.Chain for an example.

Notes:

Databases used in examples should already exists (as defined in file build.xml). Existing NTuples will be overwritten.
Data can be on network and can use various technologies (SQL DB, XML, Root files,...).
All operations can be performed from all AIDA languages (Java, C++, Python, PNuts) in a compiled or scribed way.
All standard operations can be performed directly using AIDA API.
All standard operations can be performed graphically in JAS3.
Some extensions to AIDA ITuple work only on SQLTuples.

Bugs/ToDos/Requests

Bugs:

Chains of different SQL Implementation can't be created due to incompatible mapping.
byte, char and long types and not conservating.
IFilter.rowToProcess(...) and IFilter.startingRow(...) are ignored. (AIDA semantics of those calls can't be supported by naked SQL tables.)
Columns of type Object or ITuple are not supported.
ResultSets and Statements are often not properly closed, mainly in case of a Query and Metadata.
Database properties (defined/used via StmtSrc.properties and Implementation.properties) are bound to JDBC driver. But one JDBC driver can carry connection to different backends (like in case of C-JDBC driver).
ITree.setOverwrite() is not set.
Case sensitiveness/blindeness of databases is not handled correctly.
boolean, byte and String columns don't accept default values.
Passing the NTuple signiture as one string doesn't work.
FTupleCursor is ignored in SQLTuple.columnValue(int, FTupleCursor, Value).
PreparedStatements are created even on readonly ITrees. (It doesn't do any harm, but slows the initialisation down.)
Readonly ITrees are not handled correctly.
Titles and names are sometimes confused.

ToDos:

User should be able to supply conversion code to map (unsupported) SQL types into AIDA types.
Existing tables in database should be scanned lazily for inclusion in ITree. (Currently all tables schemas are requested at the beginning to created complete ITree. It is slow in case of database with many tables.)
Standard URL should be supported: jdbc:postgresql://localhost:5432/test?auth=password&user=someuser&password=somepassword
Currently, filters and evaluators can be defined only by SQL queries. Simple Java-like code (like JDOQL) should be possible.
Multiple level Trees should be handled (currently, just one level is possible, corresponding to a Database).
Generic ITuple.getObject(...) and ITuple.fill(..., Object) should be available.
SQLTuple methods should be re-factored.
100% conformity with AIDA standard should be provided. All standard Tests should run. Tests for multiple storage technologies should be introduced into standard test-set.
Unit tests should be available.
Oracle should be tested.
Negative tests (testing functionality which shouldn't work) should exist.
Axamol should be avaluated.
Batch insert should be used.

Requests:

Customised mapping between column names/types and Java names/types should be supported (to allow LCG/Pool/RAL AttributeList access).
Support for HSQLDB should be included.

Requests to AIDA and FreeHEP:

It should be possible to create ITuple from an independent Specification object. This Specification object will carry all information needed to create/access an NTuple. Specification should be constructable by constructors similar to those available today in the ITupleFactory. Specification should be externalisable (into an XML file). Also ITuple row should be creatable from its Specification so that it can be later added to several ITuples. Possible API follows:
a) ITupleSpecification which would be created by ITupleFactory with arguments similar to current ITupleFactory.create(...) arguments:
```
ITupleSpecification spec = tupleFactory.createSpec(...);
```
This ITupleSpecification could be then used to create several ITuples:
```
ITuple tuple = tupleFactory.create(spec);
```
b) ITupleRow which would be created by ITupleSpecification and filled in a similar way as current ITuple:
```
ITupleRow row = spec.createRow();
for (...) {
  row.fill(1, ...);
  ...
  }
```
This ITupleRow could be then added into several ITuples:
```
tuple.addRow(row);
```
ITuple stored in SQL are not ordered. It should be possible to specify it.
Options handling (via long string) is not expressive enough. There should be a special Options object (like Java Properties).
It should be possible to append NTuples (using ITupleFactory.create(...) and ITupleFactory.createFiltered(...)).
There should be visible Accessor attached to ITree. It should be set up both for read and write (commit) access during ITree initialisation. It should contain not only connection string (name), but also used StmtSrc,... (static Accessor Accessor.getAccessor(...) provides a temporary fix.)
Interfaces should throw exceptions related to databases.
Strategy concerning overwriting and appending should be cleanly described and available (from ITree).
New methods should be introduced (as done in SQLTuple and SQLTupleFactory).
AidaUtils should handle options with dots.
IFilter and IEvaluator should not expect processing per lines.
IStore should be part of AIDA.
AIDA should standartise on SQL types corresponding to AIDA types (as done in Type.properties).
AIDA should standartise on SQL commands corresponding to AIDA calls (as done in StmSrc.properties).
BUG: JAS3 should not close Tree during work (public void AidaSQLStore.close() provides a temporary fix.)
In JAS3, SQLTuple is asked for select (*) and doesn't use projections, tuple s loaded into memory. (Related discussion is recorded in FreeHEP Forum. SQLTuple JAS3 plugin provides additional actions which do use AIDA calls directly.)

Release History

α:
- Initial release.
β1:
- Set of ready-to-use examples.
- Support for McKoi (embedded) SQL DB.
- Full support for multiple storage formats within one application.
- Complete distribution ready to be used. All components needed for running and building are included. Only Java (1.4+) and Ant (1.5+) are required.
- JavaDoc documentation (also for Java 1.5 constructs).
- Instalation of Java 1.5 not required. All Java 1.5 constructs are build and run using Java 1.4 compiler and runtime.
- Proposal for documentation of Overall Architecture of AIDA Persistency.
- Interoperability between SQL and XML provided.
- Better handling of exceptions.
β2:
- SQL Statements refactored into one StmtSrc.properties.
- Log4J logging added.
- Exception handling refactored.
- Migrated to FreeHEP 1.2.1.
- Migrated to AIDA 3.2.1.
- Merger added.
- Architecture documented.
- Support for Pool Collections containing LinkTable added.
β3:
- Prototype Pool Event Selector added.
- Systematic BenchMarking added.
- Support for Root files added.
- Complete JavaDoc (incl. Java 1.5 constructs) provided.
- Usable under Java 1.5 or Java 1.4 + 1.5 extensions.
- C++ API added.
- JWSDP Web Service prototype (of EventSelector) added.
β4:
- Only supported with Java 1.5.
- Collection Management, C++ and WebSevice interface moved to other packages.
- Nonstandard StmtSrc.properties can be specified as a Property.
- Available in Atlas (since 7.6.0 release).
0.9.1:
- First functional non-beta.
0.9.2:
- Complete JavaDoc documentation.
0.9.3:
- Java-SQL mapping customisable via Type.properties file.
0.9.4:
- Implementation features customisable via Implementation.properties file.
- No dependency on SQL in code.
- Support for JAS3.
- Arguments can be passed to StmtSrc directly.
- Oracle is supported.
- Writing is more optimised for speed.
0.9.5:
- SQLTuple can be run as a standard JAS3 plugin.
- Upgrade to new JDBC drivers: MySQL/3.1.4, PostgreSQL/3.0.7, McKoi/1.0.3.
- BugFix: ITuple.get*(...) returns specified default values in case of failure.
- Support for C-JDBC.
- Simplifid handling ot types mapping. Default mapping is used when possible.
- Any database (with JDBC driver) can be used.
- Many small bug-fixes.
- Better Log4J Appender.
1.0.0:
- Support for Schema specification.
- Better support for uppercase-only databases (like Oracle).
- ITuple.rows() reports correct number of rows even after a filter.
- Adding rows uses PreparedStatement.
1.0.1:
- Made up-to-date with FreeHEP used in JAS3 0.8.1.
- ITuple.start() resets ITuple to its original shape even after customised ITuple.start(...) calls.
1.0.2:
- Made up-to-date with JAS3 0.8.2 and JAIDA 3.2.4.
- JAS3 plugin adds actions using directly calls to AIDA ITuple. It profits from SQLTuple optimisation.
1.1.0:
- Plotter and Inspector Utilities included.
- JSP Web Service included.
- Oracle long-style URL allowed.
- Database (meta)data can be cached.
- Tree View added to Web Service.
- Scanner added to Web Service.
- McKoi replaced with Apache Derby.
1.2.0:
- Histogramming for Tag databases suported.
- tnsnames.ora file is translated into context.xml file
- hep.aida.ref.sql.forwardonly option trigger FORWARD_ONLY ResultSet (for big Ntuples).
- Standalone Plotter fills histograms progressively.
- Plotter can use logical connections defined in XML files.
- Set of standard histograms of tag database is created.
- Google Gadget added.
- Workarounds FreeHEP problems in CERN J2EE server added.

SQLTuple Domain Decomposition and Dependencies
	SQLTuple implements AIDA interface as an extension of the FreeHEP JAIDA implementation. It contains several extensions (Inspector, Plotter) using AIDA to perform global operations on NTuples (not only SQLTuples). All the functionality is available using standard AIDA interfaces and WebService access.

FreeHEP AIDA Data Store Technologies
	AIDA NTuples can be stored using many different storage technologies (Compressed XML files, Root files, HBook files, several SQL databases). Operations between technologies (filterring, merging,...) are possible via standard AIDA channels. Within SQL technology, native SQL channels are used to speed up operations.

AIDA NTuple with SQL backend

Based on FreeHEP

Overview

Documentation

AIDA Persistency Architecture

SQLTuple Domain Decomposition and Dependencies

FreeHEP AIDA Data Store Technologies

HowTos

Bugs/ToDos/Requests

Release History

Related Documentation and Presentations