GAUDI User Guide

Chapter 12
Converters

12.1  Overview

Consider a small piece of the LHCb detector; a silicon wafer for example. This "object" will appear in many contexts: it may be drawn in an event display, it may be traversed by particles in a Geant4 simulation, its position and orientation may be stored in a database, the layout of its strips may be queried in an analysis program, etc. All of these uses or views of the silicon wafer will require code.

How to encompass the need for these different views within Gaudi was one of the key issues in the design of the framework. In this chapter we outline the design adopted for the framework and look at how the conversion process works. This is followed by two sections which deal with the technicalities of writing converters for reading SicB data and for reading from and writing to ROOT files.

12.2  Persistency converters

Since release 2, Gaudi gives the possibility to read in event data either from Zebra or from ROOT files and to write data back to disk in ROOT files. This data may then of course be read again at a later date. Figure 16 is a schematic illustrating how converters fit into the transient-persistent translation of event data. We will not discuss in detail how the transient data store (e.g. the event data service) or the persistency service work, but simply look at the flow of data in order to understand how converters are used.

 

Figure 16 Persistency conversion services in Gaudi

One of the issues considered when designing the Gaudi framework was the capability for users to "create their own data types and save objects of those types along with references to already existing objects". A related issue was the possibility of having links between objects which reside in different stores (i.e. files and databases) and even between objects in different types of store.

The Gaudi framework gives the possibility to save data objects into ROOT based files. Thus, since our principal store of data for the moment is still SicB data in Zebra files we see immediately an application of the issues mentioned above. Figure 16 . shows that data may be read from SicB files and ROOT files into the transient event data store and that data may be written into ROOT files. It is the job of the persistency service to orchestrate this transfer of data between memory and disk.

The figure shows two "slave" services: the SicB conversion service and the RIO (ROOT I/O) service. These services are responsible for managing the conversion of objects between their transient and persistent representations. Each one has a number of converter objects which are actually responsible for the conversion itself. As illustrated by the figure a particular converter object converts between the transient representation and one other form, here either Zebra or ROOT.

12.3  Collaborators in the conversion process

In general the conversion process occurs between the transient representation of an object and some other representation. In this chapter we will be using persistent forms, but it should be borne in mind that this could be any other "transient" form such as those required for visualisation or those which serve as input into other packages (e.g. Geant4).

Figure 17 shows the interfaces (classes with names beginning in I) which must be implemented in order for the conversion process to function. The conversion process is essentially a collaboration between the following types:

Figure 17 The classes (and interfaces) collaborating in the conversion process.

 

For each persistent technology, or "non-transient" representation, a specific conversion service is required. This is illustrated in the figure by the class AConversionSvc which implements the IConversionSvc interface.

A given conversion service will have at its disposal a set of converters. These converters are both type and technology specific. In other words a converter knows how to convert a single transient type (e.g. MuonHit ) into a single persistent type (e.g. RootMuonHit ) and vice versa. Specific converters implement the IConverter interface, possibly by extending an existing converter base class.

A third collaborator in this process are the opaque address objects. A concrete opaque address class must implement the IOpaqueAddress interface. This interface allows the address to be passed around between the transient data service, the persistency service, and the conversion services without any of them being able to actually decode the address. Opaque address objects are also technology specific. The internals of a SicBAddress object are different from those of a RootAddress object.

Only the converters themselves know how to decode an opaque address. In other words only converters are permitted to invoke those methods of an opaque address object which do not form a part of the IOpaqueAddress interface.

Converter objects must be "registered" with the conversion service in order to be usable. For the "standard" converters this will be done automatically. For user defined converters (for user defined types) this registration must be done at initialisation time (see Chapter 6 ).

12.4  The conversion process

As an example (see Figure 18 ) we consider a request from the event data service to the persistency service for an object to be loaded from a data file.

Figure 18 A trace of the creation of a new transient object.

 

As we saw previously, the persistency service has one conversion service slave for each persistent technology in use. The persistency service receives the request in the form of an opaque address object. In order to decide which conversion service the request should be passed onto the svcType() method of the IOpaqueAddress interface is invoked. This returns a "technology identifier" which allows the persistency service to choose a conversion service.

The request to load an object (or objects) is then passed onto a specific conversion service. This service then invokes another method of the IOpaqueAddress interface, clID(), in order to decide which converter will actually perform the conversion. The opaque address is then passed onto the concrete converter who knows how to decode it and create the appropriate transient object.

The converter is specific to a specific type, thus it may immediately create an object of that type with the new operator. The converter must now "unpack" the opaque address, i.e. make use of accessor methods specific to the address type in order to get the necessary information from the persistent store.

For example, a SicB converter might get the name of a bank from the address and use that to locate the required information in the SicB common block. On the other hand a ROOT converter may extract a file name, the names of a ROOT TTree and an index from the address and use these to load an object from a ROOT file. The converter would then use the accessor methods of this "persistent" object in order to extract the information necessary to build the transient object.

We can see that the detailed steps performed within a converter depend very much on the nature of the non-transient data and (to a lesser extent) on the type of the object being built.

If all transient objects were independent, i.e. if there were no references between objects then the job would be finished. However in general objects in the transient store do contain references to other objects.

These references can be of two kinds:

12.5  Converter implementation - general considerations

After covering the ground work in the preceding sections, let us look exactly what needs to be implemented in a specific converter class. The starting point is the Converter base class from which a user converter should be derived. For concreteness let us partially develop a converter for the UDO class of Chapter 6 .

Listing 38 An example converter class

 
// Converter for class UDO. 
extern const CLID& CLID_UDO; 
extern unsigned char OBJY_StorageType; 
 
static CnvFactory<UDOCnv> s_factory; 
const ICnvFactory& UDOCnvFactory = s_factory; 
 
class UDOCnv : public Converter { 
public: 
  UDOCnv(ISvcLocator* svcLoc) :  
      Converter(Objectivity_StorageType, CLID_UDO, svcLoc) { } 
 
  createRep(DataObject* pO, IOpaqueAddress*& a); 
  createObj(IOpaqueAddress* pa, DataObject*& pO); 
 
  fillObjRefs( ... ); 
  fillRepRefs( ... ); 
}

The converter shown in Listing 38 is responsible for the conversion of UDO type objects into objects that may be stored into an Objectivity database and vice-versa. The UDOCnv constructor calls the Converter base class constructor with two arguments which contain this information. These are the values CLID_UDO, defined in the UDO class, and Objectivity_StorageType which is also defined elsewhere. The first two extern statements simply state that these two identifiers are defined elsewhere.

All of the "book-keeping" can now be done by the Converter base class. It only remains to fill in the guts of the converter. If objects of type UDO have no links to other objects, then it suffices to implement the methods createRep() for conversion from the transient form (to Objectivity in this case) and createObj() for the conversion to the transient form.

If the object contains links to other objects then it is also necessary to implement the methods fillRepRefs() and fillObjRefs().

12.6  SICB Converters

12.6.1  General considerations

As mentioned previously a converter must implement the IConverter interface, by deriving from a specific base class. In this way any actions which are in common to all converters of a specific technology may be implemented in a single place.

Access to the SicB data sets is basically via wrappers to the Fortran code. A complete event is read into the Zebra common block, and then the conversion to transient objects is done on request.

Two base classes which implement the IConverter interface are provided in order to ease the writing of specific converters. These are SicbItemCnv for converting "simple" data objects, and SicbSingletoListCnv for converting a group of objects into a container of non-identifiable objects, e.g. an ObjectVector of tracks. This is shown in Figure 19

Figure 19 The SicB specific converter classes and an example trace of transient object creation.

 

The createObj() methods are implemented in the base classes and instantiate the appropriate objects (and container if required). Additionally they decode the SicBaddress objects passed from the event data service and set a pointer into the SicB common block. The actual setting of the object attributes is done by implementing the updateObj(int*,dataObject*) method.

In the following section we give detailed instructions on how to implement converters within the SicBCnv package. These are intended primarily for Gaudi developers themselves.

12.6.2  Implementing converters in the SicbCnv package

SicB converters are available currently only for reading, writing back into persistent storage (ZEBRA files) is not possible at present.

Let us assume one wants to introduce a new class MCRichRadiatorHit, which contains Monte Carlo data from the SicB bank RIRW (defined in rirw.ddf). In the package LHCbEvent, one should do these modifications:

  1. Define your class MCRichRadiatorHit and place the files MCRichRadiatorHit.h (and maybe also MCRichRadiatorHit.cpp ) in the directory LHCbEvent/MonteCarlo (output of DAQ would be placed in the subdirectory Raw, output of reconstruction program would be placed in the subdirectory Rec).
  2. To access objects of your newly defined class MCRichRadiatorHit , you have to register the logical path to it. The whole logical structure of LHCb event is in the files LHCbEvent/TopLevel/EventModel.h and .cpp. In EventModel.h, declare the logical path to your objects inside the namespace MC
  3.  
    _EXTERN_ std::string RichRadiatorHitPath;

    In the file LHCbEvent/TopLevel/EventModel.cpp, implement that logical path

     
    EventModel::MC::MCRichRadiatorHitPath 
        = EventModel::MC::EventPath + "/MCRichRadiatorHit";

    and define the class identification number

     
    const CLID& CLID_MCRichRadiatorHit = 240;    // Agreed unique integer number

To write the corresponding converter, one has to do the following modifications in the package SicbCnv:

  1. Write the converter SicbMCRichRadiatorHitCnv and place the files SicbMCRichRadiatorHitCnv.h and SicbMCRichRadiatorHitCnv.cpp in the directory SicbCnv/Sicbxx.
  2. There may be only one converter per class (which can be used for more objects of the same type, e.g. SicbMCCalorimeterHitCnv is used to convert hadron calorimeter hits, electromagnetic calorimeter hits, and preshower hits, all of the type same MCCalorimeterHit). The data members will be copied from the common block to objects in this converter.

  3. The new data leaf containing the Rich radiator hits must be made known to the parent object (How otherwise would you ever be able to navigate to it?). For this pupose insert the following entry in the function SicbMCEventCnv::updateObjRefs . This is an example how to add macroscopic references.
  4.  
    addDataLeaf(ent, 
                new RegistryEntry( EventModel::MC::MCRichRadiatorHitPath, 0 ), 
                new SicbAddress( CLID_MCRichRadiatorHit, fid, recid, "RIRW" ) 
               );
  5. References have to be filled. Internal references can be initialized using the load on demand mechanism which ensures that unused references will not be automatically converted. See any implementation file in the package SicbCnv for an example of a class using SmartRefs.
  6. The converter factory must be made known to the system. This in fact depends on the linking mechanism: If the converter is linked into the executable as an object file, no action is necessary. However, usually the converter code resides in a shared or archive library. In this case the library must have an initialisation routine which creates an artificial reference to the created converter and forces the linker to include the code in the executable. An example of creating such a reference can be found in the file SicbCnv/SicbCnvDll/SicbCnv_load.cpp. The convention for these initialization files is the following: for any other package replace the string "SicbCnv" with "OtherPackage".

12.7  Storing Data using the ROOT I/O Engine (RIO)

One possibility for storing data is to use the ROOT I/O engine to write ROOT files. Although ROOT by itself is not an object oriented database, with modest effort a structure can be built on top to allow the Converters to emulate this behaviour. In particular, the issue of object linking had to be solved in order to resolve pointers in the transient world.

The concept of ROOT supporting paged tuples called trees and branches is adequate for storing bulk event data. Trees split into one or several branches containing individual leaves with data. The data structure within the Gaudi data store is tree like (see Figure 20 ).

Figure 20 The Transient data store and its mapping in the Root file. Note that the "/" used within the data store to identify separate layers are converted to "#" since the "/" within ROOT denominates directory entries

 

In the transient world Gaudi objects are sub class instances of the "DataObject". The DataObject offers some basic functionality like the implicit data directory which allows e.g. to browse a data store. This tree structure will be mapped to a flat structure in the ROOT file resulting in a separate tree representing each leaf of the data store. Each data tree contains a single branch containing objects of the same type. The Gaudi tree is split up into individual ROOT trees in order to give easy access to individual items represented in the transient model without the need of loading complete events from the root file i.e. to allow for selective data retrieval. The feature of ROOT supporting selective data reading using split trees seemed not to be too attractive since generally complete nodes in the transient store should be made available in one go.

However, ROOT expects "ROOT" objects, they must inherit from TObject . Therefore the objects from the transient store have to be converted to objects understandable by ROOT.

The following sections are an introduction to the machinery provided by the Gaudi framework to achieve the migration of transient objects to persistent objects. The ROOT specific aspects are not discussed here; the documentation of the ROOT I/O engine can be found at the ROOT web site (http://root.cern.ch). Note that Gaudi only uses the I/O engine, not all ROOT classes are available.

12.8  The Conversion from Transient Objects to ROOT Objects

As for any conversion of data from one representation to another within the Gaudi framework, conversion to/from ROOT objects is based on Converters. The support of a "generic" Converter accesses pre-defined entry points in each object. The transient object converts itself to an abstract byte stream.

However, for specialized objects specific converters can be built by virtual overrides of the base class.

Whenever objects must change their representation within Gaudi, data converters are involved. For the ROOT case the converters must have some knowledge of ROOT internals and the service finally used to migrate ROOT objects (-> TObject ) to a file. In the same way the converter must be able to translate the functionality of the DataObject component to/from the Root storage. The persistent equivalent of the DataObject is the RootObject and the job of the RootBaseConverter is to preserve this functionality when storing and reading back. Hence, the RootObject and its converter are common to all objects identifiable in the transient data store.

The instantiation of the appropriate converter is done by a macro. The macro instantiates also the converter factory used to instantiate the requested converter. Hence, all other user code is shielded from the implementation and definitions of the ROOT specific code.

Listing 39 Implementing a "generic" converter for the transient class Event.

16: /// Converter implementation
17: #include "RootCnv/RootSvc/RootEvtDataCnv.h"
18: RootEventItemConverterImp(Event)

The macro needs a few words of explanation: the instantiated converter is able to create transient objects of type Event . The corresponding persistent type is of a generic type, the data are stored as a machine independent byte stream. It is mandatory, that the Event class implements a streamer method "serialize". An example of the Event class is shown in Listing 40 .

T he instantiated converter is of the type RootGenericConverter and the instance of the instantiating factory has the instance name RootEventCnvFactory. .

Listing 40 Serialisation of the class Event.

1: /// Serialize the object for writing
2: virtual StreamBuffer& serialize( StreamBuffer& s ) const {
3: DataObject::serialize(s);
4: return s
5: << m_event
6: << m_run
7: << m_time;
8: }
9: /// Serialize the object for reading
10: virtual StreamBuffer& serialize( StreamBuffer& s ) {
11: DataObject::serialize(s);
12: return s
13: >> m_event
14: >> m_run
15: >> m_time;
16: }

12.8.1  Non Identifiable Objects

Non identifiable objects cannot directly be retrieved/stored from the data store. Usually they are small and in any case they are contained by a container object. Examples are particles (class MCParticle ), hits (class MCHitBase and others) or vertices (class MCVertex ). These classes can be converted using a generic container converter. Container converters exist currently for lists and vectors. The containers rely on the serialize mothods of the contained objects. The serialisation is able to understand smart references to other objects within the same data store: e.g. the reference from the MCParticle to the MCVertex. Listing 41 shows an example of the serialize methods of the MCParticle class

 

Listing 41 Entry points container tor the non identifiable objects.

1: /// Serialize the object for writing
2: inline StreamBuffer& MCParticle::serialize( StreamBuffer& s ) const {
3: ContainedObject::serialize(s);
4: unsigned char u = (m_oscillationFlag) ? 1 : 0;
5: return s
6: << m_fourMomentum
7: << m_particleID
8: << m_flavourHistory
9: << u
10: << m_originMCVertex(this)    // Stream a reference to another object
11: << m_decayMCVertices(this);  // Stream a vector of references
12: }
13:
14:
15: /// Serialize the object for reading
16: inline StreamBuffer& MCParticle::serialize( StreamBuffer& s ) {
17: ContainedObject::serialize(s);
18: unsigned char u;
19: s >> m_fourMomentum
20: >> m_particleID
21: >> m_flavourHistory
22: >> u
23: >> m_originMCVertex(this)    // Stream a reference to another object
24: >> m_decayMCVertices(this);  // Stream a vector of references
25: m_oscillationFlag = (u) ? true : false;
26: return s;
27: }
28: