Home Computing DAQ E-mail Notes Meetings Subsystems Search

This page last edited by PK on June 28, 2005 .

 

How to access data quicker using the stripped ETC

This would be the preferred way to analyse stripped DSTs if everything would be working as it should. It is much quicker than accessing all DSTs as there are at most 40 events useful for you on each DST (out of about 700). Make sure you read the big "BUT" at the end of the page before you try to do anything.

To get stripped ETC (SETC), go to the bookkeeping database, select DC04v1 data, inclusive bb, SETC1 datatype and SETC1 as output1. Submit and click on the Gaudi logo.

The format the bookkeeping now produces is not correct. You need to edit your file so that it looks like this:

ApplicationMgr.ExtSvc += {"TagCollectionSvc/EvtTupleSvc"};
EventSelector.Input   = {
  "COLLECTION='TagCreator/1' DATAFILE='PFN:castor:/castor/cern.ch/lhcb/STRIP04/00710190/ETC_00710190_00000001.root'  TYP='POOL_ROOTTREE' SEL='(PreselBu2LLK>=1)'",
  "COLLECTION='TagCreator/1' DATAFILE='PFN:castor:/castor/cern.ch/lhcb/STRIP04/00710194/ETC_00710194_00000001.root'  TYP='POOL_ROOTTREE' SEL='(PreselBu2LLK>=1)'"
};
//-- End of Data cards

where "(PreselBu2LLK>=1)" has to be replaced by the name of your preselection algorithm. This block replaces the DST input data and provides pointers to those events that have been accepted by your preselection. For the rest, follow the recommendations in the section about DST (the CheckSelResult algorithm is not needed anymore, but does not harm either).

You can also use the FETC file, which then point to the initial BB events. This is the only way of recovery the downscaled events.

To run on ETCs you need a file catalogue. This can be obtained by running the following command:

> genCatalog options.opts -p pool.xml -s CERN

where options.opts is the list of logical file names (select this option!) you get from the bookkeeping when retrieving the stripped BB data (for running of SETC) or the BB data (when running on FETC),  pool.xml is the xml file to generate and CERN is the site where you are running (use CERN, except if the data is replicated at your site).

This file has to be declared in your DaVinci job as:

PoolDbCacheSvc.Catalog = { "xmlcatalog_file:pool.xml" };


BUT

All the above is theory. The problem is that the bookkeeping has been corrupted at the production level. The file catalog you get from the bookkeeping by running the genCatalog script is wrong, all "File ID" being wrong. There is in principle a way to correct that since the file ID is stored in the DST. One needs to run the $GAUDIPOOLDBROOT/cmt/xmlCatalog.C root script for each file to generate a correct entry in the file catalog. This is feasible for the stripped DST but not for the BB DST, where one would have to stage the whole BB statistics to do it.


Any comment is very welcome. Flame me here.