Introduction to CM2
Recently, BaBar's data storage and bookkeeping have
changed to a completely new system. The new model is called CM2
("Computing Model 2").
New users will almost certainly use only CM2, though it is a good
idea to be conversant with some aspects of the previous system as much
of the documentation is was written in the pre-CM2 era, and many
BaBarians will use pre-CM2 language.
The original BaBar Event Store used two data-storage formats. The
Objectivity database was a large object-oriented database
with several levels of detail stored for each event. It could be used
for almost any analysis or detector study. Kanga (Kind ANd
Gentler Analysis) datasets stored only the micro (see below) level
information in ROOT-type files. This
is the level of detail required for most physics analysis jobs, and
avoided the complication of interacting the full Objectivity database
and the complications that often arose with it. The idea was to have
Objectivity as the main database, and use Kanga files at remote sites.
The Objectivity database had four levels of detail: raw, reco,
micro, and nano (or "tag"). Raw and reco were very big databases that
kept virtually all of the details from every event. Micro was a
smaller, more user-friendly database that kept only information likely
to be useful for physics analyses, rather than detector studies, or
more refined analysis tasks. Nano ("tag") contained even less detail,
and was used only to skim data for a few given key characteristics to
save loading in the whole event information for each event (a
time-consuming process). The original idea was to keep raw and reco
information for jobs like detector studies. Raw and reco were
infrequently used, and only a small part of the information was ever
accessed.
The new CM2 Event Store has just one database, the "Mini". The
Mini database is basically an extended version of the micro, however
with the additional capability to store information written into
"skims" by users ("user data", see below). The mini contains all of the
information from the old Micro database, plus the small part of Raw
and Reco. The new data storage format is more like Kanga than
anything else, so you may here people refer to the CM2 Mini database
as "CM2 Kanga," "new Kanga" or (since old-kanga is obsolete) just
"kanga". You will also find that a lot of CM2 things are labeled
"kanga". For example, jobs accessing the CM2 Mini database must be
run on the kanga queue.
As a general rule, if you are not sure whether to use something (a
queue, a file, a module) with an "objy" ("Objectivity") label or a
"kan" ("kanga") label, you should go with the "kanga". You should also
favor "Mini", "root" and "Bbk", over "Micro", "hbook" and "bdb".
Another difference between the CM2 Mini and the old database
system is that the CM2 Mini allows for the storage of "user data":
user-defined composite candidate lists and user-calculated quantities.
Finally, the bookkeeping system has also been improved. Before,
users had to use a variety of different tools to get information about
data, such as "skimData", "ir2runs", and "lumi". But now the bookkeeping
has been integrated into one system. Now all bookkeeping commands
begin with "Bbk" ("BaBar bookkeeping"). For example, you use
BbkDatasetTcl to find data, "BbkSPModes" to look for SP mode numbers,
and BbkLumi to determine luminosity. (To see them all, just type
"Bbk[TAB]". To find out more about one of them, say BbkDatasetTcl,
do "BbkDatasetTcl --help".)
General Related Documents:
Author:
Last modification: 13 June 2005
Last significant update: 13 June 2005 (page created)
|