SLAC PEP-II
BABAR
SLAC<->RAL
Babar logo
Workbook HEPIC Databases PDG HEP preprints
Organization Detector Computing Physics Documentation
Personnel Glossary Sitemap Search Hypernews
Unwrap page!
Wkbk. Search
Wkbk. Sitemap
Introduction
Non SLAC
HOWTO's
Introduction
Logging In
QuickTour
Detector
Info Resources
Software Infrastructure
CM2 Introduction
Unix
OO
SRT
Objectivity
Event Store
Framework
Beta
Modifying Code
Writing and Editing
Compiling
Debugging
Analysis
Framework II
Analysis
Find Data
Batch Processing
PAW
PAW II
ROOT I
ROOT II
ROOT III
Advanced Infrastructure
New Releases
Workdir
Main Packages
Event Displays
Gen/Sim/Reco
Contributing Software
SRT and CVS
Coding
Advanced Topics
Make CM2 Ntuples
New Packages
New Packages 2
Persistent Classes
Java
Site Installation
Check this page for HTML 4.01 Transitional compliance with the
W3C Validator
(More checks...)

Editing Code

This section is intended to teach the user just enough about C++ to be able to make minor changes to existing code. A more detailed discussion of writing C++ is reserved for the later section, Writing Code. The links and references at the bottom of this page are useful resources for learning C++ more thoroughly.

Some of the conceptual information and terminology defined in the previous Object Orientation section might be useful background reading for this section.


Choose a text editor

You can edit C++ and tcl code on any text editor. Popular choices are xemacs, or emacs.

emacs "knows" about different types of code, so if you name your file "foo.cc" or "foo.hh" then it will "know" that it is C++, and if you call it "foo.tcl" then it will "know" that it is tcl. emacs will then adopt a mode which applies a colour coding to different aspects of the file, a process called "fontifying". It can also be used to spot errors like failure to close brackets or to appropriately indent code (as in Fortran mode). For example, on my emacs editor, the first part of the Quicktour's NTrkExample.cc looks like this:


// in general, a module constructor should not do much.  The begin(job) or
// begin(run) members are better places to put initialization
NTrkExample::NTrkExample( const char* const theName, 
			  const char* const theDescription )
  : AppModule( theName, theDescription )
    , _btaChargedList("trackCandidates", this, "ChargedTracks")
{
}

Other text editors you could try include pico (the underlying editor for the pine email program), kwrite, nedit, kate, which are pretty straightforward. Less intuitive to use is vi.

The xemacs OO-Browser

The xemacs OO-Browser is a source code browser for developers. It allows easy navigation between the header files and implementation files. In addition, it can display the inheritance trees and with simple mouse-clicks, it can edit the corresponding header files. Note that it can be a bit problematic to run on some networks, particularly SunOS.

See the detailed page, The xemacs OO-Browser. (This page hasn't been updated for CM2, but the results are independent of computing model.)


C++ Syntax

Basic Syntax

C++ code is case sensitive. Two variables named Test and test are differentiated between. However, the use of variable or function names that differ only by case is strongly discouraged. Variable and function names may consist of all alphabet characters (upper and lower case). Variable names must begin with either a letter or the underscore character '_'. Most of the C++ language consists of lower case letters comprising key words and punctuation.

Comments are indicated by a double slash '//'. All text from the double slash to the end of the line is a comment and is ignored by the compiler. An alternate syntax uses '/*' to begin a comment and '*/' to end a comment. This style of commenting is supported to be compatible with C code. The double slash is a preferred convention in BaBar C++ code.

A declaration is a statement that introduces a name into a program. It consists of four parts, two optional and two mandatory. The general structure is:

[specifier] <base type> <declarator> [initializer];
When the optional initializer is included the statement is both a declaration and a definition. With the exception of function definitions, declarations are ended with a semicolon.
int x;	//declaration 
int x = 5; //declaration and definition
The '= 5' in the last statement is the initializer. The equals sign '=' is the assignment operator.

A function declaration has the format:

<return type> ClassName::FunctionName(<type> <name> , <type> 
<name> , .....);

The type and name pairs in parenthesis are arguments of the function. When a function has more than one argument they are separated by commas. If a function has no arguments then the parenthesis are still included but with nothing in between. When a function does not return any values or objects the return type is void. This declaration syntax may also be referred to as the function prototype.

The definition of a function typically consists of a series of statements. The code that comprises the function's definition is demarcated by curly braces. There is no semicolon at the end of a function's definition. For example:

int ExampleFunction(int i){ 
   int j; //declaration of j
   statement one;
   statement two;
   ...
   return j;
}
ExampleFunction takes one argument, an integer. Within the body of the function an integer named j is declared. Some calculations using the argument integer i and the function's integer j are performed. The final value of j is returned by ExampleFunction.

Typically a statement is one line of code, in either the body of a program or the definition of a function. Statements are a base unit of code and in C++, as in C, they must be terminated by a semicolon. A statement may fulfill one of many roles: declaration, definition, call to a function, allocation of memory, assignment, calculation, and so forth.

There are two ways to use your new function, the main one being to make an object (an "instantiation") of the class:

ClassName myClassObject;
Then call that function through your new instance of the class:
int returnedinteger = myClassObject->ExampleFunction(3);
where in this example, the integer 3 is passed to the example function, and the new integer "returnedinteger" is assigned the returned value of the function.

Data Types, Pointers, and Arrays

When a declaration is made, memory is set aside for the declared variable. This compiler needs a type associated with all memory so that it can properly interpret the stored data. The built in memory types of C++ are: int (integer), double (floating point), bool (boolean), and char (character).

In addition to memory being one of these types, memory can be defined as a pointer to a type. A variable declared as a pointer is interpreted as an address of memory. The syntax of this declaration is to append an asterix '*' to the type. For example:

int* x;
The variable x can have the address of an integer stored in its associated memory. It is important to recognize that a declaration will set aside, allocate, memory for a pointer. The memory for the data type pointed to must be allocated and defined explicitly. This is called initialization. To initialize or access the value of a pointer one must dereference the memory pointed to with the asterix '*', and the pointer itself with the ampersand '&'. For example:
*x = 5; //assign 5 to the memory x points to

int y; //declare an int named y
y = *x; //assign y the value x points to (both ints)

y = &x; //assign y the value stored in x, an address
A built in data structure of C++ is the array. An array is a block of memory set aside for the contiguous storage of like elements. The declaration of an array must include the type and number of elements to be stored. An array is indicated by appending a set of square brackets '[ ]' to the variable name. For example:
int myarray[10]; //declare array of 10 integers
To define any element of the array, one may use the assignment operator and place the element index between the square brackets. Array indexing begins with zero, 0. Thus, for an array of ten integers valid indices are 0 through 9.
myarray[4] = 6; //assign 6 to the 5th element

Class Syntax

User-defined data types is central to the design of C++. The most common unit of user-defined data in C++ and BaBar code specifically is the class.

To create a class a user must define it to the compiler. A class is identified by the name determined in the declaration and consists of data members and member functions (also called methods). The syntax of this declaration is:

class ClassName{
data members and member functions
};
This defines a new type and is thus referred to as a class definition. However, historically and analogous to general declarations, it is also called a class declaration. To add to the linguistic confusion, the implementation of the class (the code that defines each member function), is also called the class definition. For consistency I will refer to this initial step as the declaration and the implementation as the definition.

The data members and member functions of a class can be placed in one of three categories:

public
accessible to all code
protected
accessible only to friends of this class, classes and friends of classes that inherit from this one
private
accessible only to classes that inherit from this one

Typically data members that store the state of the object should be private. This protects the implementation of the class. Though users can read the declaration of a class, client code is not allowed to access it, and the data is protected from being altered. The functions supporting the class, those that make it a useful entity for applications, should be made public.

Data members of a class are declared with the same syntax as variables. Similarly, member functions are declared with the same syntax as general functions. However, the name of a member function is ClassName::FunctionName. The '::' character is called the scope resolution operator. It indicates that FunctionName is a function of the ClassName class. Multiple classes with same function names do not give rise to conflicts or ambiguity. When a member function is called, it is accessed using the scope resolution operator on the object. For example:

ClassName example;         //declare an object named example, type ClassName
example::FunctionName();   // call the function FunctionName associated
                           // with ClassName to act on example
Two special member functions are the constructor, same name as the class, and the destructor , class name prepended with a tilde '~' character. The constructor is called whenever a variable of its class type is declared. Similarly the destructor is called whenever an object of the class type needs to be deleted.

When the class is declared, the member functions and data members are also declarations. The body of the class declaration is within curly braces and is terminated with a semicolon. The class definition provides each member function's definition. The definition of each member function is contained within curly braces and is not terminated with a semicolon.

Here is an example class declaration and class definition.

Loops and Conditional Statements

C++ offers a set of characters for performing comparison. If the relationship is satisfied then a boolean (bool) true is returned, else a bool false is returned.

OperatorDefinition
= = equal to
! = not equal
> greater than
> = greater or equal
< less than
< = less or equal
Some built in C++ statements take boolean values as their argument. If the conditional argument is true the rest of the statement will be executed if the conditional argument is false then it is not. When the statement to be executed only exists of one statement, that statement is terminated with a semicolon. When there are multiple statements to be executed, they are bounded by curly braces. Each statement in the statement body is terminated by a semicolon, whereas the body itself is not (no semicolon after the closing curly brace). Perhaps the most common conditional statements for analysis are the if and the while statements.

If statements are useful for execution of a block of code once after a test has been satisfied. For example, only if a particle is within a mass range should analysis be continued. The conditional test is performed, if it evaluates to true the statement body is executed. In the if/else block if the condition evaluates to false then the else body of statements is executed.

  • if (condition) statement;
  • if (condition) { statements.... }
  • if (condition) { statements... } else { statements... }
While statements are used when a block of code should be executed as long as the test/conditional argument is true. If the condition evaluates to true then the statements in the body are executed. Execution of code then returns to the condition and evaluates it again. This sequence will continue until the condition evaluates to false. While loops are very useful for running quick checks and making plots.
  • while (condition) statement;
  • while (condition) { statements... }
A common source of errors, bugs, is the accidental replacement of the assignment operator with the boolean 'is equal' comparator and vice versa.

General Code Structure

While loops and if statements can be used on their own, as part of a function, and be nested within themselves and each other. The body of code associated with each statement is marked by curly braces. Each open curly brace must be matched with a closed curly brace. When there is nesting the entire nested statement must be within the body of the outer statement. In a multiply nested sequence the closed braces will be associated in reverse order from the open braces. That is the first open curly brace will be matched by the last closed curly brace.

When a variable is declared memory is allocated for it (in an area of memory called the stack). The scope of a variable begins at its declaration and persists until it is either explicitly deleted or the body of code in which it was created has finished executing.

For example, a variable declared within a member function is allocated at the time of its declaration. When execution of the code moves past the closing curly brace of the function the variable is said to have gone out of scope. When this happens the memory that was allocated to the variable is no longer reserved. That memory can now be reused by the operating system and the variable that has gone out of scope should not be accessed.

It is important to keep track of the body delimiters of these statements for compile purposes, for illuminating the scope of variables, and for making code readable. A standard convention, also used in BaBar code, is the use of indentation when nesting occurs. All code in the body of a statement should be indented systematically. The amount of indentation should correspond to the level of nesting. The curly brace closing the body of a statement should be placed on its own line at the same level of indentation as the opening part of the statement.

Look here for an example.

Iterators

Sequential storage of data is a common occurrence. Often general packages of code or libraries make available typical structures such as arrays, lists, and vectors. For these containers to be functional, a user must be able to transverse the elements, often in a systematic or all inclusive manner. At the same time the encapsulation of implementation details must be preserved. The notion of an iterator satisfies both of these requirements.

An iterator is an abstraction of a pointer to an element. It is typically implemented as a class or function associated with a given container class. The iterator points to one element in the sequence and has access to the information needed to move to the next element of the sequence. It also has access to information that will determine the end of the data sequence. Concepts supported by a general iterator are the idea of the current element, next element/incrementation, and equality/comparison.

In BaBar each reconstructed event contains many lists of like data, for example lists of pions, charged tracks, and so forth. Iterators used in conjunction with loops facilitate execution of a segment of analysis code on each element in a list. For example, an iterator is used to access a charged track in an event's list, and then the momentum is plotted in a histogram. This sequence continues to loop until each track of the list has been plotted.


C++ Structure

Program Organization

For large programs it is not reasonable for all of the code to exist in one file. This is due to readability, maintenance, and primarily compile time. If all of the code were in one unit, even the smallest change would require re-compilation of all code. To avoid this very costly dependence, code is partitioned into a set of coherent modules. The physical structure, the system of code files, is likely to reflect the logical structure of the program.

The many units of a source code in a large program must be mutually consistent. For one, types in declarations must be uniform throughout all units of code. A primary method of accomplishing this is to gather all declarations and interface information into one place, a header file, while placing the definition code into an implementation file.

Header File

Header files will contain the declarations an implementation file wants to make available to other units of code. The standard code that a header file should include are type definitions, function declarations, and name declarations. By BaBar convention header files have names with the suffix '.hh'

Units of code, files, access the code declared in a header file by using a preprocessor include command. The syntax is:

#include "<header file name>"

Before code is compiled the preprocessor will prepend a copy of the header file in any file that has included it. The final executable usually needs only one compilation of a header file, even though that header file may be included in many code files. To prevent unnecessary compilation of header file code the following macro syntax is used.

#ifndef <definevalue>
#define <definevalue>

...header file contents

#endif

The first time the compiler sees the header file code it is compiled and internally assigned a value. When the compiler comes to the header file again, it is already defined so everything between the ifndef and the endif is ignored. BaBar convention sets definevalue to the name of the header file in all capital letters ended by _HH. For example, the NTrkExample.hh file is defined NTRKEXAMPLE_HH.

Implementation File

All of the source code for the implementation and definition of a header file's declarations is placed in an implementation file. Complete function definitions should be placed in the implementation file. By BaBar convention implementation files have names with the suffix '.cc'.

The implementation must have access to the declarations and types that it defines, so it must include its own header file.

Standard Libraries

Standard libraries are included with the C++ language to provide commonly used and needed functions and types. Accessing the code of a standard library is analogous to using user-written source code. Any code making use of a standard library must include it. The include syntax is the same as for including header files except the standard library name is enclosed by angle brackets instead of double quotes.

#include <<library name>>

The Class and Package Structure of BaBar Software

Classes

In BaBar analysis software each analysis module is implemented as a class. Each module/class has an associated header and implementation file dedicated to its definition and implementation.

The important member functions of a module/class for analysis work are the constructor, begin() and end() job, and event() functions. The role of these functions has been covered in a previous chapter: Framework: the Environment for Physics Event Reconstruction

Packages

BaBar (reconstruction and simulation) software is organized into packages. A package is a self-contained piece of software intended to perform a well defined task, eg. find calorimeter clusters, simulate the drift chamber response. Each package has a unique name and its own library and include files. Some packages may not be usable on their own, requiring integration with others, for example the individual subsystem simulation packages which together form the Geant simulation of BaBar.


Histograms and the HepTuple Package

The BaBar environment offers a facility to book (that is, create) histograms in C++. The package which allows one to perform this task is HepTuple. In particular, the HepTuple package includes the histogram class HepHistogram.

The analysis module NTrkExample class (from the sample analysis job) books a histogram of the number of tracks per event. If an analysis module is to use classes from the HepTuple histogramming package, its header file must declare the collaborating classes. In NTrkExample.hh, you have:

    //------------------------------------//
    // Collaborating Class Declarations   //
    //------------------------------------//
    class HepHistogram;
An analysis module that will book a histogram needs to have a (preferably private) histogram data member.
     HepHistogram* _numTracksHisto;
The beginJob method of the analysis module (in the .cc file) contains the code that calls the histogram manager to define the histogram data member. To make use of the C++ HBOOK code the source file of the analysis module needs to include the defining header files:
   #include "HepTuple/TupleManager.h"
   #include "HepTuple/Histogram.h"
From within the beginJob function a histogram manager needs to be declared via:
   HepTupleManager* manager = gblEnv->getGen()->ntupleManager();
and then used to book a histogram, which initializes the histogram data member of the class.
  _numTrkHisto = manager->histogram("Tracks per Event",  20, 0., 20. );
The histogram is declared with four arguments:
  1. the title ("Tracks per Event"),
  2. the number of bins (20),
  3. the lower limit of the histogram (0.), and
  4. the upper limit of the histogram (20.).
This completes the declaration and the definition of the histogram. The histogram can be filled for each event (from within the event function of the analysis module) with a call to the accumulate member function of the histogram object.
 _numTrkHisto->accumulate( trkList->length() );
The default name for the output file from MyMiniAnalysis.tcl is MyMiniAnalysis.root. If you want a different name you will need make changes from the framework. For example, in the Quicktour you overrode this default with:
 set histFileName myHistogram.root 
in your snippet.tcl file.

BaBar Analysis Examples

Annotated Quick Tour Analysis Code

As a first step in becoming familiar with some of the analysis code, I have annotated the C++ header and source files for the NTrkExample analysis module. This module is appended to the MyAnalysis sequence in the quick tour analysis. This module is used to generate the the number of tracks per event histogram. The comments inserted for these purposes are in blue. Everything else is as these files will be found in their respective directories (circa Jan 2006).

Begin with the NTrkExample.hh file and note the use of the HepTuple histogram package.

Using Loops and Lists to Plot a Histogram

With the future intention of modifying the NTrkExample module, let's compose a segment of code that would histogram the momentum of charged tracks in an event. To begin with we need a list of Beta candidates.
   // get list of input track candidates
  HepAList<BtaCandidate>* trkList  =
Ifd<HepAList< BtaCandidate > >::get(anEvent, _btaChargedList.value());
Thus declares such a list named trklist which is a pointer to a HepAList. Then it puts a call to the function Ifd<HepAList<BtaCandidate> >::get and passes it the event, the previously declared pointer, and a key word. A BaBar strategy for objects with multiple data members of a type such as an event with many lists of Beta candidates is to use key words to differentiate. This function call will initialize the trkList variable to the event's list associated with the key word returned by _btaChargedList.value( ).

Now for each event, we add the number of tracks in that event to the histogram:

 _numTrkHisto->accumulate( trkList->length() ); 

Booking Another Histogram

At this point it is a relatively straightforward task to add another histogram to the quick tour analysis. You have two options, to modify the existing NTrkExample module class or to create a new module class. Creating a new module class involves a few more steps but is likely to be useful information. To create a new module class involves the following steps: create a header and implementation file, add the histogram code, and load the new module into the framework.

It is simplest to create a header and implementation file by copying a template. To start you can copy the NTrkExample .cc and .hh to files called PExample (.cc and .hh respectively), or any other name you wish. You will need to replace all instances of NTrkExample with PExample (or your chosen name). Most importantly, this will include the preprocessor definition name in the header file, the #included (self) header file name in the implementation, and all instances in member functions in both the header an implementation file.

Once the name has been consistently modified, add the new code. For the data to persist over multiple events the histogram needs to be of a greater scope than the event( ) function. The logical and conventional place to add this histogram is as a data member of the PExample class. The PExample class is declared in the header file and this is where the modification should be made. To do this you need only add the line:

   HepHistogram*            _pHisto;
as a private data member.

A PExample object will now have a pointer to two HepHistograms at the time of instantiation. Only the pointer memory has been allocated. Before the histogram is used the pointer must be initialized. This should be done by adding the following lines to the definition of the beginJob( ) member function of the PExample class (in the PExample.cc file):

   HepTupleManager* manager = gblEnv->getGen()->ntupleManager();
   assert(manager != 0);
In this case these lines already exist in the beginJob( ) member function from the NTrkExample class that we copied. They do not need to be added again. Once a HepTupleManager exists, you can ask it to book a new histogram on your behalf.
   _pHisto = manager->histogram("Momentum",  25,  0.,  1. ); 
The first argument (in quotes) is the title of the histogram, the second argument (an integer) is the number of bins, the third and fourth arguments (doubles) are the low and high values of the x-axis.

The block of code developed in the previous section will histogram the momentum per track given an event. Now you will add a code segment to the definition of the PExample::event( ) member function in PExample.cc to ensure that it is executed on every event in the analysis job.

In your number-of-tracks histogram, all you needed was the length of the tracks list, which is a property of the tracks list as a whole. But momentum is a property of a single track. So next we need to declare an iterator associated with the list of Beta candidates, and a pointer to a Beta candidate:

  // Loop over track candidates to plot momentum 
   HepAListIterator<BtaCandidate> iterTrk(*trkList);
   BtaCandidate* trk;
   while ( 0 != ( trk = iterTrk()) ) {
    _pHisto->accumulate( trk->p() );
   }

This loops over each member of *trkList, and adds its momentum to the histogram _pHisto.

The block of code added to the PExample::event() function makes use of some functions that were not used by the NTrkExample class. Whenever you make use of new code, you need to verify that the defining header file has been included by the current .cc file. In this example a HepAListIterator has been introduced into the PExample module. For the PExample module to compile the following line must be added to the top of the .cc file (along with the many other included files).

#include "CLHEP/Alist/AIterator.h"

CLHEP, Class Library for High Energy Physics, is a package that contains general utility classes. If you are looking for the home of a class or function named HEPsomething, a good place to start is in this package. (analysis-30 packages are all located in $BFROOT/dist/releases/analysis-30.)

In addition to writing the module class code you will also need to modify AppUserBuildBase.cc and MyMiniAnalysis.tcl so that the new PExample module is available to the framework. To do this, once again you need to replace the lines with "NTrkExample" with the corresponding "PExample" lines. First, in AppUserBuildBase.cc you need to #include the header file for the module:

   #include "BetaMiniUser/PExample.hh"

Then, to create and load the module in the framework include the following line in the constructor of the AppUserBuildBase class:

  theBuild->add(new PExample("PExample", "Workbook example module"));

Don't forget to append the module to your analysis path, which is defined in the MyMiniAnalysis.tcl file. A line similar to the following should accomplish this.

   path append Everything PExample 

Once a module is available to the framework (ie. it has been written, compiled, and loaded via the AppUserBuildBase class) it can be easily included or excluded in your analysis by making changes to the .tcl file.

Working examples of the PExample .hh, .cc, and AppUserBuildBase.cc files are in the WorkBook's PExample directory, at

  $BFROOT/www/doc/workbook/NewExamples/PExample/
The C++ code must be re-compiled and re-linked (gmake all) before the changes will be incorporated into the executable. You will do this in the next section of the workbook: Compile, Link, and Run.

General Related Documents:


Back to Workbook Front Page

Author: Tracey Marsh
Contributors:
Joseph Perl
James Weatherall
Last updated: Jenny Williams

Last modification: 1 July 2005
Last significant update: 7 June 2005