ATLAS analysis
Overview
In these wiki pages information related to the
ATLAS data analysis is collected. There is more ways to analyze
ATLAS data and that is not very likely to change in the next few months when the first data becomes available. The information provded in these pages focuses on the analysis procedure, that will surely be allowed by the colaboration, does not rely on additional software, which would need extra debugging and can be applied to ESD (Event Summary Data), AOD (Analysis-Oriented Data) and DPD (Derived Physics Data). The main points of the proposed way of analysis are:
- write the analysis code in the form of Athena Algorithms and Tools
- if convenient reduce the size of the AOD and dump it to AOD (using common python scripts)
- produce histograms in Athena or alternativelly dump some information in ROOT NTuples (as late in the analysis chain as possible)
This way of analysis will enable using and providing the (at least C++) colaboration tools (which is not neccessarily the case when using
EventView or ARA). The down side of the proposed analysis path is that a lot of code must be written (and debugged). In my oppinion however the collaboration common-recommended Athena tools will start appearing very soon (this is true for the top group), and will be available for each published analysis.
After the AOD analysis description the proposal code for the top mass analysis in ttbar events is presented. The analysis code is split into tools, which are easily reused and can be replaced by the tools, written, debugged and approved by The Collaboration. The goals of the organization is that analysis selection and reconstruction tools and methods could be included without the (substantial) danger of introducing bugs.
The wiki pages also contain notes on ROOT analysis, which might be usefull is cases when results are needed quickly or for analysis which is not done on the official samples (such as systematics studies). Other analysis options, which might gain importance in the future are summarized. There are also some (local) grid howto-s, notes on the use of TAGs and Luminosity information extraction. At the bottom of this page the links to the pages and documents which I find particulaty useful are collected.
Athena (C++) AOD analysis
Running Athena
Up to date instructions how to run Athena on lxplus can be found in the
ATLAS workbook, the commands needed to run the local Athena (kit) in r14 are available
here
Inspect the AOD
From Athena it is possible to find out what the AOD (in fact any POOL format) contains if you run:
checkFile.py myfile.pool.root
prints Mem Size,Disk Size,Size/Evt,#items and
ContainerName for all Trees and Branches
checkSG.py myfile.pool.root
prints a Container type |
StoreGate keys table (you need this to get retrieve containers from SG for AOD analysis)
by using
StoreGateService Dump method, you can get information on Container KEYS and Classes (useful for AOD analysis, because you can use LXR to see what info is available for the objects you are dealing with) - the jobOptions for this are here
here
truth info of the event can be inspected with
DumpMC or
PrintMC, adding these lines to your jobOptions:
theApp.Dlls += ["TruthExamples" ]
theApp.TopAlg = ["DumpMC"]
DumpMC=Algorithm("DumpMC")
# change as appropriate: "GEN_EVENT" for gen.pool file, "TruthEvent" for ESD
DumpMC.McEventKey='GEN_AOD'
DumpMC.McEventOutputKey='GEN_AOD'
Reducing AOD size and DPD production
Within Athena code is partitioned it the units called packages. User interaction with Athena proceeds via using Athena Services and writing Athena Tools and Algorithms.
For information on available Athena Services please consult . A summary of useful information on Tools and Algorithms is available
here.
Basic AOD/DPD analysis example
Generator-level AOD/DPD analysis example (top mass)
ttbar at gen level from A-Z:
Reconstruction-level AOD/DPD analysis example: ttbar top mass
In this section a ttbar analysis framework is presented. The framework is very simple and relies on Athena code / utilities. It is organised in
as follows:
- selection (particle, MET, truth, event/level (e.g. eventtype))
- overlap removal (implemented: jet-electron)
- truth-matching (jets to light- and b-quarks)
- top mass analysis (for possibly multiple selection cuts and in using multiple reconstruction methods).
For each stage code is organised in (Athena) tools. the benefits of such organisation are:
- easy code-reuse
- simple implementation of debugging (for a single tool instead)
- easy replacement of tools with "official ATLAS" tools
- easy implementation of new selection cuts / analysis methods
The framework is relatively light-weight and effort has been made to keep it as transparent as possible: the main (Athena jo configurable) algorithm is the class
AnalysisDriver : public Algorithm into which the neccessary (Athena jo configurable) tools (:
AlgTool) are plugged in. These tools can be replaced by the
simmilar purpose "official ATLAS" tools. I think that organisation of the analysis into approximately the proposed stages will also be used for the general
ATLAS analysis and so the
simmilar purpose official ATLAS tools will be available. The plan is to
drop the framework tools when official tools become available.
The framework will than become just a placeholder where the official tools are to be plugged in, making the analysis acceptable as the official
ATLAS analysis (afaiu...). The proposed analysis code / procedure should work onany POOL fprmat data (ESD, AOD and (at least primary and secondary) DPD). By using AOD size reduction techniques it enables analysis to progress until the histograms are produced. Dumping some quantities into a ROOT Tree is implemented since it might be useful for short tests or (non-official) test analysis. Some properties of the proposed framework are inspired by the [[][EventView]] (in particluar the tool-composed analysis in stages) but in contrast to the
EventView the tools inherit from
AlgTool rather than a dedicated
BaseTool.
At present there is no generally accepted
ATLAS analysis framework and I expect that
ATLAS code will develop in the form of
AlgTool production rather than a common framework production. Therefore I think a lightweight modular Athena based analysis framework (which should eventually become just a placeholder) is currently the best analysis starting point.
The scheme of the framework is represented in the Figure below, while a more complete description, link to the code and info on framework performance evaluation is available
here.
ROOT and ROOT NTuple analysis
ROOT NTuple analysis can not be used for the official
ATLAS analysis. Because NTuple analysis is simple, fast and NTuples can be costumized easily it might be usefull in case you want to test the results of your large-scale analysis before running it on the grid and DPD production scripts and
UserData service
are not working or for private studies. In any case the final analysis plots will be produced with ROOT so a basic script for plotting histograms is provided.
A good resource of ROOT-related information is
http://root.cern.ch/ .
Overview of other analysis options
ARA - Athena ROOT Access
ARA (in principle) enables POOL files analysis to be migrated from Athena to ROOT (pyROOT). When a POOL file is opened with ARA a transient tree is created, which has the branches named as the POOL file's containers
StoreGate keys.
TTree methods can than be used to get some quick plots.
For the proof of principle, you can try these
14.2.(pyROOT) instructions .
For a more complete working ARA example I recommend ARA section of the
PAT 14.2.10 Tutorial .
More information on ARA is available on the
ATLAS wiki pages:
https://twiki.cern.ch/twiki/bin/view/Atlas/AthenaROOTAccess .
ARA has been working since the spring 2008 and currently there is considerable interess to maintain and develop it within the PAT group.
As an analysis option it is much closer to Athena analysis than the ROOT ntuple analysis, because ARA analysis can be done on POOL files and because it supports the use of ARA/Athena Dual-use tools (ARA tools that can be used as valid Athena tools). Apart from reducing the complexity of analysis I can't see much sense in migrating analysis from Athena to ROOT so I don't plan to use ARA for analysis. ARA might be useful for doing quick test plots of the (small) POOL files contents using TTree methods (and it has a nice visualisation utility called PED).
Python-based Athena analysis: PyAthena
PyAthena (by S. Binet) is a framework which enables Athena analysis to be done from within python. It has been available since Athena r14 and I will try to test it
soon. Until than please see
ATLAS PyAthena wiki page or
PyAthena tutorial and please update the wiki if you have/google more usefull information.
Using the Grid for analysis
Analysis that is using large data samples or would take a lot of time to run on a PC can be submitted to the grid. The local grid usually works faster and is more reliable than
the Cern's LCG, but has to be used when the required datasets are not accessible on the local grid.
Using the Local Grid
Sending Analysis Job to Grid from lxplus
Links
Links to information I have found particulary usefull are provided.
General information on ATLAS analysis:
contains (mostly) up-to-date workbook on
ATLAS simulation and analysis, Athena and Grid.
Athena framework:
The wiki information on Athena is modest and of not very usefull at the time of writing (Sept. 08). I recommend:
- Gaudi User Guide
- Athena User Guide: link on ATLAS wiki page is not working (has not been working for a while), please use a copy /afs/f9.ijs.si/home/liza/CERN/doc/Athena.pdf, (found on some xy page on the web).
Both documents are somewhat old and lack information on the use of python scripting, joboptions and configurables, which can be found on
ATLAS wiki pages.
Configuration Management Tool:
ROOT:
Usefull packages and code examples (LXR links):
- [[][Athena package and algorithm example]]
- [[Small AOD Analyisis example][]]
- [[]Large(r)-scale AOD analysis examples]
Grid-related:
ATLAS software:
- LXR : for searching files by name/contents
- CVS : for comparing code-differences between releases
AnalysisModelForum report:
--
LizaMijovic - 28 Oct 2008