In these wiki pages information related to the ATLAS
data analysis is collected. There is more ways to analyze ATLAS
data and that is not very likely to change in the next few months when the first data becomes available. The information provded in these pages focuses on the analysis procedure, that will surely be allowed by the colaboration, does not rely on additional software, which would need extra debugging and can be applied to ESD (Event Summary Data), AOD (Analysis-Oriented Data) and DPD (Derived Physics Data). The main points of the proposed way of analysis are:
- write the analysis code in the form of Athena Algorithms and Tools
- if convenient reduce the size of the AOD and dump it to AOD (using common python scripts)
- produce histograms in Athena or alternativelly dump some information in ROOT NTuples (as late in the analysis chain as possible)
This way of analysis will enable using and providing the (at least C++) colaboration tools (which is not neccessarily the case when using EventView
or ARA). The down side of the proposed analysis path is that a lot of code must be written (and debugged). In my oppinion however the collaboration common-recommended Athena tools will start appearing very soon (this is true for the top group), and will be available for each published analysis.
After the AOD analysis description the proposal code for the top mass analysis in ttbar events is presented. The analysis code is split into tools, which are easily reused and can be replaced by the tools, written, debugged and approved by The Collaboration. The goals of the organization is that analysis selection and reconstruction tools and methods could be included without the (substantial) danger of introducing bugs.
The wiki pages also contain notes on ROOT analysis, which might be usefull is cases when results are needed quickly or for analysis which is not done on the official samples (such as systematics studies). Other analysis options, which might gain importance in the future are summarized. There are also some (local) grid howto-s, notes on the use of TAGs and Luminosity information extraction. At the bottom of this page the links to the pages and documents which I find particulaty useful are collected.
Athena (C++) AOD analysis
Up to date instructions how to run Athena on lxplus can be found in the ATLAS workbook
, the commands needed to run the local Athena (kit) in r14 are available here
Inspect the AOD
From Athena it is possible to find out what the AOD (in fact any POOL format) contains if you run:
prints Mem Size,Disk Size,Size/Evt,#items and ContainerName
for all Trees and Branches
prints a Container type | StoreGate
keys table (you need this to get retrieve containers from SG for AOD analysis)
by using StoreGateService
Dump method, you can get information on Container KEYS and Classes (useful for AOD analysis, because you can use LXR to see what info is available for the objects you are dealing with) - the jobOptions for this are here here
truth info of the event can be inspected with DumpMC
, adding these lines to your jobOptions:
theApp.Dlls += ["TruthExamples" ]
theApp.TopAlg = ["DumpMC"]
# change as appropriate: "GEN_EVENT" for gen.pool file, "TruthEvent" for ESD
Reducing AOD size and DPD production
Within Athena code is partitioned it the units called packages. User interaction with Athena proceeds via using Athena Services and writing Athena Tools and Algorithms.
For information on available Athena Services please consult . A summary of useful information on Tools and Algorithms is available
Basic AOD/DPD analysis example
Generator-level AOD/DPD analysis example (top mass)
ttbar at gen level from A-Z:
Reconstruction-level AOD/DPD analysis example: ttbar top mass
In this section a ttbar analysis framework is presented. The framework is very simple and relies on Athena code / utilities. It is organised in
- selection (particle, MET, truth, event/level (e.g. eventtype))
- overlap removal (implemented: jet-electron)
- truth-matching (jets to light- and b-quarks)
- top mass analysis (for possibly multiple selection cuts and in using multiple reconstruction methods).
For each stage code is organised in (Athena) tools. the benefits of such organisation are:
- easy code-reuse
- simple implementation of debugging (for a single tool instead)
- easy replacement of tools with "official ATLAS" tools
- easy implementation of new selection cuts / analysis methods
The framework is relatively light-weight and effort has been made to keep it as transparent as possible: the main (Athena jo configurable) algorithm is the class AnalysisDriver
: public Algorithm into which the neccessary (Athena jo configurable) tools (: AlgTool
) are plugged in. These tools can be replaced by the simmilar purpose "official ATLAS" tools
. I think that organisation of the analysis into approximately the proposed stages will also be used for the general ATLAS
analysis and so the simmilar purpose official ATLAS tools
will be available. The plan is to drop the framework tools when official tools become available
. The framework will than become just a placeholder where the official tools are to be plugged in,
making the analysis acceptable as the official ATLAS
analysis (afaiu...). The proposed analysis code / procedure should work onany POOL fprmat data (ESD, AOD and (at least primary and secondary) DPD). By using AOD size reduction techniques it enables analysis to progress until the histograms are produced. Dumping some quantities into a ROOT Tree is implemented since it might be useful for short tests or (non-official) test analysis. Some properties of the proposed framework are inspired by the [[EventView]] (in particluar the tool-composed analysis in stages) but in contrast to the EventView
the tools inherit from AlgTool
rather than a dedicated BaseTool
At present there is no generally accepted ATLAS
analysis framework and I expect that ATLAS
code will develop in the form of AlgTool
production rather than a common framework production. Therefore I think a lightweight modular Athena based analysis framework (which should eventually become just a placeholder) is currently the best analysis starting point.
The scheme of the framework is represented in the Figure below, while a more complete description, link to the code and info on framework performance evaluation is available here
ROOT and ROOT NTuple analysis
ROOT NTuple analysis can not be used for the official ATLAS
analysis. Because NTuple analysis is simple, fast and NTuples can be costumized easily it might be usefull in case you want to test the results of your large-scale analysis before running it on the grid and DPD production scripts and UserData
are not working or for private studies. In any case the final analysis plots will be produced with ROOT so a basic script for plotting histograms is provided.
A good resource of ROOT-related information is http://root.cern.ch/
Overview of other analysis options
ARA - Athena ROOT Access
ARA (in principle) enables POOL files analysis to be migrated from Athena to ROOT (pyROOT). When a POOL file is opened with ARA a transient tree is created, which has the branches named as the POOL file's containers StoreGate
keys. TTree methods
can than be used to get some quick plots.
For the proof of principle, you can try these 14.2.(pyROOT) instructions
For a more complete working ARA example I recommend ARA section of the PAT 14.2.10 Tutorial
More information on ARA is available on the ATLAS
wiki pages: https://twiki.cern.ch/twiki/bin/view/Atlas/AthenaROOTAccess
ARA has been working since the spring 2008 and currently there is considerable interess to maintain and develop it within the PAT group.
As an analysis option it is much closer to Athena analysis than the ROOT ntuple analysis, because ARA analysis can be done on POOL files and because it supports the use of ARA/Athena Dual-use tools (ARA tools that can be used as valid Athena tools). Apart from reducing the complexity of analysis I can't see much sense in migrating analysis from Athena to ROOT so I don't plan to use ARA for analysis. ARA might be useful for doing quick test plots of the (small) POOL files contents using TTree methods (and it has a nice visualisation utility called PED).
Python-based Athena analysis: PyAthena
(by S. Binet) is a framework which enables Athena analysis to be done from within python. It has been available since Athena r14 and I will try to test it
soon. Until than please seeATLAS PyAthena wiki page
and please update the wiki if you have/google more usefull information.
Using the Grid for analysis
Analysis that is using large data samples or would take a lot of time to run on a PC can be submitted to the grid. The local grid usually works faster and is more reliable than
the Cern's LCG, but has to be used when the required datasets are not accessible on the local grid.
Using the Local Grid
Sending Analysis Job to Grid from lxplus
Links to information I have found particulary usefull are provided.
General information on ATLAS analysis:
contains (mostly) up-to-date workbook on ATLAS
simulation and analysis, Athena and Grid.
The wiki information on Athena is modest and of not very usefull at the time of writing (Sept. 08). I recommend:
- Gaudi User Guide
- Athena User Guide: link on ATLAS wiki page is not working (has not been working for a while), please use a copy /afs/f9.ijs.si/home/liza/CERN/doc/Athena.pdf, (found on some xy page on the web).
Both documents are somewhat old and lack information on the use of python scripting, joboptions and configurables, which can be found on ATLAS
Configuration Management Tool:
Usefull packages and code examples (LXR links):
- [[Athena package and algorithm example]]
- [[Small AOD Analyisis example]]
- [Large(r)-scale AOD analysis examples]
- LXR : for searching files by name/contents
- CVS : for comparing code-differences between releases
- 28 Oct 2008