STAR Computing | Tutorials main page |
Introduction to STAF | |
Offline computing tutorial | Maintained by Tull and Wenaus |
Most of the contents taken from Craig's STAF pages. Detailed information on STAF components can be found there: ASPs, PAMs, software bus, user interface, and STAR IDL compiler (stic). Also a glossary.
During normal operation, the STAR collaboration expects a data acquisition rate of approximately 250 TBytes of raw data per year for the ten-year operating lifetime of RHIC. Factoring in Data Summary Tapes (DSTs) derived from the raw data and anticipating comparable statistics from Monte Carlo simulations yields a total data volume of approximately 1 PByte per year that STAR must manage, process, and analyze.
Due to its size and geographical distribution, its data volume, and its long lifetime, STAR faces many of the same software and computer challenges that the next generation of high-energy physics experiments will be facing at facilities such as LHC at CERN starting in 2004.
The STAR Analysis Framework (STAF) is a major part of the software solution that has been, and is being developed by the Software Infrastructure (SOFI) group within the STAR collaboration.
STAF is a highly modular framework written (largely) in C++ and designed around a CORBA-compliant software bus package. STAF provides a CORBA-compliant encapsulation of data analysis algorithms written in FORTRAN, C, and C++ which allows the seamless integration of physics software components and system-like software components controlled at run time by a high-level scripting language and/or by Graphical User Interfaces.
The first production release of STAF to the STAR collaboration occurred in June of 1996. Software developers and physicists within STAR began immediately to convert their code to STAF Physics Analysis Modules (PAMs) and to run simulation and analysis tasks with the new system.
During normal operation, the STAR collaboration expects a data acquisition rate of approximately 250 Tbytes of raw data per year for the projected ten-year operating lifetime of RHIC. Factoring in Data Summary Tapes (DSTs) derived from the raw data and anticipating comparable statistics from Monte Carlo simulations yields a large fraction of a Pbyte per year that STAR must manage, process, and analyse.
As is typical for this early stage in the lifecycle of a large physics collaboration, almost all of the currently extant analysis software is presently undergoing intense development. Although much of the reconstruction software will approach an asymptote of finality early in the production stage of analysis (i.e. when a steady-state is achieved between data acquisition and data analysis rates), much of the physics software will continue to change significantly over the entire duration of the experiment and data analysis. During this same time period, the physicists in charge of various aspects of the physics analysis software will also continuously change as graduate students, post-doctoral researchers, and senior scientists join and leave the collaboration or change their focus on the analysis and physics.
This fluid nature of analysis software is fundamental to a dynamic physics analysis process. However, without careful planning a great deal of programming overhead can be required to maintain a complex analysis system for a prolonged time period.
Much of the software needed in the analysis of experimental data is not strictly related to the detectors used nor to the physics being investigated. Hence, though crucial to a successful analysis of the data, this software infrastructure has often been relegated to a much lower priority than the physics algorithms comprising the data analysis. Though the primacy of the physics software in a physics analysis is indisputable, this software infrastructure gains in importance as an experimental collaboration grows larger and lasts longer.
In STAR, the SOFtware Infrastructure (SOFI ) group is charged with the responsibility of developing and maintaining general purpose tools and software for use by physicists in the analysis of data and investigation of theory and results. The STAR Analysis Framework (STAF) is a major part of the software solution that has been, and is being developed by the SOFI group for STAR.
In a collaboration with many writers and users of analysis code, some mechanisms for co-ordination of and communication between analysis software elements must exist to facilitate an efficient overall analysis process.
One way of doing this is to write each analysis element as a stand-alone program executing one step in the analysis process and communicating with other analysis elements (other programs) through some well-defined file formats. The chain of analysis is then normally a batch job which executes the programs in order, cleaning up temporary files and saving results to permanent files.
Though this approach has been used in many experiments, it works best when a very limited number of people are responsible for the entire analysis. This allows changes to data formats or data flow to be made with the reasonable expectation that all appropriate programs will be updated in a synchronised fashion. One drawback with this approach is that it lends itself to duplication of effort and/or code. Each analysis program must handle its own I/O, memory management, etc. This drawback can be mitigated with a set of common utility functions in a central library, but does not lend itself easily to global changes in data format, or in the analysis chain.
Another approach which has been successful is the use of an analysis shell. An analysis shell is a generic program which handles the system-like functions of data analysis such as data I/O, memory management, flow control, etc. without doing any real analysis of data. The actual data analysis is done by "analysis modules" conforming to some API (Application Programming Interface) which allows the modules to be plugged into the analysis shell in a modular fashion. The analysis shell invokes each analysis module within the analysis chain, either passing data to the module via the API, or presenting data when requested by shell functions invoked within the module. Often the analysis shell also contains other general purpose tools for investigation and analysis of data (e.g. histogramming, plotting, sorting, simple calculations).
We have extended this concept of plug-in modules from the user-written analysis code to the central, system-like analysis shell. By dividing the functions of the analysis shell into "domains" and adopting an interface standard for these system-like functions, we provide a customisable framework for data analysis.
STAR begins taking data in late 1999, and will continue to take data for at least ten years. This means that short of completely replacing the analysis system with another system part way through the experiment, the software design must have a lifespan of ~15 years. This is an incredibly long time in the dynamic world of computer software and hardware. To put this into perspective, one need only consider the state of computing in physics 15 years ago. In 1982, the hot new machine in physics was the VAX 780, FORTRAN 77 was a newer language than FORTRAN 90 is today, and your choice of color terminals was green or amber.
We conclude that it is unrealistic to expect any software system written today to survive unmodified for 15 years. Hence, any sensible design for a software system needed to last that long must incorporate, at a fundamental level, the concept of graceful retirement (i.e. replacement in a controllable manner) of any and all of its constituent components.
Consider further that STAR encompasses many distinct physics programs, each with its own unique analysis needs and that scores, perhaps hundreds, of physicists of diverse backgrounds and skills will be using and contributing to the analysis system over its lifetime, and the magnitude of the challenge begins to be appreciated. Our approach to the STAR Analysis Framework addresses each of these challenges in a manner which we believe can be expected to succeed over the long term as well as the short.
By making the division between the framework kernal (i.e. the software bus) and the plug-in service packages (see below) clean and well defined (what we have termed vertical modularity), we provide for the graceful retirement of communication protocols and interface standards. By dividing the system-like services into autonomous packages (horizontal modularity), we allow graceful retirement of code libraries, as well as easing the burden of code maintenance.
Finally, rather than try to define all of our own interface standards, external data representations, etc., we have tried wherever possible to adopt open software standards from the computer industry and computer engineering communities. These standards are often better designed and supported than home-grown standards due to the hundreds or even thousands of man-hours devoted to their development. They are generally well documented, providing guidance for programmers attempting to collaborate over long distances. Some of these standards allow use of powerful commercial software. And even standards which don't survive over the long term often provide migration paths and/or tools to other, functionally equivalent standards.
All this means that there are many advantages to adopting a framework architecture, both from the users' perspective and from the perspective of the framework programmer.