Latency in StdEdxY2Maker
Updated on Thu, 2016-08-25 16:31. Originally created by genevb on 2016-08-25 15:00.
On December 24, 2015, all of our reconstruction chains saw a significant increase in CPU time per event, as can be seen from this plot of nightly tests for data spanning 2012-2015:
There were no documented infrastructure changes at that time that I know of.
There were commits to the following libraries relevant to BFC reconstruction on December 23, 2015:
StEvent
StDetectorDbMaker
StBichsel
StdEdxY2Maker
I looked at reconstruction times event-by-event comparing optimized SL16a as already compiled (SL16a was the first library to include these new codes), versus SL16a with the above libraries from CVS as of December 22, 2015 (I also needed to re-compile SL16a's StTpcDb to get proper symbols) and after multiple tests (to ensure reproducibility), I found that I could reproduce the slow down using the nightly test for Run 12 UU (I used 800 events). This is demonstrated in the following plots of ratios of event-by-event times.
The first plot shows two distributions: in black is just one of many reproducibility tests to show the variance of reproducibility (~6% fluctuations, but ~1% deviation for the mean), while red shows one of the SL16a tests divided by one of the back-modified SL16a tests, indicating a ~13% slow down.
The second plot shows the individual events from the red distribution as a function of log10(event reconstruction time) showing that there is some structure. The single red point is the first event, which is considerably longer than other events, but adds very little to the overall reconstruction time (it only accounts for 0.3% of the time for these 800 events).
I tried a few different modifications of SL16a using subsets of the aforementioned libraries and found that the slowing is specifically tied to StdEdxY2Maker. Here is the mean ratio of event-by-event reconstrcutrion times of SL16a modified with only the library (libraries) listed in the column header, divided by SL16a modified with all aforementioned libraries. I processed only 40 events, so the measurement is a little less precise here.
Congruous with these findings, my collection of tests consistently show (via statistics presented at the end of the log files) that ~6% of the entire reconstruction time is spent in StdEdxY2Maker when the older StdEdxY2Maker is used, and ~18% when the newer version is used.
The commits on that date to StdEdxY2Maker modified 6 files with the comment "Reshape histograms, add new dEdx method". I understand from Yuri that this is specifically with regard to the new dE/dx model he is studying, which he calls "dN/dx". At the moment, I know only of this for documentation. The commit also coincides with the insertion of the Run 15 dE/dx calibration into the offline database. Yuri has informed me that the Run 15 (and ongoing Run 16) dE/dx calibration which we are presently using for physics productions uses the traditional dE/dx method and is not dependent on the dN/dx codes. I also understand from Yuri that dN/dx is calculated during reconstruction, but is not stored in MuDsts as an alternative PID method at this time.
Jason has performed a simple You do not have access to view this node using the same Run 12 UU chain of 40 events that I have used, comparing SL15e (pre-commit) to the current DEV (post-commit). He finds that this code now spends over 80% of its time in StdEdxY2Maker::DoFitN(), and most of that time is spent calling TMinuit minimization. DoFitN() was introduced at the time of the commit and it seems likely that this is for the dN/dx studies.
Our recommendation at this time is for the author to provide a mechanism to turn off the calculation of dN/dx in productions which are not specifically relevant to dN/dx studies.
-Gene
There were no documented infrastructure changes at that time that I know of.
There were commits to the following libraries relevant to BFC reconstruction on December 23, 2015:
StEvent
StDetectorDbMaker
StBichsel
StdEdxY2Maker
I looked at reconstruction times event-by-event comparing optimized SL16a as already compiled (SL16a was the first library to include these new codes), versus SL16a with the above libraries from CVS as of December 22, 2015 (I also needed to re-compile SL16a's StTpcDb to get proper symbols) and after multiple tests (to ensure reproducibility), I found that I could reproduce the slow down using the nightly test for Run 12 UU (I used 800 events). This is demonstrated in the following plots of ratios of event-by-event times.
The first plot shows two distributions: in black is just one of many reproducibility tests to show the variance of reproducibility (~6% fluctuations, but ~1% deviation for the mean), while red shows one of the SL16a tests divided by one of the back-modified SL16a tests, indicating a ~13% slow down.
The second plot shows the individual events from the red distribution as a function of log10(event reconstruction time) showing that there is some structure. The single red point is the first event, which is considerably longer than other events, but adds very little to the overall reconstruction time (it only accounts for 0.3% of the time for these 800 events).
I tried a few different modifications of SL16a using subsets of the aforementioned libraries and found that the slowing is specifically tied to StdEdxY2Maker. Here is the mean ratio of event-by-event reconstrcutrion times of SL16a modified with only the library (libraries) listed in the column header, divided by SL16a modified with all aforementioned libraries. I processed only 40 events, so the measurement is a little less precise here.
StEvent | StDetectorDbMaker | StdEdxY2Maker+StBichsel | StdEdxY2Maker | |
---|---|---|---|---|
Ratio | 1.13 | 1.13 | 0.97 | 0.99 |
Congruous with these findings, my collection of tests consistently show (via statistics presented at the end of the log files) that ~6% of the entire reconstruction time is spent in StdEdxY2Maker when the older StdEdxY2Maker is used, and ~18% when the newer version is used.
The commits on that date to StdEdxY2Maker modified 6 files with the comment "Reshape histograms, add new dEdx method". I understand from Yuri that this is specifically with regard to the new dE/dx model he is studying, which he calls "dN/dx". At the moment, I know only of this for documentation. The commit also coincides with the insertion of the Run 15 dE/dx calibration into the offline database. Yuri has informed me that the Run 15 (and ongoing Run 16) dE/dx calibration which we are presently using for physics productions uses the traditional dE/dx method and is not dependent on the dN/dx codes. I also understand from Yuri that dN/dx is calculated during reconstruction, but is not stored in MuDsts as an alternative PID method at this time.
Jason has performed a simple You do not have access to view this node using the same Run 12 UU chain of 40 events that I have used, comparing SL15e (pre-commit) to the current DEV (post-commit). He finds that this code now spends over 80% of its time in StdEdxY2Maker::DoFitN(), and most of that time is spent calling TMinuit minimization. DoFitN() was introduced at the time of the commit and it seems likely that this is for the dN/dx studies.
Our recommendation at this time is for the author to provide a mechanism to turn off the calculation of dN/dx in productions which are not specifically relevant to dN/dx studies.
-Gene
»
- genevb's blog
- Login or register to post comments