Data - Monte Carlo Comparison, Take 2
I re-examined the data - pythia comparison in my analysis after some good questions were raised during my preliminary result presentation at today's Spin PWG meeting. In particular, there was some concern over occasionally erratic error bars in the simulation histograms and also some questions about the shape of the z-vertex distributions.
Well, the error bars were a pretty easy thing to nail down once I plotted my simulation histograms separately for each partonic pt sample. You can browse all of the plots at
http://deltag5.lns.mit.edu/~kocolosk/datamc/samples/
If you look in particular at the lower partonic pt samples you'll see that I have very few counts from simulations in the triggered histograms. This makes perfect sense, of course: any jets that satisfy the jet-patch triggers at these low energies must have a high neutral energy content. Unfortunately, the addition of one or two particles in a bin from these samples can end up dominating that bin because they are weighted so heavily. I checked that this was in fact the problem by requiring at least 10 particles in a bin before I combine it with bins from the other partonic samples. Incidentally, I was aready applying this cut to the trigger bias histograms, so nothing changes there. The results are at
http://deltag5.lns.mit.edu/~kocolosk/datamc/cut_10/
If I compare e.g. the z-vertex distribution for pi+ from my presentation (on the left) with the same distribution after requiring at least 10 particles per bin (on the right) it's clear that things settle down nicely after the cut:
(The one point on the after plot that still is way too high can be fixed if I up the cut to 15 particles, but in general the plots look worse with the cut that high). Unfortunately, it's also clear that the dip in the middle of the z-vertex distribution is still there. At this point I think it's instructive to dig up a few of the individual distributions from the partonic pt samples. Here are eta (left) and v_z (right) plots for pi+ from the 7_9, 15_25, and above_35 samples (normalized to 2M events from data instead of the full 25M sample because I didn't feel like waiting around):
7_9:
15_25:
above_35
You can see that as the events get harder and harder there's actually a bias towards events with *positive* z vertices. At the same time, the pseudorapidity distributions of the harder samples are more nearly uniform around zero. I guess what's happening is that the jets from these hard events are emitted perpendicular to the beam line, and so in order for them to hit the region of the BEMC included in the trigger the vertices are biased to the west.
So, that's all well and good, but we still have the case that the combined vertex distribution from simulation does not match the data. The implication from this mismatch is that the event-weighting procedure is a little bit off; maybe the hard samples are weighted a little too heavily? I tried tossing out the 45_55 and 55_65 samples, but it didn't improve matters appreciably. I'm open to suggestions, but at the same time I'm not cutting on the offline vertex position, so this comparison isn't quite as important as some of the other ones.
One other thing: while I 've got your attention, I may as well post the agreement for MB triggers since I didn't bother to show it in the PPT today. Here are pt, eta, phi, and vz distributions for pi+. The pt distribution in simulation is too hard, but that's something I've shown before:
Conclusions:
New data-mc comparisons requiring at least 10 particles per sample bin in simulation result in improved error bars and less jittery simulation distributions. The event vertex distribution in simulation still does not match the data, and a review of event vertex distributions from individual samples suggests that perhaps the hard samples are weighted a bit too heavily.