> Thank you for pointing out the  --trace-children option, Nicholas!
> Then, like gamma, it too becomes really slow. Is there any way to improve
> the speed?

Do less? :-)  If you're running a benchmark you can try to reduce the 
size.  I don't use Valgrind but do use Pin with HotSpot and it's 
manageable if you keep the code size down.
