This adds volume scaling for 1- and 2-channel software volume scaling
using Orc. While testing the MMX and SSE backends on a Core2, I see an
~2x performance benefit over the hand-rolled MMX and SSE code. Since I
haven't been able to test on other architectures, the Orc code is only
used when MMX/SSE* is present. This can be changed in the future after
testing on AMD and ARM machines.