Thanks to Abramo for notice.
Support dmix on generic architectures without atomic operations but using a semaphore to avoid concurrent accesses. This is less effective than atomic operations but should work on every system.
Split arch-dependent codes of dmix to separate files.