One common strategy when looping problematic organ samples is to employ a cross-fade. This is an irreversible audio modification that gradually transitions the samples leading up to the end of a loop to equal the samples that were leading into the start. The goal is to completely eliminate any sort of impulsive glitch and hopefully also create a “good spectral match”. While the ability to eliminate glitches is possible, creating a good spectral match might not be so simple. We can think of an organ sample as being a linear combination of two signals:
- A voiced/correlated/predictable component (choose your word) which represents the tonal part of the sample. Strictly speaking, the tonal component is not entirely predictable - there are continuous subtle variations in the speech of a pipe… but they are typically small and we will assume predictability.
- An unvoiced/uncorrelated/unpredictable component which represents the pipe-noise of the sample.
Both of these components are necessary for realism in a sample.
The following image shows a cross-fade of two entirely correlated signals. The top picture contains the input signals and the cross-fade transition that will be used, the bottom contains the input signals with the cross-faded signal overlaid on top. There is nothing interesting about this picture: the two input signals were the same. The cross-fade transitions between two identical signals i.e. the output signal is equal to both of the input signals.
The next image shows a cross-fade of the same correlated signal with some uniform white noise overlaid on top. What is going on in the middle of output signal? It looks like it’s been attenuated a little bit and doesn’t appear to have a uniform distribution anymore.
This final image is a cross-fade purely consisting of two uniform random signals to add some more detail.
It turns out that summing two uniformly distributed random variables yields a new random variable with a triangular distribution (this is how noise for triangular dither gets generated). During the cross-fade, the distribution of the uncorrelated components of the signal is actually changed. Not only that, but the power of the signal is reduced in the middle of the transition. Here is an audio sample of two white noise streams being cross-faded over 2 seconds:
Listen for the level drop in the middle.
If the cross-fade is too long when looping a sample, it could make the pipe wind-noise duck over the duration of the cross-fade. While the effect is subtle, I have heard it in some samples and it’s not particularly natural. I suppose the TLDR advice I can give is:
- If you can avoid cross-fading - avoid it.
- If you cannot avoid cross-fading - make the cross-fade as short as possible (milliseconds) to avoid an obvious level-drop of noise during the transition.