pis.html
Collapsing Shepard's Relative Pitch Circularity:
A Pitch Invariant Sound
David Van Brink
Apple Computer
1995
Background: Shepard's Tones
I was recently meditating on the subject of the paradoxical "musical staircase" effect,
also known as "Shepard's Tones". This is a well known aural illusion in which a scale
-- or a continuously rising tone -- appears to ascend indefinitely. Its explanation is straightforward: start with a frequency F, and construct a tone containing harmonics
at 2F, 4F, 8F, ..., and also, F/2, F/4, F/8, ... . Small transpositions of this
tone are perceived as a new note, but a shift of one octave yields a tone with all
the same harmonics as the original tone. In practice, only a small number of harmonics
within the range of human hearing are needed, if their intensities are adjusted appropriately
to "fade in" at the low end, and "fade out" at the high end. [Shepard 1964]
The tone described above is invariant to shifting up or down by a full octave. Similar
tones can be constructed which are invariant over other transpositions, sometimes
with interesting effects. If, for example, the harmonics are placed at intervals
just greater than an octave, then the perceived pitch of the tone decreases when the tone
is shifted up by one octave. [Schroeder 1986, Risset 1986] Similarly, a tone which
sounds the same when shifted up by three semitones would contain harmonics at
2^(1/4)F, 2^(2/4)F, 2^(3/4)F, ... , and F/2^(1/4), F/2^(2/4), F/2^(3/4), ... .
By placing the harmonics at appropriate intervals, we can generate a tone which "sounds
the same" for any desired pitch shift.
A Pitch Invariant Sound
The question occurred to me, "are there sounds which sound the same when pitch-shifted
by any
amount?" Two trivial solutions are immediately obvious. The first, which can be
derived by extension from the above explanation, is the sound containing all harmonics,
or white noise. The second is the sound which contains no harmonics: silence.
Are there any other such sounds?
We proceed with a simple equation. Let f(t) represent a signal over time, so that
f(pt) is that same signal played at rate p. We want a signal where
f(t) = f(pt).
The only solution to that is where f(t) is a constant,
f(t) = f(pt) = d.
This signal sounds like silence. However, our ears are insensitive to DC offsets,
or the "zeroth harmonic". So we may look for a function which satisfies
f(t) = f(pt) + c.
This problem has a simple solution, namely, the logarithm function.
log t = log pt + c, where c = -log p.
The shape of the log function is invariant under rate change; stretching its graph
by any amount horizontally merely shifts the function vertically. While this is a
solution, and indeed produces a tone which "sounds the same at any pitch", it is
not a very interesting one. First of all, the "sound" that a log function makes is mostly silence,
as it is monotonically increasing. Second, the range of this function is -(infinity) to (infinity) ,
making it inconvenient to download into a keyboard sample player.
Both of these issues can be addressed by taking the sine of the logarithm function.
Consider
f(t) = sin log t.
Inasmuch as a signal may be said to have a frequency at a particular time, the "instantaneous
pitch" of any function of the form f(t) = sin g(t), at time t, is the derivative
of g(t) divided by the period of the sine function, 2
, giving F(t) = g'(t)/2
.
Letting F(t) stand for this interpretation of the instantaneous pitch of f(t),
we have
that
f(t) = sin log t
F(t) = log t / dt = 1/2
t.
The function shifted by pitch p becomes
f(pt) = sin (log pt) = sin (log t + log p), which is an inaudible phase shift,
F(pt) = (log t + log p) / dt = 1/2
t.
That is, the instantaneous pitch of sin log t, at time t, is the same as the instaneous
pitch of sin log pt! This is a sound which "sounds the same" played at any pitch.
As can be seen from F(t) above, the signal is a rapidly falling tone, beginning at
infinite pitch, and rapidly descending. It is illustrated in figure 1.

For practical purposes, we will want to scale the function appropriately for a human
listener. The function as given, if we take t in seconds, will have fallen from 20kHz
at time t=8e-6 to 20 Hz at time t=8e-3. The sound will have passed through the entire range of human hearing in less than a hundredth of a second. Again, this is uninteresting
when downloaded to a sampling keyboard. Scaling the function slightly, so that the
frequency at time t = 1 second is 100 Hz, gives
f(t) = sin (200
log t),
F(t) = 100/t.
One final equation shall be presented. Suppose we wish to construct this waveform
from discrete samples, with a sampling rate of R. Let us call the samples S1, S2, ... . Sn = f(n/R), giving
Sn = sin (200
log (n/R)) = sin (200
(log n - log R)).
The log R term provides only an inaudible phase shift of the sine function, and may
be omitted without altering the perceived sound, leaving
Sn = sin (200
log n) for sample rate R.
The sample rate does not matter, this sound is safe at any speed.
Bibliography
Shepard, Roger. "Circularity in Judgements of Relative Pitch". Journal of the Acoustical
Society of America V36n12 (December 1964) pp2346-2353.
Schroeder, Manfred R. "Auditory Paradox based on fractal waveform". Journal of the
Acoustical Society of America V79n1 (January 1986) pp 186-189.
Risset, Jean-Claude. "Pitch and rhythm paradoxes: Comments on 'Auditory paradox based
on fractal waveform'". Journal of the Acoustical Society of America V80n3 (September
1986) pp 961-962.

4/14/96.18:59 - 7/19/96.16:53