8.5.2 Harmonic balance for a simplified reed instrument

The simplified model of a clarinet introduced in section 8.5 is very crude, but we can learn some useful things from it by applying the method of harmonic balance, introduced in section 8.2.2. We don’t even need to assume that the instrument is specifically a clarinet to begin with: the argument we are about to give applies equally well to any kind of reed instrument which has pressure $p(t)$ and volume flow $v(t)$ connected in two ways: via the input impedance $Z(\omega)$ of the instrument tube, and via a nonlinear function $v(p)$ with the general shape shown in Fig. 1. This plot is slightly different from the version shown in section 8.5: it has been shifted sideways by the pressure $p_0$ inside the player’s mouth, so that the curve crosses the horizontal axis at $p=p_0$ (which has the particular numerical value 2.5 in the plot). Note that both $p(t)$ and $p_0$ are pressures relative to ambient atmospheric pressure.

Figure 1. Nonlinear function relating the pressure $p(t)$ to the volume flow rate $v(t)$ into the mouthpiece

The question of interest here is whether the model “instrument” is capable of sustaining a steady, periodic note at small amplitude. If it can, we expect the activity to be confined to a small region of the curve in Fig. 1, close to the position where it crosses the vertical axis. The curve is smooth in that region, so we can express it as a power series

$$v(p) = \sum_{n=0}^\infty{a_n p^n} \tag{1}$$

in which we will assume that the coefficients $a_n$ decrease rapidly as $n$ increases. Since both $p(t)$ and $v(t)$ are periodic, we can express them both in terms of Fourier series expansions:

$$p(t) = \sum_{j=-\infty}^\infty{P_j e^{ij \Omega t}} , \mathrm{~~~~~} v(t) = \sum_{j=-\infty}^\infty{V_j e^{ij \Omega t}} \tag{2}$$

where $\Omega$ is the (as yet unknown) fundamental frequency of the note. For our small-amplitude oscillations we will assume that the coefficients $P_j$ and $V_j$ become negligible when $j$ is not small, so that we can approximate the behaviour by keeping just a few low-order terms. The complex form of the Fourier series has been adopted here, because it makes the manipulations we are about to require very much easier. Because both $p(t)$ and $v(t)$ are real functions, the Fourier coefficients with negative indices must be simply the complex conjugates of the corresponding positive terms:

$$P_{-j} = P_j^*, \mathrm{~~~~~} V_{-j} = V_j^* . \tag{3}$$

We also know that $P_0 \approx 0$ because there can’t be a steady pressure difference between the inside and the outside of the instrument tube: it is open to the atmosphere at the bell end.

What we now need to do is substitute the Fourier series for $p$ into the power series (1), then collect up the terms corresponding to different harmonics of the frequency $\Omega$. The calculation is straightforward but messy: for example,

$$p^2=\sum_{j=-\infty}^\infty{\sum_{k=-\infty}^\infty{P_j P_k e^{i(j+k) \Omega t}}}$$

$$= P_1 P_1^* + P_2 P_2^* +\cdots$$

$$+ 2 e^{i \Omega t} \left( P_2 P_1^* +P_3 P_2^* + \cdots \right)$$

$$+ 2 e^{2i \Omega t} \left( P_1^2/2 + P_3 P_1^* + P_4 P_2^* + \cdots \right)$$

$$+ 2 e^{3i \Omega t} \left( P_1 P_2 + P_4 P_1^* +P_5 P_2^* + \cdots \right)$$

$$+ \mathrm{~higher~terms} + \mathrm{C.C.} \tag{4}$$

where “$+~$higher terms” means “plus all the terms involving higher harmonics”, and “$+~$C.C.” means “plus the complex conjugate of the whole thing” because everything is repeated with terms involving $e^{-i \Omega t}$, $e^{-2i \Omega t}$ and so on.

In a similar way,

$$p^3= 3e^{i \Omega t} \left( P_1^3 +P_1^2 P_3 + \cdots \right)$$

$$+ 3e^{2i \Omega t} \left( P_1^2 P_4 + 2 P_1^2 P_2 + \cdots \right)$$

$$+ 3e^{3i \Omega t} \left( P_1^3/3 + P_1^2 P_5 + 2P_1^2 P_3 + \cdots \right)$$

$$+ \mathrm{~higher~terms} + \mathrm{C.C.} \tag{5}$$

where this time we have added in a further simplification: we can assume that $P_1$ is not complex but real, so that we don’t have to distinguish $P_1$ from $P_1^*$. The reason we can do this is that the overall phase of our periodic oscillation is arbitrary, and we can fix it by adopting this assumption.

Now we can substitute into eq. (1), and separately equate the terms in $e^{i \Omega t}$, $e^{2i \Omega t}$ and $e^{3i \Omega t}$ (this is the “harmonic balancing” step of the method). In each of these equations, in the spirit of the approximations we are making, we will ignore small terms and just keep the dominant terms of each one. The underlying assumption used here is that the power series coefficients $a_j$ and the Fourier coefficients $P_j$, $V_j$ decay at comparable rates as $j$ increases. The first three resulting equations then read

$$V_1 \approx a_1 P_1 \tag{6}$$

$$V_2 \approx a_1 P_2 +a_2 P_1^2 \tag{7}$$

$$V_3 \approx a_1 P_3 +2 a_2 P_1 P_2 + a_3 P_1^3 . \tag{8}$$

These must be supplemented by the other relation between $v(t)$ and $p(t)$, via the input impedance of the tube. At each harmonic of our playing frequency $\Omega$, the corresponding Fourier coefficients $V_j$ and $P_j$ are related by

$$V_j = Y(j \Omega) P_j \tag{9}$$

where $Y(\omega)=1/Z(\omega)$ is the admittance of the tube. Combining with eqs. (6–8), we find

$$a_1 \approx Y(\Omega) \tag{10}$$

$$P_2 \approx \dfrac{a_2}{Y(2 \Omega) – Y(\Omega)} P_1^2 \tag{11}$$

$$P_3 \approx \left[ \dfrac{2 a_2^2}{Y(2 \Omega) – Y(\Omega)} + a_3 \right] \dfrac{P_1^3}{Y(3 \Omega) – Y(\Omega)} . \tag{12}$$

We can deduce several interesting things from these equations. Equation (10) is different from the others, because it does not involve any of the Fourier coefficients $P_j$: the term $P_1$ cancelled out at the last stage of the calculation. That is not a surprise: this equation represents the linearised approximation to the problem we are addressing, and it is normal that linear theory (for example of vibration modes) does not predict the amplitude of non-forced motion.

But eq. (10) allows us to deduce (within the limits of this approximate calculation) two important things: the playing frequency $\Omega$, and the required mouth pressure $p_0$. The power series coefficients $a_j$ are all real numbers, but the admittance function $Y(\omega)$ is usually complex. So the imaginary part of eq. (10) tells us that the playing frequency $\Omega$ must take a very special value, such that $Y(\Omega)$, and thus also the impedance $Z(\Omega)$, is real. Looking all the way back at Fig. 2 of section 2.2.7, we expect this to happen at a frequency very close to each resonance frequency of the instrument tube.

So far, so good: we expect our “instrument” to play a note close to one of the resonances. But now we need to see what the real part of eq. (10) tells us. Having fixed the value of $\Omega$, the (real) value of $Y(\Omega)$ is known: it can be measured from the input impedance of the instrument. So in order to satisfy eq. (10), the value of $a_1$ must match this value of $Y(\Omega)$. We are close to a resonance, which means that $Z(\Omega)$ is expected to be large, and therefore $Y(\Omega)$ is small: but we know that it is always a positive quantity, because it determines the energy dissipation of the acoustic duct.

So $a_1$ must have a particular small, positive value. This is achieved by changing the mouth pressure $p_0$, which has the result of sliding the curve in Fig. 1 sideways. The coefficient $a_1$ has a simple geometric interpretation: it is the slope of the tangent to the curve, at the point where it crosses the vertical axis. If the player begins by blowing very gently into the instrument, and then gradually increases the mouth pressure, we can deduce that the instrument should first “light up” and start to produce a note the first time that it is possible to satisfy eq. (10). This will occur when the tangent slope first reaches a positive value matching the value of $Y(\omega)$ at one of the frequencies where this is a real number. The smallest value of $Y$ corresponds to the largest value of $Z$, so the prediction is that the first note to play will correspond to the highest peak of the impedance $Z(\omega)$: or at least, to a frequency very close to that peak, since the vanishing of the real part of $Z$ doesn’t happen exactly at the peak value of $|Z|$. Usually, this highest peak of $Z(\omega)$ occurs close to the lowest resonance of the instrument tube.

Equations (11) and (12) now tell us the amplitudes $P_2$ and $P_3$ of the second and third harmonics, in terms of the amplitude $P_1$ of the fundamental. Of course, we could have extended the analysis to include higher harmonics if we wished. The first thing these equations reveal is that provided the function $v(p)$ really is nonlinear, so that the power series coefficients $a_2$, $a_3$ and so on are not all zero, the pressure waveform $p(t)$ inside the tube will inevitably contain harmonics.

The equations then tell us how those harmonic amplitudes are influenced by the properties of the tube. Finally, we have to decide whether we are really thinking of a clarinet, a saxophone, an oboe or whatever: these different instruments will have different impedance characteristics. The most important difference is between cylindrical instrument like the clarinet, and conical instruments like the saxophone or oboe. As we know from section 4.2, as a first approximation a conical tube will be expected to have resonances close to every harmonic, whereas a cylindrical tube only has them close to odd-numbered harmonics 1,3,5….

Equation (11) then tells us something important: for a conical instrument, we can expect $Z(\omega)$ to have peaks at $\Omega$ and at $2 \Omega$, if the note being played is based on the lowest tube resonance. This means that $Y(\Omega)$ and $Y(2\Omega)$ are both small, so that $P_2$ will be relatively large. But if we have a cylindrical instrument,$Y(\Omega)$ will be small but $Y(2\Omega)$ will be large (it is close to an antiresonance of the tube). The result is that $P_2$ will be much smaller than in the conical instruments. But for both types of instrument, $Y(3\Omega)$ is small, because both shapes of tube have a resonance near $3 \Omega$. So both types of instrument will have a significant level of third harmonic in the internal pressure spectrum, and thus in the radiated sound.

Equation (11) and (12) also illustrate a result sometimes called “Worman’s theorem” [1, see section 21.3]. The amplitude of the $j$th harmonic is, in this approximation, proportional to the $j$th power of the fundamental amplitude $P_1$. In practice, this effect comes into play as the blowing pressure is increased a little above the threshold to make the note louder. The harmonic content then grows, with a simple pattern. In decibel terms, if the fundamental increases by 1 dB, the second harmonic will increase by 2 dB, the third by 3 dB, and so on. This is likely to be perceived by a listener as an increase of brightness, accompanying the increase of loudness. That description is surely in qualitative agreement with the common experience of players on instruments like the clarinet or saxophone. Some experimental support for this pattern has been given by Benade [1, see Fig. 21.6].

Some more sophisticated applications of harmonic balance to the clarinet can be found in section 9.4 of Chaigne and Kergomard [2].


[1] Arthur H. Benade; “Fundamentals of Musical Acoustics”, Oxford University Press (1976), reprinted by Dover (1990).

[2] Antoine Chaigne and Jean Kergomard; “Acoustics of musical instruments”, Springer/ASA Press (2013).