10.1 Why to measure?

This “underpinnings” chapter is rather different in character from the previous ones. It will not address a particular technical topic, like acoustics or nonlinearity. Rather, it will give an overview of a very diverse range of activities that come under the umbrella of “measurement and experimentation”. This will give an excuse to include various interesting odds and ends. For this particular chapter, don’t be afraid to check the side links: some of them are picture collections or “how to…” guides aimed at instrument makers wishing to do measurements themselves.

Some experiments involve expensive laboratory-based kit, like a laser vibrometer or a scanning electron microscope. At the other extreme are the kind of “what will happen if…” explorations that instrument makers often pursue. These will be workshop-based, and may not involve any equipment beyond the maker’s usual tools. But there is a wide range of things that fall between these two extremes. Some “science” measurements need only simple equipment. On the other hand, many instrument makers with a systematic mindset want to keep detailed records of each instrument as they build it. Such record-keeping would begin with dimensions and weights, but may extend to more technical things like wood properties, vibration frequencies of components such as violin plates, and acoustic response measurements of one kind or another on the completed instrument.

Measurements of all these kinds can involve traps for the unwary, and to get the best out of a measurement involves knowing some “tricks of the trade” for avoiding them. In the course of this chapter we will meet some examples. But before getting down to specific details, it is useful to explore the general question of why people might want to do measurements at all. There is no single answer to that: different people bring different agendas. This chapter is about physical measurements, so I will not talk again about psychoacoustical experiments: these are very important, as discussed back in Chapter 6, but they involve a different mindset, and a different set of skills, pitfalls and tricks of the trade. For our present purpose, there are two main types of agenda, coming from instrument makers and physicists.

Instrument makers are likely to be motivated to do measurements or experiments by practical concerns of making better instruments, or addressing particular problems that have arisen. They may ask questions like “How do I make another one just like this?”; “Can I replace traditional materials, either with something better or with something more sustainable?”; “How should I adjust constructional details to address a tonal particular problem?” or simply “What will happen if…?”. For some instruments makers, there are also forensic questions about accurate copying of famous old instruments. There is an old joke: “How many violin makers does it take to change a light bulb?” The answer: several — one to do the job, the rest to debate how Strad would have done it. But it is not only the violin world that has iconic makers of the past: Lloyd Loar mandolins and Hermann Hauser guitars, for example.

The physicists’ agenda is rather different. They all want to know “how does it work”, but they fall into three different camps on how to approach that question (although some individual physicists have a foot in more than one of these camps — and some must have three feet!). There are those who are primarily interested in theory and mathematical models. There are those who aim to understand behaviour by detailed computer modelling. And then there are those who are dedicated experimentalists.

Measurement has a role for all three approaches, but it feels different to committed members of each camp. For the theorists, the role of measurements is to test, confirm and calibrate the theoretical models: but those models really hold the key to understanding. The computational folk are somewhat similar. Measurements are for “validation” of the computational model: once it is well enough validated, they expect to use the computer for the real business of “doing science”. The committed experimentalists take a quite different view. They are likely to assert that “measurement is ground truth”. Theory is only useful if it agrees with the measurements, otherwise the theory must surely be wrong? In extreme form, these people may think “surely we can sort the whole thing out if only we had enough data?” Indeed, this is quite a fashionable idea these days, with “big data” all around us.

The committee that awards Nobel Prizes for physics seems to have a strong bias in favour of experimentalists. Although they have, of course, awarded prizes to theorists like Einstein, they sometimes delay for a very long time, and only award a prize once there is experimental confirmation. Einstein’s prize was not for his crowning achievement, the theory of relativity, but for much less well-known work on the photoelectric effect, which had more immediate experimental evidence. There have been controversial decisions…but this isn’t the place to air those. No-one has yet been given a Nobel prize for studying musical instruments — although some Nobel laureates like Rayleigh and Raman were also interested in musical problems.

Cutting across these two broad agendas, there are several specific motivations for doing a measurement or experiment. The first is measurement aimed purely at data gathering. This might be an instrument maker wanting to measure wood properties for their records, or wanting detailed geometric information on a famous old instrument, but it could also be a physicist in the third camp, collecting extensive acoustical and vibration data on an instrument with “big data” and “data-driven science” in mind. The first two are uncontroversial, but the third one raises important doubts: I’ll say more about that shortly.

A second motivation for measurement is particular to instrument makers: monitoring for quality control. A potential client may say “I really liked the instrument you made for so-and-so: could you make me one like that?” The maker immediately has a challenge: they are working with natural materials, and no two pieces of wood will be exactly the same. The client doesn’t just want an instrument that looks like the one they admired: they want one that sounds like it. So the maker needs to try to create a similar acoustical performance in the new instrument, despite inevitable differences in the raw materials. Can measurements help them in this task?

A third category of motivation is rather different: measurement for hypothesis testing. You have an idea for how something works, or what will be the effect of a particular modification, so you design an experiment specifically to probe this idea. Crucially, such experiments need not make a “better” instrument, and they need not be subtle. The aim is to test the hypothesis, and that test may be more clear and convincing with a deliberately crude modification, far larger than you would necessarily want to use in a real instrument. If your hypothesis stands up to this test, you can then use it to inform more subtle and graduated changes.

From the perspective of a physicist, such experiments illustrate the classic “scientific method”. But an instrument maker might also perform such tests, not in order to extend scientific knowledge but as part of the practical business of developing their skill. As a simple example, they may have an idea about what will happen if they change the weight of a violin bridge. So they might deliberately fit a very heavy bridge and a very light one, to see if these extreme changes give tonal effects that follow what they were expecting — even if the light bridge was too fragile to stand up to the rigours of normal violin playing.

The final category of motivation is measurement as exploration. Try some changes, and see (or hear) what happens. Such experiments can blur into the previous category: perhaps you have only a rather vague hypothesis, but you do the test anyway. The example of changing bridge weight could fit here too: perhaps you don’t have a clear idea of what the effect of weight might be, but it is an easy experiment so let’s try it and see…

But there is a very important snag. My example of changing bridge weight seems clear, but it is misleading. Violin makers and violinists already know quite a lot about the effect of bridge weight, because players regularly make temporary changes to the weight, by fitting a mute. Makers are indeed interested in bridge adjustment, but not just in the effect of weight. There are many options for tweaking a violin bridge: wood choice, spacing of the feet, thickness, and all manner of details of the shaping of those elaborate cut-outs. Figure 1 reminds you of the complicated shape.

Figure 1. A violin bridge.

Well, you could still go for the “try it and see” approach, but you would need to make many, many bridges to cover all the different variations, and also probe how they work in combination with each other. In practice you would probably only do a small number, then try to guess the bigger picture from these limited results. This is fraught with danger! Attempts to generalise and spot patterns from inadequate data have sometimes led to important insights, but more often they have led to misleading claims, about musical instruments and in the wider world.

This simple example illustrates a fatal weakness in the notion that an experimenter could “just collect enough data and then see the answer”. Of course there have been triumphs of “big data” and Artificial Intelligence systems to extract patterns — the targeted advertisements in your web browser are a familiar example, although whether you regard those as a triumph may depend on whether you are an advertiser or a consumer. In any case, such systems can only be developed and deployed in a context where large resources can be devoted to the problem. That is never an option in the world of musical instruments.

The preferable alternative is to combine the data gathering with some kind of modelling, with a view to finding an informed way to interpret the measurements. This could take several different forms: theoretical models could involve technical mathematical theory of some kind or less formal logical argument about how different factors might interact; alternatively, the model could be a computer program that attempted to capture the essential physics of the system.

We have already seen an example of such modelling in a context relevant to my bridge-adjustment example, back in section 5.3 when we talked about the “bridge hill” of a violin. A simple idealisation of one aspect of bridge behaviour was represented in Fig. 12 in that section, and then applied to a crude model of the violin body in order to see how the “hill” feature could arise as a result of coupling of the bridge and body behaviour. The model predictions (Fig. 8) matched, at least qualitatively, the measurement of violin bridge admittance shown in Fig. 7. That match can give us some confidence in using insights from the super-simple bridge model to make informed guesses about how different aspects of bridge adjustment might influence the behaviour. This then opens the way to replacing “scattergun” exploratory experiments (with many, many bridges) with targeted hypothesis-testing experiments, involving only a limited number of modified bridges.

There is one important tool for generating insights from modelling which deserves a special mention. This is an approach called dimensional analysis. It is a very common experience that you know (at least approximately) the governing mathematical equations for something, but you can’t solve them. Dimensional analysis sometimes allows you to learn useful things about the solutions by a very simple argument.

The idea is based on looking to see which physical parameters enter into the equations, and thinking about the units they are measured in. We are interested in mechanical problems, so all units can be reduced to combinations of mass, length and time. For example, density is mass per unit volume, so it has units kilograms per cubic metre: mass divided by length cubed. For another example, the units of force can be deduced from Newton’s law: force is mass times acceleration, so its units are kilograms times metres per second per second, in other words mass $\times$ length / time squared. There is one very simple thing we can say about any equation expressing some aspect of physics: all the terms in the equation must have the same units. You can’t add a mass to a length, for example; it simply doesn’t make sense: what would 1 kg plus 1 m mean?

This simple insight is the basis of the method of dimensional analysis. Some details are explained in the next link. I will illustrate the power of the method by examples. Think first about a bending beam, like a xylophone bar or the tines of a tuning fork. We found the governing equation for vibration of beams back in section 3.2.1. For any particular problem, there are only three physical parameters involved: the bending stiffness, the mass per unit length, and the total length of the beam. As the next link explains, we can use dimensional analysis to deduce that the frequency of any particular mode of such a beam must be expressed in a particular form. It must be proportional to the square root of the bending stiffness, inversely proportional to the square root of the mass per unit length, and inversely proportional to the square of the length. That is quite a lot of information to be able to deduce simply by thinking about the units everything is measured in!


We can go a bit further. Suppose our beam has a rectangular cross-section. The analysis shows that the frequency must be independent of the width, proportional to the thickness, and inversely proportional to the square of the length. These “scaling laws” immediately tell you how to modify the shape to make a set of xylophone bars or tuning forks tuned to different frequencies. You can make the frequency lower by making the beam longer or by making it thinner, or by some combination of those. Any change which reduces the ratio of thickness to length squared by 6% will lower the pitch by one semitone, for example. Conversely, a change of thickness and length which kept that ratio the same would result in two bars or forks with the same pitch, even though they looked different. It might have taken you quite a long time to deduce this fact if you simply did measurements on bars of many different shapes.

As a final step, notice what happens if you scale both the length and the thickness by the same factor. A bit of cancellation then happens, so that the frequency scales inversely with that factor. This is an example of a rather general property: if you make a scaled replica of a structure, with all dimensions scaled by the same factor, then all the frequencies will scale by the inverse of that factor. For example if you scale down a xylophone bar, or a violin, by 6%, all the frequencies will rise by one semitone (assuming that you use the same material).

This idea of making scaled structures gives a link to the second important idea that is thrown up by dimensional analysis. Most physics problems are more complicated than a vibrating beam, and they involve a larger number of physical parameters. You can still follow the methodology explained in the previous link, but dimensional analysis will no longer tell you the complete answer to your problem. But it tells you something else which is also interesting. If the number of parameters is bigger than the number of units involved (for mechanical problems this number is 3: mass, length and time) then the analysis tells you that there are combinations of the parameters which have no units: these are called dimensionless parameters.

Because they have no units, you cannot say anything about how the values of those parameters affects the solution (without doing a lot more work, to solve the equations). But what you can say is that if you change the structure in such a way that the values of the dimensionless parameters do not change, then the answer will also not change. This idea lies at the heart of the familiar process of doing laboratory tests on scale models of structures, such as the model aeroplane in the wind tunnel shown in Fig. 2.

Figure 2. A scale model of an aeroplane being tested in the Markham wind tunnel, at the Cambridge University Engineering Department.

Once you have built your scale model, how do you choose the air speed in the tunnel, in order to reproduce more or less the same flow patterns around the model as you would expect in the full-scale aeroplane? Fluid dynamicists are very keen on dimensionless parameters, and they have defined many of them, useful in different contexts. The most important thing you must do in a wind tunnel test is to make sure to keep the same value of a dimensionless parameter called the Reynolds number — the previous link gives some details about what this number is and why it is important.

Another example of scale-model testing relates to the performance of infrastructure like buildings or dams, for example in an earthquake. The problem with a scale model in this context is that the total mass of a model of the building or the dam scales down with the cube of the linear dimensions. The result is that small models are, in relative terms, too light. The solution to that problem, via a suitable dimensionless number, is to make gravity stronger. This can be done by using a geotechnical centrifuge, which spins your sample of soil and your model structure at a high rate: exactly the same idea as the centrifuges used to train astronauts to withstand the acceleration of space missions. Figure 3 shows what a geotechnical centrifuge looks like.

Figure 3. Researchers inside the geotechnical centrifuge at the Schofield Centre of Cambridge University. This centrifuge can spin a “bucket” of soil with a scale model of a structure, using centrifugal force to simulate an increase in the force of gravity in order to match the relevant dimensionless parameter. The two researchers are standing near the sample bucket, which is hinged so that it can pivot up as the centrifuge spins. On the right-hand side is a counterweight. The whole system is in an underground space, for safety in the event of something going wrong.

Returning to our main theme, there is an aside that we might note. The kind of interplay between experiment and “theory” that we have been talking about has an interesting parallel in the context of debugging computer models. If you have never tried to write a computer program to solve a problem, you may not be aware of the tricky and time-consuming process of debugging. Your first effort to code something up invariably has errors in it. If you have no idea at all what you expect the answer to look like, you are in danger of believing the results of your first version that seems to run successfully.

But almost certainly, these results are wrong. How do you find out? The process relies on you thinking about the problem, in order to generate some ideas about how the solution ought to behave. This might involve special cases of the parameter values which should result in recognisably extreme behaviour, or it might involve cross-checks with standard physics results (“it must be consistent with Newton’s laws, or conservation of energy”). So you check these things out with your program. Probably something doesn’t work, and after a bit of head-scratching you realise what you have done wrong to create the disparity.

So how do you know when to stop looking for further bugs? There is no answer to that, except that after a while you run out of ideas for cases to check. The result is a bit counter-intuitive: the time it takes to debug a program goes up, in proportion to the number of prejudices you can come up with about the answer. The less you know, the quicker the process is — but of course that doesn’t mean you get correct code quicker, it simply means that you stop looking, and run the risk that there is still something wrong. This kind of interplay with “theory” is crucial if you want to end up with a program that has a good chance of being correct.

Exactly the same is true for interpreting experiments. It is only too easy to do something wrong in the way a measurement is carried out, or in the way the results are interpreted — these are equivalent to the bugs in a computer code. Eliminating such problems involves, first, recognising that something is wrong. You do that by scratching your head to come up with exactly the same kind of prejudices about how the results ought to behave, and then checking those aspects carefully in your measurement. We will see some examples in later sections, when we look at particular types of measurement.

It is particularly important to think about this kind of interplay with “theory” if you have done an undirected, exploratory investigation, and you are now trying to spot patterns in the results. How do you recognise a “significant pattern” and decide what it means? That is the million-dollar question. The crucial first stage is to ask the question, and to be on the lookout for patterns all the time — patterns that might be some interesting feature of the results of your measurements, or might be indicating that there is something wrong. How do you recognise “wrong”? Quite often from a “gut feeling” based on experience. Data not meeting an expectation is not necessarily “wrong” but it may be the germ which will lead to new understanding.

As we emphasised right back in Chapter 1, there are some things for which there is believed to be a good theoretical understanding, at least in general terms. If you see something which contradicts previous experimental and/or theoretical expectation, it either means you are on track for your Nobel prize, or more likely that there is something wrong with your measurement or your interpretation of it. Even if you are in fact on track for the Nobel prize with some remarkable discovery which flies in the face of what everyone believes, you still have to convince people that your results are right. So in either case, seriously unexpected results have to be checked, checked and checked again. Your new material shows up with ten times the stiffness-to-weight ratio of any other known material? Better check your rig by measuring something familiar, like steel or aluminium. You see nonlinear response when everyone expects linear response? Better check that the nonlinearity is really coming from the test object, not from some artefact of your rig and the way it is put together.

To end this discussion of general issues surrounding the business of measurement and experimentation, we should note one more very important idea that must be kept in mind when planning any experiment, including the kind of relatively informal experiment that an instrument maker might do. Suppose you want to investigate the influence of bracing pattern on the soundboard of a classical guitar — something guitar makers worry about a lot. You might think of making three guitars, the same except that they have three different bracing patterns. This would be a mistake! It would tempt you into a trap.

If you build the three guitars and then play them, no doubt they will all sound somewhat different from each other. Are you hearing the characteristic sound of the three different bracing patterns? Well, you might be, but you could also be seriously misled. The problem stems from the fact that no two guitars can be made exactly identical. Every piece of wood is different, and every aspect of the making process is only controlled within a certain tolerance: thickness distribution, details of glue joints, details of the bridge, details of the setup. All those things will come out a little different every time, even when the maker aims to keep them the same. The differences of sound you hear between your three guitars might be caused by the thing you intended to be different (the bracing pattern), but they might not: you may be hearing the accumulated effect of all those other small variations.

The solution to this problem is to realise that your experiment, like every other experiment, needs a control. At a minimum, you should make four guitars: the three you already planned, plus a fourth one that has the same bracing pattern as one of the others. In other words, you try to build two identical guitars among your set. Now you listen to them all, and the question to decide is not “Do the ones with different bracing sound different?”, but the more subtle question “Do the ones with different bracing sound more different than the two supposed identical ones?”