Jump to ContentJump to Main Navigation
From ScratchWritings in Music Theory$

James Tenney, Larry Polansky, and Lauren Pratt

Print publication date: 2015

Print ISBN-13: 9780252038723

Published to Illinois Scholarship Online: April 2017

DOI: 10.5406/illinois/9780252038723.001.0001

Show Summary Details
Page of

PRINTED FROM ILLINOIS SCHOLARSHIP ONLINE (www.illinois.universitypressscholarship.com). (c) Copyright Illinois University Press, 2018. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in ISO for personal use (for details see http://www.universitypressscholarship.com/page/privacy-policy). Subscriber: null; date: 21 June 2018

Excerpts from “An Experimental Investigation of Timbre—the Violin”

Excerpts from “An Experimental Investigation of Timbre—the Violin”

(1966)

Chapter:
(p.132) Chapter 5 Excerpts from “An Experimental Investigation of Timbre—the Violin”
Source:
From Scratch
Author(s):

James Tenney

, Larry Polansky, Lauren Pratt, Robert Wannamaker, Michael Winter
Publisher:
University of Illinois Press
DOI:10.5406/illinois/9780252038723.003.0005

Abstract and Keywords

James Tenney presents excerpts from his 1966 essay “An Experimental Investigation of Timbre—the Violin.” The research was carried out at the School of Music and the Computation Center at Yale University. Tenney first provides a description of the experiment as well as the equipment and computer programs he used in his investigation. In particular, he discusses the basic approach to sound analysis and synthesis that employs a digital computer with peripheral equipment for translating a signal from “analog” to digital form (for analysis) and from digital to analog form (for synthesis). The analysis programs used in this study comprise a “pitch-synchronous” system, while the sound-generating program used to synthesize violin tones is Max V. Mathews's “Music IV Compiler.” Tenney then explains the experimental results and concludes with a proposal for further research and a request for continued support by the National Science Foundation, laying special emphasis on spectral parameters and envelope and modulation parameters.

Keywords:   timbre, James Tenney, computer program, sound analysis, sound, violin, tone, spectral parameters, envelope, modulation parameters

Preface

This report covers the research that has been completed to date on the project “An Experimental Investigation of Timbre,” although certain aspects of this work have already been described in published papers (Mathews et al. 1965; Tenney 1965). The result has so far been limited to a single instrument—the violin—although the concepts and methods used here are entirely applicable to other musical instruments as well. A description of the equipment and computer programs used in the investigation is given in section 1 of this report. The description is brief, since most of the techniques are relatively standard. More detailed descriptions are readily available in the literature on speech analysis and computer systems. The experimental results of the research are dealt with in section 2.

(p.133) In the course of the investigation, new methods have been developed, though some of the most interesting of these emerged too late to be put into practice. Because of this, and because of the need to extend the investigation to other musical instruments, the last section of this report is in the form of a proposal for further research and a request for continued support by the National Science Foundation.

The work described here was done at the School of Music and the Computation Center at Yale University, with frequent consultations with and valuable assistance from former colleagues at Bell Telephone Laboratories. I have recently been appointed an associate professor of electrical engineering at the Polytechnic Institute of Brooklyn, and I anticipate that this new affiliation will provide much more in the way of laboratory facilities and technical assistance than have been available to me at Yale.

Excerpts from Section 1. Equipment and Procedures

[…]

The basic approach to sound analysis and synthesis described in this report was, in fact, originally developed in speech research (David, Mathews, and McDonald 1958, 1959; Mathews, Miller, and David 1961; David 1961) and employs a digital computer with peripheral equipment for translating a signal from “analog” to digital form (for analysis) and from digital to analog form (for synthesis). Sounds are first recorded on ordinary magnetic tape. From this tape, a second recording is made on digital tape in a format that can be read by the computer. The computer is then used to carry out various kinds of mathematical analysis of the signal, printing out the results in numerical and graphic form. From these results, parameters are derived for the sound-synthesis program. Using this information as input, the computer produces another digital tape from which, finally, another analog tape recording may be made. A comparison of this tape with the original recording then provides a direct aural test of the success of the analysis. In addition, manipulation of the parameters used in the computer synthesis may indicate the relative importance of each parameter in the perception of timbre.

[…]

(p.134) Analysis Programs

The analysis programs used in this study comprise a “pitch-synchronous” system (Mathews, Miller, and David 1961). That is, the computer steps through the signal period by period, carrying out all the primary analytical operations on a given period and printing out the results of these operations before proceeding to the next period. Some of the information is stored in the memory so that after the last period of a given tone has been analyzed, certain averaging operations may be carried out. …

Since the program deals with the signal one period at a time, the first thing that must be done is to measure the period-length, defined in this program as the number of samples between successive signal amplitude peaks. This requires that the computer search for the point of maximum amplitude within a predetermined range of probable sample-distances. … In the course of this frequency-measuring process, peak and RMS amplitudes are also determined for the period, and this information is printed out along with the frequency information.

[…]

Fourier series coefficients are next computed for the period. … Amplitudes and phases of the harmonics are printed out, and a printer-plot is made of the amplitude spectrum. In addition, a spectral envelope is computed by interpolation through all the harmonic amplitude values, and the frequency-position of all relative maxima and minima are determined. These positions are assumed to represent possible “poles” and “zeros” of the waveform function and turn out to be important in the later synthesis of the tones.

At this point, the program shifts to the next period, and the whole process is repeated until the end of the tone has been reached. The program then produces two printer-plots showing, respectively, the changes of peak amplitude and frequency in the course of the tone. These amplitude and frequency-envelope plots are used later to determine the nature of various types of modulation such as vibrato and tremolo.

Synthesis Program

The sound-generating program used in this study to synthesize violin tones is Max V. Mathews’s “Music IV Compiler” (Mathews 1961; Tenney 1963). (p.135) This program allows for the precise specification of all parameters of a sound. In addition, provision is made for altering the structure of the sound-generating program itself in order to simulate musical instruments of any degree of complexity. … From the user’s point of view, the computer-simulated “instrument” to be designed will consist of a configuration of “unit generators,” each of which performs some function that has an easily understandable physical or acoustical analog. These unit generators include, for example, the periodic function generator (oscillator), the random function generator, the adder (mixer), the multiplier, the bandpass filter, etc. Each unit generator has a single output and a number of control inputs, one or more of which are generally taken from the outputs of another unit generator. …

Excerpts from Section 2. Experimental Results

[…]

Summary

[…] By way of summarizing these results, the data will be recast in the form of a description of the temporal evolution of the violin tone itself. That is, I shall describe first the initial transient portion of the tone and then the steady-state and decay portions in terms of all the parameters that appear to be significant in the determination of timbre.

The initial buildup in amplitude during the attack segment, while quite irregular in shape, approximates an exponential curve. …

During this initial buildup of the amplitude of the tone, the fundamental frequency is very unsteady. This unsteadiness is generally of two kinds. If the tone is within a legato-group (thus following immediately a tone of another pitch), there is nearly always a glide (“portamento”) from the frequency of the previous tone to that of the current tone. This glide is not usually a simple interpolation between the two frequencies, however, but generally includes some degree of “overshoot”—which may occur more than once, and thus in both directions—before the frequency settles down to what will be the central frequency of the steady-state portion of the tone. …

(p.136) The second kind of unsteadiness in the fundamental frequency during the initial transient portion of the tone is a random frequency modulation, the bandwidth of which is relatively wide at the beginning, thereafter decreasing more or less gradually toward the steady-state bandwidth. This kind of fluctuation is of very great importance in the determination of violin timbre (or, more generally, of bowed-string timbre).

During the buildup of the tone, the amplitude spectrum varies irregularly, though it already shows many of the characteristic features of the steady-state spectrum.

At some 120 to 180 milliseconds after the beginning of the tone, there begins a quasi-periodic frequency modulation (the vibrato) that continues throughout the steady-state portion of the tone (in all tones except those played on an open string). … A corresponding amplitude modulation is sometimes evident at the same rate as the frequency modulation but with very variable ranges and in varying phase relationships with the frequency modulation from one tone to another.

[…]

In addition to these more nearly periodic modulations during the steady-state portion of the tone, there are random modulations in both frequency and amplitude. …

The spectrum does not become absolutely constant during the steady-state portion of the tone, though the fluctuations from one period to the next are not as great as during the initial transient portion. The spectral envelopes exhibit formant peaks at approximately 500, 1,700, and 3,000 cycles per second, and, in addition, antiresonances or zeros appear at approximately periodically spaced intervals along the frequency axis. Whereas the peaks in the spectral envelopes reflect fixed resonances in the instrument, the zeros reflect discontinuities in the excitation waveform due to the mechanism of bowed-string oscillation. The frequency locations of those zeros depend primarily on the distance of the bow from the bridge and secondarily on bow-speed and pressure.

The experimental data did not show any very important differences between conditions during the decay portion and those during the steady-state portion of the tones analyzed. The form of the amplitude envelope during the decay segment was clearly linear, however. …

One of the most noticeable characteristics of the tones is the high degree of fluctuation that takes place in the course of their evolution in (p.137) time. This fluctuation is only slightly less prominent during the steady-state region than it is during the initial transient period, so that the very term “steady state” begins to seem inappropriate. That such fluctuations are an essential aspect of the timbre of instruments like the violin may easily be demonstrated by synthesizing tones without them. By comparison with other synthetic tones in which such fluctuations are included, the former seem quite lifeless and mechanical. And though the experiments in synthesis that have been carried out so far have not yet resulted in a fully successful simulation of the timbre of the violin, they have provided a great deal of insight into the question of what it is that characterizes the timbre of a musical instrument played by a human being.

[…]

Section 3. Proposal for Continued Research

Introduction

In order to synthesize the tone of a musical instrument on the basis of data derived from computer-analysis, these data need to be in the form of a set of parameters representing inputs to an “instrument” in the sound-generating program. Thus, the design of a computer-instrument to simulate the real instrument must be done before the computer-analysis, rather than after it, as was done previously. Such a computer-instrument actually constitutes a kind of “model” of the real musical instrument whose tones we want to synthesize, and its design will be determined by all a priori knowledge we have or may gain about the instrument’s physical structure and mode of operation and about the way in which the instrument is played. In the case of the violin (and other bowed-stringed instruments), for example, we know that the spectrum of the tone will be conditioned by a number of fixed resonances, that there will generally be a set of slowly varying antiresonances in addition to the resonances, that there will be a quasi-periodic frequency-modulation (and perhaps also a similar amplitude-modulation) whenever the player is producing the tone with vibrato, etc.

The need for physical analysis of the instrument had not been anticipated at the beginning of the work described in earlier sections of this report, only becoming apparent as the work progressed. Now it is evident (p.138) that this kind of analysis should be done at the very beginning of the study of an instrument.

The complete analysis of the sounds of a given musical instrument will thus involve several stages, as outlined below:

  1. 1. a physical (and/or mathematical) analysis of the mechanical action of the instrument and of the “system” comprised of instrument and player;

  2. 2. the design of a computer-instrument to simulate this instrument-player system;

  3. 3. a complete computer-analysis of recorded tones of the real instrument;

  4. 4. the computer-synthesis of these tones, using the “instrument” designed in stage 2, with input parameters derived from stage 3; and

  5. 5. listening tests comparing the original recorded tones with the synthesized tones to evaluate the relative success of the analysis.

With regard to the way in which the computer is used to carry out the analysis of a tone (stage 3, above), certain revisions seem to be called for. First, Fourier series analysis assumes perfect periodicity in the tone being analyzed, and since no real tone produced by a musical instrument is ever perfectly periodic, Fourier analysis ought to be applied only to that part of the signal that is truly periodic—or to a truly periodic function that may be derived from the signal in some meaningful way. The presence of a salient pitch in musical tones indicates that such signals are at least approximately periodic, and the procedure to be outlined here assumes, in fact, that there is an essential periodicity in the signal that is “perturbed” in various well-defined ways. That is, the deviations from strict periodicity are assumed to be due to a set of modulating and additive functions that can be isolated from the signal along with the periodic function. This possibility of isolating various aspects of the signal would be extremely useful later also, because it would make it possible to study the subjective effect of each such single aspect separately.

Second, if the computer-analysis is to provide data that are immediately applicable to the synthesis of the tones—without “interpretation”—an analysis program must be written that does much more than simply compute Fourier coefficients, plot amplitude-spectra, and plot amplitude- and frequency-envelopes. It will have to compute, for example, rates and ranges of the various kinds of modulation present in the signal. (p.139) The kinds of data required of the program thus depend on the design of the computer-instrument that will be used for synthesis.

In order to illustrate the procedures proposed for the analysis program itself, a computer-instrument has been designed to simulate the tones of the violin and other bowed-stringed instruments (figure 1). It is based on what is already known about these instruments, but in fact it would probably be adequate to simulate the tones of most of the more common instruments of the orchestra (more than adequate for some, since they might not require such an elaborate model). The computer-instrument shown in figure 1 would generate each tone in three segments (representing the attack, steady-state, and decay regions of the tone, respectively), with linear interpolations between an initial and a final value for all parameters except the formant-filter parameters (1 through 6) that determine center frequencies and bandwidths, which remain constant during the tone. The design of this instrument assumes, further, that the actual fluctuations of amplitude and frequency in the course of the (real) tone

Excerpts from “An Experimental Investigation of Timbre—the Violin”(1966)

Figure 1. Model computer-instrument representing generalized musical sound-source.

(p.140) can each be replaced by a combination of one periodic and one random modulation-function with simplified parameters and that the more slowly varying amplitude- and frequency-envelopes can be effectively approximated by linear (ramp) functions (in three segments).

The process of analysis now involves simply the derivation of appropriate values for all the external input-parameters in our model instrument (the points numbered from 1 through 32 in the diagram, figure 1). It is not possible to derive all these values directly, however. A step-by-step procedure is necessary that gradually isolates each of the major kinds of variation in the signal, subjecting these to further, information-reducing analytical operations, employing simplifying approximations whenever possible. It will be seen in the outline that follows that the analytical process moves, essentially, from the bottom of the computer-instrument (“OUT”) to the top, in a stepwise progression that gradually fills in the various control parameters.

Spectral Parameters

We define the original recorded signal, S(t), as composed of several functions, as listed below:

  1. 1. a basic waveform, WF(t), which is a single-period function assumed to be repeated periodically in the course of the tone. The spectrum of this is assumed to have been altered by two kinds of filters, symbolized by the transfer functions …

    1. a. P, representing a set of (three) resonances or poles (“formant filters”), and

    2. b. Zt, representing a set of slowly varying antiresonances or zeros, periodically spaced in frequency. In addition, the signal includes

  2. 2. a frequency modulating function, FMt;

  3. 3. an amplitude modulating function, AM(t);

  4. 4. a low-frequency additive function, LF(t), which will include inharmonic, “DC,” and noise components generally lower in frequency than the fundamental of the tone; and finally

  5. 5. a higher-frequency additive function, HF(t), which may include some low-frequency components but will involve mostly higher-frequency inharmonic and noise components, appearing as fine-structure (p.141) fluctuations in the waveform from period to period in the original signal. (Note: LF(t) and HF(t) are not represented in figure 1.)

The combination of these various functions in the signal is then represented by the following expression:1

S( t )=P[ Z t [ F M t [ LF( t )+AM( t )×( WF( t )+HF( t ) )]]].

As each of these component functions is extracted from the signal, the values of the function will be stored on digital tape for later use. The steps in the analysis of the signal are as follows:

  1. 1. Find P (thereby determining parameters 1–6 in figure 1) and inverse-filter to obtain a new function

    S 1 ( t )= P 1 [ S( t )]= Z t [ F M t [ LF( t )+AM( t )×( WF( t )+HF( t ) )]].

    2

  2. 2. Find Zt (parameters 7–10) and inverse-filter to obtain

    S 2 ( t )= Z t 1 [ S 1 ( t )]=F M t [ LF( t )+AM( t )×( WF( t )+HF( t ) )].

    3

    Steps 1 and 2 together are intended to isolate any constant or slowly varying spectral-envelope characteristics (i.e., those that are varying independently of the fundamental frequency of the signal) before the frequency-demodulation (step 3, below) is carried out, since spurious spectral characteristics may then be introduced if this “prewhitening” has not been done. In addition, it should lessen the effect of phase-shifts in adjacent harmonics that sometimes cause artifactual discontinuities in the frequency-measuring program.

  3. 3. Find FMt and frequency-demodulate (i.e., resample, with polynomial interpolations—a quadratic should be sufficiently precise here) to obtain

    S 3 ( t )=F M t 1 [ S 2 ( t )]=LF( t )+AM( t )×( WF( t )+HF( t ) ).

    S3(t) will now be a signal with constant fundamental frequency throughout, or at least constant time-intervals between successive amplitude-peaks. (p.142)

  4. 4. Find LF(t) and subtract to obtain

    S 4 ( t )= S 3 ( t )LF( t )=AM( t )×( WF( t )+HF( t ) ).

    This first additive function, LF(t), will be the mean value of the positive and negative peak-amplitude envelopes of S3(t). These envelopes would be computed by polynomial interpolations through the points representing peak amplitudes on the positive and negative sides of the zero-axis.

  5. 5. Find AM(t) and amplitude-demodulate (i.e., divide) to obtain

    S 5 ( t )= S 4 ( t )/AM( t )=WF( t )+HF( t ).

    S5(t) will now be a signal with a relatively constant amplitude-spectrum, constant peak-amplitudes, and constant period-lengths. The signal is still not perfectly periodic, however, since the waveforms will generally be slightly different in different periods. We must derive from S5(t) a single waveform that represents an average of the periods in its steady-state region (the boundaries of which will have been determined in a preliminary run).

  6. 6. Find WF(t) (parameter 11), by averaging corresponding samples in successive periods of the steady-state region of the tone and subtract to obtain HF(t). HF(t) is only part of the total inharmonic, DC, and noise components in the tone and should be recombined with LF(t), as in step 7 below.

  7. 7.1. Remodulate (in amplitude) HF(t) to obtain

    S 6 ( t )=AM( t )×HF( t ).

  8. 7.2. Remodulate (in frequency) S6(t) and LF(t) to obtain

    S 7 ( t )=F M t [ LF( t )+AM( t )×HF( t )].

  9. 7.3. Filter S7(t) with Zt to obtain

    S 8 ( t )= Z t [ F M t [ LF( t )+AM( t )×HF( t )]].

  10. 7.4. Filter S8(t) with P to obtain a “residue” function

    S r ( t )=P[ Z t [ F M t [ LF( t )+AM( t )×HF( t )]]].

(p.143) This “residue” function, Sr(t), is now in the same form it is assumed to have in the original function, S(t), being the difference between S(t) and a quasi-periodic function, Sq(t), where

S q ( t )=S( t ) S r ( t )=P[ Z t [ F M t [ AM( t )×WF( t )]]].

Both Sq(t) and Sr(t) should be generated as sound so they can be listened to. (Sr(t) should be of very small amplitude, so it may be useful to amplify it digitally.) If the process has failed to keep any true harmonic components out of Sr(t), this should be immediately audible. In addition, listening to Sq(t) should indicate how important Sr(t) may be in determining or conditioning the timbre of the tone. If Sr(t) does seem to be important, it will have to be analyzed by some other method—perhaps by that used for the preliminary run or by that used to analyze the random modulations in the amplitude- and frequency-envelopes (see step 5, below). WF(t) is now only a single-period function, and this, of course, may be Fourier-analyzed and its spectrum compared to P and Zt.

We have now isolated each of the several functions assumed to compose the signal. In addition, we have another signal, Sr(t), which will contain much of the random noise in the tone, and a more nearly periodic signal, Sq(t), representing the original signal with Sr(t) removed. To help make clear what will have been achieved by the analysis so far, a second diagram is shown in figure 2, representing schematically the nature of our analytical results at this intermediate stage in the whole process. Several functions (AM(t), FMt, etc.) will have been stored on digital tape (denoted by the circular symbols in the diagram). The inputs to the formant filters will have been reduced to six constants (determining the center frequencies and bandwidths of the three filters), and a basic (excitation-function) waveform will have been stored (in sampled form). Thus, the major spectral parameters have been derived, but the various enveloping and modulating functions have yet to be reduced to their final (simplest) form. Although some of the noise in the tone will be contained in Sr(t), there will generally be random fluctuations in AM(t) and FMt that may produce perceptible noise in the tone. And these two modulating functions will usually exhibit some quasi-periodic fluctuations too whose parameters need to be determined. The following procedure also requires that a preliminary run on the computer has been made, producing amplitude- and frequency-envelope plots. (p.144)

Excerpts from “An Experimental Investigation of Timbre—the Violin”(1966)

Figure 2. Data representation at intermediate stage of analysis.

Envelope and Modulation Parameters

Let E(t) represent either of the modulation functions (FMt or AM(t)) extracted from the signal by the foregoing analysis. We assume that E(t) is composed of several functions (as with the signal itself, in the earlier stages of the analysis) such that

  • E(t) = L(t) + C(t) + R(t), where

  • E(t) is the original envelope function,

  • L(t) is the best-fitting (least-squares) linear function (in three segments),

  • (p.145) C(t) is a quasi-periodic (cosinusoidal) modulation, and

  • R(t) is a random modulation (to be simulated by the random function generator in the music compiler).

  1. 1. After visual inspection of plots produced in the preliminary run, divide E(t) into three segments (whose durations will specify parameter 12 in figure 1) and estimate the rate of C(t).

  2. 2. Compute L(t) (parameters 13–14 and 23–24) for each of the three segments and subtract (thus removing this “basic envelope”) to obtain a modulation function

    M( t )=E( t )L( t )=C( t )+R( t ).

  3. 3. Determine C(t) (by peak-detection and cosine interpolation) and subtract to obtain the random modulation by itself,

    R( t )=M( t )C( t ).

    C(t) is assumed to be a sequence of ramp-modulated cosines, each quasi-period of which has the form

    C i ( t )=( a 1 i + t t i T i ( a 2 i a 1 i ) )cos[ ϕ+2π( f 1 i + t t i T i ( f 2 i f 1 i ) )t]

    or

    C i ( t )= A i cos( ϕ+2π F i t ), with A i = a 1 i + t t i T i ( a 2 i a 1 i ) and F i = f 1 i + t t i T i ( f 2 i f 1 i ), where

    the index, i, denotes successive cycles of the cosine modulation, the subscripts 1 and 2 indicate initial and final values, ϕ‎ is a constant (phase) determining starting position only, ti is the time at the beginning of each cosine period, Ti is the duration of the period, a 1 i = a 2 i1 and F 1 i = F 2 i1 .4

    But we want to simulate C(t) more simply, as a cosine-function with both rate and range enveloped by single ramp-functions for each of the (p.146) three segments of the tone. This simpler function, Q(t), may be derived as follows:

  4. 4. Compute best-fitting linear functions (in three segments) for ai and fi. These will then reduce to one initial and one final value for each, A1 and A2, F1 and F2 (parameters 15–18 and 25–28). We can now represent the simplified quasi-periodic modulation, Q(t), as follows:

    Q( t )=( A 1 + t T ( A 2 A 1 ) )cos[ ϕ+2π( F 1 + t T ( F 2 F 1 ) )t],

    where T is the duration of the whole segment of the function.5

  5. 5. Simulate R(t) as the output of the random function-generator in the music compiler, with rate and range enveloped as for Q(t) (i.e., by linear functions in three segments). This means (for the range) finding straight-line segments on both positive and negative sides of the zero-axis that contain all peaks inside them. But in order that their slopes be correct, a least-squares fit to relative peaks on each side should be found first and then shifted outward. For the rate, we can assume that the output of the random function-generator changes slope from positive to negative or from negative to positive at about half the rate at which new values are generated. Thus (for rate),

  6. 5.1. locate points in R(t) where the slope changes from positive to negative or from negative to positive;

  7. 5.2. store a function representing time-intervals between these successive points of change of slope; and

  8. 5.3. compute a best-fitting straight line through this function. Initial and final rates for the random function-generator (parameters 21–22 and 31–32) will be double the values at each end of the line derived in step 5.3. Then (for range),

  9. 5.4. select from the points derived in step 5.1 those that change from positive to negative and from negative to positive;

  10. 5.5. compute a best-fitting straight line through each (positive and negative) set of points from step 5.4; and

  11. 5.6. add a (positive or negative) constant to each of these straight lines so they are shifted just outside (or touching?) the outermost points on their respective sides of the zero-axis. Initial and final ranges for the random function-generator (parameters 19–20 and 29–30) will be the average of the two functions at each end (or one-half the distance between them at each end).

(p.147) Discussion

The analytical procedure outlined above has the obvious advantage that the results will be in a form that makes them immediately applicable in synthesizing the sound of the instrument being analyzed. A direct link is thus provided between the analysis and the synthesis programs, so that the entire process could eventually be carried out in a single computer run (or at most, two, if we include the preliminary run needed to estimate certain parameters). The procedure has another advantage, however, perhaps more important than the first one. This was mentioned earlier, but it should be considered here in more detail. This second advantage has to do with the fact that the various component functions isolated from the original signal can be used to test the relative importance of different aspects of the signal—different components and types of variation—in the perception of timbre. This, in turn, would make possible an approach to an optimal information-reduction in the numerical description of the sounds. The successful synthesis of a given sound—in itself—does not guarantee that any such optimal description has been found. That is, while it does indicate that our analysis has provided a numerical description that is sufficient, it does not prove that this description is necessary in all its details. The only way to be sure that a particular component in a signal makes a real difference in the perception of the tone is to synthesize the tone with that component eliminated or replaced by some other component.

Such a strategy becomes very simple with the analytical procedure outlined here. For example, it has already been mentioned that Sr(t) and Sq(t) should be generated as sound and listened to, but many other possibilities emerge at that same intermediate stage of the analysis at which these two functions have been derived. Referring to figure 2, tones could be generated with other waveforms substituted for WF(t), with AM(t) replaced by simple linear functions (while FMt remains unchanged) and vice versa, etc. At the end of the analysis, it would be possible to make direct aural comparisons between the final synthesized tones and tones employing one or more of the original (unsimplified) modulating functions (AM(t) or FMt). By such means as these, then, it would become possible to make meaningful evaluations of the aural effects of the various simplifications, substitutions, and other operations that occur at the several stages in the analysis and synthesis of the tone.

(p.148) Equipment, Facilities, and Personnel Costs

The equipment necessary for this project is already available at the Polytechnic Institute of Brooklyn, where the proposed research will be done. In addition to the principal investigator, two half-time graduate assistants will be needed to carry out some of the detailed work on certain aspects of the research. One of these will assist in problems of mathematical analysis and computer operations, the other in problems of physical analysis and electronic instrumentation. Funds are also being requested to pay for the use of the computer facilities. The research would be carried out over a period of two years, beginning in September 1966.

Principal Investigator

The principal investigator will be James C. Tenney. Since February 1959 he has been engaged in both experimental studies and practical utilization of various techniques of electronic music. His musical training previous to that time had been as a composer, pianist, and conductor, and he has remained active in these areas up to the present time (for further information on these activities, see the résumé attached to this report). But his interest in the new musical possibilities of electronic media began as early as 1952, when he first entered college. He became convinced that the fullest realization of the enormous resources of these new media would require more than a passing knowledge of mathematics, acoustics, and electronics, though these would be of little use until he had acquired a firm musical foundation. Accordingly, his studies have always included as much that was of a technical nature as was possible while still pursuing the ordinary musical curriculum. Thus, he holds the degree of master of music from the University of Illinois, while his schooling has also included two years in the engineering school of the University of Denver. He received additional training in acoustics and electronics at the University of Illinois and was laboratory assistant in the Electronic Music Laboratory there for two years.

From September 1961 through March 1964 he was an associate member of technical staff at the Bell Telephone Laboratories, doing research in physical acoustics, psychoacoustics, and electronic music, employing a digital computer for the generation of the sounds and sound-sequences used in these studies. During this time he became a proficient (p.149) computer-programmer and gained additional training and experience in mathematics, electronics, and sound analysis.

Since April 1964 he has been research associate in the theory of music at Yale University, engaged in the two-year research project “An Experimental Investigation of Timbre” described in sections 1 and 2 of this report on a grant from the National Science Foundation.

References

Bibliography references:

David, E. E. 1961. “Digital Simulation in Research on Human Communication.” Proceedings of the Institute of Radio Engineers 4(9): 319–29.

David, E. E., M. V. Mathews, and H. S. McDonald. 1958. “Description and Results of Experiments with Speech Using Digital Computer Simulation.” In Proceedings of the 1958 National Electronics Conference, 766–75. New York: Institute of Radio Engineers.

———. 1959. “A High-Speed Data Translator for Computer Simulation of Speech and Television Devices.” In Proceedings of the Western Joint Computer Conference, 354–57. New York: Institute of Radio Engineers.

Mathews, M. V. 1961. “An Acoustic Compiler for Music and Psychological Stimuli.” Bell System Technical Journal 40: 677–94.

Mathews, M. V., J. E. Miller, and E. E. David. 1961. “Pitch Synchronous Analysis of Voiced Sounds.” Journal of the Acoustical Society of America 33(2): 179–86.

Mathews, M. V., J. E. Miller, J. R. Pierce, and J. Tenney. 1965. “Computer Study of Violin Tones.” Journal of the Acoustical Society of America 38(5): 912–13.

Tenney, J. C. 1963. “Sound-Generation by Means of a Digital Computer.” Journal of Music Theory 7(1): 24–70.

———. 1965. “The Physical Correlates of Timbre.” Gravesaner Blätter 26:106–9.

Notes:

[These are excerpts from an unapproved proposal to the National Science Foundation dated June 30, 1966. The proposal was in three sections. Tenney originally planned to publish only the third of those sections in this volume. He later decided that some information from the first two sections should be included for context, but he left no prescription for how this material was to be chosen or incorporated. We have decided to preface the third section with selected excerpts from the first two sections that we believe may provide clarifying context.—Ed.]

(1.) [Note that LF, AM, WF, and HF are each real-valued functions of a real argument (time). P, Zt, and FMt, on the other hand, are all operators whose arguments and values are themselves functions—not real numbers, as this expression might suggest. Tenney described the expression as serving a mnemonic purpose. Where t appears as a subscript it indicates an operator that is time-varying.—Ed.] (p.445)

(2.) Probably the best procedure for carrying out step 1 would be as follows:

  1. (1.) Fourier-analyze the steady-state region of the tone and compute spectral envelopes;

  2. (2.) compute an average center-frequency and an average bandwidth for each of the major peaks in these spectral envelopes (parameters 1–6); and

  3. (3.) use digital band-rejection filters (with fixed parameters) to flatten these peaks (thus compensating for the effect of any fixed resonances in the instrument). This would be done throughout the whole tone, not just the steady-state region.

(3.) The procedure for determining Zt would be as follows:

  1. (1.) Fourier-analyze the whole (already fixed-filtered) signal (S1(t)), compute spectral envelopes (again, as in step 1.2, above), and compute a best estimate of zero-positions for each period (e.g., assuming periodic spacing of the zeros in the spectrum, find the frequency-factor that, with its multiples, touches the lowest points in the spectral envelope);

  2. (2.) derive two functions, ZF(t) and ZB(t), representing the variations in time of the “frequency-factor” (from step 2.1, above) and the (average) bandwidth, respectively, of the zeros in the spectrum. Since these characteristics should be slowly varying, linear (ramp) functions (derived by computing a least-squares best fit to the sets of points in ZF(t) and ZB(t)) should be sufficient in precision (this step will specify parameters 7–10); and

  3. (3.) use emphasis-filters (i.e., digital bandpass filters) with variable parameters to remove the zeros and flatten the spectrum still further than in step 1.3, thus compensating for the effect of the bow (reed, lip, etc., depending on the instrument). In some cases (e.g., the flute), physical considerations might eliminate any necessity for locating zeros in the spectrum, and these steps could be skipped.

(4.) [At the time when he was reviewing this manuscript for publication in this volume, Tenney was aware of an error in the two expressions for Ci(t) here and the one for Q(t) below, but he did not complete their revision. If the frequency of Ci(t) is specified by the given linear (p.446) interpolation function Fi then the total phase (the argument of the cosine) will be given by an antiderivative of Fi multiplied by 2π‎. This yields C i ( t )= A i cos[ ϕ+2π( f 1 i + t t i 2 T i ( f 2 i f 1 i ) )( t t i )] where Ai is as given. Similarly, Q( t )=( A 1 + t T ( A 2 A 1 ) )cos[ ϕ+2π( F 1 + t 2T ( F 2 F 1 ) )t] below.—Ed.]

(5.) It might be asked why two separate functions (C(t) and Q(t)) are involved in the analysis and how one can justify subtracting a function (C(t)) that is different from the function (Q(t)) that will later be used in the synthesis of the envelope. The answer is that some such procedure seems both necessary and sufficient. Necessary, because if the simpler function (Q(t)) were the one subtracted from M(t) (in step 3, above), there would be, in general, some of this quasi-periodic modulation left in the random modulation function, R(t) (wherever phase-differences occurred); sufficient, because any differences between C(t)and Q(t) should be scarcely perceptible in a synthesized tone. This is not an arbitrary assumption but is based on experiments in sound-synthesis with various kinds of enveloping on the quasi-periodic modulation parameters, where it was found that surprisingly large differences in the temporal evolution of these modulation parameters in two tones were imperceptible. However, if the procedure eventually proved to be inadequate, still another level of analysis could be undergone to approximate the actual fluctuations in these parameters (probably by way of slower random functions). Such a further degree of complexity does not seem necessary now, however. It should also be noted that some of the discrepancies between C(t) and Q(t)—in terms of the general type of fluctuation they represent—will be compensated for by the random modulation. That is, the relative regularity of Q(t) will be more or less distorted by the random function-generator output, the input parameters for which are derived in the next few steps of the analysis.