ORIGINAL COMMAND LINE FORMAT
INFORMATION PAGE
CONTROLLING PARAMETERS WITH FUNCTIONS
(2) ANALYSIS ONLY:
PVANALYSIS (PVCANAL)
(3) SYNTHESIS ONLY: TWARP
(4) ADDITIVE SYNTHESIS
CHORDMAPPER
HARMONIZER
(5) Subtractive synthesis: CONVOLVERNOISEFILTER BANDAMP INHARMONATOR FILTER CHORDRESPONSEMAKER FILTRESPONSEMAKER
(6) Amplitude warping: SPECTWARPER Feature extraction: ENVELOPE
Feature extraction: PITCHTRACKER (ECMC script exists, but this program currently is broken)
Routines for which NO ECMC scripts and examples exist: (Most of you can skip this entire section)FREQRESPONSE
CONTROL FUNCTION PROCESSING : RESHAPE (usable on ECMC Linux systems)
OVERLAP/ADD METHOD VS. OSCILLATOR BANK METHOD AND RESYNTHESIS THRESHOLDS
SOURCE
MULTIPLE CHANNELS
FLOATING-POINT AMPLITUDE RESCALING
OUTPUT STATISTICS
FREQUENCY RESPONSE TERMINAL OUTPUT
ANALYSIS FILES
DECIBELS
LOW/HI SHELF EQUALIZATION
WARP INDEX
PITCH TRANSPOSITION
FREQUENCY SHIFT
ENVELOPE RESPONSE TIME
RING DECAY TIME
FFT SIZE
WINDOW SIZE
WINDOW TYPE
FRAMES PER SECOND
TIME EXPANSION/CONTRACTION
BEGIN/END TIMES
GAIN
FILTERING: SOURCE SIGNAL LEVEL
TRANSPOSITION/SHIFT APPLICATION FLAG
FILTER TYPES: PASS OR REJECT
RESPONSE FUNCTION SMOOTHING
ANALYSIS DATA ACCESS MODE
CONVOLVER PANPOT
FREQUENCY RESPONSE ACCUMULATION METHOD
RING ROUTINES: FILTER PLACEMENT
COMPRESSION AND EXPANSION
UTILITIES
FILE CONVERSION: aiffs, aiffd, nexts, nextd, nextfloats (Most of you will never use these)
FUNCTION VIEWING: showme, showspect (Most of you will never use these)
PVC is a collection of phase vocoder signal processing routines and accompanying shell scripts for use in the transformation and manipulation of sounds. It is written in C and designed to be used in a UNIX environment. It has come about as a result of my path of education and research into phase vocoder technology. It follows in the spirit of the work by Eric Lyon (out of which PVC is built) and Chris Penrose whose particular dsp research springs from the coding and tutorial work of F.R. Moore and Marl Dolson. Moore's book, Elements of Computer Music, published by Prentice Hall, is therefore a great resource for making sense of the phase vocoder engine which I am unable to go into here. Curtis Road's book, The Computer Music Tutorial, published by MIT Press, has sections on the phase vocoder as well; these may better introduce the beginner to the practical concerns of this technology. Short of the explanations these sources provide, I have attempted to offer below some explanations, particularly as needed for control of the parameters in these routines. A manual and tutorial would be great to have; unfortunately time has not yet made it so.
These routines reflect my need for tools which can perform different spectral resynthesis tasks; both simple and experimental. Their refinement has advanced with my growing skills and curiosity, which I expect will continue as long as I have questions about sound. Most of these routines can be viewed in terms of traditional additive or subtractive synthesis tasks, coming about as they did from the desire for greater finesse and control of these two basic types of synthesis. While the speculative nature of some give them an idiosyncractic character, most should, with practice, reveal the transparency of their names if not the role they can play in the shaping of sound. All require a good ear tuned towards sound and idea as none of these routines are automatic, although many hold great potential for the diligent.
This 3.0 release contains only those routines which I think are stable, useful and moderately transparent. Some earlier versions have been omitted, replaced or consolidated into newer routines. For example, compander remains, but the ideas behind bandamp have ripened into spectwarper, a remarkable "super companding" tool for windowing amplitude, and balancing the resonance/noise-residues of a sound. The harmonic tone reorganizer, chordmapper, has continued to grow in its controls (however arcane), offering increasingly subtle ways to reorganize harmonic spectra. The noisefilter routine is now very good, having become a PVC first encounter routine for many whose noisy lives cross my path. Tvfiltdeviator now joins the arcane but novel filtdeviator routine. In addition, I have added a set of feature analysis routines (pitchtracker, centroid, envelope, fluxoid); which should be useful in generating function files to control different synthesis strategies. There are other, more experimental routines (some actually appeared in 2.0) which are still proving themselves; in time they will appear or reappear. As with 2.0, floating-point files (combined with a rescale feature) continue to be readable and writable. Someday I will deal with AIFF headers (although they do not offer floating-point values), but not for now.
Paul Koonce
koonce@music.princeton.edu
This version of the PVC html documentation has been edited from Paul Koonce's
original PVC 3.0 manual by Allan Schindler to
reflect usage of the PVC programs at the Eastman Computer Music Center.
Some portions of Koonce's original documentation have been omitted,
other portions shortened, edited or reworded, and I have placed some passages, like the paragraph above,
in a smaller font. These small font passages can be skipped by ECMC users who are just
beginning to use the PVC programs, but may be of interest to more advanced users.
Paul Koonce's complete original version 3.0 HTML documentation on the PVC programs
is available at
http:/www.esm.rochester.edu/onlinedocs/PVC.3.0.README.html
and at
http://www.music.princeton.edu:80/winham/PSK/PVC.3.0.README.html
Within the ECMC version you are reading, information specific to using these
programs at Eastman,
like this passage, is printed in green font in the online version.
To make these ECMC annotations easier to spot in the printed grayscale version of this
document. the section
headers of these ECMC-specific annotations are enclosed in asterisks, as in the
At the ECMC, we are running Koonce's PVC programs in
a Linux port by John Gibson. We likely will soon be installing Koonce's 4.0 Macinstosh
version of The PVC package includes about 20 separate programs, or "routines," briefly
described in the
However, while an initial glance at the PVC scripts may seem daunting,
things often are not so bad after all. All of the parameter options are provided
with default values. In general, if we accept all of the defaults, the result
will be straight resynthesis, and if all goes well the output soundfile should sound
identical to the original.
In your initial attempts at analysis/resynthesis, you
can skip over most of the parameter options, relying upon the default
values, and change only those parameters necessary to achieve the desired
musical result. Often, the result will be fine. But if not, you then can
adjust other parameters in attempting to achieve a more satisfactory
result or to eliminate artifacts.
All of the executable PVC programs, like plainpv and twarp, originally were designed to be run from a shell window
with the standard Unix command line syntax:
The cost, or downside, of these many options can be complexity in usage -- finding
your way to the particular parameter you wish to change, or in certain cases,
many parameter decisions (some of which you may not understand very well)
that must be made.
(*** End of this ECMC note ***)
Information about any routine can be seen by typing the name of the routine without any arguments or file name. Typing:
produces the following information about plainpv.
plainpv: generic phase vocoder with dynamic controls
plainpv [flags] [input file (16-bit shorts)] [output file (optional)]
(values in brackets denote defaults)
N: FFT length (must be a power of 2) [1024]
M: window size in samples (must be a power of 2) [2*FFT]
(0 will automatically set window to 2*FFT size or larger)
w: window type: 0 = hamming, 1 = rectangular
2 = Blackman, 3 = Bartlett triangular [0.]
4-12 = Kaiser windows for alpha = 4-12, respectively
(representative sidelobe levels for alpha:
4 = -30dB, 8 = -58 dB, 12 = -90 dB)
D: analysis frames per second [200]
I: time expansion/contraction factor [1.]
(duration = duration * factor, 1. = original time)
P: pitch transposition in semitones (func) [0]
a: frequency shift factor
(bin frequency adder, before -P )(func) [0.]
b: begin time in seconds [0.]
e: end time in seconds ( 0. = end of file) [0.]
C: resynthesis channel (1 -> ?) (0 = all) [0]
SHELF EQ:(post transpose/shift)
H: SHELF EQ: Low shelf gain in dB (func) [0.]
X: SHELF EQ: High shelf gain in dB (func) [0.]
m: SHELF EQ: Low shelf frequency in Hz (func) [200.]
R: SHELF EQ: High shelf frequency in Hz (func) [2000.]
W: warp index for reshaping magnitude response (func) [0.]
Values > 0 expand the dynamic range,
values < 0 compress the dynamic range.
A: gain in decibels (func) [0.]
l: envelope attack time (func) [0.]
L: envelope release time (func) [0.]
T: BRICKWALL FILTER TYPE: 0 = bandpass, not 0 = band reject [0]
f: frequency window: low boundary
(before -P and -a) (in Hz) [0.]
F: frequency window: high boundary
(before -P and -a)(in Hz) [Nyquist frequency]
p: amplitude reports print mode: 0 = off, 1 = on [0]
i: time interval between amplitude reports [.25]
_: OUTPUT FORMAT: 0 = taken from input file
1 = 16-bit integer, 2 = 32-bit floats [0]
=: PEAK RESCALE LEVEL (float output only) 0 to -96 dB
Set to 1 to rescale to level of input file. [ 1 ]
TERMINAL DISPLAY AND GRAPH FILE OUTPUT
n: number of frames [0]
u: low bin frequency [-1]
U: high bin frequency
(-1 = nyquist) [Nyquist frequency]
S: TERMINAL DISPLAY: display option [0]
(0 = off, 1 = phase data, 2 = amp data, 3 = both)
c: GRAPH FILE: WRITE ascii to FILE
0 = off, 1 = freq, 2 = decibels [0]
3 = decibels - waterfall plot
(When on, this flag writes ascii point pairs
(with time frame on x axis) for plotting
with gnuplot.)
d: TERMINAL DISPLAY FILE NAME for -c [./ascii.out]
t: oscillator resynthesis threshold in decibels [ -96 ]
Parameters which have the word (func) on the info page just before the default as in:
W: warp index for reshaping magnitude response (func) [0.]
*** ECMC note: Information on using the function generating routines
is provided within the
GEN FUNCTION CONTROL OF PARAMETERS
section of this document.
(*** End of this ECMC note ***)
*** ECMC Notes: ***
As of this writing (December, 2003) templates ARE available for the following PVC programs:
To obtain a usage summary on using one of the available ECMC template scripts,
type the script name with no arguments.
Example: Typing
plainpvtp syntax: plainpvtp insound [outsound] [> scriptfile]
where "insound" is the name (and, if necessary, path) of the input
soundfile and the optional "outsound" argument is the name of the
output resynthesis soundfile. If the "outsound" argument is omitted
the output soundfile will be named "test."
After capturing this template in an ascii file and editing this
file, run plainpv with this script file with the command:
sh scriptfile
To see a "plainpv" template file without providing soundfile arguments type
plainpvtp -
As this usage summary explains, if we simply wish to see what a plainpvtp script file looks like, without providing any arguments, we can type
To obtain a script file to run plainpv using the soundfile /sflib/wind/fl.c4 as our input sound and to write the resynthesized output to a soundfile called pvcflutetest1.wav in our current working soundfile directory, we would type
To obtain a script to run pvanalysis use either the command pvanalysistp or else the alias pvcanaltp. To obtain a script to run twarp, use the command twarptp, and so on.
In addition to these tp script templates, I have created
example script files for many (but, again, not for all) of the PVC
programs. To obtain a listing of these example files, type
pvcex filename(s) or else getpvcex filename(s)To display one or more of these example PVC script files through the paging program "less," type:
pvcex filename(s) | less or else getpvcex filename(s) | lessTo capture one or more of these files, type:
pvcex filename(s) > outfile or else getpvcex filename(s) > outfilewhere outfile is the name you want to give to this file.
Soundfiles in the sflib/x directory exist for all of these examples except for a few that do not create soundfiles, but rather analysis files or some other type of file.
To learn how to use plainpv, the most basic program
in the
An ECMC help file called pvc summarizes the usage information above and can be consulted for quick reference when using these programs.
After you have gained some experience working with plainpv, I
recommend exploring pvanalysis and twarp, and then (in any order)
chordmapper, convolver and harmonizer.
(*** End of this ECMC note ***)
*** ECMC Notes: ***
These scripts include a parameter called output_data_format, through which you can change the output format should you wish. Setting this parameter to 0 (the default) will cause the output resynthesis soundfile to be written to the same format as the input soundfile. Setting output_data_format to 1 will cause the output format to be integers -- 16 bit if the sampling rate is less than 50000, 24 bit if the sampling rate is 96k or any other rate higher than 50000. Setting output_data_format to 2 will cause the output format to be 32 bit floats, regardless of the input sampling rate and bit depth. In most cases, the output_data_format parameter should be left at the default value of 0, so that the output resynthesis soundfile has the same bit depth format as the input soundfile.
The ECMC script that runs the PVC program twarp , however, requires the user to explicitly specify whether the output soundfile should contain 16 bit ints, 24 bit ints or 32 bit floats. See the discussion of twarp for deatils.
Most of the ECMC pvcex example files employ 44.1k 16 bit input and output format. However, for each of the ECMC tp scripts, I have provided
at least one example that employs 96k 24 bit input and output format. These
96/24 example files include the character string 96 within the
example file name, and include plainpv96gourd, pvanalysis.96metrattle,
twarp96.metrattle, chordmapper96.1 and convolver96-1.
(*** End of this ECMC note ***)
*** ECMC Notes: ***
Phase vocoder jobs sometimes can take a long time to run. The PVC programs do update the output soundfile headers frequently, so that partially completed output soundfiles can be played before the job has completed. This must be done carefully, however. First you must suspend the PVC job (so that it does not continue to append samples to the soundfile while you are trying to play the soundfile) by typing ^z (control z). Then to play the partially completed output soundfile:
*** ECMC Notes: ***
Below is a listing of the routines contained in this release along with
a description of what each does. These programs are divided here into
two groups:
(*** End of this ECMC note ***)
*** (1) PVC Programs for which an ECMC tp script exists: ***
Plainpv is a basic phase vocoder with control of pitch transposition, frequency
shift, time scale, amplitude warp and low/high shelf equalization. It also
has some nice controls for looking at the data produced by the phase vocoder.
*** ECMC Notes: *** At Eastman, obtain a shell script template for
this routine with plainpvtp; edit this template, and then run the script
with the command: sh scriptfilename (where scriptfile is the name
of the input script file you have created to run plainpv All example files listed here and below are are available
in the hardcopy ECMC PVC EXAMPLE FILES binder in the studios.
Pvanalysis is the time varying form of freqresponse that creates a phase vocoder analysis for use by other routines. The routines which
require pvanalysis files are twarp,
and convolver, tvfilter, ringtvfilter, and tvfiltdeviator.
*** ECMC Note: *** At Eastman, use
pvanalysistp (or else the easier-to-type
alias pvcanaltp) to obtain a template file to run this routine. Edit
this file, then type sh filename to create the analysis file.
See ECMC example files
pvanalysis.voicetest,
pvcanal2 and
pvanalysis.96metrattle.
Twarp is like plainpv except that it works from an analysis file rather than a soundfile.
This allows you to move forwards/backwards through time according to a time function
file,
and also allows you to introduce dithering (random deviations in time point
while reading the analysis file). Use of subtle dithering can help eliminate
artifacts that often result from time expansion with phase vocoder techniques.
*** ECMC Notes: ***
To use twarp:
Example twarp6 illustrates
time point dithering, and is a mix of ECMC examples
twarp6-1 , twarp6-2 , twarp6-3 ,
twarp6-4 and twarp6-5
ADDITIVE SYNTHESIS -- HARMONIZER and CHORDMAPPER : These routines all allow for a kind additive synthesis based on the remapping
of phase vocoder data according to some model. Each requires an ascii data
file specifying how phase vocoder information will be replicated or mapped.
This mapping is constant for the run of the routine. Harmonizer works much like a commercial harmonizer in that it allows you to create
harmony against the source by adding a transposed copy of it. Here the concept
is extended by allowing for multiple harmonizations, each taken from a different
band of frequencies, output with seperate gain.
*** ECMC Note: ***At Eastman, run
this program with a script initially obtained with the
command harmonizertp Chordmapper lets you specify how harmonically related groups of partials will be replicated
or mapped to produce chords. An input data file organizes the remapping
into tone groups, and includes ways to tune or neutralize the frequency
deviations of partials. Time-varying control of these features is available
as well. You can use this routine to build up thick chords from single tones,
or to delicately reorganize a harmonic spectrum.
*** ECMC Note: ***At Eastman,
run this routine with a script file obtained with the chordmappertp command.
See example files AN ECMC help file on
chordmapper
also is available.
Inharmonator lets you specify how the partials of one fundamental will be remapped or
deviated. While the more recent and developed routine chordmapper is probably better for this task, I have decided to leave this routine
in for now. (Think chordmapper.) Subtractive synthesis: CONVOLVER In its setup and control, convolver is the very
similar to
tvfilter. It's processing, however, is different. In tvfilter filtering is produced by multiplying the magnitudes from the polar form of the two analyses; leaving the phases (or frequencies) of the source intact while modifying the amplitudes
of those frequencies. Convolver goes a bit further by multiplying the two analyses in their Cartesian forms. This produces an intersection of the two spectra. Unlike tvfilter which produces a shadowlike intersection, shadowing the analysis file
characteristic onto the input sound file, convolver creates a true spectral intersection, allowing only that which is common
to both sounds to be heard. The effect is a sound which is somewhat garbled as it outputs the more
intermittently common spectral components of the two. The form of the multiplication
in convolver does not allow some of the filter transposition controls associated with tvfilter. There is however a convolution panpot which offers control of the mix between the convolution and source sounds.
*** ECMC Note: ***At Eastman,
use convolvertp to create a script file to run this routine, and see the
example files
convolver1 ,
convolver2 ,
convolver3 ,
(a mix of two sources:
convolver3-1
convolver3-2) ,
convolver4 and
convolver5
Spectwarper uses an expanded compansion scheme to highlight either a sound's stronger,
resonant components or its weaker noise/residual components. Spectwarper is fairly similiar to compander; however, unlike compander which compands bins against the constant peak of an input response file, spectwarper compands bins using a peak drawn (in the current frame) from a narrow frequency
band centered around the value being processed. This causes the compansion or "warping' of the
amplitudes to accentuate(expansion) or mask(compression) formants located within the frequency bands; the result being the noise/pitch highlighting mentioned earlier. Part of this
comes from the treatment of compression in Spectwaper. Unlike compander which only reduces the amplitude above the threshold when compressing, spectwarper reduces the amplitude of the entire range, becoming, in effect, an expander
of the strongest amplitudes that expands them (when the compression level
is severe) out of the picture. Spectwarper is one of my favorite routines of late simply because it provides such
a simple and powerful control over the noise and pitch characteristics of
a sound.
*** ECMC Note: *** ECMC users can obtain a script file to
run Spectwarper with spectwarpertp
Example spectwarper96-1 illustrates the use of spectwarper to emphasize, or exaggerate,
the noise elements within a 96k marimba note.
ECMC example spectwarper96-2 illustrates
the reverse, emphasizing the pitched elements within the same marimba note
and de-emphasizing the percussive noise elements.
Using the pvcex or getpvcex command,
see the following example files :
plainpv1, plainpv2, plainpv3, (a mix of examples plainpv3-1, and plainpv3-2),
plainpv4, plainpv5, and plainpv6 (a mix of examples plainpv6-1, and plainpv6-2)
Example plainpv7
imposes the amplitude envelope of a maraca roll on a gong tone. The maraca roll
envelope was created by ECMC PVC example envelope1
Example plainpv8
incorporates a pitch analysis file created by ECMC PVC example
pitchtracker1
(However, as of this writing, the Linux version of pitchtracker is
broken, so example plainpv8 cannot be re-created and is rather academic.)
Example plainpv96gourd
is a 96/24 example.
The following ECMC example files illustrate various aspects and possibilities
of this program:
twarp1, twarp2 (a mix of example files twarp2-1 and twarp2-2),
twarp3, twarp4 (a mix of example files twarp4-1 and twarp4-2) and twarp5
Example twarp96.metrattle
is a 96/24 example employing a metallic rattle from /sflib96/perc.
See also example file
harmonizer1
(a mix of examples
harmonizer1-1 and
harmonizer1-2)
and example file
harmonizer2
(a mix of examples
harmonizer2-1 and
harmonizer2-2)
chordmapper1
(a mix of four source
soundfiles created by examples
chordmapper1-1,
chordmapper1-2,
chordmapper1-3 and
chordmapper1-4),
example
chordmapper2 , and
chordmapper3 , and
its four source files:
chordmapper3-1 ,
chordmapper3-2 ,
chordmapper3-3 and
chordmapper3-4 .
chordmapper3 and its four sources are very similar to inharmonator1
and its four sources, using
slightly different procedures to obtain almost identical results.
*** (2) PVC Programs for which NO ECMC tp script exists: ***
Almost all ECMC users can skip the small font descriptions of the following programs,
and jump ahead to the
FEATURE EXTRACTION : ENVELOPE
Because no ECMC tp utility exists to create script files for the
PVC programs that follow,
usage of these programs is more difficult for ECMC users: you will have
to create your own script files, based upon examples provided by Koonce and
located in the the directory
Freqresponse is a routine used by several others to prepare a spectrum for use with routines that filter, compress or limit. The response can be normalized or not depending on the needs of the routine which will use the response.
Noisefilter filters out the noise in a sound by subtracting out a frequency response. The frequency response is analyzed from a short segment in the file where noise alone is found. For sounds that do not have segments of isolated noise, there is a threshold mode.
Compander is a classic compressor/expander. What is different here is the use of a peaks response file. The peaks response file is a frequency response, analyzed from a segment of the sound, that is taken to represent the peak bin amplitudes for the sound. Each frequency bin of the peaks frequency response functions as the 0 dB reference point for that frequency bin. The amplitude of the frequency bin is companded relative to this reference.
Bandamp is an older PVC program, no longer included in the current PVC distribution. (Its capabilities also can be realized with the newer spectwarper program.)
This program
is an amplitude windowing routine. Like compander, it uses a response
file, previously created with
Filter is a very useful routine for filtering a sound by a frequency response. Filtering is achieved by first creating the frequency response through either synthesis or analysis, followed by filtering with filter. Synthestic responses are created using either chordresponsemaker (which synthesizes a spectrum as a collection of harmonic tones), or filtresponsemaker (which synthesizes a frequency response using lines and breakpoints). Analyzed responses can be made with freqresponse (which analyzes a sound file segment and constructs a response representing the peak or average amplitudes). Once made, the magnitudes of the FFT response are multiplied against the time varying magnitudes of the input sound's FFT. Filter allows time-varying control of the response shape (warp), transposition/shift, compansion, smoothing, and source/filter mix, making this a very useful tool for quickly manipulating the spectral characteristics of a sound according to your synthetic or analytic goals. The synthetic forms can be run with the scripts S.filter_with_chord_synthesis or S.filter_with_breakpoint_synthesis; the analysis-based form with S.filter_with_analysis. The analytic form is a powerful tool for bringing the color of one sound into the realm of another.
Chordresponsemaker is a routine that uses a collection of harmonic tones, variable in size, to create a synthetic frequency response. It is found in various filtering scripts.
Filtresponsemaker is a routine that uses breakpoints and straight lines to create a synthetic frequency response. It is found in various filtering scripts.
Tvfilter is the time-varying (tv) form of filter. Tvfilter uses a pvanalysis file to change the magnitudes of the input sound file. As it is with filter, tvfilter multiplies the magnitudes of the analysis FFT against the magnitudes of the input sound's FFT, while preserving the frequency/phase characteristics of the input sound. Preserving the phase of the input sound file results in a cross-synthesis which sounds like the input sound file covered or suppressed by the shadow of the analysis file. Like filter, tvfilter offers a variety of controls for manipulating the filter characteristic. The use of a phase vocoder analysis to represent the filter characteristic also makes possible the temporal control of the filter file (i.e. backwards/forwards control) as found with twarp. Run this using the script S.tvfilter.
Ring uses the phase vocoder to create an all-pass resonator. It works by structuring the FFT resynthesis as a bank of feedback filters that feed back the sinusoid of each bin in a strength proportional to the amplitude of that bin (after adjustment by global feedback controls). This allows the sound to "ring" in a way something like reverb or comb filter resonance. The difference from comb filtering is that with ring spectral resonance is created not through a collection of comb filters selected for their ability to resonate various pulse wave spectra, but rather, through an array of feedback filters (sized by the FFT) that resonate a sine wave spectrum while dynamically tuning their feedback frequencies to the frequencies of the input sound. In short, it creates a kind of "self resonance". Ring is a nice way of increasing the resonant pitch characteristics of a sound, although it has its weaknesses. Ring works best with larger FFT sizes as it is attempting to synthesize or accentuate the more pitched/harmonic characteristics of the sound; this is something larger FFTs, with their increased frequency resolution, handle better. Use of the Kaiser window, with its low sidelobe amplitudes, helps as well. In adition, there is a threshold for preventing the noise features of a sound from being resonated, plus an EQ which can be positioned to filter either the source input to the feedback loop, or the feedback return. Run this using the script S.ring.
Ringfilter marries filter with ring by allowing a frequency response to be imposed on the resonance created with ring. Ringfilter begins to look more like multiple-delay, comb filter resonance since the static frequency response selects which frequencies will feed back. What is unique here is that the frequency response can come from an analysis, allowing the input sound to be resonated by the average spectral characteristic of another sound. A synthesized frequency response can be used as well. Like the EQ in ring, the filter in ringfilter can be positioned to either filter the source input to the feedback loop, or the feedback return where it will have the effect of introducing the filter characteristic more slowly through the resulting variable rates of decay. Run ringfilter with S.ringfilter_with_chord_synthesis to create a synthetic frequency repsonse, and with S.ringfilter_with_analysis for an analyzed frequency response.
Ringtvfilter is to ringfilter what tvfilter is to filter; that is, it makes the filter in ringfilter time-varying. This is a sophisticated idea, that is, time-varying filtering of the resonance of a time-varying sound. The best characterization would be to say that Ringtvfilter imprints the shadow of one sound onto the reverb of another. Ringtvfilter requires some thought and finese in order to separate and articulate the evolutions of the source, resonance, and filter. The best results are created using dynamic, high-profiled source sounds, rich with transient noise; and more constant, pitch/harmonic sounds for the time-varying filter. Like tvfilter, ringtvfilter requires an analysis file. Run this routine using S.ringtvfilter.
The idea behind filtdeviator is to use a frequency response function to not only filter a sound (as with filter), but to to create a topology of frequency deviation working in correlation with the filter. Consequently, filtdeviator is filter with added parameters for specifying how the filter frequency response function will be mapped into the deviation of frequency. The added parameters set the base and peak deviation for how the response will be mapped into both pitch transposition and frequency shift, and how the function will be warped within the range set by these limits. Their is also a master (0-1) deviation control for globally controlling the deviation. All the controls of filtdeviator allow you to dynamically vary the presence and effect of amplitude filtering and frequency deviation, making filtdeviator an interesting routine for exploring the way filters can be used to impede/transform the resonant signature of a sound. Using small amounts of frequency deviation, with no amplitude filtering, and a sweeping transposition of the filter will produce an effect something akin to the commercial guitar phase shifter; larger amounts of deviation take it into another place entirely. Adding the correlated amplitude filtering conceals the deviation more (positioning it more at the edges of formants), producing a sound something like the floppy resonant behavior of slide whistles. The scripts to run filtdeviator -- S.filtdeviator_with_ chord_synthesis and S.filtdeviator_with_analysis -- are designed with frequency response synthesis/analysis sections like those for filter and ringfilter. Run this routine using either S.filtdeviator_with_analysis or S.filtdeviator_with_chord_synthesis.
Tvfiltdeviator is to filtdeviator what tvfilter is to filter; i.e. it uses a time-varying filter response in place of the constant one. This routine blows the lid off of what was unusual about tvfiltdeviator. It's great for making wacky sounds out of ones with nice, fixed harmonies. The best use is to use it to deviate itself. Try taking something like a harpsichord or guitar (pitched stuff with decay) and do an analysis of the sound with pvanalysis. Then use the analysis to deviate the same sound. What happens is the strength of each of the sound's components becomes a control over the frequency deviation of that component, one that causes the sound to go "sproing" whenever it has any amplitude. Makes tonal music sound really broken. Run this routine with tvfiltdeviator.
Envelope is a routine for tracking the amplitude envelope of a sound. Output can
be ASCII, floats or a NeXT soundfile. Selecting floats or ASCII will produce
a file suitable for use in the control of a parameter.
*** ECMC Notes: ***
At Eastman, obtain a script to run this routine
with envelopetp.
See example file
envelope1, which
is used in example plainpv7
Pitchtracker is a routine for tracking the fundamental pitch trajectory of a sound. It
is an experimental routine that works, I believe, but forever has its quirks.
Three detection methods are available for following the 1) fundamental of
the harmonic collection, 2) the strongest formant, or 3) a band-limited
centroid. Different output formats let you see, hear and eventually use
the fruits of your pitch tracking.
*** ECMC Note: *** ECMC users can use pitchtrackertp to obtain a
template script file, edit this file and then use it to run pitchtracker.
See example file
pitchtracker1.
The analysis output produced by this example is used
in example plainpv8
IMPORTANT NOTE: As of this writing the Linux version of
PITCHTRACKER IS BROKEN, AND CANNOT BE USED.
Centroid is a routine for tracking the centroid of a sound. The centroid is the average
of all the frequencies weighted by their amplitudes. It essentially gives
you a kind of center frequency value for your spectrum. The analysis can
be restricted to a band of frequencies, allowing the centroid to track a
particular frequency component (although pitchtracker can do this as well). Selecting floats or ASCII will produce a file suitable
for use in the control of a parameter.
Fluxoid is a routine for tracking the average frequency change of a sound. The average can be weighted (best) or not by the amplitudes.
Selecting floats or ASCII will produce a file suitable for use in the control
of a parameter.
CONTROL FUNCTION PROCESSING : RESHAPE Reshape is a routine for transforming function streams to meet the needs of different
parameters. It takes a headerless float or ASCII function file as input and outputs
a headerless stream of float or ASCII values. With the appropriate flags,
it can be used to limit, resample, translate, warp, expand, shrink, invert, quantize, and lowpass filter the input values. The output can be translated into different amp or pitch
units depending on your needs. Run reshape at the command line. Below are various terms, parameters, or ways of doing things which are common
to many of the routines. OVERLAP/ADD VS. OSCILLATOR BANK METHODS AND RESYNTHESIS THRESHOLDS: The phase vocoder resynthesizes the signal using one of two methods, depending
on the type of changes made to the FFT. If the changes are only to the magnitudes
(amplitudes), then the faster overlap/add method is used. If however changes
in frequency are made, then the FFT integrity is compromised, necessitating
use of the oscillator bank method in which each bin is synthesized as a
sine wave changing in frequency and amplitude. This method is slower, although
a resynthesis threshold is available which can be used to increase the computation
speed by turning off bins whose amplitude falls below the threshold. A threshold
of -60dB is appropriate, although safety warrants using a lower threshold if the spectrum is thin and its decays exposed; use your ear. The source sound is the original input sound. Some routines allow for the
mix of the processed sound with the original source sound. All routines allow both monophonic and multi-channel input files to be processed.
With multi-channelled files, you can either select one channel and produce
a monophonic output file, or process all the channels. Channels are numbered
beginning with 1. Processing of multi-channelled files is done one channel
at a time beginning with channel 1, with zeros written to channels which
have yet to be processed. Prcessing one channel at a time requires less
memory and allows you to audition the output sooner than if you did all
channels at once. FLOATING-POINT AMPLITUDE RESCALING Selection of the floating-point, output-file format invokes an amplitude rescaling feature. Once processing is complete,
a second pass through the sound file is made to rescale the values to the
decibel level specified. A dB rescale level of 1 causes rescaling to the
level of the original input file.
Two flags are provided for controlling the output amplitude statistics;
one turns the statistics on or off, and the other sets how often they will
be reported. The statistics provide the peak output level in amplitude and
decibels. Wth integer format ouput files, ouput values exceeding the normalized
peak amplitude of 1. (0 dB) are clipped to a value of 1.0, and the statistics
placed in clip mode; in clip mode reports are made only for frames where
clipping occurs. The peak amplitude, its time, and the number of clipped
samples are reported at the end of processing. With floating-point format
output files, ouput values exceeding the normalized peak amplitude of 1.
are not clipped since they will be rescaled in the second pass; output statistics
proceed normally throughout. The levels before and after rescaling are reported
at the end of processing. FREQUENCY RESPONSE TERMINAL OUTPUT In many filtering or companding routines, a crude terminal print of the
frequency response is a available. A flag sets the high cutoff frequency
for this output; a value of 0 (0 Hz) turns printing off. Analysis files are binary, 32-bit floating-point files written by pvanalysis, containing frames of FFT analysis data for one or more channels. Analysis
file data is preceeded by a header containing information about the analysis.
Analysis files are much larger than the sound files they represent, and
increase in proportion to the FFT size used. As such, files can become very
large, so it is advisable to only make them when needed unless you have
disk space to spare. Amplitude is always handled in decibel units. The greatest magnitude of
the 16-bit short integer is equated with an amplitude of 1.0 or 0 dB. 0
dB functions as unity gain, and the peak amplitude in issues of compression,
expansion, and amplitude windowing. A change of +/- 6 dB represents a doubling
or halving of the amplitude. Increments of 10 dB are loosely associated
with one change in dynamic level. 16-bit shorts allow for a 96 dB dynamic
range. Take care not to loose signal level as a consequence of processing
since quantization noise will emerge when you attempt to regain your signal
level by amplifying the integer sound file. Equalization has been provided at various points in routines to allow for
the needed adjustment of spectra. The EQ consists of low and hi shelf segments,
whose width is adjusted through control of the shelf breakpoint frequency.
The region between the shelf segments is represented by a linear decibel
gradient between the decibel levels of the two shelves. Some routines implement
the EQ before pitch changes, others after. EQ placed before pitch changes
(pre-transpose/shift) will cause the EQ to be transposed with the pitch
changes, whereas afterwards (post-transpose/shift) will keep them fixed
as shifts and transpositions occur. Many of the routines employ the principle of warping in which a distribution
of values is transformed by an identity function. In these places an exponential
function is employed to remap a 0-1 range of values into a new orientation
that preserves the minima (0) and maxima (1) while bringing the distribution
closer to either extreme as a result of the curvature of the exponential
function selected. The curvature of the exponential function is selected
through a warp index. Specifically, warp index w will reorient the input x through the function below (^ = exponentiation). y = (1. - (e^(x * w))) / (1. - (e^w)) In this function, the warp index of 0 produces a linear function and an
untransformed output. Positive warp index values of increasing magnitude
produce curves of increasing concavity (increasing slope) that draw values towards the 0-valued minima, and reduce the function integral.
Negative values do the opposite, drawing values towards the maxima of 1,
increasing the integral. The practical use of this mechanism is found in various places. One such
place is the reshaping of the frequency response distribution characteristics.
In this, positive warp indeces cause the peaks of the response to be accentuated
while the weaker frequencies are expanded out (i.e. pushed towards 0). Negative values have the opposite effect as they compress
the dynamic range of the response and raise the relative level of the weaker noise components. Another place where warp applies is in the remapping
of FFT amplitudes through the spectrum warpshape. In this, the sucessive
FFT frames have their amplitudes remapped by the identity function, similiarly
expanding or compressing the dynamic range depending upon the warp specified;
0 (linear warp function) leaves the amplitudes unchanged. With the pitch transposition control, a constant or function value is multiplied
against all bin frequncies. This is classic transposition, here specified
in semitones of transposition (12 semitones equals an octave). Conversion
is made to produce the appropriate frequency multiplier. With the frequency shift control, a constant or function value is added
to all the bin frequencies to produce a nonlinear pitch domain translation of the spectrum.
Frequency shift is related to things like ring modulation and their similarly
nonlinear shifts of pitch characteristics. Use this to create small distortions of the harmonic integrity of a sound. The rate at which amplitude changes are allowed to occur effects how smooth spectral evolutions will be. To control this, many routines contain attack and decay response times
controls: once translated these controls manipulate the coefficients of the following filter. y(n) = (1. - A) * x(n) + A * y(n) The filter is a lowpass designed to increasingly smooth the sudden changes in a signal
as the value of the coefficient, A, is increased. Its control is through the response time parameter which
is the time in seconds it takes a signal, shifting from one state to another, to decay to -60 dB of its former state. Response times are transformed to create the necessary coefficients for the
selected frame rate. The response time is separated into attack and decay;
this allows seperate control of the smoothing of the signal depending upon whether
it is increasing or decreasing in amplitude. Short attack/decay response
times can be used in places where dynamic processing induces garble or even
pops. You can use longer response times to generally smooth or blur the
onset/offset of sound components, particularly if the response controls
are being applied to a time-varying filter. When applied to amplitudes,
longer decay respsonse-times do not sound good, for in their delay of the decay, they
end up amplifying the residual noise of a sound. Decay time is an issue in the feedback of the ring routines. Like response
time, it is the time it takes the signal to decay to -60dB of its former
state, or better, the time it takes the reverb to decay to -60dB. The FFT size must be a power of 2. Larger FFT sizes resolve frequencies
better but transient behavior more poorly. Choose your FFT size according
to the sound you are working with. A size of 1024 or 2048 works well in
most cases. *** ECMC Note: ***
FFT SIZE when working with 9624 soundfiles: In my tests it is rarely a good idea to lower this parameter below 1024 when
the sampling rate is 96k. Artifacts usually will result if you try this.
(When working with 44.1k soundfiles it sometimes IS useful to use a value of 512
)
With 9624 soundfiles, raising the FFT size from the default of 1024 to 2048, or
in certain cases even tro 4096, sometimes produces better results.
The window size is a less opaque parameter; like the FFT, it must be a power of 2. Windows which are twice the size
of the FFT work well. Larger window sizes may resolve frequencies better.
Specifying 0 for the window size will automatically set the window to twice
the FFT size, a feature I have always used. *** ECMC Note: ***
WINDOW SIZE when working with 9624 soundfiles:
This is a critical parameter with 9624 input soundfiles. The argument is in samples, and specifies how many samples are included in
each analysis frame. The frames need to be large enough to include at least one
cycle of the fundamental frequency.
Large window sizes create better frequency resolution put poorer temporal resolution,
in particular often smearing attacks.
At sampling rates of 48k and below setting this value to 2 * FFT_length
often produces the best compromise and audio quality.
The default PVC windowsize value of "0" actually is a flag that sets
the windowsize to 2 * FFT_length.
Thus if windowsize is set to 0, the sampling rate is 44.1k and the FFT_length is
1024, the window size is set to 2048 samples, equal to .046 seconds (between 4 and 5
milliseconds).
For high sampling rates such as 96k and 88.2k, however, a window size of 2048 sa
mples
analyzes a much smaller unit of time (slightly more than .002 seconds). This
can result in poor frequency resolution, especially for low pitched sounds.
In the ECMC PVC templates, therefore, if the sampling rate is above 48k and the
windowsize argument is set to the default 0, the ECMC script doubles the
default window size so that it is 4 * the FFT_length value, thus analyzing
approximately the same amount of time as when windowsize is set to 0 at a
44.1k sampling rate.
In most cases this will solve the problem. But if you are unhappy with the audio
quality of the resynthesis and want to try fiddling with the windowsize argument
,
be aware that you generally will need to set it to either double or four times
the value you would use if the input soundfile had a 44.1k sampling rate.
Also, if you set the FFT_length to a high value, such as 2048 or 4096 to
attempt to produce better frequency resolution, windowsize may need to be set
to a very high value: 4096, 8192 or even 16384.
With FFT_length raised from the default 1024 to 2048, setting the window_size to
4096 or to 8092 generally will produce good results.
With FFT_length raised to 4096, try setting the window_size to 8092, and if
this doesn't help to 16384.
windowsize almost always should be set to at least twice the FFT_length value.
(*** End of this ECMC note ***)
The FFT and inverse FFT are computed using a window. Like the FFT size,
the shape of the window used can effect the quality of the analysis and
resynthesis. (See F.R.Moore, Stieglitz, or Roads for further explanation.)
A variety of windows are available including: Hamming, Rectangular, Blackman,
Triangular, and Kaiser (in 8 different forms as related to 8 different alpha
values). Blackman (-w2) or Kaiser (-w8) are reccomended for most applications.
In some unusual cases where transient behavior is being lost, consider using
other windows such as the Rectangular, although take care to assure that
it is not producing pops or a buzzy sound. This controls how often the phase vocoder will perform an analysis on the
signal. It is a translation of the classic decimation control which specifies
how many samples to skip between analysis frames. More frames increases
the resolution of time but decrease speed. 200 frames per second is a good
reference point. If you expand time you should increase this proportionately
to maintain about 200 or more frames per second. Once the spectral modifications are made to the FFT analysis, an inverse
FFT is invoked to produce the samples of a time-domain signal. The classic
phase vocoder paradigm controls the number of samples through the interpolation
value and its relation to the decimation. The arcane relationship of decimation
and interpolation is here translated into the parameter of time expansion/contraction,
allowing for the direct scaling of time. Use values greater than 1 to expand time, less than 1 contract it. Processing may be performed on an entire file or a segment of it by specifying
begin and end times. End times less than or equal to 0 default to the end
of the input file. The output and other components can be gained. 0 dB represents unity gain,
no change. See decibels. FILTERING: SOURCE SIGNAL LEVEL The mix of source and filtered sounds in the filter routines can be controlled by the source
decibels floor. This value, taken from the -96 to 0 dB range, specifies
the level of the source signal. The filtered signal level is equal to (1
- source amplitude floor). Consequently, the source level functions as a
floor over which lies the filtered signal. A source floor of 0 dB would neutralize
filtering since there would be no filter range above the floor, a floor
of -96 dB would produce the full effect of the filter. TRANSPOSITION/SHIFT APPLICATION FLAG Filter routines which allow for transposition and frequency shifting of
both filter and source have a flag which specifies whether transposition/shift
should be applied before or after filtering. If it is applied before, the
pitch transposition trajectory will evolve independent of the filter's trajectory
of transposition. If it is applied after, then the pitch transposition trajectory
will be added to the filter transposition trajectory, causing the filter
to move in parallel with the pitch transposition movements plus any movements
the filter transposition parameter adds. Filters can be toggled to use frequency responses in pass or rejection mode.
In pass mode, the response's stronger magnitudes are used to pass source through the filter; in rejection mode, to impede or reject components. In rejection mode, the response is created by inversion
in the decibel range, not amplitude. In time-varying filtering (tvfilter), rejection can be in mode 1 in which the response is inverted against
a constant 0 dB peak, or in mode 2 in which the response is inverted against
the current analysis frame's peak amp. Spectral warping is always applied
after the response has been transformed by rejection. Many routines which use frequency response files to filter or warp amplitudes have a control which allows the response to be smoothed. The smoothing
is produced by replacing the magnitude of a frequency bin with an average
taken from a band centered around that bin. The degree of smoothing is controlled through manipulation of a bandwidth value, specified in octave units. Larger bandwidths produce greater
degrees of smoothness, 0 turns smoothing off. Routines which use analysis data made with pvanalysis -- twarp, convolver, tvfilter, ringtvfilter, and tvfiltdeviator) -- access data the same; using the time-point, rate, and data window boundary parameters, set to function in either rate or explicit mode. In rate mode, the rate determines the speed of movement through a data file; the time-point sets the starting position. The rate may be positive (forward in time) or negative (backwards in time), and
vary according to a function. Explicit mode uses the time point parameter to specify exactly where the analysis data should come from (units here are in the time of the analyzed sound). (Explicit mode does not use the rate control, and makes sense only if the time-point is controlled with a function.) Both rate and explicit modes abide by the upper and lower data window boundaries which delimit the data range. When the time-pointer moves beyond the specified upper and lower time boundaries, it re-enters
the window from the other end, making the window into a circular/modular
structure. The boundaries can be controlled with functions as well, giving
this mode an expressive dimension far surpassing the time expansion/contraction
parameter. There is also an auto-stop feature that, when turned on, causes processing to stop when it reaches
the end of the analysis. The convolver routine has a unique panpot mechanism for controlling the mix of input
sounds (A and B) with their convolution. The panpot is a crossfade mechanism
that uses a -1 to 1 control range to accentaute either sound A, B or their
convolution. A value of -1 produces an output consisting entirely of sound
A, a value of 1, sound B. The 0 between these extremes produces the convolution of A and B. Values between these points produce
a crossfade mix of either A or B and the convolution. For example, a trajectory from -1 to 1 would crossfade from sound A into the convolution,
and on to sound B. Separate gain controls for A, B and the convolution make it possible to tune the continuity of this
trajectory. In addition, the presence or spread of the convolution into the crossfade range can be tuned with the domain warp controls. The domain warp reshapes the movement through the crossfade range, allowing you to
create a more gradual approach from A or B into the convolution center.
This is achieved through a simple nonlinearizing of the crossfade domain
in warp index style. Increasingly positive domain warp values (specified independently for each side) transform the linear trajectory towards the convolution into a decellerating
one, causing the subtle mix area around 0 to be expanded. Therefore, if you want to hear more convolution in your crossfade, increase
the panpot domain warps. FREQUENCY RESPONSE ACCUMULATION METHOD Several of the response-producing routines have the option of accumulating
the response by either peak or average means. Whereas peak responses represent
the record of a sound's thresholds (or synthesis specification's highest values), average responses represent
the most common characteristics. If the sound you are analyzing has intermittent
moments of sound whose peak characteristics you wish fully represented in
the response, use the peak mode; otherwise use the average. RING ROUTINES: FILTER PLACEMENT Ringfilter and ringtvfilter
(for which there are no ECMC tp scripts) use frequency response functions to filter the reverb.
Two filtering modes are available in which either the source input to the feedback if filtered, or the feedback. When the response is used to filter the source input, it filters the signal before it enters
the feedback mechanism, imposing its characteristic, from the start, on the feedback. However, when positioned to filter the feedback component, the appearance of the respsonse's spectral characteristic, in the reverb, appears gradually as the signal
decays. In this mode, the time it takes the signal to decay into the response
characteristic is controlled by an additional decay time associated with
the filter. Spectral compression and expansion play a role in many routines. Its implementation
here is according to the traditional model that uses thresholds and magnitudes of compression/expansion to reduce or enlarge
the dynamic range of a signal. With spectral compression, amplitudes that
exceed the specified compression threshold are reduced by an amount determined
by the decibels of compression (a multiplier of the bin's amplitude lying
above the threshold). Expansion works in a similar fashion, except that
it changes the amplitudes below, rather than above, the expansion threshold;
this results in an expansion of the dynamic range as the bins falling below the threshold
are made to cover a wider range. The term companding or compander is a merging of the two names, useful in
situations where they are both available. While compander is the most obvious example of a routine using companding, traditional
compression can be found in several other routines that involve filtering.
It is not uncommon, in those routines, to reduce the dynamic range of an
analyzed frequency response, particularly if it is time-varying, since the
goal in filtering is more about color than dynamic range. In all routines that use some form of companding, the dynamic range of the
unprocessed signal/response is assumed to lie between 0 and -96 dB; thresholds
are chosen from within this range. The degree of compression or expansion,
expressed in decibels, represents how much the signal lying beyond the threshold
will be reduced. A value of -6 dB would halve the dynamic range above the
threshold in compression, or double the range below the threshold in expansion. Compander applies compression for each frequency bin separately rather than as a
macro gain change. It does this by using a frequency response file, created
with freqresponse, to establish a unique, 0 dB point of reference for every bin; using its unique point of reference, every bin is compressed or expanded. FILE CONVERSION: aiffs, aiffd, nexts, nextd, nextfloats The sound file conversion scripts: aiffs, aiffd, nexts, nextd, and nextfloats
are shell scripts available for converting sound files back and forth between
aiff and next formats, or from next to floats. They are all effectively
SGI scripts since they use the SGI sound file format conversion utility, sfconvert. Aiffs and aiffd take next integer files and write new aiff files, nexts and nextd the opposite; in addition aiffs and aiffd can be used to write new aiff integer files converted from next float files.
Nextfloats writes a new float file from a next integer file.The s or d following the aiff or next in the name stands for the action taken on the original file once the new
file is made; the s saves the original file (i.e. does not delete it), the d causes it to be deleted. Multiple files may be converted with the same run
of the command. Running the command without any input files will produce
a description of the routine.
FUNCTION VIEWING: showme, showspect Two graphing scripts are available for viewing functions and spectral data.
You must have gnuplot installed on your computer to use them (Type gnuplot
<CR> to see if you do). Showme is a simple script for viewing function files. Run without an input file
for a description. Showme takes headerless floating-point or ASCII (give -a flag) function files
and plots them. Showspect plots the file of FFT amplitude or frequency data
produced by the plainpv script, S.plainpv_with_printout_and_graph_files. Showspect is useful for seeing a graphic representation of a very particular part
of an analysis, it is not a substitute for a standard spectrogram application. GEN FUNCTION CONTROL OF PARAMETERS Any parameter whose flag on the
routine's information
page has the word (func) after it
gen4 -L1000 0 -3 0 1 3 > $SFDIR/ptrans ; Such function file definitions may be
placed near the top of a script file that runs a PVC
program, before the arguments to
plainpv or whatever other PVC program is being used,
and we may group all of these function file definitions together.
Alternatively they can be created within the body of a PVC file, perhaps
just before, or even just after, the parameter which they control.
Lines in shells can be continued onto new lines with the backslash, which
comes in handy with gen functions. The above, for example, could be entered
as: Many of the ECMC PVC example files, including plainpv5, twarp1 and harmonizer1,
include function table definitions. These function generating routines
are similar in several respects to the gen routine in
Csound. However, whereas
Csound stores function tables in RAM, PVC requires
that these tables be written to disk files. To create a function file, copy the appropriate gen routine
template line to a new line, removing the leading # comment
symbol, edit the line, and specify a file name for the output. Although
these files are fairly small -- typically 1 kb --
I recommend writing them to your $SFDIR ("current working
soundfile") directory, rather than to your current Unix directory
or to /tmp.
The gen routines you are most likely to find useful are those that
create time/value envelope shapes: gen1, gen3 and gen4.
A quick tutorial on gen1 through gen5 is provided below.
Those who want additional information on CMUSIC gen routines can
consult Appendix D in F. Richard Moore's Elements of Computer
Music text (on reserve at Sibley for CMP 421-2).
Excerpts from this appendix are included as an appendix to the hardcopy
ECMC PVC EXAMPLES binder available in the studios.
gen1 creates linear {straight line} segments, like Csound gen 7
Note: You cannot look at the values within these function tables,
since they are in binary format. If you would like to see the values
in a table, to make sure you are getting what you want, before you
run a job, remove the file redirect at the ends of lines like those
above and include an exit command:
*** ECMC Note: *** No ECMC tp script or examples exist for centroidtp
*** ECMC Note: *** No ECMC tp script or examples exist for fluxoid
(*** End of this ECMC note ***)
*** ECMC Note: *** Most ECMC users will never use the floating point
option, and thus will never use this rescaling option, although I have included
it near the bottom of the user parameter section of the ECMC tp script files.
(*** End of this ECMC note ***)
*** ECMC Note: *** ECMC users probably will never
need to use these file format conversion utilities.
Some of the file conversion utilities mentioned here do not work
on the ECMC Linux systems anyway.
[or, within an ECMC template script file, includes
the comment # int, float or FUNC]
can be controlled by a
function file.
To make these files, complete CMUSIC gen command lines are inserted
into a script, like this:
##### Cmusic function file generator tempates #####
# gen0 normalizes function files previously created with other gen routines
# gen0 -Llength max < inputfuncfile > outputfuncfile
# gen1 creates linear {straight line} segments, like Csound gen 7
# gen1 -Llength t1 v1 ... tN vN
# gen2 generates harmonic waveforms from sine {a} & cosine {b} amps
# gen2 -Llength [-o (default) or -c] a1 ... aN b0 ... bM N
# gen3 generates amp values & linear connections at equally spaced time points
# gen3 -Llength v1 v2 ... vN
# gen4 generates exponenetial segments; "a" values determine shape &
# depth of curve: 0 = linear, neg. = exponential, pos. = inverse expo.
# gen4 -Llength t1 v1 a1 ... tN vN
# gen5 is like Csound gen 9 : harmonic1/amp/phase harmonic2/amp/phase
# gen5 -Llength h1 a1 p1 ... hN aN pN
# gen6 generates a table of random numbers between +1 and -1
# gen6 -Llength
# cspline: smooth curve {cubic spline} interpolator
# cspline len_flag [flags] x0 y0 x1 y1 ... xN yN
# genraw reads in a previously created function file
# genraw -LN filename (where N is the length of the output function.)
# For a usage summary of "reshape" type "reshape" with no arguments.
##### End of gen routine function generator tempates #####
syntax: gen1 -Llength t1 v1 ... tN vN
Examples: Either of the following two lines would generate an identical result:
gen1 -L100 0 0 .5 2.5 1. 0 > $SFDIR/updown
gen1 -L100 0 0 50 2.5 100 0 > $SFDIR/updown
gen1 -L100 0 0 .5 2.5 1. 0
exit
This will cause the table values to be displayed in your shell window.
syntax: gen2 -Llength [-o (default) or -c] a1 ... aN b0 ... bM NExample:
gen2 -L100 1. 0 1/3 0 1/5 0 1/7 0 # > $SFDIR/squareResult: harmonics 1,3,5,7 in sine phase are created with proportions of a square wave
gen3 generates amplitude values and linear connections at
equally spaced time points
syntax: gen3 -Llength v1 v2 ... vNExample:
gen3 -L100 1 .35 1.2 0Result: Values decrease linearly from 1. to .35 1/3 of the way through the table, then increase from .35 to 1.2 at 2/3 way through the table, then decrease from 1.2 to 0 at the end of the table
gen4 can be a powerful but complicated envelope generating routine to use because one must specify 3 values for each breakpoint except the last, where only 2 arguments are necessary. These arguments for each breakpoint are:
time (t), value (v), and (a), which determines the slope of the curve between this breakpoint and the next.
Result: Values in the table move with an inverse exponential slope from 02 to 4. over the first 1/3 of the table, then from 4. to 2.5 over the second third of the table, then exponentially from 2.5 to 0 over the final third of the table.Syntax: gen4 -Llength t1 v1 a1 t2 v2 a2 ... tN vN Example: gen4 -L50 0 -2. 1 .33 4 1. .67 2.5 -1 1. 0 < $SFDIR/gliassando t1 v1 a1 t2 v2 a2 t3 v3 a3 t4 v4
gen5 is similar to Csound's gen9, generating harmonic (or, less often, inharmonic) waveforms. The user specifies one or more partials, and for each partial, three arguments: the partial frequency (as a multiplier of a fundamental of 1), it's relative amplitude (on a scale of 0 to 1), and its phase (between 0 and 360 degrees). The resulting table numbers typically have values between +1. and -1.
Syntax: gen5 -Llength h1 a1 p1 ... hN aN pNFour examples:
(1) gen5 -L1000 1 1 0 > $SFDIR/sine # sine wave
h1 a1 p1
Result: One cycle of a sine wave with values between +1. and -1.
(2) gen5 -L1000 3 1 90 > $SFDIR/harm3Result: Three cycles (the third harmonic) of a cosine wave (a sine wave with a 90 degree phase shift)
(3) gen5 -L1000 2 1 0 4 .5 0 7 .2 0 > $SFDIR/harm247
h1 a1 p1 h2 a2 p2 h3 a3 p3(4) gen5 -L1000 1 1 0 # | reshape -b0 -B1. # > $SFDIR/tempfunc
Result: A sine wave with values rescaled between 0 and 1.
From example file plainpv5:
gen4 -L1000 0 -90 0 \
.1 12 0 \
.8 3 0 1 -90 > $SFDIR/ampfunc
Two functions from example file plainpv6-1:
gen3 -L1000 5 5 -5 -5 5 > $SFDIR/spectrumfuncThe values remain at 5 during the first 1/4 of the table, then move linearly from +5. to -5. during the second quarter of the table. They remain at -5 during the third quarter of the table, then move linearly from -5. to +5. during the last 1/4 of the table.
gen4 -L1000 0 -2 1 \The floating point values in the file
.25 -2 1 \
.5 4 1 \
.8 4 1 \
1 2 > $SFDIR/pitchfunc
To avoid cluttering your soundfile directory with these temporary function files, include lines at the very end of a script file (after the OFFICE USE ONLY section) deleting these temporary files, as at the end of example file plainpv5:
Note: If a function definition you create contains an error that makes it impossible for the gen routine to create this function, an error message will be displayed, and the program will ignore this function and use the default values for the parameter(s) where this non-existent function is intended to be used. However, this error message will scroll by quickly near the top of the voluminous diagnostic messages from the program, and it is easy to miss this error. To do a test run of your function definition(s), place an exit command immediately after your function definitions, which will terminate the program at this point, without running the PVC analysis and resynthesis:
gen1 -L1000 0 0 .2 0 3.4 -96 3.68 -96 > $SFDIR/rampdownRun the program. If you get an error message, check your function definition(s) for errors, make an necessary corrections, and run the program again. If you get no error messages, remove the exit line and run the program.
gen1 -L1000 0 -96 .2 -96 3.4 0 3.68 0 > $SFDIR/rampup
exit
Below is a sample of the output from plainpv.
plainpv -N1024 -M0 -w2 -D400 -I2 -a-0 -P2 -A0 -C0 -t-96 -b0 -e0 -H0 -m200
-X0 -R2000 -L0 -l0 -W0 -T0 -f-1 -F-1 -_1 -=1 -p1 -i.25 /S1/t.snd /S1/cm.mix.snd
/////////////////////////////////////////////////////////////////////
---------------------------------------------------------------------
============================== PLAINPV ==============================
---------------------------------------------------------------------
========================== INPUT SOUNDFILE ==========================
INPUT FILE: FILENAME = /S1/t.snd
INPUT FILE: SAMPLE RATE = 44100
INPUT FILE: NUMBER OF CHANNELS = 2
INPUT FILE: DURATION = 2.770386
INPUT FILE: BEGIN TIME = 0.000000
INPUT FILE: END TIME = 2.770386
INPUT FILE FORMAT: 16-BIT INTEGER
========================== OUTPUT SOUNDFILE =========================
OUTPUT FILE: FILENAME = /S1/cm.mix.snd
OUTPUT FILE: SAMPLE RATE = 44100
OUTPUT FILE: NUMBER OF CHANNELS = 2
OUTPUT FILE FORMAT: 16-BIT INTEGER
OUTPUT FILE: DURATION = 5.540771
======================== ANALYSIS PARAMETERS ========================
FFT SIZE = 1024
*
FUNDAMENTAL ANALYSIS FREQUENCY = 43.066406
*
WINDOW SIZE = 2048
FRAMES/SECOND = 400
DECIMATION SAMPLES (samples between analysis frames) = 110
======================= RESYNTHESIS PARAMETERS ======================
TIME EXPANSION/CONTRACTION FACTOR = 2
*
INTERPOLATION SAMPLES (samples between resynthesis frames) = 220
*
OSCILLATOR RESYNTHESIS THRESHOLD (in dB) = -96.000000
*
GAIN (in dB) = 0.000
PITCH TRANSPOSITION (in semitones) = 2.000
FREQUENCY SHIFT (in Hz) = 0.000
*
ENVELOPE ATTACK TIME (in seconds) = 0.000
ENVELOPE RELEASE TIME (in seconds) = 0.000
*
SPECTRUM WARPSHAPE INDEX = 0.000
*
FREQUENCY WINDOW: LOW BOUNDARY = 0.000000
FREQUENCY WINDOW: HIGH BOUNDARY = 22050.000000
*
*............. LOW/HIGH SHELF EQ............*
LOW SHELF FREQUENCY = 200.000
.......... LOW SHELF DECIBELS = 0.000
HIGH SHELF FREQUENCY = 2000.000
.......... HIGH SHELF DECIBELS = 0.000
*...........................................*
*
=====================================================================
ANALYSIS: CHANNEL = 1
..............USING BLACKMAN WINDOW
.....USING OSCILLATOR BANK RESYNTHESIS
*********************************************************************
** PEAK AMPLITUDE STATISTICS **
*********************************************************************
TIME PEAKAMP DECIBELS (LAST DECIBELS PEAK)
*********************************************************************
( 0.00 - 0.25) 0.0005 -66.295 -66.295
( 0.25 - 0.50) 0.2052 -13.754 -13.754
( 0.50 - 0.75) 0.3285 -9.668 -9.668
( 0.75 - 1.00) 0.3066 -10.269
( 1.00 - 1.25) 0.3176 -9.962
( 1.25 - 1.50) 0.2731 -11.275
( 1.50 - 1.75) 0.2655 -11.518
( 1.75 - 2.00) 0.2416 -12.337
( 2.00 - 2.25) 0.2930 -10.661
( 2.25 - 2.50) 0.2915 -10.707
( 2.50 - 2.75) 0.3067 -10.267
( 2.75 - 3.00) 0.4094 -7.757 -7.757
( 3.00 - 3.25) 0.3076 -10.241
( 3.25 - 3.50) 0.2841 -10.930
( 3.50 - 3.75) 0.2843 -10.924
( 3.75 - 4.00) 0.3241 -9.786
( 4.00 - 4.25) 0.3340 -9.524
( 4.25 - 4.50) 0.3612 -8.845
( 4.50 - 4.75) 0.3113 -10.136
( 4.75 - 5.00) 0.3094 -10.189
( 5.00 - 5.25) 0.3141 -10.058
( 5.25 - 5.50) 0.1142 -18.846
============= PEAK AMPLITUDE ========================================
CHANNEL TIME PEAKAMP DECIBELS (CLIPPED SAMPLES)
.....................................................................
1 2.898 0.4094 -7.757
*********************************************************************
=====================================================================
ANALYSIS: CHANNEL = 2
..............USING BLACKMAN WINDOW
*********************************************************************
** PEAK AMPLITUDE STATISTICS **
*********************************************************************
TIME PEAKAMP DECIBELS (LAST DECIBELS PEAK)
*********************************************************************
( 0.00 - 0.25) 0.0004 -67.948 -67.948
( 0.25 - 0.50) 0.2301 -12.763 -12.763
( 0.50 - 0.75) 0.2477 -12.122 -12.122
( 0.75 - 1.00) 0.1969 -14.115
( 1.00 - 1.25) 0.2631 -11.599 -11.599
( 1.25 - 1.50) 0.2086 -13.613
( 1.50 - 1.75) 0.2559 -11.840
( 1.75 - 2.00) 0.2671 -11.465 -11.465
( 2.00 - 2.25) 0.2768 -11.157 -11.157
( 2.25 - 2.50) 0.1762 -15.082
( 2.50 - 2.75) 0.2113 -13.502
( 2.75 - 3.00) 0.2549 -11.872
( 3.00 - 3.25) 0.2673 -11.460
( 3.25 - 3.50) 0.2869 -10.847 -10.847
( 3.50 - 3.75) 0.2841 -10.931
( 3.75 - 4.00) 0.1991 -14.019
( 4.00 - 4.25) 0.2131 -13.427
( 4.25 - 4.50) 0.2540 -11.904
( 4.50 - 4.75) 0.2235 -13.014
( 4.75 - 5.00) 0.2407 -12.369
( 5.00 - 5.25) 0.2941 -10.629 -10.629
( 5.25 - 5.50) 0.1166 -18.667
============= PEAK AMPLITUDE ========================================
CHANNEL TIME PEAKAMP DECIBELS (CLIPPED SAMPLES)
.....................................................................
2 5.103 0.2941 -10.629
*********************************************************************
=====================================================================
PEAK AMPLITUDES: ALL CHANNELS
---------------------------------------------------------------------
CHANNEL TIME PEAKAMP DECIBELS (CLIPPED SAMPLES)
.....................................................................
1 2.898 0.4094 -7.757
2 5.103 0.2941 -10.629
=====================================================================
PLAINPV: RESYNTHESIS COMPLETED