A nearly constant presence in my electroacoustic work since 2012 has been CataRT, a tool for concatenative synthesis developed by Diemo Schwarz at IRCAM. 

In concatenative synthesis, a database of prerecorded or live-recorded sound is created by segmenting it into units, usually of the size of a note, grain, phoneme, or beat. Each grain is analyzed for a number of sound descriptors which describe sonic characteristics, such as loudness, spectral centroid (a measure of brightness), or periodicity. These values are stored and are recalled at the moment of synthesis, when one or more target values is sent to the synthesis engine, and the closest unit to those given target values (usually in the sense of minimizing a weighted Euclidean distance) is selected. The selected units are then concatenated and played, possibly after further transformations.

CataRT, available as a standalone application or a modular series of Max patches and abstractions, contains an LCD display that charts descriptors in two or more dimensions, allowing one to visualize a descriptor space and navigate through with a mouse or other type of external controller. Introductory videos on the CataRT homepage demonstrate the basics.


In 2011 Aaron Einbond and I began the joint project of integrating feature modulation synthesis into CataRT. In between the selection and synthesis stages—in other words, just before the chosen grain is played back—its values can be altered according to external parameters. A basic example is loudness modulation, where an independent mixer gives levels for subsets of the corpus. Before a grain is played, its stored loudness value is recalled, then a lookup function finds the mixer coefficient to be applied, and the resulting level is sent to CataRT in lieu of the original level.

Targeted Transposition applies this same principle to pitch. Using the bach library developed by Andrea Agostini and Daniele Ghisi, a “target pitches” interface has been implemented to combine with existing CataRT modules; one or more target pitches are defined before playback in this window. As grains are selected from the corpus by proximity to target descriptors, which may include pitch itself and/or other descriptors, their note number content is examined, and a transposition value equivalent to the difference between the estimated note number and the target pitch is sent to CataRT before playback. If more than one target pitch is defined, for example a harmonic field of possible pitches, the pitch of each sample can be either drawn at random (with or without replacement) or chosen based on the shortest distance to the original pitch of the unit.

This model of targeted transposition works best with a corpus whose grains are clearly segmented into units of definable and constant pitch, for example by using segmentation based on change of pitch on a harmonic sound, or by loading banks of samples. Here is a basic demo of the process, with a few sound examples:


My first piece to use this technique extensively was Five Out of Six, where an ensemble of six instruments on stage interact with both CataRT and live video (by Things Happen). The teaser video below features only CataRT ouput, webs of retuned samples, with images from the piece.

Targeted Transposition was the subject of an ICMC paper in 2012: Precise Pitch Control in Real Time Corpus-Based Concatenative Synthesis by Aaron Einbond, Christopher Trapani, and Diemo Schwarz.


The next step of our work was the implementation of a transcription module that records the output of CataRT. The selected grains and their playback parameters are recorded in real time, and can be visualized in the non-metered bach object called bach.roll. This format can be stored as text and quickly recalled for playback.

After transcribing, it is possible to interact with the roll, editing onset, duration, or pitch data. As a note is repositioned, the necessary adjustments are automatically made for the playback engine. It is also possible to batch edit any of the recorded CataRT playback parameters, such as attack time, release time, panning, reverse playback (an on/off switch), or gain. The tools visible in this video were developed in part with Christophe Lebreton at GRAME, during the composition of Convergence Lines.

This second demo video shows the transcription of an outwardly expanding spiral movement, visible in the CataRT LCD, through a corpus of prepared piano samples, retuned using Targeted Transposition. This spiral was an early sketch for Spinning in Infinity, a piece whose electronics consisted exclusively of CataRT playback piloted by bach.rolls, recalled by a MIDI keyboard in the orchestra.

This research was presented at the 2014 International Computer Music Conference in a paper titled Fine-tuned Control of Concatenative Synthesis with CataRT Using the bach Library for Max by Aaron Einbond, Christopher Trapani, Andrea Agostini, Daniele Ghisi, and Diemo Schwarz.