Cakewalk Music Software/MIDI Editing: MIDI is not an ideal data format for editing . Sequencers *could* improve it.

MIDI -- not an ideal data format for editing/
how sequencers can improve on it

John S. Allen

The MIDI protocol was devised in the early 1980's primarily to control synthesizers of that era in real time. The MIDI protocol was conceived as a device control protocol, like the Postscript or PCL protocol that sends data from a desktop computer to its printer.

Today's word processing applications don't make the user edit printer data. Yet today's music sequencers are very much bound up in the MIDI protocol. Editing in these sequencers has been viewed by their developers largely as the editing of MIDI data.

The structure of the MIDI protocol has led to editing difficulties as synthesizers have become more versatile. In particular, the way that MIDI organizes "continuous controller" functions such as volume, pitch wheel and modulation makes editing difficult. This is because the structure of MIDI data does not correspond to the structure of musical data as it is commonly understood. For example, while MIDI lets you assign a trombone section to a single MIDI channel, it does not give you full, separate control over the individual trombone voices and the individual notes they play.

The sequencer I use, Cakewalk for Windows, like most others, does not compensate for most quirks of the MIDI data structure. Previous articles in this series have discussed practical aspects of dealing with problems caused by these quirks. This article gives an overview of MIDI quirks, the problems they cause, and existing and potential solutions. I refer to Cakewalk frequently, though most of what I say applies to other sequencers.

Though some teams of software developers -- mostly in academic settings -- are looking into alternatives to the MIDI protocol, it is my opinion that sequencing packages which process MIDI data intelligently can largely solve the problems in software. Besides, whatever its advantages and disadvantages, MIDI is a deeply-entrenched standard which will be with us for a long time.

Let's define the primary problems with direct editing of MIDI data:

Midi parameters relate well to musical and perceptual parameters: this is not the problem. Parameters such as note on time, note off time, volume, stereo pan and pitch bend all are easy for musicians to understand, and easy to implement in hardware and software. The problems are, rather:

1) The time structure of MIDI data does not conform to the concepts of selection, cutting and pasting as used in computer software -- or to the analogy of a music sequencer as a sound recorder.

2) There is overlap and redundancy in the actions performed by MIDI parameters. The actions performed by MIDI key velocity, volume controller and expression parameters overlap considerably, as do key number, track pitch bend, polyphonic aftertouch and modulation. This redundancy is not managed so as to serve the goals of editing or of economizing on data.

3) There are problems with the relationship of track structure to channel structure. Since it is possible to place data from any channel in any track, conflicts between data in different tracks can occur easily, but they are hard to locate.

4) There are problems with the relationship of messages to notes. Messages correspond to MIDI channels or key numbers, rather than to notes, and when there is more than one note at a time in a channel or on a key, the notes can not be controlled with complete independence.

5) Bandwidth is limited..

Now, let's look at specifics.

MIDI is structured in terms of messages which change the status of the synthesizer. In technical terms, MIDI is a "stateless" protocol. The synthesizer or other receiving device keeps track of the change that each new message brings about. This approach is appropriate in a device control protocol, in order to reduce the amount of data which must be sent. When there are no changes, no messages need be sent.

But for editing, you must know the status between messages. This is not only because your memory is not as perfect as the synthesizer's, but also because you may jump from one place to another in the music while editing.

If you open up a Cakewalk editing window or an event list where there are no messages, you don't know what values are in effect. As a result, you may find that a musical segment you paste in sounds very different from the way it did in its original location. control2.gif (4542 bytes)

The Cakewalk Controllers View at the left shows two messages, but gives no information about the setting for the first half-measure. If there were no messages in the window, you couldn't tell what the setting is anywhere in the time segment shown in the window.

The solution to this problem is intelligent graphic displays with a "searchback" function which produces a line graph instead of spikes.

Cakewalk does us a line graph for its tempo map, as shown in the image on the right: Tempo map after deletion (5 KB GIF) but not with controllers. This inconsistency probably reflects the fact that the tempo setting is always available within Cakewalk. Tempo can not be stateless, because it controls the sequencer's clock. The clock is located in the sequencer rather than in the receiving device, and must always be running during playback.

Controller messages, on the other hand, are MIDI data sent to the receiving device. Cakewalk does have a controllers "searchback" option which it uses during playback, but Cakewalk does not use this function to search for old MIDI controller settings and display them during editing.

Cutting and pasting MIDI data results in unexpected and unwanted consequences if MIDI message structure is not taken into account. The effect of a message can extend past the section which is deleted or pasted in, because the status of MIDI controllers depends on the most recent message. If that message is deleted, then an earlier message takes effect. If a new message is pasted in, its effect continues until another message overrides it -- often, all the way to the end of a piece of music.

Cakewalk accounts for this problem only as it affects notes, by storing the note off message within the note on message. That way, deleting the note on message deletes the entire note. Cakewalk does not account for the problem with controller messages.

The solution to this problem is a searchback and translation function which adjusts MIDI data to the requirements of cut-and-paste editing. Another article on this site discusses this problem in detail in connection with deletion and cutting, and one will be posted soon in connection with pasting.

MIDI synthesizers give you only one control input on each MIDI channel for each controller parameter, though several independent inputs are needed for user-friendly control.

By way of comparison, an electric guitar typically has one or more volume controls on the instrument, plus a volume pedal and a volume control on the amplifier. Each of these controls serves a different purpose.

You set the volume control on the amplifier before you start to play, to establish the balance with other instruments. Since you play a guitar with your hands, not your feet, the volume pedal is always available for special effects, such as muting the attack of a note.

You don't use the volume knob on the guitar much while playing, but it does let you cut back on volume for accompaniment, while bringing it up during solos.

If your performance is being recorded, or amplified by a house PA system, the recording engineer has a fader for the guitar, and separate faders for other instruments, making it possible to balance them. The engineer also may position microphones and performers to achieve balance. The outputs of several faders may be sent to another, group fader, for example, for the entire rhythm section. Finally, there is a master fader that sets the level of the entire, fully mixed audio signal.

In all, I have described six different faders, and the positioning of microphones. Each of these serves a unique purpose, yet all of them affect the volume setting of the electric guitar as it is being recorded. Some affect other instruments as well. There are three major conveniences provided by having several volume controls.

Setting certain volume controls for entire length of the performance, others for different sections of the performance, and yet others for transient changes and expression control.
Having some volume controls affect only one instrument, while others affect groupings of instruments or the overall level of the complete mix.
Returning precisely to preset levels by setting them at the extremes of a volume control's range.

What volume controls does my Cakewalk sequencer give me?

Well, I can input a volume signal sent from a MIDI input device. There is also a volume setting in Cakewalk's Track View, and a volume control fader in Faders View, and a volume control graph that I can edit in Controllers View. That's four different ways to input volume controls -- not bad.

But there's a catch. All of these controls adjust the same MIDI parameter, and as soon as I change one of these controls, it replaces the adjustment I made with another. The separate controls are no more than "Windows dressing."

For example, I may insert a volume setting in track view. This sends a volume control message before the first note sounds. But if I insert other, later volume control messages in the track, the initial setting in track view only works until the first new message.

This problem is worst when working with polytimbral synthesizers, which can synthesize sounds on as many as 16 different channels with different tone colors, all at the same time and through the same audio outputs. As you add more instruments to your arrangement, the volume goes up, and when it overloads the synthesizer output, you have a real nightmare if you have inserted volume settings anywhere except at the beginning of your tracks.

Musicians with a multi-synthesizer rig have more flexibility in keeping the signals for different instruments separate, but still nothing like what is usual when recording a group of live musicians. General MIDI does give you master volume and pan controllers, however Cakewalk does not support these master controllers in its graphics windows.

Multiple inputs to same volume control (22 KB GIF)

In the Cakewalk screen shot above, I wanted the accordion to "breathe." I set an initial volume in the Track view ( top). Then I created a crescendo using the Faders View (left). Finally, I drew in a decrescendo using the line drawing tool in Controllers View (in the middle, and still highlighted as active). All of these volume changes are visible in the Controllers View. The "now" line, which is just before the fourth beat of the measure, shows a controllers setting of 89, which is reflected in the Faders View. All of my volume adjustments, no matter how I input them, are to the same controller function, leaving me no way to rebalance my tracks without going back and changing all of the volume settings through the whole track, or else writing a CAL program. Excuse me for saying so, but computers are supposed to make things easier!

The MIDI "expression" function, controller #11, which also controls volume, could ease the situation somewhat -- but two controllers still aren't enough if I want to have expression control, a global volume setting for track balance, and settings for different sections of the track. And not all synthesizers support the expression function as a volume control. My Turtle Beach Monterey doesn't in its default General MIDI configuration.

Similar problems occur with all of the other MIDI controller functions.

You may download the MIDI file from which I made my screen shot and play with the volume adjustments yourself.

The solution to the multiple controller problem? a sequencer with multiple, configurable, one-to-many assignable and gangable inputs to MIDI controller functions. This approach will be described in a separate article.

A workaround, if your synthesizer supports it, is to assign an unused controller input, such as #2, breath controller, to volume. Then you will be able to use one of the inputs (probably the standard controller 7, which is supported in Cakewalk's Track View) for rebalancing tracks, the breath controller for changes between sections of the music, and #11 for local expression control. This still doesn't give you as many independent controls as our guitar example, and it adds more controller messages than you really need -- but all in all, three volume controls are better than two.

It can be useful to use a similar approach for other controller functions -- more about that in another article to come.

Note that "virtual mixers" which let you control outboard mixing boards from a sequencer do let you adjust MIDI channels of the same synthesizer separately, since the channels of a polytimbral synthesizer are all mixed together by the time they reach the outboard mixing board.

MIDI controllers work on all notes in a channel, but polyphonic, polytimbral synthesizers produce multiple notes per channel. How do I control these notes separately?

The problem of a single controller for multiple notes is bound up with the history of MIDI. In the early days of MIDI, this problem did not exist, since synthesizers could produce only one note at a time per channel. Today's polytimbral synthesizers can produce more than one note at a time on each channel, and they know which MIDI patch to use according to the channel on which it is sent.

The MIDI polyphonic key aftertouch control is one exception -- it does work separately on each key -- though not widely supported and there is no standard assignment as to what it controls. That there is only one controller for the multiple notes on each channel is not a problem with most keyboard controllers and traditional keyboard sounds, since most keyboards don't have polyphonic aftertouch anyway. But to sequence a trombone section or a guitar, you must have separate control over each individual note of multiple notes being played at the same time. If you don't, the only way to get the independence of control you need is to put each trombone, or each guitar string, on a separate MIDI channel. And in Cakewalk, the only way to identify those channels is to put each of them on a separate track, which puts each one on a separate Piano Roll View and staff line. This makes editing inconvenient.

The solution to this problem is a sequencer with dynamic voice allocation, combined with the ability to select the individual notes, voice, channel, track or group of channels/tracks on which a controller function operates. These advances will be discussed in a separate article.

For now, you have to split out notes to separate MIDI channels to get independent controllers. You may do this manually or by means of a Cakewalk CAL program module

MIDI does not provide any way to confine the timing of controller settings to individual notes. While adjusting controller settings for a sequence of notes, you must be very careful to place new a new setting precisely at the beginning of a new note, or else there will be a "glitch" as the new setting either affects the end of the old note or the new setting comes after the start of the new note.

The same problem happens when cutting and pasting controller settings along with notes. You must be very careful to include the controller settings in the time span of the note or notes you are cutting and pasting, and no others.

The solution to this problem is the same as described in the previous section of this article, and will be discussed in a separate article.

Even the polyphonic aftertouch function still has some of these problems.

Polyphonic aftertouch messages include a key assignment. This allows different notes in the same MIDI channel to be controlled separately as long as they have different key assignments. Polyphonic controller messages may extend past the beginning and end of a note they affect, with no audible consequences as long as there are not other notes with the same key assignment in the affected time period..

Different notes which have the same key number can not, however, be controlled separately. There is no way in MIDI to send separate controller messages for different notes that have the same key number. (MIDI also can not sound such notes correctly, as described in the next section).

The solution to this problem is the same as described in the previous section of this article, and will be discussed in a separate article.

Why do some of my notes cut off shortly after they start? A MIDI channel does not support ensemble polyphony, which allows unisons (as in a trombone section with two trombones playing the same note). A MIDI channel supports only keyboard polyphony: On a keyboard, unisons can not occur because there is only one key for each MIDI note number. This limitation also occurs in MIDI because the MIDI "note off" message is associated with a key number rather than with any particular note. ("Note off" is misleading -- "key off" is more accurate.) But Cakewalk MIDI editing lets you make notes overlap. MIDI stores the key on and key off messages separately, but Cakewalk stores both together, as a note which has a start time and a duration. This way, each note on is associated with a note off, even after cutting and pasting sections of music. But some problems occur, as shown below.

Unison as displayed by Cakewalk, from .wrk file )0.9 KB GIF) The first of the two half notes in the Cakewalk staff view shown at the right sounds for its full duration. The second half note sounds as an eighth note, because the "note off" (actually "key off") message for the sixteenth note ends all notes with the same key assignment. You may download the Cakewalk for Windows v. 3.0 .wrk file of this example (it also works with newer Cakewalk versions) and check this out for yourself.

Unison as displayed by Cakewalk from .mid file (0.9 KB GIF) When we save a MIDI file of this example in Cakewalk and play it back, it sounds the same as the Cakewalk file. But the display the MIDI file in Cakewalk looks different, as shown in the next image (right). The unison notes have traded key off messages. Cakewalk reasonably assigns the first keye off message to the note that starts earlier, since MIDI, unlike Cakewalk's proprietary file format, provides no information on which key off message belongs with which key on message. When you play the file back, the first keye off message ends all notes with its key assignment, just as with the Cakewalk file.

Unison in Staff view with no trim (0.8 KB GIF) The Cakewalk Trim option in staff view has no effect on the appearance of either of the above examples, but it does affect the example to the right. This example does not have one note entirely within another, but only one note which overlaps another. Such overlaps are common when merging takes, or as a result of slight timing errors. The second note is cut off shortly after it starts, as you can hear in the MIDI file. (In this example, the Cakewalk file sounds the same and generates the same Staff View.)

The Cakewalk Trim option hides the overlap, as shown in the next illustration, and makes it harder to figure out what the problem is. For this reason, I recommend against using the Trim option when editing.

The unison problem as shown in these examples can be completely baffling -- what you see in Cakewalk's Staff and Piano Roll views does not match what you hear. There is no explanation of this problem in the Cakewalk documentation.

Unison conflict on two tracks (1.2 KB GIF) In Cakewalk, and in MIDI files, you can assign the same MIDI channel to more than one track, as shown in the next illustration(right). If you do this, conflicts are even more likely and are harder to find. The MIDI file or Cakewalk file of the two-track version produces the Staff View shown at the right, and sounds the same as the one-track version.

With unisons, as with controllers, we are dealing with a historical problem. MIDI was developed without the idea that a channel might play more than one note at a time. And then, when multiple notes began to be played on a channel, the input controller was usually a keyboard. While a sustain pedal does make unisons possible with a MIDI keyboard, it still does not allow proper playing of unisons: any key number for which a note off message is more recent than the last note on message cuts off when you raise the pedal. If you merge two tracks that both contain sustain pedal messages, similar problems occur, and the results can be very unpredictable.

A trombone section (to give an example) can play unison notes that end at different times, and you also can create overlapping unisons by overdubbing or combining MIDI tracks. When you do this, all notes of the unisons will be cut off when the first note ends (or when the sustain pedal lifts, if that is later).

The unison problem can crop up during editing even in a single-note line. If you adjust the timing of a note slightly so it overlaps another at the same pitch, the later note will be cut off shortly after it begins.

This problem is solved by dynamic voice allocation, by use of polyphonic pitch control to reassign a conflicting note to a different key number with a pitch offset, and by truncating notes which slightly overlap others. Sustain pedal data merging problems can also be resolved through dynamic voice allocation or by retarding sustain pedal depressions slightly so they do not immediately precede releases.

The truncation of overlapping notes is a well-known technique, though it is usually done manually, or in Cakewalk, through CAL programs. Combining truncation of overlapping notes with dynamic voice allocation and key reallocation makes it more versatile and easier to implement. I have posted overlap removal CAL programs on this site.

MIDI bandwidth is limited. Finally, there is the problem of bandwidth. Standard, serial MIDI can send only about 3,000 bytes, or 1,000 messages, per second. This is not enough data to avoid audible delays or allow complete flexibility of control with today's polytimbral synthesizers. Continuous controller messages, in particular, gobble up data bandwidth.

Bandwidth is an editing-related problem, because editing can greatly increase the amount of MIDI data -- for example, by cloning tracks or by ganging controller functions.

MIDI data is sometimes sent at high speeds over SCSI connections, and in the future, it will probably be sent over other types of fast connections. Also, other protocols may replace MIDI. But for the next several years, the problem of MIDI's slow data rate will be with us.

The bandwidth problem is widely recognized and understood, though today's sequencers offer only piecemeal solutions. Cakewalk's user manual recommends turning off controller functions of input devices which are not needed -- and that's good advice. Cakewalk also offers several routines in its proprietary programming language, CAL, to "thin" controller messages by eliminating, for example, 3 out of every 4 messages. These functions can produce unwanted results at times, but they are under user control.

Cakewalk also reduces the data rate by brute force; for example, the pitch wheel function in Controllers View gives only 1/128 the resolution which the MIDI protocol allows (also, conveniently for the programmers, making it possible to use a graphics window with only 128 levels, as with the other controllers). The coarse resolution makes graphic editing of the pitch wheel function unusable for large pitch bends, and marginal even with small pitch bends. The 16,384 levels of the pitch wheel function in the MIDI specification are enough to provide a pitch resolution of 0.78 cent over the full 128-semitone range of MIDI key allocations, close enough for most musical purposes. The Cakewalk graphic input gives a resolution of 3.125 cents over the default General MIDI 4-semitone pitch bend range, coarse enough to result in noticeable detuning of sustained notes relative to one another. Pitch wheel graphics window showing only 32 data points for 1 semitone (5.4 KB GIF)

In the image at the right, note that there are only 32 data spikes for the pitch bend shown, which covers 1/4 of the total range. Also, note that there is no zoom function to expand the window vertically -- only horizontally.

For small bend ranges, the resolution which the MIDI specification provides is not needed. On the other hand, this resolution is needed for larger bends. Similar situations occur with other controller functions.

The redundancy of MIDI controller functions can generate unnecessary controller messages. Cakewalk also creates bandwidth problems without warning the user or providing assistance in solving these problems. Ganging of controllers in Cakewalk's Faders View can produce data overload very quickly.

The best available answer to these problems is for a sequencer to use the great power and speed of today's host computers to monitor and allocate MIDI bandwidth. There is much that can be done, along the same lines as the lossy data compression which is so successfully used in digital audio and imaging systems. Several approaches suggest themselves:

Automated and semi-automated reassignment of tracks to different MIDI ports, if more than one port is available, according to the amount of data which the device at each port can accept.
Automatic thinning of controller data to reflect perceptual limits. A minimum increment for the sending of a new controller message would reflect the type of function, its level, its sensitivity setting (as, for example, with the range of the pitch wheel function as described above), and its rate of change. Some functions (for example, Pan and Modulation, or Pitch Wheel with small bends) provide far more resolution than human listeners can detect.
Substitution of less data-intensive functions when these lead to little or no change in the audible result: for example, using changes inkey velocity messages (which must in any case be sent with every new note) to allow the elimination of volume messages.
Automatically advancing initial controller settings ahead of the notes which they affect, into time segments where MIDI data is relatively sparse. (This is, however, possible only with stored data and where the advanced messages will not affect sounding notes.)
Automatic muting of data which has no audible effect; for example, controller data when there are no notes that they affect. (There may still be notes on other MIDI channels which are playing and which are affected by data overload.)
Controlled, prioritized "shedding" of messages when overload nonetheless occurs. Messages to be shed first would be those for smaller increments of controller functions related to quieter notes, and for functions whose incorrect setting does not produce a musically unpleasant result (i.e. almost anything but the pitch wheel function).
Shedding of notes, tracks or entire channels which "double" or "layer" other channels. The same notes will still be played, though their tone quality will be altered.
MIDI data stream monitoring (like the CPU usage graphs in computer system monitor programs) to show what functions are causing the heavy data flow, and to allow the user to decide what to do about them.
In an audio-capable sequencer, archiving MIDI tracks as audio tracks. The MIDI data can be retained in case it needs further work. The substitution of audio tracks for MIDI tracks and vice versa could be semi-automated if the software and hardware are designed to do this. This approach has the advantage that it increases the number of available MIDI channels, not just the density of MIDI data which can be sent over a given number of channels.

When using these techniques, a sequencer should maintain an unprocessed version of a file for flexibility in future editing, while using the processed version for real-time output. This goes without saying for the 9th suggestion above, but also applies to the others.

***

This is the end of the presentation on problems inherent in MIDI itself. Forthcoming articles will discuss problems which are added by conventional sequencer design concepts, and by shallow and incorrect application of music theory.