2008 News Archive

ECE Music Program

February 15, 2008

For many years, ECE has supported research and courses that intersect engineering with music, and in fact, the collaboration between the Department and the Eastman School of Music seemed natural to Chair Mark Bocko as long ago as 1993. It was back then that Professor Bocko began working with Eastman's Professor David Headlam, who integrates technology into his research in music. Today, Bocko and Headlam are co-directors of the ECE-Eastman Music Research Lab (MRL) with other faculty including ECE Professor Edward Titlebaum, ECE Professor Jack Mottley, and Eastman Professor David Temperley.

Given the strength of both the Eastman School of Music and the Department of ECE, it's not surprising that students--from undergraduate to MS to PhD levels--are expressing a keen interest in learning about digital signal processing as it relates to audio and music. To name only two recent examples, ECE recently awarded PhD's to Gordana Velikic, whose thesis was "The Use of Phase in Automatic Music Transcription," and to Xiaoxiao Dong, whose thesis was "Musical Sound Synthesis and a New Music Representation Based on Empirical Physical Modeling of Musical Instruments."

Because of student interest, the Department is adding a new track to the Master's in EE in Audio and Music Signal Processing and is expanding its undergraduate course offerings with the intention of eventually offering a new undergraduate major in music engineering.

Current Research

The ECE-Eastman Music Research Lab (MRL) currently runs projects in these general areas:

  • Music telepresence, or playing music together over the Internet
  • Music transcription and sound source separation
  • Music representations and musical languages
  • Hiding data in music

The first three projects are aimed at enhancing the enjoyment of music with digital technologies and teaching computers to do what musicians already do.
In music telepresence, Bocko and Headlam have been working for years to help musicians rehearse together when they are in remote locations.

With NSF funding, the MRL collaborated with the University of Miami and Wright State University in Dayton, Ohio, and developed very low-delay ways of communicating using the Internet-2.

In the second area of study, music transcription and sound source separation, ECE PhD graduate Gordana Velikic developed a technique that helps resolve single instrumental ambiguities and boost automated transcription to the 95-98% accuracy range.

Music collaboration using the Internet-2

Music collaboration using the Internet-2

Another aspect of Velikic's research explores methods of disentangling the sounds made by multiple instruments playing together. Velikic disentangles all of these sounds using phase. For example, suppose someone is playing a clarinet and applying a small amount of vibrato by changing his embouchure slightly. The changes he makes affects all of the harmonics in a similar way, keeping them all in phase. By analyzing these phases, Velikic's system predicts the origin of the spectral components: whether they are from the clarinet or other instruments.

Currently, an accurate reproduction of sound such as CD quality is composed of 16 bits per sample and 44,100 samples per second, with one CD containing about 70 minutes of music. To yield this near-perfect sound requires about 1.4 million bits of data per second. Also available are various compact machine music languages such as MIDI, with the sounds actually coming from a synthesizer. Obviously, MIDI music sounds very mechanical as opposed to the pure quality of CD music.

Recent PhD graduate Xiaoxiao Dong and ECE PhD student Mark Sterling have created a system that produces high-quality yet compact sound files using physical modeling. In short, they measure a real musical instrument and then make a computer reproduce the sound that the instrument makes through simulation.

Using a clarinet, they built a computer model of the physics of how the instrument works. To control the instrument, several parameters are required, such as the fingering, how hard the performer blows into the clarinet, and how firmly the lips are gripping the reed. Then they encode the sound as the time history of these control parameters. Because performers don't change the parameters rapidly, the software updates the parameters only about 10-20 times per second.

As for inserting hidden, inaudible data in sound files, digitally recorded audio contains a lot of redundancies, meaning there's a lot more data than needed to reproduce the sound. The mp3 format is a great example. As mentioned earlier, CD audio requires about 1.4 million bits of data per second. With the compression of mp3, that amount of CD data is reduced by a factor of approximately 10. This means that, with mp3, about 90% of sound data is removed, yet the resulting sound is of excellent quality.

Using a number of techniques, the MRL group has successfully hidden everything from text to binary to musical scores in these unneeded bits. Because of the fascinating opportunities in programs such as data hiding, music representation, and music transcription, the University is entertaining a vast new ECE-Eastman Music Program. We're awaiting developments with great excitement.

-lhg