Originally developed for telecommunications and military applications in the 1930s, the vocoder, a portmanteau of voice encoder, as a musical tool has become indelibly associated with artists such as Daft Punk, Kraftwerk, and the Electric Light Orchestra (ELO), thanks to its ability to merge the characteristics of the human voice with the playability of a synthesizer.
Wendy Carlos recorded the first mainstream performance of a vocoder for her soundtrack to Stanley Kubrick’s A Clockwork Orange, using a 10-band version she developed with Bob Moog in 1970. From there, it was quickly established as a featured component for countless pop, funk, and dance hits, and is often confused with an effect known as the talkbox, which is an entirely different method for embossing the characteristics of the human voice onto a musical performance.
Over the past 45 years, there have been many successful vocoders, both hardware and software. The earliest hardware units from Bode, Roland, Moog and Korg found their way into top artists’ studios. Nowadays, software versions are baked into leading DAWs from Ableton, Apple, and Propellerhead, to name a few.
Although vocoders are typically used for creating “robot vocals,” they are capable of generating a wide variety of effects. But to get the most from this unique processor, it is essential to understand how a vocoder works.
Let's talk basics
If you’ve ever seen the display of a graphic EQ respond to incoming audio, you will have witnessed an essential concept behind the vocoding effect. A vocoder consists of several filter bands, tuned across the audio spectrum, that respond to a set of envelope followers corresponding to their specific frequency ranges.
Vintage hardware, such as the Roland SVC-350, offered 8 to 12 analogue filter bands and provided enough intelligibility to be the lyrical focus of pop tracks like ELO’s “Mr. Blue Sky.” In software, the number of filter bands is far greater, with Logic’s EVOC 20 offering up to 20 bands, Ableton delivering 40 bands, and Reason’s BV512 including an FFT-like, 512-band setting.
In a classic configuration, the vocoder’s filters are applied to a bright synth patch (such as strings), while the incoming signal (usually a vocal) stimulates the envelope followers, which raise and lower the volume of their associated frequency band; attack and release parameters are provided for fine-tuning their response times. Thus, the characteristics of the human voice control the behaviour of the filters that are processing that synth.
Carriers and modulators
Like the important aspects of FM synthesis and ring modulation, knowing how the carrier and modulator relate to vocoding is the crucial first step in understanding how the processing works. The carrier is the signal that is processed by the vocoder. It is usually a synth, but other common options include distorted guitar and orchestral timbres because the carrier requires a broad frequency spectrum in order for the vocoder to work its magic.
When using a synthesizer as the carrier, the sound should have plenty of high-end so the vocoder’s filter array can be heard. Traditionally, an unfiltered sawtooth or noise waveform serves best for the basis of that sound. Square waves will also work, but their hollow timbre sometimes reduces the intelligibility of the effect. Modern wavetable synths, with their rich and complex character, will also work. The bottom line is that the carrier’s sound should contain a full spectrum of frequencies, with a lot of mid and high content, and no additional filtering.
The modulator is the signal that activates the vocoder’s envelope followers, each of which controls a single filter band. In the case of the human voice, every vowel’s formant characteristics correspond with one or more of the filter bands that process the carrier signal. Thus, when you speak into a connected microphone, these envelopes open and close different filters, shaping the tone of the carrier’s signal.
A number of other parameters let you finesse the character of the vocoder’s sound. Not every vocoder includes all of them, and some versions feature additional options, but these are the most common tools you’ll find on most modern examples.
Number of bands. This determines the total number of filter bands that are applied to the carrier signal. A lower number (in the 4 to 6 range) delivers a vintage sound, while anything higher than 16 bands offers more detail, often at the expense of CPU utilisation. As a rule of thumb, you can think of this number as the vocoder’s overall resolution.
Frequency range. For software vocoders that support these parameters, you can set an upper and lower frequency for the vocoder bands, which will be divided across the number of bands available. Generally, these parameters are handy for tuning the filter bands to accommodate the higher or lower vocal ranges of female and male vocalists, respectively. You can also use these to precisely adjust the detail of the vocoding effect when using other instruments as modulators.
Envelope. These parameters are most often attack and release (or attack and decay). As with other envelope-following processes, such as compression and auto-wah, the envelope settings determine how quickly the effect responds to dynamic changes in volume. For vocals, these are generally set to fairly quick response times, but longer times are great for more impressionistic results.
Unvoiced. While a vocoder’s filters can cover a broad spectrum of sound, especially when using a high number of bands, intelligibility can suffer when plosives (the letters t, d, b, and such) and sibilants (s, z, x) are inaccurately processed by the filter bank. In this case, an additional highpass filter will help improve intelligibility for these unvoiced elements.
Confusingly, this feature goes by many different names. In Reason’s Vocoder, it is called “HF Emph” (for high-frequency emphasis). In Ableton Live, it’s simply “Unvoiced.” In Logic’s EVOC 20, the label is “U/V Detection.” But in all three cases, it is essentially the same feature: a highpass filter that blends in a small amount of treble content from the modulator input to emphasize the consonants. Some vocoders include an additional sensitivity parameter to further adjust this process.
Formant Shift. Users of plug-ins such as AutoTune and Melodyne probably know that formant-shifting a vocal results in either chipmunk (upward) or Darth Vader (downward) effects. In the case of vocoders, the results are very similar. Here, the technique is applied to the vocoder filter bands directly, shifting their ranges up or down across the frequency spectrum. It’s worth mentioning here that Reason’s Vocoder refers to this parameter as shift.
Bandwidth. Called “Formant Stretch” on the EVOC 20 and “BW” on Live, the bandwidth parameter controls the total width of each frequency band. Leaving this parameter at 100% is the default value, with 200% creating overlap between the ranges and extremely low values imparting a shimmery, crystalline texture to the vocoder output.
Now that you have a grasp of the fundamental properties and controls on a standard vocoder, here are a few tips for making the most of its features—both for vocals and special effects.
Articulation. Newcomers to vocoding often sing or speak naturally into a vocoder, but are then disappointed when the results are mushy or unclear. Here are two tricks for getting a robot voice to sound right when working with a vocoder.
Over-articulate your words. Since the original voice won’t be heard in the mixed vocoder output, don’t be shy about this. It’s okay to sound ridiculous, as long as the vocoding effect sounds good. If you’re going for a hybrid sung vocal with vocoder, dedicate a separate track with exaggerated pronunciation to drive the modulator input.
Add compression to the modulator signal. Natural phrasing often de-emphasizes some syllables, which can make it harder for a given filter band to respond appropriately. Compressing the vocal before it hits the modulator input will help keep the voice clear and consistent. The amount of compression will vary depending on your goal, but if you find a well-articulated vocal isn’t driving the vocoder properly, try adding a compressor to it.
Vintage Robots. The classic sounds that we know from Kraftwerk and ELO were derived in the ’70s when the number of vocoder bands was generally twelve or less. If you’re going for that timbre, you should use a similar number. Some vocoders, such as Ableton Live’s, allow you to select as few as four bands, which is cool for subtle processing of chords and leads.
For Daft Punk vocoder and talkbox effects, increase the number of bands to the maximum as this will dramatically increase intelligibility for a more modern sound.
Carrier Options. While sawtooth-based carriers will deliver the most familiar results, here are a few areas for experimentation. Just remember that if a wide frequency range isn’t present in the carrier, the vocoder filters won’t have enough audio information to process.
1. Add a touch of glide to a monophonic sawtooth carrier, which will make the vocoder effect “sing” more gracefully. A good reference for this is Daft Punk’s “Around the World” (which sounds more like a talkbox than a vocoder).
2. If you want to learn a lot about the relationship between voice and pitch for different speaking styles and emotions, experiment with pitch LFOs. While these are great for adding vibrato, try using different waveforms, rates and depths for a creative range that goes from a questioning voice to laughter effects.
3. The quickest way to create whispers and crowds is to use slightly filtered noise as your carrier. The results are often creepy.
4. Old-school hip-hop vocoder effects often involve a single bass sawtooth as their carrier. Try using two sawtooth oscillators an octave apart as your source for added depth and character.
5. If you have a sufficiently rich choir patch or a sample with plenty of mids and highs, you can simulate a singing chorus by using it as the carrier.
Ethereal Voices. DJ Sasha and his team of producers have used haunting vocoder tones on remixes like “Talk Amongst Yourselves (Sasha Involver Remix)” by Grand National. This unusual effect is created by using very narrow filter bandwidths, in the 20% to 70% range.
Drums. Substituting full drum kit or percussion loops is a fantastic way to emboss natural rhythmic elements onto chords and pads. The more complex and bright the performance is, the more intense the effect will be. Try using drum loops with lots of hi-hats and cymbals or mixed percussion loops that include shakers, tambourines and congas.
Whooshes. Using an envelope-or LFO-modulated filter on white noise as your modulator is a great way to create unusual sweeps in conjunction with chords or leads. For more nuanced control, do those sweeps with your mouth, by whooshing and swishing ridiculously. The precision you can achieve with this approach is striking.
Transporter. If you use swept white noise as your modulator, a thick sawtooth pad as your carrier, and extremely narrow filter bands on the vocoder, followed by reverb and delay you’ll have a great starting point for re-creating the transporter effect in Star Trek: The Next Generation.