Sound-source identity preservation in a stuffed-to-surfeit composition’s mix
Posted by runagate on July 6, 2009 · Leave a Comment
As my first blog aptly demonstrated it is extremely easy to get caught up in the sheer complexity of trying to come to grips with this new technological ecosystem we computer musicians find ourselves in.
One point I’d like to make clear up front is that I realize that the preponderance of my fellow experimental composer’s method is to take a small collection of tools as their palette in painting what P.K. calls “holographic” audio. I think this is a sensible approach, and a valid one, and one that dovetails nicely with their aesthetics.
I, myself, taking as a jumping off point something few have considered attempting, have a different set of problems to overcome. I firmly believe that “more is more” when it comes to having an embarrassment of riches, an overabundance of options.
To put my aesthetic into a graspable form, to make it more explicit - since audio itself does not tell you what its creator is attempting - and since words are fairly impotent at pointing directly with our visually-oriented language, let me toss out a few visual metaphors:
* the self-transforming machine entelechies of Terrence McKenna’s travelogues of the DMT dimension
* the sensory overload of psilocybin-induced technologically-themed “download to consciousness” visions in the throes of a “heroic dose”
* the embarrassment of riches of godlings, avatars and multiplicity of aspects of Brahman Atman in a tapestry of the Vedic cosmology
* the sensorium-blast of popular movies’ conception of Virtual Reality (all of which seem to come from the same wellspring despite VR not actually existing yet) “Lawnmower Man” “Johnny Mnemonic” (which of course is based on Gibson’s story) “The Cell” “The Matrix”
* the blooming, buzzing confusion of Blake’s child
* the traceries of a Cordoban book cover or the Alhambra’s purulent, twining calligraphy
Not that I have these things in mind when composing. I simply recognize a common archetypal root. To impose the rococo upon the baroque, to gild the lily and then add a fiber optic light show to that. Sharks with friggin’ laser beams on their heads.
These choices of “depth of detail” in non-repeating variations are not arbitrary. Once one is committed to one sound-source which is peculiar to the digital techniques that we have access to now, and to which there is often no habituated or genre-based normative usage, it behooves one to carefully balance the other sounds “in the mix.” For let’s not kid ourselves: no one is going to have ears forgiving of these sounds the way they would of pre-conditioned auditory events like a cassette recording
More than anything it leads me to making bizarre proof-of-concept compositions. Which is not to say that I don’t truly enjoy my KVR-type music. But I have a lot more on my “to-do list” and a 400 song backlog from the past plus new ideas all the time. I can’t seem to “commit” to a genre; at least with experimental digital composition it’s all so new that it somehow all fits together as a group. It’s not like if I made a calypso song and a 7/8 locrian blues song and then had to ponder whether they’d fit into the same performance or album.
Also, it’s very freeing to make such “avant” songs as 1) I am making them to listen to myself and 2) there’s no pre-conditioning in the listener. Point 2 has it’s good and bad points. The good is that you don’t have to deal with possible expectations in a genre-form that you disagree with. I’m not a purist by any means but it seems that our contemporary culture’s way of achieving “natural selection” amongst variations and elaborations on a theme rewards the least common denominator far more often than not. Well, one need only look at hip-hop’s current state, or electronica’s manifestations that actually cross popular consciousness etc. etc.
The drawback of not having a pre-conditioned tradition in the ear of the listener is, of course, that without a grounding in familiarity most people aren’t going to bother allowing the song to play in their presence in the first place let alone give it attention enough to decide whether they are intrigued by what they hear.
Which, of course, why so many types of music find a loving home mostly amongst connoisseurs. If you know enough about the style of music, or the instrument, or whatever, you’re much more likely to be drawn in by the shock factor of an artist showing off what’s possible.
The reason that I use “naive” melodies and rhythms in this kind of music is so as not to distract the listener from the timbral morphing or the depth of field trickery and movement or whatever happens to be the non-standard compositional raison d’ĂȘtre of the song. I’ve spent a ludicrous amount of time testing - sadly, only my own - how human perception ends up focusing its attention during all the goings-on in my music, through various types of transformations. I recall dropping a track with the heading “foreground/background swivel” or something like that… the idea being that different elements, as they underwent a wide variety of transformations (pitch; including Doppler or filtering, loudness; time dilation - speeding and slowing; 3D movement whether it be near/far or in one case falling into “orbit” or various speeds around the listener; &c.), would capture the attention of the listener at different times and in different ways.
Of course, overall the complexity of what kind of change would draw one’s attention was added to the fact that a large variety of different sounds were all in the same stereo field competing to be the one that “stood out.” As we all know (even if we’ve never consciously put it into words as such) that loudness matters the most, followed by anything that’s frequency range falls into the “sweet spot” that the length of our ear canal resonates at - the frequency of human voices, and then it gets really complicated really fast. Startling introductions of new sounds (or re-introduced intermittent sounds) work well because we’re wired to react quickly to being ambushed by something that’s likely to eat us.
After that… well, I’m not sure how to put anything else I discovered into words. But the attempt was a success. Although I’m the one who created this particular soundscape I find it fascinating to listen and observe how my attention flickers amongst the happenings. Sometimes the density of material crescendos and suddenly several foreground sounds fall away and you’re left involuntarily attending to the tapestry of background noises that’d been “edited out” by your ear while foreground sounds had demanded one’s focus. It’s almost as though it were crafted to train the mind catch itself in the act of paying attention.
Anyone reading this is likely to be well advanced into the stage where it’s actually hard to turn of their critical listening faculties. That’s the curse of we who spend a lot of time mixing audio. At least we’re mixing our own; I’ve read about many studio engineers that can’t listen to music recreationally after decades of service to other people’s muses. But what of a normal listener? For the most part when people mix music a lot of effort goes into hiding the process of how one’s attention is directed here and there.
For the “more is more” composing approach one thing that immediately became an issue is that the we still only use two speakers and all of the sounds have to deploy in space from them. When I first started out making computer music I knew nothing about mixing, had no decent equipment and, worst of all, no internet access whereby I could conceivably learn such things. The FL manual was no help. It’s fairly well organized but it’s job is not to explain the jargon of recording, DSP and computer audio in general. So my first forays into mercilessly packing the stereo field had, shall we say, a generous amount of amateur mistakes. But unlike in previous eras my amateurish mistakes were backed up by digital audio editing and a lot of effects processing power. Luckily for my mixes I had but a 1Ghz Pentium III processor so simply changing the volume on my 10-minute-long magnum opus could take 20 minutes of watching a blue progress bar slowly inching forward before I even got to hear if the new volume setting was correct.
About the only useful trick I learned from my first batch of freeware was abusing the hell out of multiband compression. If some of my cherished sounds were “hidden” in the mix simply smashing them with compression until they’re all equally audible (and equally lacking in dynamics) was the only solution I had available. I’m hardly unique in that. But I knew there had to be a better way.
As it turns out the best tricks I’ve found haven’t been used to the best of my knowledge, at least not as a conscious way around the limitations of our two-speaker stereo systems. I’m sure I’ve mentioned somewhere around here my use of 3D binaural panning and how, perceptually, it makes our incredibly amazing hearing system do the work for me. E.g. if you have a kick drum and a bassline if you pan the bass “back” away from the listener (and narrow the stereo field in doing so) and “downwards” the filtering that comprises the binaural effect to give your ears “directional cues” actually makes those two sounds, which share the same frequency range, more distinct. You can tell them apart easily when they’re positioned in three dimensions away from one another. This example is rather remarkable as binaural filtering takes place far above the bass frequency range, so you’re really getting these “directional cues” from the treble-range harmonics and attack transients and the like.
This is useful as far as it goes I’m sure I’ll get into it more elsewhere, especially given that sidechain compression, stereo width, traditional panning and the like can also contribute to differentiating two sound sources that share the same frequency range.
But there’s another overriding factor in the “more is more” approach. When that transformational filigree involves change over time, or morphing, or what have you as the vast majority of my favorite effect processing does it’s all too easy for the listener, especially one not already steeped in the esoteric DSP arts as we are, to “lose track” of the supposedly distinct threads of auditory events…
What do I mean by this? Well, take a more “minimal” approach to timbral morphing, say, a filtered bassline in an electronica song. Although the sound changes over time the listener is likely to realize that it comprises “one intentional sound source” because, although the filter cutoff may change suddenly the resulting sound is still obviously a bass and it’s probably pretty repetitive, aiding the process of maintaining it’s identity over time due to the melodic content (which itself may be rather static).
Or let’s take the famous “quacking banjo” morph. If it starts out as a ducks quack at point A and morphs to a banjo at point B it has to occur slowly enough that there’s in-between states interpolated between the two sounds that the listener even realizes the fun audio trick being played. In more minimal approaches to composition you’ll tend to hear quite slow transformations… sometimes even too slow such that the listener isn’t even paying attention closely enough to realize that the interpolated states are related to the intended beginning and end points!
So when packing a mix full of a number of wildly transforming sounds a lot more thought has to go into maintaining the integrity of the “identity” of the streams of sounds if that’s required by the composition. Obviously, there’s plenty of examples of both disregarding the “identity” of a piece’s used noises and just as many where only the creator knows of the original sound source, such as using pictures as oscillators or most granular synthesis. Those are of interest to me, too, but simply not germane to what I’m trying to explain. It’s an issue of the purpose of one’s use of transformative processing. But not such that those hearing your sounds can identify the sound per se but that if a changing sound source is to be regarded as one event-over-time despite significant alterations to it’s time domain, timbre, or what have you, then one must consider how it comes across to the beholder in the context of the mix which may also have many similarly drastically changing “sound sources.”
To illustrate that I’ll give you an examples that utilize no processing. Consider a clarinet. Most anyone who hears your recording of a clarinet is likely to recognize it (from it’s envelope and it’s timbre). Even if they don’t specifically recognize it as a clarinet (don’t laugh, I’ve seen many, many people unable to do so, guessing as far afield as a “flute” and “some horn” or the like). Not everyone that reads this may be able to differentiate a clarinet from an oboe, for that matter. But in most cases a listener will recognize the sound for what it is. But what of a shrill squeak from a beginner? Well, it’s not all that obvious that a clarinet is causing that sound unless 1) there’s nothing else happening in the “audio timeline” to have caused the sound or 2) the listener has already encountered someone trying to play a woodwind for the first time. How about a purposefully-played overtone/harmonic? Well, the same is true as above except that, in the hands of a musician capable of executing this extended technique it’s “identity over time” has a couple factors in it’s favor. It’s even more likely, due to rhythmic continuity, to be parsed as a purposefully-played part of what’s preceded in the song and their “breath technique” is likely to make the horn’s envelope even more instantly recognizable than a beginner’s mistake.
To take the example further, consider the extended technique of multiphonics. Depending on what “broken chord” is being played the sound ranges from something fairly obviously harmonized-woodwind-esque (albeit atonally) to something whose very “density” would seem to divest if of woodwind-ness as clarinets’ normal sound is always monophonic and rather non-dense, not varying that much from a sine wave like the human voice or a flute. Is the average listener who has possibly never even heard a woodwind player executing extended technique playing likely to have any idea of what they’re hearing? Well, depending upon the context it might not matter. I’m not dealing strictly with a sound being “identifiable” as a specific, known instrument or sonic event but with it’s being regarded as being caused intentionally from the same sound-generating source regardless of what that is or whether one can give a name to it.
In this example I’m pretty sure that any listener would eventually figure it out. Even if the sound is startlingly different the player would likely repeat a similar technique more than once, or the mind would pick up on the mouth causing tremolo or vibrato in a recognizably woodwind-like way, or there simply wouldn’t be any other sonic material in the recording to confuse the issue. I can think of counterexamples I’ve heard, but I think I’m homing in on a description that conveys what I’m sketching out here.
When I look at my sound palette I see that I use both sounds that we haven’t got a name for and processes over time that can potentially obscure what is in reality a single source of sound (be it an instrument, a field recording, a voice, or whatever) and thus, if it’s compositionally important, ruin the whole point of morphing.
Consider some of the specifics: For the last several years I’ve used almost exclusively synthesis to generate sound, only rarely samples and those mostly for drums that’re processed beyond recognition anyways (though real drums are laudably capable of retaining their distinct attack and other envelope characteristics better than synth drums for my purposes). Synthesizers already are likely to create an “unnamed” sound in a recording. But I also love to change them in myriad ways over time with parameter automation. This immediately brings up the possibility of changing the sound significantly enough over time that it’s not grokked as “the same instrument.” Obviously this barely matters if the melodic content draws the listener’s attention through the timeline of change anyways. For synthesizers we typically have to take into account change to waveform type, waveform mix (in the case of multiple oscillators), waveform modulation (frequency modulation, ring modulation, phase distortion, even pulse width modulation or even wavecycle offset of supersaw effects), timbre (in the case of additive and some other types of synthesis), filter type, frequency and resonance, and most especially the ADSR envelope. Changing the envelope of anything obscures it’s “identity” much more drastically than any other parameter I can think of, especially the attack. This is just a short and simple list - there’s a great deal of variation amongst synthesizers that’d have to be considered on a case-by-case basis. For instance, I’m quite fond of automating the modulation in a modulation matrix which dramatically increases the complexity of these consideration.
As a special case, because I use them so often, I’m going to mention physical model synthesizers. Given that one has access to the waveguide parameters that create a distinct “model” one can quite easily transform a PM sound source beyond recognition with tiny little parameter changes. I’ve played with the “indentifiableness” of PM tuned plates, like metallophones, in any number of ways: subtly changing over time, allowing them to mutate beyond recognition suddenly and return, and even setting their parameters so it’s never obvious that they’re tuned plates in the first place. I love the fact that PM sound have a visceral “there-ness” to them regardless of whether they’re emulating something that could actually exist or not.
Then there’s processing those possibly-already transforming synthesizer sounds with automated or modulated DSP. The number of effects boggles the mind: delay (which, when automated, creates often lovely “interpolation” effects beyond simple echo, resulting in a sound like the speeding up or slowing down of a tape recording which I shall hereafter refer to as “dub echo scrubbing”), flanger/chorus/phaser (which are all modulated micro-delays or one form or another but, given the right kind of modulation can achieve a wide variety of effects many of which can easily obscure the affected sounds “identity”), comb filtering and other “resonators,” distortion (already fairly likely to mask a sound’s identity), panning which included autopanning and 3D binaural stereo placement/movement, vibrato, tremolo, bitcrushing and decimation, dopplering, filters (a wide variety typically with resonance, and including envelope wah effects), formant effects like the talkbox (vocoding isn’t really an effect since the “sound generation” is typically a synthesizer, but there’s also the “morph” effect of side-chaining two sounds through a vocoder but that typically refers to a type of spectral or convolution effect), gates (including rhythmic gating and the “swell” effect of a slow volume envelope), pitch- and formant-shifting (including interesting variations such as the auto-arpeggiator and harmonizers), reverb, ring modulation, (takes a deep breath)
Plus a whole list of even more severe or esoteric processes. There’s a variety of spectral effects such as warped filters (a la mutagene’s VSTs), spectral accumulation, spectral exaggeration, side-chaining to force the spectral overtone series of one sound onto another’s, spectral vocoding/morphing, resynthesis (which, like vocoding, is a bit beyond mere effect processing), a wide variety of “buffer override” effects which, though sharing a root DSP process don’t really constitute one “effect” but includes the stutter effect, time stretching in order to accentuate the algorithm’s artifacts, dub echo scrubbing (such as moving the tape across the read head like dub reggae or the “turntable scratching” effect which is entirely possible with realtime audio processing, or the “tapestop” effect, or the “playing backwards” effect), waveshaping (which like feedback delay networks can either be used with essentially random settings or set with a particular algorithm which can cause a wide variety of specific effects), convolution (which can range from reverbs to hardware emulation to purely abstract effects using waves not even meant to be impulses), and finally granulization which comes in many flavors that don’t necessarily sound anything alike any more so than chorus, phaser and slapback are all forms of delay effects.
Another layer of complexity is the automation and/or modulation you utilize. One popular for is parameters linked to a step sequencer. This aids sound-source identity over time due to our minds interpreting the periodicity of the rhythm as continuity regardless of how drastic a timbral change is. This is true even with non-integer ratios of rhythm: I love to automate the “clock” of step sequencers for “dilating” rhythms, and bouncing ball rhythms are popular, too. There’s a certain range of LFO frequency that we find doesn’t obscure our perception of “identity,” usually a different rate for each kind of effect I suppose having something to do with how drastic a change the effect causes. I wonder if there’s some sort of formula for each type of process based on how we perceive, much as the “thickening” of a chorus sound happens around 20ms whereas we hear distinct echoes from a delay somewhere around 30ms and higher? How much timbral obscurement can our ability to identify a sound source tolerate? How fast and by how much can a vibrato fluctuate a sound source’s pitch before we re-categorize it as a newly-occurring sound? Or, for my purposes, how quickly can one transition from unaffected to wildly-affected sound without losing that sense of “identity?” Or how much can one rely on repetition, melodic continuity or rhythmicality in an effect to preserve that identity?
I don’t know that there are answers to such analytical questions. For my purposes, again, it probably doesn’t matter and will have to be dealt with on a case-by-case basis due to my insistence of putting several such transforming sounds into a soundscape simultaneously.
I’d like to add to the foregoing, despite not fully fleshing out all types of modulation but of which you’re probably already familiar, audio-based modulation. In it’s simplest form you’ll encounter an envelope detector which then uses that data to modulate some parameter of an effect. The most famous is the auto-wah: the volume envelope of the incoming audio is used to modulate the cutoff of a filter, so that depending on how you play more or less wah effect occurs. To the best of my knowledge there’s only three more types of audio input-dependent modulation: peak-detection (which is just a simpler version of envelope detection), zero-crossing detecting (it counts the number of times the incoming waveform crosses the “zero” line thus a simple sine causes little modulation whereas a chord causes more, and a “broken” chord of a woodwind multiphonic causes a great deal of very rapid modulation), and pitch-tracking (whereby the pitch is analyzes and sent as the modulation level).
Each of these input-based modulation types can benefit from a modular DAW’s data manipulation. Give the modulation level an emphasis curve like you can to the velocity of a keyboard, reverse it entirely (so that, with a pitch-tracking modulation, lower notes causes faster modulation) and the like. You can easily dream up more complex ways of filtering, constraining and transforming the data which may be useful. Additionally, having a sidechain input go to the modulation source can be fun, for instance having a ring modulator on a bassline but have the pitch-tracking detector’s input be a trumpet track, so that the ringmod frequency always follows the trumpet melody instead of the bass’s frequency…
What I’d like to obtain (or, if I knew how to, I’d simply create myself) are other forms of modulation based on analyzing the incoming audio How about the “average of the three most prominent harmonic overtones” as a modulation source? Or “variation of stereo phase?” .modulating a track’s 3D effect, making it tremble from “near” to “far”…?
Having brought side-chaining into the discussion again let’s clamber up another level of complexity: all of these processes can be chained in the modern digital DAW. One has hardly learned some of the popular possibilities of the run of the mill effect processes! Now you’re expected to guess what one will sound like atop another? Or in a chain of eight sequentially? Well, why not, now that a near-infinitude of possibilities has suddenly been dropped into the laps of we who for whatever reason happened to get hipped to computer-based composition/recording/mixing/etc.?
I often hear the complaint that “there are too many options” - perhaps, for some, there are. I’d been pondering exactly this eventuality long before I had the slightest hope of getting my crazy mitts on all these new tools. I still recall being blown away by Cool Edit’s ability to slice and rearrange was! It’s not as though we’re sitting before the modular synthesizer behemoths of over 30 years ago and having to heat the room up to inhuman temperatures while physically patching 1/4″ cables here and there and everywhere and they did it. We should be thankful to be granted this digital paradise! I honestly can’t believe they managed to take some of those units on tour.
Well, that argument won’t be heard from my mouth anytime soon. If anything certain factors still work in our favor (re: my discussion of freeware and crowdsourcing). We don’t actually have to construct every synth and effect ourselves like with Supercollider or MAX/Msp. Nor do we have to learn the algorithms and the appropriate input values. It’s mostly pre-engineered instruments and effects like in previous era, but now with exquisite control, modularity and instant recall of settings (not to mention composing the movement of those settings!).
So, has the idea of chaining effects sequentially settled in? Obviously, a great deal of practice is required to take full advantage of the giant list of effects I mentioned above. Why do we tend to call this practice “experimentation”… how is one to practice in a non-experimental way with audio processes anyways? It’s not as though words can convey what one should expect when you set DSP effects to certain settings most of the time and even if you could the result would still be awfully dependent on the audio source material. Likewise, even if each effect only did one distinct thing the number of possible ways to chain just four of them in a row is rather startlingly large. A guitar player will most likely know what a flanger put after a distortion will sound like. What is the sound of a spectral accumulation on a guitar that’s been Dopplered and then decimated? There’s really only one way to find out - until you’ve “experimented” sufficiently to know your tools, that is, and then it gets a little easier.
Eventually you can train your “inner ear” to have a rough estimate of what chains of eight effects processes are going to sound like, how to use such a massive amount of processing in a way that results in a useful and cogent sound, and then later use that knowledge so that most anything that passes through your imagination’s flights of fancy can be executed in reality with the vast array of available freeware. Most of the time…
When you get to this far flung point like I have the question becomes, how much can I get away with in one sound field? Contrary to what most people seem to think even more DSP will come to your rescue sometimes. At this level of complexity (transforming sound sources effected by chains of modulated or automated effects processes) you simply have to take it on a case-by-case basis, just like any other mix.
That is the reasoning behind some of my digital proof-of-concept compositions. There’s DSP being used for basic sound design, compositionally-mandated transformations-over-time, mixing clarity and that thing I’ve been talking about all along, the maintenance of each sound source’s audio stream as identifiably a part of the series of events despite all it’s changes.
Here’s a sample of things that’ve worked in various situations:
1] Rhythmicality to the modulation, as described above
2] Certain frequencies of modulation, as described above, depending upon the effect in question and what else is going on in the mix
3] Traditional panning because, obviously, if a sound stays resolutely hard-panned to the left it’s most likely going to be interpreted as having contiguity
4] Binaural 3D positioning, due to stereo image narrowing (which works with traditional panorama placement, too), the way our ears interpret binaural filtering lending itself to the maintenance of “sound-source identity” as well as simply moving the sound away from other, potentially conflicting sounds. It’s almost magical how this works but blame your audio sensory system, not me.
5] Rhythmic panning or certain speeds of panoramic movement and especially binaural 3D movement - it’s an art, not a science, but stereo movement can help maintain a sound’s identity regardless of what kinds of transformations it’s undergoing because against a static background our ears track a moving sound as arising from a constant source. One need only think of a rustle of leaves here, a second later a twig snaps over there to the right, and when you add a third unseen sound a bit further to the right you may have just identified something that’s going to eat your sorry ass. That there’s natural selection. It’s also why multichannel audio is used in such a conservative way: the pictures only coming from a smallish area in front of you, but if you hear a startling sound behind you you’re going to involuntarily turn towards it, ruining the illusion of immersion in the video…
6] Speaker sims, ampsims, microphone simulators or the like: somehow the timbre imposed via a sound-source simulation like this helps maintain the sense of sound-source identity amazingly well even with wildly changing underlying overtone changes in whatever you’re running through the sim, plus like PM synthesis they just have a great sense of physicality and locality. Combined with stereo image narrowing and panoramic placement this is almost a sure-fire way to achieve maintenance of identity - possibly too much in the case of using the same sound source simulator to have different sound sources coming from it in the same song! Try it and see… I could also add to this list various other ways of imposing an overarching timbre to a sound source but there’s no good textual shortcut to naming any of the possibilities, just bear it in mind that the more complex the timbre character imposed the better it works. Bitcrushing/decimation and vinylization are two examples.
7] Resonators - this category can include a number of things. The traditional example is a sitar’s resonant strings. They’re called “drone” strings because, much like playing the open E string on a guitar while you play a melody on other strings, a sitar player often strums them as a drone but they’re actually resonators without being played at all. Whenever a note is played it resonates in sympathy with the “drone” strings and emphasizes certain notes and harmonics because, obviously, all of the strings are connected to, and thus causing vibrations in, one instrument’s body. Sounds a bit like the previous example, no? There’s not many resonator VSTs, sadly, but they work well for the same reason, as nothing else in your sound field is causing overtone resonance. Comb filters work great for this, too, but there’s not too many VST that do that, either. Oddly, comb filter effects tend to have great modulation options whereas resonator VSTs tend to have no modulation and in fact the ones I have like to crash when automated. Various kinds of reverb can also have this effect, but that’s the next entry…
8] Reverberation. Anything processes in the same “reverberant” space tends to be interpreted, at the very least, as originating from within that space. Reverbs tend to fill up a stereo field with a lot of sound, though, so you’re equally likely to mask the identity of all of the sounds in your stereo field, so precision and creativity are necessary. But you probably already know the challenges here - reverb can be the thing that perfects a mix or just muddles it in any engineering situation. Here’s I’ll only address special instance, such as putting a “tiny” space’s reverb on a sound that’s changing a great deal, preferably with stereo narrowing and panning of some sort. 3D binaural placement works wonders with this. Sometimes, with certain reverbs, you can even get away with panoramic movement, but more often this just sounds absurd to the human ear unless you’re trying to give the impression that an entire room is moving in relation to the static sound events in your mix! I’ve always used very specific VSTs for this, mostly AriesVerb for it’s odd and unique “spaces” such as the “shoebox,” small spherical reverbs or “harmonic resonance” reverbs… anything that doesn’t give the impression of a entire large space. Save that kind of reverb to “settle” the more static mix elements, such as drums. This is much like binaural filtering as the “depth of field” placement differentiates a sound source from anything without reverberation and gives a sense of locality.
9] Doppler effects. As helpful as panoramic movement is to maintain the sound-source identity of some transforming sound streams is it works even better with Dopplering movement. Of course, if you use too much Doppler pitch-shift you may alter your sound’s pitch content more than you care to but for a subtle effect that helps a changing sound seem contiguous a little Dopplering goes a long ways. This necessarily involved panoramic movement and obviously you won’t always be willing to have movement in a composition.
10] Resonant filtering. Good old resonant filtering, depending on the speed of the modulation, can often achieve the desired effect.
11] Time-domain modulation. Echoes, choruses, flangers, etc. Echoes are dangerous because they quickly take up a lot of the “space” you have to work with. The ability of chorus to “thicken” a sound does a decent job here, but flangers and phasers do an even better job especially if nothing else in the sound field has a modulated delay effect on it. For some reason negative feedback phasers and flangers tend to work better for this job than positive perhaps due to their “metalicity.”
Having covered those bases there’s obviously nearly limitless other ways to accomplish this task, both with effects and more traditional mixing concepts, mixing and matching whatever works in a given situation. There’s no way to present a recipe; there’s not even any good way to put it all into words, evidently. I suppose I should buckle down and write a description of something I’ve actually recorded and explain the methodology sometime.
I hate to end it here. I’d love to present specific examples and abstruse solutions. I’d love to cover in detail how to use each of the available effect types, but then I’d want to elaborate on all the other mad tricks you can pull out. The important facts are to be aware of this previously-undescribed aspect of the new era of digital music-making and to present some of the easier-to-describe ways of maintaining sound-source identity. Obviously we all have access to the same ways of looking at audio: rhythm, pitch, harmony and the suddenly much more necessary timbre, phase, panorama placement, modulation-over-time, etc. My hope is that my words can pare down the time between imagination and audible art for whoever is next in line learning all these miraculous tools and techniques. And, of course, the more we become aware of just how much is now possible the more our imaginations can take flight into an ever-more vast array of possibilities and tear down any limitations to their realization in music.
Filed under Uncategorized · Tagged with 3D, binaural, DAW, DSP, effects, experimental, freeware, mixing, morphing, ooTray, runagate, Sound Source Categorization, Sound Transformation, soundscape, synthesis, Timbral Metamorphosis, VST