andrey dergachev. notes on sound

Before anyone else who is interested in the sound of the movie will read this text, I wanted to write a few words about how I feel about what I wrote. For me, everything said is more like information for contemplation, than instruction or a guide to action. This is basically a simple description of my current understanding of certain aspects of the sound in the movie, what attracts me to it and how I try to come up with a sound that leaves me feeling satisfied.

I like films where the plot is not an absolutely dominant component. In such films, the image and sound help the viewer not only to perceive the event or their sequence, but also to feel their relative scale. If you are a familiar with painting, then this resembles some paintings by Pieter Bruegel, in which there is not only a visual, but also an event scale. The boundaries between the event and its embodiment are blurred. The incident ceases to be important only in the context of history, plot. It becomes significant in itself. Films in which this appears, cease to be exclusively narrative and sound, possibly because it itself does not carry any semantic load, is a convenient and invisible tool not only for maintaining history, but also for dissolving it.

There is another side to the matter the so-called technical. I am not very strong in engineering, but my knowledge is enough to say with confidence that the correct control of what you are doing is absolutely necessary. If you don’t hear (at least at the last stage — mixing) all the nuances and mistakes of the phonogram or hear them incorrectly, then the chance that the final result will be close to what you would like to get begins to tend to zero. Some of what will be discussed below, it would be impossible to hear and evaluate without high-quality studio control.

Mixing of lavalier mic and boom, how do you bring them together in post-production?
          I'll start from the beginning. When I was involved in making film sound and this could be called a coincidence, the situation in this area was rather confusing. In the 90s, during the transition from analogue cinema sound to digital, in Russia, along with outdated technologies, much of the film industry itself has sunk into oblivion. The educational system (not necessarily formal) and with it the continuity of the culture of sound and sound recording, whatever it was at that time, was broken. A lot of young and fast guys well-versed with computers but with poor understanding of sound, appeared on the scene. I was among them. I was a little older than most of my friends and colleagues, but we all shared those strange ideas about sound, that a new wave of sound engineers in cinema generated. To give an idea of the degree of ignorance prevalent at that time in our midst, I will mention that many of us thought that encoding the finished soundtrack of a movie in Dolby Digital format brings the sound softness and some saturation. Although in fact, converting sound to a format that provides only 384kbit/s digital stream to 6 channels (Dolby Digital bitrate for movie theaters; a format that allows you to apply sound information to a film synchronously with the image) at best just masked phonogram flaws, reducing the amount of information within the sound by about 15 times. Most likely, there were those who understood much more than the others in the formation of the final phonogram, but most of us, eager to come closer to some collective ideal called “sound in western films”, was far from understanding how sound is formed and how it is perceived. Also quite common was the belief in a certain specific way by which one could achieve the desired result. All this is a rather long story, which seems to me, has not quite ended. Therefore, lets return to “boom” and “lavarier mic”. 

I must say that I first encountered production sound (in Russia we called sometime ‘live sound’) on the movie “The Banishment” (2007 dir. Andrei Zvyagintsev), when Stanislav Krechkov and I decided to record the sound at the set at all costs, contrary to the tradition in Russia of the subsequent complete re-recording of all actors in the studio. The result was far from ideal (40-60% of the dialogues from the set hit the movie). In fact, it is now difficult to assess the exact volume of dialogues successfully recorded on set , since a lot of time has passed. Plus, all the dialogues of the main character Vera (Maria Bonnevie) were dubbed by Elena Lyadova. Maria practically did not speak Russian. At the set, she pronounced learned transcription of the Russian text (so that later it would be easier to get into “lipsing”) and understood the meaning of the dialogues. But, unfortunately, her pronunciation betrayed her as foreigner. However, despite the lack of experience and various technical difficulties (one of them was extremely noisy light, and this related not only to the equipment, but also to the lighting crew itself), we managed to avoid ADR (Automated Dialogue Replacement) for the children and half of the scenes that were complicated in terms of acting. 

At that time, the process of mixing lay beyond my comprehension. I firmly believed that the result of mixing completely depends on the skill of the mixer and almost does not depend on the quality of the recorded material. So at that moment I did not know anything about merging lavalier mic and boom and got to know this later when I started working with Dmitry Grigoriev at the post-production of the film “Elena” (2011 dir. Andrey Zvyagintsev). Dima loved blend them, because it allowed to achieve the desired or close to the desired naturalness and at the same time intelligibility in the sound of voices and allowed to aproach the desired distance. The problem with this method was that the phases in the signals from the "lavalier mic" and from the "boom" were almost always shifted relative to each other and, even worse, this shifting was not constant in time, since the "boom" and the "lavalier mic" is never completely motionless while recording sound on the set. Music engineers are very familiar with phase distortion associated with recording on two microphones. In a movie sound environment, phase distortion is often understood as either an explicit “flanger” or very noticeable changes in the lower part of the signal spectrum. Only in these cases the problem was considered as existed. In part, these difficulties were solved by the fact that the editor, focusing on hearing and on the image of the sound wave, compensated for phase shifts between two microphones by shifting a phrase, word or part of a word in one of the microphones. Usually it was a "lavalier mic". However, the rooms in which the phonogram from the set was edited mostly was very poor considering their acoustic quality and it was impossible to hear the distortions and “waving” of the timbre of voices that were edited in this way. In addition to this, the lavalier mic timbre was usually far from natural and it was therefore difficult to assess how much worse it became after adding the lavalier mic to the boom. As a consequence of all this, in the dubbing studio we heard material already edited in this way and perceived such sounding as a given, rarely comparing the result with the original phonogram. Plus, even in the dubbing studio, we were not able to hear a drop in speech intelligibility that occurs when two waves that are not identical in phase are combined. Partly due to the fact that the studios where we mixed did not allow us to hear these nuances, partly due to the fact that we did not understand what to pay attention to and were deceived by the obvious difference between the combined signal and the signals from each microphone separately that we could hear during rare tests. Nevertheless, the result was more than satisfactory, given the fact that the phonogram from the set did not allow to do otherwise. We chose too narrow microphones for the recording in interiors and without lavalier mics all interior dialogs would sound too distant.

Over time, we began to understand better the influence of the directionality (wideness) of the microphone on the recorded signal in timbral and acoustic senses and in “Leviathan” (2014, dir. Andrey Zvyagintsev), where quiet environment made possible to use cardioids and even omni microphones not only in interiors, but also when shooting occurred outdoors, the phonogram allowed us use only one of the microphones in most scenes: either “boom” or “lavalier mic”. However, due to the habit and due to the fact that we still did not understand all the nuances that occur when two signals of similar volume from one source are added together, we continued to blend “boom” and “lavalier mic”. Despite the fact that the mixing took place in the studio recently designed by Philip Newell and Julius Newell (Flysound studio), which now allows us to hear what happens when the microphones are folded, we did not take this advantage. Already post factum, after frequent discussions on the blending of two microphones which periodically occurred at the studio (FlySound) and tests with Sergey Bolshakov, I began to hear the disadvantages that inevitably entails mixing two signals of similar volume from one source. As a result, I came to the conclusion that resorting to blending “lavalier mic” and “boom” should only be done in extreme cases. 

A couple of years ago, Soundradix, which previously had an arsenal of tools for working with phase displacement, released the Auto Align Post plug-in, which made it extremely easy to cope not only with the static phase displacement in two clips, but also, which was extremely important for post-production in the movie, with a variable phase shift. After a short time, very many began to use this tool, both abroad and in Russia. The changes made by this instrument to the corrected sound are so insignificant that with a blind test they are quite difficult to hear even in a good studio. Mixing the lavalier mic and the boom has become simple, allowing you to correct recording deficiencies from the set, if necessary. But, despite the appearance of such an instrument (other similar plug-ins will probably appear), I prefer to use blending of two microphones only if none of the microphones alone can achieve the desired intelligibility and natural sounding.

How do you work with the dynamics of backgrounds? 
          First you need to stipulate what kind of films will be discussed. It so happened that my preferences and what kind of sound is “suitable” for films in which I participate as a sound designer often coincide. To my pleasure. These are dramas or documentaries in which the plot component, no matter how strong it is, nevertheless leaves a place for the background or environment in which the action takes place to be an integral part of what is happening. I do not mean a sound environment that is designed to reflect a certain subjective sensation or experience of the characters, and which can be torn off from the place where the action happens. I am talking about backgrounds and effects that simply complement, in a sound sense, the feeling of the place where the action takes place. I like it when the backgrounds do not create a sense of artificiality, but at the same time they can cause a certain feeling that is associated with staying in exactly the place that is captured on the screen. This is similar to how, for example, sitting on the porch of a country house, you hear a cricket loudly located nearby and, quite quietly, barking somewhere far away, a dog. Such ratios of loudness in space, seems to me, create an amazing perspective with fairly simple means. The sounds from which such a space is created should, as far as possible, be recognized by the listener so that he can appreciate the remoteness of the object even without acoustic processing. Subsequent acoustic processing further enhances the sense of perspective. In the description, this looks like something significant and a bit complicated, but in fact I just try to create a sonic space that, in combination with the image, would cause me a certain feeling. The distribution of sound sources of different loudness in space and time can be called dynamics. For all the time I did not often want to resort to an artificial change in the dynamics that usually already existed in some sound. Usually this was due to the fact that I needed to avoid redundant sound information in space or time. Sometimes this was due to the fact that some of the backgrounds, for example, the rustling of leaves or the noise of waves, did not have the desired dynamics within themselves. Sometimes this was due to the image, in which there was a clear movement in space, which I wanted to support or emphasize. I like it if in such scenes, among other atmospheres, there is a background already recorded with movement in space. This does not always work, but sometimes no other tricks can achieve a natural approach or removal. This to a greater extent does not even apply to backgrounds, but to SFX, dialogues and foley.

How do you work with voice acoustics, how do you determine what is needed?
          The natural sound of the voice is what I love, I would even say realistic. The first is related to the timbre, the second to the acoustics of the space in which the characters are in a particular scene. 

I always want timbre voices not to be flawed. Often, besides the equalization of the voice, I find it convenient to use the early reflections of the reverbs to give the desired timbre to the voice. Rarely, but it also happens that well-chosen early reflections reduce the need of strong equalizer invasion, or cancel it altogether. Sometimes I mix these timbrely significant reflections for me only into the central channel (mono), more often I distribute along the entire front. It often happens that the voices of several characters in a dialogue in a scene sound slightly different or even different phrases of one of the speakers sound different, as they are taken from different takes or different shots. To make these differences less noticeable and, at the same time, to control the distance of the voices, I use mono reverb, adding it in different (usually very small) proportions to the tracks. If it was managed to record acoustic responses (IR) for Altiverb in a high-quality way on the site, then I use them because they combine well with the acoustics already existing in the phonogram of the dialogs recorded on the site. 

When all this is done and the dialogs sound timbre and acoustically even and, if necessary, adjusted in distance, the turn of spatial acoustics comes. I must say, I often try different reverbs, trying to find the desired sound of the acoustic environment. The sound quality of the reverb, its transparency are extremely important to me, otherwise, instead of the acoustic space, I can only get its designation. To begin to understand how this or that reverb works, it sometimes takes me quite a lot of time. When I listened to the sound of the VSS3 TC Electonics, I decided to try to make it the main reverb for the whole movie. This was the Quick Silver Chronicles documentary (2018, dir. Alexandra Kulak and Ben Guez). Despite the fact that this is a stereo reverb, I liked to experiment with it and decided to continue the experiments in the next post-documentary film “Foam” (2019, dir. Ilya Povolotsky). Unlike the Chronicles, there were more unusual spaces here, which allowed me to explore the slightly different side of the reverb. Thus, during the work on two projects, I was able to better learn the features of the plugin I liked. It happened that the reverb, initially recognized as unsatisfactory during tests in the studios, after a few months suddenly found its beauty and indispensability in certain situations. At the film “Elena” (2011, directed by Andrey Zvyagintsev), together with Stanislav Krechkov, we decided to conduct an experiment with recording the acoustics of dialogues directly during the take. Despite the fact that the microphones that were in our arsenal were not suitable for this purpose, the result settled my desire to repeat this experience, if possible. Technically and even more organizationally, this is a rather difficult task, since almost studio silence should prevail in the premises in which the shooting takes place, which is rarely possible on the set. When adding spatial acoustics, I either look for a natural sensation of voices so that the viewer does not perceive the voices separate from the space in which they are located, or if I need to emphasize the unusualness of the space, I go towards exaggeration of the effect. Sometimes it is necessary to resort to this in those cases when the reverbs are not able to convey the necessary realism. One of the tools for this is mono reverb, which allows you to add an easily readable additional acoustic layer to an existing, for example, 5-channel, reverb. Beside this, additional mono or stereo reverb will allow you, if desired, to break the symmetry of the acoustic space using the panorama. It should be noted that such a nuance does not make sense in films with a lot of music, sound effects, or with quick editing and a lot of events in the frame. The viewer most likely will not notice all the efforts made, and perhaps he will even be annoyed by excessive sound information.

How do you work with acoustics of everything else (not dialogs)?
          Some observations. During various test recordings for the film Leviathan (2014. dir. A. Zvyagintsev) with Stas (Krechkov), we noticed that even if we record the steps or the voice of a person standing on a road in the middle of the steppe, even in this case there are various reflections, to one degree or another, shaping the sound we perceive and record. Even before, when recording foley with Natalia Zueva for the film “Stalingrad” (2013, dir. Fedor Bondarchuk, sound supervisor - Rostislav Alimov), we began to pay attention to the fact that it is better to use acoustic reflections as a necessary addition or as a tool for that to get the natural sound of foley. True, not in any foley-studio this can be realized. Over time, it became increasingly apparent that in reality we never hear sounds that would not be acoustically colored. Rather, all that we hear is always the sum of numerous reflections and a direct signal. These reflections form partly the timbre of sound, partly its perception in space. Understanding this when recording this or that atmosphere, sound effect or foley, you can already have an idea of how much acoustics you want in the recorded sound. Later, during mixing, it will be possible to either bring the distance of the sound of a particular sound to the desired one or leave everything as it is, if the range of sound suits. This is about mono reverb. Next, it remains to understand how much the background or effect is “inscribed” in the space that exists in the image, and whether it sounds sufficiently voluminous. I always devide in my head the distance and spatiality of sound, bearing in mind, however, that spatiality of sound partly affects its remoteness. 

Sometimes it happens that acoustic (reverberation) becomes the only component of sound that you want to leave in the phonogram. This happened in the film "Fidelity" (2019. dir. Nigina Sayfulaeva). After trying to create quiet interior atmospheres from recorded interior backgrounds, including quadro and 5.0, Sofia Matrosova, who edited the backgrounds for this film, made several interior atmospheres using the reverb obtained with the TC 6000, with almost no dry signal. As a result, it was decided to use this method as the main one for all quiet interior backgrounds in the film. These atmospheres turned out to be transparent, without unnecessary details, not too noticeable, but filling the space of the apartment of the main characters.

Modern theatres allow to convey to the viewer more acoustic and spatial nuances than before. And modern recording studios allows to more accurately control the degree of presence of these nuances in the final phonogram. But if you neglect the importance of accurate studio control, then most likely the result that you will hear in theaters will not greatly please you.

share ↵