the musicology of record production

london college of music

  • Increase font size
  • Default font size
  • Decrease font size
Home

Staging of Recorded Music Front

E-mail Print PDF

The Staging of Recorded Music

BACK to front page

Spotify Playlist for this chapter

Section Headings:

1.    Realism, artificiality and the idea of staging

2.    Spatial staging and the Sound Box

3.    Timbral Staging

4.    Functional Staging

5.    Media Based Staging

 

Realism, artificiality and the idea of staging

Edison's Tone Tests With Anna Case  Description found in: Bladwin, Neil. 2001. Edison: Inventing The Century. Chicago: University of Chicago Press

Is It Live Or Is It Memorex? Youtube presentation of Ella Fitzgerald's TV advert for Memorex Audio Cassette Tapes

Deutches Institut fur Normung DIN45500 - 1973

Recording In The Real World: Ted Fletcher's 2005 Paper at the Art of Record Production conference, University of Westminster. (you need to register and log in to access this resource)

McGurk Effect:  Youtube explanation of the McGurk Effect by John Medina from brainrules.net

M. C. Escher's never ending staircase:  Ascending and Descending (1960) by M.C. Escher

Salvador Dali's floppy pocket watches Salvador Dalí. The Persistence of Memory. 1931. Oil on canvas.

Lipsmacks, Mouth Noises and Heavy Breathing: Steve savage's 2005 at the Art of Record Production conference, University of Westminster. (you need to register and log in to access this resource)

Clarke, Eric. 2005. Ways Of Listening: an ecological approach to the perception of musical meaning. New York. Oxford University Press.

 

Spatial staging and the Sound Box

The concept of staging as a tool of analysis in record production comes from the work of William Moylan and Serge Lacasse but is also related to Trevor Wishart’s thoughts on ‘landscaping’ in electroacoustic composition.  Lacasse refers specifically to the manipulation of the sound of the human voice, but his ideas are transferable to all recorded sound. The notion of staging refers to the treatment of sound in ways that add meaningful context for the listener. Perhaps the simplest example of this is the addition of ambience to suggest the sound source’s placement in physical space – a church as opposed to a bathroom, for instance. Directional, spatial, and distance cues in audio are more complex than this, however: the human perceptual system uses overall volume, the relative volume in different ears, differences in the arrival time of sound sources in each ear, the amount of high-frequency content, the relative volumes of direct and ambient sound, and the perceived performance intensity as suggested by the timbre. For example, two similar but different sounds might be heard at similar volume but arriving at the right ear slightly louder and slightly before arriving at the left. While this would obviously suggest the sound source’s orientation to the right, if one of the sounds had a greater high-frequency content than the other, we would be likely to perceive it as being closer, because high-frequency sound dissipates more rapidly over distance than low-frequency sound. Using equalization (EQ) to emphasize high frequencies in a recording is a standard technique to suggest intimacy and proximity.
Further complexity has been added in record production through the use of conflicting messages. If we listen to the recording of Whitney Houston belting out the final chorus of ‘I Will Always Love You’ and compare it with Jarvis Cocker’s vocal on Pulp’s ‘Common People’, we notice that the volume of the voice, the high-frequency content, and the level of room ambience are similar.  The perceived performance intensities of the two vocal deliveries, on the other hand, are  entirely different. Houston’s vocal timbre suggests high levels of energy being expended, and Cocker’s timbre gives the impression of a throwaway delivery and a world-weary lack of effort. The timbral clue of a quietly ‘spoken’ voice at a high volume level outranks other conflicting clues to stage Cocker’s vocal as intimately close and Houston’s as further away. An extra level of complexity is added in Houston’s case: although the intensity of the vocal can range from a virtual whisper at the start of the track to a powerful ‘roar’ at the end, the actual volume remains almost the same, controlled by a combination of compression and mix volume. The false impression we receive of how loud the vocal is at different points in the song, and our sense of how much energy is being expended in Houston’s singing, is shaped by the vocal timbre, and this overrides the opposing messages conveyed by the equality of volume throughout.

Timbral Staging

Perhaps, therefore, we can extend the concept of staging in a recording to include both the creation of meaning beyond physical placement within a perceived environment, and the way that timbral shaping can suggest emotional meaning. Here again, Lacasse’s work is paralleled in the field of electroacoustic composition. Denis Smalley discusses ways of generating meaning in electronic music by creating morphologies that suggest an action and an object; for example, a string vibrating when plucked, a human sobbing, or the smooth mechanical acceleration of a motor.  Lacasse extends his concept of staging to include electronic treatments of sound that impose a timbral shape (or Smalley’s spectromorphology) on to recorded sound in ways that suggest the physical manifestation of human emotional activity. This is related very closely to theories of embodied cognition and perception, such as those proposed by George Lakoff and Mark Johnson and Antonio Damasio.  Staging a guitar sound by adding overdrive or distortion creates a spectromorphology for that sound which is similar to the timbral shape of a shouting voice. By adding a certain pattern of both harmonic and non-harmonic overtones, the staging conveys meaning through relating the guitar sound to the type of emotional human states that we associate with shouting voices, for example aggression and anger.

Functional Staging

I’m going to start, therefore, with a short taxonomy of the functions to which recorded music can be put:
1.    Dance – playback in informal (party) or formal (club) situations. Production will ensure musical features important to facilitating the attentional synchronisation of dance gestures to musical gestures are highlighted.
2.    Focussed listening – playback for an individual (or small group) to listen attentively – mostly in the home but can be formalised (music society or acousmatic concert) or via headphones in other informal situations. Production will aim for clarity and stylistically appropriate proximity to suggest that the listener is a privileged (best position) witness.
3.    Performance atmosphere – playback to simulate or suggest the atmosphere of a ‘live’ performance. Production will reproduce, simulate or suggest acoustic properties associated with stylistically appropriate communal experience of a performance.
4.    Background – playback used for subliminal or peripheral creation of ambience where listeners attention is focused elsewhere. Production will aim to be smooth and without sudden dynamic or timbral variations.

These functions are not mutually exclusive and I will argue that different styles of music combine different aspects of these production approaches in different ways. Before I move on to look at some particular examples from African and Cuban recorded music, I’ll discuss a few wider generic musical descriptions and how these functional categories can be related to broad trends in record production.
One factor common to a wide variety of commercial recordings intended for dance  is that playback will be through a public address system in a large venue. The playback will thus entail the addition of substantial ambience from the dance venue itself as well as any ambience on the original recording. Reverberant spaces will blur the rhythmic characteristics of a piece of music by making the note onsets less distinct. These note onsets are the perceptual cues that we use to establish pulse and to synchronise dance gestures to musical sound. A characteristic of functional staging in recorded music intended for public dancing would therefore be to reduce the ambience on the recordings of the musical elements that are key to establishing the pulse of the music. In western popular music at the end of the 1960s and beginning of the 70s when clubs dedicated to dancing to recorded music started to become more popular, we see a divergence in drum sounds between dance music and rock music that seems to bear this out. (Example 1: Wonder 1973 and Led Zeppelin 1973)
At the same time in dance music, musical elements that are more concerned with generating the party atmosphere – most commonly vocals – are treated with reverb to suggest large scale communal activity (Example 2: KC & The Sunshine Band 1975) and contribute to the club ‘vibe’.
This use of production techniques to create the atmosphere of a large scale communal activity is used extensively in music that is designed for reproduction in a smaller home environment. (Example 3: Queen 1977 ). Rather than an accurate representation of the listening experience of a large concert hall though, the muddying influence of reverberating low frequency sound is usually avoided but the ‘fattening’ of the sound that this creates is often suggested through some sort of electronic or tape based compression of the low end. This gives some aspects of the perception of a large space without the loss of clarity that realistic reverberation would induce.
Many of the conventional techniques of multitrack recording and mixing can be related to this form of virtual staging – of generating psychoacoustic cues that are reminiscent of some features of a particular type of listening experience whilst avoiding other aspects which may have a negative impact on intelligibility or the musical meaning of a particular sonic feature. That, though, is a whole other paper.
Music intended for home listening through domestic hi-fi or personal stereo systems tend to involve some balance between the 2nd and 3rd functions in our list. I’ve already mentioned the approach that suggests or simulates a culturally appropriate communal listening experience and the other is aimed towards focused listening. This form of staging will often employ techniques that suggest intimacy and an individual approach – as if the performance is being whispered in your ear, and is solely for you. Close microphone placement, exaggeration of high frequency content and the relative high volume of dry signals in comparison to reverberation are all common techniques for suggesting proximity to the performer and when these are combined with low energy level, intimate performances the effect is even stronger.
In fact, these techniques have become so prevalent that in some styles of music they have become merged and confused with questions of recording quality – the closer they sound, the better the recording. This has also been combined with our continued exposure to unnaturally compressed bass frequencies to create expectations about the sonic characteristics of recorded music that constitute a culturally constructed perception of ‘good quality’ recording that extends well beyond questions of frequency and dynamic range.

Media Based Staging

The idea of media based staging takes the idea of ‘location’ a step further to include perceptions of time and place that are associative rather than perceptual: how the aural ‘footprint’ of particular forms of mediation associated with audio reproduction media have been used to generate meaning within the production process.
The sound of particular media - specific limitations in frequency range and dynamic range and particular forms of distortion, ambience and noise - will generate associative meaning for audiences with particular forms of cultural experience.
This is also further complicated by issues of familiarity and expertise that may allow for finer or coarser gradations of differentiation - for example, hearing the sound of early recording which is a broad based association familiar to most members of post industrial societies - and hearing the difference between wax cylinder and acoustic recording recording and 1920s electric disc recording - a relatively easy skill to acquire but not one that’s common in contemporary society. Or recognising the sound of a voice coming down a phone line - once again, a widely acquired social skill - and hearing the difference between a land line and a mobile phone. Getting more esoteric - the difference between 16 and 24 bit recording and difference between good quality MP3 and a .wav or an .aif file.
I’ve also distinguished associations related to chronology and those that I’ve called environmental i.e. associations related to contemporary society but to different environments or arenas of experience.
Starting with the idea of environmental forms of media based staging: this relates to media that are associated with specific places or categories of places - such as the public address systems in various types of environment - supermarkets, railway stations, sporting events, aeroplanes etc.
Other examples might include Sound Reproduction Systems such as muzak in an elevator or a supermarket, film sound in a movie theatre, AM Radio sound, or the sound of TV.
Thirdly, this could include the sound of various communication media such as different types of telephone calls, walkie talkie radios, police radios, or the sound of astronauts communicating from the moon.
I’m now going to investigate the way that these forms of staging can create meaning by looking at a few examples.
My first example is Eminem and the supermarket PA system or tannoy on the Real Slim Shady. Aside from referencing a familiar form of paging - or requesting someone’s presence - that relates to the lyrical content, this also is a culturally familiar form of disembodied voice. The role of the narrator, a very common form of disembodied voice in contemporary media, can be summoned in many more conventional ways than this - the sound of radio announcers or TV voice overs being two examples. Record production is itself another medium that generates disembodied voices - the paradox that Evan Eisenberg has described as the performer without an audience and the audience without a performer. Techniques such as this can allow the creation of multiple levels of disembodiment: the staging identifies this version of Eminem’s voice as different to the main vocal - a step further away and thus a narrator commenting on or preparing us for the lead vocal.
A further example of this can be found on 10.30 Appointment by Soweto Kinch, a UK rap artist from Birmingham. The cultural significance of an interview at a Job Centre in the UK - conjured up by the tannoy announcements and the office environment noises - where the protagonist explains to the employment officer that he wants to be a rapper, is not only very British but is also culturally specific to the unemployed and students claiming state benefits. The ritual humiliation of the ticket and window interview queuing technique is more broadly familiar however.
Moving from public address to sound reproduction systems, one frequently used example involves tuning in a radio station.
On this Skee-Lo example, the introduction to the track is staged as playback with the limited frequency range of a small speaker AM radio and this is itself introduced by the sound of a tuning dial being turned.
The tracks is referencing the audio media that is expected to be one of the primary forms of playback. It also works as an interesting twist on an often used popular music arranging tool - arranging the introduction as a lighter version of the main theme. Rather than, for example, a solo piano introduction before the band kicks in, this provides a version of the track with high and low frequency filtering which is then removed as the vocal starts. In a move that dilutes and confuses the message of the radio tuning, there is also the sound of a ringing tone mixed quietly into the introduction setting up the ‘hello’ of the vocal as answering a phone - although without any phone voice treatment.
The telephone voice utilised by Britney Spears on Oops...I did it again is recognisably treated and is dropped into the verse narrative of the song a few times - seemingly quite randomly - as an arrangement tool. The familiar staging of telephone communication is used as a tone colour in the vocal arrangement rather than as a cipher for disembodiment or separation - the form of meaning usually associated with telephone references in popular music.
Later on in the track, the filmic reference to the Titanic movie - a diamond necklace dropped into the ocean by the old lady narrator - is given the sonic characteristics of the reduced sound quality of a movie theatre.
The media based staging in these instances seem more like references to popular culture rather than creating meaning related to the musical and lyrical content. I shall return to this idea of ‘namechecking’ references that would be familiar to one’s target audience a little later.
The other common reason for using media based staging in record production is to evoke the sound of a particular (or more commonly just a vague) historical period. In the same way that sepir tinting of film and photographs, black and white photography and the particular colour saturation associated with Super8 and other home movie formats are used to denote age, the sound of early recordings are used as well.
On the other hand, another crucial aspect of this that should be mentioned is the way that particular forms of clarity and audio quality are associated with modernity. This has become quite tightly entangled with the distinction between expensive and cheap record production which will crop up again a little later.
Interestingly my memory of the Beatles Honey Pie was that it had a vocal treatment simulating a 1920s megaphone but when I came to listen to it the vocal was full frequency range except for a short fragment of ‘old crackly record’ at the beginning. In this instance an obvious reference to the stylistic period of the track.
The Buggles 1979 track, Video Killed The Radio Star, uses a limited frequency range and dynamic compression on the vocals to suggest the sound of early radio broadcasts. This is mixed into a contemporary (to 1979) production sound and the production itself juxtaposes perceptions of antiquity with those of modernity - the voice and keyboard sounds have the restricted frequency range of antiquity whilst the female vocals, kick drum and bass have a sound of modernity that was set ot become the standard in the 1980s.
This brings us to a further distinction that can be made about the way that media based staging can create meaning: a way that is related to the ideas of familiarity and expertise that were mentioned earlier. The references I mentioned in the Britney Spears tracks related to popular culture in ways that were designed to resonate with the demographic of her projected audience - mobile phone conversations and romantic films.
Historical references can be similarly grounded in ideas of what might be perceived as cool to a particular target audience - to ideas of authenticity and the perceived authority that stems from speaking with a particular voice - the ‘voice’ of late 1960s and early 1970s record production - the sound of analogue tape and valve or tube amplifiers - in this example.
The voice of authority is the perceived golden age of rock - used to distance the sound of Oasis (and other Manchester bands of the early to mid 1990s) from the sound of the 1980s.
There are many other examples of particular types of production technology developing an authenticity within a particular musical style - Roland TR808 drum machines and TB909 synthesisers within house and techno in the late 1980s, playing, sampling and pressing to vinyl within the Bristol sound (Roni Size, Portishead) and the anti-synthesiser stance of various rock bands at various points - such as Queen and Rage Against The Machine.
In our post modern age though, the cache of sonic signatures can go up as well as down and the ‘voice of authority’ can be sincere or it can ironic.
Whereas on certain hip hop tracks, for instance, the presence of the sound of vinyl crackle is a signifier of authenticity: of sampling from the original repertoire, in the case of the Mike Flowers Pops it is part of the ironic language of retro cheesiness.
This inverted snobbery can be seen in terms of Bourdieu’s ideas on cultural capital - specifically of Thornton’s idea of subcultural capital: only an audience with the habitus of listening within a particular sonic world will understand the cultural resonances - the subtleties of authority and irony - that allow the “correct” reading of this audio event.
Another important way that media based staging can affect the meaning of recorded music is through a dilettante approach - on the face of it, a superficial, amateurish and partially understood approach to recording. Garage bands from the late 1950s onwards have produced rough and unpolished recordings and this has led to it being embraced as a production aesthetic in itself. If the dilettante approach is chosen rather than being accidental then it takes on additional meaning - in this instance, professional quality recording becomes a signifier for the ‘establishment’ and the rejection of it  - the choice to go Lo-Fi - becomes a political statement: a marker of difference.
An important aspect of this which relates back to issues of familiarity and expertise that were mentioned earlier is the fact that the signifier - the characteristic that identifies the media in question - is often highly exaggerated. The crackle on the Mike Flowers record is so loud that it would have been a signifier of a badly worn record, the vocal track on Video Killed The Radio Star has a more restricted frequency range than the actuality of early AM radio and the slap back delay on the tannoy in The Real Slim Shady has slightly more feedback than the real thing. Gaining the stamp of authenticity - of speaking with the voice of authority - often requires the ‘tone of that voice’ to be exaggerated. In this example by Portishead the signifying surface noise from the vinyl is not only exaggerated by loudness but also by making it intermittent, the attention is drawn to it even more
To sum up then. I have identified various aspects of media based staging: the fact that it can create both environmental and chronological associations and that it can create the perception of authenticity either through association with an established voice of authority or through a dilettante rejection of those types of authority. In any event these techniques rely on an audience that has accumulated the appropriate forms of experience. These can be based on very broad social communities of experience - associations such as the telephone being a cipher for communication but also possibly separation - or they can be based on more tribal or sub-cultural groupings - associations such as Lo-Fi recording with anti-consumerism and rebellion or vinyl crackle with the more authentic sound of records over CDs in DJ culture.
 

Last Updated on Monday, 28 September 2009 16:17  

CB Login

CB Online

None