This is the final article in Sounding Out!‘s April Forum on “Sound and Technology.” Every Monday this month, you’ve heard new insights on this age-old pairing from the likes of Sounding Out! veteranos Aaron Trammell and Primus Luta along with new voices Andrew Salvati and Owen Marshall. These fast-forward folks have shared their thinking about everything from Auto-tune to techie manifestos. Today, Marshall helps us understand just why we want to shift pitch-time so darn bad. Wait, let me clean that up a little bit. . .so darn badly. . .no wait, run that back one more time. . .jjuuuuust a little bit more. . .so damn badly. Whew! There! Perfect!–JS, Editor-in-Chief
A recording engineer once told me a story about a time when he was tasked with “tuning” the lead vocals from a recording session (identifying details have been changed to protect the innocent). Polishing-up vocals is an increasingly common job in the recording business, with some dedicated vocal producers even making it their specialty. Being able to comp, tune, and repair the timing of a vocal take is now a standard skill set among engineers, but in this case things were not going smoothly. Whereas singers usually tend towards being either consistently sharp or flat (“men go flat, women go sharp” as another engineer explained), in this case the vocalist was all over the map, making it difficult to always know exactly what note they were even trying to hit. Complicating matters further was the fact that this band had a decidedly lo-fi, garage-y reputation, making your standard-issue, Glee-grade tuning job decidedly inappropriate.
Undaunted, our engineer pulled up the Auto-Tune plugin inside Pro-Tools and set to work tuning the vocal, to use his words, “artistically” – that is, not perfectly, but enough to keep it from being annoyingly off-key. When the band heard the result, however, they were incensed – “this sounds way too good! Do it again!” The engineer went back to work, this time tuning “even more artistically,” going so far as to pull the singer’s original performance out of tune here and there to compensate for necessary macro-level tuning changes elsewhere.
The product of the tortuous process of tuning and re-tuning apparently satisfied the band, but the story left me puzzled… Why tune the track at all? If the band was so committed to not sounding overproduced, why go to such great lengths to make it sound like you didn’t mess with it? This, I was told, simply wasn’t an option. The engineer couldn’t in good conscience let the performance go un-tuned. Digital pitch correction, it seems, has become the rule, not the exception, so much so that the accepted solution for too much pitch correction is more pitch correction.
Since 1997, recording engineers have used Auto-Tune (or, more accurately, the growing pantheon of digital pitch correction plugins for which Auto-Tune, Kleenex-like, has become the household name) to fix pitchy vocal takes, lend T-Pain his signature vocal sound, and reveal the hidden vocal talents of political pundits. It’s the technology that can make the tone-deaf sing in key, make skilled singers perform more consistently, and make MLK sound like Akon. And at 17 years of age, “The Gerbil,” as some like to call Auto-Tune, is getting a little long in the tooth (certainly by meme standards.) The next U.S. presidential election will include a contingent of voters who have never drawn air that wasn’t once rippled by Cher’s electronically warbling voice in the pre-chorus of “Believe.” A couple of years after that, the Auto-Tune patent will expire and its proprietary status will dissolve into to the collective ownership of the public domain.
Growing pains aside, digital vocal tuning doesn’t seem to be leaving any time soon. Exact numbers are hard to come by, but it’s safe to say that the vast majority of commercial music produced in the last decade or so has most likely been digitally tuned. Future Music editor Daniel Griffiths has ballpark-estimated that, as early as 2010, pitch correction was used in about 99% of recorded music. Reports of its death are thus premature at best. If pitch correction is seems banal it doesn’t mean it’s on the decline; rather, it’s a sign that we are increasingly accepting its underlying assumptions and internalizing the habits of thought and listening that go along with them.
Headlines in tech journalism are typically reserved for the newest, most groundbreaking gadgets. Often, though, the really interesting stuff only happens once a technology begins to lose its novelty, recede into the background, and quietly incorporate itself into fundamental ways we think about, perceive, and act in the world. Think, for example, about all the ways your embodied perceptual being has been shaped by and tuned-in to, say, the very computer or mobile device you’re reading this on. Setting value judgments aside for a moment, then, it’s worth thinking about where pitch correction technology came from, what assumptions underlie the way it works and how we work with it, and what it means that it feels like “old news.”
As is often the case with new musical technologies, digital pitch correction has been the target for no small amount of controversy and even hate. The list of indictments typically includes the homogenization of music, the devaluation of “actual talent,” and the destruction of emotional authenticity. Suffice to say, the technological possibility of ostensibly producing technically “pitch-perfect” performances has wreaked a fair amount of havoc on conventional ways of performing and evaluating music. As Primus Luta reminded us in his SO! piece on the powerful-yet-untranscribable “blue notes” that emerged from the idiosyncrasies of early hardware samplers, musical creativity is at least as much about digging-into and interrogating the apparent limits of a technology as it is about the successful removal of all obstacles to total control of the end result.
Paradoxically, it’s exactly in this spirit that others have come to the technology’s defense: Brian Eno, ever open to the unexpected creative agency of perplexing objects, credits the quantized sound of an overtaxed pitch corrector with renewing his interest in vocal performances. SO!’s own Osvaldo Oyola, channeling Walter Benjamin, has similarly offered a defense of Auto-Tune as a democratizing technology, one that both destabilizes conventional ideas about musical ability and allows everyone to sing in-tune, free from the “tyranny of talent and its proscriptive aesthetics.”
Jonathan Sterne, in his book MP3, offers an alternative to normative accounts of media technology (in this case, narratives either of the decline or rise of expressive technological potential) in the form of “compression histories” – accounts of how media technologies and practices directed towards increasing their efficiency, economy, and mobility can take on unintended cultural lives that reshape the very realities they were supposed to capture in the first place. The algorithms behind the MP3 format, for example, were based in part on psychoacoustic research into the nature of human hearing, framed primarily around the question of how many human voices the telephone company could fit into a limited bandwidth electrical cable while preserving signal intelligibility. The way compressed music files sound to us today, along with the way in which we typically acquire (illegally) and listen to them (distractedly), is deeply conditioned by the practical problems of early telephony. The model listener extracted from psychoacoustic research was created in an effort to learn about the way people listen. Over time, however, through our use of media technologies that have a simulated psychoacoustic subject built-in, we’ve actually learned collectively to listen like a psychoacoustic subject.
Pitch-time manipulation runs largely in parallel to Sterne’s bandwidth compression story. The ability to change a recorded sound’s pitch independently of its playback rate had its origins not in the realm of music technology, but in efforts to time-compress signals for faster communication. Instead of reducing a signal’s bandwidth, pitch manipulation technologies were pioneered to reduce the time required to push the message through the listener’s ears and into their brain. As early as the 1920s, the mechanism of the rotating playback head was being used to manipulate pitch and time interchangeably. By spinning a continuous playback head relative to the motion of the magnetic tape, researchers in electrical engineering, educational psychology, and pedagogy of the blind found that they could increase playback rate of recorded voices without turning the speakers into chipmunks. Alternatively, they could rotate the head against a static piece of tape and allow a single moment of recorded sound to unfold continuously in time – a phenomenon that influenced the development of a quantum theory of information.
In the early days of recorded sound some people had found a metaphor for human thought in the path of a phonograph’s needle. When the needle became a head and that head began to spin, ideas about how we think, listen, and communicate followed suit: In 1954 Grant Fairbanks, the director of the University of Illinois’ Speech Research Laboratory, put forth an influential model of the speech-hearing mechanism as a system where the speaker’s conscious intention of what to say next is analogized to a tape recorder full of instructions, its drive “alternately started and stopped, and when the tape is stationary a given unit of instruction is reproduced by a moving scanning head”(136). Pitch time changing was more a model for thinking than it was for singing, and its imagined applications were thus primarily non-musical.
Take for example the Eltro Information Rate Changer. The first commercially available dedicated pitch-time changer, the Eltro advertised its uses as including “pitch correction of helium speech as found in deep sea; Dictation speed testing for typing and steno; Transcribing of material directly to typewriter by adjusting speed of speech to typing ability; medical teaching of heart sounds, breathing sounds etc.by slow playback of these rapid occurrences.” (It was also, incidentally, used by Kubrick to produce the eerily deliberate vocal pacing of HAL 9000). In short, for the earliest “pitch-time correction” technologies, the pitch itself was largely a secondary concern, of interest primarily because it was desirable for the sake of intelligibility to pitch-change time-altered sounds into a more normal-sounding frequency range.
This coupling of time compression with pitch changing continued well into the era of digital processing. The Eventide Harmonizer, one of the first digital hardware pitch shifters, was initially used to pitch-correct episodes of “I Love Lucy” which had been time-compressed to free-up broadcast time for advertising. Similar broadcast time compression techniques have proliferated and become common in radio and television (see, for example, Davis Foster Wallace’s account of the “cashbox” compressor in his essay on an LA talk radio station.) Speed listening technology initially developed for the visually impaired has similarly become a way of producing the audio “fine print” at the end of radio advertisements.
Though the popular conversation about Auto-Tune often leaves this part out, it’s hardly a secret that pitch-time correction is as much about saving time as it is about hitting the right note. As Auto-Tune inventor Andy Hildebrand put it,
[Auto-Tune’s] largest effect in the community is it’s changed the economics of sound studios…Before Auto-Tune, sound studios would spend a lot of time with singers, getting them on pitch and getting a good emotional performance. Now they just do the emotional performance, they don’t worry about the pitch, the singer goes home, and they fix it in the mix.
Whereas early pitch-shifters aimed to speed-up our consumption of recorded voices, the ones now used in recording are meant to reduce the actual time spent tracking musicians in studio. One of the implications of this framing is that emotion, pitch, and the performer take on a very particular relationship, one we can find sketched out in the Auto-Tune patent language:
Voices or instruments are out of tune when their pitch is not sufficiently close to standard pitches expected by the listener, given the harmonic fabric and genre of the ensemble. When voices or instruments are out of tune, the emotional qualities of the performance are lost. Correcting intonation, that is, measuring the actual pitch of a note and changing the measured pitch to a standard, solves this problem and restores the performance. (Emphasis mine. Similar passages can be found in Auto-Tune’s technical documentation.)
In the world according to Auto-Tune, the engineer is in the business of getting emotional signals from place to place. Emotion is the message, and pitch is the medium. Incorrect (i.e. unexpected) pitch therefore causes the emotion to be “lost.” While this formulation may strike some people as strange (for example, does it mean that we are unable to register the emotional qualities of a performance from singers who can’t hit notes reliably? Is there no emotionally expressive role for pitched performances that defy their genre’s expectations?), it makes perfect sense within the current affective economy and division of labor and affective economy of the recording studio. It’s a framing that makes it possible, intelligible, and at least somewhat compulsory to have singers “express emotion” as a quality distinct from the notes they hit and have vocal producers fix up the actual pitches after the fact. Both this emotional model of the voice and the model of the psychoacoustic subject are useful frameworks for the particular purposes they serve. The trick is to pay attention to the ways we might find ourselves bending to fit them.
Owen Marshall is a PhD candidate in Science and Technology Studies at Cornell University. His dissertation research focuses on the articulation of embodied perceptual skills, technological systems, and economies of affect in the recording studio. He is particularly interested in the history and politics of pitch-time correction, cybernetics, and ideas and practices about sensory-technological attunement in general.
Featured image: “Epic iPhone Auto-Tune App” by Flickr user Photo Giddy, CC BY-NC 2.0
REWIND!…If you liked this post, you may also dig:
“From the Archive #1: It is art?”-Jennifer Stoever
“Garageland! Authenticity and Musical Taste”-Aaron Trammell
This is article 2.0 in Sounding Out!‘s April Forum on “Sound and Technology.” Every Monday this month, you’ll be hearing new insights on this age-old pairing from the likes of Sounding Out! veterano Aaron Trammell along with new voices Andrew Salvati and Owen Marshall. These fast-forward folks will share their thinking about everything from Auto-tune to techie manifestos. So, turn on your quantizing for Sounding Out! and enjoy today’s supersonic in-depth look at sampling from from SO! Regular Writer Primus Luta. –JS, Editor-in-Chief
My favorite sample-based composition? No question about it: “Stroke of Death” by Ghostface and produced by The RZA.
Supposedly the story goes, RZA was playing records in the studio when he put on the Harlem Underground Band’s album. It is a go-to album in a sample-based composer collection, because of the open drum breaks. One such break appears in the cover of Bill Wither’s “Ain’t No Sunshine”, notably used by A Tribe Called Quest on “Everything is Fair.”
RZA, a known break beat head, listened as the song approached the open drums, when the unthinkable happened: a scratch in his copy of the record. Suddenly, right before the open drums dropped, the vinyl created its own loop, one that caught RZA’s ear. He recorded it right there and started crafting the beat.
This sample is the only source material for the track. RZA throws a slight turntable backspin in for emphasis, adding to the jarring feel that drives the beat. That backspin provides a pitch shift for the horn that dominates the sample, changing it from a single sound into a three-note melody. RZA also captures some of the open drums so that the track can breathe a bit before coming back to the jarring loop. As accidental as the discovery may have been, it is a very precisely arranged track, tailor-made for the attacking vocals of Ghostface, Solomon Childs, and the RZA himself.
“Stroke of Death” exemplifies how transformative sample-based composition can be. Other than by knowing the source material, the sample is hard to identify. You cannot figure out that the original composition is Wither’s “Ain’t No Sunshine” from the one note RZA sampled, especially considering the note has been manipulated into a three-note melody that appears nowhere in either rendition of the composition. It is sample based, yes, but also completely original.
Classifying a composition like this as a ‘happy accident’ downplays just how important the ear is in sample-based composition, particularly on the transformative end of the spectrum. J Dilla once said finding the mistakes in a record excited him and that it was often those mistakes he would try to capture in his production style. Working with vinyl as a source went a long way in that regard, as each piece of vinyl had the potential to have its own physical characteristics that affected what one heard. It’s hard to imagine “Stroke of Death” being inspired from a digital source. While digital files can have their own glitches, one that would create an internal loop on playback would be rare.
There has been a change in the sound of sampling over the past few decades. It is subtle but still perceptible; one can hear it even if a person does not know what it is they are hearing. It is akin to the difference between hearing a blues man play and hearing a music student play the blues. They technically are both still the blues, but the music student misses all of the blue notes.
The ‘blue notes’ of the blues were those aspects of the music that could not be transcribed yet were directly related to how the song conveyed emotion. It might be the fact that the instrument was not fully in tune, or the way certain notes were bent but others were not, it could even be the way a finger hit the body of a guitar right after the string was strummed. It goes back farther than the blues and ultimately is not exclusive to the African American tradition from which the phrase derives; most folk music traditions around the world have parallels. “The Rite of Spring” can be understood as Stravinsky ‘sampling’ the blue notes of Transylvanian folk music. In many regards sample-based composing is a modern folk tradition, so it should come as no surprise that it has its own blue notes.
The sample-based composition work of today is still sampling, but much of it lacks the blue notes that helped define the golden era of the art. I attribute this discrepancy to the evolution of technology over the last two decades. Many of the things that could be understood as the blue notes of sampling were merely ways around the limits of the technology. In the same way, the blue notes of most folk music happened when the emotion went beyond the standards of the instrument (or alternately the limits imposed upon it by the literal analysis of western theory). By looking at how the technology has evolved we can see how blue notes of sampling are being lost as key limitations are being overcome by “advances.”
First, let’s consider the E-Mu SP-1200, which is still thought to be the most definitive sounding sampler for hip-hop styled sample-based compositions, particularly related to drums. The primary reason for this is its low-resolution sampling and conversion rates. For the SP-1200 the Analog to Digital (A/D) and Digital to Analog (D/A) converters were 12-bit at a sample rate of 26.04 kHz (CD quality is 16-bit 44.1 kHz). No matter what quality the source material, there would be a loss in quality once it was sampled into and played out of the SP-1200. This loss proved desirable for drum sounds particularly when combined with the analog filtering available in the unit, giving them a grit that reflected the environments from which the music was emerging.
On top of this, individual samples could only be 2.5 seconds long, with a total available sample time of only 10 seconds. While the sample and conversion rates directly affected the sound of the samples, the time limits drove the way that composers sampled. Instead of finding loops, beatmakers focused on individual sounds or phrases, using the sequencer to arrange those elements into loops. There were workarounds for the sample time constraints; for example, playing a 33-rpm record at 45 rpm to sample, then pitching it back down post sampling was a quick way to increase the sample time. Doing this would further reduce the sample rate, but again, that could be sonically appealing.
An under appreciated limitation of the SP-1200 however, was the visual feedback for editing samples. The display of the SP-1200 was completely alpha numeric; there were no visual representations of the sample other than numbers that were controlled by the faders on the interface. The composer had to find the start and end points of the sample solely by ear. Two producers might edit the exact same kick drum with start times 100 samples (a fraction of a millisecond) apart. Were one of the composers to have recorded the kick at 45 rpm and pitched it down, the actual resolution for the start and end times would be different. When played in a sequence, these 100 samples affect the groove, contributing directly to the feel of the composition. The timing of when the sample starts playback is combined with the quantization setting and the swing percentage of the sequencer. That difference of 100 samples in the edit further offsets the trigger times, which even with quantization turned off fit into the 24 parts per quarter grid limitations of the machine.
Akai’s MPC-60 was the next evolution in sampling technology. It raised the sample and conversion rates to 16-bit and 40 kHz. Sample time increased to a total of 13.1 seconds (upgradable to 26.2). Sequencing resolution increased to 96 parts per quarter. Gone was the crunch of the SP-1200, but the precision went up both in sampling and in sequencing. The main trademark of the MPC series was the swing and groove that came to Akai from Roger Linn’s Linn Drum. For years shrouded in mystery and considered a myth by many, in truth there was a timing difference that Linn says was achieved by delaying certain notes by samples. Combined with the greater PPQ resolution in unquantized mode, even with more precision than the SP-1200, the MPC lent itself to capturing user variation.
Despite these technological advances, sample time and editing limitations, combined with the fact that the higher resolution sampling lacked the character of the SP-1200, kept the MPC from being the complete package sample composers desired. For this reason it was often paired with Akai’s S-950 rack sampler. The S-950 was a 12-bit sampler but had a variable sample rate between 7.5 kHz and 40 kHz. The stock memory could hold 750 KB of samples which at the lowest sample rate could garner upwards of 60 seconds of sampling and at the higher sample rates around 10 seconds. This was expandable to up to 2.5 MB of sample memory.
The editing capabilities made the S-950 such a powerful sampler. Being able to create internal sample loops, key map samples to a keyboard, modify envelopes for playback, and take advantage of early time stretching (which would come of age with the S-1000)—not to mention the filter on the unit—helped take sampling deeper into the sound design territory. This again increased the variable possibilities from composer to composer even when working from the same source material. Often combined with the MPC for sequencing, composers had the ultimate sample-based composition workstation.
Today, there are literally no limitations for sampling. Perhaps the subtlest advances have developed the precision with which samples can be edited. With these advances, the biggest shift has been the reduction of the reliance on ears. Recycle was an early software program that started to replace the ears in the editing process. With Recycle an audio file could be loaded, and the software would chop the sample into component parts by searching for the transients. Utilizing Recycle on the same source, it was more likely two different composers could arrive at a kick sample that was truncated identically.
Another factor has been the waveform visualization of samples for editing. Some earlier hardware samplers featured the waveform display for truncating samples, but the graphic resolution within the computer made this even more precise. By looking at the waveform you are able to edit samples at the point where a waveform crosses the middle point between the negative and positive side of the signal, known as the zero-crossing. The advantage of zero-crossing sampling is that it prevents errors that happen when playback goes from either side of the zero point to another point in one sample, which can make the edit point audible because of the break in the waveform. The end result of zero-crossing edited samples is a seamlessness that makes samples sound like they naturally fit into a sequence without audible errors. In many audio applications snap-to settings mean that edits automatically snap to zero-crossing—no ears needed to get a “perfect” sounding sample.
It is interesting to note that with digital files it’s not about recording the sample, but editing it out of the original file. It is much different from having to put the turntable on 45 rpm to fit a sample into 2.5 seconds. Another differentiation between digital sample sources is the quality of the files, whether original digital files (CD quality or higher), lossless compression (FLAC), lossy compressed (MP3, AAC) or the least desirable though most accessible, transcoded (lossy compression recompressed such as YouTube rips). These all result in a different degradation of quality than the SP-1200. Where the SP-1200’s downsampling often led to fatter sounds, these forms of compression trend toward thinner-sounding samples.
Some producers have created their own sound using thinned out samples with the same level of sonic intent as The RZA’s on “Stroke of Death.” The lo-fi aesthetic is often an attempt to capture a sound to parallel the golden era of hardware-based sampling. Some software-based samplers by example will have an SP-1200 emulation button that reduces bit rates to 12-bit. Most of software sequencers have groove templates that allow for the sequencers to emulate grooves like the MPC timing.
Perhaps the most important part of the sample-based composition process however, cannot be emulated: the ear. The ear in this case is not so much about the identification of the hot sample. Decades of history should tell us that the hot sample is truly a dime a dozen. It takes a keen composer’s ear to hear how to manipulate those sounds into something uniquely theirs. Being able to listen for that and then create that unique sound—utilizing whatever tools— that is the blue note of sampling. And there is simply no way to automate that process.
Featured image: “Blue note inverted” by Flickr user Tim, CC BY-ND 2.0
Primus Luta is a husband and father of three. He is a writer and an artist exploring the intersection of technology and art, and their philosophical implications. He maintains his own AvantUrb site. Luta was a regular presenter for Rhythm Incursions. As an artist, he is a founding member of the collective Concrète Sound System, which spun off into a record label for the exploratory realms of sound in 2012. Recently Concréte released the second part of their Ultimate Break Beats series for Shocklee.
REWIND!…If you liked this post, you may also dig:
“SO! Reads: Jonathan Sterne’s MP3: The Meaning of a Format”-Aaron Trammell
“Remixing Girl Talk: The Poetics and Aesthetics of Mashups”-Aram Sinnreich
“Sound as Art as Anti-environment”-Steven Hammer
In our current relationship with technology, we bring our bodies, but our minds rule–Linda Stone, “Conscious Computing”
I begin with an epigraph from Linda Stone, who coined the phrase ‘continuous partial attention’ to describe our mental state in the digital age. The passive cousin of multi-tasking, continuous partial attention is a reaction to our constantly connected lifestyles in which everything is happening right now and where value is increasingly equated with our ability to digest it all. Almost everything we do has the potential to be interrupted, be it by an email, a text or a tweet; often we will give only partial attention to any one thing in anticipation of the next thing that will require our attention. In this internal fight for mental attention, listening to music has been seriously impacted.
The digital era has seen more music releases than ever before. Unfortunately, the massive influx of quantity is by no means a measure of how we are engaging with said music. iPhones and similar devices, for which music players have become mere features, enable listening to become a thing of partial attention. From allowing the shuffle or random modes to choose music selections for you, or even streaming music algorithms to calculate things you might like, to listening while playing Angry Birds or reading your Twitter stream, less commitment is made to the act of listening, and as such only a portion of our working memory is committed to the experience. Without working memory actively processing musical information, it is less likely to be stored for the long term, particularly if other information is continuously vying for space and attention.
These days video games sell better than music. Despite being a digital product, games are able to instill memories (even of the music) into one’s consciousness, because the game interface allows our sensory memories to work together in an active manner with the medium. Iconic memory stores visual cues from the game, echoic memory takes the audible cues from the game and the haptic memory is engaged in controlling game play. There is only so much more which can be done while playing a video game. If something were to interrupt game play, the game would be paused to address the new information rather than giving it partial attention. This is quite different from music which plays a background role in so much of our lives even when we are actively putting music on we tend to only engage it with partial attention.
When I began thinking about turning Concrète Sound System into a record label, one of my main goals was to create works that could engage the audience in active musical experiences that could create long term memories. I felt that as important as the music would be, it would take something material to create these memories, a physical product more evocative of earlier moments in recording history than the CD, its most recent gasp. I wondered if, by creatively evoking the physical object, the listener could be engaged in an active manner that would enable the memory of music and its power to persist through the everyday waves of digital noise.
The first mass duplicated audio medium was the Gold Moulded Edison Cylinder at the turn of the twentieth century. Imagine two cylinder copies of one of these recording today, as musical objects. Each of them would have over a hundred years of physical history. From the wear of the cases to the condition of the wax based on the temperature in which they were stored, each of these cylinders would be unique musical objects, with completely different histories, despite having the same origin. It is reasonable to assume that if the cylinders were played today on the same playback device, despite the fact that the compositions and performances are exactly the same, the differences between the recordings would be audible.
Even without a century of history, there would likely be audible differences between the cylinders. If one cylinder was the first copy made, and another the 150th –master cylinders of Gold Moulded Edison Cylinders could only produce 150 copies reliably–the physical wear in the process of reproduction would leave its own imprint, making each of those copies distinct musical objects. In the analog world, as the technology improved the differences between copies decreased substantially. Cassettes were manufactured in batches of ten to hundreds of thousands without audible differences. But even in circulations so high, over time each of those analog copies took on their own identity and collected their own memories.
The listener as an active agent contributed to the development of these unique musical objects. After a purchase, any number of variables played into the ritual of the first experience of the music. Was there a way to listen upon walking out of the store? Were there liner notes or lyric sheets inside? Would you read those prior to listening or as you listen? Where would you listen? Through headphones? The listening chair in front of the hi-fi stereo? Or on the boombox with some friends? All of these possibilities shaped memories as musical objects that defined the music consumption culture of the past.
For example, I bought the debut 2Pac album 2Pacalypse Now on cassette the day it was released. I loved the album so much I kept it in regular rotation in my Walkman for months until finally the tape popped. Rather than go out and buy a new copy I decided to perform a surgery. It was in a screwless reel case which meant I couldn’t just open it up to retrieve the ends of the tape trapped inside, but rather had to crack the reel case open and transplant the reels into a new body. So, my copy of the 2Pacalypse Now cassette is now inside of a clear reel holder with no visual markings. It also has a piece of tape that was used to splice it back together, which makes an audible warp when played back. I can pretty much be sure that there is no other copy of 2Pacalypse which sounds exactly like mine. While this probably detracts from the resale value of the cassette (not that I’d sell it), it is imbued with a personal history that is priceless.
Cassettes, in particular, played a significant role in the attachment of physical memories to music beyond the recordings they held. They gave birth to the mixtape. The taper community was born from personal tape recorders that allowed concert-goers to record performances they attended, and, prior to the rise of peer to peer sharing online, these communities were trading tapes internationally via regular postal mail. European jazz and rock concerts were finding their way back to the states and South Bronx hip-hop performances were traveling with the military in Asia. All of these instances required a physical commitment with which came memories that inherently became their own musical objects.
Needless to say the nature of musical exchange has changed with the rise of the digital age of music. This is not to say that memories as musical objects have gone away, but they are being taken for granted as the objects lose their physicality. I remember going to The Wiz on 96th Street with $10 to spend on music. I spent at least ten minutes trying to decide between Sid and B-Tonn and Arabian Prince. I ended up with Arabian Prince and have regretted it since I got home and listened that day, as I never found Sid and B-Tonn for sale again. Today I could download both in the time it took me to walk to the train station. After skimming through the first few songs of Arabian Prince I could decide it was not for me and drag drop it in the trash where the memory of it would disappear with the files. No matter how I felt about the music then, the memory of it is a permanent fixture in my mind because of the physical actions it took to listen.
The first release for Concrète Sound System, Schrödinger’s Cassette, tackled this issue head on by presenting the audience with its own paradox, an update of physicist Erwin Schrödinger’s famous Thought Experiment, where the ultimate fate of the cassette inside is left up to the individual. Schrödinger’s Cassette sought to take listeners out of digital modes of consumption by using an analog medium to evoke the physical. The cassette release trend has been growing over the last few years, almost in parallel to the rise of the digital music and speaking to the need to separate music from our digital lives and to a desire to work harder for it. At the minimum, listening to a cassette requires having a cassette player, and acquiring one these days takes commitment. Unlike digital media, listeners cannot instantly skip a song on a cassette or put a favorite on repeat. It takes physical manipulation of the medium to move through its songs and doing so is a time investment. All these limitations make the cassette a medium that is best for linear listening, from beginning to end (unless you physically cut, rearrange, and splice it yourself).
Schrödinger’s Cassette took the required commitment a step further by encasing the cassette itself in industrial grade concrete. This required the user to actively crack the concrete (or the french concrète meaning ‘real’, from which the label derives its name) in order to listen to the music. The paradox is that, depending on the listener’s method for cracking, harm could be done to the cassette that might render it ‘unlistenable’. Upon receiving one of these pieces, the listener holds in their hands a musical object which they must physically act upon in order to create an unrepeatable musical event. Schrödinger’s Cassette has a look, a sound (if shaken you can hear the cassette reels), a feel, a smell, and a taste as well (though I wouldn’t advise it). All of the senses can be actively focused on the object and, as such, the whole of one’s working memory is engaged in the discernment of the object’s musical contents.
For many, Schrödinger’s Cassette was taken as a work of art and left uncracked. The Wire magazine successfully cracked one edition open, revealing a portion of the musical contents on their regular radio program. For those that decided not to crack it, digital versions were made available so that they could listen, though this option was only made available after the listener spent some time with their physical object. In this way, the music from the project, a compilation called Between the Cracks, was directly connected to physical memories spurred by a material presence.
Triggering active memory during the consumption of music through physical objects need not be this complex. Old medium such as vinyl and cassette releases inherently have the physical properties required without the concrete or much else. Perhaps for this reason they show new signs of life despite the rise of digital. No matter how much our reality is augmented by our digital lives, we still inhabit those bodies that we bring with us, and, as far as the memories those bodies carry with them go, physicality rules.
Featured Image: Wax Cylinders in the Library of Congress, Image by Flickr User Photo Phiend
Primus Luta is a husband and father of three. He is a writer and an artist exploring the intersection of technology and art, and their philosophical implications. He is a regular guest contributor to theCreate Digital Music website, and maintains his own AvantUrb site. Luta is a regular presenter for the Rhythm Incursions Podcast series with his monthly showRIPL. As an artist, he is a founding member of the live electronic music collectiveConcrète Sound System, which spun off into a record label for the exploratory realms of sound in 2012.