Text & Sound – Københavns Universitet


Audio books, eBooks and digital archives all have a material and bibliographical affinity between them: All rely on electronic equipment in order to transform stored information into audible or legible text on a playback or display device for the reader or user.

From our different approaches to the work on mediated sound we will raise a series of questions concerning digitisation. If form affects contents, how do we take into account the playback or display devices in our interpretation of different electronic documents? What role does e.g. storage on new formats play (from analogue to digital)? What information is lost and what is gained as new technological methods of analysis become accessible?

The digitisation of texts (both written and auditive) will increasingly subject research in the humanities to similar material conditions. Therefore we would like to ask the participants of this seminar these two questions: What does the future hold in store for the material approach to texts and documents? And how do we meet the challenges that the digitisation of textual material poses?


Text & Sound – Documents & Digitisation
[Note: As yet, this site only contains Klaus Nielsen’s part of the paper.]
We will not be presenting a specific joint project today, but instead two different perspectives on the material approach to auditive texts. First, Bente will tell us about the LARM project and the challenges of establishing a digital sound archive with bibliographic and analytic tools. Then, Klaus will give an example of how sound analyses can used as a tool for literary studies through a case study (Gitte’s Monologues by Per Højholt) taken from his current PhD project.

But before we hear from Bente we must turn our attention once again to Johnny Weissmüller and the 1930’s Tarzan yell that I played earlier. This is how the sound looks like when displayed graphically.

Fig. 1. Johnny Weissmüller in Tarzan the Ape Man, MGM 1932


Perhaps not all are familiar with this type of visualisation – so very briefly explained: Sound is caused by oscillations or vibrations in the air, and it is in fact these oscillations we can see. If we zoom in this becomes more evident. The horizontal axis represents time: The closer the oscillations are, the higher the pitch of the sound. The vertical axis represents amplitude: The steeper the oscillations are, the louder the sound.

Fig. 2. Sound in close-up


Notice now when I play the Tarzan yell again. First forwards:

And then backwards:

Weissmüller’s yell is a so-called phonetic palindrome. This is when a word or a phrase not necessarily in its orthography but in its pronunciation is identical forwards and backwards. These are very rare by the way. But the Tarzan yell is not only a phonetic palindrome; it is also a material palindrome. This characteristic sound has quite simply been cut and pasted together – conjured up by a clever and dexterous sound engineer. The first part of the yell was recorded, copied, flipped over and pasted onto the original.

Calling the yell a text would probably require an unnecessary stretching of the concept of the text. That is not my objective at all. The point, rather, is that the playback device – in this case the computer – opens up for a new and different view of the materiality of sound. Here we have a short sound bite that everyone knows – an iconic sound if you will. Using the computer’s audio software as playback device enables us to meet the sound on its own material terms. I believe that in this way we may acquire a more nuanced understanding of sound as text than what more traditional forms of transcription offer.


Gitte and Audio Analysis
On of Denmark’s most prominent modernist and post-modernist poets Per Højholt published his immensely popular Gitte monologues between 1980-84. [For an English introduction to the work click here]They were originally written for Danish National radio and therefore met their audience as readings on radio before they were printed and published. The first collection came in book form in 1981. Then followed two audio releases on vinyl LP and music cassette in 82 and 83. On these Højholt performed the monologues with a live audience in the studio. Finally in 84 came a complete collected edition in book form. In addition, Højholt toured throughout the country performing several hundred live shows dedicated to the monologues.

Gitte is a somewhat naïve girl from a provincial outskirt in Jutland. Højholt has on several occasions described Gitte’s very particular take on language as a decidedly physical phenomenon: »she talks with her body« as opposed to the rest of us who waste time on syntax and the likes. Roland Barthes claims that in fact the body is that which is lost when the spoken language is transcribed. This is interesting because it undoubtedly questions the validity of the printed editions as proper representations of Gitte’s particular nature. But how then can we retain the body in our analysis and interpretation of the work? This is where audio analysis comes in handy.

In addition to the actual monologues the two audio releases contain a vast amount of text in the form of the author’s own introductions to each monologue. The relationship between these two voices in the work, i.e. Gitte’s and Højholt’s, is particularly interesting, and now it’s time for listening. [I realize that for non Danish speakers these audio clips will make little sense. But the importance of this particular analysis lies not so much in the linguistic contents of the text as it does in the tonality and phonetic qualities of the voices. Therefore I urge readers to listen even though the texts are incomprehensible. My own rough translations appear on mouse over.]

First, we will listen to two examples from the first LP in 82. This one is Gitte:

[Mouse over here for translation] 


And this one is Højholt:

[Mouse over here for translation] 


As you can probably hear there is a big difference between the characteristics of the two voices. If we look at a smaller portion – first of Gitte’s voice – we can tell the software to show the pitch curve of Gitte’s speech. This brings us a step closer to the physicality of the voice.

What we see here can be compared to music notation. The blue dots represent measurements of pitch: low pitch at the bottom and high pitch at the top. The software in this case is Praat , and I am indebted to phonetician John Tøndering from the Department of Scandinavian Studies and Linguistics, University of Copenhagen, for his tremendous help utilizing the software. Here we see how Gitte’s register lies fairly high and that her variations are steep. The two large curves in fact represent impressive modulations of more than an octave.

If we compare this with Højholt’s voice, we see a striking difference. His register is much lower and his variations are smaller.  

These were both from 82. Now, let us hear an example from the 83 release where things are quite different. Here we have a transition from the author’s introduction to Gitte’s monologue. Notice how much two sound alike both in register and dynamics. Here we find Højholt producing the highest notes and Gitte the lowest. 

[Translation in progress]

These are of course only small examples but I have analysed the full contents of both audio releases and calculated the averages of each voice. The results shown in the diagram below tell the same story. In 82 (left column) Gitte’s average pitch lay about 6 semitones higher than Højholt’s. In 83 (right column) the two averages are almost identical. 

Fig. 3. Click on the image for a higher resolution


This change of vocal characteristics that we can both hear and see through our audio analyses is crucial to our understanding of Gitte’s Monologues. There are several other structural aspects that point similarly to the contamination of Gitte’s diegetic univers, i.e. the narrated world, and the more or less biographical world inhabited by the narrator (and author) Højholt. But this will have to be the subject of another presentation.

In our abstract we had posed a series of questions concerning this material approach to sound. Unfortunately, we have not dealt with all of them, and in conclusion we would like to pose one more question: How do we handle the materiality of digitisation in our interpretative endeavours? We have seen examples of how digitisation can give us a new perspective on sound. But what is lost in the process? As with e-books and other forms of electronic texts, audio has two forms of materiality: one linked to storage, and one linked to playback. However, the process of digitisation itself creates yet another material distinction: that between copy and original. When I measure the tonal variations of Højholt’s voice, I am actually measuring the copy. When the LARM archive within the next couple of years opens up a vast portion of our auditive cultural heritage, it will be an archive of copies. This paradoxically means that the user simultaneously is removed one step from the materiality of the text, whilst the accessibility and potential for analyses is greatly enhanced. This I believe is an important part of the challenges posed by digitisation.