Performance Keynote :: Becoming Latency-Native

Rebekah Wilson, December 4 2019. Trondheim, Norway

Composer, software developer and networked music researcher Rebekah Wilson speaks on the topic of Latency as a property of Music in the context of Networked Music Peformance.


Transcript for video at This video was filmed in at the Rockheim venue in Trondheim, Norway as the opening Performance Keynote for the Web Audio Conference 2019.

I am a composer and a technologist originally from Aotearoa/New Zealand and I studied traditional acousmatic composition in the 90’s. And I had the great fortune after that to be artistic co director at STEIM in Amsterdam, the studio for electronic and instrumental music. And it was there that I became passionate with interactive electro acoustic music. So instead of statically creating files that would get played back this was my introduction to live electronics and it's been a passion of mine ever since. I was a self-taught programmer due to my passionate need for two things, which I'm sure you all share here. One, the need to make things, just stuff, the need to make stuff and two, a need for self-expression. And in 2003, after my time at STEIM, I co-founded Source Elements in Chicago and began working on a prototype which took about a year and a half to get out to the public, for sending high-quality audio over the internet. It was initially for the purposes of working with post-production, orchestras and studio musicians. It was not even a consideration at that time, it did not enter our minds that we might use it for music making together.

But since the last years, the last four/five years, as the potential of really amazing internet audio applications has exploded and bandwidth has exploded and access to bandwidth has increased around the globe, I become really interested to see how these tools could be used for interactive music performance. Because in my years running Source Elements, I always got asked all the time, when will you fix the latency? When will I make the latency go away? As if I could do something about that in the software. It became a really important question for me because I come from an island in the South Pacific, almost as far as you can go before you come back again. So when I arrived home, I go back home for the summer after a holiday from the northern hemisphere. I sit down at my computer and I open, get on the internet, I ssh into a server. There's always a moment, that first day I'm like, damn, why is it taking so long? Because we have like maybe an extra 100/ 200 milliseconds latency to get all the way over there, it’s a very, very long way away for those of you who have braved the flight to do that. But after a few days, the latency becomes natural and I don't think about it anymore. But what happens when we go somewhere a bit further away? Say, what happens when we colonize the moon? The distance between us and the moon, will be almost 400,000 kilometers. And when Michael Collins circled the moon in Apollo 11, waiting for Armstrong and Aldrin to come back, he communicated back to Mission Control with a two and a half second latency. It might not seem much but as you can see from this graphic, you really can't share a beat in time with that kind of latency. And when we colonize Mars, we're going to get 3 to 22 minutes latency, depending on the orbit, because our orbit with Mars is elliptical.

How will we interact with this kind of latency? We don't have a social model for this because actually, we don't think a lot about communication latency, particularly popular media and science fiction have not helped. I mean, when was the last time you saw a movie when they're like, hello, John down on another planet? [ .. pause of 2 seconds ...] Yes, yes, oh, sorry, can't get that. You know what I mean? The only movie I saw recently they did that, was with 'The Martian'. But movies are science fiction, to be fair, they also have time travel. But for us in reality, early 21st century physics still holds. So consider the HIGH SEAS experiment, the Hawaii Space exploration analog simulation. In the Hawaiian desert, they shut in- this program has been going for a while. They shut in Martian hopefuls in a small building in the desert and pretend they're on Mars. And they have to go outside, they have to with spacesuits. And on the inside, the first crew commander Angelo Vermuelen told me that they also have a 20 minute delay for communications back to Mission Control, which is fantastic. So they really got to experience how isolated it was being on Mars and how far away they were. And bear in mind when there is the sun between us and Mars, there's going to be radio silence. How will we manage galactic and interstellar travel? These distances are unimaginable for us.

When we think of communicating interactively as humans, we evolved to communicate at really close quarters, even though we have some latency to our brains, it's natural and we are accustomed to it. We evolved to coordinate our actions and our interactions to this short latency. How we move in time together, how we respond to each other gesturally - high latency really messes with our sense of timing. It disrupts performance and it absolutely destroys behavior learned over millennia, communicating what we perceive as natural latency. And in this century aside from thinking about the long galactic epochs of the future, we are already communicating at long distances comparatively, it's part of our 21st century life. Some of us have moved away or we’re working with collaborators in different continents. And while traveling is a wonderfully immense thing to be able to do, for those of us who can travel there's many who cannot, whether this is the cost of travel, flight-guilt with climate change, physical challenges and disabilities, geopolitical conflicts or simply a desire to stay at home with our families, we have many reasons to stay and we have many reasons to go. The Internet makes us feel close, but not close enough. We can't do one of the things that we love to do with humans, make music together. Because when latency is higher than say 30 milliseconds, we don't experience musical interactions fluid. And when we're not interacting fluidly, we stop feeling as if we're in sync with each other. And yet, I think the drive to play music is so strong, so fiercely strong, that we'll find solutions. So on that note, I'm going to perform a live piece of music with a friend of mine, Reinier van Houdt, he in Rotterdam. Reinier, do we have you on the speakers? Can we- say hi again, please. So Reinier is coming up in the back speakers, he's in Rotterdam and that's about 1325 kilometers away. It's about 4.4 milliseconds at light speed and we're connecting using a custom webRTC app that I built called Source-Connect Now and we are using a program of mine that I'll explain afterwards. You'll see some things happening on the screen and some noises will happen and I hope you enjoy it. Just one moment, please. Reinier, do you hear this?

Reinier: Yes.

Rebekah: Great, I'll just be a minute or less, just putting on some gear. Okay, let's see what we can do here. I have to restart something. Let's see if I'm getting any errors. Not yet. Okay, it seems to be working and here we go. Reinier, do you see series one?

Reinier: Yes.

Rebekah: Are we hearing his piano?

Voice: Yes.

Rebekah: Okay.

[ Music plays. Jump to performance in the video at 11:30 ]

Rebekah: Thank you, Reinier in Rotterdam. So, as I said before, that performance was 1325 kilometers away from us, about 4.4 milliseconds. Although maybe I didn't get the center of Trondheim, it could have been 4.5, I'm not sure. It was fairly effortless for us to play together and we both have a microphone, really good internet. And so the question is why are we not all doing this? We play games on the internet together, we chat, we make collaborative documents. But we don't play music over the internet. We perceive that the latency is simply too high. And given that it's a property of physics, it's not going away. Maybe one day with time travel, quantum tunneling, quantum entanglement .. sadly enough, I'm not a physicist.

We should change the question, since latency won't go away. So what needs to change for us so that we can play music together? The remainder of this talk, I'm going to introduce my research into latency aesthetics, how the internet can actually be compatible with music performance. And then hopefully, if I'm not running out of time, I can just show you technically what happened here. If we do run out of time just come and ask me, I can show you. So while there are a number of really amazing experiments and research projects going on as it stands today, there's no strong cultural movement of playing music together on the internet. And the first reason is latency as I've discussed. And in performance, there's actually a lot more latencies than we like to think about. For example, now with the performance with Reinier, we have the software processing latency in Chrome - I was using webRTC with Chromium. And we've got I/O latency for the digital I/O, we have the transmission packetization, buffering, the transmission over the internet itself that goes through probably at least 10 hops, maybe more. And then we need extra buffers on the other side for error correction, decoding of the Opus codec and extra processing latency to mask any missing packets, not to mention error propagation of sound, CPU and IO latency on Reinier’s side. And then we have the neurological, emotional and cognitive latencies. And all of these can be just as high as the internet latency and yet, we accommodate for that.

A second major disruption is mediated presence. When we connect over the internet, we're sending digital representations of ourselves, representations that are very limited or very stylized. And this is mediated presence where we are compressed, filtered, frame-rate limited, cropped, we're effectively sending sensory-limited avatars, which makes it difficult for us to determine the true intention of each other. If I can't see you very well, if I'm not able to glance in your direction for immediate reassurance of what we're playing together, of what you're able to do, I have to rely on other information. I believe we can accommodate for these two problems. And I completely disagree with the idea that the internet will always disrupt musicality. I'm absolutely confident that we can learn to play together over the internet. Our brains already compensate for large amounts of latency and we can learn to compensate for more. So then we should learn as much as we can about the constraints and possibilities of the internet.

So the network experience is constrained by latency mediated presence, as I've just mentioned. And then it's also multi-located and multi-authorial. I think this part is pretty fun, it’s actually one of the more interesting characteristics of network performance. In that we become multi-located and multi-authored, which means essentially, when we are performing in more than one space at once, we have two events happening there, two real, truthful versions of the performance happening. What Reinier heard on his side was absolutely as valid as what I was hearing, but it was slightly different. And we also need to consider musical properties itself, rhythm harmony and timbre, the main ones. And how we manage these properties depends on how we manage latency mediated presence in the multi authorial aspects. So given this core group of characteristics, we can find an unlimited number of aesthetic approaches to work with us. I'm going to present four of these aesthetic approaches here, which I've been using myself for a few years. And firstly, harmony, because latency absolutely messes with it. We expect to hear two notes as a chord when they play at once within a very short timeframe. But as soon as you get that skewed, you know, the verticality gets skewed, we start hearing them as melodic content.

[ Jump to music excerpts in the video at 33:05 ]

So this is a very short excerpt of a Bach chorale and you'll just hear right now what it would sound like, we're in the same room with each other.

So this is how Bach wrote it to sound, and I'm going to play the same chorale as it might have sounded in another location with other musicians over the internet.

[ Jump to music excerpts in the video at 33:28 ]

This is no longer correct harmony in the traditional sense and the yellow lines show how skewed that synchronization is with the harmonic result. And a third possibility.

[ Jump to music excerpts in the video at 33:49 ]

So, it’s quite different. If I'd have submitted that in my harmony class, my teacher definitely would not have passed me. So that makes it clear, we're going to have to consider network music as multi-authorial and a valid experience on all sides and we need to consider the consequences of the network when it comes to working with harmony. It's simply not possible to sync up each of them in individual spaces when connected over the internet to sound the same because they're just going to arrive at different times.

Another really great aspect of the network is, we've got two microphones in two different spaces. Reinier was wearing headphones but if we had a sound engineer on his side or if he had been in a concert hall, we could have had his speakers open as well and then we could have actually worked with some acoustic resonance and really having some beautiful feedback, working with the feedback and the resonance of the frequencies of the room is a really, I think quite beautiful thing that we can do with timbral fusions: we can invent new sonic-sounding instruments. And thirdly, very importantly we are post-digital. It is an undeniable part of us, where it is inevitable that the lives that we lead now a digital, because we're leading digital lives and the digitalness is also a reference to our life. And because the transmission is digital, our representation on the other side is digital. We don't have any original analog or sensory information on the other side at all. So, we can really choose how we want to present ourselves. We can put ourselves back together in different ways on the other side, there's absolutely no obligation to reproduce reality.

The fourth aesthetic I will mention here today, and then I will go into a bit more detail on that, is performative relationships. One of the grounding requirements of us playing music together is that we flow and we have mutual trust. But how can we do that when there's latency? And how can we do that when we can't see each other properly? And how can we do that when we don't know who is playing from what speakers? How can we communicate clearly when our timbres and harmonies are blurred? But I'm firmly of the belief that we can solve these problems for the same technology that we use to make them. An example of this performance- an example of this was the performance with Reinier. So it used analytical tools to present to us as performers. You saw the score there, we were both reading exactly the same score. It was slightly different for us than Reiner because it was using a local random generator, so sometimes the glitches look a bit different. But overall, we had the same kind of motion towards the structure. And so what we're simply doing is classifying patterns using a convolution neural network. And it does add extra latency but because classification isn't instantaneous but we're already working with latency over the internet so it doesn't really seem to matter that we're adding a bit more. And just before I go into the technical details of that, I think I've still got a couple of minutes. I just wanted to thank my collaborators on this project. Adam, who wrote the JavaScript score, and Umer who worked with me on the training and model and made it much, much faster than I had, which was really fantastic. So basically, we trained the computer to listen to certain phrases, and you saw there were the five phrases in the score that were repeated. And then we also trained the the machine to detect whether or not those phrases might be played at the same time by us. So I had those two classification events that we were seeking. The score just listens for events and makes move the music along. And why this works for us, I've done three performances now with this technology with various sizes of ensembles. And it works because we are trusting the score, the score is telling us what the other side is playing. Sometimes I could hear Reinier playing one of the phrases and then I could choose to do that but sometimes I can't because there's a lot going on. You know, maybe I'm busy trying to think about what I'm doing or there's a really high volume sound of the electronics. But the score knows because it's getting getting a clean feed of the piano of both our sides, and it can tell us exactly what's happening. And that's why the score is really effective for us. It gives us a sense of being together in time. We probably wouldn't even need to hear each other, even if we just wanted to watch the score and assume what was going on, on the other side, and know it sort of matched.

So I have no doubt that there's a new age of digital remote music-making to come. In my imagination this music will be rich and texture rhythms, sonic possibilities. And to achieve this, we just need to develop technology and stick with cultural investigations. And that means our technology must investigate our performance behaviors and over time, our music and our ears will adapt. Maybe we’ll embrace latency as a kind of necessary messiness, the same way orchestras and marching bands have to deal with it. Maybe there will be an entirely new kind of music where latency is a critical property in even unstable, really high latencies. And maybe right now there's a pair of 15 year olds, you know, sitting in the bedrooms, a continent part, using our software to invent this kind of music.