Spawn Songs: How AIs Find Their Voices

If you, like me, follow news about AI in music, you’ve probably seen an image like this:

Or this:

No humans here. Just AI, in robotic form, ready to take full control of the creative tasks of producer and composer. In fact, images intimating such a future have been circulating since the early 20th century. Compare the following image from 1930, from a campaign by a musicians’ union against the substitution of recorded music for live musicians in theaters:

deathofmusic robot_at_the_helm

It’s easy retrospectively to laugh at worries that mechanical reproduction would bring the destruction of art. But we should not be so blasé as to think that it all turned out ok. Yes, the substitution of recorded music for live musicians in theaters threatened the livelihood of countless instrumentalists across the nation. But the representation of recorded music as a robot wresting control of musical culture ignored the many people and new kinds of creative labor involved in producing recorded music. We are left to wonder “what if”: what if the energy poured into a simplified, sensationalist image of mechanical substitution had instead been directed to fueling a more robust public discourse about recording’s new configurations of people and tools, and their implications for musical authorship, ownership, and compensation? Perhaps what some now regard as among the greatest injustices in the history of recorded music – the shadow status of session musicians whose creative contributions were integral to the iconic sound of hit songs, for instance, or the infamous Bridgeport Music, Inc. v. Dimension Films 2005 blanket ruling against unlicensed sampling, which failed to fathom the creative labor of hip hop musicians – might have played out differently.

Similarly, today’s robot representations do not capture the realities of AI. As Mary Gray and Siddharth Suri demonstrate in their new book Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass, the AI systems in our lives depend on many largely hidden human workers. Think of search engine results, social media feeds, recommender systems – these “rely on a shared pool of on-demand workers amassed by on-demand platforms” (xvii), and these on-demand platforms “allow humans to power many of the websites, apps, online services, and algorithms most consumers think are automated” (170). By selling their services as powered by the magic of AI, companies (and the media hype surrounding them) conceal an underlying dynamic: that as we ask AI to do more, we are generating “new needs and different types of human labor to fill those needs” (xviii).

Recognizing the human labor involved in AI systems, the “humans in the loop” as Gray and Suri put it, calls for fundamentally reimagining what AI is – conjuring not autonomous robots that eliminate humans from the picture, but rather networked systems in which people play a constant, dynamic, essential part. It calls, in other words, for changing the image of AI in popular consciousness.

Re-enter music. Although music has long been shadowed by fears of robot take-over, musicians are also adept at taking in the world around them and reflecting it back in ways that can wake us up to new (and old) realities, helping us apprehend our world differently not just at intellectual but also at emotional, visceral levels.

This past summer, Holly Herndon released PROTO, an album made in collaboration with other human musicians and Spawn, her name for a collection of vocal models created using machine learning techniques. Herndon also calls Spawn her “AI baby,” and the use of AI in creating the album has driven its media coverage. But Herndon’s work resists reduction to “now AI can compose music too!”-style headlines. When asked by Jezebel’s Hazel Cills how to think about the people behind AI, Herndon responded:

That’s one of the biggest problems of AI; it’s this kind of opaque, black box technology, and when we have this glossy press release where it’s like “the machine just wrote this song” you’re totally discounting all the human labor that went into the training set that the thing learns on. That was a really important part of how we set up the project and the way that we did. We wanted the people training Spawn to be visible, to be audible, to be named, to be compensated, because I think that’s a huge part of what we’re facing with this thing today.

The track “Evening Shades (Live Training)” illustrates this desire for public recognition of the people and labor required to make Spawn work. Through the alternation of a full-voiced human choir with Spawn’s oddly filtered and stuttering efforts to repeat back the same phrase, the track makes audible some of the process – some of the human labor – involved in training Spawn to “sing”:

Rather than picture AI as a robot, then, listening to PROTO can help us imagine AI as a gathering of people and machines – a collective endeavor in which many humans work together with algorithmic systems. What if we were to start our discussions of AI from such an understanding – might we better grapple with the implications for authorship, ownership, compensation, with how to recognize and value the many “humans in the loop?” What if…?

 

Further reading:

David Turner, “AI isn’t going to replace your favorite band,” Penny Fractions newsletter, November 6 2019.

Cherie Hu, “Slave to the ‘rithm? Not so fast: Everything you need to know about the deal between Endel and Warner Music,” Water & Music newsletter, March 25 2019.

Melissa Avdeeff, “Artificial Intelligence & Popular Music: SKYGGE, Flow Machines, and the Audio Uncanny Valley,” Arts 8/4 (2019).