(Published in the Spring 2001 Vol. 7 No. 1 issue of Convergence: The Journal of Research into New Media Technologies)

 

Deeper conversations with interactive art, or why artists must program

 

Andrew Stern

andrew@interactivestory.net

www.interactivestory.net

 

The past few years have seen a flurry of activity in the amount of computer-based interactive art being produced, spurred on by the advent of the internet and World Wide Web.  Hundreds of experimental works are being created as artists struggle to understand the computer as a new medium for art, which in turn is fueling the establishment of dozens of “new media” departments, institutes, festivals, museums and websites to serve and showcase the work.  As I navigate this wildly diverse, sometimes confusing landscape of computer-based art, I find myself searching for works that truly take advantage of the capabilities that, from my experience as a designer and programmer of interactive virtual characters, I know the computer can offer.

 

Although many artists are being very creative and innovative, to my consternation I have found few that attempt to use the medium in ways that I believe fully attempt to engage all of its abilities.  This essay will address these concerns by taking a step back and looking analytically at the computer as a medium for art, in an attempt to understand the important new capabilities it offers artists.  From this analysis I will make the case that it would behoove the practitioners of interactive art to focus on deepening the conversation between art and audience.

 

 

Autonomy + Reactivity = Interactivity

 

One of the primary capabilities of the computer as an art medium  -- specifically when the computer itself is a component of the artwork -- is as an all-purpose electronic output device. The computer can be used to display text, imagery, and video, to play sound and speech, and to send control signals to machines and robots.  User-friendly tools and programming languages such as Hypercard, Director and HTML allow artists to compose and sequence text, imagery, sounds and control signals with relative ease.  In this way artists can use the autonomy of the computer to essentially playback pieces of media, or to control machines and robots.  While playback of visuals and sound could previously have been achieved using video, the autonomous, procedural nature of the computer now allows artists to playback these sequences in a more sophisticated non-linear fashion, in real-time.  Many artists have employed this technique in works ranging from robot art such as Alan Rath’s Robot Dance (1995) to much of today’s Web art [1].

 

Autonomy can be extended into generativity when more powerful programming languages are used to write algorithms that can synthesize new text, images, sound and speech. By applying techniques from artificial intelligence and artificial life the artist can attempt to harness the potential of emergent behavior to generate new content.  Examples range from Karl Sims’ seminal Evolved Virtual Creatures (1994) [2] to Harold Cohen’s AARON (1995) [3] to David Cope’s Experiments in Musical Intelligence (1996) [4].

 

Besides its capability as an all-purpose output device, the computer is also an all-purpose input device.  Computer-based artwork can get input from the audience in the form of text from a keyboard, point-click-and-drag input from a mouse, even image and motion input from a camera and sound input from a microphone.  Having the artwork respond in some way to these stimuli is typically what is called “interactivity”.  More accurately, this could be called “push button” art – click this, see that.  In such works the audience’s input essentially triggers a pre-recorded piece of media to be displayed.  Such works have interactivity with no autonomy – essentially reactivity.  The underlying content of the artwork itself is typically stateless, canned, fixed; all that changes when the audience interacts with the work is which part of the canned content is now displayed.

 

Reactivity can be used to powerful effect.  Works such as Bill Viola’s The Tree of Knowledge (1997), Jim Rosenberg’s Intergrams (1996), Michael Joyce’s afternoon (1987) and Abbe Don’s We Make Memories (1995) tend to function in this way.  In virtual reality works such as Michael Naimark’s Be Now Here (1996) and Brenda Laurel’s Placeholder (1993) the audience can navigate to see different viewpoints of a fixed world with limited reactivity, creating a sense of immersion in a virtual space. 

 

It is when artists combine the computer’s capabilities of real-time autonomy and reactivity that they achieve a deeper form of interactive art.  By making the computer listen to the audience (the first half of reactivity), think about what it heard (autonomy), and then speak its thoughts back to the audience (the second half of reactivity), the artwork can have a dialog, a conversation, with the audience.  (By “speaking” and “conversation” I mean some sort of meaningful communication, not necessarily literal speech.)  Interactivity is the cycle where both the artwork and the audience listen, think and speak to each other. Such a capability for art was unachievable until the advent of the computer.

 

Why is conversation a deeper, more powerful form of interactive art?  Because a piece of art that can converse with the audience can customize its behavior, its message, for that audience.  The artwork can actively try to connect with the audience, by listening to how the audience responds to its questions, or to what the audience says or does.  It could adapt itself to the particular situation and environment it finds itself in.

 

Note this is not advocating interactivity for its own sake.  “More interactive” is not necessarily “better”.  But a piece of art than can customize its behavior, that can more directly connect with its audience, that can adapt to its environment, offers the artist more power to communicate a message, idea, feeling or mood.

 

 

Please understand me

 

For an artist to create a conversational interactive artwork means to give the audience an interface to speak with, and to imbue the artwork with the capability to listen to this input, to process it in some way, and to speak its response.  The audience-to-artwork communication can take many forms – mouse clicks and movements, sound, body motion and gesture, even natural language such as speech; the artwork-to-audience communication can be in any of the forms described earlier -- text, imagery, sound, speech and the control of machines and robots.

 

Some artists have begun creating works with an interactive listen-think-speak cycle, but these works almost invariably tend to be strong on the speaking part, and weak on the listening and thinking parts.  Perhaps this should be of no surprise since effective speaking (display) of text, imagery, and sound is technically easier to accomplish than effective listening to audience input and thinking about it – and, in speaking, the artist can borrow from established traditional techniques from the visual arts.  Most interactive artworks do no more than very simple processing of narrow channels of user input, such as a mouse-click on a hyperlink, the pressing of a button or flipping of a lever.  These include works such as Lynn Hershman’s America's Finest (1995), Mark Amerika’s Grammatron 1.0 (1998), Perry Hoberman’s Bar Code Hotel (1994), and Bill Seaman’s Passage Sets (1995). 

 

While these works can be quite compelling, their limited interactivity can be frustrating for the audience; there is only so much a person can say with such narrow channels of input.  People by nature want to be able to express themselves.  In artworks where the audience can only click on links or press buttons they are forced to conform to the work instead of the work opening itself up to them.  Of course this is sometimes the intent of the artist.  However it is the opinion of this artist that untying the straitjacket that limited-input interactive artworks force upon the audience -- by allowing the audience to more readily say what they think and feel -- will result in an audience more engaged and enriched by the work.

 

A few artists are creating works that use artificial life or artificial intelligence techniques to think about what the audience says and generate new responses not explicitly programmed into the system.  In works such as A-Volve (1997) [5] and Terminal Time (1999) [6] the audience is given the chance to express themselves more deeply -- by drawing the shape of a virtual creature they would like to create, or by expressing their personal ideological bias towards history and society, respectively. In virtual worlds and virtual reality environments such as Nerve Garden (1997) [7] and The Bush Soul (1998) [8] the audience can fashion their own virtual plants or control avatars that co-exist in a world with other virtual inhabitants.  In Petit Mal (1997) [9] the audience shares the space with a fully autonomous interactive robot.  This robot can gently and silently move about the room in response to who it senses is in the room and how they move, sometimes advancing, retreating or just wandering about.  Although programmed with a relatively simple control program, people often attribute high degrees of personality and intelligence to the robot.

 

A few systems go even further in the amount of listening they can do to audience input.  To date these projects have tended to be closer to research than to art, but are useful for artists because they suggest new directions for what deeply interactive experiences could be like.  In “believable agent” projects such as The Edge of Intention (1992) [10], Silas T. Dog (1994) [11], and Virtual Petz (1998) [12] the audience can have almost literal conversations with the work. Here the audience is given the freedom to directly gesture to life-like computer characters that are driven by a complex internal set of motivations, goals, emotions and personalities, communicated using real-time expressive animation and sound.  These works can adapt to the audience’s interactions and persist over time, opening up the possibility for the audience to form long-term emotional relationships with the work [13]. 

 

Though conversation between artwork and audience gives the artist more power to communicate, it also creates new problems for exhibition.  A meaningful conversation takes time, which can be a challenge for installation-based works in which large groups of people may be in attendance.  Some artists purposefully make their works group-oriented, such as A-Volve’s large touchscreen or Terminal Time’s audience applause-meter.  These work well but only allow for modest amounts of audience customization and individually personalized experience.  Desktop-based works allow for a more individual, intimate emotional relationship to be developed between the audience and the work, but this requires a single person occupying the computer for long periods of time.

 

 

What games can teach artists

 

This conversational notion of interactivity was identified years ago in the field of computer games by Chris Crawford in his Interactive Entertainment Design essays [14]. This is indicative of how the best game designers are addressing issues of concern to artists trying to create deeply interactive works.  Computer games are currently the dominant interactive format -- not because they are goal-oriented experiences, but because they are closer than any other format to fully using the capabilities of the medium.  It should be noted that outside the realm of digital media, story and art (books, music, theater, television and movies) are generally more popular than games (sports, puzzles, board games); once computer-based story and art become as deeply interactive as computer games are now, perhaps they will eclipse games as the dominant interactive format [15].

 

Although the design goals of a computer game differ drastically from those of an interactive art piece, there is certainly some overlap in technique.  Artists can gain additional insight into how to create more conversational interactive work by studying today’s most compelling games.  The best first-person-shooter games such as Quake (1998) and Half-Life (1999) and the “first-person-soother” Virtual Babyz (1999) use well-designed combinations of user interface, real-time rendered animation and artificial intelligence in an attempt to give the player as many degrees of freedom as possible.  These works have managed to find a  “sweet spot” in the space of listening-thinking-speaking possibilities on today’s personal computer.  For example, using just a mouse and cursor keys the player can smoothly and easily navigate and control their viewpoint, or use the mouse to control a hand-shaped cursor to directly touch and interact with characters and objects.  Navigation and object manipulation are simplified and made intuitive; there are no complicated, arbitrary menus or buttons to operate.  The worlds are designed such that all objects can be interacted with; there are no “dead” objects just part of the background.  To speak to the characters the player uses their real voice by way of a microphone and voice recognition; players do not have to unnaturally type text to the characters, who use their own digital voices to speak back to the player.

 

However there is only so far these games can go with just the mouse, keyboard and microphone as a means of audience communication.  To deepen the conversation, more channels of input are needed, and the ability to understand them.  While computer science researchers are currently developing the hardware and software to recognize and understand a user’s emotional state [16] and natural language [17], much of this technology is probably still many years away. In the meantime artists should take advantage of the freedom to build their own custom input hardware for a unique installation piece, such as video cameras and motion sensors, whereas game designers are at the mercy of whatever input hardware the general public happens to have purchased, which is typically minimal.

 

 

Why artists must program

 

The most fundamental challenge for artists creating artworks that can converse with the audience is to achieve a deep understanding of the processes of discourse.  Any meaningful dialog between two (or more) participants -- such as an argument between adults, or two children drawing a picture together, or the petting of a cat, or even a game of tic-tac-toe -- has a complex set of mechanisms in operation.  Each participant is being careful to listen when the other communicates, is keeping track of what the other has communicated over time, is constantly giving the other feedback (“backchannels”) about how well they do or do not understand what they are perceiving, and so on.  The ability to successfully converse -- a common everyday skill we all possess -- is a difficult and complex thing to re-create on a computer.

 

To create a computer-based artwork that captures the processes at work in a conversation requires programming.  There is no escaping the fact that to make an artwork interactive is fundamentally to build a machine with processes; anything less would simply be a reactive work without autonomy -- “push button” art.  Artists must think procedurally to create truly interactive art, and fashion these procedures to express their artistic intentions.  This requires the artist to have a firm foothold in both artistic practice and computer science.  The most precocious “new media” academic departments are requiring their students to become equally proficient in both disciplines. 

 

For those artists not ready or willing to take the leap into coding, there is always the hope for the creation of more sophisticated authoring tools that will carry the burden of the programming [18].  But it is not obvious to me that this will be possible, especially for truly conversational interactive works; there becomes an irreducible point where the creation of procedural behavior can only be achieved with programming.  Non-programmer artists can overcome this challenge by collaborating with programmers to implement their ideas, but in doing so are now sharing in the authorship of their work.  The more complex an artwork’s interactivity becomes, the more the decisions made during programming will impact and define the essence of the experience.  In such cases the programmer becomes less of a technician and more of a fellow artist.  Collaboration can be beneficial to both the artist and programmer, for in the process each gains greater understanding of the strengths and limitations of the medium.

 

Many contemporary artists are in the process of figuring out how to use interactivity in their work, and we can expect a wide variety of experimentation for some time to come.  It is not obvious at this very early stage how artists will use deeply conversational interactivity to serve their goals; in fact the required level sophistication and “gee-whiz” factor of the technology involved may compete with the aesthetic goals of the work.  Resolving this tension will be a challenge for the artist.  But we know that people naturally enjoy expressing themselves and connecting with others, and artists who embrace this concept will create works that can directly reach out and connect with the audience – works that could even form persistent, long-term relationships with individual people.  While perhaps technically daunting to implement, these new modes of expression for the artist and the resulting heightened engagement of the audience should be worth the effort. 

 

 

Acknowledgments

 

I would like to thank Adam Frank, Rob Fulop, Janet Murray [19], Michael Mateas, Chris Crawford and Tania Vu for their helpful discussions and invaluable insights on the nature of interactivity and the use of the computer as a new medium for art and entertainment.

 

 

References

 

1. Examples of Web art can be found at http://www.adaweb.org, http://www.rhizome.org, http://www.artnetweb.com, http://www.walkerart.org/gallery9/beyondinterface.

 

2. Sims, K. 1994. Evolving Virtual Creatures. Siggraph '94 Proceedings, pp.15-22.

 

3. McCorduck, P.  1991. Aaron's Code: Meta-art, artificial intelligence, and the

work of Harold Cohen. Freeman.

 

4. Cope, D.  1996. Experiments in Musical Intelligence.  A-R Editions.

 

5. Sommerer, C.; and Mignonneau L. (Eds).  1998. Art @ Science. Springer Verlag.

 

6. Mateas, M; Domike, S.; and Vanouse, P. 1999. Terminal Time: An Ideologically-biased History Machine.  In Proceedings of the AISB’99 Symposium on Creative Language: Humor and Stories, pp. 69-75.  The Society for the Study of AI and Simulation of Behavior. 

 

Also see:  Mateas, M. 1999. Not Your Grandmother's Game: AI-Based Art and Entertainment. In Proceedings of the 1999 AAAI Spring Symposium, Artificial Intelligence and Computer Games, SS-99-02, pp. 64-68. Menlo Park: AAAI Press.

 

7. Damer, B. 1998.  The Cyberbiological Worlds of Nerve Garden.  Leonardo, Volume 31, No. 5, pp. 389 - 392.  Cambridge: MIT Press.

 

8. Allen, R.  1998.  The Bush Soul: Traveling Consciousness in an Unreal World. http://emergence.design.ucla.edu/papers/bushpaper.htm

 

9. Penny, S. 1997. Embodied Cultural Agents: at the Intersection of Robotics, Cognitive Science and Interactive Art.  In Proceedings of the 1997 AAAI Fall Symposium, Socially Intelligent Agents.  Menlo Park: AAAI Press.

 

10. Bates J.; Loyall, A. B.; and Reilly, W. S. 1992. Integrating reactivity, goals and emotions in a broad agent.  In Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society, Bloomington, IN.

 

11. Blumberg, B. 1996.  Old Tricks, New Dogs: Ethology and Interactive Creatures. PhD Dissertation, MIT Media Lab.

 

12. Stern, A.; Frank, A.; and Resner, B. 1998. Virtual Petz: A Hybrid Approach to Creating Autonomous, Lifelike Dogz and Catz. In Proceedings of the Second Intl. Conference on Autonomous Agents, pp. 334-5. Menlo Park: AAAI Press.

 

13. Stern, A. 2000.  Creating Emotional Relationships with Virtual Characters.  Austrian Research Institute for Artificial Intelligence workshop on “Emotions in Humans and in Artifacts”.  Publication forthcoming, MIT Press, 2001.

 

14. Crawford, C.  1993.  “A Better Metaphor for Game Design: Conversation”, and “Fundamentals of Interactivity”.  Interactive Entertainment Design, http://www.erasmatazz.com/library.html.

 

15. Stern, A. 1999. AI Beyond Computer Games. In Proceedings of the 1999 AAAI Spring Symposium, Artificial Intelligence and Computer Games,

SS-99-02, pp. 77-80. Menlo Park: AAAI Press.

 

16. Picard, R. 1997. Affective Computing. MIT Press.

 

17. Examples of current natural language understanding research can be found at http://www.lim.univ-mrs.fr/NLULP99

 

18. Binkley, T. 1998. Autonomous Creations: Birthing Intelligent Agents.  Leonardo, Volume 31, No. 5, pp. 333 - 336.  Cambridge: MIT Press.

 

19. Murray, J. 1997. Hamlet on the Holodeck: The Future of Narrative in Cyberspace. The Free Press, New York.

 

-----

 

Bio:  Andrew Stern is a designer and programmer of the interactive characters Dogz, Catz and Babyz from PF.Magic in San Francisco.  Along with his fellow creators Adam Frank, Ben Resner and Rob Fulop, he has presented these projects at a variety of conferences including Digital Arts and Culture 99, AAAI Narrative Intelligence Symposium 99, Autonomous Agents 98, and Intelligent User Interfaces 98.  He presented and exhibited Babyz at the Siggraph 2000 Art Gallery and participated in the “No Art Jargon” panel.  Babyz recently won a Silver Invision 2000 award for Best Overall Design for CDRom.  Catz won a Design Distinction in the first annual I.D. Magazine Interactive Media Review, and along with Dogz and Babyz was part of the American Museum of Moving Image’s Computer Space exhibit in New York.  The projects have been written about in publications such as the New York Times, Time Magazine, Wired and AI Magazine.  Andrew Stern is currently collaborating with Michael Mateas on an interactive drama project.  He holds a B.S. in Computer Engineering from Carnegie Mellon University and a Masters degree in Computer Science from the University of Southern California.  He can be reached by email at andrew@interactivestory.net.