You are now in the main content area

Rabbits, Ducks, and Deepfakes

Multistability/Metapictures in the Age of AI Images
By: Cody Rooney
December 01, 2025
A black-and-white drawing that can be seen as either a rabbit or a duck, depending on how you look at it. The rabbit’s ears double as the duck’s beak, creating a playful optical illusion that shifts between the two animals.
Duck/Rabbit Image, ChatGPT, 2025

A black-and-white drawing that can be seen as either a rabbit or a duck, depending on how you look at it. The rabbit’s ears double as the duck’s beak, creating a playful optical illusion that shifts between the two animals.

Like the Duck–Rabbit, the image before us may shift, but the real movement happens in the spectator.

You’ve probably scrolled past an image recently and wondered: is this real or AI? It would seem as if our entire visual field as of late has been inundated with a wave of synthetic imagery which has collapsed our ability to trust our own sight. Across Instagram, TikTok, and other platforms, discourse abounds about whether or not a given image or video is AI-generated. “Is this AI?” might be one of the most defining phrases of our contemporary era.  

That moment of hesitation when one encounters an image, in deciphering whether or not it is AI generated, might seem like a banal or even slightly frustrating phenomenon endemic to our digital age, but perhaps, it is, in actuality, potentially the breeding ground for a new politics of vision.

To speak about images today means thinking about vision as a form of power and as the ground on which one builds their sense of reality. Our ways of seeing are layered, contradictory, and socially produced, a phenomenon Martin Jay calls a “scopic regime”, or rather, the cultural frameworks and practices that structure our visual field. Jay describes modern visuality as a “contested terrain,” made up of distinct visual cultures that reveal how complex the act of looking really is. He identifies Cartesian perspectivalism as the dominant structure of Western vision since the fourteenth century. This framework imagines an observer who stands outside the world and sees from a detached, unified point of view, grounded in a naturalized, scientific model of sight that privileges geometric order, uniformity, and the promise of a transparent window onto reality.  

In the wake of artificial intelligence, today's defining visual logic is shifting. ChatGPT, Sora, Midjourney and other artificial intelligence platforms now produce droves of images and videos which have begun to completely saturate the visual field at every level, infiltrating advertising ecosystems, populating social platforms, and filtering into the screens and surfaces that mediate everyday experience. We may be on the threshold of a new scopic regime characterized by AI-generated images ability to draw on the same visual grammar that perspectivalism established centuries ago.  

The goal of AI imagery, largely, is to reproduce the transparent window of vision so convincingly that the difference between simulation and world becomes impossible to parse. These images are assembled through statistical recombination rather than representation. They operate as simulacra, surfaces without origin or interior, often more plausible and polished than the realities they echo. The effect is a new hyperreality in which the line between depiction and fabrication no longer holds. In this sense, AI functions as an algorithmic trompe-l’œil: an image that does not simply fool the eye but unsettles how sight produces knowledge in the first place. This moment marks a turning point in the long legacy of the perspectival regime, which upheld realism as its central promise. AI now weaponizes that same promise, using the tools of realism to produce images whose stability can no longer be guaranteed, and whose truth-value is always in question.

However, this question, in fact, might signal the genesis of an entirely new regime of visuality. W. J. T. Mitchell describes a genre of multi-stable images he terms “metapictures” that “show themselves in order to know themselves.” These images foreground their own act of representation, producing a doubled relation to seeing. His central example of the metapicture is the Duck–Rabbit, perhaps the most famous “multistable” image in modern psychology. Mitchell writes that the Duck–Rabbit is “the ideal hypericon because it cannot explain anything (it remains always to be explained). If it has a ‘doctrine’ or message, it is only as an emblem of resistance to stable interpretation, to being taken in at a glance.” The Duck–Rabbit draws perception out of hiding by posing a simple yet impossible question: is this a duck or a rabbit? Its effect is the opposite of perspectival clarity. Instead of offering a single stable meaning, it produces oscillation, hesitation, and a continual re-seeing. It reveals the instability of looking itself.

Contemporarily we can see that a similar perceptual process occurs in the consumption of any manner of media in this emerging newfound scopic regime. As the eye flits between the perception of images as reproduction of reality, and mere verisimilitude produced by artificial intelligence, every image in contemporary culture behaves as a multistable metapicture, every advertisement, every post on social media, every viral tiktok is subjected to the same process of “oscillation, hesitation, and a continual re-seeing.” Within every act of mediation, the contemporary spectator is subject to the “‘effect of interpellation’... the sense that the image greets or hails or addresses us.”  

Jacques Rancière in the Pensive Image reminds us that the politics of spectatorship must be continually re-examined: “It is these principles that should be re-examined today… gaze and passivity, exteriority and separation, mediation and simulacrum; oppositions between the collective and the individual, the image and living reality, activity and passivity, self-ownership and alienation.” Today this re-examination is occurring continually across our visual field. In consuming AI-generated media, and in questioning the veracity of all media, we are asked to reconsider the very foundations of visuality that we have taken for granted. AI-driven visuality is making mediation itself visible.

As digital culture grows more saturated with synthetic images, the simple act of questioning what we see becomes increasingly politically significant. In this way, the unintended consequences of AI may contain a small kernel of possibility. Even as AI extends the logics of automation and mediation, it also makes those logics visible. It prompts viewers to slow down, to look twice, and to recognize the structures shaping their perception. If this emerging scopic regime teaches us anything, it is that seeing can become an active, critical practice rather than a passive one. Like the Duck–Rabbit, the image before us may shift, but the real movement happens in the spectator.

About the author: Cody Rooney is a writer, editor, creative director, and multimedia artist based in Toronto. He is a PhD student in Communication and Culture at Toronto Metropolitan University and Editor in Chief of Liminul Magazine, where his work explores philosophy of technology, digital mediation, and contemporary visual culture, with a focus on how AI and media systems reshape perception and aesthetic experience. 

Insights & Ideas is a ComCult blog series showcasing the research and expertise of ComCult students. Designed to engage a broad audience, the series features op-ed-style posts that connect academic insights to real-world issues, making complex ideas accessible and relevant. Each entry highlights the unique perspectives and innovative thinking within the ComCult program. We invite you to explore more stories that amplify research and inspire ideas! (News and Events Archives)