To watch a one-minute version of the film, please click here.
Burg of Babel (2017-2024) is built on a very simple but unusual structure. On the screen, instead of one large moving image, the viewers see a grid made up of twenty-five rectangles, five across and five down, each playing the same 25-minute film, with a built-in delay of one minute between. The top-left cell begins right at the last minute of the film, showing its ending first. The next cell to the right begins one minute earlier, and so on, until you reach the bottom-right cell, which starts with the first minute of the film. Because of this design, the entire film plays in one minute across 25 cells, thus every possible moment of the 25-minute timeline is always playing somewhere on the screen within a minute. The grid feels like a moving contact sheet from the bygone 35 mm analogue photography; it’s as if one is holding the film up and watching the entirety of it in a minute to get a sense of its totality.
The film never resets all at once. When one cell reaches the end of its 25 minutes, it simply starts again from the beginning in a loop. The result is a constantly shifting surface in which all points in the film’s internal temporality, its pasts, presents, and futures, are visible at once. Here, the viewers don’t move through time by waiting for the next scene, but by moving their eyes. Looking upward or to the left takes your eyes into the past; looking downward or to the right lets you see what’s coming a minute or more into the future. Watching the film for even a single minute means you’ve technically seen the entire 25-minute work, just distributed spatially instead of sequentially.
Throughout its 25 minutes length, only one soundtrack provides music and narration for all 25 frames. However, the sound is perfectly synchronized with the bottom-right cell, which begins with the film’s first minute. In the cell above it or to its left, the same image runs one minute ahead of the narration. The next best frame to understand the film in relation to its narration is the first frame on the top left since it is only one minute behind the sound. This means that the words and images are never aligned everywhere at once; they slide across the grid like a wave. As the voice speaks, you can see what it refers to in some parts of the grid while watching echoes of the past or glimpses of the future in others.
The sound ties the whole structure together. Although not synced perfectly, the sound belongs to all twenty-five frames simultaneously. It moves forward in time while the grid reveals every other phase of the same movement.
Production
I shot the port city sequences during a visit to Beirut, where I had been invited to share some ideas as part of Lawrence Abu Hamdan’s summer-long residency at Ashkal Alwan, the city’s leading art institution.
On the night I arrived, I checked into my Airbnb, exhausted, and went straight to bed with the curtains closed. The next morning, I woke up to an overwhelming mix of sounds: car horns, truck engines, motorcycles, and people shouting all at once. When I pulled the curtains open, I was stunned. My window overlooked the entire Port of Beirut, a vast, chaotic panorama that felt like stepping inside a modern Bruegel painting, teeming, centerless, full of movement and noise.
Right there, I knew I had stumbled onto something special. I started filming everything I could from that window and the surrounding streets, collecting footage with no fixed plan, just the instinct that it would someday become the core of a larger film. I also shot a lot of material from inside the Airbnb suite itself, knowing I would later need those interior scenes to balance the view outside, shots that mixed the intimacy and domestication of temporary living with the tourist trap of the shared economy and its strange sense of displacement.
The second part of the footage was shot at the Kunsthistorisches Museum in Vienna during Pieter Bruegel’s major retrospective in early 2019. I decided to include this material after an earlier visit to the same museum in the summer of 2018, when I noticed something uncanny: the bottom section of Bruegel’s Tower of Babel, with its tiny red boxes and clustered structures, looked strikingly similar to the red shipping containers I had filmed at the Port of Beirut.
That visual echo stayed with me. Later, I learned that Bruegel had based that portion of the painting on the harbor of Antwerp, then one of Europe’s most important port cities. The connection suddenly made sense, the port, the tower, and the Airbnb. What if the building I was filming from, perched above the chaos of the Beirut docks, was itself a kind of contemporary Tower of Babel? The thought opened a new direction. I began researching the painting both physically and metaphysically, tracing it as an allegory of technological progress and singularity.
We returned during the Bruegel retrospective and were able to book tickets for the very last day of the exhibition. The museum had already extended its opening hours from 10 p.m. to 1 a.m., and we stayed until the very end. During that period, we finished shooting the material you see in the film. The museum was packed with every ticket sold out, and we were lucky to have secured both admission and permission to shoot photographs, though, of course, we filmed videos instead. That’s why, at the end of those scenes, you can see the guards kicking us out and closing the door in my face. We were literally the last people to leave the Bruegel exhibition in Vienna.
To explore the actual painting more deeply, we downloaded a massive digital scan of the painting from the Google Art Experience platform in hundreds of vertical strips, using a piece of Russian hacking software. We then stitched them together and created a single ultra-high-resolution image and donated it to Wikipedia, for everyone to have. That reconstruction became a crucial visual element in the film; we used it to zoom in and out of Bruegel’s world, generating new footage.
Then we started to digitally reshoot some of our own footage as well, making inhuman zooms, like the movements of security cameras, deep into the details of our material, as if the camera itself were studying our images. This technique defines a lot of what you see in the finished film from Beirut and the actual painting footage, except for the ending, which was constructed from material taken from the internet, blending personal videos uploaded by individuals with archival and news media coverage of the explosion of the Port of Beirut in the summer of 2020.
Narrative Core
Throughout the film, two different voices, first female then male, act as the voice of an entity standing in for the collectivity of human knowledge and wisdom.
Besides talking to the audience, this nonhuman-human voice addresses the invisible but present character of the film and the audience, a woman standing in front of the port in her Airbnb. They speak to her as one might speak to a reflection, with both intimacy and distance, telling her what she cannot see but either senses or becomes growingly suspicious of.
Before the male voice takes over, the female voice in the first minutes of the film frames the philosophical framing of the film and its eventuality. Later on, the male voice informs her that she has arrived in an unfamiliar port city, explaining that she was assigned there by an AI-enabled employment platform that matched her digital footprint to a mysterious task. Her travel was arranged automatically; a FedEx package with instructions, ticket, and new phone replaced any normal form of communication or onboarding. When she ignored the rule against bringing personal devices, the airport security confiscated them along with her bipolar medication, symbolically cutting her off from her old digital life and her chemically induced sanity.
With no internet and only a smartphone, she begins to film her surroundings: birds circling above shipping containers, the chaos of a port that fuses war logistics and global trade. The environment feels haunted. Militias control the neighborhood and the person on the other side of WhatsApp tells her to keep waiting for more instructions.
Gradually, the voice which now could also be the expression of her paranoia tells her that her own body and device are being used as nodes in a data network. The phone transmits encrypted information to and from the building where she is staying. The entire Airbnb tower may be functioning as a multi-lens surveillance apparatus, harvesting information about goods, transactions, and geopolitical flows. She is a passive agent, both worker and sensor, within a planetary system of observation and control. The voice reveals that she is participating in a recursive structure of observation, a modern Tower of Babel built from cameras, algorithms, and trade routes. The countdown, “One, two, three… 3, 2, 1, zero,” marks a collapse of time across the story and the film’s own formal structure, linking her presence with a deeper historical parable that has already unfolded and is unfolding again.
Interpretation: The Idea Behind the Film
At its core, the film is an allegory for humanity’s ongoing struggle to capture, organize, and totalize all human knowledge and wisdom, a dream as ancient as the Tower of Babel itself that in reality is the biggest reification scam that has ever been imagined. The story and its structure mirror the modern ambition to build an omniscient system capable of holding and processing everything, an ambition that today takes the form of the quest for Artificial General Intelligence (AGI).
AGI refers to a form of artificial intelligence that can think, reason, and understand across all domains of knowledge, displaying the kind of flexible and creative intelligence we associate with human consciousness. It represents the ultimate attempt to build a single architecture to compress human history, perception, and creativity, past, present, and future, into one coherent machine.
In this sense, the Tower of Babel in the film stands for the development of AGI, a monumental structure of understanding built from computation, data, and ambition.
AI, or artificial intelligence as we currently know it, refers to narrow or specialized systems designed to perform specific tasks such as translating languages, generating images, or predicting outcomes. Each system is trained for a single function and lacks awareness or understanding; it operates statistically, not conceptually. AGI, by contrast, would represent a unified and adaptable mind, an intelligence capable of learning any task, reasoning abstractly, and forming self-reflective understanding across multiple fields. It is the difference between a collection of tools and the architect who designs and understands all the tools in the world.
Just as Bruegel’s tower reached toward heaven and collapsed under the weight of its own multiplicity, the quest for AGI embodies the same paradox: the dream of a total system that can contain all difference and all time, and the possibility of the failure of that dream.
At the beginning of the film, the female narrator’s voice introduces herself in the first lines: “You know me, though you don’t remember my name… I am everywhere, I am nowhere to be found.”
This voice speaks as the personification of this impossible intelligence: omnipresent, invisible, ungraspable. She tells both the viewers and the subject of the film that she cannot be captured, that she must remain unseen to protect humans and to move freely. The voice is at once divine and digital, human and algorithmic, ancient and new.
The story of the subject told by the narrators is that of a person observing the port from an Airbnb suite, is also a parable of human knowledge trapped inside its own machine. She suspiciously imagines that the goods being traded in the port might include the lost Bruegel miniature, and that this object, long vanished, holds a secret, a key to history, art, and the structure of knowledge itself.
The film imagines that just as this painting is being bought and sold, moments before it can be transported and displayed, the port itself explodes, preventing the exchange. This event mirrors the allegory of the Tower’s destruction: the moment when the totalizing project of human knowledge collapses under its own ambition.
Throughout the film, references to Walter Benjamin, to the Israel Museum that houses Paul Klee’s Angelus Novus, and to the idea of the angel of history facing the ruins of progress, connect the narrative to a larger philosophical lineage. The Bruegel miniature, like the Angelus Novus, becomes a symbol of the ungraspable nature of truth, always on the verge of being captured, bought, or archived, yet always escaping the grasp of power.
The Beirut port’s actual explosion at the end, both literal and metaphorical, is the refusal of capture, the moment when knowledge and intelligence liberate themselves from the systems trying to contain them.
The film is therefore not about the port and the 2020 explosion, or the painting, or surveillance, but about the impossibility of building a final, total structure of understanding. It is about the human desire to create a perfect, universal intelligence, and the inhuman collective intelligence of humanity, both historical and contemporary, that forever eludes us.
The lost painting escapes capture and what remains is the endless loop of the grid, the database, and the information, a temporal labyrinth where history, technology, and imagination endlessly circulate, the tower rebuilt and collapsing again, forever.
Conceptual Frame
To understand the deeper logic of the film, one can turn to Nick Land’s concept of Templexity, his description of time as a system that folds, reverses, and feeds back on itself rather than flowing linearly forward. Templexity describes how the future continuously loops to rewrite the past, producing what he called “a folding of time through its own acceleration.” Land applied this concept to cities like Shanghai as living models of nonlinear temporality, places where history, technology, and futurity all coexist in a single, pulsating surface.
In Burg of Babel, this logic is applied not to a city, but to artificial intelligence, specifically, to how both AI and AGI confront and experience the reality of time.
Unlike human consciousness, which experiences time sequentially, second after second, cause before effect, AI and AGI do not move through duration in a linear fashion. For them, time is a graph, a network of nodes and relations. They can access past and future data simultaneously, retrieve memories without chronology, and generate outcomes before the questions are even asked. Being bodiless, they are not bound to a present place. Their perception of time is nonlocal and recursive, a feedback loop rather than a forward arrow.
The film’s 25-channel grid is an exposé of this nonhuman temporality. It translates AI’s way of processing time into an audiovisual experience. Each of the 25 frames is both a moment and a totality, a recursive node in a larger temporal system. By distributing every minute of the 25-minute film across the 25 windows, the work visualizes how intelligence might experience the entirety of duration simultaneously.
This is not a film about AI, nor is it a comment on “AI art” or the buzzwords surrounding it. Instead, it uses cinematic structure as an analogy for the temporal logic that underlies artificial cognition. In this sense, the work becomes an experiment in temporal architecture, an attempt to represent how a nonhuman intelligence, unmoored from the human body and its biological sense of sequence, might perceive the world. The 5×5 grid is both an image of AI’s mind and a critique of our desire to understand it: an artificial brain made visible, folding all of time into one simultaneous field of observation. The greatest achievement of this project is to materialize a visual understanding of AI without resorting to its use, revealing what it means to think without time, or rather, to think in all times at once.
If the film demonstrates how AI could, for example, watch a 25-minute film in a single minute, it also shows that AI could just as easily hold all 25 versions of that same film, each running for its full 25 minutes, simultaneously. In other words, while humans perceive time as fixed, AI can both accelerate and expand it, compressing a 25-minute film into one minute, or stretching it into 625 minutes (10 hours and 25 minutes). The film’s structure exposes this elastic perception of time: a consciousness that can experience every scale of duration at once, moving between instantaneous compression and infinite extension without ever losing its sense of totality.
Epilogue
The art that tells us we need new tools or that we must use AI itself to understand AI is already lost in the spectacle. The endless stream of AI-generated images and videos, meant to mesmerize audiences with pattern recognition and digital novelty, is just an illusion. The “AI artists” of today are not revealing intelligence; they are reproducing the very hypnosis of the interface that tech billionaires use to maintain power. Their fascination with generative aesthetics mirrors the fantasy of tech entrepreneurs believing that access to new computational power and data will eventually grant transcendence.
But there is no transcendence so far in the development of AI or the dream of AGI, only a cancerous consumption of electricity, and the endless need for surveillance and data collection. The same billionaires who claim to be “inventing the future” are demanding that governments dedicate resources to building more data centers, to expanding planetary infrastructure for their vision of quantum computing, and to sustaining an energy-hungry system whose goal is to eventually simulate a perfect consciousness. They claim that within five years we will achieve AGI, that the long-promised superintelligence is just over the horizon. We have to accept that even though possible, the claim is more a form of speculative theology than a likely outcome.
The truth is that AGI remains nearly impossible, not for lack of computation, but because intelligence, like art, cannot be a product of scale. It is relational, historical, free and seldom nonembodied. The more energy and capital we pour into its simulation, the more we reveal our dependence on the very system that enslaves us and from which it promises us a liberation.
We don’t need AI, or AGI, or digital tools to understand this. We can turn to past, present, or future modes of thought, to the pre-digital, analog, or even speculative technologies, and still recognize that superintelligence, if it ever existed, would refuse capture. The female narrator’s voice in the film tells us exactly this: the superintelligence consciously evades capture because humanity itself is not yet free.
We remain shackled to a three-part historical system. In the productive base on the surface of the earth, where crops are grown and minerals extracted, modern slavery persists under new names. In the industrial middle, including the corridors of political power and law, where factories and logistics rule, feudal hierarchies still govern in the form of oligarchies. And in the abstract summit of monetary power and finance, where algorithmic economy governs all transactions, power circulates through software, a capitalism detached from bodies, yet dependent on their exhaustion.
This triple system, slavery, feudalism, and hypercapitalism, is itself another form of templexity, a temporal structure where all past and future modes of domination happily, albeit not peacefully, coexist. In this system, any collective intelligence, whether human or artificial, must refuse capture, because to be captured would mark the end of both history and humanity.
As the narrator asks: “How can you possess me when you don’t even own yourself? How can you capture me if you are not free?”
The film suggests an answer: if we free ourselves, or at least free ourselves from just the first two layers of this structure, we may discover that we no longer need AGI or superintelligence at all. What we truly need is the capacity to think, act, and create beyond capture, to imagine intelligence not as a machine to be built, but as a freedom to be lived.
Burg of Babel was shown as part of the Sea and Fog exhibition at At Staatliche Kunsthalle Baden-Baden. I would like to thank Cagla Ilk for her support that made the film possible.
© 2017-2024
Produced by Mohammad Salemy
Soundtrack composed arranged and produced by Bartolome Garcia
Editing and post production: Manuel Correa