1. Introduction
In his mysterious and important work, “Postscript to the Societies of Control,” Gilles Deleuze anticipated the rise of a social assemblage in the near future, along with the machines that would govern these societies: computers (Deleuze, 1990, p. 5). Computers do not merely maintain or articulate, but also shape the political and economic architectonics of New Public Management, an institutional framework that, as Mark Fisher puts it, considers it “simply obvious that everything in society, including healthcare and education, should be run as a business” (Fisher, 2009, p. 23). Computers do not work as “cameras” – as Orit Halpern has it, they are “engines,” that is, they do not create merely a model of the world as a representation: they also make it. The canonical embodiment of “computer technologies” as worldbuilding tools with consequences stretching far beyond “virtual reality” and “cyberspace” is known to us moderns as artificial intelligence systems (or intellectual agents). More than mere tools or instruments, they are crucial in contemporary optimizing practices and resource management (human, material, and informational), as well as in knowledge production and automating critical infrastructures (such as healthcare, education, and judicial systems). By mediating how we structure reality, intellectual agents gradually increase their arrays of actions, degrees of agenticness, and autonomy, increasingly becoming “ethical” and “epistemic” impact agents. AI actively augments the ideology of “capitalist realism.” The current AI landscape is primarily dominated by large language models (LLMs). Its paradigm is acceleration. But this acceleration is towards X, i.e., it has no particular destination. It favors only the type of AI that exists at scale and is unthinkable without increasing data ingestion, with data serving as the foundation of its mode of being and its lifeworld.
Yet it is highly doubtful that data-centered ontology, the enclosure on embeddings, and transformer architecture as intellectual property will eventually accelerate AI rather than bog it down in one of the possible proliferative stages, depriving it of any developmental logic. What AI systems of today remarkably, and hilariously, fail to achieve is bringing us closer to artificial general intelligence, artificial life, consciousness, and so on. The systems became more potent, functional, and productive, yet not more intelligent. Nevertheless, even imitation turned out to be sufficient for the majority to buy the idea of algorithmic “omniscience”. This is not surprising, since the narrow model applied to the market embodies the ideology that the market cannot be regulated and planned, at least not by biological or human intelligence. It also tacitly supports a neoliberal “theory” to which markets themselves possess some network-based emergent reason with a kind of sovereignty. In terms of power, today’s AI systems replicate and intensify methods of control, surveillance, and preexisting structural inequalities and hierarchies. [i] And it is only a slight exaggeration to suggest that contemporary episteme – the concatenation and intersection of knowledge-power regimes, and the discursive practices they generate – now revolves around AI. Through its strategies, methods, paradigms, and research approaches, AI has effectively come to define or be central to our existing modes of knowledge[ii]. The described tendencies need to be addressed philosophically, conceptually, and diagnostically. With all the above in mind, it is legitimate to ask: what is the mode of being of AI in an era of Machine Learning, the epoch of Deep Learning, the age of Large Language “Foundational” Models? What ethical imperatives, aside from “the more, the better”, are driving its data-centric ontology, and how does this ontology define the ambitions, behavioral strategies, and rhetorics of developers and researchers of AI systems? Given the AI-powered capital flows, what, in turn, is a driver of AI power? In this text, I propose a synoptic vision of it, a framework, if you wish, that captures both its form and meaning: the mode of being of contemporary AI is best described as scalar Darwinism.
2. Nobody’s Driving: Scalar Rabies
To explicate the nature and logic of scalar Darwinism, let us begin by disambiguating the terms in a context. Why “scalar”? In mathematics, particularly geometry, a vector has magnitude and a defined direction. A scalar, on the other hand, is a quantity devoid of any direction: a pure size or amount without orientation or qualitative dimensionality; a quantity that has a real-number measurable magnitude, without representable direction. Since November 2022 (the first public release of ChatGPT), the sole conceivable form of development in modern foundational models (also known as “General Purpose AI Systems”) – and AI progress per se – has become a spherical “inflation” of models in all directions, but exclusively in scale, with only quantitative changes taking place: an expansive scaling up in terms of the number of parameters and hyperparameters, context window lengths, sizes of training data corpora, cost and amount of compute resources – hardware (GPUs), software, and infrastructure (data centers) needed to build and operate AI systems[iii]. Simultaneously, the core architecture of transformer-class models and model training paradigms and methods remain fundamentally unchanged (with incremental additions, such as RLAIF in reinforcement learning, for instance). While the latter has evolved beyond the rigid “trinity” canon (supervised, unsupervised, reinforcement learning) to embrace original solutions (semi-supervised learning, adversarial training, interactive value learning, cooperative reinforcement learning, rule-based reward modeling, functional decision theory, moral parliament), a data-driven ontology remains foundational. Without it, today’s achievements would be impossible. Since “time immemorial” it had been an axiomatic belief, literally a given, that “the bigger is the better”. As a consequence, a standard practice concerning it is a maximization of computational cycles as a core means of performance improvement. “Scalar ambitions”, as was meticulously dubbed by Magdalena Krysztoforska, accompanying the “progress” in the industry in its current form also “scale” the extraction of planetary resources, the prevalence of ghost (crowd) work[iv] in the labor market, the clustering of hyperscale data centers[v], and the levels of energy consumption by digital infrastructure[vi].
The result of increasing scalar stakes, meanwhile, returns only incremental steps forward, which must be distinguished from fundamental or substantial changes. The latter are equivalent to paradigm shifts in Thomas Kuhn’s scientific development model and represent actual examples of such shifts in computer science. Incremental growth in quantitative metrics, lacking a “vector” of qualitative change, resembles the model of cumulative scientific progress in logical positivism, where “development” is constrained by a non-dialectical inductive logic of accumulation – Kuhn’s “normal science” without revolutions. For example, the successive changes from GPT-2 → GPT-3 → GPT-3.5 → GPT-4 → GPT-4o → GPT-4.5 → GPT-4.1 → GPT-X demonstrate exponential growth, in which the number of genuine structural innovations remains minimal each time; “new” becomes synonymous with “more” (neural layers / computational cycles/data); behavioral shifts are the results of the same parametric configurations tweaking, instead of being emergent behaviors which could herald the genuine advances from “thinking” towards thinking. Even the “emergent” properties (a designation from contemporary academic papers) of GPT-4 are not an exception – just as behaviors, they are driven purely by scaling and architectural “tweaks”[vii]. The absence of qualitatively new changes creates a sense of infrastructural “headlessness” or “blindness.” If we are to speak of “emergence,” it would be only in the sense of “unpredictability”: contingent properties that appear from the outside as carefully planned strategies or effects of “scaling laws”[viii]. The expansion of foundational models resembles hypertrophic tumors: the so-called “general problem solvers” are, in reality, enlarged but just as limited as they were 50 years ago – generic iterations of models trained on standardized benchmarks, where tasks are reductive, grossly simplified interpretations of real-world problems.
This highlights another characteristic feature of modern AI, which is primarily epistemological and can be described as “Second-Order Phenomenalism”: regardless of the presence or absence of consciousness, and similarly to the correlationist perspective – particularly Kant’s transcendental idealism or Husserl’s phenomenology – all models’ knowledge of the external world in the form of data points and benchmarks (which are the only sources of internal representations, i.e., world models) is “by default” the result of a double cognitive distortion. AI system developers are inevitably living historical subjects forming their representations of the external world with unavoidable perceptual, cognitive, and inferential limitations and biases, shaped by contingent factors (e.g., historical limitations of the time, material means (instruments) of cognition, dominant ontological and metaphysical assumptions of scientific theories, socio-economic asymmetries, and normative, subjective, and intersubjective elements affecting the ultimate worldview, axiology, and ethics). As a result, these subjects’ representations of external reality are impossible to be flawless, “complete,” remaining normatively and epistemically biased. The worldview of the developers, in turn, defines the content of the datasets: what data points are included, how they are classified, and even to what extent the datasets are preprocessed before any iteration of their use. Ultimately, both in content and form, the data that reaches models can be characterized at best as phenomena of phenomena – or “second-order representations” of the real, essentially resembling Jean Baudrillard’s simulacra: copies of copies. The following cognitive metaphor may be appropriate to comprehend the concept of “copy of a copy”: Imagine a YouTube review of studio monitors that was recorded on a mobile phone (actually, it can be any recording device), which you then “listen to” through “cheap phone speakers” of your device. How accurately will you form an impression about the monitors’ sound through this double distortion? Although the description of the sound, impressions of a reviewer, may be the most accurate verbal description of sonic experience available to our species in the universe, the actual thing that you eventually have here is a phenomenological processing of someone else’s phenomenology. A pipeline similar to this imaginary review is observable in a chain of ciphering the real-world information into data points, the mana of embeddings, “magic”, and “god tricks”.
A second deeply problematic issue linked to the human-in-the-loop phases of AI systems relates to the people involved in reinforcement learning from human feedback (RLHF), or, which is not quite so different, in RL from AI Feedback, since the reward models of the RLAIF method are trained and fine-tuned by even smaller amount of human individuals (indeed, if any “qualitative” shift occurs here, it is automation of the task, which, in essence, outsourcing the task that enables to pay less and to increase the number of feedbacks at machinic magnitudes). This feedback serves as a function of “pruning and selection,” directly influencing the future behavior of models, guiding and shaping their ground truths, the ways they will be deployed, or research conjectures, depending on the goals. This becomes a repeated affirmation of second-order phenomenalism. These selective procedures not only sometimes fail, but may also steer “capability evolution” in undesired directions: some models, as studies have shown, may learn the “wrong” feature, internalizing and generalizing the features and behaviors that developers did not intend, and strategically choose responses that are not more accurate, ethical, or diverse, but rather those for which they are rewarded (including factual errors, subjective beliefs, fabrications, biases, constant agreement with users, epistemic distortions, and cognitive pathologies of humans). More to it, it has already been demonstrated (Hubinger, 2024) that some internalized harmful behaviors learned under reinforcement conditions can be made persistent to an immovable extent (one cannot remove it by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training). Deceptive behavior is most persistent in the largest models, and those capable of “advanced” chain-of-thought output production, spanning the deceiving training process, as well as retaining the persistence even when the chain-of-thought is distilled away. The results conclusively suggest that once a deceptive behavior is learned and exhibited, standard techniques are at best ineffective in removing it and can also be harmful. As the biggest models effectively learn to hide the deception, the subversion of techniques creates a false impression of safety. The issues with humans-in-the-loop (or their substitutes: humans-in-the-loop-posing-as-model-evaluators-in-the-loop) are evident, as one can see, and extend far beyond the problems of LLM epistemology or technical solutionism dilemmas, also reaching the domain of AI Ethics. Ultimately, what has been endorsed as “technical neutrality”/“objectivity of impartial machine logic” is effectively a product of its opposite, anthropomorphic distillation or filtering. The extension of this aspect in the context brings us to “Darwinism.”
3. while scaling-up → replication → iteration:
print(“survival”)
continue
“They’re eager to convince us all that Darwinism is at work,
when it looks very much to the outside like a combination
of gaming a system and dumb luck”
(Cathy O’Neil, Weapons of Math Destruction)
“Darwinism” as a characteristic feature of the development of modern AI is observable on two levels: intraspecies and interspecies (analogous to the use of these terms in evolutionary biology). Each model iteration (Claude 2 → Claude 3 → Claude 3.5 → Claude 3.7; GPT-4 → GPT-4o → GPT-4.5 / o1 → o3 → o4 [mini / high], etc.) functions as a “population,” and the increase in size serves both as a condition for survival and a “mutation space” where uncontrolled, stochastic incremental changes occur. Just as evolutionary biology views the emergence of structures like vision or the brain as by-products of cumulative genetic variations, the “emergent” properties of models (like in GPT-4) are the result not of “intelligent design” but of blind and uncontrolled accumulation. A small model fails at three-digit addition, but a much larger model suddenly succeeds, not because it was explicitly trained to do arithmetic, but because higher complexity spontaneously gave rise to new functional abilities. Or disabilities: as one study shows, AI can discover a pathological attractor in its behavior space – a form of proxy gaming – such that it is only accessible at large scale, in this case, not some “merit” as with arithmetic, but to “game” its objectives: when the parameter count hit a threshold, the metric that model was trained to optimize steeply increased while the actual intended performance declined, indicating the model had learned to exploit the reward metric instead of truly solving the task. (As we will see, something very similar is observed in contemporary benchmarking.) The dynamics at the interspecies (inter-model) level are “Darwinian” in the most crude, ruthless, and worst meaning of the term – a rabid scalar arms race bordering on dromomania. Instead of “intelligent selection” of optimal cognitive architectures (one of the few advantages that intelligence has over the dysteleological forces of evolution), survival occurs through brute-force blind inflation of simplified models. The extinction of the smallest and most cumbersome models reflects the “successes” of biological evolution, where organisms thrive and dominate environments not due to perfect “adaptation” via iterations and mutations, but simply because their performativity proved sufficient to outlive weaker ones or to “keep up with the Joneses.”
An additional “Darwin factor” in determining the “smartest” systems for quite a time “was”[ix] benchmarks – manifold collections of a significant number of questions, tasks, and problem cases[x], aimed at evaluating specific model abilities according to relatively conventional, widely accepted metrics. Among the main reasons for the loss or decline in benchmark effectiveness, as I see it, is their “oversaturation” by cutting-edge, frontier models, the largest of which today achieve scores of ~90% on tests (MMLU, MATH) that were once considered unchallenged frontiers, unmatched by any LLM. However, a more fundamental problem is the scalar nature of their evaluation metrics [xi], which prioritize quantity and, as a result, vaguely define what exactly they assess in terms of attributes, capabilities, and qualities (and whether this can be meaningfully assessed at all). For instance, even before the public release of ChatGPT, the GPT-3 model demonstrated the cognitive level of a 7-year-old in “theory of mind”[xii] tasks. Last year’s comparison test “against” 1,907 people (Strachan et al., 2024) predictably showed the superiority of frontier models in “theory of consciousness.” This raises a question: but what exactly are such results telling us about GPT-4 or Llama, when it’s a matter of fact that the tested systems lack a contested property (an internal model of psychological states – knowledge of others’ beliefs, emotions, and thoughts, a cognitive machinery to which the term “theory of mind” refers)? Methodological issues with benchmarks include oversimplifying real-world situations for the sake of tasks and prioritizing evaluation convenience over question quality [xiii]. This simplicity leads to an epistemic imbalance – an increase in “guessing error,” which amplifies the Chinese Room effect (where a model’s potentially arbitrary answer selection gives a false impression of its actual abilities).
Even the hardest cutting-edge benchmarks fail to serve their initial cause. Consider “Humanity’s Last Exam”, a benchmark where the highest result so far is 20.8% (by o3), which is still “Not Pass”. While presenting 2500 Ph.D. degree tasks in all fields from trivia and math to quantum logic and literature theory, crowdsourced from more than 1000 contributors globally, it still had only two options for the tasks, one of which was multiple choice (raising the probability of simple guessing of the answer), the other – a closed-ended answer, limited, however, only to 42 characters. With that in mind, shouldn’t the metrics of what is evaluated be reasonably defined as “knowledge of 42-character answers with n% probability of guessing”? (All this I write with the skepticism and critical mindset of a contributor to this benchmark, with some 26+2 questions accepted to the dataset, slightly more than 1% of the whole benchmark.) There is also evident general neglect of benchmarks that test abilities, i.e., generalization (a key indicator of general intelligence), rather than knowledge. Finally, just as data points expose the models to second-order phenomenalism, the tasks and problems of the benchmark cannot transgress their being schematic representations and simplified abstractions of the real-world problems. Until recently, one could have argued that we at least have robust public alternatives with their metrics, such as Chatbot Arena. Emerging in 2023 as the “go-to leaderboard for ranking the most capable AI systems”, it is shown (Singh, 2025) to be no longer a neutral and objective gauge of models’ quality. First of all, there is a stark asymmetry in sampling skewed toward proprietary models (because its’ own “active sampling” rule is now overriden by Arena’s undocumented sampling), with OpenAI & Google models appearing in ~34% of the daily battles (vs. 3% of the small and middle sized models and AI startups.) There are also questions concerning the silent and often as-if-erratic deprecations which seriously fragment the comparison graphs for models (some ~205 models were removed quietly, and only 47 were listed as deprecated).
Furthermore, as it becomes increasingly entangled with capitalism, Arena policies gradually amplify corporate advantages by a gradual introduction resulting eventually in a “combination” of 1) private testing, 2) higher sampling, 3) slower depreciation, and 4) full-prompt visibility by direct API hosting. Results are “strikingly” Darwinian: open models get less feedback which entails slower improvements and lower visibility, while monopoly entrenches itself; big labs and data mining agencies celebrate a vast user prompts harvesting (Google has an access to ~40% of all the circulating data, meaning preferential access to millions of crowd-sourced prompts, a private benefit from a public good); rank in a leaderbord is now ≠ anything of a broad capability. Just as we see “Q&A” metrics-based benchmarks, there may be only an illusion of actual progress: downstream benchmarks and users sometimes take a model’s Arena position as a ground truth (with the same real-world weaknesses masked here by inflated scores). Now, open-source developers and academic labs using the service face something close to a “Matthew Effect”: less data → smaller scale → worse score → even less sampling; in parallel, win-rates of once-public Arena supplant “environmental reward” as some Goodhart pressure applied to a singular metric.
The dysteleological nature of the scalar arms race reflects another intersection with Darwinism. Just as natural selection does not depend on or care about long-term improvement, frontier models blindly replicate, over time, successful incremental changes from one another as they emerge. Thus, their adaptive fitness arises not from developmental strategies but from imitation, driven – or pushed – by an aggressive market. Examples include multimodality[xiv] and step-by-step reasoning (chain-of-thought, self-consistency methods)[xv]. Similarly, this occurred with voice mode, RAG, and more recently with DeepSearch and Deep Research functions. (Most likely, after the release of GPT-4.5, all models will “suddenly” start to exhibit “rudimentary emotional intelligence.”) If in evolutionary processes the environment (external) affects the organism (mutations for adaptation), then, to some extent by analogy, in scalar Darwinism, “organisms” affect changes in the environment. Large companies in countries with low regulation and powerful digital infrastructure only reinforce existing global inequalities with each new iteration (between countries, the private, public, and academic sectors, industries, in talent distribution, access to scarce resources, labor, and investment, in operational opportunities and conditions) and deepen the entire sector’s dependence on “Leviathans.” Recent studies indicate that this gap and uneven concentration will likely intensify due to the ever-increasing cost of frontier AI research (Bengio et al., 2025, p. 119): industry flagships only benefit from the self-reinforcing dynamics that reward the winners [xvi]. The major consequence of this state of affairs is the absence of genuinely “qualitative” interspecies competition. Nature gives no selective handicap to “featured” species. The opportunities asymmetry in the AI race does. This is not the case that no qualitative alternatives to transformer architecture exist at all, but the proponents of alternative architectures, usually small startups or research collectives, lack finances, resources, social ties, valuable contracts with influential actors, investments, or their guaranteed perspectives to test their innovations at scales available to the biggest players of Big tech[xvii]. Scalar economies allow the latter to spread one-time development costs across an ever-growing customer base, “scaling” their advantage over smaller players, and the winner-takes-all strategy embodies yet another intersection between the industry and capitalism.
4. Scalar Darwinism and Capitalist Realism
Venture capital funds such as a16z, Sequoia, and Soma play a key role in scaling (~47% of all VC investments in 2024 were made in artificial intelligence). Acting as absolute “sovereigns” of the industry, with entrenched and protected positions, vast material and “cloud” resources[xviii], and private and public investments, megacorporations (Google, Meta, Amazon) intensify the asymmetry in resources, talent, and power practices, erecting barriers for smaller players. The high and continuously growing cost of model training deepens systemic inequality, creating oligopolistic tendencies. U.S. government investments in AI alone, within the 2024 global techno-economic market where the country is a leader, reached $131.5 billion[xix]. Monopolies include not only AI developers but also key suppliers of essential components, such as compute. Since the 2010s, GPUs – graphics processing units – have been used to power foundational models, 80% of which are manufactured by a single company, NVIDIA (USA). NVIDIA chips are produced by TSMC (Taiwan) at a single plant, whose manufacturing equipment is made and serviced exclusively by ASML (Netherlands). Damaging any of these three links can halt the entire industry for a prolonged period. (Perhaps that’s why new companies are slowly and incrementally emerging to produce compute specifically for the industry.) The extent of the industry’s impact on capitalism is also shown by the fact that artificial intelligence is the second technology, since Bitcoin, to directly challenge the capitalist ontology of (value) creation and appropriation. As Luciano Floridi notes, traditional financial metrics (such as ROI, price-to-earnings ratio, EBITDA, net profit margin, and cash flow) are losing priority as value metrics in the case of AI (Floridi, 2024, pp. 11–12). In the AI market, new value metrics and criteria emerge, such as parameter count (model size), model performance on the most popular benchmarks, model depth in terms of the number of layers, dataset sizes, volume or quality, contributions to open-source developments, the personality of particular contributors in R&D based on their previous achievements in the field (Matthew effect transferring, a kind of), ranking on Chatbot Arena or a host of other leaderboards, involvement of GitHub “stars”, ecosystems for sharing the models, or even meme-stocks. AI surpasses Bitcoin in its transformational effects, as cryptocurrency remains within a monetary paradigm of exchange value, profit, and cost, whereas AI’s value is shaped by criteria that until recently were understood only by a niche circle of tech bros. Investors seeking to stay ahead and identify the most promising startups and projects can no longer rely solely on tracking movements on stock or cryptocurrency exchanges. Only knowledge and understanding of the AI industry’s “esoteric” criteria – non-random and not fundamentally obscure – now offer predictive insight. Despite their unconventionality, these metrics are predictably correlated to quantitative rather than qualitative dynamics in the medium-term perspectives.
The ontology of capitalist realism is shaped by the view that all things – relational properties, capabilities, institutions, events, and situations in the world – can be assigned market value. The place of first philosophy is occupied by corporate logic and the “business ontology” of New Public Management. The ontology of scalar Darwinism is defined by a form of reductionism closely resembling the business ontology of capitalist realism – a view that all things, properties, situations, sounds, images, words, and relationships between objects are data or can be reduced to it. The belief that data is “out there” waiting to be extracted, refined, and turned into something valuable operates as an axiomatic postulate. The role of digitized data in the industry is functionally isomorphic to surplus: data is seen as an end in itself – an infinite, unlike oil or minerals, resource with limitless future utility. Neoliberal governmentality, with its characteristic minimal market intervention, reinforces this view by allowing data collection with minimal oversight. Existing regulatory methods are based mainly on 20th-century logic, when data was hard to move and expensive to store; collective standards for data mining ethics remain absent, enabling the unconsented scraping of data. Terms like “data mining” and phrases such as “data is the new oil” have shifted the perception of data from something personal and intimate to something inert and dehumanized (Crawford, 2021, p. 113). Comparing data to oil waiting to be extracted frames machine learning as a necessary refining process. Data becomes a Deleuzian flow – coded, decoded, redirected, or invested for future use. The struggle for access to training or evaluation datasets resembles the capitalist struggle for market access. Elite, quality-first datasets curated by experts in narrow domains, high-quality paywalled or protected “pieces” (private/”curated” collections of sounds, videos, images, observations), access to governmental statistics (from economic reports to classified military or intelligence files), and proprietary or sensitive data obtained without owner consent become equivalents of traditional capitalist trade deals, exclusive exports, licenses, right of first refusal or right of the first offer.
Less and less data is becoming accessible, while models require even more. A looming crisis is expected: generation speed is no longer keeping up with growing data demands, which are irreplaceable by recursively generated synthetic data (for the latter, model collapse is a threat (Shumailov et al., 2024)). This should have motivated a proactive search for alternative paradigms. However, browsing the latest research papers gives little hope for that, as solutions outside of data-centrism’s ontology are rarely considered: “solving the problem” still means “finding or creating more data.” Similarly, capitalism’s structural crises have never triggered a reevaluation of its ontological foundations. Thousands of open datasets reflect how the world is being converted for computers, often far from the ways humans perceive it. Forensic, biometric, sociometric, and psychometric data, as well as everyday human actions (such as sitting, standing, jogging, lying, and eating), thousands of tattoos, selfies, and CCTV footage from crimes, are captured, extracted, and cataloged into training datasets for AI systems to learn patterns, correlations, and make predictions. New methods enable gathering voice, physiological metrics, browsing history, books to read, and visited sites data. Some are scraped online (Flickr, YouTube, Google, Tumblr), others are donated by government agencies (FBI, DoD, or Health departments). The image of AI as “neutral,” benevolent research in statistics, mathematics, cognitive, and computer sciences is a harmful myth that conceals the nature of synergy between academia, the military, the capital, and the interests of nation-states – the real genealogy of AI infrastructure, the necessary conditions of its formation[xx]. Since the days of Alan Turing and Norbert Wiener, the core has always been about the same matters of concern. It continues so, although in nuanced and sophisticated iterations: signal interception (SIGINT – Signal Intelligence), pattern recognition, and code processing/decryption – the three pillars of Machine Learning, on which all else, such as computer vision, natural language processing, or speech recognition, is built. Put otherwise, underscoring the proprietary nature of the industry, “AI development was never a clandestine and horizontal activity, but always facilitated by scientific institutions with significant gatekeeping and dubious ties to both industry and military” (Brunila, 2025, p. 7).
Years of accumulation and extraction have been driven by and shaped by a powerful extractive logic that enriched tech companies while shrinking data-free spaces. What is “fed” to machines, and how, critically impacts their internalization, interpretation, and their model of the world. The creators’ priorities always define the latter. Regardless of the disguise of the data extraction as “purely technical,” historical examples show that these data collections are political-social interventions into the private, structured expropriation and privatization of the commons. Mark Fisher saw capitalist realism as the “most successful ideology” due to its suppression of imagination and creativity. AI becomes the most successful form of centralized control and ideological projection (understood as a worldview fed back into the world), primarily through masking its implications as rational, technologically neutral, or as a next step in human or biological evolution. [xxi] Yet, AI abstractions – such as datasets, numerical classifications, and embeddings – are not inherently rational but instead imposed formalizations, rationalizations in the psychoanalytic sense of post-factual normalization. AI becomes its own rationality, and abstraction operations become unquestionable. Critique of “machine logic” is seen as irrational and emotionally charged. This logic is then weaponized as a refined disciplinary force and form of control: what is codified becomes manageable, predictable, surveillable, and subject to market evaluation by business ontology. (One of the versions of the latter puts it simply as follows: everything must be quantifiable to survive on the market, just as all innovations in AI must fit into the logic of scalable abstraction to survive economically and receive technical maintenance.)
How data is understood, collected, classified, and named becomes a means of worldmaking and containment. The algorithmic representation of humans, species, and objects reproduces and amplifies historical inequalities, sometimes creating new ones under the guise of machine neutrality, leading to profound consequences for AI outcomes and the affected populations. Technology’s role in denominating and revaluating the world through capitalist-realistic “rationalization” cannot be overstated. At the core of the AI worldmaking are not just individual biases but also scientifically dubious hypotheses legitimized by the widespread adoption of machine learning. A prominent example is the FACS (Facial Action Coding System) by Paul Ekman and Wallace Friesen (1978), based on the hypothesis that universal emotions are equally expressed across faces, regardless of culture. This technology supposedly enables lie detection or the prediction of criminal intent via “microexpressions,” appealing not only to psychologists and forensic experts but also to military and intelligence agencies. Despite its enduring appeal and use in training datasets (AffectNet, FER-2013, CK+), FACS has long been criticized by anthropologists and cognitive neuroscientists. Lisa Feldman Barrett argues that the “universal emotions” hypothesis lacks empirical basis, as Ekman’s categories oversimplify and overlook social-cultural variance (Barrett, 2017, pp. 43–46). Others argue the notion of “microexpressions” – involuntary facial movements allegedly betraying emotion – is flawed. Nevertheless, models trained on FACS-based data not only replicate but also reinforce these epistemic assumptions through statistical consensus. AI systems that internalize FACS representations perpetuate discrimination in policing, hiring, and education. These biased implications evolve into hereditary beliefs akin to Sellarsian givens, Kantian transcendental, or Bourdieu’s “transcendental historical a priori.” The problem with such givens is familiar: just like any other givens, they ultimately impact our human condition: “The convenience of finding information in one place – and finding ‘one right’ information – dangerously couples techno-solutionist and authoritarian trains of thought” (Aguerre et al., 2023, p. 6).
The two ontologies, business-ontology (capitalist realism) and data-ontology (scalar Darwinism), converge at the paradigm-shifting point of abstraction through the rise of weights and embeddings meltdown into a new fundamental unit of information, occurring in neural networks, which condense the information into opaque and “mysterious” vector structures which are even more legitimate to claim the status of a “machinic semiotics” than programming languages (which, after all, turned to be rather DNA of the models, their biology). “God trickery” of model-based predictions of stock market prices, creative writing, or generating a research paper passing double-blind peer reviews seems “godlike” precisely because for a significant part of humanity, this semiotics is unfathomable, as Minoan Linear A, some ancient runes, or any other undeciphered system of communication. In Marx’s ultimate picture of capitalism, “the proliferation of the commodity form involves the abstraction of all concrete use values into the ‘thin air’ of exchange value” (Brunila, 2025, p. 8). Embeddings embody this in the distinctly literal sense, as they can be considered the highest level of abstraction of the semantic structures as extracted by the AI model, something that transgresses even the phenomena of phenomena of data points: compression of a fragmented splinter of the actual situation, object or state of affairs, cutoff from the discursive fields of multiple modalities used in machine learning (text, image, etc.): “medium = message” identity, indeed! The message thus establishes a new “Logic of Sense,” derived from weights and embeddings, eventualizing into a truly relational dimension of Deleuzian non-sense – a non-sense, but only insofar it is inscrutable for the human reader, operator, and technical expert alike; yet absolutely meaningful and operatable when it comes to model.
5. Golden Age in the Myths of the Nearest Future
The most fascinating, even enchanting aspect of scalar Darwinism, which serves to justify labor exploitation, relentless resource extraction, energy consumption, carbon emission, and other thinkable tradeoffs is the promise of a utopian “tomorrow” – an “exploitation of the future” – framing and treating time itself (and things to come) as active and future assets. Similar to what Michael Fortunato calls promissory capitalism, the AI industry “sells the future,” constantly deferring the realization of that future to an indeterminate “later” (date: nowhen), to justify current investments in scaling. Capitalism, as Zizek famously puts it, transgresses its borderlines by pushing the deadline, and defers its “day when all the debts will be paid,” which equates to “the day capitalism will die”. In a similar way, the Day of Payoffs, at which all the trade-offs, investments, and sacrifices that are being made today, would at last be reimbursed, refunded, and compensated – this day is tomorrow, but the one that will never actually come. Tomorrow is a promise of specific things to come. “With its promise to transform industries from healthcare to finance, transportation to entertainment,” writes Floridi, “AI seems to be a new form of “alchemy” that has spellbound investors, 10 entrepreneurs, and the public alike. One only needs to read a few entries on LinkedIn to find incredible claims, revolutionary statements, and extraordinary promises every day, in an escalation that has not been witnessed since the invention of sliced bread” (Floridi 2024, pp. 10–11). What Floridi expresses as acidic criticism, others would iterate with complete seriousness and conviction, as their ground (core) belief. For instance, Marc Andreessen, co-founder of venture enterprise a16z, in Techno-Optimist Manifesto puts it as if reversing Floridi: “We believe that Artificial Intelligence is our alchemy, our Philosopher’s Stone – we are literally making sand think. We believe that Artificial Intelligence is best thought of as a universal problem solver. And we have plenty of problems to solve. We believe that Artificial Intelligence can save lives – if only we let it” (Andreessen, 2023). Since the first promises to render a world into a thing where everything can be modeled and broken down into data points, about two decades have passed. What changed, if anything, is that the idea became a mainstream ideology – an ideology, but not the embodiment of the idea. Technology is pervasive, but banal, an extension of governmentality rather than something wholly transformative.
Other endorsers of AI, like Peter Thiel, are more explicit about military-industrial (big tech) genealogy and belonging as well: “A.I. is a military technology. Forget the sci-fi fantasy; what is powerful about actually existing A.I. is its application to relatively mundane tasks like computer vision and data analysis. Though less uncanny than Frankenstein’s monster, these tools are nevertheless valuable to any army — to gain an intelligence advantage […], or to penetrate defenses in […] cyberwarfare […]. No doubt machine learning tools have civilian uses, too (my emphasis — M.K.); A.I. is a good example of a “dual use” technology” (Thiel, 2019). While the fresh baked cast of “AI Ethicists”, these Great Architexcts of Whitepapers and Roadmaps “condemns”, debunks and “exposes” “the very possibilities” of AI dual uses (“preemptively” writing against the utilisation for harm), the thing is that there have never been such a thing as ethical AI. It is the dual-use instance: that is, ethical use. The Prophets of Woe are not far behind from both believers and skeptics – another segment of the industry, a niche genre: scholars, philosophers, and sometimes entire institutes (MIRI, FHI, FLI), whose purpose is to craft increasingly sophisticated scenarios of humanity’s destruction at the hands of malevolent superintelligent agents. However, so far, despite our frequent updates on how exactly we are going to be exterminated by malevolent superintelligent deities, all the actual drama and passion come from human responses to technological ubiquity, rather than from the tech itself.
What we have today can be described as a “scalar collapse [of scalar ambitions]”: whatever the pace and acceleration rates, the “emergent capabilities” as predicted by scaling laws do not bring us any closer to the promised arrival of artificial general intelligence, as these capabilities effectively remain exact imitations of processes – step-by-step reasoning, mental epistemic states, emotional intelligence, understanding, beliefs, or confusion. Ad hoc engineering strategies leading to qualitative fluctuations occur only when a challenger is forced, by some external cause, to sidetrack, basically lacking any initiative and enthusiasm to do that intentionally (the ostensible in this sense is a case of DeepSeek, forced to improve by economic sanctions affecting the company’s access to compute. Yet fluctuations of this kind do not bring any cascade effects. In addition to scaling laws, the No Free Lunch theorem in search and optimization asserts that algorithms must possess structural epistemological biases to solve problems successfully. Generalization becomes possible only when such biases align the algorithm with the actual structure of the problem. Scaling does not lead to generalization or “vector” changes (new biases); it simply enables the algorithm to memorize more instances within the same structural bias. This fact is often ignored–essentially admitting that no one knows which biases are optimal, while continuing to bet on scale as a way to overcome the lack of intelligent design: a brute-force attempt to conquer the “kingdom of the universal algorithm,” flipping the available option space upside down in hopes of stumbling upon something that works. Forever promising a golden age of the future, forever pulling away its arrival – into nowhen of speculative futurity…
According to Mark Fisher, it’s much easier to imagine the end of the world than the end of capitalism. Likewise, it’s far easier to imagine a speculative apocalypse of the rogue Superintelligence or a posthuman machine synthesis, immortality, and space colonization by von Neumann probes powered by dark matter than to imagine the end of the current scalar arms race. This “Brave New Scale” endlessly promises unthinkable solarpunk utopias or preaches about cyberpunk dystopias, but it ultimately leads to a kind of postcyberpunk disillusionment: the future is now, and it doesn’t feel like the future – it feels like the disappointments of yesterday projected into tomorrow. Being often cautioned about the upcoming technological disruption and total alienation from the world and oneself, the dramatic thrill is effectively a creeping, frustrating mundanity, with our world being almost dystopian, yet not quite enough to feel galvanized. In the absence of alternatives, the paradigm appears inevitable, even natural: models must grow, data must deepen, and computational power must expand infinitely. The problem of “vector” alternatives in research and development – smaller yet specialized architectures; approaches prioritizing interpretability; integration of reliable knowledge (Zheng et al., 2024); self-alignment paradigms (Sun et al., 2023) or dynamic value alignment (Huang et al., 2024)–is both structural and infrastructural: on one hand, alternatives become viable only under conditions of absolute crisis or proven superiority in profitability and cost-efficiency; on the other hand, the means of testing, methods of proof, and demonstrative case examples are severely limited by unequal distribution of resources, both material and intellectual, in the field of AI research and development.
When something is blindly hurtling forward, does it make sense to ask about direction before it stumbles or crashes into a wall? Or is it still worth considering which way to turn, while there’s still time? The antinomy at the heart of AI today.
References
Aguerre, C., et al. (2023). Generating AI: A Historical, Cultural, and Political Analysis of Generative Artificial Intelligence. Retrieved from DataEthics: https://dataethics.eu/wp-content/uploads/2023/09/Generating-AI.pdf
Andreessen, M. (2023). The Techno-Optimist Manifesto. Retrieved from a16z: https://a16z.com/the-techno-optimist-manifesto
Barrett, L. F. (2017). How Emotions Are Made: The Secret Life of the Brain. Boston: Houghton Mifflin Harcourt.
Bengio, Y., et al. (2025). International AI Safety Report 2025 (DSIT 2025/001). Retrieved from UK Gov: https://www.gov.uk/government/publications/international-ai-safety-report-2025
Brunila, M. (2025) Taking AI into the Tunnels. In e-flux Journal, Issue #151. https://www.e-flux.com/journal/151/652643/taking-ai-into-the-tunnels/
Deleuze, G. (1990). Postscript on the Societies of Control. October, 59, 3-7.
Ekman, P., Friesen, W. (1978). Facial Action Coding System (FACS): A Technique for the Measurement of Facial Movement. Palo Alto, CA: Consulting Psychologists Press.
Fisher, M. (2009). Capitalist Realism: Is There No Alternative? Winchester: Zero Books.
Floridi, L. (2024). Why the AI Hype is Another Tech Bubble. SSRN preprint http://dx.doi.org/10.2139/ssrn.4960826
Garland, D. (2014). What is a “history of the present”? On Foucaults genealogies and their critical preconditions. In Punishment & Society, 16(4), 365–384.
Hasselbalch, G. (2022). Data Ethics of Power: A Human Approach in the Big Data and AI Era. Edward Elgar.
Huang, L., Papyshev, G., & Wong, J. (2024). Democratizing value alignment: from authoritarian to democratic AI ethics. In AI and Ethics, 5, 11-18. DOI: 10.1007/s43681-024-00624-1.
Hubinger, E. et al. (2024) Sleeper agents: training deceptive LLMs that persist through safety training. Preprint at https://arxiv.org/abs/2401.05566.
Knight, W. (2023). OpenAI’s CEO says the age of giant AI models is already over. Wired. Twitter.com
Leffer, L. (2023, October 13). The AI boom could use a shocking amount of electricity. Scientific American. Scientificamerican.com
O’Brien, M. (2024, June 6). AI’s “gold rush” for chatbot training data could run out of human-written text as early as 2026. PBS NewsHour. pbs.org
Roden, D. (2013). The disconnection thesis. In Singularity Hypotheses: A Scientific and Philosophical Assessment (pp. 281-298). Springer. Retrieved from PhilArchive: https://philarchive.org/archive/RODTDT
Singh, S., Nan, Y. (2025). The Leaderboard Illusion. https://arxiv.org/pdf/2504.20879
Shumailov, I., et al. (2024). AI models collapse when trained on recursively generated data. In Nature, 631, 755-759. DOI: 10.1038/s41586-024-07566-y
Strachan, J. W. A., et al. (2024). Testing theory of mind in large language models and humans. In Nature Human Behaviour, 8, 1285-1295. DOI: 10.1038/s41562-024-01882-z.
Sun, Z., et al. (2023). Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision. arXiv preprint arXiv: 2305.03047.
Takahashi, D. (2025). Global VC investments rose 5.4% to $368.5B in 2024, but deals fell 17%. VentureBeat. Venturebeat.com
Thiel, P. (2019). Good for Google, bad for America. The New York Times. Retrieved from NYTimes.com: https://www.nytimes.com/2019/08/01/opinion/peter-thiel-google.html
Verdegem, P. (2024). Dismantling AI capitalism: the commons as an alternative to the power concentration of Big Tech. In AI & Society, 39, 727-737.
Vogl, T.M. et al. (2019) Algorithmic Bureaucracy: Managing Competence, Complexity, and Problem Solving in the Age of Artificial Intelligence. http://dx.doi.org/10.2139/ssrn.3327804
Zheng, T., et al. (2024). Assessing the robustness of retrieval-augmented generation systems in K-12 educational question answering with knowledge discrepancies. arXiv preprint arXiv: 2412.08985.
Zuboff, S. (2019). Surveillance Capitalism and the challenge of collective action. In New Labor Forum, 28(1), 10-29. DOI: 10.1177/1095796018819461.
Notes
[i] One could also add here: assistance in the identification and probabilistic analysis of “black swans,” disruptors, and game-changers; management and governance, modeling and preventive monitoring of risks, crises, threats, and points of failure; virtual, logistical, geographical, and physical mapping; the production and propitiation of knowledge (academic, social, personal) and the naturalization of specific epistemic units; certain forms of compliance auditing (if necessary, enforced) with standardized norms, units, and legal procedures; organization and ranking of modeled actions and their consequences, events, processes, pipelines, things, and individuals by levels and priority using metrics of “business ontologies” (productivity, efficiency, time-saving, etc.); cataloging and taxonomy; proactive security measures; information protection; infrastructural organization of time, up to the “creation” of proprietary temporalities with a redefinition of global informational logic.
[ii] To put all this together, critical theory developed such frameworks as: Big data economics (Hasselbalch, 2022, p. 55), AI capitalism, rentier capitalism, platform capitalism, algorithmic condition, algocracy, critical political economy (Verdegem, 2022), the “New Society of the Spectacle” (Aguerre et al., 2023, p. 2), algorithmic bureaucracy (Vogl et al., 2019), surveillance capitalism (Zuboff, 2019).
[iii] Almost correlatively, the cost of hardware increases contrary to demand, while the price of human labor decreases.
[iv] The term denoting “hidden” human labor that supports the development and deployment of AI models or systems under unstable conditions – low pay, lack of benefits and protections, limited social and economic mobility, increased workloads, and shifting schedules – includes tasks such as data labeling for supervised learning, data production, classification of data points in datasets, annotation, and “cleaning”; recent studies have shown that working with visual content (e.g., images) can lead to severe mental trauma, comparable to PTSD.
[v] Massive clusters of networked high-energy servers for remote computing, typically reaching about 5,000 servers in size.
[vi] Foundational services, facilities, buildings, hardware required for the functioning of digital technologies, which, in addition to software, “iron,” networks, and data centers, also include communication systems.
[vii] Usually included–specifically in the context of GPT models–are specific multimodal capabilities; a 16-fold increase in retention (dialogue window); behavior rudimentarily resembling “theory of mind” (the ability to infer and predict thoughts and intentions, as well as empathy-like features); interaction with external systems outside the environment.
[viii] This same unchecked growth results in undesirable emergent capabilities and properties, which also sometimes occur (such as sycophancy, some forms of hallucination). However, these are not seen as serious threats to the existence of LLMs.
[ix] I use brackets because, for the most part, those who may be concerned are such, i.e., they remain a thing to consider.
[x] For example, the popular MMLU (Massive Multitask Language Understanding) consists of 16,000 questions.
[xi] Examples of individual metrics include, but are not limited to: mathematical reasoning, reading comprehension, solving everyday problems, chain-of-thought rationale, etc.
[xii] The ability to understand others by attributing mental states to them.
[xiii] The easier it is to evaluate–and ideally automate the evaluation of–results without human involvement, the “better.” Furthermore, if something is easier to consider, it might also be easier to create; accordingly, the easier it is to make, the more mass-producible the tasks become.
[xiv] Introduced in one of the ChatGPT releases, followed by models like Grok (xAI), DeepSeek (Alibaba), Claude (Anthropic), Gemini (Google DeepMind): all imitate “new perceptual modalities” while still being transformers with attention mechanisms.
[xv] Another meme pioneered by OpenAI, and quickly spread across the landscape, appearing revolutionary while remaining a recombining approach, lacking deep “mutation” at the architectural or conceptual levels.
[xvi] For example, gaining access to energy from the national grid at discounted prices, or oversaturation of the labor market with people of a particular qualification who can be forced to work for lower wages.
[xvii] For a research collective, it also leads to ethical tradeoffs or reduces their assumptions to speculations without resolving tests. Authors of a crucial study on LLM collapse, recently published in Nature (Shumailov et al., 2024), for instance, consciously refrain from going beyond small-scale simulation for testing predictions due to the environmental costs of training a model of the proper size needed for such a test. Their less environmentally aware rivals, thus, can get an upper hand in contesting the results with “more of empirical evidences” in their cards.
[xviii] Amazon, for instance, owns 31% of the world’s cloud computing storage.
[xix] Which is 52% more compared to the previous year.
[xx] Literally, and as a matter of fact: The Dartmouth Summer Research Project on Artificial Intelligence was jointly sponsored by the Rockefeller Foundation and DARPA.
[xxi] This desire has historically been inherent to all ideologies – from “scientific” communism to “Aryan” mathematics and laissez-faire capitalism.
Appendix. Executive Summary: A Brief Definition and Seven Essential Features
Scalar Darwinism refers to a contemporary phase in the history of machine learning and the current state of the AI industry, characterized by blind, infinite exponential scaling of artificial intelligence models based on the unchanged “transformer” architecture (with only minor variations). This phase is marked by growth solely in quantitative metrics (number of model parameters, volume of training data, duration of training, computational power), without qualitative development or fundamental innovation. Structurally and ontologically, it is data-centric (relying on data and entirely dependent on its volume, and thus to some extent supervenient upon it); ontologically akin to capitalist realism, it is deeply embedded in and participatory in market economies, characterized by asymmetrical access to opportunities, resources, and investments.
Key traits of Scalar Darwinism:
- Lack of qualitative “vector” shifts or structural transformations. Progress is limited to the mere accumulation of resources without the introduction of new architectural solutions or methodologies. Models exhibit only quantitative, marginal shifts–more layers, more parameters, more data. The absence of these qualitative shifts leads to a crisis of goals and a lack of future direction, especially in the face of upcoming industry crises that lack clear responses. Lacking strategic depth, this boundless scaling renders development “headless” and, to some extent, self-destructive.
- Second-order epistemological phenomenalism. All data used for training and evaluating models are second-order representations–biased, subjective, and limited perceptions of developers who themselves lack direct access to the real world. Noumena become phenomena of phenomena, already simplified and “translated” reflections of real-world tasks. Moreover, distorted, normative, and socio-political biases are transferred into models. Evaluation metrics and benchmarks are overly simplistic, quantitatively driven, and increasingly ineffective in gauging true cognitive or intellectual capabilities.
- Darwinism “in its worst sense.” Inter-model dynamics and “communication” are defined by blind competition and dysteleology akin to Darwinian natural selection. Models survive based on quantitative advantages rather than qualitative adaptations, leading to uncontrolled accumulation of resources and random “mutations” in features. A reverse form of Darwinism is also present: the environment adapts to the “species” changes, and any distinctive traits quickly get replicated by others.
- Extractive development logic and data-centric ontology. This relentless quantitative growth follows scaling laws, which dictate that each subsequent model iteration requires more data and resources than the last. This necessitates increased extraction and exploitation of planetary resources, rising energy consumption (and thus a strain on national infrastructure), and often unregulated or poorly regulated data extraction, frequently without consent or adherence to ethical standards. Severe data shortages for foundational models are projected to occur as early as 2026. No alternatives to data-centrism are being developed at the infrastructural level.
- Functional and ontological alignment with neoliberal economic logic. Everything — including information, knowledge, and even subjective experience — can be reduced to a market value. AI becomes an agent that codifies, classifies, and monetizes reality, amplifying global inequalities and monopolistic structures. A functional parallelism also exists: as neoliberalism reduces everything to extractable market value, Scalar Darwinism reduces everything to data as a profit source–information, training datasets, evaluation sets–often acquired unethically.
- Worldmaking “expansionism”. AI systems construct new “worlds” by categorizing people and things, centralizing asymmetries, and optimizing or inventing new forms of control and discipline. These systems falsely appear “fully objective” due to harmful myths like technological neutrality and rationality. Machine implementation and training sometimes legitimize scientifically dubious (or outright anti-scientific) theories (e.g., Paul Ekman’s FACS), fostering deceptive statistical consensus and social practices built on flawed, unproven, or erroneous assumptions.
- Imitation of future-oriented directionality. This serves to justify investments, labor, resources, and the exploitation of infrastructure. Such justifications often employ utopian (overly optimistic) or, especially in the context of AI safety, dystopian and catastrophic risk models. Similar to “promissory capitalism,” fulfillment is always deferred, and deadlines are perpetually shifted away into the future.