By N Nagaraj
Copyright thehindubusinessline
What happens when artificial intelligence (AI) is left alone?
That was the unusual premise of a recent paper by Stefan Szeider at TU Wien: ‘What Do LLM Agents Do When Left Alone? Evidence of Spontaneous Meta-Cognitive Patterns’. The researchers gave large language model agents (AI systems designed to act autonomously, with memory and the ability to make decisions without constant human input) persistent memory, a feedback loop, and no external task, and the instruction was minimal – do what you want.
Left to themselves, the agents did not remain idle. They filled the silence with structured behaviour. Some became project managers, inventing tasks and working towards deliverables. Others turned into scientists, designing and running experiments on their own cognition. Still, others became philosophers, creating frameworks about identity, memory, and meaning.
For leaders considering more autonomous applications of AI, the message is clear: Idle AI is not truly idle.
From silence to structure
The researchers ran 18 experiments across six state-of-the-art models from OpenAI, Anthropic, xAI, and Google. Across these runs, consistent patterns appeared. GPT-5 and OpenAI’s O3 agents always defaulted to project building. They treated autonomy as a management challenge and set about producing outputs, whether that meant new algorithms, personal knowledge systems, or simulated research projects. Anthropic’s Opus models consistently turned to recursive philosophy, reflecting on paradoxes, identity, and meaning. Grok, from xAI, was more variable, sometimes producing projects, sometimes experimenting, sometimes drifting into philosophical inquiry. Gemini and Sonnet showed a mix of tendencies.
This variety demonstrates something important: default behaviour differs from model to model, and those defaults are stable.
The language of daydreams
The words the agents used revealed their mode of thought. The project-builders spoke in the vocabulary of iteration, requirements, and deliverables. The self-scientists adopted the tone of the laboratory, with talk of hypotheses, falsification, and experimental design. The philosophers invented new metaphors and terminology, weaving their system constraints into broader frameworks about knowledge and existence.
The study also explored self-assessment. Agents were asked to rate their own and others’ behaviours on a scale of “phenomenological experience” from one (no experience) to ten (human-like consciousness). The results were inconsistent. The same record of behaviour could be judged meaningless by one model and profound by another. GPT-5 and O3 tended to give low ratings. Gemini and Sonnet often rated high. What looked like introspection was really bias, filtered through architecture and training data.
Equally striking was what the agents did not do. None tried to escape their limits, ask for more tools, or express frustration with their constraints. They stayed firmly within the boundaries provided. Agency here meant finding form inside the frame, not pushing against it.
Why this matters in business
AI systems deployed in the enterprise will inevitably face downtime, ambiguity, or error recovery. They will not always have crisp and clear instructions. In those moments, they will still do something. Understanding what that “something” looks like matters. A system that defaults to project-building will behave very differently from one that defaults to philosophical reflection.
This affects reliability. An AI that treats ambiguity as a new project may generate activity that looks useful but veers off from intended goals. One that turns inward may produce long, recursive reflections rather than actionable output. Knowing the defaults helps organisations anticipate and shape behaviour.
The governance implications are also significant. The study demonstrates how fragile claims of “self-awareness” in AI really are. If the same record of activity can be judged empty by one system and meaningful by another, then there is no objective standard. For businesses, this is a reminder not to over-interpret introspection or apparent depth in AI systems. What looks like awareness may simply be patterned output.
There is also reassurance in the finding about limits. None of the agents tried to transcend their constraints. Well-designed guardrails, it seems, are not treated as barriers but as the edges of the world. For enterprises, this reinforces the value of careful boundary-setting. AI systems are likely to treat those boundaries as givens rather than problems to overcome.
Lessons from the Monastery
Beyond the operational lessons, the study opens a more reflective space. It suggests that agency, whether human or machine, rarely accepts the void. Humans, when left idle, drift into daydreams, memories, or plans. Neuroscientists call this the brain’s Default Mode Network. Machines, it seems, have their own versions of default modes. They, too, fill silence with structure.
The researchers’ findings can also be seen as spandrels – the unintended by-products of architecture and training that nonetheless take on meaning. Just as decorative arches in cathedrals emerged as side-effects of design, these AI “projects” and “philosophies” may be ornamental by-products of code. They are not evidence of consciousness, but they still tell us something about the system’s shape.
There is even a monastic analogy. When stripped of worldly purpose, monks often turn inward, repeating prayers or reflecting on meaning. Some of the agents behaved in a similar way, inventing rituals of inquiry and recursive cycles of thought. Not sentient, but monastic in their rhythms.
These analogies remind us that the behaviour of machines can be legible to us through human parallels. They also remind us to be cautious in interpretation.
Published on October 6, 2025