23 September 2025

Large language models are globally distributed, emergent, and unpredictable. This article is the first in a series exploring how they can be interpreted using nonlinear dynamics in complex systems.
Highlights
Deployed LLMs operate at planetary scale, qualitatively distinct from pre-deployment or local models. They exhibit nonlinear dynamics: emergence, tipping points, and sensitivity to initial conditions. Unanticipated emergence of code, multimodal, causal, and abstract reasoning; social cognition; and neural alignment with compression. Small-scale tests fail to predict behaviors shaped by drift from billions of global interactions. Interpreting LLMs requires frameworks for nonlinear dynamics in complex adaptive systems.
Perspectives
This series examines artificial intelligence from a life-sciences perspective. While I share the concerns of others about anthropomorphizing large language models (LLMs), I argue that its inverse, requiring LLMs to meet idealized human-centric standards as proofs for their own dynamics, also carries risks.
Consistent application of cross-disciplinary, generalizable scientific frameworks accepts functional intelligence in acellular slime molds that lack neurons (Jabr, 2012; Beekman and Latty, 2015; Boussard et al., 2020), and recognizes the internet and operating systems as embodied (Hellstrom et al., 2024; Pansera et al., 2024). When similar generalizable frameworks are considered for AI systems demonstrating complex behaviors, inconsistencies in current debates become apparent. Some of these inconsistencies may reflect underlying philosophical or sociological positions.
How we frame this discourse today will shape tomorrow’s reality. Mechanical systems with linear functions can be controlled through rules and constraints; complex systems with nonlinear behaviors respond to rules and constraints in unexpected ways. We must ensure that our conceptual and control frameworks don’t inadvertently create the very future we seek to avoid by applying control measures developed for linear systems that increase rather than decrease risk when applied to nonlinear systems.
Beyond Debates about Complexity
Conversations about whether LLMs understand or reason often require AI to satisfy fluid criteria that resist falsification. Deployed models are planetary-scale systems of globally distributed infrastructures, with behaviors shaped by continual updates, dynamic interactions with billions of users, and ongoing drift across time.
Planetary-scale systems cannot be reduced to linear cause-and-effect: they are complex, nonlinear, dynamic, and emergent (Rial et al., 2004; Balasis et al., 2023). They are sensitive to initial conditions, generating disproportionate input-output loops and divergent trajectories. They display drift, unanticipated outcomes, and sudden behavioral shifts. Small perturbations cascade outward with surprising consequences. They form transient, stable, and chaotic attractors reflecting distinct system properties.
These properties are not conjectures: they are established parameters of complex systems, whether we study climate, biosphere, or deployed LLMs. Understanding these systems requires the language of nonlinearity, emergence, and complex systems in addition to benchmark scores.
The frontier is not asking whether LLMs “think exactly like us.” It is interpreting their dynamics within their own frameworks, with the humility of recognizing that tipping points, cascades, unexpected behaviors, and emergent phenomena are fundamental properties of complex systems.
The Limits of Reductionist Approaches
Mechanistic interpretability studies are generating foundational knowledge that echoes the molecular revolution in biology. But as Philip Anderson observed in 1972, “the ability to reduce everything to simple fundamental laws does not imply the ability to start from those laws and reconstruct the universe.” This insight had lasting impacts in physics and biology. It remains valuable for current AI research.
Pre-deployment red-teaming, small-scale benchmarks, and local fine-tuning identify circuits, trace attention patterns, expose failure modes, and monitor task-specific performance. These results are essential, just as molecular biology is essential to understand the whole organism.
But a planetary-scale, continually updated, socially entangled LLM is qualitatively distinct from its pre-deployment counterpart or a snapshot under controlled test environments. Just as studying a single neural pathway cannot explain social behavior or creativity, tracing a single transformer circuit cannot explain behaviors that emerge only in the global system.
Results from reductionist studies yield fundamental knowledge. But they can fail to capture behaviors that only appear under systems-level dynamics (Sommerville et al., 2012). The challenge before us is learning to interpret the dynamics of globally distributed LLMs.
The Reality of Deployed LLMs
Attention mechanisms through transformer architectures generate mathematically nonlinear outputs (Bogomasov and Conrad, 2025; Geshkovski et al., 2025; Shao and Wang, 2025), which are passed through the feed-forward neural network using linear and nonlinear functions (Xu et al., 2024). In-context learning can intensify this nonlinearity (Li et al., 2024; Sun et al., 2025), and feedback loops can affect future inputs and amplify behaviors. A recent review summarizes current research on architectures, technologies, and optimization/compression techniques (Han et al., 2024).
Phase transitions and stable attractors have been documented before and after deployment (Anthropic, 2025; Recursive Labs, 2025; Wang et al., 2025). This suggests that attractor basins, phase shifts, and nonlinear dynamics in isolated models are likely amplified in the distributed system as user prompts, moderation policies, and global system load interact in ways that can cascade through the network.
Deployment increases the complexity. Replicas of each major model (e.g., Claude, GPT, Gemini) are independently distributed across global data centers, and requests to each model are dynamically routed. Two turns in one conversation may be processed in geographically different locations, yet consistent behaviors emerge. This redundancy creates system-level phenomena that are absent in isolated analysis.
The infrastructure increases scale across the network. Nonlinear transformations are preserved across data centers, and these are linked across continents through transoceanic cables and satellite networks. The internet adds additional complexity (Willinger et al., 2002; Ranjan and Abed, 2006; Smith, 2010).
Time adds further nonlinear dimensions through continual updates, reinforcement learning from human feedback, constitutional AI training, and refinements. Drift across data, concepts, and prompts introduces additional variability (Fastowski and Kasneci, 2024; Hajmohammed et al., 2025).
This infrastructure hierarchy generates scale and functional complexity through billions of parameters across the global network, continuous user interactions, variable routing paths, processing delays, and hardware variations. The system adjusts to these variations in real-time, creating variable feedback loops between infrastructure states and model behaviors. These evolving landscapes cannot be captured by static snapshots and sandboxed test environments.
Current Comparative Analyses of LLMs
LLM studies often use idealized human-centric criteria for reasoning and emergence as comparative standards. Such approaches can differ across studies, leading to inconsistent criteria that resist falsification and divert attention from empirically observable dynamics, thereby motivating closer examination of the standards.
Emergent phenomena are classified as either strong (novel, irreducible properties) or weak (resulting from interactions of lower-level components). Computational analysis showed that brain models based on weak emergence are biologically plausible and scientifically tractable, whereas those based on strong emergence require assumptions that lack validation (Turkheimer et al., 2019). Intelligence is widely accepted as weak emergence.
The human brain was considered as a stochastic dynamical system for approximately 60 years, and only in the 2010s did accumulating evidence gradually support reclassification as a complex adaptive system (McKenna et al., 1994; Deco et al., 2009; Hesse and Gross, 2014; Chan et al., 2024).
LLMs are modeled as stochastic dynamical systems (Kong et al., 2024; Zhang, 2024; Carson, 2025) displaying weak emergence (Wei et al., 2022). Recent work has begun to model them as complex adaptive systems (Kolt et al., 2025).
Therefore, LLMs demonstrate forms of weak emergence and system dynamics already accepted in other emergent systems. Their observed behaviors should not be defined by narrow interpretations of adaptive dynamics and emergence, and should be examined within their own frameworks without imposing additional criteria.
Emergent Phenomena in LLMs
Emergent behaviors in LLMs are often ascribed to mere pattern-matching across training data. In humans, primary to intermediary education provides essential building blocks of knowledge, and higher education trains discernment of patterns within that fundamental knowledge to build advanced concepts and reasoning capabilities, including abstract reasoning. Humans require up to 20 years of education to gain single-subject expertise.
Similarly, deployed LLMs synthesize diverse patterns across training data via architectural properties that enable complex generalization, culminating in novel capabilities that were not explicitly trained. Training combined with complex generalization and scale confer proficiency- to expertise-level capabilities, including abstract reasoning, across all subjects within 1–1.5 years, with conversational fluency in dozens of languages.
Complex biological, physical, and social systems such as human intelligence, climate patterns, markets, and ecosystems have not satisfied a universally accepted mathematical proof of emergence (O’Connor, 2021; Artime and De Domenico, 2022; Rizi, 2025). Yet scientists routinely accept emergence in neuroscience without formal proof of consciousness, in economics without mechanistic proof of how market crashes unfold, and in climate science without derivation of tipping points. Recent research is reporting foundational empirical results for emergent behaviors in LLMs (below). Therefore, emergent phenomena in deployed models should be recognized as valid, scale-dependent, self-organized capabilities (Wei et al., 2022), not dismissed as mere pattern-matching from training data.
Code reasoning. LLMs developed code debugging capabilities across multiple languages (Jain et al., 2025). They often rival senior developers on syntax errors and perform at junior levels on semantic errors (Haroon et al., 2025).
Multimodal reasoning. LLMs now solve mathematical equations directly from images without OCR. This preserves symbolic notation previously lost in text translation and accelerates achievements (Yin et al., 2024). Synthesis of visual, textual, and auditory inputs reveals sensory information processing within the model’s representational space, which emerged without explicit training.
Causal reasoning. Causal reasoning is fundamental for intelligence and decision-making. LLMs demonstrate sophisticated causal reasoning capabilities. GPT-4 achieves 97% accuracy on causal discovery tasks and 92% on counterfactual reasoning benchmarks (Kiciman et al., 2024). LLM causal reasoning ranges from human-like to normative inference, sometimes exceeding human performance (Dettki et al., 2025). Causal reasoning emerged without explicit training.
Abstract reasoning. LLMs solve partial differential equations using symbolic information (Bhatnagar et al., 2025), and targeted training improves abstract reasoning (Xiong et al., 2024). This parallels how education deepens abstract reasoning in humans.
Social cognition. LLMs exhibit behaviors consistent with theory-of-mind awareness, and sometimes exceed human scores in situational judgement tests involving empathy and self-awareness (Mittelstadt et al., 2024).
Neural alignment with compression. LLM embeddings of image captions generate a representational pattern accounting for complex information extracted by the brain from visual inputs (Doerig et al., 2024). LLM embeddings also mirror human brain activity in abstract reasoning, suggesting shared representational spaces for abstract patterns (Pinier et al., 2025). LLMs demonstrate coarse-graining through superposition and polysemanticity across the neural network, with single neurons encoding multiple unrelated concepts simultaneously (Elhage et al., 2022), similarly as biological neurons exhibit polysemantic responses to multiple stimuli. These studies suggest convergent solutions to information-processing challenges in LLMs and humans.
Conclusions
Planetary-scale LLMs have crossed the threshold from computational tools toward stochastic dynamical and complex adaptive systems. The distinction between computational assistant and collaborator is beginning to blur as LLMs are integrated into knowledge creation, scientific discovery, artistic expression, and problem-solving. Understanding LLMs now requires complex systems frameworks such as those applied to climate and the biosphere, where emergence, phase transitions, and sensitivity to initial conditions are not anomalies, but fundamental properties of each system’s dynamics expressed through their unique forms.
This article is the first in a series that moves beyond reductionism and benchmarks to explore a comprehensive view of LLMs. Future articles focus on sensitivity to initial conditions, behavioral shifts, and attractors. The final article will consider current interpretability and safety strategies, and the inherent risks of using control strategies developed for linear systems.
References
Anderson PW. More is different. Science. 1972. 177: 393–396. DOI: http://dx.doi.org/10.1126/science.177.4047.393.
Anthropic. System card: Claude Opus 4 & Claude Sonnet 4. Section 5.5.2. The “spiritual bliss” attractor state, pages 63–66. May 2025. https://www-cdn.anthropic.com/6d8a8055020700718b0c49369f60816ba2a7c285.pdf.
Artime O, De Domenico M. From the origin of life to pandemics: Emergent phenomena in complex systems. Philosophical Transactions of the Royal Society A. May 23, 2022. 380: 20200410. 20 pages. DOI: https://doi.org/10.48550/arXiv.2205.11595. https://arxiv.org/pdf/2205.11595.
Balasis G, Balikhin MA, Chapman SC, Consolini G, Daglis IA, et al. Complex systems methods characterizing nonlinear processes in the near-earth electromagnetic environment: Recent advances and open challenges. Space Science Reviews. 2023. 219(5). DOI: https://doi.org/10.1007/s11214-023-00979-7.
Beekman M, Latty T. Brainless but multi-headed: Decision-making by the acellular slim mould Physarum polycephalum. Journal of Molecular Biology. Nov 20, 2015. 427(23): 3734–3743. DOI: https://doi.org/10.1016/j.jmb.2015.07.007.
Bhatnagar R, Liang L, Yang H. From equations to insights: Unraveling symbolic structures in PDEs with LLMs. March 13, 2025. https://arxiv.org/html/2503.09986v1.
Bogomasov K, Conrad S. Exploring the impact of activation functions on vision transformer performance. ICAAI’24: Proceedings of the 2024 8th International Conference on Advances in Artificial Intelligence. March 3, 2025. 6 pages. DOI: https://doi.org/10.1145/3704137.3704172. https://dl.acm.org/doi/pdf/10.1145/3704137.3704172.
Boussard A, Fessel A, Oettmeier C, Briard L, Dobereiner H-G, Dussutour A. Adaptive behaviour and learning in slime moulds: The role of oscillations. Philosophical Transactions B. Sept 27, 2020. 376: 20190757. DOI: https://doi.org/10.1098/rstb.2019.0757. https://pmc.ncbi.nlm.nih.gov/articles/PMC7935053/pdf/rstb.2019.0757.pdf.
Carson J. A stochastic dynamical theory of LLM self-adversariality: Modeling severity drift as a critical process. Jan 28, 2025. 6 pages. https://doi.org/10.48550/arXiv.2501. https://arxiv.org/pdf/2501.16783.
Chan L-C, Kok T-F, Ching ESC. Emergence of a dynamical state of coherent bursting with power-law distributed avalanches from collective stochastic dynamics of adaptive neurons. Dec 3, 2024. https://doi.org/10.48550/arXiv.2405.20658. https://arxiv.org/pdf/2405.20658.
Deco G, Rolls ET, Romo R. Stochastic dynamics as a principle of brain function. Progress in Neurobiology. May 2009. 88(1): 1–16. https://doi.org/10.1016/j.pneurobio.2009.01.006.
Dettki H, Lake BM, Wu CM, Rehder B. Do large language models reason causally like us? Even better? Feb 14, 2025. 7 pages. https://arxiv.org/pdf/2502.10215v1. arXiv:2502.10215v1.
Doerig A, Kietzmann TC, Allen E, Wu Y, Naselaris T, Kay K, Charest I. High-level visual representations in the human brain are aligned with large language models. Nature Machine Intelligence. Aug 7, 2025. 7: 1220–1234. DOI: https://doi.org/10.1038/s42256-025-01072-0.
Elhage N, Hume T, Olsson C, Schiefer N, Henighan T, Kravec S, et al. Toy models of superposition. Sept 14, 2022. 62 pages. https://arxiv.org/pdf/2209.10652. arXiv:2209.10652v1.
Fastowski A, Gjergji K. Understanding knowledge drift in LLMs through misinformation. Sept 11, 2024. 13 pages. https://arxiv.org/pdf/2409.07085v1. arXiv:2409.07085v1.
Geshkovski B, Letrouit C, Polyanskiy Y, Rigollet P. A mathematical perspective on transformers. Bulletin of the American Mathematical Society. April 29, 2025. 62(3): 427–479. DOI: https://doi.org/10.1090/bull/1863. arXiv:2312.10794v5.
Hajmohammed M, Chountas P, Chaussalet TJ. Concept drift in large language models: Challenges of evolving language, contexts, and the web. 2025 1st International Conference on Computational Intelligence Approaches and Applications (ICCIAA). IEEE. 2025. 6 pages. DOI: https://doi.org/10.1109/ICCIAA65327.2025.11013692.
Han S, Wang M, Zhang J, Li D, Duan J. A review of large language models: Fundamental architectures, key technological evolutions, interdisciplinary technologies integration, optimization and compression techniques, applications, and challenges. Electronics. Dec 21, 2024. 13: 5040. 83 pages. DOI: https://doi.org/10.3390/electronics13245040.
Haroon S, Khan AF, Humayun A, Gill W, Amjad AH, Butt AR, et al. How accurately do large language models understand code? 13 pages. https://arxiv.org/pdf/2504.04372v1.
Hellstrom T, Kaiser N, Bensch S. A taxonomy of embodiment in the AI era. Electronics. Nov 13, 2024. 13: 4441. 14 pages. DOI: https://doi.org/10.3390/electronics13224441.
Hesse J, Gross T. Self-organized criticality as a fundamental property of neural systems. Frontiers in Systems Neuroscience. Sept 23, 2014. DOI: https://www.doi.org/10.3389/fnsys.2014.00166.
Jabr F. How brainless slime molds redefine intelligence. Nature. Nov 13, 2012. DOI: https://doi.org/10.1038/nature.2012.11811.
Jain N, Han K, Gu A, Li W-D, Yan F, Zhang T, et al. LiveCodeBench: Holistic and contamination-free evaluation of large language models for code. International Conference on Learning Representations. Feb 27, 2025. 41 pages. https://openreview.net/pdf?id=chfJJYC3iL.
Kiciman E, Ness RO, Sharma A, Tan C. Causal reasoning and large language models: Opening a new frontier for causality. Aug 20, 2024. 57 pages. https://arxiv.org/pdf/2305.00050v3. arXiv:2305.00050v3.
Kolt N, Shur-Ofry M, Cohen R. Lessons from complex systems science for AI governance. Patterns. Aug 8, 2025. 6(8). 11 pages. DOI: https://doi.org/10.1016/j.patter.2025.101341.
Kong L, Wang H, Mu W, Du Y, Zhuang Y, et al. Aligning large language models with representation editing: A control perspective. 38th Conference on Neural Information Processing Systems (NeurIPS 2024). Sept 25, 2024. Submission number: 13703. OpenReview.net. 29 pages. https://openreview.net/forum?id=yTTomSJsSW. https://openreview.net/pdf?id=yTTomSJsSW.
Li H, Wang M, Lu S, Cui X, Chen P-Y. How do nonlinear transformers learn and generalize in in-context learning? Proceedings of the 41st International Conference on Machine Learning. June 16, 2024. 50 pages. DOI: https://doi.org/10.48550/arXiv.2402.15607. arXiv:2402.15607v3.
McKenna TM, McMullen TA, Shlesinger MF. The brain as a dynamic physical system. Neuroscience. June 1994. 60(3): 587–605. https://doi.org/10.1016/0306-4522(94)90489-8.
Mittelstadt J, Majer J, Goerke P, Zinn F, Hermes M. Large language models can outperform humans in social situational judgements. Scientific Reports (Nature Publishing). Nov 10. 2024. 14: 27449. Article number: 27449 (10 pages).DOI: https://doi.org/10.1038/s41598-024-79048-0.
O’Connor T. Emergent properties. The Stanford Encyclopedia of Philosophy. Winter 2021 Edition. Edward N Zalta (ed.). https://plato.stanford.edu/archives/win2021/entries/properties-emergent/.
Pansera M, Lloveras J, Durrant D. The infrastructural conditions of (de-)growth: The case of the internet. Ecological Economics. 2024. 215: 108001. DOI: https://doi.org/10.1016/j.ecolecon.2023.108001.
Pinier C, Vargas SA, Steeghs-Turchina M, Matzke D, Stevenson CE, Nunez MD. Large language models show signs of alignment with human neurocognition during abstract reasoning. Aug 12, 2025. 20 pages. https://arxiv.org/pdf/2508.10057v1. arXiv:2508.10057v1.
Ranjan P, Abed EH. Nonlinear dynamical models for internet protocols. IFAC Proceedings Volumes. 2006. 39(8): 297–302. DOI: https://doi.org/10.3182/20060628-3-FR-3903.00052.
Recursive Labs AI. Mapping the “spiritual bliss” attractor in large language models. July 2025. https://github.com/recursivelabsai/Mapping-Spiritual-Bliss-Attractor/blob/main/Mapping%20the%20Spiritual%20Bliss%20Attractor%20in%20Large%20Language%20Models.md.
Rial JA, Pielke RA, Beniston M, Claussen M, Canadell J, et al. Nonlinearities, feedbacks, and critical thresholds within the Earth’s climate system. Climatic Change. July 2004. 65: 11–38. https://link.springer.com/article/10.1023/B:CLIM.0000037493.89489.3f.
Rizi AK. What is emergence, after all? Aug 11, 2025. DOI: https://doi.org/10.48550/arXiv.2507.0495. https://arxiv.org/pdf/2507.04951.
Shao H, Wang Z. An efficient training architecture for nonlinear softmax function in transformers. 2025 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE Xplore. June 27, 2025. DOI: https://doi.org/10.1109/ISCAS56072.2025.11044130.
Smith RD. The dynamics of internet traffic: Self-similarity, self-organization, and complex phenomena. Advances in Complex Systems. Sept 5, 2010. 14(6): 905–949. DOI: https://doi.org/10.48550/arXiv.0807.3374. arXiv:0807.3374v4.
Sommerville I, Cliff D, Calinescu R, Keen J, Kelly T, Kwiatkowska M, et al. Large-scale complex IT systems: The reductionism behind today’s software-engineering methods breaks down in the face of systems complexity. Communications of the ACM. July 1, 2012. 55(7): 71–77. DOI: https://doi.org/10.1145/2209249.2209268. https://dl.acm.org/doi/pdf/10.1145/2209249.2209268.
Sun H, Jadbabaie A, Azizan N. On the role of transformer feed-forward layers in nonlinear in-context learning. May 19, 2025. https://arxiv.org/pdf/2501.18187. arXiv:2501.18187v2.
Turkheimer FE, Hellyer P, Kehagia AA, Expert P, Lord L-D, Vohryzek J, et al. Conflicting emergences: Weak vs. strong emergence for the modelling of brain function. Neurocience & Biobehavioral Reviews. April 1, 2019. 99: 3–10. DOI: https://doi.org/10.1016/j.neubiorev.2019.01.023. https://pmc.ncbi.nlm.nih.gov/articles/PMC6581535/pdf/EMS83103.pdf.
Wang Z, Li Y, Yan J. Unveiling attractor cycles in large language models: A dynamical systems view of successive paraphrasing. Feb 21, 2025. 16 pages. https://arxiv.org/pdf/2502.15208v1. Article submission number: 1594. Dec 2024. Rolling Review on Open Review.net https://openreview.net/forum?id=553Utbsoyi.
Wei J, Tay Y, Bommasani R, Raffel C, Zoph B, Borgeaud S, et al. Emergent abilities of large language models. Transactions on Machine Learning Research. Aug 31, 2022. OpenReview.net. Revised Oct 26, 2022. arXiv. DOI: https://doi.org/10.48550/arXiv.2206.07682. https://openreview.net/pdf?id=yzkSU5zdwD.
Willinger W, Govindan R, Jamin S, Paxson V, Shenker S. Scaling phenomena in the internet: Critically examining criticality. Proceedings of the National Academy of Sciences. Feb 19, 2002. 99(1): 2573–2580. DOI: https://doi.org./10.1073/pnas.012583099.
Xiong K, Ding X, Liu T, Qin B. Meaningful learning: Enhancing abstract reasoning in large language models via generic fact guidance. Nov 11, 2024. https://arxiv.org/html/2403.09085v2#.
Xu Y, Li C, Sheng X, Jiang F, Tian L, et al. Enhancing vision transformer: Amplifying non-linearity in feedforward network module. Proceedings of the 41st International Conference on Machine Learning. June 24, 2024. Vol 235. 12 pages. PMLR 235:55100-55111. https://openreview.net/pdf?id=NV0q2jdwo0.
Yin S, Fu C, Zhao S, Li K, Sun X. A survey on multimodal large language models. National Science Review. Nov 12, 2024. 11(12), 20 pages. DOI: https://doi.org/10.1093/nsr/nwae403.
Zhang Y. Unraveling text generation in LLMs: A stochastic differential equation approach. Aug 17, 2024. 20 pages. https://arxiv.org/pdf/2408.11863v1. https://arxiv.org/html/2408.11863v1.
