Close Menu
    Trending
    • How do earthquakes end? A seismic ‘stop sign’ could help predict earthquake risk
    • Trump Announces Cease-Fire Between Israel and Lebanon
    • Google Is Tracking Your Life – Photo Cloud Feeding AI System
    • Rachel Zoe Confronts Amanda Frances In ‘RHOBH’ Reunion Clip
    • China’s DeepSeek says it released long-awaited new AI model
    • China’s DeepSeek unveils latest models a year after upending global tech | Technology News
    • Malik Nabers’ reaction to Cowboys drafting Caleb Downs should thrill Dallas fans
    • AI is replacing creativity with ‘average’
    Benjamin Franklin Institute
    Friday, April 24
    • Home
    • Politics
    • Business
    • Science
    • Technology
    • Arts & Entertainment
    • International
    Benjamin Franklin Institute
    Home»Science»World models could unlock the next revolution in artificial intelligence
    Science

    World models could unlock the next revolution in artificial intelligence

    Team_Benjamin Franklin InstituteBy Team_Benjamin Franklin InstituteJanuary 18, 2026No Comments6 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    Share
    Facebook Twitter Pinterest Email Copy Link


    You’ve probably seen an artificial intelligence system go off track. You ask for a video of a dog, and as the dog runs behind the love seat, its collar disappears. Then, as the camera pans back, the love seat becomes a sofa.

    Part of the problem lies in the predictive nature of many AI models. Like the models that power ChatGPT, which are trained to predict text, video generation models predict what is statistically most plausible to look right next. In neither case does the AI hold a clearly defined model of the world that it continuously updates to make more informed decisions.

    But that’s starting to change as researchers across many AI domains work on creating “world models,” with implications that extend beyond video generation and chatbot use to augmented reality, robotics, autonomous vehicles and even humanlike intelligence—or artificial general intelligence (AGI).


    On supporting science journalism

    If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


    A simple way to understand world modeling is through four-dimensional, or 4D, models (three dimensions plus time). To do this, let’s think back to 2012, when Titanic, 15 years after its theatrical release, was painstakingly converted into stereoscopic 3D. If you were to freeze any frame, you would have an impression of distance between characters and objects on the ship. But if Leonardo DiCaprio had his back to the camera, you wouldn’t be able to walk around him to see his face. Cinema’s illusion of 3D is made using stereoscopy—two slightly different images often projected in rapid alternation, one for the left eye and one for the right. Everyone in the cinema sees the same pair of images and thus a similar perspective.

    Multiple perspectives are, however, increasingly possible thanks to the past decade of research. Imagine realizing you should have shot a photo from a different angle and then having AI make that adjustment, giving the same scene with a new perspective. Starting in 2020, NeRF (neural radiance field) algorithms offered a path to create “photorealistic novel views” but required combining many photos so that an AI system could generate a 3D representation. Other 3D approaches use AI to fill in missing information predictively, deviating more from reality.

    Now, imagine that every frame in Titanic were represented in 3D so that the movie existed in 4D. You could scroll through time to see different moments or scroll through space to watch it from different perspectives. You could also generate new versions of it. For instance, a recent preprint, “NeoVerse: Enhancing 4D World Model with in-the-Wild Monocular Videos,” describes one way of turning videos into 4D models to generate new videos from different perspectives.

    But 4D techniques can also help generate new video content. Another recent preprint, “TeleWorld: Towards Dynamic Multimodal Synthesis with a 4D World Model,” applies to the scenario with which we began: the dog running behind the love seat. The authors argue that the stability of AI video systems improves when a continuously updated 4D world model guides generation. The system’s 4D model would help to prevent the love seat from becoming a couch and the dog from losing its collar.

    These are early results, but they hint at a broader trend: models that update an internal scene map as they generate. Yet 4D modeling has applications far beyond video generation. For augmented reality (AR)—think Meta’s Orion prototype glasses—a 4D world model is an evolving map of the user’s world over time. It allows AR systems to keep virtual objects stable, to make lighting and perspective believable and to have a spatial memory of what recently happened. It also allows for occlusions—when digital objects disappear behind real ones. A 2023 paper puts the requirement bluntly: “To achieve occlusion, a 3D model of the physical environment is required.”

    Being able to rapidly convert videos into 4D also provides rich data for training robots and autonomous vehicles on how the real world works. And by generating 4D models of the space they’re in, robots could navigate it better and predict what might happen next. Today’s general-purpose vision-language AI models—which understand images and text but do not generate clearly defined world models—often make errors; a benchmark paper presented at a 2025 conference reports “striking limitations” in their basic world-modeling abilities, including “near-random accuracy when distinguishing motion trajectories.”

    Here’s the catch: “world model” means much more to those pursuing AGI. For instance, today’s leading large language models (LLMs), such as those powering ChatGPT, have an implicit sense of the world from their training data. “In a way, I would say that the LLM already has a very good world model; it’s just we don’t really understand how it’s doing it,” says Angjoo Kanazawa, an assistant professor of electrical engineering and computer sciences at University of California, Berkeley. These conceptual models, though, aren’t a real-time physical understanding of the world because LLMs can’t update their training data in real time. Even OpenAI’s technical report notes that, once deployed, its model GPT-4 “does not learn from experience.”

    “How do you develop an intelligent LLM vision system that can actually have streaming input and update its understanding of the world and act accordingly?” Kanazawa says. “That’s a big open problem. I think AGI is not possible without actually solving this problem.”

    Though researchers debate whether LLMs could ever attain AGI, many see LLMs as a component of future AI systems. The LLM would act as the layer for “language and common sense to communicate,” Kanazawa says; it would serve as an “interface,” whereas a more clearly defined underlying world model would provide the necessary “spatial temporal memory” that current LLMs lack.

    In recent years a number of prominent AI researchers have turned toward world models. In 2024 Fei Fei Li founded World Labs, which recently launched its Marble software to create 3D worlds from “text, images, video, or coarse 3D layouts,” according to the start-up’s promotional material. And last November AI researcher Yann LeCun announced on LinkedIn that he was leaving Meta to launch a start-up, now called Advanced Machine Intelligence (AMI Labs), to build “systems that understand the physical world, have persistent memory, can reason, and can plan complex action sequences.” He seeded these ideas in a 2022 position paper in which he asked why humans can act well in situations they’ve never encountered and argued the answer “may lie in the ability… to learn world models, internal models of how the world works.” Research increasingly shows the benefits of internal models. An April 2025 Nature paper reported results on DreamerV3, an AI agent that, by learning a world model, can improve its behavior by “imagining” future scenarios.

    So while in the context of AGI, “world model” refers more closely to an internal model of how reality works, not just 4D reconstructions, advances in 4D modeling could provide components that help with understanding viewpoints, memory and even short-term prediction. And meanwhile, on the path to AGI, 4D models can provide rich simulations of reality in which to test AIs to ensure that when we do let them operate in the real world, they know how to exist in it.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link

    Related Posts

    Science

    How do earthquakes end? A seismic ‘stop sign’ could help predict earthquake risk

    April 24, 2026
    Science

    ‘Kraken’ fossils show enormous, intelligent octopuses were top predators in Cretaceous seas

    April 24, 2026
    Science

    Largest ever octopus was great white shark of invertebrate predators

    April 24, 2026
    Science

    Do you need to worry about Mythos, Anthropic’s computer-hacking AI?

    April 23, 2026
    Science

    How many dachshunds would it take to get to the moon?

    April 23, 2026
    Science

    The Age Code review: Can you slow ageing with your diet? A new book gives it a go

    April 23, 2026
    Editors Picks

    Prediction markets face scrutiny after bets on Iran strikes raise insider trading concerns

    March 6, 2026

    ‘Email apnea’: Reading work emails makes us forget to breathe

    February 26, 2026

    Albania votes in general elections as PM Edi Rama seeks a fourth term | Politics News

    May 11, 2025

    New report sheds light on Chiefs TE Travis Kelce’s future

    December 29, 2025

    Opinion | Child Sexual Abuse Destroyed My Family. Here’s What Could Have Helped.

    January 8, 2026
    About Us
    About Us

    Welcome to Benjamin Franklin Institute, your premier destination for insightful, engaging, and diverse Political News and Opinions.

    The Benjamin Franklin Institute supports free speech, the U.S. Constitution and political candidates and organizations that promote and protect both of these important features of the American Experiment.

    We are passionate about delivering high-quality, accurate, and engaging content that resonates with our readers. Sign up for our text alerts and email newsletter to stay informed.

    Latest Posts

    How do earthquakes end? A seismic ‘stop sign’ could help predict earthquake risk

    April 24, 2026

    Trump Announces Cease-Fire Between Israel and Lebanon

    April 24, 2026

    Google Is Tracking Your Life – Photo Cloud Feeding AI System

    April 24, 2026

    Subscribe for Updates

    Stay informed by signing up for our free news alerts.

    Paid for by the Benjamin Franklin Institute. Not authorized by any candidate or candidate’s committee.
    • Privacy Policy
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.