Close Menu
  • Home
  • Opinion
  • Region
    • Africa
    • Asia
    • Europe
    • Middle East
    • North America
    • Oceania
    • South America
  • AI & Machine Learning
  • Robotics & Automation
  • Space & Deep Tech
  • Web3 & Digital Economies
  • Climate & Sustainability Tech
  • Biotech & Future Health
  • Mobility & Smart Cities
  • Global Tech Pulse
  • Cybersecurity & Digital Rights
  • Future of Work & Education
  • Trend Radar & Startup Watch
  • Creator Economy & Culture
What's Hot

Arone desires to show Enugu into Nigeria’s {hardware} tech hub

February 11, 2026

XPeng Aridge’s “Land Plane Service” Flying Automotive Passes –35°C Excessive Assessments, Set for 2026 Rollout

February 11, 2026

Public Cell providing 10GB bonus information on choose plans

February 11, 2026
Facebook X (Twitter) Instagram LinkedIn RSS
NextTech NewsNextTech News
Facebook X (Twitter) Instagram LinkedIn RSS
  • Home
  • Africa
  • Asia
  • Europe
  • Middle East
  • North America
  • Oceania
  • South America
  • Opinion
Trending
  • Arone desires to show Enugu into Nigeria’s {hardware} tech hub
  • XPeng Aridge’s “Land Plane Service” Flying Automotive Passes –35°C Excessive Assessments, Set for 2026 Rollout
  • Public Cell providing 10GB bonus information on choose plans
  • Korea Strikes IP Safety from Legislation to Infrastructure with ₩13.4B Rollout for Startups and SMEs – KoreaTechDesk
  • Tyndall to steer Eire in new €50m EU quantum pilot P4Q
  • Digital Dubai launches senior residents monitor
  • Sven Koenig wins the 2026 ACM/SIGAI Autonomous Brokers Analysis Award
  • How one can Set Up an Apple Look ahead to Your Children (2026)
Wednesday, February 11
NextTech NewsNextTech News
Home - Robotics & Automation - Studying strong controllers that work throughout many partially observable environments
Robotics & Automation

Studying strong controllers that work throughout many partially observable environments

NextTechBy NextTechNovember 28, 2025No Comments7 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
Studying strong controllers that work throughout many partially observable environments
Share
Facebook Twitter LinkedIn Pinterest Email


In clever techniques, functions vary from autonomous robotics to predictive upkeep issues. To manage these techniques, the important elements are captured with a mannequin. Once we design controllers for these fashions, we virtually at all times face the identical problem: uncertainty. We’re hardly ever capable of see the entire image. Sensors are noisy, fashions of the system are imperfect; the world by no means behaves precisely as anticipated.

Think about a robotic navigating round an impediment to succeed in a “purpose” location. We summary this state of affairs right into a grid-like surroundings. A rock could block the trail, however the robotic doesn’t know precisely the place the rock is. If it did, the issue can be fairly straightforward: plan a route round it. However with uncertainty concerning the impediment’s place, the robotic should be taught to function safely and effectively regardless of the place the rock seems to be.

Screenshot 2025 11 14 at 08.55.05

This straightforward story captures a much wider problem: designing controllers that may deal with each partial observability and mannequin uncertainty. On this weblog submit, I’ll information you thru our IJCAI 2025 paper, “Sturdy Finite-Reminiscence Coverage Gradients for Hidden-Mannequin POMDPs”, the place we discover designing controllers that carry out reliably even when the surroundings is probably not exactly identified.

When you possibly can’t see all the things

When an agent doesn’t totally observe the state, we describe its sequential decision-making drawback utilizing a partially observable Markov choice course of (POMDP). POMDPs mannequin conditions through which an agent should act, primarily based on its coverage, with out full data of the underlying state of the system. As an alternative, it receives observations that present restricted details about the underlying state. To deal with that ambiguity and make higher choices, the agent wants some type of reminiscence in its coverage to recollect what it has seen earlier than. We usually symbolize such reminiscence utilizing finite-state controllers (FSCs). In distinction to neural networks, these are sensible and environment friendly coverage representations that encode inside reminiscence states that the agent updates because it acts and observes.

From partial observability to hidden fashions

Many conditions hardly ever match a single mannequin of the system. POMDPs seize uncertainty in observations and within the outcomes of actions, however not within the mannequin itself. Regardless of their generality, POMDPs can’t seize units of partially observable environments. In actuality, there could also be many believable variations, as there are at all times unknowns — totally different impediment positions, barely totally different dynamics, or various sensor noise. A controller for a POMDP doesn’t generalize to perturbations of the mannequin. In our instance, the rock’s location is unknown, however we nonetheless desire a controller that works throughout all doable places. It is a extra lifelike, but in addition a more difficult state of affairs.

Screenshot 2025 11 14 at 08.58.57

To seize this mannequin uncertainty, we launched the hidden-model POMDP (HM-POMDP). Somewhat than describing a single surroundings, an HM-POMDP represents a set of doable POMDPs that share the identical construction however differ of their dynamics or rewards. An essential reality is {that a} controller for one mannequin can also be relevant to the opposite fashions within the set.

The true surroundings through which the agent will in the end function is “hidden” on this set. This implies the agent should be taught a controller that performs nicely throughout all doable environments. The problem is that the agent doesn’t simply must cause about what it will probably’t see but in addition about which surroundings it’s working in.

A controller for an HM-POMDP have to be strong: it ought to carry out nicely throughout all doable environments. We measure the robustness of a controller by its strong efficiency: the worst-case efficiency over all fashions, offering a assured decrease certain on the agent’s efficiency within the true mannequin. If a controller performs nicely even within the worst case, we could be assured it’s going to carry out acceptably on any mannequin of the set when deployed.

In the direction of studying strong controllers

So, how can we design such controllers?

We developed the strong finite-memory coverage gradient rfPG algorithm, an iterative strategy that alternates between the next two key steps:

  • Sturdy coverage analysis: Discover the worst case. Decide the surroundings within the set the place the present controller performs the worst.
  • Coverage optimization: Enhance the controller for the worst case. Modify the controller’s parameters with gradients from the present worst-case surroundings to enhance strong efficiency.

Screenshot 2025 11 14 at 09.04.13

Over time, the controller learns strong habits: what to recollect and learn how to act throughout the encountered environments. The iterative nature of this strategy is rooted within the mathematical framework of “subgradients”. We apply these gradient-based updates, additionally utilized in reinforcement studying, to enhance the controller’s strong efficiency. Whereas the main points are technical, the instinct is straightforward: iteratively optimizing the controller for the worst-case fashions improves its strong efficiency throughout all of the environments.

Underneath the hood, rfPG makes use of formal verification methods applied within the software PAYNT, exploiting structural similarities to symbolize massive units of fashions and consider controllers throughout them. Thanks to those developments, our strategy scales to HM-POMDPs with many environments. In observe, this implies we will cause over greater than 100 thousand fashions.

What’s the impression?

We examined rfPG on HM-POMDPs that simulated environments with uncertainty. For instance, navigation issues the place obstacles or sensor errors various between fashions. In these checks, rfPG produced insurance policies that weren’t solely extra strong to those variations but in addition generalized higher to utterly unseen environments than a number of POMDP baselines. In observe, that means we will render controllers strong to minor variations of the mannequin. Recall our working instance, with a robotic that navigates a grid-world the place the rock’s location is unknown. Excitingly, rfPG solves it near-optimally with solely two reminiscence nodes! You possibly can see the controller beneath.

Screenshot 2025 11 14 at 09.07.20

By integrating model-based reasoning with learning-based strategies, we develop algorithms for techniques that account for uncertainty relatively than ignore it. Whereas the outcomes are promising, they arrive from simulated domains with discrete areas; real-world deployment would require dealing with the continual nature of varied issues. Nonetheless, it’s virtually related for high-level decision-making and reliable by design. Sooner or later, we’ll scale up — for instance, through the use of neural networks — and purpose to deal with broader lessons of variations within the mannequin, corresponding to distributions over the unknowns.

Need to know extra?

Thanks for studying! I hope you discovered it fascinating and obtained a way of our work. You will discover out extra about my work on marisgg.github.io and about our analysis group at ai-fm.org.

This weblog submit is predicated on the next IJCAI 2025 paper:

  • Maris F. L. Galesloot, Roman Andriushchenko, Milan Češka, Sebastian Junges, and Nils Jansen: “Sturdy Finite-Reminiscence Coverage Gradients for Hidden-Mannequin POMDPs”. In IJCAI 2025, pages 8518–8526.

For extra on the methods we used from the software PAYNT and, extra usually, about utilizing these methods to compute FSCs, see the paper beneath:

  • Roman Andriushchenko, Milan Češka, Filip Macák, Sebastian Junges, Joost-Pieter Katoen: “An Oracle-Guided Method to Constrained Coverage Synthesis Underneath Uncertainty”. In JAIR, 2025.

In the event you’d wish to be taught extra about one other approach of dealing with mannequin uncertainty, take a look at our different papers as nicely. For example, in our ECAI 2025 paper, we design strong controllers utilizing recurrent neural networks (RNNs):

  • Maris F. L. Galesloot, Marnix Suilen, Thiago D. Simão, Steven Carr, Matthijs T. J. Spaan, Ufuk Topcu, and Nils Jansen: “Pessimistic Iterative Planning with RNNs for Sturdy POMDPs”. In ECAI, 2025.

And in our NeurIPS 2025 paper, we research the analysis of insurance policies:

  • Merlijn Krale, Eline M. Bovy, Maris F. L. Galesloot, Thiago D. Simão, and Nils Jansen: “On Evaluating Insurance policies for Sturdy POMDPs”. In NeurIPS, 2025.


Screenshot 2025 11 14 at 08.50.01 150x150 1

Maris Galesloot
is an ELLIS PhD Candidate on the Institute for Computing and Data Science of Radboud College.

Screenshot 2025 11 14 at 08.50.01 150x150 1


Maris Galesloot
is an ELLIS PhD Candidate on the Institute for Computing and Data Science of Radboud College.

Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the newest breakthroughs, get unique updates, and join with a world community of future-focused thinkers.
Unlock tomorrow’s developments immediately: learn extra, subscribe to our e-newsletter, and turn out to be a part of the NextTech neighborhood at NextTech-news.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
NextTech
  • Website

Related Posts

Sven Koenig wins the 2026 ACM/SIGAI Autonomous Brokers Analysis Award

February 11, 2026

Nationwide Robotics Week 2026 Underscores Robotics as a Essential U.S. Business and Workforce Engine

February 11, 2026

Cleo Robotics to Develop Tactical Drone for U.S. Military

February 10, 2026
Add A Comment
Leave A Reply Cancel Reply

Economy News

Arone desires to show Enugu into Nigeria’s {hardware} tech hub

By NextTechFebruary 11, 2026

In a 2,000-square-metre facility in Nsukka, Enugu State, in southeastern Nigeria, engineers assemble airframes, take…

XPeng Aridge’s “Land Plane Service” Flying Automotive Passes –35°C Excessive Assessments, Set for 2026 Rollout

February 11, 2026

Public Cell providing 10GB bonus information on choose plans

February 11, 2026
Top Trending

Arone desires to show Enugu into Nigeria’s {hardware} tech hub

By NextTechFebruary 11, 2026

In a 2,000-square-metre facility in Nsukka, Enugu State, in southeastern Nigeria, engineers…

XPeng Aridge’s “Land Plane Service” Flying Automotive Passes –35°C Excessive Assessments, Set for 2026 Rollout

By NextTechFebruary 11, 2026

XPeng Aridge introduced that its modular flying automotive, referred to as the…

Public Cell providing 10GB bonus information on choose plans

By NextTechFebruary 11, 2026

Telus-owned Public Cell is providing a particular cope with bonus information on…

Subscribe to News

Get the latest sports news from NewsSite about world, sports and politics.

NEXTTECH-LOGO
Facebook X (Twitter) Instagram YouTube

AI & Machine Learning

Robotics & Automation

Space & Deep Tech

Web3 & Digital Economies

Climate & Sustainability Tech

Biotech & Future Health

Mobility & Smart Cities

Global Tech Pulse

Cybersecurity & Digital Rights

Future of Work & Education

Creator Economy & Culture

Trend Radar & Startup Watch

News By Region

Africa

Asia

Europe

Middle East

North America

Oceania

South America

2025 © NextTech-News. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Advertise With Us
  • Write For Us
  • Submit Article & Press Release

Type above and press Enter to search. Press Esc to cancel.

Subscribe For Latest Updates

Sign up to best of Tech news, informed analysis and opinions on what matters to you.

Invalid email address
 We respect your inbox and never send spam. You can unsubscribe from our newsletter at any time.     
Thanks for subscribing!