
The most dangerous mistake in AI robotics is not using an LLM.
It is giving the LLM the wrong kind of authority.
An LLM can interpret intent, summarize context, read manuals, propose plans, call tools, and help operators understand what a robot is doing. That is useful. But a robot is still a cyber-physical system. It has timing constraints, noisy sensors, actuator limits, stale data, bus failures, and safety envelopes that cannot be negotiated by a language model.
So the architecture question is not “should a robot use an LLM?”
The real question is:
Which layer is allowed to decide what?
If you split authority correctly, an LLM can become a valuable semantic layer on top of a robot stack. If you split it badly, you build a demo that looks intelligent until the first stale sensor, delayed callback, blocked action, invalid state transition, or unsafe command.
This article is the authority model I would use for an AI-enabled robot built with an LLM, ROS 2, and a microcontroller.
Key takeaways
- The LLM should own semantic interpretation, operator interaction, task decomposition, and tool selection proposals.
- ROS 2 should own robot state, supervision, action orchestration, interface contracts, observability, and safety envelope checks.
- The microcontroller should own real-time control, hard timing, direct actuator loops, low-level interlocks, and sensor sampling that cannot tolerate jitter.
- The LLM should never publish raw motor commands, GPIO writes, PWM values, relay toggles, or safety-critical mode changes.
- A safe architecture uses typed commands, validation layers, bounded actions, cancellation paths, watchdogs, freshness checks, and degraded modes.
- The best mental model is not “AI brain.” It is “LLM as semantic planner, ROS 2 as supervisor, MCU as real-time authority.”
Citation-ready answer
In an AI-enabled robot, authority should be split by time scale and consequence. The LLM may propose goals and tool calls, ROS 2 should validate and supervise those goals against robot state and safety constraints, and the microcontroller should execute only bounded real-time control loops and hardware-facing commands. The closer a decision is to actuator timing, electrical safety, or irreversible physical motion, the less authority the LLM should have.
The authority stack
Here is the simplest version of the architecture.
| Layer | Owns | Must not own |
|---|---|---|
| LLM | Intent parsing, semantic planning, explanation, tool proposal | Raw actuation, safety state, timing-critical control |
| ROS 2 | State aggregation, supervision, actions, lifecycle, observability, command validation | Hard real-time motor loops, electrical interlocks |
| Microcontroller | PID loops, PWM, encoder reads, emergency hardware behavior, sensor sampling, local watchdogs | Semantic goals, free-form planning, user intent |
This split is not philosophical. It is a direct response to how physical systems fail.
The LLM is slow, probabilistic, text-oriented, and context-sensitive. ROS 2 is a distributed robotics middleware with typed interfaces, topics, services and actions. The microcontroller is close to the hardware and can run deterministic loops at rates that a language model should not even see.
When those boundaries are clean, the system has a chance to degrade safely.
When those boundaries are blurred, a prompt becomes part of the control path.
That is exactly what I argued against in why LLMs should not control motors and robots directly. This article goes one layer deeper: if the LLM does not control motors, what does it control?
Why time scale defines authority
The first rule is simple:
The faster the loop, the lower the layer.
| Time scale | Example | Right owner |
|---|---|---|
| 10 ms or less | Motor current, encoder pulse, inner control loop | MCU or motor controller |
| 10-100 ms | Sensor sampling, local safety interlock, watchdog heartbeat | MCU |
| 50-500 ms | State estimation, local planning, navigation feedback | ROS 2 nodes |
| 500 ms to seconds | Action goals, task progress, operator interaction | ROS 2 supervision plus LLM support |
| Seconds to minutes | Task decomposition, procedure selection, explanation | LLM |
ROS 2 documentation on real-time programming is clear about why this matters: a real-time loop must meet deadlines with a small margin of allowable jitter, and nondeterministic operations such as page faults, dynamic allocation, or blocking synchronization can break timing assumptions.
An LLM call is almost the definition of nondeterministic latency.
That does not make LLMs useless. It means they belong above the timing-critical path.
If a robot needs to balance, stop a motor, debounce an encoder, limit torque, keep a PWM loop stable, or respond to a hard limit switch, the LLM is irrelevant. That logic belongs in the microcontroller, motor controller, safety relay, or real-time control layer.
The LLM can decide that the robot should inspect a shelf.
It should not decide the PWM duty cycle that moves the wheel.
What the LLM is allowed to do
The LLM is useful when the problem is semantic.
Good LLM responsibilities include:
- interpreting operator intent,
- turning natural language into a structured task request,
- choosing among high-level tools,
- summarizing robot state for a human,
- reading manuals or troubleshooting notes,
- proposing a recovery procedure,
- explaining why a task cannot run,
- generating a checklist for the operator,
- producing a bounded action request for ROS 2 to validate.
The keyword is “bounded.”
An LLM output should not be “set motor A to 74 percent and open relay 3.” It should be something closer to:
1 | { |
That object is not a command to the hardware.
It is a proposal to the supervisory layer.
The LLM can be wrong. The architecture must assume it will be wrong sometimes. ROS 2 should treat the output like untrusted intent, not as authority.
This is also why generic “AI agent” architectures do not transfer cleanly to robotics. In a web workflow, a bad tool call might send a poor email. In a robot, a bad tool call can energize a motor, damage hardware, or create a human safety risk.
What ROS 2 should supervise
ROS 2 is the right place for robot-level supervision because it already gives you typed interfaces and distributed node communication.
The ROS 2 documentation describes topics as continuous data streams, services as short request/response interactions, and actions as long-running tasks with feedback and cancellation. That difference matters. A robot stack should not treat all commands as generic function calls. It should choose the interface type based on semantics.
| ROS 2 interface | Use it for | Do not use it for |
|---|---|---|
| Topic | Sensor data, robot state, diagnostics, health, command telemetry | One-off commands that require acceptance/rejection |
| Service | Quick bounded checks, configuration queries, validation requests | Long-running motion or tasks that need cancellation |
| Action | Navigation goals, inspection tasks, manipulation routines, recovery procedures | High-rate control or raw actuator streaming |
The official ROS 2 guide to topics, services and actions frames them exactly this way: topics for continuous streams, services for synchronous request/response, and actions for long-running tasks with feedback.
That distinction gives you a clean authority design:
- The LLM proposes a task.
- A ROS 2 supervisor validates the task.
- The supervisor sends a ROS 2 action goal to a bounded subsystem.
- The subsystem publishes feedback and status.
- The supervisor can cancel, pause, degrade, or reject.
- The MCU never sees free-form intent.
The ROS 2 supervisor should own:
- command schema validation,
- robot mode validation,
- state freshness checks,
- action preconditions,
- speed and workspace limits,
- battery and thermal checks,
- sensor confidence checks,
- operator confirmation rules,
- cancellation and timeout policy,
- logging and traceability,
- degraded-mode selection.
The LLM should not bypass this layer.
If you are building something like a local robot brain on Jetson, this is the layer that keeps the AI useful without making it authoritative. It fits naturally with the architecture I described in building a local robot brain on Jetson Orin Nano Super.
What the microcontroller must own
The MCU is the wrong place for language.
It is the right place for physical truth.
The microcontroller should own:
- encoder reads,
- PWM generation,
- motor enable pins,
- low-level PID loops,
- watchdog timeout behavior,
- hard limits,
- debouncing,
- current or voltage protection,
- emergency local stop behavior,
- actuator state transitions,
- real-time sensor sampling,
- local command expiry.
It should also be allowed to refuse commands.
That last point is important. Many robot architectures treat the microcontroller like a transparent adapter between ROS 2 and hardware. That is a mistake. The MCU is not just a wire. It is the last software authority before electrons become motion.
If ROS 2 sends an expired command, reject it.
If the heartbeat is stale, stop.
If the requested velocity violates a local limit, clamp or reject it.
If the command sequence is invalid, ignore it.
If the mode is not armed, do not energize.
This principle is very close to the argument in my micro-ROS piece on connecting a real-time microcontroller to a ROS 2 brain on Jetson. micro-ROS can make a microcontroller participate in a ROS 2 graph, but participation is not the same thing as surrendering hardware authority.
Where micro-ROS fits
micro-ROS is useful because it gives resource-constrained devices a path into a ROS 2 ecosystem.
The eProsima Micro XRCE-DDS documentation explains that Micro XRCE-DDS uses a client-server protocol where resource-constrained clients communicate with an Agent that bridges them into DDS. That is powerful because the MCU can publish and subscribe through a ROS-like model without pretending to be a workstation.
But you still need to design the contract carefully.
A good MCU contract is not:
1 | /motor_pwm |
A better contract is:
1 | /base_controller/cmd_velocity_limited |
The difference is authority.
The first interface exposes actuation.
The second exposes a bounded controller.
If the MCU is running micro-ROS, the same rule applies. Do not publish raw actuator authority unless the higher layer is also a trusted, deterministic controller with a valid reason to own it. For LLM-mediated systems, it almost never is.
A practical authority flow
Here is a pattern I would use.
1 | Human request |
At each step, authority gets narrower.
The human can express broad intent.
The LLM can translate that intent.
ROS 2 can decide whether the robot state permits it.
The action server can run a bounded behavior.
The MCU can execute timing-critical control.
The motor driver can enforce electrical constraints.
That narrowing is what makes the architecture defensible.
The opposite pattern is dangerous:
1 | Human request |
That may work in a demo. It is not a robot architecture.
The validation layer
The command validator is the most important piece in this design.
It should sit between LLM proposals and ROS 2 actions.
It should validate at least:
| Check | Why it matters |
|---|---|
| Schema validity | The LLM output must match a known command contract. |
| Known intent | Unknown tools or goals must be rejected. |
| Robot mode | A command valid in manual mode may be invalid in autonomous mode. |
| State freshness | Stale localization or sensor data makes plans unsafe. |
| Workspace limits | Do not send goals outside allowed physical zones. |
| Speed/force limits | Keep high-level commands inside conservative envelopes. |
| Required sensors | Do not run a task if required perception is degraded. |
| Human proximity | Shared workspaces require stricter gating. |
| Timeout | Every accepted command needs expiry. |
| Cancellation | Every long-running action needs a cancel path. |
| Audit log | Every AI-mediated decision should be explainable later. |
The validator does not need to be clever.
It needs to be boring, explicit, and hard to bypass.
That is the key difference between an AI demo and a cyber-physical system.
A minimal command contract
The first contract should be small.
Do not give the LLM twenty tools. Give it two or three high-level capabilities with tight schemas.
Example:
1 | { |
The validator should check:
toolis on the allowlist,target_idexists in a known map or semantic registry,max_speed_m_sis below the allowed mode limit,- required sensors are healthy,
- localization is fresh,
- no safety fault is active,
- operator confirmation is satisfied if required,
- timeout is within bounds.
Only then should ROS 2 send an action goal.
This pattern works well because the LLM is still useful. It can choose the target, explain the reason, and produce the proposal. But it cannot invent a new actuation path.
ROS 2 actions are often the right boundary
For AI-mediated robot tasks, ROS 2 actions are usually a better boundary than services or topics.
Why?
Because actions provide:
- a goal,
- feedback,
- status,
- result,
- cancellation.
The ROS 2 action documentation describes actions as long-running remote procedure calls with feedback and cancellation. That is exactly the shape of many robot tasks: navigate to a waypoint, inspect an object, dock, recover from a fault, run a calibration sequence, or perform a bounded manipulation routine.
An LLM should not stream commands.
It should request an action.
Then the robot stack should report progress in a way that the LLM can summarize for the operator without controlling the low-level loop.
How to choose the boundary
Use this table when deciding where a behavior belongs.
| Question | If yes | Authority layer |
|---|---|---|
| Does it require deterministic timing below 100 ms? | Keep it out of the LLM and usually out of high-level ROS. | MCU or real-time controller |
| Does it directly energize hardware? | Require local interlocks and mode checks. | MCU plus safety layer |
| Does it need robot-wide state? | Validate centrally. | ROS 2 supervisor |
| Does it take seconds or minutes and need feedback? | Use an action. | ROS 2 action server |
| Does it require natural language or procedure interpretation? | Let the LLM propose, not execute. | LLM plus validator |
| Can it harm hardware or people if wrong? | Reduce autonomy and require stronger checks. | Supervisor plus safety function |
The more physical consequence a command has, the more boring the interface should be.
That is a good rule.
Failure modes this architecture prevents
This split is not just cleaner. It prevents concrete failures.
| Failure mode | What goes wrong without boundaries | What the authority split does |
|---|---|---|
| Hallucinated tool | LLM invents a command name or parameter | Schema validator rejects it |
| Stale state | Robot acts on old localization or sensor data | ROS 2 supervisor checks freshness |
| Slow model call | Control loop waits on an LLM response | MCU loop never depends on model latency |
| Unsafe target | Robot navigates outside allowed area | Supervisor rejects workspace violation |
| Lost link | Jetson or ROS graph drops connection | MCU watchdog expires to safe state |
| Long task cannot stop | Robot continues despite operator change | ROS 2 action cancellation path exists |
| Bad mode | AI command runs while robot is manual or faulted | Mode gate blocks it |
| Over-broad command | LLM asks for “go faster” | Limits clamp or reject the request |
The point is not to make the LLM perfect.
The point is to make LLM imperfection survivable.
How this differs from a normal AI agent stack
In a normal software agent, the hard part is often reliability: JSON contracts, retries, tool schemas, validation, logs.
I wrote about that in how I built an AI agent architecture.
In robotics, those patterns still matter, but they are not enough.
A physical AI agent has extra constraints:
- time,
- energy,
- heat,
- motion,
- sensor uncertainty,
- local safety,
- actuator wear,
- communication loss,
- humans near the machine.
That is why the validation layer cannot be only a JSON parser.
It must be a robot-state validator.
It must understand modes, faults, stale data, action status, safety envelope, and degraded behavior.
Recommended reference architecture
For a Jetson-based robot, I would start with this:
1 | LLM process |
This is not the only architecture. But it is a good default because every layer has a reason to exist.
Design principles
- The LLM proposes. It does not command hardware.
- ROS 2 validates. It does not abdicate to the LLM.
- The MCU executes bounded real-time control. It does not interpret user intent.
- Every AI-originated request must be typed, logged, bounded, cancellable, and rejectable.
- Every actuator-facing command must expire.
- Every long-running task should expose feedback and cancellation.
- Every safety-relevant state must be observable outside the LLM context.
- The robot must have useful behavior when the LLM, network, or Jetson process fails.
If a design violates one of these principles, it may still work as a demo.
It is probably not ready to be trusted near hardware.
FAQ
Should an LLM ever publish directly to a ROS 2 topic?
It can publish to a non-actuating intent or explanation topic, but it should not publish directly to actuator command topics. A better pattern is for the LLM to produce a structured proposal that a ROS 2 supervisor validates before any action goal or hardware-facing command is created.
Is ROS 2 real-time enough for motor control?
ROS 2 can be used in systems with real-time constraints, but hard low-level motor loops often belong in a microcontroller, motor controller, or real-time control process. Use ROS 2 for orchestration, state, supervision and actions; use the MCU for deterministic loops and hardware interlocks.
Where does micro-ROS fit in this architecture?
micro-ROS is useful when a microcontroller needs to participate in a ROS 2 system, publish status, receive bounded commands, or expose sensor data. It should not turn the MCU into a passive wire adapter. The MCU should still enforce local timing, command expiry, limits, and safety behavior.
Should AI robot commands be services or actions?
For long-running robot behaviors, actions are usually the better boundary because they provide feedback, status, result, and cancellation. Services are better for short checks or configuration queries. Topics are better for continuous streams such as state, diagnostics, sensor data and health.
What is the safest first tool to expose to an LLM?
Start with read-only tools: summarize robot state, inspect logs, explain active faults, search documentation, or propose a task without executing it. After that, expose one bounded action proposal with strict validation. Do not start with raw motor, GPIO, relay or serial-write tools.
How do I know the authority split is too loose?
If the LLM can create new command names, choose raw actuator values, bypass robot modes, ignore stale state, send commands without expiry, or trigger physical action without a validator, the split is too loose.
Final thought
The best AI robot architecture does not make the LLM the brain.
It makes the LLM a semantic layer inside a stack that still respects physics.
ROS 2 should supervise the robot as a distributed system. The microcontroller should protect the timing and hardware boundary. The LLM should help humans and high-level autonomy reason about goals, context and procedures.
That split is less flashy than a single “AI brain” diagram.
It is also much closer to how reliable physical systems should be built.