How to Split Authority Between an LLM, ROS 2, and a Microcontroller

The most dangerous mistake in AI robotics is not using an LLM.

It is giving the LLM the wrong kind of authority.

An LLM can interpret intent, summarize context, read manuals, propose plans, call tools, and help operators understand what a robot is doing. That is useful. But a robot is still a cyber-physical system. It has timing constraints, noisy sensors, actuator limits, stale data, bus failures, and safety envelopes that cannot be negotiated by a language model.

So the architecture question is not “should a robot use an LLM?”

The real question is:

Which layer is allowed to decide what?

If you split authority correctly, an LLM can become a valuable semantic layer on top of a robot stack. If you split it badly, you build a demo that looks intelligent until the first stale sensor, delayed callback, blocked action, invalid state transition, or unsafe command.

This article is the authority model I would use for an AI-enabled robot built with an LLM, ROS 2, and a microcontroller.

Key takeaways

The LLM should own semantic interpretation, operator interaction, task decomposition, and tool selection proposals.
ROS 2 should own robot state, supervision, action orchestration, interface contracts, observability, and safety envelope checks.
The microcontroller should own real-time control, hard timing, direct actuator loops, low-level interlocks, and sensor sampling that cannot tolerate jitter.
The LLM should never publish raw motor commands, GPIO writes, PWM values, relay toggles, or safety-critical mode changes.
A safe architecture uses typed commands, validation layers, bounded actions, cancellation paths, watchdogs, freshness checks, and degraded modes.
The best mental model is not “AI brain.” It is “LLM as semantic planner, ROS 2 as supervisor, MCU as real-time authority.”

Citation-ready answer

In an AI-enabled robot, authority should be split by time scale and consequence. The LLM may propose goals and tool calls, ROS 2 should validate and supervise those goals against robot state and safety constraints, and the microcontroller should execute only bounded real-time control loops and hardware-facing commands. The closer a decision is to actuator timing, electrical safety, or irreversible physical motion, the less authority the LLM should have.

The authority stack

Here is the simplest version of the architecture.

Layer	Owns	Must not own
LLM	Intent parsing, semantic planning, explanation, tool proposal	Raw actuation, safety state, timing-critical control
ROS 2	State aggregation, supervision, actions, lifecycle, observability, command validation	Hard real-time motor loops, electrical interlocks
Microcontroller	PID loops, PWM, encoder reads, emergency hardware behavior, sensor sampling, local watchdogs	Semantic goals, free-form planning, user intent

This split is not philosophical. It is a direct response to how physical systems fail.

The LLM is slow, probabilistic, text-oriented, and context-sensitive. ROS 2 is a distributed robotics middleware with typed interfaces, topics, services and actions. The microcontroller is close to the hardware and can run deterministic loops at rates that a language model should not even see.

When those boundaries are clean, the system has a chance to degrade safely.

When those boundaries are blurred, a prompt becomes part of the control path.

That is exactly what I argued against in why LLMs should not control motors and robots directly. This article goes one layer deeper: if the LLM does not control motors, what does it control?

Why time scale defines authority

The first rule is simple:

The faster the loop, the lower the layer.

Time scale	Example	Right owner
10 ms or less	Motor current, encoder pulse, inner control loop	MCU or motor controller
10-100 ms	Sensor sampling, local safety interlock, watchdog heartbeat	MCU
50-500 ms	State estimation, local planning, navigation feedback	ROS 2 nodes
500 ms to seconds	Action goals, task progress, operator interaction	ROS 2 supervision plus LLM support
Seconds to minutes	Task decomposition, procedure selection, explanation	LLM

ROS 2 documentation on real-time programming is clear about why this matters: a real-time loop must meet deadlines with a small margin of allowable jitter, and nondeterministic operations such as page faults, dynamic allocation, or blocking synchronization can break timing assumptions.

An LLM call is almost the definition of nondeterministic latency.

That does not make LLMs useless. It means they belong above the timing-critical path.

If a robot needs to balance, stop a motor, debounce an encoder, limit torque, keep a PWM loop stable, or respond to a hard limit switch, the LLM is irrelevant. That logic belongs in the microcontroller, motor controller, safety relay, or real-time control layer.

The LLM can decide that the robot should inspect a shelf.

It should not decide the PWM duty cycle that moves the wheel.

What the LLM is allowed to do

The LLM is useful when the problem is semantic.

Good LLM responsibilities include:

interpreting operator intent,
turning natural language into a structured task request,
choosing among high-level tools,
summarizing robot state for a human,
reading manuals or troubleshooting notes,
proposing a recovery procedure,
explaining why a task cannot run,
generating a checklist for the operator,
producing a bounded action request for ROS 2 to validate.

The keyword is “bounded.”

An LLM output should not be “set motor A to 74 percent and open relay 3.” It should be something closer to:

{
  "intent": "inspect_area",
  "target": "charging_station",
  "constraints": {
    "max_speed_m_s": 0.25,
    "avoid_people": true,
    "require_operator_confirm": false
  },
  "reason": "Operator asked for a visual inspection after a charging fault."
}

That object is not a command to the hardware.

It is a proposal to the supervisory layer.

The LLM can be wrong. The architecture must assume it will be wrong sometimes. ROS 2 should treat the output like untrusted intent, not as authority.

This is also why generic “AI agent” architectures do not transfer cleanly to robotics. In a web workflow, a bad tool call might send a poor email. In a robot, a bad tool call can energize a motor, damage hardware, or create a human safety risk.

What ROS 2 should supervise

ROS 2 is the right place for robot-level supervision because it already gives you typed interfaces and distributed node communication.

The ROS 2 documentation describes topics as continuous data streams, services as short request/response interactions, and actions as long-running tasks with feedback and cancellation. That difference matters. A robot stack should not treat all commands as generic function calls. It should choose the interface type based on semantics.

ROS 2 interface	Use it for	Do not use it for
Topic	Sensor data, robot state, diagnostics, health, command telemetry	One-off commands that require acceptance/rejection
Service	Quick bounded checks, configuration queries, validation requests	Long-running motion or tasks that need cancellation
Action	Navigation goals, inspection tasks, manipulation routines, recovery procedures	High-rate control or raw actuator streaming

The official ROS 2 guide to topics, services and actions frames them exactly this way: topics for continuous streams, services for synchronous request/response, and actions for long-running tasks with feedback.

That distinction gives you a clean authority design:

The LLM proposes a task.
A ROS 2 supervisor validates the task.
The supervisor sends a ROS 2 action goal to a bounded subsystem.
The subsystem publishes feedback and status.
The supervisor can cancel, pause, degrade, or reject.
The MCU never sees free-form intent.

The ROS 2 supervisor should own:

command schema validation,
robot mode validation,
state freshness checks,
action preconditions,
speed and workspace limits,
battery and thermal checks,
sensor confidence checks,
operator confirmation rules,
cancellation and timeout policy,
logging and traceability,
degraded-mode selection.

The LLM should not bypass this layer.

If you are building something like a local robot brain on Jetson, this is the layer that keeps the AI useful without making it authoritative. It fits naturally with the architecture I described in building a local robot brain on Jetson Orin Nano Super.

What the microcontroller must own

The MCU is the wrong place for language.

It is the right place for physical truth.

The microcontroller should own:

encoder reads,
PWM generation,
motor enable pins,
low-level PID loops,
watchdog timeout behavior,
hard limits,
debouncing,
current or voltage protection,
emergency local stop behavior,
actuator state transitions,
real-time sensor sampling,
local command expiry.

It should also be allowed to refuse commands.

That last point is important. Many robot architectures treat the microcontroller like a transparent adapter between ROS 2 and hardware. That is a mistake. The MCU is not just a wire. It is the last software authority before electrons become motion.

If ROS 2 sends an expired command, reject it.

If the heartbeat is stale, stop.

If the requested velocity violates a local limit, clamp or reject it.

If the command sequence is invalid, ignore it.

If the mode is not armed, do not energize.

This principle is very close to the argument in my micro-ROS piece on connecting a real-time microcontroller to a ROS 2 brain on Jetson. micro-ROS can make a microcontroller participate in a ROS 2 graph, but participation is not the same thing as surrendering hardware authority.

Where micro-ROS fits

micro-ROS is useful because it gives resource-constrained devices a path into a ROS 2 ecosystem.

The eProsima Micro XRCE-DDS documentation explains that Micro XRCE-DDS uses a client-server protocol where resource-constrained clients communicate with an Agent that bridges them into DDS. That is powerful because the MCU can publish and subscribe through a ROS-like model without pretending to be a workstation.

But you still need to design the contract carefully.

A good MCU contract is not:

1	/motor_pwm

A better contract is:

/base_controller/cmd_velocity_limited
/base_controller/status
/base_controller/fault
/base_controller/heartbeat
/base_controller/estop_state

The difference is authority.

The first interface exposes actuation.

The second exposes a bounded controller.

If the MCU is running micro-ROS, the same rule applies. Do not publish raw actuator authority unless the higher layer is also a trusted, deterministic controller with a valid reason to own it. For LLM-mediated systems, it almost never is.

A practical authority flow

Here is a pattern I would use.

Human request
  -> LLM intent parser
  -> structured task proposal
  -> ROS 2 command validator
  -> ROS 2 action goal
  -> subsystem controller
  -> MCU bounded command
  -> motor/sensor loop

At each step, authority gets narrower.

The human can express broad intent.

The LLM can translate that intent.

ROS 2 can decide whether the robot state permits it.

The action server can run a bounded behavior.

The MCU can execute timing-critical control.

The motor driver can enforce electrical constraints.

That narrowing is what makes the architecture defensible.

The opposite pattern is dangerous:

Human request
  -> LLM
  -> shell command
  -> GPIO or serial write
  -> actuator

That may work in a demo. It is not a robot architecture.

The validation layer

The command validator is the most important piece in this design.

It should sit between LLM proposals and ROS 2 actions.

It should validate at least:

Check	Why it matters
Schema validity	The LLM output must match a known command contract.
Known intent	Unknown tools or goals must be rejected.
Robot mode	A command valid in manual mode may be invalid in autonomous mode.
State freshness	Stale localization or sensor data makes plans unsafe.
Workspace limits	Do not send goals outside allowed physical zones.
Speed/force limits	Keep high-level commands inside conservative envelopes.
Required sensors	Do not run a task if required perception is degraded.
Human proximity	Shared workspaces require stricter gating.
Timeout	Every accepted command needs expiry.
Cancellation	Every long-running action needs a cancel path.
Audit log	Every AI-mediated decision should be explainable later.

The validator does not need to be clever.

It needs to be boring, explicit, and hard to bypass.

That is the key difference between an AI demo and a cyber-physical system.

A minimal command contract

The first contract should be small.

Do not give the LLM twenty tools. Give it two or three high-level capabilities with tight schemas.

Example:

{
  "tool": "request_navigation_inspection",
  "target_id": "charging_station",
  "max_speed_m_s": 0.25,
  "requires_visual_check": true,
  "timeout_s": 120
}

The validator should check:

tool is on the allowlist,
target_id exists in a known map or semantic registry,
max_speed_m_s is below the allowed mode limit,
required sensors are healthy,
localization is fresh,
no safety fault is active,
operator confirmation is satisfied if required,
timeout is within bounds.

Only then should ROS 2 send an action goal.

This pattern works well because the LLM is still useful. It can choose the target, explain the reason, and produce the proposal. But it cannot invent a new actuation path.

ROS 2 actions are often the right boundary

For AI-mediated robot tasks, ROS 2 actions are usually a better boundary than services or topics.

Why?

Because actions provide:

a goal,
feedback,
status,
result,
cancellation.

The ROS 2 action documentation describes actions as long-running remote procedure calls with feedback and cancellation. That is exactly the shape of many robot tasks: navigate to a waypoint, inspect an object, dock, recover from a fault, run a calibration sequence, or perform a bounded manipulation routine.

An LLM should not stream commands.

It should request an action.

Then the robot stack should report progress in a way that the LLM can summarize for the operator without controlling the low-level loop.

How to choose the boundary

Use this table when deciding where a behavior belongs.

Question	If yes	Authority layer
Does it require deterministic timing below 100 ms?	Keep it out of the LLM and usually out of high-level ROS.	MCU or real-time controller
Does it directly energize hardware?	Require local interlocks and mode checks.	MCU plus safety layer
Does it need robot-wide state?	Validate centrally.	ROS 2 supervisor
Does it take seconds or minutes and need feedback?	Use an action.	ROS 2 action server
Does it require natural language or procedure interpretation?	Let the LLM propose, not execute.	LLM plus validator
Can it harm hardware or people if wrong?	Reduce autonomy and require stronger checks.	Supervisor plus safety function

The more physical consequence a command has, the more boring the interface should be.

That is a good rule.

Failure modes this architecture prevents

This split is not just cleaner. It prevents concrete failures.

Failure mode	What goes wrong without boundaries	What the authority split does
Hallucinated tool	LLM invents a command name or parameter	Schema validator rejects it
Stale state	Robot acts on old localization or sensor data	ROS 2 supervisor checks freshness
Slow model call	Control loop waits on an LLM response	MCU loop never depends on model latency
Unsafe target	Robot navigates outside allowed area	Supervisor rejects workspace violation
Lost link	Jetson or ROS graph drops connection	MCU watchdog expires to safe state
Long task cannot stop	Robot continues despite operator change	ROS 2 action cancellation path exists
Bad mode	AI command runs while robot is manual or faulted	Mode gate blocks it
Over-broad command	LLM asks for “go faster”	Limits clamp or reject the request

The point is not to make the LLM perfect.

The point is to make LLM imperfection survivable.

How this differs from a normal AI agent stack

In a normal software agent, the hard part is often reliability: JSON contracts, retries, tool schemas, validation, logs.

I wrote about that in how I built an AI agent architecture.

In robotics, those patterns still matter, but they are not enough.

A physical AI agent has extra constraints:

time,
energy,
heat,
motion,
sensor uncertainty,
local safety,
actuator wear,
communication loss,
humans near the machine.

That is why the validation layer cannot be only a JSON parser.

It must be a robot-state validator.

It must understand modes, faults, stale data, action status, safety envelope, and degraded behavior.

Recommended reference architecture

For a Jetson-based robot, I would start with this:

LLM process
  - natural-language interface
  - retrieval over manuals/logs
  - structured task proposal only

ROS 2 supervisor
  - validates LLM proposal
  - owns robot mode
  - checks state freshness
  - sends action goals
  - records decision logs
  - cancels or degrades when needed

ROS 2 subsystem nodes
  - navigation action server
  - perception health
  - localization status
  - diagnostics
  - operator UI

microcontroller
  - motor loop
  - encoder reads
  - watchdog
  - emergency local behavior
  - command expiry
  - bounded velocity or actuator contracts

This is not the only architecture. But it is a good default because every layer has a reason to exist.

Design principles

The LLM proposes. It does not command hardware.
ROS 2 validates. It does not abdicate to the LLM.
The MCU executes bounded real-time control. It does not interpret user intent.
Every AI-originated request must be typed, logged, bounded, cancellable, and rejectable.
Every actuator-facing command must expire.
Every long-running task should expose feedback and cancellation.
Every safety-relevant state must be observable outside the LLM context.
The robot must have useful behavior when the LLM, network, or Jetson process fails.

If a design violates one of these principles, it may still work as a demo.

It is probably not ready to be trusted near hardware.

FAQ

Should an LLM ever publish directly to a ROS 2 topic?

It can publish to a non-actuating intent or explanation topic, but it should not publish directly to actuator command topics. A better pattern is for the LLM to produce a structured proposal that a ROS 2 supervisor validates before any action goal or hardware-facing command is created.

Is ROS 2 real-time enough for motor control?

ROS 2 can be used in systems with real-time constraints, but hard low-level motor loops often belong in a microcontroller, motor controller, or real-time control process. Use ROS 2 for orchestration, state, supervision and actions; use the MCU for deterministic loops and hardware interlocks.

Where does micro-ROS fit in this architecture?

micro-ROS is useful when a microcontroller needs to participate in a ROS 2 system, publish status, receive bounded commands, or expose sensor data. It should not turn the MCU into a passive wire adapter. The MCU should still enforce local timing, command expiry, limits, and safety behavior.

Should AI robot commands be services or actions?

For long-running robot behaviors, actions are usually the better boundary because they provide feedback, status, result, and cancellation. Services are better for short checks or configuration queries. Topics are better for continuous streams such as state, diagnostics, sensor data and health.

What is the safest first tool to expose to an LLM?

Start with read-only tools: summarize robot state, inspect logs, explain active faults, search documentation, or propose a task without executing it. After that, expose one bounded action proposal with strict validation. Do not start with raw motor, GPIO, relay or serial-write tools.

How do I know the authority split is too loose?

If the LLM can create new command names, choose raw actuator values, bypass robot modes, ignore stale state, send commands without expiry, or trigger physical action without a validator, the split is too loose.

Final thought

The best AI robot architecture does not make the LLM the brain.

It makes the LLM a semantic layer inside a stack that still respects physics.

ROS 2 should supervise the robot as a distributed system. The microcontroller should protect the timing and hardware boundary. The LLM should help humans and high-level autonomy reason about goals, context and procedures.

That split is less flashy than a single “AI brain” diagram.

It is also much closer to how reliable physical systems should be built.