AppleOS 26 introduces LowLevelInstanceData that can reduce CPU draw calls significantly by instancing. However, I have noticed trouble with animating each individual instance.
As I wanted low-level control, I'm using a custom system and LowLevelInstanceData.replace(using:) to update the transform each frame. The update closure itself is extremely efficient (Xcode Instruments reports nearly no cost). But I noticed extremely high runloop time, reach around 20ms. Time Profiler shows that the CPU is blocked by kernel.release.t6401.
I think it is caused by synchronization between CPU and GPU, however, as I am already using a MTLCommandBuffer to coordinate it, I don't understand why I am still seeing large CPU time.
The 20ms CPU stall you're seeing is likely caused by a CPU/GPU synchronization hazard when updating the instance transform buffer.
LowLevelInstanceData provides three distinct methods for writing transform data, each with different synchronization behavior:
-
withMutableTransforms— Gives you a mutable pointer to the current backing buffer. If the GPU is still reading this buffer from the previous frame's render, the CPU will block until the GPU finishes. This is the most likely cause of the stall you're seeing — thekernel.release.t6401time in the Time Profiler is the CPU waiting for the GPU to release the buffer. -
replaceMutableTransforms— Gives you a mutable pointer to a fresh buffer. RealityKit handles buffer rotation internally, so there's no synchronization stall. When your closure completes, RealityKit swaps the new buffer in for subsequent renders. This is the correct method for per-frame CPU-side animation. -
replace(using:)with an MTLCommandBuffer — Returns anMTLBufferfor GPU compute shader writes. Best for very large instance counts where GPU parallelism outperforms CPU iteration.
This three-tier pattern (with… / withMutable… / replaceMutable…) is consistent across all of RealityKit's low-level types (LowLevelBuffer, LowLevelMesh, LowLevelInstanceData).
Regarding your MTLCommandBuffer coordination — RealityKit manages its own internal render pipeline, so your command buffer doesn't affect when RealityKit reads the instance data. The synchronization needs to happen through the LowLevelInstanceData API itself (via the replace variants), not through an external command buffer.
If you're currently using withMutableTransforms for your per-frame updates, switching to replaceMutableTransforms should eliminate the stall — the closure signature is the same, so it's a one-line change.
If this doesn't resolve the issue, could you share the code where you're updating the transforms each frame? Seeing the actual update call and the surrounding context would help narrow down what's happening.