Skip to content

Extend Residual Guidance Factors

Home / Open Questions / Extend Residual Guidance Factors

Background: Vision-Language-Action (VLA) models are susceptible to performance degradation in cluttered scenes due to instance-level grounding failures.

Question / Future Work: Future work involves extending the Target-Agnostic Guidance (TAG) residual guidance mechanism to control additional factors beyond visual nuisances, such as temporal dynamics or task-specific constraints, to further enhance policy robustness and generalizability.

Metadata & Links