Habitat Navigation Challenge 2023 [4]

Object navigation. NOT instruction following.

SPL: Success weighted by (inverse) Path Length

where,
= length of shortest path between goal and target for an episode
= length of path taken by agent in an episode
= binary indicator of success in episode

RxR-Habitat Challenge - CVPR 2021 Embodied AI Workshop [5]

Instruction following.

Primary

NDTW: Normalized Dynamic Time Warping

“RxR does not have a shortest path prior, we care more about an agent’s ability to follow a path than its ability to reach the specific endpoint of the path.” [5]

nDTW scores how faithfully an agent’s trajectory shadows a reference path. It aligns the two sequences with Dynamic Time Warping, sums the point-wise distances, then normalizes by the reference length and a goal-success radius before passing the result through a negative exponential. The output is a smooth 0-to-1 similarity value that is order-aware, density-agnostic, and works on either graph or continuous representations—higher means the agent stayed closer to the intended route throughout. (one paragraph explanation courtesy of o3, adapted from [3])

Secondary

NE: Navigation Error

measures the distance between the last node in the predicted path and the last reference path node. [1]

SR: Success Rate

measures how often the last node in the predicted path is within a threshold distance of the last reference path node.[1]

SPL: Success weighted by (inverse) Path Length

Equipped with a binary definition of episodic success, we conduct test episodes. In each episode, the agent is tasked with navigating to a goal. Let be the shortest path distance from the agent’s starting position to the goal in episode , and let be the length of the path actually taken by the agent in this episode. Let be a binary indicator of success in episode . We define a summary measure of the agent’s navigation performance across the test set as follows: [2]

SDTW: Success weighted by Normalized Dynamic Time Warping

SDTW folds goal completion into nDTW by multiplying the similarity score by a binary success flag: if the agent ends within the specified success radius, SDTW equals its nDTW; otherwise it is forced to zero. (one paragraph explanation courtesy of o3, adapted from [3]

PL: Path Length

measures the total length of the predicted path, which has the optimal value equal to the length of the reference path.[1]


References

  1. Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation https://arxiv.org/abs/1905.12255
  2. On Evaluation of Embodied Navigation Agents https://arxiv.org/abs/1807.06757
  3. General Evaluation for Instruction Conditioned Navigation using Dynamic Time Warping https://arxiv.org/abs/1907.05446
  4. Habitat Navigation Challenge 2023 https://aihabitat.org/challenge/2023/
  5. RxR-Habitat Challenge - CVPR 2021 Embodied AI Workshop https://www.youtube.com/watch?v=YGwHGgD-9gQ