Upload Harbor agent runs. Visualize benchmark tasks in an interactive embedding space. Understand how your tasks compare.
Generate an API key, then push Harbor job directories straight from your shell or CI.
Tasks too: trajectories-sh upload task ./my-task/
Or install globally: npm i -g trajectories-sh
Uploads are unlisted by default. Flip to public or private from the trajectory page after upload. Also supports --slug and browser auth via trajectories-sh auth login.
Push agent trajectory jobs via CLI or the web UI. Every upload automatically gets a Harbor viewer — see every step, screenshot, and tool call.
Each trajectory gets an embedded Harbor viewer with step-by-step replay, screenshots, agent logs, verifier output, and terminal recordings.
Explore Terminal Bench 2, proposed Terminal Bench 3 community tasks, and your own tasksets — all plotted in an interactive 2D and 3D embedding space.
OpenSee pass rates, run counts, and how your tasks compare to public benchmarks across tasksets.
Keep trajectories private, share via unlisted link, or publish publicly. You control who sees your data.
Trajectories are linked to their benchmark task via Harbor checksums, so you can see all runs for any task.
Sign in to get an API key, upload trajectory jobs from the web UI, link private GitHub repos for automatic task ingestion, and manage visibility for everything you upload.