TransportationBench
TransportationBench evaluates AI performance across transportation domains using standardized benchmarks. Each TB-{domain} index combines knowledge evaluation (certification exams) and task evaluation (real-world deliverables) into a single score.
| Model | TB-Safety | TB-Operations Soon | TB-Planning Soon | TB-Aviation Soon |
|---|---|---|---|---|
| Claude Opus 4.6 | 56.3% | -- | -- | -- |
| GPT-5.2 | 53.6% | -- | -- | -- |
| Claude Haiku 4.5 | 51.8% | -- | -- | -- |
| GPT-5 mini | 47.4% | -- | -- | -- |
| Gemini 3 Pro Preview | 44.8% | -- | -- | -- |
| Gemini 3 Flash Preview | 41.4% | -- | -- | -- |
| Claude Sonnet 4.5 | -- | -- | -- | -- |
| Claude Sonnet 4.6 | -- | -- | -- | -- |
TB-Safety
Safety is a core mission for transportation agencies and a top AI use case. TB-Safety evaluates whether AI can reliably support safety workflows.
TB-Operations
Coming Soon
Traffic management and ITS operations are rapidly adopting AI. TB-Operations evaluates whether AI can support operational workflows.
TB-Planning
Coming Soon
Long-range planning and demand modeling increasingly rely on AI tools. TB-Planning evaluates whether AI can support transportation planning workflows.
TB-Aviation
Coming Soon
Drone operations and airspace management are emerging AI use cases. TB-Aviation evaluates AI readiness for aviation-related transportation tasks.