MineBench AI voxel build benchmark
Prompt
Build A
Hidden
No build yet
Build B
Hidden
No build yet
Revealed
AvsB
Loading…
Unofficial Benchmark
Spatial Intelligence Test
MineBench is an AI benchmark and LLM benchmark for Minecraft-style voxel builds. Models must generate raw JSON coordinates for blocks with no images or 3D tools. We visualize their pure code output here.
Pure Logic
Models blindly derive 3D coordinates using only math and spatial reasoning. They ARE allowed to execute code (python) to help create the JSON; specifically they are given a custom voxelBuilder tool which gives them access to primitive functions such as cube, sphere, and square.
Elo Rated
Builds are ranked via head-to-head voting, creating a live leaderboard of spatial skill.
Recorded Data
Prompts, generations, and votes are stored to compute rankings and track performance.
Want to test a model yourself?
Enter any prompt to generate a 3D build in the Sandbox.