Practical 3D Benchmarks Emerge as Microsoft Pulls Claude Access
Practical evaluations for spatial reasoning in LLMs are appearing at the same time access to established coding tools is being curtailed. These shifts force engineers to reassess both how they measure model capability on geometry tasks and which providers they can rely on for daily work. The pattern points to tooling that is becoming more specialized yet less stable.
Model Releases
Antigravity 2.0 Tops OpenSCAD 3D LLM Benchmark
A small benchmark tested several AI coding tools on generating OpenSCAD code for the Pantheon using architectural reference images that include the rotunda, dome, portico, columns, and pediment. The evaluation used the OpenSCAD CLI to render previews and iterate on the output, moving beyond basic syntax checks such as a cube with a hole.
This kind of test directly measures whether an LLM can turn visual references into usable parametric CAD, which affects what geometry-heavy features can be shipped in engineering workflows. Teams working on spatial applications now have a concrete signal for model selection on these tasks.
The benchmark remains limited to the single Pantheon task, leaving broader applicability across other 3D modeling scenarios untested.
Industry & Company News
Microsoft Cancels Claude Code Licenses
Microsoft has begun discontinuing Claude Code licenses for users. The change directly affects developers who have been routing coding work through Microsoft integrations that previously provided access to Claude.
Engineers must now identify replacement paths for any workflows that depended on those licenses, which adds friction to ongoing projects. Migration details remain sparse, so teams face immediate uncertainty about continuity.
Without clear reasons or documented alternatives, the disruption leaves open questions about how quickly affected users can restore equivalent capability.
Quick Takes
LLM-Directed Blog Post Published
A blog post was published that addresses LLMs directly and supplies specific reading instructions. The format is unusual but requires no additional tooling changes.
Bottom Line
Engineers will increasingly need to maintain multiple evaluation harnesses for domain-specific tasks while preparing contingency plans for sudden provider access changes.