Databricks partners with OpenAI on GPT-5.5

Databricks is excited to partner with OpenAI on GPT-5.5, their latest frontier model. GPT-5.5 is OpenAI’s strongest frontier model for agentic work in enterprise, complex document reasoning, and long-horizon coding agents. GPT-5.5 also now powers Codex, OpenAI’s coding agent.

GPT-5.5 Features and Benefits

GPT-5.5 is the smartest frontier model yet and the next step toward a new way of getting work done. It understands what you’re trying to do more quickly and can take on more of the work itself. Codex, OpenAI’s coding agent, is now powered by GPT-5.5, with stronger reasoning and execution capabilities for developer workflows.

The same strengths that make GPT-5.5 great at coding also make it powerful for everyday work on a computer. Because the model is better at understanding intent, it can move more naturally through the full loop of knowledge work: finding information, understanding what matters, using tools, checking the output, and turning raw material into something useful.

It can write and debug code, research online, analyze data, create documents and spreadsheets, operate software, and move across tools until a task is finished. Instead of carefully managing every step, you can give GPT-5.5 a messy, multi-part task and trust it to plan, use tools, check its work, recover from ambiguity, and keep going.

GPT-5.5 sets the state-of-the-art performance

To understand how these improvements translate into real enterprise workloads, we evaluated GPT-5.5 on OfficeQA, Databricks’ benchmark for document-heavy, multi-step analytical tasks customers perform every day. OfficeQA, built from 89,000 pages of U.S. Treasury Bulletins, measures a model’s ability to retrieve information across documents, interpret complex tables, and perform precise calculations grounded in real enterprise data.

When given the right documents (OfficeQA Pro LLM with Oracle PDF + Web Search), GPT-5.5 scored 64.66%, a decent jump from GPT-5.4’s 57.14%, representing a ~13% improvement and a new state-of-the-art on this benchmark. This tests the ceiling of what the model can do when retrieval is already handled.
In a full-agent workflow eval (OfficeQA Pro Agent Harness), where the model must find the right documents, parse them, and compute answers on its own using the Codex agent harness, GPT-5.5 scored 52.63%, up from GPT-5.4’s 36.10%. That’s a 46% reduction in errors, showing that GPT-5.5’s gains aren’t just theoretical; they hold up in realistic, end-to-end enterprise workflows.

GPT-5.5 is coming soon to Databricks. Bring frontier reasoning to your enterprise data, securely, and at scale.

Latest post

[Webinar] Mythos Reality Check: Beating Automated Exploitation at AI Speed

How This Former Roboticist’s Students Rebuilt ENIAC

Microsoft Unveils Tools To Build Infrastructure For Agentic Web

Databricks partners with OpenAI on GPT-5.5

Microsoft Unveils Tools To Build Infrastructure For Agentic Web

Lasso Regression: Why the Solution Lives on a Diamond

OpenAI debuts always-on agents to end the friction of manual team handoffs

FCC Bans Procurement Of Foreign-made Routers For Security Reasons

[Webinar] Mythos Reality Check: Beating Automated Exploitation at AI Speed

How This Former Roboticist’s Students Rebuilt ENIAC

Microsoft Unveils Tools To Build Infrastructure For Agentic Web

How Roman Sailors Repaired Ships on the Fly Far From Home

[Webinar] Mythos Reality Check: Beating Automated Exploitation at AI Speed

How This Former Roboticist’s Students Rebuilt ENIAC

Microsoft Unveils Tools To Build Infrastructure For Agentic Web

How Roman Sailors Repaired Ships on the Fly Far From Home

Latest post

Databricks partners with OpenAI on GPT-5.5

GPT-5.5 Features and Benefits

GPT-5.5 sets the state-of-the-art performance

Related Posts