OpenAI is releasing a lighter, cheaper model for developers to tinker with called GPT-4o Mini. It costs significantly less than full-sized models and is said to be more capable than GPT-3.5.
Building apps using OpenAI’s models can rack up a huge bill. Developers without the means to afford to tinker with it can get priced out of it entirely and may opt for cheaper models like Google’s Gemini 1.5 Flash or Anthropic’s Claude 3 Haiku. Now, OpenAI is entering the light model game.
“I think GPT-4o Mini really gets at the OpenAI mission of making AI more broadly accessible to people. If we want AI to benefit every corner of the world, every industry, every application, we have to make AI much more affordable,” Olivier Godement, who leads the API platform product, told The Verge.
Starting today, ChatGPT users on Free, Plus, and Team plans can use GPT-4o Mini instead of GPT-3.5 Turbo, with Enterprise users getting access next week. That means GPT-3.5 will no longer be an option for ChatGPT users, but it will still be available for developers via the API if they prefer not to switch to GPT-4o Mini. Godement said GPT-3.5 will get retired from the API at some point — they’re just not sure when.
“I think it’s going to be very popular,” Godement said
The new, lightweight model will also support text and vision in the API, and the company says it will soon handle all multimodal inputs and outputs like video and audio. With all these capabilities, this could look like more capable virtual assistants that can understand your travel itinerary and create suggestions. However, the model is meant for simple tasks, so no one is exactly building Siri for cheap.
This new model achieved an 82 percent score on the Measuring Massive Multitask Language Understanding (MMLU), a benchmark exam consisting of about 16,000 multiple-choice questions across 57 academic subjects. When the MMLU was first introduced in 2020, most models were pretty bad at it, which was the goal since the models had gotten too advanced for previous benchmark exams. GPT-3.5 scored 70 percent on this benchmark, GPT-4o scored 88.7 percent, and Google claims Gemini Ultra to have the highest-ever score of 90 percent. In comparison, the competing models Claude 3 Haiku and Gemini 1.5 Flash scored 75.2 percent and 78.9 percent, respectively.
It’s worth noting that researchers are wary of benchmark tests like the MMLU, as how it’s administered varies slightly from company to company. That makes different models’ scores difficult to compare, as The New York Times reported. There’s also the problem of the AI potentially having these answers in its dataset, which essentially lets it cheat, and typically no third-party evaluators are part of the process.
For developers who are hungry to build AI applications for cheap, the launch of GPT-4o Mini gives them another tool to add to their inventory. OpenAI let the financial technology startup Ramp test the model, using GPT-4o Mini to build a tool that extracts expense data on receipts. So, instead of slogging through text boxes, a user can upload a picture of their receipt and the model sorts it all for them. Superhuman, an email client, also tested GPT-4o Mini and used it to create an auto-suggestion feature for email responses.
The goal is to provide something lightweight and inexpensive for developers to create all the apps and tools they couldn’t afford to make with a larger, more expensive model like GPT-4. Many developers would turn to Claude 3 Haiku or Gemini 1.5 Flash before paying the eye-watering compute costs required to run one of the most robust models.
So, what took OpenAI so long? Godement said it was “pure prioritization” as the company was focused on creating bigger and better models like GPT-4, which took a lot of “people and compute efforts.” As time went on, OpenAI noticed a trend of developers eager to use smaller models, so the company decided now was the time to invest its resources into building GPT-4o Mini.
“I think it’s going to be very popular,” Godement said. “Both by existing apps that use all the AI at OpenAI and also many apps that were put out by the pricing before.”