Today, OpenAI released a new version of ChatGPT — ChatGPT 4-o — that is free to everyone. It’s a new, multimodal (“omni”) version of ChatGPT that can engage in live conversation using reasoning across 50 languages by effortlessly integrating and processing text, audio, and visual data in real time.
They are also rolling out a desktop app to make it easier to use.
All of these changes will be available to everyone over the next few weeks. I just refreshed my ChatGPT (I’m a paid 4 user), and I already have access to the new model, though without voice.
These are the basic highlights. I’ll tie each to education.
Basically, it can be used as a free personal tutor by almost anyone, and we are just getting started. Students can chat with it, just like they would their human tutors, on their phones or PCs. The big difference is that it is available to them 24/7 and it’s free.
*ChatGPT4 (ChatPGT4-o) is now FREE
The highest-end frontier models (ChatGPT4, Claude’s Opus, Google’s Gemini Pro & Ultra) each have some unique features and capabilities but are largely competitive with each other. The problem for some is that they are each $20/month, which makes them inaccessible for some, especially for many teachers and students.
To be honest, this was causing me a bit of angst, as it was leading people to think that current AI tools were way less sophisticated than they are, as the free versions of ChatGPT and other models can’t do much compared to the paid versions. And it gave those who could afford usage of the higher-end models a substantial advantage over their peers when they used these models to support their education.
This now makes that capability free, and on all but one metric, it’s better than all of them. The chart below displays the metrics. I explain here what each of them in case you don’t know what they are.
MMLU — Measures multi-task language understanding across a variety of subjects like humanities, social sciences, STEM fields etc.
GPOA — Evaluates general-purpose reasoning and problem-solving abilities.
MATH — Assesses skills in solving various types of mathematical problems.
HumanEval — Tests coding/programming abilities by having models complete short coding tasks.
MGSM — Measures how well a model can generalize its capabilities to handle multiple different tasks.
Beyond these metrics, we don’t know much about the difference between ChatGPT4Turbo and ChatGPT4-o, though a few things stand out.
It is twice as fast as ChagtGPT4.
Multiple users have also reported significant increases in the ability of Data Analytics/Code Interpreter.
Claire Zhou noted they dropped text-to-3D without even talking about it.
And font manipulation.
Today’s announcement also means the GPTs in the store are free. GPTs are bots that individuals can easily build to help them with their workflows I’ve made some to help me with my work, but I haven’t made any for my students to use because they previously had to have a paid subscription to ChatGPT4 to use them. Now teachers and professors can easily make and share these classroom support bots with their students, and the students can use them for free.
Of course, it also means students can use many of the bot tools to generate accurate bibliographies, train AIs to write like themselves, and generally “humanize” their writing. Some instructors may not like this, but the full capabilities are now free to everyone. Students who use the free version of ChatGPT are no longer limited to a model that can’t reason and hallucinates bibliographies.
These are some examples of popular bots in the GPT “store.”
It’s “interesting” that three of the top bots in the writing category are “humanizers.”
Anyhow, students now have access to a lot of free intelligence they previously could not access. In reflecting on free GPT4, Nici Sweaney noted:
It’s not just the kids with $ who can now take a photo of a textbook, chat to the AI model in their car, and arrive home to a written presentation, plan for slides, and study notes.
It’s not just kids with $ that can code instantly, upload a spreadsheet and have AI write complex excel macros and formulas, or have ChatGPT analyse data and produce because graphs.
*Real-time multimodal conversation enables free tutoring
There are significant updates to ChatGPT that enable strong tutoring for free.
“Real-time” conversation. Previous voice tools first “heard” the voice, translated the voice to text, produced text output, and translated the text output to speech. This is why there was a delay in voice conversation.
This new assistant, which likely uses a new process, reduces that to approximately 200 milliseconds, or .2 seconds, which is very close to human conversation, and may demonstrate an ability to listen and speak at the same time. It also allows the user to interrupt the voice assistant in the same way you might interrupt a human and keeps the conversation flowing (and the voice assistant won’t get angry if you interrupt it :)).
Emotional/human voice conversation. This moves away from one of the 7 stale voices to regular voice patterns that incorporate human emotion.
Mark Silva explained: “It's designed to understand and generate emotional nuances, bringing a new level of empathy and engagement to voice interactions. Imagine interacting with an AI that not only understands what you're saying but also how you're feeling.”
Vision. The model has improved vision capabilities, and its omni-modal approach makes it easy for tutoring and coaching.
You can see the whole video here, but I broke it down into a few parts.
A tutor that helps you do a math problem.
Here’s another example.
And it can help you code/learn to code.
[Note: A few commentators said Co-Pilot is better than this.]
A tutor that helps you learn to read a chart.
A tutor that works in 50 languages
ChatGPT4-0 can support real-time translation in 50 languages, which they claim covers 97% of the world’s population.
This both makes instant translation possible and allows you to practice learning a language with a bot.
You can even get support identifying objects in another language.
And it can moderate a debate and summarize a meeting.
What does this lay the foundation for?
*Real-time video interaction, which is expected soon. Imagine that students will even be able to design the real-time video tutor they interact with.
*A tutor that can provide visual aids and demonstrations and can adapt its teaching style based on your facial expressions and body language (Silva)
*A tutor that can remember (using its memory feature) what you know and what you don’t, enabling it to adapt its instruction.
*A tutor that can (eventually) call any of the bots in the store (“experts”) to provide accurate answers, avoiding hallucinations.
*A tutor that can provide seamless interaction with students, teachers, and entire classrooms.
*Alex Gray suggests a number of innovations, including a “science class where complex concepts are explained through interactive visuals,” “real-time feedback and interaction,” “multilingual schools, and cultural exchange programmes,” and “By processing and generating content in various forms – text, image, and video – the model caters to different learning preferences and abilities.'“
Most importantly, in my mind, it’s a tool to help students prepare for the AI World.
Integrating GPT-4o into education is not just about enhancing learning; it's about preparing students for a future where AI will be ubiquitous. By working with GPT-4o, students can develop a critical understanding of AI, its potential impacts, and ethical considerations. They'll be better equipped to navigate and thrive in an AI-driven world.
Contemporary Assessments
Yes, it does mean it’s not only a great tutor but that it can do most assessments that are used in school today. Leon Furze outlines this well here.
Beyond Education
Beyond education, these conversational tools will present challenges to sites like Character.ai (friendly bots), emotional conversational tools (hume.ai), and translation sites (TimeKettle). DuoLingo’s stock collapsed shortly after the livestream.
Thoughts from Others
“This is the first time an AI system almost delivered the entire tech demo itself.” (Elvis S)
“It's very clear to me that advanced emotional intelligence…is the key to unlocking the future of collaborative AI.” (Elvis S)
Not perfect, but Valuable
The explanation above is not meant to imply that ChatGPT-4o multimodal is the perfect tutor. There are still some hallucinations and AI generally needs more work on improving its math and reasoning abilities, but progress is substantial and it’s helping to create a future where the best education is available to everyone. It’s already better than what many students in some countries have access to.