Updates
Site is Under Maintenance
Please come back again in...
00Days
00Hours
00Minutes
00Seconds
Droosmo TECH by Meliani Driss From TGM Group Like our page

GPT-4o

Forward-looking: OpenAI just introduced GPT-4o (GPT-4 Omni or "O" for short). The model is no "smarter" than GPT-4 but still some remarkable innovations set it apart: the ability to process text, visual, and audio data simultaneously, almost no latency between asking and answering, and an unbelievably human-sounding voice.

While today's chatbots are some of the most advanced ever created, they all suffer from high latency. Depending on the query, response times can range from a second to several seconds. Some companies, like Apple, want to resolve this with on-device AI processing. OpenAI took a different approach with Omni.

Most of Omni's replies were quick during the Monday demonstration, making the conversation more fluid than your typical chatbot session. It also accepted interruptions gracefully. If the presenter started talking over the GPT-4o's reply, it would pause what it was saying rather than finishing its response.

OpenAI credits O's low latency to the model's capability of processing all three forms of input--text, visual, and audio. For example, ChatGPT processed mixed input through a network of separate models. Omni processes everything, correlating it into a cohesive response without waiting on another model's output. It still possesses the GPT-4 "brain," but has additional modes of input that it can process, which OpenAI CTO Mira Murati says should become the norm.

"GPT-4o provides GPT-4 level intelligence but is much faster," said Murati. "We think GPT-4o is really shifting that paradigm into the future of collaboration, where this interaction becomes much more natural and far easier."

Omni's voice (or voices) stood out the most in the demo. When the presenter spoke to the bot, it responded with casual language interspersed with natural-sounding pauses. It even chuckled, giving it a human quality that made me wonder whether it was computer-generated or faked.

Real and armchair experts will undoubtedly scrutinize the footage to validate or debunk it. We saw the same thing happen when Google unveiled Duplex. Google's digital helper was eventually >validated, so we can expect the same from Omni, even though its voice puts Duplex to shame.

However, we might not need the extra scrutiny. OpenAI had GPT-4o talk to itself on two phones. Having two versions of the bot converse with each other broke that human-like illusion somewhat. While the male and female voices still sounded human, the conversation felt less organic and more mechanical, which makes sense if we removed the only human voice.

At the end of the demo, the presenter asked the bots to sing. It was another awkward moment as he struggled to coordinate the bots to sing a duet, again breaking the illusion. Omni's ultra-enthusiastic tone could use some tuning as well.

OpenAI also announced today that it's releasing a ChatGPT desktop app for macOS, with a Windows version coming later this year. Paid GPT users can access the app already, and it will eventually offer a free version at an unspecified date. The web version of ChatGPT is already running GPT-4o and the model is also expected to become available with limitations to free users.

 

Related Post in Article

Related Post

Hashtags

You may like these posts

Post a Comment

To leave a comment, kindly ensure that your words are respectful and considerate. Let's foster a positive and supportive environment for everyone! 😊🌟

Enter Image URL / Code Snippets / Quotes / name tag, then click parse button accordingly that you have entered. then copy the parse result and paste it into the comment field.


Cookie Consent

We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.

Google Translate
Bookmark Post