OpenAI’s ChatGPT: Now Seeing, Hearing, and Speaking in a Major Update

ChatGPT. Photo by Jonathan Kemper on Unsplash.

By Ali Al-Rumaih, The-14

September 25, 2023

In a groundbreaking development, OpenAI is set to usher in a new era of AI-driven conversations and interactions with a major update to its ChatGPT. This update promises to empower the viral chatbot to engage in voice conversations, interpret images, and, in the process, move closer to the realm of popular AI assistants such as Apple’s Siri.

The inclusion of voice capabilities in ChatGPT is poised to open doors to a multitude of creative and accessibility-focused applications, according to OpenAI‘s recent blog post. This addition brings ChatGPT into league with established AI services like Siri, Google Voice Assistant, and Amazon’s Alexa, which are seamlessly integrated into the devices they operate on. These voice assistants are commonly used for tasks such as setting alarms, delivering reminders, and fetching information from the internet.

Since its debut just a year ago, ChatGPT has found utility across various industries, from summarizing documents to generating computer code, sparking a competitive race among tech giants to launch their own generative AI-based solutions.

But the update doesn’t stop at voice interactions alone. ChatGPT‘s new voice feature can narrate bedtime stories, resolve dinner table debates, and audibly articulate text input from users. Furthermore, this innovative technology has found applications outside the realm of chatbots, as Spotify employs it to enable podcasters to translate their content into multiple languages, widening their reach.

The addition of image support further augments ChatGPT‘s capabilities. Users can now capture images of their surroundings and seek assistance from the chatbot in troubleshooting problems, planning meals based on fridge contents, or analyzing complex data graphs for work-related purposes. Presently, Google Lens is a popular choice for gaining insights from images, but ChatGPT is set to challenge this status quo.

These eagerly awaited features will be gradually rolled out to subscribers of ChatGPT‘s Plus and Enterprise plans over the next two weeks.

Breaking Barriers: ChatGPT Can See, Hear, and Speak!

OpenAI’s latest release is a game-changer! You can now engage in seamless, back-and-forth conversations with ChatGPT, all through the power of your voice. Whether you’re on the move, craving a bedtime story, or settling a lively dinner table debate, ChatGPT has got you covered.

But this isn’t just about chatbots; it’s a glimpse into the future of advanced AI capabilities. Here’s everything you need to know about these groundbreaking features:

Voice:

  • Engage in natural, voice-driven conversations.
  • Available on both iOS and Android (opt-in via settings).
  • Powered by an innovative text-to-speech model that creates remarkably human-like audio.
  • Utilizes the Whisper system for precise speech recognition.
  • Collaborations with professional voice actors for lifelike voice creation.

Image:

  • Users can now present images to ChatGPT for interpretation.
  • Features a drawing tool for pinpointing specific details in images.
  • Image understanding leverages the capabilities of multimodal GPT-3.5 and GPT-4 models.
  • Understands photographs, screenshots, and documents that combine text and images.

Gradual Deployment for Enhanced Safety:

Voice Risks:

  • Potential for impersonation and fraudulent use of synthetic voices.
  • Current use case: voice chat with voice actors.
  • Collaborative example: Spotify’s Voice Translation feature for podcasts.

Image Risks:

  • Challenges like model hallucinations and high-stakes interpretation.
  • Pre-deployment testing in risk-prone domains and with a diverse pool of testers.
  • Collaboration with Be My Eyes to understand uses and limitations.
  • Stringent technical measures are taken to respect individuals’ privacy and minimize analysis of personal data.

With ChatGPT‘s newfound abilities, OpenAI is leading the charge towards a future where AI isn’t just a tool but a seamless part of our daily lives, enhancing our interactions and capabilities in unprecedented ways.

Subscribe to our newsletter.

0 Shares