OpenAI adds vision capabilities to Voice Mode, mirroring Gemini's features

By PulseAugur Editorial · [1 sources] · 2024-12-18 09:46

OpenAI has updated its Voice Mode for ChatGPT to include visual input capabilities, allowing users to show the AI objects and receive information about them. This feature, which was previously demonstrated by Google's Gemini, enables ChatGPT to analyze images and provide contextually relevant responses. The update aims to enhance the interactivity and utility of the voice assistant by integrating visual understanding. AI

RANK_REASON Product update adding a new capability to an existing AI tool.

Read on Smol AINews →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

Smol AINews TIER_1 Deutsch(DE) · 2024-12-18 09:46

OpenAI Voice Mode Can See Now - After Gemini Does

**OpenAI** launched **Realtime Video** shortly after **Gemini**, which led to less impact due to Gemini's earlier arrival with lower cost and fewer rate limits. **Google DeepMind** released **Gemini 2.0 Flash** featuring enhanced multimodal capabilities and real-time streaming. *…

COVERAGE [1]

OpenAI Voice Mode Can See Now - After Gemini Does

RELATED TOPICS