- Gemini Live, Google's counterpart for ChatGPT's Advanced Voice Mode is launching now to Android users in English.
- It is not free and will only be available to Gemini Advanced subscribers which costs $20/month.
- Gemini Live will roll out in more languages and to iOS users in the coming weeks.
Google's Made by Google event has officially ended where the company launched the latest lineup for their flagship Pixel series smartphones. The rumor mill has been hard at work for the past few weeks about that one and many rumors have finally been turned into reality. Moreover, as expected, the event also had more than a few – well quite a lot actually – mentions of AI.
Among other things AI, the one important announcement has been the launch of Gemini Live. Google announced Gemini Live at their I/O conference earlier this year. It's finally rolling to Gemini Advanced subscribers on Android in English, with rollout to more languages and iOS (via the Google app) coming soon.
With Gemini Live, Gemini is now capable of holding more natural, two-way conversations. You can also interrupt it in the middle of a response, just like in any natural conversation. You can go into the Gemini app on Android to converse with the chatbot.
This is similar to the Advanced Voice Mode experience in the ChatGPT app that is now rolling in a limited alpha to ChatGPT Plus users. For once, Google has put itself ahead of OpenAI in the release timeline by initiating a wider rollout.
Gemini Live is also available hands-free, so you can talk to Gemini in the background or even when your phone is locked. You can also leave conversations in the middle and get back to them later.
Google is rolling out Gemini Live in 10 new voices so your conversations with the AI can feel even more authentic to you; you can choose the voice and tone that resonates with you.
Notably, Gemini Live cannot simulate any other voice than the 10 voices available in the app, possibly to avoid copyright issues. ChatGPT-4o follows the same policy. There is one area where Gemini Live is not the same as ChatGPT-4o's Voice Mode. The former cannot understand your emotions from your tone, something that OpenAI demoed their chatbot could do.
Moreover, there is also one capability of Gemini Live that Google demoed at the I/O conference that is not going to be available at launch. Yes, we're talking about multimodal inputs. If you don't know what that was, no worries. Here's a recap: With multimodal inputs, Gemini Live can take inputs from your phone's camera (both photos and videos) in real time and answer any questions or help you identify objects you point at. For example, you can point it at some DJ equipment and ask it to identify the name of a part or you can point it on your screen and ask what a certain part of a code does.
But multimodal capabilities are delayed for now and Google has only said they will arrive later this year, with no specifics. Interestingly, ChatGPT-4o's Advanced Voice Mode is also supposed to have similar capabilities but they haven't been launched with the limited alpha rollout either.
Notably, Gemini Live is a step on the way to Google bringing Project Astra to fruition.
Talking to a chatbot is sometimes way more convenient than typing something out, especially when you want to brainstorm something. And with Gemini Live, the conversation can be way more seamless. Or if the live demos from the Made by Google event are to be any indication, seamless enough. (The chatbot apparently hallucinated during the live demo and there's some friction when putting the "interrupt Gemini in the middle" feature to test). Let's see how it fares in the real world, eh? Get ready to test Gemini Live on your Pixel, Samsung, or other Android devices over the coming weeks, starting from today.
Member discussion