Google showcased Project Astra last summer just as OpenAI introduced their GPT-4o with Vision to the world. However, while OpenAI's product already made it to the masses this December, all Google did was provide another demo of an improved Project Astra.
Needless to say, it was a disappointing development. However, there's a silver lining. While there's no roadmap on when Project Astra will be available through the Gemini app (as is the plan), there's already a way to test Project Astra right now through Google AI Studio.
Google recently added a new feature to Google AI Studio – Stream Realtime – which works a lot like Project Astra and is a good place to get an idea of Project Astra's capabilities. While Google AI Studio is meant to be a place for developers for API testing, anyone can use the AI Studio interface for free without the API.
With Stream Realtime, you can share your surroundings with Gemini through your phone/ computer camera, or your computer screen and chat about what you're streaming.
- To use Stream Realtime, i.e., Project Astra in disguise, navigate to aistudio.google.com on your phone or computer.
- Sign in to your Google account.
- Go to 'Stream Realtime' from the option on the left menu.
- Once you switch to 'Stream Realtime', you'll find some options on the right you can customize, like 'Output format' and 'Voice'. There are currently 5 voices available: 'Puck', 'Charon', 'Kore', 'Fenrir', and 'Aoede', with Puck being the default. You cannot change the model from Gemini 2.0 Flash Experimental.
- You can also enable certain tools like 'Code Execution', 'Function calling', 'Automatic function response' and 'Grounding'.
- Once you have configured the settings, select 'Show Gemini' to share your camera feed or 'Share your screen' to share the screen of your PC with Gemini; the latter option is absent on mobile.
- On my PC, I decided to share my screen with Gemini and while there was initially some friction (Gemini would not respond), after a refresh, it worked perfectly. You can select a browser tab, an application or your entire screen with Gemini.
- Once your screen is visible, start chatting with Gemini about the content of your screen. To stop sharing your screen, click on 'Stop Sharing' at the bottom.
- To end the session completely, click on the 'camera' button in the chat to stop the recording.
- Once you end the session, you can find the video recordings, voice recordings, and transcripts of Gemini's answers in the chat.
You can share your camera feed and chat to Gemini about it in the same manner.
Things to know:
- Gemini does a great job of identifying stuff on your screen and answering any queries about it.
- It can only see the part of the app/ webpage currently visible on the screen and cannot see anything else until you scroll and show it.
- It does not have access the Internet in Gemini AI Studio and can only access information till its training cutoff date, which is August 2024.
Member discussion