Generative AI

Auto Added by WPeMatico

Nano Banana Pro vs Grok Imagine for Image Generation and Editing

The AI image world today is split between two giants. One is backed by Google’s Gemini, while the other carries the unmistakable Elon Musk aftertaste. We know the former as the Nano Banana Pro – an upgraded, souped-up version of the already-iconic Nano Banana. To challenge it in a vs match, is Grok Imagine, the […]

Nano Banana Pro vs Grok Imagine for Image Generation and Editing Read More »

Google Earth AI: Unlocking geospatial insights with foundation models and cross-modal reasoning

Google Earth AI is our family of geospatial AI models and reasoning agents that provides users with actionable insights, grounded in real-world understanding. Today, we’re sharing our latest Earth AI innovations and expanding access to these new capabilities on Google Earth and Google Cloud. For years, Google has developed AI models that enhance our understanding

Google Earth AI: Unlocking geospatial insights with foundation models and cross-modal reasoning Read More »

How we are building the personal health coach

The personal health coach is built with Gemini models to deliver personalized and adaptive coaching, grounded in science and informed by expert oversight. Historically, health and fitness journeys have been fragmented, generic and inaccessible, whether within existing apps or through general health and fitness journeys outside of apps. For instance, a primary care provider might

How we are building the personal health coach Read More »

StreetReaderAI: Towards making street view accessible via context-aware multimodal AI

We introduce StreetReaderAI, a new accessible street view prototype using context-aware, real-time AI and accessible navigation controls. Interactive streetscape tools, available today in every major mapping service, have revolutionized how people virtually navigate and explore the world — from previewing routes and inspecting destinations to remotely visiting world-class tourist locations. But to date, screen readers

StreetReaderAI: Towards making street view accessible via context-aware multimodal AI Read More »

Toward provably private insights into AI use

We detail how confidential federated analytics technology is leveraged to understand on-device generative AI features, ensuring strong transparency in user data handling and analysis. Generative AI (GenAI) enables personalized experiences and powers the creation of unstructured data, including summaries, transcriptions, and more. Insights into real-world AI use [1, 2] can help GenAI developers enhance their tools

Toward provably private insights into AI use Read More »

Introducing Nested Learning: A new ML paradigm for continual learning

We introduce Nested Learning, a new approach to machine learning that views models as a set of smaller, nested optimization problems, each with its own internal workflow, in order to mitigate or even completely avoid the issue of “catastrophic forgetting”, where learning new tasks sacrifices proficiency on old tasks. The last decade has seen incredible

Introducing Nested Learning: A new ML paradigm for continual learning Read More »

Generative UI: A rich, custom, visual interactive user experience for any prompt

We introduce a novel implementation of generative UI, enabling AI models to create immersive experiences and interactive tools and simulations, all generated completely on the fly for any prompt. This is now rolling out in the Gemini app and Google Search, starting with AI Mode. Generative UI is a powerful capability in which an AI

Generative UI: A rich, custom, visual interactive user experience for any prompt Read More »

AfriMed-QA: Benchmarking large language models for global health

Afrimed-QA, a collection of contextually relevant datasets for evaluation of LLMs on African health question answering tasks, developed in partnership with organizations across Africa. Large language models (LLMs) have shown potential for medical and health question answering across various health-related tests spanning different formats and sources, such as multiple choice and short answer exam questions

AfriMed-QA: Benchmarking large language models for global health Read More »

Towards better health conversations: Research insights on a “wayfinding” AI agent based on Gemini

Google Researchers share user insights from a novel research AI agent that helps people find their way to better health information through proactive conversational guidance, goal understanding, and tailored conversations. The ability to find clear, relevant, and personalized health information is a cornerstone of empowerment for medical patients. Yet, navigating the world of online health

Towards better health conversations: Research insights on a “wayfinding” AI agent based on Gemini Read More »