Toonsutra: Immersive Reading Experience

Brings comics to life with an immersive reading experience powered by the Gemini API, Gemini 2.5 Pro Preview & Lyria 2

An AI-powered comic experience where stories don't just sit on the page; they respond and engage, transforming static images into dynamic audio narratives.

We worked with Toonsutra and Google to help bring comics to life, literally.

This project introduces an automated technical approach to generate immersive audio for digital comics. A multi-agent LLM architecture, built on Gemini 2.5 Pro, comprising specialized agents, collaboratively processes comic images and transcripts. This system analyzes context, aligns dialogue, composes music, maps voiceover audio to panels, and generates a final synchronized spatial metadata file to build immersive experiences.

CLIENT
Toonsutra

COLLABORATION
Google

RELEASED
May 2025

AI/ML

LLM

Generative AI