Google LiteRT-LM: A new framework for running LLMs offline on Android, Chrome, and even Raspberry Pi

Google has introduced LiteRT-LM, a new framework designed for running large language models (LLMs) locally, without needing an internet connection. This is the core technology powering Gemini Nano in devices like Chrome, Chromebook Plus, and the Pixel Watch.

The primary goal of LiteRT-LM is to enable AI that is:

  • Fast and cost-effective
  • Completely private and secure
  • Fully functional offline
How the framework organizes on-device LLM pipelines for various tasks.
How the framework organizes on-device LLM pipelines for various tasks.

Core Architecture and Features

The framework is built on two main components, making it modular and efficient for various on-device tasks.

Engine & Session

The Engine is the core of the framework, responsible for loading the LLM and its associated components like the tokenizer.

auto engine = Engine::Create(gemini_nano.tflite);
auto tokenizer = engine->GetTokenizer();
auto base_decoder = engine->GetTextDecoder();

A Session represents a specific, isolated task. You can load fine-tuned adapters (like LoRA), enable caching, and run inference for different jobs concurrently.

auto session = engine->CreateSession();
session->LoadLoRA(summarizer.lora);
session->SetKVCacheEnabled(true);

std::string result = session->Run(Summarize this text: ...);

Optimizations and Compatibility

  • Advanced Optimizations: Includes efficient Context Switching, Session Cloning, and a Copy-on-Write KV-Cache to boost performance.
  • Cross-Platform Support: Ready to deploy on Android, Linux, macOS, Windows, and even Raspberry Pi.
  • Hardware Acceleration: Leverages available CPU, GPU, and NPU resources for faster processing.
  • Developer-Friendly API: Built with C++, it provides a straightforward API for easy integration into your projects.
This unlocks the ability for developers to build powerful applications with language models that run autonomously anywhere—from a web browser to an embedded device.