Building an Astro Integration
- Published on
- Category
- Under the Hood
Last year, I wrote a blog post about an experimental local AI model that’s running inside the Google Chrome browser. While this API was experimental and is now not available anymore, the Chrome team has announced a few new variants of this API. Chrome split it into several specialized APIs, each with its own focus and capabilities.
And one of them is already available in stable Chrome versions and even built into this very website. If your browser supports the Summarizer API, you’ll see a ✨ button in the bottom-right corner. This opens a small chat interface where you can interact with the Summarizer API. This will then summarize the content of the article you’re currently reading. And the coolest part is that it’s all done locally on your device, without any data being sent to external LLM providers, costing me nothing to provide this feature.
So let’s see how it works!
The Chrome Summarizer API is part of Chrome’s Built-in AI suite. Unlike “traditional” LLMs that require API keys and are running on servers in huge data centers, these are running completely offline on your device. That’s also why it’s not available for everyone, as not every device has the necessary hardware to run these models. Also it’s not built into every browser at the moment, as you can see here in the baseline status:
Summarizer
This feature is not Baseline because it does not work in some of the most widely-used browsers.
But once it’s available for you it is a really cool little feature that can help some users quickly screen articles to see if they’re interested in reading the full version. Or if you have a documentation for something and want to give your users a quick overview over the content.
When using it you can choose between a few different types of summaries:
In addition, you can specify the length of the summary (short, medium, or long) and also the format (plain text or markdown). This makes it easy to customize the output to fit your needs.
One thing to keep in mind is that the summarizer can only do what the name suggests: summarize text. It cannot generate new content or provide additional information. You can also not chat with it directly but can just provide text to summarize (although there is also a Prompt API planned). So you cannot compare it to a chatbot like ChatGPT or Gemini.
Since the API is available starting with Chrome 138, you can start using it right away without needing to set any flags.
But if you want to check the current status or see some debugging information, you can navigate to chrome://on-device-internals
.
Once enabled, this shows you information like your device performance class, event logs or some model information.
You can see whether a model is downloaded, which version is installed, and how often it has crashed.
Unfortunately the model is not at all stable in my testing, depending on the device. So you might need to reset the crash counter, if you want to try again. But once it’s working it’s super interesting!
Before you can do anything with the Summarizer, you need to check if the model is available. You can do this by calling:
const availability = await Summarizer.availability();
This returns one of the following values:
unavailable
: The model is not available on your device.downloadable
: The model is ready to be downloaded.downloading
: The model is currently being downloaded.available
: The model is available and ready to use.When you then want to download the model, you can just create a new instance of the Summarizer which will automatically start downloading the model.
const summarizer = await Summarizer.create({
monitor(m) {
m.addEventListener('downloadprogress', (e) => {
console.log(`Downloaded ${e.loaded * 100}%`);
});
}
});
And then it’s time to wait ⏳. Although the Gemini Nano model is smaller than cloud-scale models like GPT-5 or Claude, it still requires a multi-gigabit download (~4GB on my machine).
But now that it’s downloaded we can finally start using it. So let’s see how the summarizer chat is built.
I split the implementation into two separate parts:
The chat component itself is super simple:
export default function Chat() {
// Early return if API is not available
if (!("Summarizer" in self)) {
return null;
}
const [isOpen, setIsOpen] = createSignal(false);
const {
messages,
availability,
downloadProgress,
isGenerating,
generateSummary,
} = createSummaryChat();
// ... rest of component
}
The component first checks if the Summarizer
API exists in the global scope.
This ensures that the chat is not even shown if the API is not available to a user.
If it is available, we need to handle the open / closed state so we can toggle the chat visibility (this is just a default SolidJS signal). And then afterwards we can initialize the summary chat, which we’ll take a look at soon.
One thing we still need to do though is to parse the messages.
Since I didn’t just want boring text messages, I use the markdown output of the AI model.
So we need to parse the markdown, which is done using the marked
library and then sanitize it with the DOMPurify
library.
DOMPurify.sanitize(
marked(message.content, { async: false, gfm: true }),
);
And that’s already it for the component. I won’t bore you with the details of the component HTML itself, since this is just a default SolidJS component and the styling is completely up to you.
It’s way more interesting to see how the state is managed and how to interact with the AI model.
As said, the core functionality lives in my custom createSummaryChat
hook that manages:
export function createSummaryChat() {
const [availability, setAvailability] = createSignal<Availability | undefined>();
const [downloadProgress, setDownloadProgress] = createSignal(0);
const [isGenerating, setIsGenerating] = createSignal(false);
const [messages, setMessages] = createSignal<Array<ChatMessage>>([
{
id: crypto.randomUUID(),
actor: "ai",
content: "Hey there 👋!\n\nI'm an AI assistant 🤖 running **offline inside your browser**..."
},
]);
// ... implementation
}
Before generating summaries, we need to create a summarizer instance:
async function createSummarizer(type: SummaryType): Promise<SummarizerInstance | undefined> {
const availability = await Summarizer.availability();
setAvailability(availability);
if (availability === "unavailable") {
// Handle unavailable state
return undefined;
}
return Summarizer.create({
sharedContext: "This is a blog about web development topics and programming experiments...",
type,
format: "markdown",
length: type === "tldr" ? "long" : "medium",
monitor: (monitor) => {
monitor.addEventListener("downloadprogress", (event) => {
setDownloadProgress(e.loaded);
});
},
});
}
First you need to check if the model is available at all. Then we need to decide what to do: If the model is completely unavailable, we cannot generate a summary so we cannot proceed. In any other case we can create our summarizer instance. This will automatically download the model and update the progress bar if it’s not already downloaded. Once the model is available, a new session is created and we can start summarizing stuff.
But as you can see, I already pre-configured my session with the following options:
tldr
, key-points
, teaser
)short
, medium
, long
)Now that we have a way to create a summarizer instance, we can finally generate our first summary.
async function generateSummary(type: SummaryType): Promise<void> {
// Create summarizer instance
const summarizer = await createSummarizer(type);
if (!summarizer) return;
// Extract content from the current page
const postContent = document.querySelector(".post-container")?.innerText;
setIsGenerating(true);
// Generate summary
const summary = await summarizer.summarize(postContent);
// Add summary to messages
setMessages((messages) => [...messages, {
id: crypto.randomUUID(),
actor: "ai",
content: summary,
}]);
setIsGenerating(false);
}
While this works, it’s also kind of boring. You need to wait ⏳ for ages again, until the model finishes the complete summary and only then it’s displayed.
So let’s do it like nearly all AI chat interfaces and enable real-time streaming. This will update the UI for each chunk of text that is generated, so we can already read the first output while the model is still working on the summary.
async function generateSummary(type: SummaryType): Promise<void> {
/* Prepare summarizer and data */
const stream = summarizer.summarizeStreaming(postContent);
let content = "";
let messageId = "";
for await (const chunk of stream) {
// Add new chunk to existing content
content += chunk;
if (!messageId) {
// Create new message for first chunk
messageId = crypto.randomUUID();
setMessages((messages) => [...messages, {
id: messageId,
actor: "ai",
content,
}]);
} else {
// Update existing message with new content
setMessages((messages) =>
messages.map((message) =>
message.id === messageId ? { ...message, content } : message
),
);
}
}
/* Clean up */
}
This creates a more interactive and engaging experience for users, not having to wait for the entire summary to load before seeing any progress 🚀.
If you have some recent version of Google Chrome installed on your pc (no it’s not available on mobile), you may already see the ✨ button in the bottom-right corner of this page. Have fun trying it out!
For everyone not using Chrome or being on mobile, I made a small demo. This just outputs the same summary that was generated by the model on my machine once during writing.
Summary chat that summed up this article in different styles. This is not live and uses pre-defined texts.
As said, this is just a small demo and the model is not very stable in my opinion (at the time of writing).
There is also a demo playground by Google if you want to try out summarizing your own texts, or play with the different parameters like length, type, etc.
The Chrome Summarizer API is an exciting new feature that makes building AI into web applications more accessible. I personally don’t want to pay AI providers so that users can use these features on my sites (especially since I don’t make money off of them). So this is a great opportunity for developers like me to experiment with AI without worrying about costs.
But I also have some wishes for the future:
For now, it’s a fascinating experiment that showcases the potential of local AI.
Try the chat component on this page if you have the right browser setup, and experiment with different summary types. The future of local AI in the browser is here - it just needs a bit more time to mature.