This Week in Glean: Looking back at Glean in 2021
(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.) All "This Week in Glean" blog posts are listed in the TWiG index (and on the Mozilla Data blog). This article is cross-posted on the Mozilla Data blog.
A year ago I posted Glean in 2021 as a way to look into the future and set out a vision and plan for the project. Today I'm looking back at 2021, if we were able to follow up on my plans back then and look at all the other things we did for Glean.
Let's start easy: According to the index we wrote 21 This Week in Glean blog posts (including this one). Close enough to one every other week. Communicating about our work is important and TWiGs are one way to put our ideas and thoughts about the project out there.
Let's first look at the topics I identified as important in last year's blog post.
The Glean SDK is a fully self-servable telemetry SDK, usable across different platforms. It enables product owners and engineers to instrument their products and rely on their data collection, while following Mozilla policies & privacy standards.
I'd say we're in pretty good shape here. The docs on how to add Glean to a project need some updates, but we also added a post for a slightly different audience: Integrating Glean for product managers, so it's not only for engineers anymore. We assisted others when they integrated Glean, but more for general data and instrumentation knowledge.
We specified & implemented 3 new metric types this year.
url metric is available in all SDKs, but not yet exposed on Firefox Desktop.
text metric were not usable until recently because of bugs in our schema generation tool.
We had to change how we generate the table schema for these and other metric types in the future a bit,
but the pipeline now correctly ingests data for these metrics
and users can query the data in the columns where they expect them.
Revamped testing APIs
We implemented some of the new testing APIs, namely metric coverage support and
Glean and FOG already use these new testing capabilities, but to my knowledge other Glean users are not yet relying on that.
Certainly an area we should make more progress on in the coming year.
UniFFI - generate all the code
Migration to rely on UniFFI for code generation has started! Right now adding new features to Glean requires changes across all SDKs. With UniFFI we can define the API of metric types once, implement it in Rust and get the code for the SDKs generated automatically. Less manual code means less potential for bugs and easier maintenance.
uniffi branch is used for slowly migrating Glean piece by piece.
Most of the metric types are defined in UDL now and the Kotlin code base has been migrated to it already,
with all tests passing again.
Work on this will continue next year, migrating the Swift, Python and Rust SDKs and then eventually merging it back into the main branch,
so we can ship it to users.
From there adding new metrics will require less implementation work and thus will allow us to follow up with some of the existing but currently paused requests.
All the other things
We released 9 major versions of Glean and did 33 releases in total. We're very liberal with breaking changes & major releases, though we make sure that upgrades require minimal changes where possible. The Kotlin, Swift, Python & Rust SDKs are versioned all the same, so a breaking change in only one SDK will still cause a major release for other SDKs, even without breaking changes.
More products and projects are now using Glean, including Rally studies, the Mozilla VPN client, and the new version of Focus for Android. A full list of products is available in the Glean Dictionary.
To ease integration between telemetry collected in Gecko and telemetry collected in the Android parts of the browser we're now shipping Glean through GeckoView. (After crashing it, fixing it and shipping it again without new bugs cropping up.) There's some work to be done next year to make local development easier again and to automate some of the tasks of upgrading Glean across applications.
Firefox on Glean (FOG)
We invested a lot of time in making the Firefox integration better, fixing bugs and convincing people to migrate their instrumentation to the new Glean-based system. It's going slow, but we now have the first few important metrics in Glean pings coming from Firefox Desktop. Additionally Glean was used to provide telemetry for the new Desktop Background Updater. Next year we hope to convince more people to start using Glean in Desktop, with the added benefit that this data will also be available in Firefox for Android.
A year ago Glean.js was just started. Thanks to Bea's tireless work it's now nearly feature-complete compared to the other Glean SDKs and used in production in applications & web extensions and soon even used to instrument websites. Next year Bea aims for a proper stable v1.0 release and hopefully more use across different products. It's a real alternative for those platforms where the initial set of SDKs don't cut it.
Glean surely is becoming the one telemetry solution across Mozilla products.
Tooling: GLAM, Dictionary, Looker
These are the tools I'm much less directly involved in, but they make up an important part of the Glean ecosystem. After all these are where users actually get to analyze at the data they collect.
GLAM currently only serves data for Firefox Desktop using legacy telemetry data as well as Firefox for Android using Glean data. Things are in place to enable more Glean-powered products soon, once we iron out some details around data expectations from applications.
The Glean Dictionary saw near-weekly releases and went from "promising prototype" into "valuable production tool" this year, thanks to a lot of work from wlach and linh. It was the basis of several initiatives such as metric annotations and tagging support. These things will be helpful to make data at Mozilla more discoverable and better documented.
The primary tool for analyzing data at Mozilla is now Looker. Folks have been very busy making more and more data accessible to both data experts and non-experts alike1. Glean data is (more or less) automatically available in Looker explores, for example event counts. Instead of bringing up ad hoc dashboards for initiatives when needed we can usually recommend that people just build things in Looker. That has the major advantage that exploration beyond what the dashboard summarizes is just one or two clicks away. I'm excited to shift my own use away from fiddling around with SQL to more approachable explores & dashboards, even for one-time analysis.
I'm saving an outlook on the next year of Glean for some other time. I called out some areas of work above that will see more work next year, and then there's also new projects & ideas coming up. I still plan to continue work on Glean for the vast majority of my time.
The This Week in Glean series is going into winter hiatus until January 2022. Thanks to everyone who contributed to This Week in Glean blog posts in 2021:
Alessio, Travis, Chutten, Bea, Raphael, Anthony, Will.
And of course thanks to everyone who contributed to Glean and the wider ecosystem around it.