Mozilla.ai is a new-ish endeavour of the Mozilla Foundation. Mozilla (the foundation and the corporation, its a whole thing with weird ownership and power structures) used to be well known for also kinda being in charge of getting Firefox and other free software tools developed. And for spending a lot of money in sorta dubious ways (like buying a strange mix of companies or paying a metric fuckton of money to their executives).
But Mozilla.ai dresses itself up as something different. Sure, Mozilla also wants to be part of the hype because of course they want to, but their goal is (quote):
Democratizing open-source AI to solve real user problems
Mozilla.ai
Great. I kinda get anxious when tech people use “democratizing” these days because it tends to mean “everyone gets to buy the same stuff if they have the same money” but let’s not dwell on that too much.
Mozilla.ai recently went into trying to understand the big trends in generative AI (aside from wasting ungodly amounts of energy and other resources to generate mediocre outputs) and the methodology is a bit … how do you say it in English … fucked up in all imaginable ways.
So Mozilla spoke to 35 organizations and wrote down notes about those conversations. So far, standard practice. In the end they had 18481 words of notes to go through. Which is not nothing but also not that much.
Quick sidebar about those notes: Every qualitative set of data (and that is what those notes are) is biased and subjective to a degree. Which questions you ask, which answers you follow-up on and how, which parts of the responses you put into notes, which things you might overlook because of language issues with one or more participants not being native speakers … the list is long. Which isn’t a criticism of Mozilla. It’s just what comes with that kind of process. And there are methods to mitigate these effects to a certain degree but it’ll always be there. Which is something everybody doing research should know.
Mozilla also seems to know. And they had an innovative solution: THEY HAD AN LLM SUMMARIZE THE NOTES TO REDUCE BIAS. Quote:
To avoid confirmation bias and subjective interpretation, we decided to leverage language models for a more objective analysis of the data. By providing the models with the complete set of notes, we aimed to uncover patterns and trends without our pre-existing notions and biases.
Mozilla.ai
Now we are in the clown science department. After this sentence they talk about about the Macbook they ran the models on locally (Privacy!!!111) and the models they used with some tech specs which makes it look very sciency in order to confuse the reader. But what they did was this:
The realized that their notes have biases and that they themselves might have biases when interpreting/summarizing them. So they took a bunch of heavily biased, mostly undocumented statistical models to summarize the notes and magically get rid of biases (but potentially add some fabrications/”hallucinations” to keep it spicy). But that’s not how any of this works.
This isn’t a “minus times minus results in a positive” kind of situation. What they are doing is stacking biases on top of one another. Their own biases are in the original data and all the biases from the models they used are now added to the mix. They literally made it worse.
Because with their own work they could have gone through the process of trying to identify their own biases, making them explicit (with the help of external parties? Mozilla does have money after all to hire experts.) and allowing people looking at the data and the analysis to use that information to interpret the results and the quality of the work. And what did the LLMs help them uncover from the data? Which insights have we gained?
Across all the models, 3 key takeaways stood out:
- Evaluation: Many organizations highlight the challenges of evaluating LLMs, finding it time-consuming.
- Privacy: Data privacy and security are major concerns influencing tool and platform choices.
- Reusability & Customization: Organizations value reusability and seek customizable models for specific tasks.
Wow. Mind.blown. So that’s the insight you tainted all your work for? The most obvious fucking things? I mean at least they are so obvious that the dangers of them being fabricated are negligible.
What this shows is an organization that lacks scientific rigour and a bit of critical distance to the field it wants to study/work in. This could have been some weird LinkedIn influencer’s post. It’s bad work and it’s not giving me any confidence that the Mozilla Foundation/Mozilla.ai knows what they are doing (aside from following the current hype).
Mozilla.ai is not a serious organization. It seems to be just another “AI” clown car.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Leave a Reply