From democratizing access to scientific data and enterprise-grade ML frameworks to on-device semantic search and interactive art, these projects are pioneering what it means to build AI in the open.
You can join the project leads and thousands of other open-source AI builders, advocates, and hackers in the Mozilla AI discord.
Projects that focus on improving and customizing AI models for better performance and real-world use.
A platform that provides instant pronunciation feedback and language coaching, Koel Labs uses movie scenes to improve pronunciation in real time.
Designed to provide general and domain-specific AI capabilities for Swahili-speaking regions, Sartify’s Swahili-LLM supports multilingual interactions, including the Kiswa-English mix common in East Africa
Helping users access high-quality scientific data in multiple languages, ScholasticAI uses open-source AI models to process open licensed scientific documents. This approach ensures accessibility of top-tier research while remaining private and reducing the need for remote servers.
Transformer Lab is a comprehensive application for building, training, and managing local LLMs on Windows, MacOS, and Linux. It simplifies working with LLMs, allowing users to train, fine-tune, evaluate, and export them across formats.
Tools and systems that make software development easier and more powerful, helping developers build and manage their projects more effectively.
A curated set of AI/ML models for drug discovery research in low- and middle-income countries. The models on Ersilia Model Hub can operate locally, ensuring access even in areas with poor internet connectivity and helping improve researcher access for drug discovery in the Global South.
An AI copilot that learns from continuous feedback, Foyle uses a notebook format to simplify automation and operations tasks across workflows.
Feature Retrieval, Editing, and Understanding for Developers (FREUD) helps developers better control and understand AI model outputs by focusing on the interpretability of speech-to-text models like Whisper.
The NX project aims to create a scalable, distributed machine learning framework that outperforms current solutions by leveraging Elixir’s unique strengths in multitasking and fault-tolerance.
Using advanced OCR for layout, text, and math detection, Marker converts PDFs to Markdown while running locally. This ensures high-quality data extraction without needing constant internet access. Paired with companion tools Surya and Texify, Marker helps create robust training datasets and improves data processing efficiency.
Open WebUI is a popular platform with over 3 million downloads. It offers an easy-to-use interface that lets users run powerful GPT-like AI models without an internet connection. The team is currently focused on building a community where users can share data and insights to improve AI models together.
Offering full transparency, customizable prompts, and the freedom to choose and host preferred language models, Theia AI IDE extends the existing Theia editor to provide a suite of copilot options and agents, enabling developers to customize their tools and workflows with on-device or cloud AI solutions.
Creative projects that find new ways to display data and media, making information easier to understand and more engaging.
An on-device semantic search and image recognition that provides users with private, durable, and reliable photo storage. Ente Photos is a fully open source platform for storing family photos securely.
Designed to embed, visualize, cluster, and categorize data using the latest advancements in LLMs. Latent Scope features a pipeline tool for processing datasets and an exploration interface for visualizing and editing categorized data. It can operate entirely locally with open-source models or integrate with popular model providers.
A creative tool for interactive art, Tölvera empowers artists to create and interact with dynamic, self-organizing systems. It is inspired by fields such as artificial life (ALife) and self-organizing systems. It provides creative coding-style APIs that allow users to combine and compose various built-in behaviors, such as flocking, slime mold grow