Our Favorite Open Source Finds of the Year (So Far)

10 Aug 2025, written by Mahir Khan

At Letterbrace, we like to explore new open source projects. It’s part of our job as a technical writing team, but it’s also just something we genuinely enjoy. Recently, there’s been a wave of interesting projects that tackle real pain points for developers and we wanted to spotlight a few that have really caught our eye.

Let’s get started.

Morphik: An AI-Native Database

GitHub: https://github.com/morphik-org/morphik-core

Problem

Modern AI applications must retrieve information that is both semantically meaningful and structurally organized (e.g., steps for X in component Y, based on the documentation). Vector databases like Pinecone can handle semantic similarity, but fall short in creating relationships between concepts. Graph databases like Neo4j have the opposite problem. They are strong in structure, but struggle working with embeddings.

Solution

Morphik merges a vector search engine with a knowledge graph into a single database, built from the ground up in Rust. Unstructured data (text, images, etc.) is ingested, and a graph is spun up with built-in semantic search functionality. Devs can then query the graph (using a Python SDK) to generate structured responses that are used as RAG inputs for LLMs.

Who It’s For

Morphik is designed for AI application developers who need advanced knowledge retrieval from domain-specific data. Teams can use it to build a QA bot for technical documentation, following a chain of reasoning to find relevant information and its dependencies.

Why It’s Interesting

This project tackles a real pain point for LLM-based tool developers by combining graphs with vector search. It does so efficiently, caching intermediate data along the way. The early traction (over 2.5k stars on GitHub) is a sign that developers appreciate its ability to give AI bots both structure and reasoning.

Klavis AI: MCP Servers for AI Agents

GitHub: https://github.com/Klavis-AI/klavis

Problem

LLM agents are great at generating responses, but triggering actions like sending emails or firing web hooks are more difficult. Developers end up writing “glue code” to manage authentication, translate between APIs, and configure servers for each tool — basically defeating the purpose of having an AI agent do the work for you.

Solution

Klavis AI simplifies this process by offering prebuilt MCP (Model Context Protocol) servers, which gives agents uniformity. Behind the scenes, it handles authentication and request shaping, so developers don’t have to write “glue code”.

Who It’s For

Klavis AI is built for developers who need to connect to external APIs like email, CRM, and databases. It is especially useful for teams using agentic frameworks that depend on MCP servers to perform their tasks.

Why It’s Interesting

Aware of MCP’s complexity, the founders structured the project to lower the barrier for developers. The setup is clean, extensible, and easy to deploy, and users have responded positively, with over 3k stars on GitHub.

Chonkie: Intelligent RAG Chunking

GitHub: https://github.com/chonkie-inc/chonkie

Problem

Noisy context makes large documents difficult for AI models to process effectively. Quick fixes like splitting by token count or using newline markers may work in simple cases, but will fail as inputs scale.

Solution

Chonkie uses NLP techniques to split text into self-contained “chunks” which respect both structure and context. This leads to fewer hallucinations and more accurate, faster responses from LLMs.

Who It’s For

Developers building RAG pipelines can save time by using Chonkie’s plug-and-play solution instead of spending precious development time to make their text LLM-ready.

Why It’s Interesting

Chonkie has seen early adoption (around 1k GitHub stars) which is quite strong for a utility library. While its not as flashy as some of the other open-source projects, practically every LLM hits the context limit problem that requires chunking. Chonkie solves this with a reliable, low-config approach.

Cua: Docker for AI Agents

GitHub: https://github.com/trycua/cua

Problem

While AI agents have advanced, having them operate a browser and run shell commands is still painful to manage, especially on macOS. It also raises security concerns when agents are granted system-level access.

Solution

Built on Apple Silicon’s virtualization, Cua spins up lightweight macOS or Linux containers that agents can fully control. It abstracts away the OS setup, letting agents run in a sandboxed environment with ~90% of host speed (and without exposing the host system).

Who It’s For

It targets developers working with autonomous agents that need to interact with software or websites (particularly on macOS) in a secure environment.

Why It’s Interesting

Cua solved something that is still quite novel: bridging OS-level agent control in a secure and easy manner. The performance and sandboxing makes it incredibly dev friendly as proven by its early adoption (over 8k GitHub stars).

Browser-Use: Autonomous Agents That Browse The Web

GitHub: https://github.com/browser-use/browser-use

Problem

LLMs don’t have native internet capabilities beyond text. Existing tools like Selenium are designed for scripted flows, not agent-driven decision making. Connecting LLM output to real browser actions remains a challenge.

Solution

Browser-Use introduces an agent framework that serves as a “gateway” for agents to discover APIs and take action. It supports dynamic workflows where agents can navigate, interpret, and execute based on context.

Who It’s For

Browser-Use is ideal for developers building personal agents that interact with the web such as flight booking, site scraping, and online form completion.

Why It’s Interesting

Backed by a $17M seed round and 50k GitHub stars, the demand for reliable browser automation is huge. Sites change frequently, bot detection is common, and authentication flows are complex. But Browser-Use helps bridge those gaps.

Plexe: Custom ML from a Single Prompt

GitHub: https://github.com/plexe-ai/plexe

Problem

Building a production-grade custom ML model typically takes weeks (or months). It involves setting up pipelines, experimenting with algorithms, tuning parameters, and repeatedly iterating across changing requirements.

Solution

Plexe streamlines the entire process. You describe the type of model you need, and a “team” of coordinated AI agents connect to your data, test various approaches, assess performance, and converge on a solution.

Who It’s For

Plexe supports teams that need ML models but don’t have the know-how or the bandwidth to build from scratch. It automates the whole process, making it possible to generate a recommendation engine or a churn predictor without step-by-step manual tuning.

Why It’s Interesting

The concept of generating a working ML model from a single prompt signals a broader evolution in development workflows. Plexe is moving in that direction, and its ~2k GitHub stars shows the growing interest in this simplified approach.

assistant-ui: Drop-in ChatGPT-Style UI Component

GitHub: https://github.com/assistant-ui/assistant-ui

Problem

Apps adding a ChatGPT-like chat features need to implement a clean, responsive interface. While this isn’t inherently complex, handling details like autoscroll and accessibility adds repetitive overhead for developers.

Solution

All of this logic is abstracted by assistant-ui through a TypeScript/React library, with components like <ChatContainer> and <MessageInput> . Developers can get started by running npm install and connecting it to their AI backend.

Who It’s For

Developers building AI chat features can use assistant-ui to save weeks of effort spent on building a production-ready interface. It’s already being used by projects like LangChain.

Why It’s Interesting

It has already hit 50k+ monthly NPM donwloads and ~5k GitHub stars. Users value its simplicity and customizability. It includes all the core features like streaming, tool call rendering, and accessibility.

Infisical: Secrets Management Built for Devs

GitHub: https://github.com/Infisical/infisical

Problem

In microservice architectures, secrets live in many places: .env files, CI pipelines, Kubernetes configs, and more. Each service handles secrets differently, creating isolated systems that don’t coordinate. This inconsistency leads to secret sprawl, which only gets worse as applications scale.

Solution

Infisical solves this by centralizing storage and injecting secrets into your environments in real time. With features like versioning, audit logs, and credential rotation (along with an intuitive dashboard), managing secrets becomes significantly more straightforward.

Who It’s For

It’s built for developers who want secure secret management without diving deep into platform engineering. Infisical handles much of the heavy lifting from storage and injection to rotation and scanning.

Why It’s Interesting

Infisical nails what most open-source infra tool miss: simplicity. It offers a clean UI, straightforward CLI, and thoughtful abstractions that make secrets easier to scale (reflected in its ~18k stars on GitHub).

Tiptap - A Headless, Extension-Based Editor Framework

GitHub: https://github.com/ueberdosis/tiptap

Problem

Many editors either lock developers into specific frameworks or require extra work to support features like collaborative editing. Rather than speeding up development, these limitations often turn into bottlenecks.

Solution

In Tiptap, everything is extension-based — from core features like mentions and tables to real-time collaboration. It supports frontend frameworks like React and Vue, while keeping the editor logic decoupled and backend-driven.

Who It’s For

TipTap is built for developers who want flexibility without giving up a rich feature set. Its customizable architecture makes Tiptap applicable to a wide variety of use cases.

Why It’s Interesting

Tiptap’s headless architecture has helped it gain ~30k GitHub stars and adoption from companies like GitLab and Nextcloud. It abstracts away difficult problems like schema management and parsing, and gives developers a modern, unopinionated engine that they can build on.

ParadeDB - Postgres-Native Search and Analytics

GitHub: https://github.com/paradedb/paradedb

Problem

Postgres wasn’t built for full text-search or large-scale analytics. As applications grow, tools like Elasticsearch can be integrated to fill in those gaps. But this adds complexity: you have to build ETL pipelines, keep data in sync, and manage separate infrastructure.

Solution

ParadeDB is a modern Elasticsearch alternative built directly on top of Postgres. It extends Postgres with a suite of Rust-based extensions that add full text search, vector search, and columnar analytics.

Who It’s For

ParadeDB is designed for developers who need real-time search within their existing PostgreSQL infrastructure. By staying within the Postgres ecosystem, it simplifies deployment, scaling, and management.

Why It’s Interesting

What makes ParadeDB stand out, as seen in its 7k+ GitHub stars, is its emphasis on unification. It enhances Postgres with native search/analytics feature and avoids the operational overhead that comes with stitching together external tools.

Final Thoughts

Open source thrives on collaboration and shared curiosity. We’re excited to keep watching these projects grow (and to discover even more).

If you’re building something in this space or know of a project worth exploring, drop us a message. We’d love to check it out!