At Scaleway, we’re convinced that open source is one of the cornerstones of today’s software development. Many cloud players like us might not even exist without it, and that’s why we foster open-source initiatives.
That is why we were particularly excited to discover FerretDB: an open-source initiative that is a fast and scalable NoSQL document store. We decided to team up with them to add FerretDB to Scaleway’s ecosystem.
What is FerretDB?
FerretDB is an open-source alternative to MongoDB® that provides a MongoDB®-compatible API layer over a PostgreSQL engine to store and retrieve data. At the same time, adding a managed document database to our products would also be highly valuable for our users and us.
Many developers like being able to query document data through a JSON-structured query language. PostgreSQL is a rock-solid database system we really like and already provide as a managed version. But we also know that using SQL structure, avoiding SQL injection, or switching to its document query symbols (→, @>, or ?) can be a turn-off for some, especially when entire codebases depend on it, and migrating a database is far from being a business priority or a de-stressing weekend activity.
FerretDB combines the best of both worlds by building on reliable and proven open-source software and enhancing it with the MongoDB®-like APIs and features developers love.
How does FerretDB work?
FerretDB relies on a modular architecture built on top of PostgreSQL. FerretDB stores documents in binary JSON format using the corresponding PostgreSQL column type. FerretDB also uses several tables in PostgreSQL to store database structure information such as database names, collection names, and user access management.
To be compatible with existing software and tools built on MongoDB®, FerretDB translates MongoDB® wire protocol queries to SQL. As with any project of this type, several incompatibilities can emerge, but these are minor at this stage, and the FerretDB community keeps enhancing compatibility with more and more advanced queries over time.
For example, the following insert request performed to FerretDB…
... title: "The Favourite",
... genres: [ "Drama", "History" ],
... runtime: 121,
... rated: "R",
... year: 2018,
... directors: [ "Yorgos Lanthimos" ],
... cast: [ "Olivia Colman", "Emma Stone", "Rachel Weisz" ],
... type: "movie"
…is then translated by FerretDB into the following SQL query sent to PostgreSQL:
INSERT INTO mydb.movies_257fbbf4 (_jsonb) VALUES ($1)
Finally, after receiving the PostgreSQL response, FerretDB will answer the client with:
For the moment, search queries such as
db.find() are not yet optimized to translate efficiently to SQL, but that should evolve a lot over the coming months so we won’t go into detail now.
With this overall approach, FerretDB is fully stateless and, by decoupling storage engine and query translation, keeps all PostgreSQL consistency features.
Managed FerretDB on Scaleway
There are multiple ways to help open-source initiatives: by contributing, giving them visibility, and using their technology. FerretDB is still an emerging open-source technology, and after multiple chats with the FerretDB team, we found out that the most useful contribution we could provide as a cloud provider would be to integrate FerretDB into our Managed Database stack — which would also have the benefit of easing the deployment and management of databases as much as possible for developers.
Our integration of FerretDB will give you fast and smooth access to FerretDB’s technology and let you test it and provide FerretDB with feedback on concrete use cases, as well as on any compatibility and performance issues.
Thanks to its architecture, integrating FerretDB is as simple as running a FerretDB container side by side with another PostgreSQL container. That made it really easy to add FerretDB to our existing Managed Relational Database stack.
FerretDB can run safely on our virtual machines and benefit from existing failover capabilities so that incoming requests can be redirected to the right instances in case of a failure. It also integrates well with our existing monitoring stack, and we can send FerretDB metrics and logs to our observability product, which is based on Grafana.
Finally, FerretDB’s support of mongodump and mongorestore commands makes upgrading FerretDB as easy as dumping data, restarting FerretDB containers, and restoring dumped data. This is particularly valuable in such an early-stage project, as we want to upgrade to the latest stable FerretDB version every few days to get access to new features and see progress quickly.
Building the MVP
Integrating FerretDB is overall fairly straightforward, but like with many new technologies, the devil’s hidden in the details. In this case, as FerretDB is currently still being built, some key features still need to be added and are planned for v1.0 — which should be released in April 2023. Those new features are indexing, aggregation pipelines, and cursor commands.
Moreover, some caveats based on structural PostgreSQL differences or limitations in FerretDB context might persist over time, but they will stay limited, and most standard applications should be able to work around them. For instance, FerretDB documents fields cannot yet contain infinity or -infinity values, and database and collection names cannot yet contain spaces or dots.
We really enjoyed the deep dive into FerretDB, getting to know the team behind it, and seeing its quick evolution. We will continue to follow the development of this open-source alternative closely as it becomes more and more mature and increases feature parity while improving performance for production workloads.
Currently, FerretDB cannot be used as a database replacement for applications relying fully on MongoDB® features (such as RocketChats, KeyStoneJS, or NodeBB).
But you can already test it for custom applications or side projects and see its feature parity increasing quickly. If you’re interested in testing the technology, you can request access to our managed prototype right here.