Migration

Overcome Firestore’s Limitations By Migrating To MongoDB

Michael Cosimo
Michael Cosimo

Cloud Firestore is a flexible, NoSQL database for mobile, web, and server development. Amidst all its glory, Firestore has some notable shortcomings.

For starters, Firestore has a limit of one instance per project, several query limitations, and provides no straightforward way of cleaning up data spread over multiple collections.

This article highlights certain shortcomings of Firestore and our expert take on migrating from Firestore to MongoDB.

Dissecting the Pitfalls of Firestore

Firestore uses the NoSQL data model to store and sync data for client-and-server-side development. It syncs data in documents across client apps through real-time updates. The documents support many different data types and are further organized into collections. You can also create subcollections within documents and build hierarchical data structures.

While Firestore provides no-maintenance massive scalability, there are some significant pitfalls:

Problems with aggregation queries
An aggregation query processes data from multiple index entries to return a single summary value. Firestore supports the count() aggregation query that allows you to determine the number of documents in a collection or query. These queries rely on the existing index configuration and scale proportionally to the scanned index entries. This translates to a read of each matching row of the query and hinders efficient scaling.

Tricky latency and slow write frequency
Firestone’s limits on writes and transactions can significantly hamper scaling under load. The maximum sustained write rate to a document is 1 per second, which is quite long as per the industry standard. While this latency may be feasible during the inception, it can progressively burden as writes begin to fail at higher sustained rates.

Full-text search
Firestore doesn’t support native indexing or search for text fields in documents. Consequently, you need to rely on extensions to perform a full-text search (like searching for posts that contain a specific string).

The Cloud Firestore documentation suggests using Algolia – a dedicated third-party search service. However, this alternative creates the problem of syncing data across two databases, making it challenging to maintain document integrity.

Data Integrity and Security
Firestore doesn’t guarantee data integrity because it has very dynamic data structures similar to JSON. Therefore, developers cannot constrain data on the database level or set restrictions such as “only certain fields can be viewed” within a single data collection. As a result, bugs are inevitable and can lead to a data compromise.

Summing Up Our Take on Firestore

Here are our 2 cents on Firestore and its practical limitations.

While leveraging Firestore, we encountered a requirement to move towards multi-tenancy-based architecture and introduce isolation in Firestore for each tenant’s data.

While Firestore in Datastore mode allows a multitenant application to use separated silos of data for each tenant, there’s a limitation of one Firestore instance per project. Firestore also doesn’t support multiple databases; one Firestore instance is restricted to one database per instance.

Additionally, we had a requirement to bring up and delete instances. There’s no easy way of cleaning up Firestore data spread over multiple collections. Deleting a collection requires coordinating an unbounded number of individual delete requests or writing cleanup scripts. This further required us to modify scripts whenever we introduced new collections in Firestore. While it would be possible to update and delete data atomically, only 500 writes can be done in one transaction.

Together, these limitations pushed us to explore suitable alternatives to Firestore that would correspond to our specific requirements.

MongoDB Emerging as an Effective Alternative to Firestore

While looking for the best alternatives to Firestore, we favoured switching to an open-source, cloud-agnostic database platform. While we explored some open-source alternatives like Cassandra, it turned out to be costly and had a complex setup. Finally, we decided upon MongoDB, a more robust document database known for high performance and security.

Our technical team has rich expertise in MongoDB, and we already have a self-hosted Mongo setup per tenant in our cluster, which would prove to be cost-effective.

SQL vs NoSQL Database
An SQL database is a relational database because of the relational model that normalizes data across strictly defined tables, thus suiting highly structured data. While SQL remains a standard in organizations worldwide, many other database systems are emerging.

NoSQL is one of these newer database systems. They are designed to ’’scale horizontally’’, which means adding resources to a single node (a computer or server) and are suitably matched for the cloud. Moreover, the NoSQL data structure is flexible enough to accommodate different models, including key-value, document, column-oriented, and graph.

We chose NoSQL because our data was deeply nested as subcollections in Firestore. To maintain the same schema, we would be required to maintain performance-impacting joins on multiple tables. Further, the schema for the database was prone to change as we would keep adding new features. In such a scenario, having a NoSql database would allow more flexibility than an SQL database.

Our Strategy to Alleviate Migration Challenges
MongoDB has several advantages over Firebase – it can be operated on-premise or in the cloud, unlike Firebase, which is purely a cloud database service and allows for extremely powerful and complex queries to be assembled easily.

While we attempted to migrate from Firestore to MongoDB, we encountered rather interesting challenges.

We had a deeply nested structure in Firestore, where each nested object was a subcollection, similar to the one below:


                // collection  baz
                {
                ""id"": ""0001"",
                ""type"": ""ipsum"",
                ""name"": ""lorem"",  
               ""description:""Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."", 
                ""foo"": // subcollection foo
                        [
                                { ""id"": ""1001"", ""value"": "" dolor"" },
                                { ""id"": ""1002"", ""value"": ""sit"" },
                                { ""id"": ""1003"", ""value"": ""amet"" },
                                { ""id"": ""1004"", ""value"": ""consectetur"" }
                        ]
                ""bar"": //subcollection bar
                        [
                                { ""id"": ""5001"", ""value"": ""adipiscing"" },
                                { ""id"": ""5002"", ""value"": ""elit"" },
                                { ""id"": ""5005"", ""value"": ""sed"" },
                                { ""id"": ""5007"", ""value"": ""incididunt"" },
                                { ""id"": ""5006"", ""value"": ""dolore"" },
                                { ""id"": ""5003"", ""value"": ""enim"" },
                                { ""id"": ""5004"", ""value"": ""minim"" }
                        ]
                    }

MongoDB vs Firestore

Maintaining this nested structure presented certain challenges that we had to navigate. Firestore’s default behavior while querying would return the document to a specified location, omitting its subcollections. This allows us to segregate information in subcollections according to access patterns in our application logic.

On the other hand, in MongoDB, accessing the document would also return subcollections, thus increasing the response size and impacting the performance. This is because MongoDB stores subcollections as part of the document itself. But each document has a size limit of 16 MB. This would limit the amount of data we could store in our subcollections, and we had anticipated that our data in subcollections would grow with time, thus exceeding the 16 MB limit.

To tackle this, we had to flatten our MongoDB nested structure to a flat structure, as shown below:


                // collection  baz
                {
                    ""id"": ""0001"",
                    ""type"": ""ipsum"",
                    ""name"": ""lorem"",  
                    ""description:""Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.""
                } 
                // collection foo        
                [
                    { ""id"": ""1001"", ""value"": "" dolor"",""baz_id"":""0001"" },
                    { ""id"": ""1002"", ""value"": ""sit"",""baz_id"":""0001""  },
                    { ""id"": ""1003"", ""value"": ""amet"" ,""baz_id"":""0001"" },
                    { ""id"": ""1004"", ""value"": ""consectetur"",""baz_id"":""0001""  }
                ]
        
                // collection bar
                [
                    { ""id"": ""5001"", ""value"": ""adipiscing"",""baz_id"":""0001""  },
                    { ""id"": ""5002"", ""value"": ""elit"" ,""baz_id"":""0001""},
                    { ""id"": ""5005"", ""value"": ""sed"",""baz_id"":""0001"" },
                    { ""id"": ""5007"", ""value"": ""incididunt"",""baz_id"":""0001"" },
                    { ""id"": ""5006"", ""value"": ""dolore"" ,""baz_id"":""0001""},
                    { ""id"": ""5003"", ""value"": ""enim"",""baz_id"":""0001"" },
                    { ""id"": ""5004"", ""value"": ""minim"" ,""baz_id"":""0001""}
                ]
                

Once done, every subcollection in Firestore was successfully migrated to its own collection with references to the parent document’s ID after we switched to MongoDB.

Conclusion

While selecting a database platform to store and manage your data, it’s imperative to weigh its pros and cons and choose a platform that suits the requirements of your organization. While Firestore is optimized for some use cases, it presents several drawbacks. Thus, we migrated our operations to MongoDB, another NoSQL database that offered us more advantages over Firestore.

If you’d like to learn more about how we work with MongoDB to leverage and automate Google Cloud, sign up for a free consultation with D3V’s certified cloud engineers.