In response to an increasing number of customers who cannot run MongoDB at scale, Amazon has implemented the DocumentDB solution.

You may easily scale from 10GB to 64TB with the help of automated data scaling in DocumentDB. Let’s see how this can be done.

What is DocumentDB? 

The AWS DocumentDB is a scaled-up version of the previous MongoDB version 3.6. Amazon felt the need to design its solution for large data volumes and mission-critical workloads. DocumentDB does not use any MongoDB source code. Therefore, it is a proprietary Amazon implementation.

DocumentDB, just like MongoDB, is a document store engine. AWS DocumentDB is a NoSQL database. The document store engine is typically JSON formatted. That is, it indexes JSON data structures. Amazon DocumentDB is a MongoDB drop solution by AWS but different. 

Before we take a deep dive into Amazon DocumentDB, it is important to understand NoSQL and why it is currently the next big thing. 

What is a NoSQL Database? 

nosql in aws documentdb

NoSQL databases do not have tables related to each other (aka “not only SQL”). Most of the time, documents and graphs are the two main types of things people use. They can handle a lot of data and many user loads. 

Developers’ use of NoSQL databases was made for performance, not for storage. NoSQL data is semi-structured and polymorphic and easily holds vast amounts of unstructured data.

In addition to NoSQL databases being more consistent than relational databases, they are also easier to scale and need fewer resources. Unlike SQL databases, NoSQL databases can be stacked. Schemaless NoSQL databases allow for various database item structures.

More developers are starting to host programs and data on public clouds. Expanding out rather than up and intelligently geo-locating data were key requirements. These are features of MongoDB.

Companies everywhere employ NoSQL. Financial and healthcare data, for example, are significant use cases (e.g., storing IoT readings from a smart kitty litter box).

YouTube video

What led to DocumentDB?

Amazon came up with the idea because many people were having trouble running MongoDB on a large scale. Amazon thought that none of the current solutions, including MongoDB Atlas, could solve their customers’ problems, so they came up with their own.

For example, DocumentDB allows you to grow your database from 10GB to 64TB automatically, so you do not have to do anything. Before DocumentDB, it was hard to develop this kind of data.

Amazon’s solution also has built-in fault tolerance. It automatically divides your storage space into 10GB chunks spread across many discs. Each 10 GB chunk of your storage volume is replicated six times across three availability zones for backup.

As many as two copies of data can be lost without affecting write availability, and three documents can be lost without affecting read availability, too. It also has a self-healing storage capacity. Data blocks and discs are checked for errors and fixed automatically.

Because Amazon hosts the service, most rules are covered. It meets many standards, including PCI DSS and ISO 9001. It also meets SOC 1, SOC 2, SOC 3, and HIPAA.

YouTube video

Benefits of DocumentDB

#1. MongoDB-compatible

Amazon DocumentDB works with MongoDB 3.6 and 4.0 drivers. Customers may use many of the same apps, drivers, and tools with Amazon DocumentDB.

Amazon DocumentDB employs the Apache 2.0 open source MongoDB 3.6 and 4.0 APIs to simulate a MongoDB server. The performance, scalability, and availability required by mission-critical MongoDB applications are now available.

#2. Monitoring

Amazon DocumentDB provides Amazon CloudWatch analytics for cloud database servers. Using the AWS Management Console, you can monitor your cluster’s performance in areas like computation and memory. Query throughput, MongoDB operation counts, and active connections are included.

#3. Latency

Amazon DocumentDB supports JSON documents, several data types, and fast indexing. An in-memory architecture allows the service to assess queries over large documents swiftly.

#4. Access Control

Amazon DocumentDB supports RBAC with built-in and defined roles. RBAC allows you to implement the least privilege by limiting what users can do.

As part of AWS Identity and Access Management (IAM), you can manage what AWS IAM users and groups can do with Amazon DocumentDB resources, including clusters, instances, snapshots, and parameter groups. You can also tag your Amazon DocumentDB resources and regulate your IAM users and groups.

#5. Encryption

Using the AWS Key Management Service, you can encrypt your Amazon DocumentDB (KMS) databases.

The data in the underlying storage is protected, as are automated backups, snapshots, and replicas in a cluster using Amazon DocumentDB encryption. Client-to-Amazon DocumentDB connections are automatically encrypted with TLS.

#6. Compliance Certifications

Amazon DocumentDB was built to the highest security standards to help you satisfy your own regulatory and compliance needs. Amazon DocumentDB is PCI DSS, ISO 9001, 27001, 27017, and 27018 compliant, SOC 1, 2, and 3 compliant, and HIPAA compliant.

#7. Global Clusters with High Availability

Amazon DocumentDB Global Clusters enable global reading and catastrophe recovery. It duplicates your data across up to five AWS sites with a minimal performance impact. 

#8. Multi-AZ Deployments with Replicas

With up to 15 replicas in three availability zones, Amazon DocumentDB automatically changes instances when one fails. In the event of failure, Amazon DocumentDB will try to create a new example for one.

#9. Fault-tolerant and self-healing storage

The storage volume is copied six times across three AZs (AZs). Amazon DocumentDB offers fault-tolerant storage to manage data loss of up to two copies without affecting write availability. Amazon DocumentDB’s storage is also self-healing, replacing failed data blocks and discs.

AWS DocumentDB FAQ

Is AWS DocumentDB same as MongoDB?

In addition to being fast, scalable, and fully managed, Amazon DocumentDB (with MongoDB compatibility) is a fully-managed document database service that can be used with MongoDB workloads.

JSON data can be stored in Amazon DocumentDB as a document database. You can store, search for, and index the data with ease.

Customers can use the AWS Database Migration Service (DMS) for free for six months to quickly and easily move their on-premises or Amazon Elastic Cloud (EC2) MongoDB non-relational databases to the AWS Database Migration Service (DMS) with almost no downtime.

How does Amazon DocumentDB work?

Amazon DocumentDB interacts with the Apache 2.0 open-source MongoDB 3.6 and 4.0 APIs as a document database. As a result, one can use the same MongoDB drivers, applications, and tools with Amazon DDocumentDBwith little or no changes.

How does Amazon DocumentDB scale?

Amazon DocumentDB is a web-scale database that can scale from 10 GB to 64 TB in increments of 10 GB. Amazon DocumentDB’s storage and compute capacity can be scaled vertically and horizontally (for greater read throughput) by adding additional replica instances (up to 15) to the cluster.

In terms of design, what are the main things about Amazon DocumentDB?

A cloud-first architecture has been used to build Amazon DocumentDB from the ground up. It means that JSON workloads can be easily scaled.

An important part of DocumentDB’s design is separating the storage and the processing, so each can grow at its own pace. DocumentDB has a storage system that is distributed, fault-tolerant, and self-healing. Each database cluster can store up to 64 TB of data without splitting it up.

Conclusion 

DocumentDB is Amazon’s only managed MongoDB compatible service. Amazon says DocumentDB has twice the throughput of currently available MongoDB solutions. The alternative would be to manage databases on EC2/EBS, which is challenging.

If you need those guarantees, pick DocumentDB, or stick with MongoDB. Another reason for choosing DocumentDB is keeping everything in AWS.