Disclaimer: this is my personal opinion and not the opinion of my colleagues or my employer.
We used to store Terra Bytes of data in ElasticSearch in form of JSON documents. As the size of data stored in cluster grew, we had to create new clusters with lots of nodes and it turned to a maintenance and cost nightmare. Microsoft Azure team suggested we move to DocumentDB to reduce the cost, and since it can scale infinitely, there won't be any maintenance needed.
Our Use Case
We need to store JSON documents with average size of 10-15 KB. These documents are rarely written more than 2 times, but could be queried many times. At this time, we have more than 100 Million documents stored, and we add about 1 million documents per day.
Before making the change, I used the Azure cost calculator and estimated that we will need about 20,000 RUs for our normal operation. Our lack of knowledge about how DocDB works (and how it relies on Partition IDs and if your queries go cross partition you are doomed) resulted in underestimating the cost. To make the long story short, we are running at 700,000 RUs, which translates to monthly cost of $54,000.
If we used the most elaborate setup of ElasticSearch for this purpose, say, with replication factor of 3 on SSD machines, we would need about 45 VMs with 100 GB SSD per VM, at the cost of $540 per month, resulting in a total monthly cost of $24,300, which is less than half of DocumentDB.
If we used replication factor 2, with 400 GB non-SSD VMs, monthly cost would be less than $5,000. You can come up with many other combinations in between.
What a terrible deal! I hope we got a lot of new features and functionality for the extra money!
By moving to DocumentDB, we lost a lot of functionality and features. It is almost impossible to run any query which goes cross partition. We are unable to run any quick analytics on the data. Any query on more than one partition is so expensive and slow, we just do not bother with it.
In ElasticSearch, it was just too easy to search for anything, we had a lot of wasteful operations to gather the data in real-time. When migrated to DocumentDB, we had to turn of a bunch of such operations because they are not possible anymore.
In summary, we lost a lot of functionality by moving to DocumentDB.
While migrating the data to DocDB, I learned that Azure team has to manually change our partition count and scale parameters to accommodate our initial needs. At the start, a new collection was initialized with 250 GB storage and assuming storage size increase is a seamless operation, I started the data migration, many hours into the migration, all instances of my migration tool stopped. I guess something was triggered on Azure back end and the capacity was reduced to 100 GB. Since it was Friday afternoon, our migration was delayed for 3 days until someone manually increased our capacity to higher value.
For the next size increase, we informed the nice folks at Azure to increase the capacity before we hit the wall, but migration crashed again because size increase required downtime.
Promises of a great future
Azure team has been genuinely trying to help us move to DocDB, and they have plans for improvements in all areas mentioned above. It's on us to make the right decision when choosing the technologies.