Azure Cosmos DB is a great product for handling huge amounts of unstructured data. It is a globally distributed and high performance database service that is often used for large-scale solutions.
Data is stored as JSON document in Azure Cosmos DB. This makes the data portable and the data schema flexible, but also introduces a limitation; JSON does not support the DateTime datatype. It only supports string, number, boolean, array, object and null datatypes.
Despite this limitation it is possible to sort and filter by DateTime in SQL queries. In this blog I will describe the problem and provide a reusable solution.
All source code mentioned in this blog can be found on GitHub.
Azure Cosmos DB emulator
For testing purposes, an Azure Cosmos DB environment is needed. Lucky for us, Microsoft has a free Azure Cosmos DB emulator which provides a local environment that emulates the Azure Cosmos DB service. It can be downloaded here.
After downloading, double click the installer, accept the License Agreement and click Install to start the installation. When the installation completes, click Finish to close the installer and the Azure Cosmos DB emulator will be launched. During the installation and launch, several User Account Control popups may appear which you have to approve by clicking on the Yes button.
Next, a database and collection must be created for storing the JSON documents. For this, click on the Explorer icon in the menu bar and then on the New Collection button. Fill in the form as shown in the screenshot below and click on the OK button.
Demo 1: storing DateTime values
Now the Azure Cosmos DB has been created, a console application must be created to insert JSON documents for investigating the problem. The console application is created using Visual Studio 2019 and runs on .NET Core 2.2. The NuGet packages shown below are required to make the console application run properly.
The console application inserts usernames and their last login timestamp into the Azure Cosmos DB collection created earlier. For the last login timestamp different DateTimeKind values are used; Unspecified, Local and UTC.
When taking a closer look at the contents of the Azure Cosmos DB collection after the console application has executed, you can see that the DateTime is stored as a string and that different formats are used.
The value of the string does not always start with the most significant value (year) and end with the least significant value (seconds) because for User 1 the timezone is entirely missing and for User 2 the timezone is added at the end of the string. This prevents correct sorting and filtering in SQL queries. To solve this the DateTime has to be stored in a uniform way. This will be done in the next demo’s.
Demo 2: storing DateTimeOffset values
The first step for storing DateTime values in a uniform format is using the DateTimeOffset datatype for the LastLogin property in the User class (instead of the DateTime datatype). DateTimeOffset refers to a single point in time and stores how much that point in time differs from UTC. Therefore, in the Azure Cosmos DB collection we will now see timezone information being added to every LastLogin string value.
Demo 3: convert to UTC
The next step is converting the DateTime values to UTC. This can be done explicitly in source code but this is a tedious and error prone task. Luckily, Newtonsoft.Json provides functionality for automatic conversion to UTC by implementing the JsonConvert.DefaultSettings. By default, the DateTimeOffset is stored in ISO 8601 format, so for converting to UTC we need to configure a IsoDateTimeConverter with the AdjustToUniversal DateTimeStyles value. If you use the Repository pattern in your source code, I would suggest putting the JsonConvert.DefaultSettings in there.
When the console application has executed, the DateTime values are now stored in a uniform way and always start with the most significant value (year) and end with the least significant value (seconds) so now you can sort and filter string values. Problem solved!
P.S. It is also possible to store DateTime values as UNIX Epoch time, but this is not human readable so I skipped this option in my blog.