Azure Cosmos DB Performance – Throughput

06.02.2018

Recently I was designing a Cosmos DB SQL API solution for a client. During testing we were constantly getting http status code 429 - Too many Requests errors. We manually had to increase the Throughput for a collection. This got to be tedious.

In production, this was unacceptable.

We needed to automate the setting of the Cosmos DB Collection Throughput.

Architecture

The solution architecture I built looks like the diagram below and utilizes the Throttling Cloud Design Pattern.

Steps

  1. Create an Cosmos DB Azure Alert for http status code 429 - Too many Requests. The collection has exceeded the provisioned throughput limit.
  2. When the condition is met, an Alert is sent to a Function App.
  3. Function App increases the Request Units for the Collection by 100.
  4. Update Cosmos DB Collection Offer (Request Units).

Throughput SLA

"Throughput Failed Requests" are requests which are throttled by the Azure Cosmos DB Collection resulting in an Error Code, before Consumed RUs have exceeded the Provisioned RUs for a partition in the Collection for a given second.

NOTE: The Metric we want to check for is Status Code 429 - Too many Requests . The collection has exceeded the provisioned throughput limit.

Create a new Azure Cosmos DB Alert

We need to create an Alert for our Azure Cosmos DB Database as shown in the following figure.

The following are the configuration settings.

  1. Select our Resource.
  2. Name for our Alert Rule.
  3. Description of our Alert Rule.
  4. Select Throttled Requests for the Metric.
  5. The condition, threshold, and period that determine when the alert activates.
  6. Condition set to greater than.
  7. Threshold Count set to 1.
  8. For Period we will start with Over the last hour.
  9. Check if the service administrator and co administrators are emailed when the alert fires.

NOTE: We will configure our WebHook setting later.

Create our Function App

We will create our Function App using the Azure Portal. Optionally we could do this in Visual Studio 2017. I choose the Portal because I had planned on add additional functionality.

INFO: We could download the Function App code with a Visual Studio project File. This will allow us to modify the source code, use Visual Studio Team Services (VSTS) for source control, and incorporate a CI/CD pipeline.

The Azure Function App template I used is a genericJson Webhook. Application Insights was added.

The following is the code from the ProcessEvent Function App.

using System; using System.Net; using Newtonsoft.Json; using Microsoft.Azure.Documents; using Microsoft.Azure.Documents.Client; using Microsoft.Azure.Documents.Linq; using System.Configuration; using System.Collections.Generic; public static async Task<object> Run(HttpRequestMessage req, TraceWriter log) { log.Info($"Webhook was triggered!"); var endPoint = ConfigurationManager.AppSettings["Endpoint"]; var authKey = ConfigurationManager.AppSettings["AuthKey"]; var databaseId = ConfigurationManager.AppSettings["databaseName"]; var list = new Dictionary<string, int>(); var newThroughput = 0; string jsonContent = await req.Content.ReadAsStringAsync(); dynamic data = JsonConvert.DeserializeObject(jsonContent); DocumentClient client = new DocumentClient( new Uri(endPoint), authKey, new ConnectionPolicy { ConnectionMode = ConnectionMode.Direct, ConnectionProtocol = Protocol.Tcp }); var colls = await client.ReadDocumentCollectionFeedAsync(UriFactory.CreateDatabaseUri(databaseId)); foreach (var coll in colls) { Microsoft.Azure.Documents.Offer offer = client.CreateOfferQuery().Where(r => r.ResourceLink == coll.SelfLink) .AsEnumerable().SingleOrDefault(); //Get the current Throughput for the collection var result = ((OfferV2) offer).Content.OfferThroughput; // Increase the Throughput by 100 newThroughput = result + 100; // Modify the offer offer = new OfferV2(offer, newThroughput); list.Add(coll.Id, result); await client.ReplaceOfferAsync(offer); } // format the response content var results = string.Join("; ", list.Select(x => x.Key + "=" + x.Value)); var res = new HttpResponseMessage(HttpStatusCode.OK) { Content = new StringContent($"Collection(s): {results} Request Units increased to {newThroughput}") }; return res; }

NOTE: We are not using the 'HttpRequestMessage' data in our Function App. It is only triggered when an Event is received.

The following is an example of an Offer

offer {{ "id": "gCK7", "_rid": "gCK7", "_self": "offers/gCK7/", "_etag": ""00002b01-0000-0000-0000-5a5d0b6f0000"", "offerVersion": "V2", "resource": "dbs/jdUKAA==/colls/jdUKAIUe1QA=/", "offerType": "Invalid", "offerResourceId": "jdUKAIUe1QA=", "content": { "offerThroughput": 10600, "offerIsRUPerMinuteThroughputEnabled": false }, "_ts": 1516047215 }} Microsoft.Azure.Documents.Offer {Microsoft.Azure.Documents.OfferV2}

Next we need to set our Azure Cosmos DB Connection settings

We need to modify the Function App Application Setting. The Cosmos DB connection settings need to be added.
* Endpoint
* AuthKey
* databaseName

The following figure shows the these settings.

Next we will test our Function App as shown in the following figure.

Our response is Collection(s): resources=1000; leases=1000 Request Units increased to 1100

We can see that the Collections resources and leases request units have been increased to 1100.

When we run it again, the request units increase by 100.

Modify Event Rule

Finally, we need modify our event rule.

  1. We first need to copy our Function App URL, as shown in the following figure.

2. We then edit our Event Rule by pasting the Function App URL into the Webhook text box, as shown below.

3. Save the Event Rule.

Summary

  • Azure Event Rules can be used to send events to a Function App
  • Using Azure Cosmos DB SQL Api Microsoft.Azure.Documents.Offer builtin functionality to Get and Set the OfferThroughput is easy to use.
  • Increased the Request Units for all Collections in a Database.
  • We were able to provide the client with an automated process.