DXA 2.0 Model Service timeout and retry

Introduction

Out of all the various improvements DXA 2.0 brought to the table, arguably the most significant one is the introduction of the Model Service. So, what is it exactly and why does it pose such significance?  Well, the Model Service is a new RESTful Content Delivery microservice that should be installed alongside the (Session Enabled) Content Service. In combination with the new R2 data model, it can significantly increase performance by reducing the number of round-trips needed by the web application to the SDL Web Content Delivery data store and the amount of data to be transferred. Its role, pun intended :), is multifold, some of its functionalities are:

  • Model building on the CD side and data expansion
  • Support for both data models (DD4T and R2). It has built in converters that can work with any combination of published <-> requested data models
The following description is somewhat simplified, but it illustrates the performance gain:

It is the job of the Model Service to build up the requested page’s (and contained entities) model and return it to the DXA webapp in a single response. Earlier the DXA webapp would make dozens (for complicated pages even 50-60) of requests to the Content Service, instead of a single request towards the Model Service. Do note that the total number of requests towards the Content Service isn’t all that different, but in the new architecture they are made locally by the Model Service (remember that it should be installed next to the Content Service) thus eliminating network latency.

More about the Model Service can be found here.

The problem

Recently while working on an implementation I had faced a scenario where I would get errors for some pages in the DXA webapp with messages hinting towards the Model Service, but not describing the exact cause. Besides a couple of these pages, all others would work normally.

An example error message is below:

ERROR - DXA Model Service returned an unexpected response. Sdl.Web.Common.DxaException: DXA Model Service returned an unexpected response. ---> Sdl.Web.ModelService.ModelServiceException: DXA Model Service returned an unexpected response from request 'https://model_service_url/PageModel/tcm/123/home/horizontalCarousel?includes=INCLUDE&modelType=R2' of . ---> System.ArgumentNullException: Value cannot be null. Parameter name: value at Newtonsoft.Json.JsonConvert.DeserializeObject(String value, Type type, JsonSerializerSettings settings) at Newtonsoft.Json.JsonConvert.DeserializeObject[T](String value, JsonSerializerSettings settings) at Sdl.Web.ModelService.ModelServiceClient.PerformRequest[T](IModelServiceRequest request) in [project_path]\Sdl.Web.ModelService\ModelServiceClient.cs:line 135 --- End of inner exception stack trace --- at Sdl.Web.ModelService.ModelServiceClient.PerformRequest[T](IModelServiceRequest request) in [project_path]\Sdl.Web.ModelService\ModelServiceClient.cs:line 144 at Sdl.Web.Tridion.ModelService.DefaultModelServiceProvider.GetPageModelData(String urlPath, Localization localization, Boolean addIncludes) in [project_path]\Sdl.Web.Tridion\ModelService\DefaultModelServiceProvider.cs:line 94 --- End of inner exception stack trace --- at Sdl.Web.Tridion.ModelService.DefaultModelServiceProvider.GetPageModelData(String urlPath, Localization localization, Boolean addIncludes) in [project_path]\Sdl.Web.Tridion\ModelService\DefaultModelServiceProvider.cs:line 102 at Sdl.Web.Tridion.Mapping.DefaultContentProvider.LoadPageModel(String& urlPath, Boolean addIncludes, Localization localization) in [project_path]\Sdl.Web.Tridion\Mapping\DefaultContentProvider.cs:line 117 at Sdl.Web.Tridion.Mapping.DefaultContentProvider.GetPageModel(String urlPath, Localization localization, Boolean addIncludes) in [project_path]\Sdl.Web.Tridion\Mapping\DefaultContentProvider.cs:line 69 at Sdl.Web.Mvc.Controllers.PageController.Page(String pageUrl) in [project_path]\Sdl.Web.Mvc\Controllers\PageController.cs:line 44

DXA Model Service returned an unexpected response, what unexpected response I thought to myself, malformed or missing data, something else perhaps? There was only one way to find out so I started Postman. Since the request itself was logged, it was just a matter of getting an OAuth token and hitting that URL. Funnily enough, I got the correct response back, there was no network blockage (the Postman request was initiated from the same machine I was running the local DXA app on) and the JSON looked OK, I even validated it via a tool just to be sure. But there was indeed something strange as it turned out, the response took a very long time to be generated. Around 12+ seconds which is waaaay too long. Since this was an existing implementation (only now switching to DXA) the data was not modelled (in Tridion) using the best practices. But in any case, that was something out of my control.

Let’s jump back to the previous section for a moment and recall that it is the job of the Model Service to return a page’s model (including regions, entities, etc.). Having that, plus the problematic data model in combination with the large amount of component presentations on those pages (all of them including several levels of linked components, loads of keywords etc.) in mind, it turned out that the Model Service ended up having to do loads of resolving and expanding all of which took a very long time, the previously mentioned notorious 12+ seconds.

The solution

The proper solution would be to adjust the data model in Tridion itself, however, having in mind that this was an existing implementation and that the development had to continue at the time without any delays, this was not an option. Even if I wanted to fiddle around, I didn’t have access to most parts of the system (the microservices for example), so all I could do was to increase the timeout of the Model Service. To be more precise, this is not the timeout of the Model Service microservice itself, but rather the amount of time the DXA application waits for the response from the Model Service.

As this setting is not documented nor mentioned anywhere, I had to dig through the DXA framework to see whether it’s even possible at all. As it turns out, yes, it is, I even stumbled upon another setting related to the Model Service, also non-documented. That setting is the retry count.
Both of the settings are added as entries in the <appSettings> section (in .NET). When not explicitly specified, the default values are used which are 10 seconds for timeout and 4 retries.
Example <appSettings> with the relevant config keys below:

Having the timeout setting added, the pages loaded successfully, it took some time, but eventually all the data came in and was rendered properly so development could continue.

If you have any questions or wish to leave some feedback, please feel free to get in touch.