Unveiling blocking operations

Background

Developing an application following the reactive principles can be quite challenging, specially when working with third party libraries, as these may not be fit for purpose, i.e. blocking operations. In such cases, the first challenge is to find the blocking operation and the second to confine it so that it does not affect the rest of the application (what is also known as bulkheads pattern).

I recently came across such a situation while working on an Akka application: this application exposes a REST endpoint to its clients and connects to a third party throughout a SOAP endpoint. The SOAP calls are made with Akka Camel (http://doc.akka.io/docs/akka/current/scala/camel.html) and, as it will be seen along this post, the connector to make those calls is blocking.

Unfortunately, this kind of behaviour is difficult to be found out just with unit tests or end to end tests. Unit tests normally stub the responses of the third party whereas end to end tests tend to be not very demanding in terms of performance. Yet it is precisely when the application is under heavy load that the blocking operations reveal themselves to hurt the ability to scale. Only performance tests will unveil the insidious blocking operations effectively.

 

Running performance tests

When running tests, it is always a good idea to use a controlled environment. To build my controlled environment, I use a mock server (built with Wiremock) to stand in for the third party and set the response delay to 1 second (http://wiremock.org/docs/simulating-faults/).  As a side note, I will say that a controlled environment makes it easier to reproduce the results and interpret them. For instance, running a mock server in the same host as the application and setting a fixed delay will make the results more consistent as fluctuations of the network or cached responses by the third party are avoided. Also, setting the delay to 1 second will facilitate calculations later on.

To generate web traffic on the application, I created a Jmeter script and ran it several times to try out different number of clients hitting the application at once during a given period of time. These are the results and their interpretation.

 

1 client

This is our first test and also the ‘control experiment’ to verify that the application behaves correctly. Fortunately, the results are coherent: there is 1 client sending requests repeatedly during 122 seconds and the delay of the response is 1 second, therefore the throughput is 1 request/sec and the average response time is 1000 ms as expected.

Waiting for possible shutdown message on port 4445
summary + 20 in 21s = 1.0/s Avg: 1019 Min: 1009 Max: 1086 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
summary + 29 in 29.5s = 1.0/s Avg: 1016 Min: 1009 Max: 1025 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
summary = 49 in 50.2s = 1.0/s Avg: 1017 Min: 1009 Max: 1086 Err: 0 (0.00%)
summary + 30 in 31s = 1.0/s Avg: 1016 Min: 1010 Max: 1023 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
summary = 79 in 81s = 1.0/s Avg: 1017 Min: 1009 Max: 1086 Err: 0 (0.00%)
summary + 29 in 30s = 1.0/s Avg: 1019 Min: 1011 Max: 1058 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
summary = 108 in 110s = 1.0/s Avg: 1017 Min: 1009 Max: 1086 Err: 0 (0.00%)
summary + 12 in 12.2s = 1.0/s Avg: 1016 Min: 1010 Max: 1027 Err: 0 (0.00%) Active: 0 Started: 1 Finished: 1
summary = 120 in 122s = 1.0/s Avg: 1017 Min: 1009 Max: 1086 Err: 0 (0.00%)
5 clients

Now we can move on to our next test, having 5 clients hitting the application simultaneously.  Once again,  the system can handle the load without any problem as the throughput is nearly 5 requests/sec and the average response time 1000 ms.

summary + 1 in 1.4s = 0.7/s Avg: 1086 Min: 1086 Max: 1086 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
summary + 120 in 28s = 4.3/s Avg: 1017 Min: 1008 Max: 1056 Err: 0 (0.00%) Active: 5 Started: 5 Finished: 0
summary = 121 in 29.4s = 4.1/s Avg: 1018 Min: 1008 Max: 1086 Err: 0 (0.00%)
summary + 149 in 30s = 5.0/s Avg: 1019 Min: 1008 Max: 1155 Err: 0 (0.00%) Active: 5 Started: 5 Finished: 0
summary = 270 in 59.4s = 4.5/s Avg: 1018 Min: 1008 Max: 1155 Err: 0 (0.00%)
summary + 146 in 30.2s = 4.8/s Avg: 1016 Min: 1008 Max: 1032 Err: 0 (0.00%) Active: 5 Started: 5 Finished: 0
summary = 416 in 90s = 4.6/s Avg: 1017 Min: 1008 Max: 1155 Err: 0 (0.00%)
summary + 149 in 30s = 5.0/s Avg: 1017 Min: 1009 Max: 1036 Err: 0 (0.00%) Active: 5 Started: 5 Finished: 0
summary = 565 in 119s = 4.7/s Avg: 1017 Min: 1008 Max: 1155 Err: 0 (0.00%)
summary + 59 in 12s = 4.9/s Avg: 1018 Min: 1007 Max: 1040 Err: 0 (0.00%) Active: 0 Started: 5 Finished: 5
summary = 624 in 131s = 4.7/s Avg: 1017 Min: 1007 Max: 1155 Err: 0 (0.00%)
20 clients

Next, the traffic is increased to 20 clients and the system still can handle it without any problem as the application throughput is nearly 20 requests/sec. So far the system can absorb all the traffic. It is no surprise though as there is a router of 20 actors in the system. Therefore,  we need to go beyond 20 clients to observe any blocking operation (in case it exists)

summary + 11 in 5s = 2.4/s Avg: 1056 Min: 1011 Max: 1167 Err: 0 (0.00%) Active: 6 Started: 6 Finished: 0
summary + 470 in 30s = 15.8/s Avg: 1020 Min: 1005 Max: 1167 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 481 in 34.4s = 14.0/s Avg: 1021 Min: 1005 Max: 1167 Err: 0 (0.00%)
summary + 590 in 30s = 19.7/s Avg: 1018 Min: 1007 Max: 1092 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 1071 in 64.3s = 16.7/s Avg: 1019 Min: 1005 Max: 1167 Err: 0 (0.00%)
summary + 590 in 30s = 19.7/s Avg: 1018 Min: 1006 Max: 1179 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 1661 in 94.3s = 17.6/s Avg: 1019 Min: 1005 Max: 1179 Err: 0 (0.00%)
summary + 589 in 30s = 19.6/s Avg: 1017 Min: 1007 Max: 1112 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 2250 in 124s = 18.1/s Avg: 1018 Min: 1005 Max: 1179 Err: 0 (0.00%)
summary + 341 in 17.3s = 19.7/s Avg: 1016 Min: 1006 Max: 1030 Err: 0 (0.00%) Active: 0 Started: 20 Finished: 20
summary = 2591 in 142s = 18.3/s Avg: 1018 Min: 1005 Max: 1179 Err: 0 (0.00%)
40 clients

Things get interesting as we increased the traffic to 40 simultaneous clients. Now each routee is expected to handle 2 requests. The results are not satisfactory as they are indicative of the existence of a blocking  operation. Throughput of the system remains near 20 requests/sec and average response time is 2 sec. Clearly, there is a blocking operation.

summary + 9 in 4.1s = 2.2/s Avg: 1030 Min: 1010 Max: 1086 Err: 0 (0.00%) Active: 6 Started: 6 Finished: 0
summary + 462 in 30s = 15.6/s Avg: 1203 Min: 1008 Max: 1946 Err: 0 (0.00%) Active: 34 Started: 34 Finished: 0
summary = 471 in 34s = 14.0/s Avg: 1200 Min: 1008 Max: 1946 Err: 0 (0.00%)
summary + 590 in 30.1s = 19.6/s Avg: 1981 Min: 1297 Max: 2216 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 1061 in 64s = 16.6/s Avg: 1634 Min: 1008 Max: 2216 Err: 0 (0.00%)
summary + 595 in 30s = 19.9/s Avg: 2027 Min: 1811 Max: 2254 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 1656 in 94s = 17.7/s Avg: 1775 Min: 1008 Max: 2254 Err: 0 (0.00%)
summary + 587 in 30s = 19.6/s Avg: 2027 Min: 1767 Max: 2300 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 2243 in 124s = 18.1/s Avg: 1841 Min: 1008 Max: 2300 Err: 0 (0.00%)
summary + 597 in 30.1s = 19.8/s Avg: 2029 Min: 1704 Max: 2376 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 2840 in 154s = 18.5/s Avg: 1881 Min: 1008 Max: 2376 Err: 0 (0.00%)
summary + 586 in 30s = 19.6/s Avg: 2027 Min: 1710 Max: 2412 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 3426 in 184s = 18.7/s Avg: 1906 Min: 1008 Max: 2412 Err: 0 (0.00%)
summary + 595 in 30.1s = 19.8/s Avg: 2029 Min: 1644 Max: 2446 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 4021 in 214s = 18.8/s Avg: 1924 Min: 1008 Max: 2446 Err: 0 (0.00%)
summary + 181 in 10s = 18.9/s Avg: 2029 Min: 1506 Max: 2460 Err: 0 (0.00%) Active: 0 Started: 40 Finished: 40
summary = 4202 in 223s = 18.8/s Avg: 1929 Min: 1008 Max: 2460 Err: 0 (0.00%)
100 clients

Final test to confirm our suspicions. This time there are 100 simultaneous clients and again the application throughput remains 20 requests/sec and, as expected, the response time goes up to 5 seconds.

Waiting for possible shutdown message on port 4445
summary + 321 in 21.3s = 15.1/s Avg: 1349 Min: 1006 Max: 2117 Err: 0 (0.00%) Active: 44 Started: 44 Finished: 0
summary + 600 in 30.4s = 19.8/s Avg: 3413 Min: 1998 Max: 5070 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 921 in 52s = 17.8/s Avg: 2694 Min: 1006 Max: 5070 Err: 0 (0.00%)
summary + 586 in 30s = 19.8/s Avg: 5059 Min: 4477 Max: 5335 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 1507 in 81.3s = 18.5/s Avg: 3614 Min: 1006 Max: 5335 Err: 0 (0.00%)
summary + 594 in 30.1s = 19.7/s Avg: 5068 Min: 4859 Max: 5357 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 2101 in 111s = 18.9/s Avg: 4025 Min: 1006 Max: 5357 Err: 0 (0.00%)
summary + 586 in 30s = 19.6/s Avg: 5078 Min: 4670 Max: 5405 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 2687 in 141s = 19.0/s Avg: 4254 Min: 1006 Max: 5405 Err: 0 (0.00%)
summary + 597 in 30.1s = 19.8/s Avg: 5081 Min: 4500 Max: 5594 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 3284 in 171s = 19.2/s Avg: 4405 Min: 1006 Max: 5594 Err: 0 (0.00%)
summary + 587 in 30s = 19.6/s Avg: 5074 Min: 4491 Max: 5593 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 3871 in 201s = 19.2/s Avg: 4506 Min: 1006 Max: 5594 Err: 0 (0.00%)
summary + 593 in 30.1s = 19.7/s Avg: 5084 Min: 4504 Max: 5710 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 4464 in 231s = 19.3/s Avg: 4583 Min: 1006 Max: 5710 Err: 0 (0.00%)
summary + 588 in 30s = 19.7/s Avg: 5067 Min: 4547 Max: 5689 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 5052 in 261s = 19.3/s Avg: 4639 Min: 1006 Max: 5710 Err: 0 (0.00%)
summary + 593 in 30.1s = 19.7/s Avg: 5071 Min: 4529 Max: 5632 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 5645 in 291s = 19.4/s Avg: 4685 Min: 1006 Max: 5710 Err: 0 (0.00%)
summary + 595 in 30s = 19.9/s Avg: 5069 Min: 4537 Max: 5605 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 6240 in 321s = 19.4/s Avg: 4721 Min: 1006 Max: 5710 Err: 0 (0.00%)
summary + 591 in 30s = 19.7/s Avg: 5073 Min: 4459 Max: 5706 Err: 0 (0.00%) Active: 82 Started: 100 Finished: 18
summary = 6831 in 351s = 19.4/s Avg: 4752 Min: 1006 Max: 5710 Err: 0 (0.00%)
summary + 81 in 5s = 17.6/s Avg: 5116 Min: 4469 Max: 5717 Err: 0 (0.00%) Active: 0 Started: 100 Finished: 100
summary = 6912 in 356s = 19.4/s Avg: 4756 Min: 1006 Max: 5717 Err: 0 (0.00%)

 

The solution

The results obtained are consistent with the existence of a 1 second-long blocking operation. To fix the problem, the blocking call can be wrapped in a Future that runs in a separate thread pool (bulkheads pattern), turning the call into an asynchronous one and freeing the threads of the default dispatcher used by the actors. When repeating the tests, we get the results below.

20 clients

Surprise! After “fixing” the application, the results get worse. Throughput has plummeted from 20 to 8 requests/sec and response time has increased from 1 to 2.5 seconds.

summary + 1 in 3.2s = 0.3/s Avg: 2801 Min: 2801 Max: 2801 Err: 0 (0.00%) Active: 4 Started: 4 Finished: 0
summary + 170 in 24.3s = 7.0/s Avg: 1939 Min: 1030 Max: 3749 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 171 in 27.5s = 6.2/s Avg: 1944 Min: 1030 Max: 3749 Err: 0 (0.00%)
summary + 237 in 30s = 7.9/s Avg: 2541 Min: 1248 Max: 4073 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 408 in 57.3s = 7.1/s Avg: 2291 Min: 1030 Max: 4073 Err: 0 (0.00%)
summary + 238 in 30.4s = 7.8/s Avg: 2556 Min: 1184 Max: 4068 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 646 in 88s = 7.4/s Avg: 2389 Min: 1030 Max: 4073 Err: 0 (0.00%)
summary + 236 in 30s = 8.0/s Avg: 2549 Min: 1289 Max: 3919 Err: 0 (0.00%) Active: 20 Started: 20 Finished: 0
summary = 882 in 117s = 7.5/s Avg: 2432 Min: 1030 Max: 4073 Err: 0 (0.00%)
summary + 202 in 24.4s = 8.3/s Avg: 2469 Min: 1019 Max: 4059 Err: 0 (0.00%) Active: 0 Started: 20 Finished: 20
summary = 1084 in 142s = 7.7/s Avg: 2439 Min: 1019 Max: 4073 Err: 0 (0.00%)
40 clients

Let’s give it another try with 40 simultaneous clients. As shown below, the previous results get confirmed, throughput of 8 requests/sec and response time degrading up to 5 seconds. Even worse, some requests have failed. Examining the log file, it can be seen that the cause is a timeout in some other actor of the system.

summary + 1 in 3.4s = 0.3/s Avg: 2961 Min: 2961 Max: 2961 Err: 0 (0.00%) Active: 4 Started: 4 Finished: 0
summary + 191 in 28.2s = 6.8/s Avg: 2372 Min: 1029 Max: 5148 Err: 0 (0.00%) Active: 32 Started: 32 Finished: 0
summary = 192 in 32s = 6.1/s Avg: 2375 Min: 1029 Max: 5148 Err: 0 (0.00%)
summary + 229 in 30.1s = 7.6/s Avg: 4949 Min: 2363 Max: 8172 Err: 3 (1.31%) Active: 40 Started: 40 Finished: 0
summary = 421 in 62s = 6.8/s Avg: 3775 Min: 1029 Max: 8172 Err: 3 (0.71%)
summary + 240 in 30s = 8.0/s Avg: 5073 Min: 2150 Max: 8358 Err: 3 (1.25%) Active: 40 Started: 40 Finished: 0
summary = 661 in 92s = 7.2/s Avg: 4246 Min: 1029 Max: 8358 Err: 6 (0.91%)
summary + 233 in 30.4s = 7.7/s Avg: 5115 Min: 2812 Max: 8915 Err: 6 (2.58%) Active: 40 Started: 40 Finished: 0
summary = 894 in 122s = 7.3/s Avg: 4473 Min: 1029 Max: 8915 Err: 12 (1.34%)
summary + 237 in 30.4s = 7.8/s Avg: 5077 Min: 2836 Max: 8098 Err: 5 (2.11%) Active: 40 Started: 40 Finished: 0
summary = 1131 in 152s = 7.4/s Avg: 4599 Min: 1029 Max: 8915 Err: 17 (1.50%)
summary + 233 in 29.5s = 7.9/s Avg: 5084 Min: 2667 Max: 8859 Err: 8 (3.43%) Active: 40 Started: 40 Finished: 0
summary = 1364 in 182s = 7.5/s Avg: 4682 Min: 1029 Max: 8915 Err: 25 (1.83%)
summary + 242 in 30.5s = 7.9/s Avg: 5051 Min: 2866 Max: 7482 Err: 3 (1.24%) Active: 40 Started: 40 Finished: 0
summary = 1606 in 212s = 7.6/s Avg: 4738 Min: 1029 Max: 8915 Err: 28 (1.74%)
summary + 115 in 11s = 10.8/s Avg: 4530 Min: 1755 Max: 8201 Err: 1 (0.87%) Active: 0 Started: 40 Finished: 40
summary = 1721 in 223s = 7.7/s Avg: 4724 Min: 1029 Max: 8915 Err: 29 (1.69%)

 

To understand these results, it needs to be said that the thread pool used to run the Future encapsulating the blocking calls is the global execution context,

scala.concurrent.ExecutionContext.Implicits.global

As stated in the documentation, this execution context has a parallelism level equal to the number of available processors. That number, on my machine, happens to be 8!!

Why was there not this limitation when the application was executing the blocking call with the default dispatcher? Well, the parallelism level of the default dispatcher equals the amount of available processors times 3. That means that the default dispatcher has 24 threads at its disposal to serve the 20 routees.

What is more, as the global execution context is the default one used to run all the Futures of the application, now it is clear why other parts of the system are timing out. All the threads of the global execution context are occupied making the blocking call and the other parts of the application that also make use of those threads are being starved.

There are several ways to solve this problem like creating a custom thread pool to handle the blocking calls. Another option is using the blocking construct as described on http://docs.scala-lang.org/overviews/core/futures.html. When using the blocking construct, new threads are created to execute the code wrapped by blocking, ignoring the parallelism level.

These are the results obtained when using the blocking construct.

40 clients

This looks good, throughput around 25 requests/sec and response time near 1.5 seconds. This is the best throughput achieved in the series of performance tests.

summary + 1 in 1.5s = 0.7/s Avg: 1128 Min: 1128 Max: 1128 Err: 0 (0.00%) Active: 2 Started: 2 Finished: 0
summary + 460 in 30s = 15.5/s Avg: 1059 Min: 1010 Max: 1690 Err: 0 (0.00%) Active: 32 Started: 32 Finished: 0
summary = 461 in 31.2s = 14.8/s Avg: 1059 Min: 1010 Max: 1690 Err: 0 (0.00%)
summary + 744 in 30s = 24.9/s Avg: 1563 Min: 1064 Max: 1960 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 1205 in 61.2s = 19.7/s Avg: 1371 Min: 1010 Max: 1960 Err: 0 (0.00%)
summary + 733 in 30.1s = 24.4/s Avg: 1621 Min: 1245 Max: 1941 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 1938 in 91.2s = 21.2/s Avg: 1465 Min: 1010 Max: 1960 Err: 0 (0.00%)
summary + 748 in 30.2s = 24.7/s Avg: 1623 Min: 1242 Max: 1928 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 2686 in 121s = 22.1/s Avg: 1509 Min: 1010 Max: 1960 Err: 0 (0.00%)
summary + 731 in 30s = 24.6/s Avg: 1626 Min: 1266 Max: 1921 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 3417 in 151s = 22.6/s Avg: 1534 Min: 1010 Max: 1960 Err: 0 (0.00%)
summary + 744 in 30.1s = 24.7/s Avg: 1623 Min: 1299 Max: 1986 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 4161 in 181s = 23.0/s Avg: 1550 Min: 1010 Max: 1986 Err: 0 (0.00%)
summary + 734 in 30s = 24.5/s Avg: 1625 Min: 1245 Max: 1969 Err: 0 (0.00%) Active: 40 Started: 40 Finished: 0
summary = 4895 in 211s = 23.2/s Avg: 1561 Min: 1010 Max: 1986 Err: 0 (0.00%)
summary + 281 in 11.3s = 24.9/s Avg: 1627 Min: 1278 Max: 1964 Err: 0 (0.00%) Active: 0 Started: 40 Finished: 40
summary = 5176 in 222s = 23.3/s Avg: 1565 Min: 1010 Max: 1986 Err: 0 (0.00%)
100 clients

The throughput remains the same, roughly 25 requests/sec, whereas the response time increases significantly near 4 seconds. It makes sense, the throughput is the same but the number of clients increase, what means that the clients need to wait more to be served.

summary + 61 in 11.1s = 5.5/s Avg: 1024 Min: 1010 Max: 1134 Err: 0 (0.00%) Active: 12 Started: 12 Finished: 0
summary + 641 in 30s = 21.6/s Avg: 1211 Min: 1009 Max: 1992 Err: 0 (0.00%) Active: 42 Started: 42 Finished: 0
summary = 702 in 41s = 17.2/s Avg: 1195 Min: 1009 Max: 1992 Err: 0 (0.00%)
summary + 736 in 30.1s = 24.5/s Avg: 2212 Min: 1116 Max: 2949 Err: 0 (0.00%) Active: 72 Started: 72 Finished: 0
summary = 1438 in 71s = 20.3/s Avg: 1715 Min: 1009 Max: 2949 Err: 0 (0.00%)
summary + 730 in 30s = 24.4/s Avg: 3386 Min: 2501 Max: 4065 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 2168 in 101s = 21.5/s Avg: 2278 Min: 1009 Max: 4065 Err: 0 (0.00%)
summary + 746 in 30.3s = 24.6/s Avg: 4081 Min: 3715 Max: 4287 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 2914 in 131s = 22.2/s Avg: 2739 Min: 1009 Max: 4287 Err: 0 (0.00%)
summary + 729 in 30s = 24.5/s Avg: 4066 Min: 4033 Max: 4228 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 3643 in 161s = 22.6/s Avg: 3005 Min: 1009 Max: 4287 Err: 0 (0.00%)
summary + 739 in 30s = 24.7/s Avg: 4086 Min: 3952 Max: 4591 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 4382 in 191s = 23.0/s Avg: 3187 Min: 1009 Max: 4591 Err: 0 (0.00%)
summary + 730 in 30s = 24.3/s Avg: 4083 Min: 4006 Max: 4441 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 5112 in 221s = 23.1/s Avg: 3315 Min: 1009 Max: 4591 Err: 0 (0.00%)
summary + 741 in 30.1s = 24.6/s Avg: 4060 Min: 3909 Max: 4220 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 5853 in 251s = 23.3/s Avg: 3409 Min: 1009 Max: 4591 Err: 0 (0.00%)
summary + 737 in 30s = 24.6/s Avg: 4062 Min: 3994 Max: 4134 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 6590 in 281s = 23.5/s Avg: 3482 Min: 1009 Max: 4591 Err: 0 (0.00%)
summary + 738 in 30s = 24.6/s Avg: 4059 Min: 4007 Max: 4116 Err: 0 (0.00%) Active: 100 Started: 100 Finished: 0
summary = 7328 in 311s = 23.6/s Avg: 3540 Min: 1009 Max: 4591 Err: 0 (0.00%)
summary + 739 in 30s = 24.6/s Avg: 4065 Min: 4019 Max: 4305 Err: 0 (0.00%) Active: 88 Started: 100 Finished: 12
summary = 8067 in 341s = 23.7/s Avg: 3589 Min: 1009 Max: 4591 Err: 0 (0.00%)
summary + 87 in 4s = 24.8/s Avg: 4065 Min: 4039 Max: 4113 Err: 0 (0.00%) Active: 0 Started: 100 Finished: 100
summary = 8154 in 344s = 23.7/s Avg: 3594 Min: 1009 Max: 4591 Err: 0 (0.00%)

 

But where is this limit of 24-25 requests/sec coming from? The first thought that comes to mind is that this limit is imposed by the number of threads of the default dispatcher that, as it was said before, is 24 (number of processors * 3).

However, changing the parallelism-factor does not make any difference as the throughput does not change. After taking some thread dumps and looking into them, I found out that the limitation derives of the maximum number of threads configured on the thread pool used by Akka Camel.

Being able to analyse the different thread pools used by the application is essential to tune an application successfully and I will leave it for a next post.

 

Conclusion

This post has showed how to analyse the behaviour of an application to detect blocking operations, how to deal with them avoiding some common pitfalls and how performance tests are a valuable tool to conduct said analysis.

For such reason, it is recommendable to introduce performance tests in the development process as soon as possible.

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.