Inspired by a blog by Stephen Toub on.NET performance, we are writing a similar article to highlight the performance improvements ASP.NET Core has made in 6.0.

The benchmark set

We used BenchmarkDotNet for most of our examples throughout the process. In github.com/BrennanConr… The REPO, which includes most of the benchmarks used in this article, is provided on.

Most of the benchmark results in this article are generated from the following command line:

Dotnet run -c Release -f net48 --runtimes net48 netcoreApp3.1 net5.0 net6.0

Then select the specific benchmark to run from the list.

This command line gives the BenchmarkDotNet directive:

  • Build everything in the publish configuration.
  • Build it for the.NET Framework 4.8 periphery.
  • Run each benchmark on.NET Framework 4.8,.NET Core 3.1,.NET 5, and.NET 6.
  • For some benchmarks, they only run on.NET 6 (for example, if you compare two ways of coding on the same version) :

Dotnet run -c Release-f net6.0–runtimes net6.0 For example dotnet run -c Release-f net5.0–runtimes net5.0 net6.0 I will include the commands to run each benchmark.

Most of the results in this article were generated by running the above benchmark tests on Windows, primarily for the purpose of NET Framework 4.8 is included in the result set. However, unless otherwise noted, all of these benchmarks generally show fairly significant improvements when run on Linux or macOS. Just make sure you have installed each runtime you want to measure. These benchmarks use. NET 6 RC1 build, as well as the latest release. NET 5 and.NET Core 3.1 downloads.

span< T >

Since the addition of Span< T > in.NET 2.1, with each release we have converted more code to use Span both internally and as part of a public API to improve performance. This launch is no exception.

PR dotnet/aspnetcore#28855 removed the temporary string assignment from the PathString from string.substring when adding two PathString instances and instead used Span< char> As a temporary string. In the benchmark below, we use a short string and a long string to show the performance difference in avoiding temporary strings.

dotnet run -c Release -f net48 --runtimes net48 net5. 0 net6. 0 --filter *PathStringBenchmark*

private PathString _first = new PathString("/first/");
private PathString _second = new PathString("/second/");
private PathString _long = new PathString("/longerpathstringtoshowsubstring/");

[Benchmark]
public PathString AddShortString()
{
    return _first.Add(_second);
}

[Benchmark]
public PathString AddLongString()
{
    return _first.Add(_long);
}
Copy the code
methods run Tool chain The average distribution ratio allocated
AddShortString The.net Framework 4.8 net48 23.51 ns 1.00 96 B
AddShortString The.net 5.0 net5.0 22.73 ns 0.97 96 B
AddShortString The.net 6.0 net6.0 14.92 ns 0.64 56 B
AddLongString The.net Framework 4.8 net48 30.89 ns 1.00 201 B
AddLongString The.net 5.0 net5.0 25.18 ns 0.82 192 B
AddLongString The.net 6.0 net6.0 15.69 ns 0.51 104 B

Dotnet /aspnetcore#34001 introduces a new span-based API for enumerating query strings that are allocated free in common cases where there are no encoded characters, and lower when the query string contains encoded characters.

dotnet run -c Release -f net6. 0 --runtimes net6. 0 --filter *QueryEnumerableBenchmark*

#if NET6_0_OR_GREATER
    public enum QueryEnum
    {
        Simple = 1,
        Encoded,
    }
    [ParamsAllValues]
    public QueryEnum QueryParam { get; set; }

    private string SimpleQueryString = "? key1=value1&key2=value2";
    private string QueryStringWithEncoding = "? key1=valu%20&key2=value%20";
    [Benchmark(Baseline = true)]
    public void QueryHelper()
    {
        var queryString = QueryParam == QueryEnum.Simple ? SimpleQueryString : QueryStringWithEncoding;
        foreach (var queryParam inQueryHelpers.ParseQuery(queryString)) { _ = queryParam.Key; _ = queryParam.Value; }} [Benchmark]
    public void QueryEnumerable()
    {
        var queryString = QueryParam == QueryEnum.Simple ? SimpleQueryString : QueryStringWithEncoding;
        foreach (var queryParam in new QueryStringEnumerable(queryString)){ _ = queryParam.DecodeName(); _ = queryParam.DecodeValue(); }}#endif
Copy the code
methods Query parameters The average distribution ratio allocated
QueryHelper Simple 243.13 ns 1.00 360 B
QueryEnumerable Simple 91.43 ns 0.38
QueryHelper Encoded 351.25 ns 1.00 432 B
QueryEnumerable Encoded 197.59 ns 0.56 152 B

Note that there is no such thing as a free lunch. In the case of the new QueryStringEnumerable API, if you plan to enumerate the query string values multiple times, it might actually be more expensive than using queryhelms.parsequery and storing the parsed query string values in a dictionary.

@paulomorgado’s dotnet/aspnetcore#29448 uses the string.create method, which allows you to initialize a string after it is created if you know its final size. This is used to remove some temporary string assignments in uriHelper. BuildAbsolute.

dotnet run -c Release -f netcoreapp31. --runtimes netcoreapp31. net6. 0 --filter *UriHelperBenchmark*

#if NETCOREAPP
    [Benchmark]
    public void BuildAbsolute()
    {
        _ = UriHelper.BuildAbsolute("https".new HostString("localhost"));
    }
#endif
Copy the code
methods run Tool chain The average distribution ratio allocated
BuildAbsolute The.net Core 3.1 netcoreapp3.1 92.87 ns 1.00 176 B
BuildAbsolute The.net 6.0 net6.0 52.88 ns 0.57 64 B

PR dotnet/aspnetcore# 31267 converts some of ContentDispositionHeaderValue parsing logic to use based on Span< T> To avoid the common occurrence of temporary strings and temporary bytes [].

dotnet run -c Release -f net48 --runtimes net48 netcoreapp31. net5. 0 net6. 0 --filter *ContentDispositionBenchmark*
[Benchmark]
public void ParseContentDispositionHeader()
{
    var contentDisposition = new ContentDispositionHeaderValue("inline");
    contentDisposition.FileName = "FileAName. Bat";
 }
Copy the code
methods run Tool chain On average, The proportion allocated
ContentDispositionHeader The.net Framework 4.8 net48 654.9 ns 1.00 570 B
ContentDispositionHeader The.net Core 3.1 netcoreapp3.1 581.5 ns 0.89 536 B
ContentDispositionHeader The.net 5.0 net5.0 519.2 ns 0.79 536 B
ContentDispositionHeader The.net 6.0 net6.0 295.4 ns 0.45 312 B

The free connection

One of the main components of ASP.NET Core is the managed server, which brings with it a number of different issues that need to be optimized. We will focus on the improvements to free connections in 6.0, where we made a number of changes to reduce the amount of memory used by connections waiting for data.

We made three different types of changes, one was to reduce the size of objects used by connections, which included System.io.pipelines, SocketConnections and SocketSenders. The second type of change is to pool frequently accessed objects so that we can reuse old instances and save allocation. The third type of change makes use of what is called “zero byte read”. Here, we try to read data from the connection with a zero-byte buffer. If there is data available, the read will return no data, but we know there is data available now and can provide a buffer to read the data immediately. This avoids pre-allocating a buffer for reads that might be completed in the future, so we can avoid large allocations until we know the data is available.

< span style = “box-sizing: border-box; word-break: inherit! Important; word-break: inherit! Important;”

Dotnet /aspnetcore#31308 reconstructs Kestrel’s Socket layer to avoid some asynchronous state machines and reduce the size of the remaining state machines, thereby saving 33% of the allocation per connection.

Dotnet /aspnetcore#30769 removed the PipeOptions assignment for each connection and moved it to the connection factory, so we only assigned the entire life of a server and reused the same options for each connection. Dotnet /aspnetcore#31311 from @benaadams replaces well-known header values in WebSocket requests with internal strings, which allows strings allocated during header resolution to be garbage collected, reducing memory usage for long-standing WebSocket connections. Dotnet /aspnetcore#30771 reconstructs the Sockets layer in Kestrel to first avoid allocating SocketReceiver object + SocketAwaitableEventArgs and merge it into a single object, which saves a few bytes. This results in fewer objects being allocated per connection. The PR also aggregates the SocketSender class, so you now have multiple core SocketSender on average, rather than creating one for each connection. So in the benchmark below, when we had 10,000 connections, only 16 connections were allocated on my machine instead of 10,000, saving ~ 46MB!

Another similar size change is dotnet/runtime#49123, which adds support for zero-byte reads in SslStream so that our 10,000 idle connections are allocated ~ 46mb to ~ 2.3mb from SslStream. Dotnet /runtime#49117 added zero-byte reading support on StreamPipeReader, and then Kestrel used it in dotnet/aspnetcore#30863 to start using zero-byte reading in SslStream.

The net result of all these changes is a significant reduction in memory usage for idle connections.

The following numbers are not from BenchmarkDotNet applications, as they measure idle connections and are easier to set up with client and server applications.

The console and WebApplication code are pasted in the following bullet points:

Gist.github.com/BrennanConr…

Here are 10000 idle secure WebSocket connections (WSS) occupying server memory on various frameworks.

The framework memory
net48 665.4 MB
net5.0 603.1 MB
net6.0 160.8 MB

That’s nearly four times less memory than the Net5.

Entity Framework Core

EF Core has made a number of improvements in version 6.0, increasing query execution speed by 31% and TechEmpower fortune’s benchmark runtime updates, optimization benchmarks and EF improvements by 70%.

These improvements come from object pooling improvements, smart checks to see if telemetry is enabled, and the addition of an option to opt out of thread-safety checks when you know your application is safely using DbContext.

See the blog post announcing Entity Framework Core 6.0 Preview 4: Performance, which highlights many of the improvements in detail.

Blazor

Native byte[] interoperates

Blazor now has effective support for byte arrays when performing JavaScript interoperations. Previously, byte arrays sent to and from JavaScript were Base64 encoded, so they could be serialized to JSON, which increased the transfer size and CPU load. Base64 encoding is now available. NET6 is optimized to allow users to use it transparently. Byte [] in NET and Uint8Array in JavaScript. Describes how to use this feature with JavaScript to. .NET and.net to JavaScript.

Let’s look at a quick benchmark to see what byte[] interoperates in. Differences between.NET 5 and.net 6. The following Razor code creates a 22 kB byte [] and sends it to JavaScript’s receiveAndReturnBytes function, which immediately returns the byte []. This round trip is repeated 10,000 times, and the time data is printed to the screen. This code is useful for.net 5 and. NET 6 is the same.

<button @onclick="@RoundtripData">Roundtrip Data</button> <hr /> @Message @code { public string Message { get; set; } = "Press button to benchmark"; private async Task RoundtripData() { var bytes = new byte[1024*22]; List<double> timeForInterop = new List<double>(); var testTime = DateTime.Now; for (var i = 0; i < 10_000; i++) { var interopTime = DateTime.Now; var result = await JSRuntime.InvokeAsync<byte[]>("receiveAndReturnBytes", bytes); timeForInterop.Add(DateTime.Now.Subtract(interopTime).TotalMilliseconds); } Message = $"Round-tripped: {bytes.length / 1024d} kB 10,000 times and it took on average {timeForInterop.Average():F3}ms, and in total {DateTime.Now.Subtract(testTime).TotalMilliseconds:F1}ms"; }}Copy the code

Next, let’s look at the receiveAndReturnBytes JavaScript function. In the.net 5. We must first decode the Base64-encoded byte array into the Uint8Array so that it can be used in our application code. We then have to re-encode the data to Base64 before returning it to the server.

function receiveAndReturnBytes(bytesReceivedBase64Encoded) {
    const bytesReceived = base64ToArrayBuffer(bytesReceivedBase64Encoded);
    // Use Uint8Array data in application
    const bytesToSendBase64Encoded = base64EncodeByteArray(bytesReceived);
    if(bytesReceivedBase64Encoded ! = bytesToSendBase64Encoded) {throw new Error("Expected input/output to match.")}return bytesToSendBase64Encoded;
}
// https://stackoverflow.com/a/21797381
function base64ToArrayBuffer(base64) {
    const binaryString = atob(base64);
    const length = binaryString.length;
    const result = new Uint8Array(length);
    for (let i = 0; i < length; i++) {
        result[i] = binaryString.charCodeAt(i);
    }
    return result;
}
function base64EncodeByteArray(data) {
    const charBytes = new Array(data.length);
    for (var i = 0; i < data.length; i++) {
        charBytes[i] = String.fromCharCode(data[i]);
    }
    const dataBase64Encoded = btoa(charBytes.join(' '));
    return dataBase64Encoded;
}
Copy the code

Encoding/decoding adds significant overhead on both the client and server and requires a lot of boilerplate code. So how do you do that in.NET 6? Well, it’s pretty simple:

function receiveAndReturnBytes(bytesReceived) {
    // bytesReceived comes as a Uint8Array ready for use
    // and can be used by the application or immediately returned.
    return bytesReceived;
}
Copy the code

So it’s definitely easier to write, but what about its performance? In.NET 5 and.net. These code snippets are run in the BLAZorServer template for NET 6, as we see under the Release configuration. NET 6 has a 78% performance improvement in Byte [] interoperability!

Note that streaming interoperability support also effectively downloads (large) files, see the documentation for more details.

The InputFile component has been upgraded to use streaming via dotnet/aspnetcore#33900.

— — — .NET 6 (ms) .NET 5 (ms) ascension
The total time 5273 24463 78%

In addition, this byte array interoperability support is used in the framework to support JavaScript and. NET two-way flow interoperation. Users can now transfer arbitrary binary data. Documentation on streaming from.NET to JavaScript is available here, and JavaScript to.NET documentation is available here.

The input file

Using the Blazor Streaming Interop mentioned above, we now support uploading large files via the InputFile component (previously the upload limit was around 2GB). The speed of the component is also significantly improved by using native Byte [] streams instead of Base64 encoding. For example, for example, and. NET 5 can upload a 100MB file 77% faster.

.NET 6 (ms) .NET 5 (ms) The percentage
2591 10504 75%
2607 11764 78%
2632 11821 78%
Average: 77%

Note that streaming interoperability support also effectively downloads (large) files, see the documentation for more details.

The InputFile component has been upgraded to use streaming via dotnet/aspnetcore#33900.

hodgepodge

Dotnet /aspnetcore#30320 from @benaadams modernizes and optimizes our Typescript library so the site loads faster. The signalr.min.js file changed from 36.8KB compressed and 132KB uncompressed to 16.1KB compressed and 42.2 kB uncompressed. The blazor.server.js file is 86.7 kB compressed, 276 kB uncompressed, 43.9 kB compressed, and 130 kB uncompressed.

Dotnet /aspnetcore#31322 of @benaadams removes some unnecessary casts when fetching common functions from the connection feature set. This provides about a 50% improvement in accessing common features in the collection. Unfortunately, it’s not possible to see performance improvements in benchmarks because it requires a bunch of internal types, so I’ll include the numbers here from PR, which includes benchmarks that can run against internal code if you’re interested in running them.

Dotnet /aspnetcore#31519, also from @benaadams, adds the default interface method to the IHeaderDictionary type to access the common header through an attribute named after the header name. No more typing common titles incorrectly when accessing the title dictionary! More interesting in this blog post is that this change allows the server implementation to return a custom header dictionary to more optimally implement these new interface methods. For example, the server might store the header value directly in a field and return it directly, rather than querying the header value in an internal dictionary, which requires hashing the key and looking up the entry. In some cases, this change can result in up to 480% improvement when the header values are retrieved or set. Again, in order to properly benchmark this change to show that it needs to be set up using an internal type, I’m going to include the numbers from PR, which, for those interested in trying it out, contains benchmarks that run on internal code.

methods branch type On average, Operation/SEC. Delta
GetHeaders before Plaintext 25.793 ns 38770569.6
GetHeaders after Plaintext 12.775 ns 78279480.0 versus + 101.9%
GetHeaders before Common 121.355 ns 8240299.3
GetHeaders after Common 37.598 ns 26597474.6 + 222.8%
GetHeaders before Unknown 366.456 ns 2728840.7
GetHeaders after Unknown 223.472 ns 4474824.0 versus + 64.0%
SetHeaders before Plaintext 49.324 ns 20273931.8
SetHeaders after Plaintext 34.996 ns 28574778.8 + 40.9%
SetHeaders before Common 635.060 ns 1574654.3
SetHeaders after Common 108.041 ns 9255723.7 + 487.7%
SetHeaders before Unknown 1439945 ns 694470.8
SetHeaders after Unknown 517.067 ns 1933985.7 + 178.4%

Dotnet/aspnetcore# 31466 using the.net 6 introduced in new CancellationTokenSource. TryReset () method under the condition of the connection is closed but did not cancel reuse CancellationTokenSource. The figures below were gathered by running Bombardier on 125 connections to Kestrel, which ran about 100,000 requests.

branch type distribution The number of bytes
Before CancellationTokenSource 98314 4719072
After CancellationTokenSource 125 6000

Dotnet /aspnetcore#31528 and dotnet/aspnetcore#34075 make similar changes to the CancellationTokenSource for reusing HTTPS handshakes and HTTP3 streams, respectively.

Dotnet /aspnetcore#31660 improves server-to-client stream performance by reusing allocated StreamItem objects for the entire stream in SignalR, instead of assigning one to each StreamItem. Dotnet /aspnetcore#31661 stores the HubCallerClients object on the SignalR connection instead of assigning it to each Hub method call.

Dotnet /aspnetcore#31506 of @shreyasjejurkar reconstructs the internal structure of the WebSocket handshake to avoid temporary List assignments. Dotnet /aspnetcore#32829 in @gfoidl refactored QueryCollection to reduce allocation and vectorize some code. Dotnet /aspnetcore#32234 of @benaadams removes unused fields in the HttpRequestHeaders enumeration, which improves performance by no longer assigning fields to the headers of each enumeration.

Dotnet /aspnetcore#31333 from martincostello converts http.sys to use loggermessage.define, a high-performance logging API. This avoids unnecessary boxing of value types, parsing of log format strings, and in some cases, assigning strings or objects when logging levels are not enabled.

Dotnet /aspnetcore#31784 adds a new IApplicationBuilder. Use overloading to register the middleware to avoid unnecessary on-demand allocation when the middleware is running. The old code looks like this:

app.Use(async (context, next) =>
{
    await next();
});
Copy the code

The new code is as follows:

app.Use(async (context, next) =>
{
    await next(context);
});
Copy the code

The benchmark below simulates the middleware pipeline without the need to set up a server to demonstrate improvements. Using an int instead of an HttpContext for the request, the middleware returns a completed task.

dotnet run -c Release -f net6. 0 --runtimes net6. 0 --filter *UseMiddlewareBenchmark*
static private Func<Func<int, Task>, Func<int, Task>> UseOld(Func<int, Func<Task>, Task> middleware)
{
    return next =>
    {
        return context =>
        {
            Func<Task> simpleNext = () => next(context);
            return middleware(context, simpleNext);
        };
    };
}
static private Func<Func<int, Task>, Func<int, Task>> UseNew(Func<int, Func<int, Task>, Task> middleware)
{
    return next => context => middleware(context, next);
}
Func<int, Task> Middleware = UseOld((c, n) => n())(i => Task.CompletedTask);
Func<int, Task> NewMiddleware = UseNew((c, n) => n(c))(i => Task.CompletedTask);
[Benchmark(Baseline = true)]
public Task Use()
{
    return Middleware(10);
}
[Benchmark]
public Task UseNew()
{
    return NewMiddleware(10);
}
Copy the code
methods On average, ratio allocated
Use 15.832 ns 1.00 96 B
UseNew 2.592 ns 0.16

conclusion

Hope you enjoyed reading about some of the improvements in ASP.NET Core 6.0! I encourage you to check it out. NET 6 blog post on runtime performance improvements.