Boosting C# Collection Initializer Performance by 87%
Written on
Understanding Collection Initializers in C#
My interest piqued after noticing several discussions on LinkedIn regarding collection initializers. Surprisingly, claims emerged suggesting an 87% performance enhancement using these features. This led me to investigate the performance of collection initializers in C# using BenchmarkDotNet.
While micro-optimizations may seem trivial to some, they could be crucial for developers seeking to maximize their application's efficiency. So, let's delve into what collection initializers and collection expressions are in C#.
What Are Collection Initializers and Collection Expressions?
In a recent article, I introduced the fundamentals of collection initializers with straightforward examples. Instead of manually populating a collection like this:
List<string> devLeaderCoolList = new List<string>();
devLeaderCoolList.Add("Hello");
devLeaderCoolList.Add(", ");
devLeaderCoolList.Add("World!");
We can streamline this process into a more concise format:
List<string> devLeaderCoolList = new List<string> { "Hello", ", ", "World!" };
This syntax is not only cleaner but also enhances readability. However, the real question is: How does this impact performance? With various syntax options available, could there really be a significant difference?
Dave Callan's LinkedIn post inspired this inquiry, prompting me to conduct some benchmarks on collection initializer performance.
The video titled "UNEXPECTED 87% Performance Boost! - C# Collection Initializers" details how to leverage these features effectively.
Exploring List Collection Initializer Performance
In this section, I will share benchmarks for initializing lists in C# through various methods. This includes comparing different collection initializers, the new collection expression syntax, and manual initialization techniques. It’s evident that manual addition will typically yield slower results than using a collection initializer.
For this analysis, I will not examine the spread operator, as I prefer to focus on collection combinations separately. I will utilize BenchmarkDotNet for all tests. If you're unfamiliar with BenchmarkDotNet, check out the linked video for guidance.
The List Benchmark Code
With BenchmarkDotNet installed, here’s how I set up the entry point for the benchmarks:
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Reflection;
BenchmarkRunner.Run(
Assembly.GetExecutingAssembly(),
args: args);
The following code snippet outlines the list benchmarks:
[MemoryDiagnoser]
[MediumRunJob]
public class ListBenchmarks
{
private static readonly string[] _dataAsArray = new string[]
{
"Apple", "Banana", "Orange",};
private static IEnumerable<string> GetDataAsIterator()
{
yield return "Apple";
yield return "Banana";
yield return "Orange";
}
[Benchmark(Baseline = true)]
public List<string> ClassicCollectionInitializer_NoCapacity()
{
return new List<string>
{
"Apple", "Banana", "Orange",};
}
[Benchmark]
public List<string> ClassicCollectionInitializer_SetCapacity()
{
return new List<string>(3)
{
"Apple", "Banana", "Orange",};
}
[Benchmark]
public List<string> CollectionExpression()
{
return new List<string> { "Apple", "Banana", "Orange" };}
[Benchmark]
public List<string> CopyConstructor_Array()
{
return new List<string>(_dataAsArray);}
[Benchmark]
public List<string> CopyConstructor_Iterator()
{
return new List<string>(GetDataAsIterator());}
[Benchmark]
public List<string> ManuallyAdd_NoCapacitySet()
{
List<string> list = new();
list.Add("Apple");
list.Add("Banana");
list.Add("Orange");
return list;
}
[Benchmark]
public List<string> ManuallyAdd_CapacitySet()
{
List<string> list = new(3);
list.Add("Apple");
list.Add("Banana");
list.Add("Orange");
return list;
}
}
In this setup, the baseline for comparison is the traditional collection initializer without an initial capacity.
The List Benchmark Results
Here are the benchmark outcomes, ranked from least to most efficient based on the Ratio column (higher values indicate worse performance):
- 3.31X — Using a copy constructor with an iterator performed the worst due to iterator overhead.
- 1.76X — The copy constructor with an array also underperformed, suggesting that classic initializers might be preferable if capacity is known.
- 1.10X — Manually adding to a collection without capacity is slightly slower than using a collection initializer with no capacity.
- 1.0X — The classic collection initializer serves as our baseline.
Notably, we begin to see performance improvement:
- 0.64X — Using a collection expression resulted in a significant 36% performance boost.
- 0.61X — Manually adding items to a list with pre-set capacity was actually faster than previous methods.
- 0.53X — Providing capacity for a classic collection initializer nearly halved the time required.
Overall, knowing the collection size beforehand yields considerable performance improvements. It's intriguing to ponder why the compiler doesn’t optimize this automatically.
Examining Dictionary Collection Initializer Performance
While dictionaries lack the streamlined syntax of collection expressions, we can still benchmark various collection initializer methods.
The Dictionary Benchmark Code
Here’s the code for the dictionary benchmarks:
[MemoryDiagnoser]
[MediumRunJob]
public class DictionaryBenchmarks
{
private static readonly Dictionary<string, string> _sourceData = new()
{
["Apple"] = "The first value",
["Banana"] = "The next value",
["Orange"] = "The last value",
};
private static IEnumerable<KeyValuePair<string, string>> GetDataAsIterator()
{
foreach (var item in _sourceData)
{
yield return item;}
}
[Benchmark(Baseline = true)]
public Dictionary<string, string> CollectionInitializer_BracesWithoutCapacity()
{
return new Dictionary<string, string>
{
{ "Apple", "The first value" },
{ "Banana", "The next value" },
{ "Orange", "The last value" },
};
}
[Benchmark]
public Dictionary<string, string> CollectionInitializer_BracesWithCapacity()
{
return new Dictionary<string, string>(3)
{
{ "Apple", "The first value" },
{ "Banana", "The next value" },
{ "Orange", "The last value" },
};
}
[Benchmark]
public Dictionary<string, string> CollectionInitializer_BracketsWithoutCapacity()
{
return new Dictionary<string, string>
{
["Apple"] = "The first value",
["Banana"] = "The next value",
["Orange"] = "The last value",
};
}
[Benchmark]
public Dictionary<string, string> CollectionInitializer_BracketsWithCapacity()
{
return new Dictionary<string, string>(3)
{
["Apple"] = "The first value",
["Banana"] = "The next value",
["Orange"] = "The last value",
};
}
[Benchmark]
public Dictionary<string, string> CopyConstructor_Dictionary()
{
return new Dictionary<string, string>(_sourceData);}
[Benchmark]
public Dictionary<string, string> CopyConstructor_Iterator()
{
return new Dictionary<string, string>(GetDataAsIterator());}
[Benchmark]
public Dictionary<string, string> ManuallyAdd_NoCapacitySet()
{
Dictionary<string, string> dict = new();
dict.Add("Apple", "The first value");
dict.Add("Banana", "The next value");
dict.Add("Orange", "The last value");
return dict;
}
[Benchmark]
public Dictionary<string, string> ManuallyAdd_CapacitySet()
{
Dictionary<string, string> dict = new(3);
dict.Add("Apple", "The first value");
dict.Add("Banana", "The next value");
dict.Add("Orange", "The last value");
return dict;
}
[Benchmark]
public Dictionary<string, string> ManuallyAssign_NoCapacitySet()
{
Dictionary<string, string> dict = new();
dict["Apple"] = "The first value";
dict["Banana"] = "The next value";
dict["Orange"] = "The last value";
return dict;
}
[Benchmark]
public Dictionary<string, string> ManuallyAssign_CapacitySet()
{
Dictionary<string, string> dict = new(3);
dict["Apple"] = "The first value";
dict["Banana"] = "The next value";
dict["Orange"] = "The last value";
return dict;
}
}
This code illustrates two key points: the distinction between square brackets and curly braces for collection initializers and the different methods of manually populating a dictionary.
The Dictionary Benchmark Results
The results for the dictionary benchmarks are as follows:
- 2.03X — The copy constructor with an iterator again ranks lowest due to similar reasons as before.
- 1.02X — Manually adding items with capacity matched the performance of collection initializers.
- 1.0X — The classic collection initializer using curly braces without specified capacity serves as the baseline.
Beyond this baseline, other methods showed improved performance:
- 0.96X — Providing capacity slightly improved efficiency over the baseline.
- 0.95X — Manually assigning without capacity was faster than the baseline by about 5%.
- 0.94X — Classic collection initializer using square brackets without capacity exceeded the baseline performance by ~6%.
- 0.90X — Manually assigning with known capacity was about 11% faster than the baseline.
- 0.87X — Classic collection initializer with square brackets and known capacity was about 15% faster than the baseline.
- 0.86X — Manually adding without capacity was approximately 16% faster.
Interestingly, using a copy constructor yielded the fastest results, suggesting that for specific scenarios, this method may offer significant efficiency.
Are These Benchmarks Practical?
Having authored numerous articles, videos, and social media posts, I anticipate some critiques regarding these benchmarks. It's crucial to contextualize these results and consider their applicability.
In the broader landscape of software development, focusing solely on these micro-optimizations may not yield substantial performance improvements. However, the benchmarks demonstrate that there are indeed gains to be had!
When working with a small dataset or executing rapid operations, the overhead of iterators can overshadow the overall efficiency of the process. Additionally, the use case for initializing collections can vary significantly.
Providing known capacity appears to be a vital factor in enhancing performance, likely preventing unnecessary resizing. What would the impact be with larger data sets?
These benchmarks aim to inform rather than dictate coding practices. If you're performance-focused, I encourage you to benchmark and profile your own code rather than relying solely on external results.
Final Thoughts on Collection Initializer Performance in C#
Ultimately, the insights presented in this article on collection initializer performance in C# highlight potential micro-optimizations. While these findings are intriguing, the priority should remain on code readability and ensuring that performance issues are not primarily linked to collection initialization.
I hope you enjoyed this exploration into collection initializers and found it enlightening. If you’re eager for more knowledge, consider subscribing to my free weekly software engineering newsletter and exploring my videos on YouTube!
Engage with fellow developers in my Discord community, and for those looking to enhance their skills further, check out my courses and e-books. Explore more articles on software engineering topics through my website and GitHub repository.