Inspired by this issue being closed: "API request: string.Intern(ReadOnlySpan ...)" #28368
Shared pool is capped; with 2 generation LRU eviction and further evictions on Gen2 GC collections.
Collections
using Ben.Collections;
array = array.ToInternedArray();
list = list.ToInternedList();
dict = dict.ToInternedDictionary();
string val = stringBuilder.Intern();
var conDict = dict.ToInternedConcurrentDictionary();
Sql query
using static Ben.Collections.Specialized.StringCache;
while (await reader.ReadAsync())
{
var id = reader.GetInt32(0);
// Dedupe the string before you keep it
var category = Intern(reader.GetString(1));
// ...
}
namespace Ben.Collections.Specialized
{
public class InternPool
{
// Thread-safe shared intern pool; bounded at some capacity, with some max length
public static SharedInternPool Shared { get; }
// Unbounded
public InternPool();
// Unbounded with prereserved capacity
public InternPool(int capacity);
// Capped size with prereserved capacity, entires evicted based on 2 generation LRU
public InternPool(int capacity, int maxCount);
// Capped size; max pooled string length, with prereserved capacity, entires evicted based on 2 generation LRU
public InternPool(int capacity, int maxCount, int maxLength)
// Deduplicated; unbounded
public InternPool(IEnumerable<string> collection);
// Deduplicated; capped size, entires evicted based on 2 generation LRU
public InternPool(IEnumerable<string> collection, int maxCount);
[return: NotNullIfNotNull("value")]
public string? Intern(string? value);
public string Intern(ReadOnlySpan<char> value);
public string InternAscii(ReadOnlySpan<byte> asciiValue);
public string InternUtf8(ReadOnlySpan<byte> utf8Value);
// Gets the number of strings currently in the pool.
public int Count { get; }
// Count of strings checked
public long Considered { get; }
// Count of strings deduplicated
public long Deduped { get; }
// Count of strings added to the pool, may be larger than Count if there is a maxCount.
public long Added { get; }
}
}
Using the dotnet counters as InternPool
Get the proc id
> dotnet-counters ps
14600 MyApp MyAppPath\MyApp.exe
Then query monitor that process
> dotnet-counters monitor InternPool --process-id 14600
Press p to pause, r to resume, q to quit.
Status: Running
[InternPool]
Considered (Count / 1 sec) 4,497
Deduped (Count / 1 sec) 4,496
Evicted (Count / 1 sec) 0
Total Considered 1,357,121
Total Count 7,811
Total Deduped 1,316,376
Total Evicted 32,934
Total Gen2 Sweeps 3
dotnet build -c Release
cd tests/Benchmarks
dotnet run -c Release
Method | Dataset | Mean | Error | Ratio | Allocated |
---|---|---|---|---|---|
Default | 2M (20k-Values) | 284.7 ms | 5.30 ms | 1.00 | 63991944 B |
StringIntern_Instance | 2M (20k-Values) | 295.3 ms | 4.21 ms | 1.04 | 639920 B |
StringIntern_Shared | 2M (20k-Values) | 330.3 ms | 4.79 ms | 1.16 | 12256 B |
Default | 2M-Unique | 292.3 ms | 7.76 ms | 1.00 | 79199856 B |
StringIntern_Instance | 2M-Unique | 472.7 ms | 7.33 ms | 1.62 | 79199856 B |
StringIntern_Shared | 2M-Unique | 1,035.4 ms | 11.56 ms | 3.54 | 79200144 B |
Default | Taxi-data | 327.3 ms | 4.23 ms | 1.00 | 67108800 B |
StringIntern_Instance | Taxi-data | 291.2 ms | 1.16 ms | 0.89 | 224 B |
StringIntern_Shared | Taxi-data | 311.9 ms | 2.34 ms | 0.95 | 456 B |
Default | 2M-Unique-Long | 836.2 ms | 10.45 ms | 1.00 | 460444288 B |
StringIntern_Instance | 2M-Unique-Long | 1,242.7 ms | 8.06 ms | 1.49 | 460444288 B |
StringIntern_Shared | 2M-Unique-Long | 1,734.8 ms | 12.35 ms | 2.07 | 460444784 B |
Default | 2M (20k-Values-Long) | 736.5 ms | 33.14 ms | 1.00 | 332438368 B |
StringIntern_Instance | 2M (20k-Values-Long) | 368.1 ms | 2.59 ms | 0.50 | 3324384 B |
StringIntern_Shared | 2M (20k-Values-Long) | 413.4 ms | 4.13 ms | 0.56 | 32200 B |
See CONTRIBUTING.md for information on contributing to this project.
This project has adopted the code of conduct defined by the Contributor Covenant to clarify expected behavior in our community. For more information, see the .NET Foundation Code of Conduct.
This project is licensed with the Apache License.