Nemesis.TextParsers
2.9.1
See the version list below for details.
dotnet add package Nemesis.TextParsers --version 2.9.1
NuGet\Install-Package Nemesis.TextParsers -Version 2.9.1
<PackageReference Include="Nemesis.TextParsers" Version="2.9.1" />
paket add Nemesis.TextParsers --version 2.9.1
#r "nuget: Nemesis.TextParsers, 2.9.1"
// Install Nemesis.TextParsers as a Cake Addin #addin nuget:?package=Nemesis.TextParsers&version=2.9.1 // Install Nemesis.TextParsers as a Cake Tool #tool nuget:?package=Nemesis.TextParsers&version=2.9.1
Nemesis.TextParsers
Benefits and Features
TL;DR - are you looking for performant, non allocating serializer from structural object to flat, human editable string? Look no further. Benchmarks shows potential gains from using Nemesis.TextParsers
Method | Count | Mean | Ratio | Allocated |
---|---|---|---|---|
TextJson | 10 | 121.02 us | 1.00 | 35200 B |
TextJsonBytes | 10 | 120.79 us | 1.00 | 30400 B |
TextJsonNet | 10 | 137.28 us | 1.13 | 288000 B |
TextParsers | 10 | 49.02 us | 0.41 | 6400 B |
TextJson | 100 | 846.06 us | 1.00 | 195200 B |
TextJsonBytes | 100 | 845.84 us | 1.00 | 163200 B |
TextJsonNet | 100 | 943.71 us | 1.12 | 636800 B |
TextParsers | 100 | 463.33 us | 0.55 | 42400 B |
TextJson | 1000 | 8,142.13 us | 1.00 | 1639200 B |
TextJsonBytes | 1000 | 8,155.41 us | 1.00 | 1247200 B |
TextJsonNet | 1000 | 8,708.12 us | 1.07 | 3880800 B |
TextParsers | 1000 | 4,384.00 us | 0.54 | 402400 B |
More comprehensive examples are here
Other popular choices
When stucked with a task of parsing various items form strings we often opt for TypeConverter. We tend to create methods like:
public static T FromString<T>(string text) =>
(T)TypeDescriptor.GetConverter(typeof(T))
.ConvertFromInvariantString(text);
or even create similar constructs to be in line with object oriented design:
public abstract class TextTypeConverter : TypeConverter
{
public sealed override bool CanConvertFrom(ITypeDescriptorContext context, Type sourceType) =>
sourceType == typeof(string) || base.CanConvertFrom(context, sourceType);
public sealed override bool CanConvertTo(ITypeDescriptorContext context, Type destinationType) =>
destinationType == typeof(string) || base.CanConvertTo(context, destinationType);
}
public abstract class BaseTextConverter<TValue> : TextTypeConverter
{
public sealed override object ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, object value) =>
value is string text ? ParseString(text) : default;
public abstract TValue ParseString(string text);
public sealed override object ConvertTo(ITypeDescriptorContext context, CultureInfo culture, object value, Type destinationType) =>
destinationType == typeof(string) ?
FormatToString((TValue)value) :
base.ConvertTo(context, culture, value, destinationType);
public abstract string FormatToString(TValue value);
}
What is wrong with that? Well, nothing... except of performance and possibly - support for generics.
TypeConverter was designed around 2002 when processing power tended to double every now and then and (in my opinion) it was more suited for creating GUI-like editors where performance usually is not an issue. But imagine a service application like exchange trading suite that has to perform multiple operations per second and in such cases processor has more important thing to do than parsing strings.
Features
- as concise as possible - both JSON or XML exist but they are not ready to be created from hand by human support
- works in various architectures supporting .Net Core and .Net Standard and is culture independent
- support for basic system types (C#-like type names):
- string
- bool
- byte/sbyte, short/ushort, int/uint, long/ulong
- float/double
- decimal
- BigInteger
- TimeSpan, DateTime/DateTimeOffset
- Guid, Uri
- supports pattern based parsing/formatting via ToString/FromText methods placed inside type or static/instance factory
- supports compound types:
- KeyValuePair<,> and ValueTuple of any arity
- Enums (with underlying number types; code gen and reflection based)
- Nullables
- Dictionaries (built-in i.e. SortedDictionary/SortedList and custom ones)
- Arrays (including jagged arrays)
- Standard collections and collection contracts (List vs IList vs IEnumerable)
- User defined collections
- everything mentioned above but combined with inner elements properly escaped in final string i.e. SortedDictionary<char?, IList<float[][]>>
- ability to fallback to TypeConverter if no parsing/formatting strategy was found
- parsing is fast to while allocating as little memory as possible upon parsing. The following benchmark illustrates this speed via parsing 1000 element array
Method | Mean | Ratio | Gen 0 | Gen 1 | Allocated | Remarks |
---|---|---|---|---|---|---|
RegEx parsing | 4,528.99 us | 44.98 | 492.1875 | - | 2089896 B | Regular expression with escaping support |
StringSplitTest_KnownType | 93.41 us | 0.92 | 9.5215 | 0.1221 | 40032 B | string.Split(..).Select(text=>int.Parse(text)) |
StringSplitTest_DynamicType | 474.73 us | 4.69 | 24.4141 | - | 104032 B | string.Split + TypeDescriptor.GetConverter |
SpanSplitTest_NoAlloc | 101.00 us | 1.00 | - | - | - | "1|2|3".AsSpan().Tokenize() |
SpanSplitTest_Alloc | 101.38 us | 1.00 | 0.8545 | - | 4024 B | "1|2|3".AsSpan().Tokenize(); var array = new int[1000]; |
- provides basic building blocks for parser's callers to be able to create their own transformers/factories
- LeanCollection that can store 1,2,3 or more elements
- SpanSplit - string.Split equivalent is provided to accept faster representation of string - ReadOnlySpan<char>. Supports both standard and custom escaping sequences
- access to every implemented parser/formatter
- basic LINQ support
var avg = "1|2|3".AsSpan()
.Tokenize('|', '\\', true)
.Parse('\\', '∅', '|')
.Average(DoubleTransformer.Instance);
- basic support for GUI editors for compound types like collections/dictionaries: CollectionMeta, DictionaryMeta
- lean/frugal implementation of StringBuilder - ValueSequenceBuilder
Span<char> initialBuffer = stackalloc char[32];
using var accumulator = new ValueSequenceBuilder<char>initialBuffer);
using (var enumerator = coll.GetEnumerator())
while (enumerator.MoveNext())
FormatElement(formatter, enumerator.Current, ref accumulator);
return accumulator.AsSpanTo(accumulator.Length > 0 ? accumulator.Length - 1 : 0).ToString();
- usage of C# 9.0 code-gen (and Incremental Code Generators) to provide several transformers for common cases where parsing logic is straightforward
Todo / road map
- ability to format to buffer i.e. TryFormat pattern
- support for ILookup<,>, IGrouping<,>
- support for native parsing/formatting of F# types (map, collections, records...)
Links
Funding
Open source software is free to use but creating and maintaining is a laborious effort. Should you wish to support us in our noble endeavour, please consider the following donation methods:
If you just want to say thanks, you can buy me a ☕ or ⭐ any of my repositories.
License
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 is compatible. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- System.Memory (>= 4.5.5)
-
.NETStandard 2.1
- System.Runtime.CompilerServices.Unsafe (>= 6.0.0)
-
net6.0
- No dependencies.
-
net7.0
- No dependencies.
NuGet packages (2)
Showing the top 2 NuGet packages that depend on Nemesis.TextParsers:
Package | Downloads |
---|---|
Nemesis.Demos
Set of utils for showing coding and language/framework features in form of live demos This package was built from the source at https://github.com/nemesissoft/Nemesis.Demos/tree/dcb94c7943b7275519e5994167a812cba983e01d |
|
Nemesis.TextParsers.DependencyInjection
Contains helper methods useful to setup DependencyInjection using Microsoft.Extensions.DependencyInjection This package was built from the source at https://github.com/nemesissoft/Nemesis.TextParsers/tree/e0a38ccb3e4232df89c2e1737b5316fe91462790 |
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
2.9.15 | 197 | 8/9/2024 |
2.9.6 | 146 | 8/8/2024 |
2.9.2 | 232 | 1/3/2024 |
2.9.1 | 127 | 1/1/2024 |
2.8.2 | 148 | 12/19/2023 |
2.7.2 | 279 | 7/16/2023 |
2.7.1 | 167 | 7/14/2023 |
2.7.0 | 154 | 7/14/2023 |
2.6.3 | 487 | 5/30/2022 |
2.6.2 | 427 | 3/1/2021 |
2.6.1 | 360 | 2/25/2021 |
2.6.0 | 383 | 2/25/2021 |
2.5.0 | 409 | 12/31/2020 |
2.4.0 | 426 | 11/30/2020 |
2.3.0 | 429 | 11/16/2020 |
2.2.1 | 538 | 5/15/2020 |
2.2.0 | 431 | 5/14/2020 |
2.1.2 | 446 | 5/12/2020 |
2.1.1 | 500 | 5/1/2020 |
2.1.0 | 464 | 4/28/2020 |
2.0.4 | 468 | 4/26/2020 |
2.0.2 | 483 | 4/21/2020 |
2.0.1 | 486 | 4/17/2020 |
2.0.0-alpha | 352 | 4/15/2020 |
1.5.1 | 465 | 3/29/2020 |
1.5.0 | 505 | 3/28/2020 |
1.4.1 | 525 | 3/23/2020 |
1.3.2 | 495 | 3/19/2020 |
1.3.0 | 497 | 3/16/2020 |
1.2.0 | 559 | 3/15/2020 |
1.1.3 | 579 | 3/14/2020 |
1.1.2 | 550 | 2/27/2020 |
1.1.1 | 492 | 2/26/2020 |
1.1.0 | 548 | 2/26/2020 |
1.0.6 | 565 | 2/25/2020 |
1.0.4 | 439 | 2/25/2020 |
1.0.3 | 563 | 2/18/2020 |
1.0.2 | 552 | 11/8/2019 |
1.0.1 | 508 | 11/6/2019 |
1.0.0 | 513 | 9/25/2019 |
0.11.50 | 522 | 9/25/2019 |
0.11.47 | 505 | 9/25/2019 |
0.11.46 | 518 | 9/23/2019 |
0.11.42 | 557 | 9/18/2019 |
0.11.41 | 518 | 9/18/2019 |
0.11.40 | 511 | 9/18/2019 |
0.11.39 | 575 | 9/18/2019 |
0.11.38 | 536 | 9/18/2019 |
0.11.37 | 517 | 9/18/2019 |
0.11.36 | 534 | 9/17/2019 |
0.11.35 | 535 | 9/17/2019 |
0.11.34 | 535 | 9/17/2019 |
0.11.33 | 545 | 9/17/2019 |
0.9.32 | 538 | 9/17/2019 |
0.9.31 | 527 | 9/11/2019 |
0.9.30 | 545 | 9/9/2019 |
0.9.29 | 524 | 9/6/2019 |
0.9.28 | 543 | 8/3/2019 |
0.9.27 | 561 | 8/3/2019 |
0.9.26 | 522 | 8/1/2019 |
0.9.25 | 561 | 7/21/2019 |
0.9.24 | 550 | 7/19/2019 |
0.9.22 | 558 | 6/14/2019 |
0.9.21 | 541 | 6/13/2019 |
0.9.20 | 592 | 6/9/2019 |
0.9.19 | 616 | 6/7/2019 |
0.9.18 | 596 | 6/5/2019 |
0.9.15 | 554 | 5/29/2019 |
0.9.14 | 605 | 5/29/2019 |
0.9.13 | 587 | 5/28/2019 |
0.9.12 | 585 | 5/27/2019 |
0.9.10 | 601 | 5/21/2019 |
0.9.8 | 574 | 5/7/2019 |
0.9.7 | 596 | 5/5/2019 |
0.9.6 | 578 | 5/5/2019 |
0.9.5 | 558 | 5/5/2019 |
0.0.0-alpha.0.335 | 79 | 1/1/2024 |
# Release 2.9.1 - Source code generator for enum types
## What's Changed
* Implement source code generator for enum by @MichalBrylka in https://github.com/nemesissoft/Nemesis.TextParsers/pull/16
**Full Changelog**: https://github.com/nemesissoft/Nemesis.TextParsers/compare/v2.7.2...2.9.1
## Code generator for enum types
With this feature it is enough to annotate enum with 2 attributes:
```csharp
[Auto.AutoEnumTransformer(
//1. optionally pass parser settings
CaseInsensitive = true, AllowParsingNumerics = true,
//2. TransformerClassName can be left blank. In that case the name of enum is used with "Transformer" suffix
TransformerClassName = "MonthCodeGenTransformer",
//3. optionally pass namespace to generate the transformer class within. If not provided the namespace of the enum will be used
TransformerClassNamespace = "ABC"
)]
//4. decorate enum with TransformerAttribute that points to automatically generated transformer
[Transformer(typeof(ABC.MonthCodeGenTransformer))]
public enum Month : byte
{
None = 0,
January = 1, February = 2, March = 3,
April = 4, May = 5, June = 6,
July = 7, August = 8, September = 9,
October = 10, November = 11, December = 12
}
```
This in turn generates the following parser using best practices (some lines are ommited for brevity):
<details>
<summary>Source code for generated parser</summary>
```csharp
public sealed class MonthCodeGenTransformer : TransformerBase<Nemesis.TextParsers.CodeGen.Sample.Month>
{
public override string Format(Nemesis.TextParsers.CodeGen.Sample.Month element) => element switch
{
Nemesis.TextParsers.CodeGen.Sample.Month.None => nameof(Nemesis.TextParsers.CodeGen.Sample.Month.None),
Nemesis.TextParsers.CodeGen.Sample.Month.January => nameof(Nemesis.TextParsers.CodeGen.Sample.Month.January),
// ...
Nemesis.TextParsers.CodeGen.Sample.Month.December => nameof(Nemesis.TextParsers.CodeGen.Sample.Month.December),
_ => element.ToString("G"),
};
protected override Nemesis.TextParsers.CodeGen.Sample.Month ParseCore(in ReadOnlySpan<char> input) =>
input.IsWhiteSpace() ? default : (Nemesis.TextParsers.CodeGen.Sample.Month)ParseElement(input);
private static byte ParseElement(ReadOnlySpan<char> input)
{
if (input.IsEmpty || input.IsWhiteSpace()) return default;
input = input.Trim();
if (IsNumeric(input) && byte.TryParse(input
#if NETFRAMEWORK
.ToString() //legacy frameworks do not support parsing from ReadOnlySpan<char>
#endif
, out var number))
return number;
else
return ParseName(input);
static bool IsNumeric(ReadOnlySpan<char> input) =>
input.Length > 0 && input[0] is var first &&
(char.IsDigit(first) || first is '-' or '+');
}
private static byte ParseName(ReadOnlySpan<char> input)
{
if (IsEqual(input, nameof(Nemesis.TextParsers.CodeGen.Sample.Month.None)))
return (byte)Nemesis.TextParsers.CodeGen.Sample.Month.None;
else if (IsEqual(input, nameof(Nemesis.TextParsers.CodeGen.Sample.Month.January)))
return (byte)Nemesis.TextParsers.CodeGen.Sample.Month.January;
else if (IsEqual(input, nameof(Nemesis.TextParsers.CodeGen.Sample.Month.February)))
return (byte)Nemesis.TextParsers.CodeGen.Sample.Month.February;
// ...
else if (IsEqual(input, nameof(Nemesis.TextParsers.CodeGen.Sample.Month.December)))
return (byte)Nemesis.TextParsers.CodeGen.Sample.Month.December;
else throw new FormatException(@$"Enum of type 'Nemesis.TextParsers.CodeGen.Sample.Month' cannot be parsed from '{input.ToString()}'.
Valid values are: [None or January or February or March or April or May or June or July or August or September or October or November or December] or number within byte range.
Ignore case option on.");
static bool IsEqual(ReadOnlySpan<char> input, string label) =>
MemoryExtensions.Equals(input, label.AsSpan(), StringComparison.OrdinalIgnoreCase);
}
}
```
</details>