Utf8StringSplitter 1.0.0
See the version list below for details.
dotnet add package Utf8StringSplitter --version 1.0.0
NuGet\Install-Package Utf8StringSplitter -Version 1.0.0
<PackageReference Include="Utf8StringSplitter" Version="1.0.0" />
paket add Utf8StringSplitter --version 1.0.0
#r "nuget: Utf8StringSplitter, 1.0.0"
// Install Utf8StringSplitter as a Cake Addin #addin nuget:?package=Utf8StringSplitter&version=1.0.0 // Install Utf8StringSplitter as a Cake Tool #tool nuget:?package=Utf8StringSplitter&version=1.0.0
Utf8StringSplitter
Utf8StringSplitter
provides methods (Split
and SplitAny
) to split Utf8 strings (byte
sequences) based on a specified separators.
This library is distributed via NuGet, supporting .NET Standard 2.0, .NET Standard 2.1, .NET 6 (.NET 7), .NET 8 and above.
PM> Install-Package Utf8StringSplitter
How to use
using System.Text;
using Utf8StringSplitter;
void Sample()
{
// u8 suffix is a C# 11 feature
ReadOnlySpan<byte> utf8Source = "1,2,3,4,5"u8;
foreach (var str in Utf8Splitter.Split(utf8Source, (byte)','))
{
// ToArray for .NET Standard 2.0
//Console.WriteLine($"{Encoding.UTF8.GetString(str.ToArray())}");
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
Console.WriteLine("--------------------");
// u8 suffix is a C# 11 feature
ReadOnlySpan<byte> utf8Source2 = "1--2--3--4--5"u8;
foreach (var str in Utf8Splitter.Split(utf8Source2, "--"u8))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
Console.WriteLine("--------------------");
// u8 suffix is a C# 11 feature
ReadOnlySpan<byte> utf8Source3 = "1,2-3;4-5"u8;
foreach (var str in Utf8Splitter.SplitAny(utf8Source3, "-,;"u8))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
}
// output
// 1
// 2
// 3
// 4
// 5
// --------------------
// 1
// 2
// 3
// 4
// 5
// --------------------
// 1
// 2
// 3
// 4
// 5
Utf8Splitter
API
public static class Utf8Splitter
{
public static SplitEnumerator Split(ReadOnlySpan<byte> source, byte separator, Utf8StringSplitOptions splitOptions = Utf8StringSplitOptions.None);
public static SplitEnumerator Split(ReadOnlySpan<byte> source, ReadOnlySpan<byte> separator, Utf8StringSplitOptions splitOptions = Utf8StringSplitOptions.None);
public static SplitAnyEnumerator SplitAny(ReadOnlySpan<byte> source, ReadOnlySpan<byte> separators, Utf8StringSplitOptions splitOptions = Utf8StringSplitOptions.None, Utf8StringSeparatorOptions separatorOptions = Utf8StringSeparatorOptions.MultiByte);
}
Split
Utf8Splitter.Split
Split and enumerate a UTF8 string into ReadOnlySpan<byte>
based on the separators.
The separators can specify byte
or ReadOnlySpan<byte>
.
Option can specify Utf8StringSplitOptions
almost equivalent to StringSplitOptions
.
void SampleSplit()
{
// default
Console.WriteLine("Utf8Splitter.Split");
ReadOnlySpan<byte> utf8Source = "1,2,3,4,5"u8;
foreach (var str in Utf8Splitter.Split(utf8Source, (byte)','))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
Console.WriteLine("Utf8Splitter.Split");
ReadOnlySpan<byte> utf8Source2 = "1---2---3---4---5"u8;
foreach (var str in Utf8Splitter.Split(utf8Source2, "---"u8))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
// splitOptions is TrimEntries.
Console.WriteLine("Utf8Splitter.Split : Utf8StringSplitOptions.TrimEntries");
ReadOnlySpan<byte> utf8Source3 = " 1 , 2 , 3 , 4 , 5 "u8;
foreach (var str in Utf8Splitter.Split(utf8Source3, (byte)',', splitOptions: Utf8StringSplitOptions.TrimEntries))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
// splitOptions is RemoveEmptyEntries.
Console.WriteLine("Utf8Splitter.Split : Utf8StringSplitOptions.RemoveEmptyEntries");
ReadOnlySpan<byte> utf8Source4 = ",1,2,,,,3,,4,,5,,"u8;
foreach (var str in Utf8Splitter.Split(utf8Source4, (byte)',', splitOptions: Utf8StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
// splitOptions is TrimEntries and RemoveEmptyEntries.
Console.WriteLine("Utf8Splitter.Split : Utf8StringSplitOptions.TrimEntries and RemoveEmptyEntries");
ReadOnlySpan<byte> utf8Source5 = " ,1, 2, ,, , 3 ,,4,, 5 ,,"u8;
foreach (var str in Utf8Splitter.Split(utf8Source5, (byte)',',
splitOptions: Utf8StringSplitOptions.TrimEntries | Utf8StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
}
// output
// Utf8Splitter.Split
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.Split
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.Split : Utf8StringSplitOptions.TrimEntries
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.Split : Utf8StringSplitOptions.RemoveEmptyEntries
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.Split : Utf8StringSplitOptions.TrimEntries and RemoveEmptyEntries
// 1
// 2
// 3
// 4
// 5
SplitAny
Utf8Splitter.SplitAny
Split and enumerate a UTF8 string into ReadOnlySpan<byte>
for one of the specified separators.
The first option can specify Utf8StringSplitOptions
almost equivalent to StringSplitOptions
.
void SampleSplitAny()
{
Console.WriteLine("Utf8Splitter.SplitAny");
foreach (var s in Utf8Splitter.SplitAny("1;2-3,4-5"u8, ",-;"u8))
{
Console.WriteLine($"{Encoding.UTF8.GetString(s)}");
}
Console.WriteLine("Utf8Splitter.SplitAny");
foreach (var s in Utf8Splitter.SplitAny("1😀2🙃3😋4😀5"u8, "😀🙃😋"u8))
{
Console.WriteLine($"{Encoding.UTF8.GetString(s)}");
}
// splitOptions is TrimEntries.
Console.WriteLine("Utf8Splitter.SplitAny : Utf8StringSplitOptions.TrimEntries");
ReadOnlySpan<byte> utf8Source3 = " 1 , 2 - 3 ; 4 , 5 "u8;
foreach (var str in Utf8Splitter.Split(utf8Source3, ",-;"u8, splitOptions: Utf8StringSplitOptions.TrimEntries))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
// splitOptions is RemoveEmptyEntries.
Console.WriteLine("Utf8Splitter.SplitAny : Utf8StringSplitOptions.RemoveEmptyEntries");
ReadOnlySpan<byte> utf8Source4 = ",1,2,--,3,,4;,5;,"u8;
foreach (var str in Utf8Splitter.Split(utf8Source4, ",-;"u8, splitOptions: Utf8StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
// splitOptions is TrimEntries and RemoveEmptyEntries.
Console.WriteLine("Utf8Splitter.SplitAny : Utf8StringSplitOptions.TrimEntries and RemoveEmptyEntries");
ReadOnlySpan<byte> utf8Source5 = " ,1- 2, ,- , 3 ,,4,; 5 ,-"u8;
foreach (var str in Utf8Splitter.Split(utf8Source5, ",-;"u8,
splitOptions: Utf8StringSplitOptions.TrimEntries | Utf8StringSplitOptions.RemoveEmptyEntries))
{
Console.WriteLine($"{Encoding.UTF8.GetString(str)}");
}
}
// output
// Utf8Splitter.SplitAny
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.SplitAny
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.SplitAny : Utf8StringSplitOptions.TrimEntries
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.SplitAny : Utf8StringSplitOptions.RemoveEmptyEntries
// 1
// 2
// 3
// 4
// 5
// Utf8Splitter.SplitAny : Utf8StringSplitOptions.TrimEntries and RemoveEmptyEntries
// 1
// 2
// 3
// 4
// 5
The second option can specify Utf8StringSeparatorOptions
.
Utf8StringSeparatorOptions.Utf8
processes separators as UTF-8 string one by one. Utf8StringSeparatorOptions.Bytes
processes separators as byte array.
The default value is Utf8StringSeparatorOptions.Utf8
.
void SampleSplitAny2()
{
Console.WriteLine("Utf8Splitter.SplitAny Utf8StringSeparatorOptions.Utf8");
foreach (var s in Utf8Splitter.SplitAny("1😀2🙃3😋4😀5"u8, "😀🙃😋"u8, separatorOptions: Utf8StringSeparatorOptions.Utf8))
{
for (var i = 0;i < s.Length; i++)
{
Console.Write($"{s[i]}");
}
Console.WriteLine();
}
Console.WriteLine("Utf8Splitter.SplitAny Utf8StringSeparatorOptions.Bytes");
foreach (var s in Utf8Splitter.SplitAny("1😀2🙃3😋4😀5"u8, "😀🙃😋"u8, separatorOptions:Utf8StringSeparatorOptions.Bytes))
{
for (var i = 0; i < s.Length; i++)
{
Console.Write($"{s[i]}");
}
Console.WriteLine();
}
}
// output
// Utf8Splitter.SplitAny Utf8StringSeparatorOptions.Utf8
// 49
// 50
// 51
// 52
// 53
// Utf8Splitter.SplitAny Utf8StringSeparatorOptions.Bytes
// 49
//
//
//
// 50
//
//
//
// 51
//
//
//
// 52
//
//
//
// 53
License
MIT License.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 is compatible. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- System.Memory (>= 4.5.5)
-
.NETStandard 2.1
- System.Runtime.CompilerServices.Unsafe (>= 6.0.0)
-
net6.0
- No dependencies.
-
net8.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.