KokoroSharp 0.5.1
dotnet add package KokoroSharp --version 0.5.1
NuGet\Install-Package KokoroSharp -Version 0.5.1
<PackageReference Include="KokoroSharp" Version="0.5.1" />
paket add KokoroSharp --version 0.5.1
#r "nuget: KokoroSharp, 0.5.1"
// Install KokoroSharp as a Cake Addin #addin nuget:?package=KokoroSharp&version=0.5.1 // Install KokoroSharp as a Cake Tool #tool nuget:?package=KokoroSharp&version=0.5.1
https://github.com/user-attachments/assets/869e13c1-675e-4ff8-b89a-215bb802ca39
KokoroSharp
KokoroSharp is a fully-featured inference engine for Kokoro TTS, built entirely in C# with ONNX runtime. It enables developers to perform flexible and fast text-to-speech synthesis utilizing multiple speakers and languages.
Features
- Plug & Play integration via the nuget package. All dependencies are handled automatically.
- Nuget package includes ALL voices released by hexgrad with their Kokoro 82M v1.0 release.
- High-level interface designed to suit both beginners and power users.
- Text-segment streaming for seamless text-to-speech. Responses feel instant.
- Voice mixing with no restrictions on the amounts of voices mixed, and ability to save/load mixed voices.
- Linear job scheduling with background worker as dispatcher.
- Optional multi-platform playback support with pre-integrated audio queue handling.
Supports languages/accents:
[American English, British English, Spanish, French, Italian, Brazilian/Portuguese]
.
With a custom phonemization solution, these additional languages are also supported:
[MandarinChinese, Japanese, Hindi]
.
How to setup
- On Windows, Linux, and MacOS: Install via Nuget (Package Manager or CLI), and you're set!
- On Other platforms: For platforms other than the ones above, developers are expected to provide their own phonemization solution. The built-in tokenizer supports raw
(phonemes -> tokens)
conversion.
The package is accessible on all .NET platforms, yet integrated phonemization is only available with the eSpeak NG backend atm.
Getting started
KokoroTTS tts = KokoroTTS.LoadModel(); // Load or download the model (~320MB for full precision)
KokoroVoice heartVoice = KokoroVoiceManager.GetVoice("af_heart"); // Grab a voice of your liking,
while (true) { tts.SpeakFast(Console.ReadLine(), heartVoice); } // .. and have it speak your text!
// Note: Language detection is automated based on what the loaded voice supports.
Above is a simple way to get started on the highest level. For more control, check out the example Program, which covers more advanced parts like job scheduling, voice mixing, and long-term, speaker-agnostic playback queuing.
Models can be found on taylorchu's releases, and can be loaded via KokoroTTS.LoadModel("path/to/model")
, or downloaded automatically with KokoroTTS.LoadModel()
. Check out the various overloads of KokoroTTS.LoadModel
for background loading.
Notes
KokoroSharp prioritizes a smooth developer experience by logging potential misuse instead of throwing exceptions. Wherever possible, the library attempts to automatically resolve issues to minimize disruptions.
All communication with the AI model and playback devices happens on background threads, letting the main thread focus on rendering the UI in peace. The library is carefully designed with thread-safety in mind.
The
voices
folder are automatically copied to your build path when you build and are ready to be accessed. Same with the mentionedespeak
backends. Developers may opt to remove them when shipping their apps.Mind that
LoadVoicesFromPath
exists as an option, in case developers want to implement their custom voice-loading logic when shipping a project that utilizes KokoroSharp for text-to-speech synthesis.In addition, the built-in tokenization (
text -> tokens
) is NOT mandatory, and can be bypassed for platforms likeAndroid/iOS
, given developers provide pre-phonemized input with their phonemization solution of choice.
License
- This project is licensed under the MIT License.
- The Kokoro 82M model and its voices are released under the Apache License.
- eSpeak NG is licensed under the GPLv3 License.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. |
-
net8.0
- Microsoft.ML.OnnxRuntime (>= 1.20.1)
- NAudio (>= 2.2.1)
- NumSharp (>= 0.30.0)
- OpenTK.Audio.OpenAL (>= 5.0.0-pre.13)
- System.Numerics.Tensors (>= 9.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
0.5.1 | 0 | 2/7/2025 |