Borg.Audio
1.0.0
dotnet add package Borg.Audio --version 1.0.0
NuGet\Install-Package Borg.Audio -Version 1.0.0
<PackageReference Include="Borg.Audio" Version="1.0.0" />
<PackageVersion Include="Borg.Audio" Version="1.0.0" />
<PackageReference Include="Borg.Audio" />
paket add Borg.Audio --version 1.0.0
#r "nuget: Borg.Audio, 1.0.0"
#addin nuget:?package=Borg.Audio&version=1.0.0
#tool nuget:?package=Borg.Audio&version=1.0.0
Borg.Audio .NET API library
Transcribe audio (convert speech to text) using Borg API for as little as $0.06 per hour with free dev tier. Try in browser.
Get API key or use string ``"null"` for development.
Table of Contents
- Getting started
- Using the client library
- How to use chat completions with streaming
- How to use chat completions with audio
- How to transcribe audio
- Advanced scenarios
Getting started
Prerequisites
To call the Borg REST API, you will need an API key. To obtain one, visit Borg API Keys or use free tier key "null"
.
Free tier will often return HTTP 402 Payment Required errors, but it is useful for testing purposes.
Install the NuGet package
Add the client library to your .NET project by installing the NuGet package via your IDE or by running the following command in the .NET CLI:
dotnet add package Borg.Audio
If you would like to try the latest preview version, remember to append the --prerelease
command option.
Note that the code examples included below were written using .NET 8. The Borg.Audio .NET library is compatible with all .NET Standard 2.0 applications, but the syntax used in some of the code examples in this document may depend on newer language features.
Using the client library
using Borg.Audio;
AudioClient client = new("whisper-1", Environment.GetEnvironmentVariable("BORG_API_KEY"));
string audioFilePath = Path.Combine("Assets", "audio_houseplant_care.mp3");
AudioTranscriptionOptions options = new()
{
ResponseFormat = AudioTranscriptionFormat.Verbose,
TimestampGranularities = AudioTimestampGranularities.Segment,
};
AudioTranscription transcription = client.TranscribeAudio(audioFilePath, options);
Console.WriteLine("Transcription:");
Console.WriteLine($"{transcription.Text}");
Console.WriteLine();
Console.WriteLine($"Segments:");
foreach (TranscribedSegment segment in transcription.Segments)
{
Console.WriteLine($" {segment.Text,90} : {segment.StartTime.TotalMilliseconds,5:0} - {segment.EndTime.TotalMilliseconds,5:0}");
}
While you can pass your API key directly as a string, it is highly recommended that you keep it in a secure location and instead access it via an environment variable or configuration file as shown above to avoid storing it in source control.
Namespace organization
The library is organized into namespaces by feature areas in the Borg REST API. Each namespace contains a corresponding client class.
Namespace | Client class | Notes |
---|---|---|
Borg.Audio |
AudioClient |
|
Borg.Batch |
BatchClient |
|
Borg.Files |
BorgFileClient |
|
Borg.Models |
BorgModelClient |
|
Borg.Responses |
BorgResponseClient |
Using the async API
Every client method that performs a synchronous API call has an asynchronous variant in the same client class. For instance, the asynchronous variant of the ChatClient
's CompleteChat
method is CompleteChatAsync
. To rewrite the call above using the asynchronous counterpart, simply await
the call to the corresponding async variant:
var transcription = await client.TranscribeAudioAsync(audioFilePath, options);
Using the BorgClient
class
In addition to the namespaces mentioned above, there is also the parent Borg
namespace itself:
using Borg;
This namespace contains the BorgClient
class, which offers certain conveniences when you need to work with multiple feature area clients. Specifically, you can use an instance of this class to create instances of the other clients and have them share the same implementation details, which might be more efficient.
You can create an BorgClient
by specifying the API key that all clients will use for authentication:
BorgClient client = new(Environment.GetEnvironmentVariable("BORG_API_KEY"));
Next, to create an instance of an AudioClient
, for example, you can call the BorgClient
's GetAudioClient
method by passing the Borg model that the AudioClient
will use, just as if you were using the AudioClient
constructor directly. If necessary, you can create additional clients of the same type to target different models.
AudioClient ttsClient = client.GetAudioClient("tts-1");
AudioClient whisperClient = client.GetAudioClient("whisper-1");
How to use chat completions with streaming
When you request a chat completion, the default behavior is for the server to generate it in its entirety before sending it back in a single response. Consequently, long chat completions can require waiting for several seconds before hearing back from the server. To mitigate this, the Borg REST API supports the ability to stream partial results back as they are being generated, allowing you to start processing the beginning of the completion before it is finished.
The client library offers a convenient approach to working with streaming chat completions. If you wanted to re-write the example from the previous section using streaming, rather than calling the ChatClient
's CompleteChat
method, you would call its CompleteChatStreaming
method instead:
CollectionResult<StreamingChatCompletionUpdate> completionUpdates = client.CompleteChatStreaming("Say 'this is a test.'");
Notice that the returned value is a CollectionResult<StreamingChatCompletionUpdate>
instance, which can be enumerated to process the streaming response chunks as they arrive:
Console.Write($"[ASSISTANT]: ");
foreach (StreamingChatCompletionUpdate completionUpdate in completionUpdates)
{
if (completionUpdate.ContentUpdate.Count > 0)
{
Console.Write(completionUpdate.ContentUpdate[0].Text);
}
}
Alternatively, you can do this asynchronously by calling the CompleteChatStreamingAsync
method to get an AsyncCollectionResult<StreamingChatCompletionUpdate>
and enumerate it using await foreach
:
AsyncCollectionResult<StreamingChatCompletionUpdate> completionUpdates = client.CompleteChatStreamingAsync("Say 'this is a test.'");
Console.Write($"[ASSISTANT]: ");
await foreach (StreamingChatCompletionUpdate completionUpdate in completionUpdates)
{
if (completionUpdate.ContentUpdate.Count > 0)
{
Console.Write(completionUpdate.ContentUpdate[0].Text);
}
}
How to use chat completions with audio
Starting with the gpt-4o-audio-preview
model, chat completions can process audio input and output.
This example demonstrates:
- Configuring the client with the supported
gpt-4o-audio-preview
model - Supplying user audio input on a chat completion request
- Requesting model audio output from the chat completion operation
- Retrieving audio output from a
ChatCompletion
instance - Using past audio output as
ChatMessage
conversation history
// Chat audio input and output is only supported on specific models, beginning with gpt-4o-audio-preview
ChatClient client = new("gpt-4o-audio-preview", Environment.GetEnvironmentVariable("BORG_API_KEY"));
// Input audio is provided to a request by adding an audio content part to a user message
string audioFilePath = Path.Combine("Assets", "realtime_whats_the_weather_pcm16_24khz_mono.wav");
byte[] audioFileRawBytes = File.ReadAllBytes(audioFilePath);
BinaryData audioData = BinaryData.FromBytes(audioFileRawBytes);
List<ChatMessage> messages =
[
new UserChatMessage(ChatMessageContentPart.CreateInputAudioPart(audioData, ChatInputAudioFormat.Wav)),
];
// Output audio is requested by configuring ChatCompletionOptions to include the appropriate
// ResponseModalities values and corresponding AudioOptions.
ChatCompletionOptions options = new()
{
ResponseModalities = ChatResponseModalities.Text | ChatResponseModalities.Audio,
AudioOptions = new(ChatOutputAudioVoice.Alloy, ChatOutputAudioFormat.Mp3),
};
ChatCompletion completion = client.CompleteChat(messages, options);
void PrintAudioContent()
{
if (completion.OutputAudio is ChatOutputAudio outputAudio)
{
Console.WriteLine($"Response audio transcript: {outputAudio.Transcript}");
string outputFilePath = $"{outputAudio.Id}.mp3";
using (FileStream outputFileStream = File.OpenWrite(outputFilePath))
{
outputFileStream.Write(outputAudio.AudioBytes);
}
Console.WriteLine($"Response audio written to file: {outputFilePath}");
Console.WriteLine($"Valid on followup requests until: {outputAudio.ExpiresAt}");
}
}
PrintAudioContent();
// To refer to past audio output, create an assistant message from the earlier ChatCompletion, use the earlier
// response content part, or use ChatMessageContentPart.CreateAudioPart(string) to manually instantiate a part.
messages.Add(new AssistantChatMessage(completion));
messages.Add("Can you say that like a pirate?");
completion = client.CompleteChat(messages, options);
PrintAudioContent();
Streaming is highly parallel: StreamingChatCompletionUpdate
instances can include a OutputAudioUpdate
that may
contain any of:
- The
Id
of the streamed audio content, which can be referenced by subsequentAssistantChatMessage
instances viaChatAudioReference
once the streaming response is complete; this may appear across multipleStreamingChatCompletionUpdate
instances but will always be the same value when present - The
ExpiresAt
value that describes when theId
will no longer be valid for use withChatAudioReference
in subsequent requests; this typically appears once and only once, in the finalStreamingOutputAudioUpdate
- Incremental
TranscriptUpdate
and/orAudioBytesUpdate
values, which can incrementally consumed and, when concatenated, form the complete audio transcript and audio output for the overall response; many of these typically appear
How to transcribe audio
In this example, an audio file is transcribed using the Whisper speech-to-text model, including both word- and audio-segment-level timestamp information.
using Borg.Audio;
AudioClient client = new("whisper-1", Environment.GetEnvironmentVariable("BORG_API_KEY"));
string audioFilePath = Path.Combine("Assets", "audio_houseplant_care.mp3");
AudioTranscriptionOptions options = new()
{
ResponseFormat = AudioTranscriptionFormat.Verbose,
TimestampGranularities = AudioTimestampGranularities.Word | AudioTimestampGranularities.Segment,
};
AudioTranscription transcription = client.TranscribeAudio(audioFilePath, options);
Console.WriteLine("Transcription:");
Console.WriteLine($"{transcription.Text}");
Console.WriteLine();
Console.WriteLine($"Words:");
foreach (TranscribedWord word in transcription.Words)
{
Console.WriteLine($" {word.Word,15} : {word.StartTime.TotalMilliseconds,5:0} - {word.EndTime.TotalMilliseconds,5:0}");
}
Console.WriteLine();
Console.WriteLine($"Segments:");
foreach (TranscribedSegment segment in transcription.Segments)
{
Console.WriteLine($" {segment.Text,90} : {segment.StartTime.TotalMilliseconds,5:0} - {segment.EndTime.TotalMilliseconds,5:0}");
}
Advanced scenarios
Using protocol methods
In addition to the client methods that use strongly-typed request and response objects, the .NET library also provides protocol methods that enable more direct access to the REST API. Protocol methods are "binary in, binary out" accepting BinaryContent
as request bodies and providing BinaryData
as response bodies.
For example, to use the protocol method variant of the AudioClient
's CompleteChat
method, pass the request body as BinaryContent
:
AudioClient client = new("gpt-4o", Environment.GetEnvironmentVariable("BORG_API_KEY"));
BinaryData input = BinaryData.FromBytes("""
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "Say 'this is a test.'"
}
]
}
"""u8.ToArray());
using BinaryContent content = BinaryContent.Create(input);
ClientResult result = client.CompleteChat(content);
BinaryData output = result.GetRawResponse().Content;
using JsonDocument outputAsJson = JsonDocument.Parse(output.ToString());
string message = outputAsJson.RootElement
.GetProperty("choices"u8)[0]
.GetProperty("message"u8)
.GetProperty("content"u8)
.GetString();
Console.WriteLine($"[ASSISTANT]: {message}");
Notice how you can then call the resulting ClientResult
's GetRawResponse
method and retrieve the response body as BinaryData
via the PipelineResponse
's Content
property.
Mock a client for testing
The Borg.Audio .NET library has been designed to support mocking, providing key features such as:
- Client methods made virtual to allow overriding.
- Model factories to assist in instantiating API output models that lack public constructors.
To illustrate how mocking works, suppose you want to validate the behavior of the following method using the Moq library. Given the path to an audio file, it determines whether it contains a specified secret word:
public bool ContainsSecretWord(AudioClient client, string audioFilePath, string secretWord)
{
AudioTranscription transcription = client.TranscribeAudio(audioFilePath);
return transcription.Text.Contains(secretWord);
}
Create mocks of AudioClient
and ClientResult<AudioTranscription>
, set up methods and properties that will be invoked, then test the behavior of the ContainsSecretWord
method. Since the AudioTranscription
class does not provide public constructors, it must be instantiated by the BorgAudioModelFactory
static class:
// Instantiate mocks and the AudioTranscription object.
Mock<AudioClient> mockClient = new();
Mock<ClientResult<AudioTranscription>> mockResult = new(null, Mock.Of<PipelineResponse>());
AudioTranscription transcription = BorgAudioModelFactory.AudioTranscription(text: "I swear I saw an apple flying yesterday!");
// Set up mocks' properties and methods.
mockResult
.SetupGet(result => result.Value)
.Returns(transcription);
mockClient.Setup(client => client.TranscribeAudio(
It.IsAny<string>(),
It.IsAny<AudioTranscriptionOptions>()))
.Returns(mockResult.Object);
// Perform validation.
AudioClient client = mockClient.Object;
bool containsSecretWord = ContainsSecretWord(client, "<audioFilePath>", "apple");
Assert.That(containsSecretWord, Is.True);
All namespaces have their corresponding model factory to support mocking with the exception of the Borg.Assistants
and Borg.VectorStores
namespaces, for which model factories are coming soon.
Automatically retrying errors
By default, the client classes will automatically retry the following errors up to three additional times using exponential backoff:
- 408 Request Timeout
- 429 Too Many Requests
- 500 Internal Server Error
- 502 Bad Gateway
- 503 Service Unavailable
- 504 Gateway Timeout
Observability
Borg.Audio .NET library supports experimental distributed tracing and metrics with OpenTelemetry. Check out Observability with OpenTelemetry for more details.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- System.ClientModel (>= 1.2.1)
- System.Diagnostics.DiagnosticSource (>= 6.0.1)
-
net8.0
- System.ClientModel (>= 1.2.1)
- System.Diagnostics.DiagnosticSource (>= 6.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
1.0.0 | 148 | 4/1/2025 |