KernelMemory.Evaluation 0.0.2

dotnet add package KernelMemory.Evaluation --version 0.0.2                
NuGet\Install-Package KernelMemory.Evaluation -Version 0.0.2                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="KernelMemory.Evaluation" Version="0.0.2" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add KernelMemory.Evaluation --version 0.0.2                
#r "nuget: KernelMemory.Evaluation, 0.0.2"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install KernelMemory.Evaluation as a Cake Addin
#addin nuget:?package=KernelMemory.Evaluation&version=0.0.2

// Install KernelMemory.Evaluation as a Cake Tool
#tool nuget:?package=KernelMemory.Evaluation&version=0.0.2                

KM Evaluation

This repository contains the code for the evaluation of the Knowledge Management (KM) system. The evaluation is based on the following metrics:

  • Faithfulness: Ensuring the generated text accurately represents the source information.
  • Answer Relevancy: Assessing the pertinence of the answer in relation to the query.
  • Context Recall: Measuring the proportion of relevant context retrieved.
  • Context Precision: Evaluating the accuracy of the retrieved context.
  • Context Relevancy: Determining the relevance of the provided context to the query.
  • Context Entity Recall: Checking the retrieval of key entities within the context.
  • Answer Semantic Similarity: Comparing the semantic similarity between the generated answer and the expected answer.
  • Answer Correctness: Verifying the factual correctness of the generated answers.

Usage

Test set generation

To evaluate the KM, you must first create a test set containing the queries and the expected answers. Since this is a manual process, this might be fastidious for large datasets. To help you with this task, we provide a generator that creates a test set from a given KM memory and index.

using Microsoft.KernelMemory.Evaluation;

var testSetGenerator = new TestSetGeneratorBuilder(memoryBuilder.Services)
                            .AddEvaluatorKernel(kernel)
                            .Build();

var distribution = new Distribution
{
    Simple = .5f,
    Reasoning = .16f,
    MultiContext = .17f,
    Conditioning = .17f
};

var testSet = testSetGenerator.GenerateTestSetsAsync(index: "default", count: 10, retryCount: 3, distribution: distribution);

await foreach (var test in testSet)
{
    Console.WriteLine(test.Question);
}

Evaluation

To evaluate the KM, you can use the following code:

var evaluation = new TestSetEvaluatorBuilder()
                            .AddEvaluatorKernel(kernel)
                            .WithMemory(memoryBuilder.Build())
                            .Build();

var results = evaluation.EvaluateTestSetAsync(index: "default", await testSet.ToArrayAsync());

await foreach (var result in results)
{
    Console.WriteLine($"Faithfulness: {result.Metrics.Faithfulness}, ContextRecall: {result.Metrics.ContextRecall}");
}

Credits

This project is an implementation of RAGAS: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines.

License

This project is licensed under the MIT License.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
0.0.2 0 11/24/2024
0.0.1 28 11/23/2024