Unsuccessful requests evaluation metric
The unsuccessful requests metric measures the ratio of questions that are answered unsuccessfully out of the total number of questions.
Metric details
Unsuccessful requests is an answer quality metric for generative AI quality evaluations that can help measure the quality of model answers. Answer quality metrics are calculated with LLM-as-a-judge models. Watsonx.governance does not calculate the unsuccessful requests metric with fine-tuned models.
Scope
The unsuccessful requests metric evaluates generative AI assets only.
- Types of AI assets: Prompt templates
- Generative AI tasks:
- Retrieval Augmented Generation (RAG)
- Question answering
- Supported languages: English
Scores and values
The unsuccessful requests metric score indicates how successfully models provide answers to questions. Higher scores indicate that the model can not provide answers to the question.
- Range of values: 0.0-1.0
- Best possible score: 1.0
Settings
- Thresholds:
- Lower limit: 0
- Upper limit: 1
Parent topic: Evaluation metrics