About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Keyword inclusion evaluation metric
Last updated: May 08, 2025
The keyword inclusion metric measures the similarity of nouns and pronouns between the foundation model output and the reference or ground truth.
Metric details
Keyword inclusion is a metric that measures how well your model generates text that matches certain phrases or keywords in the reference or ground truth. The metric is available only when you use the Python SDK to calculate evaluation metrics. For more information, see Computing Adversarial robustness and Prompt Leakage Risk using IBM watsonx.governance.
Scope
The keyword inclusion metric evaluates generative AI assets only.
- Types of AI assets: Prompt templates
- Generative AI tasks:
- Text summarization
- Question answering
- Retrieval augmented generation (RAG)
- Supported languages: English
Scores and values
The keyword inclusion metric score indicates the proportion of keywords that exist in the generated output and the reference or ground truth.
- Range of values: 0.0-1.0
- Best possible score: 1.0
- Ratios:
- At 0: No similar keywords are included in the output
- Over 0: Increasing amount of similar keywords are included in the output.
Parent topic: Evaluation metrics
Was the topic helpful?
0/1000