About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Micro f1 score evaluation metric
Last updated: May 08, 2025
The micro f1 score calculates the harmonic mean of precision and recall.
Metric details
Micro f1 score is a multi-label/class metric for generative AI quality evaluations that measures how well generative AI assets perform entity extraction tasks for multi-label/multi-class predictions.
Scope
The micro F1 score metric evaluates generative AI assets only.
- Types of AI assets: Prompt templates
- Generative AI tasks: Entity extraction
- Supported languages: English
Scores and values
The micro F1 score metric indicates the harmonic mean of precision and recall. Higher scores indicate that predictions are more accurate.
- Range of values: 0.0-1.0
- Best possible score: 1.0
Settings
- Thresholds:
- Lower limit: 0.8
- Upper limit: 1
Parent topic: Evaluation metrics
Was the topic helpful?
0/1000