About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Record latency evaluation metric
Last updated: May 08, 2025
The record latency metric measures the time taken (in ms) to process a record by your model deployment.
Metric details
Record latency is a throughput and latency metric for model health monitor evaluations that calculates latency by tracking the time that it takes to process transaction records per millisecond (ms).
Scope
The record latency metric evaluates generative AI assets and machine learning models.
- Generative AI tasks:
- Text summarization
- Text classification
- Content generation
- Entity extraction
- Question answering
- Retrieval Augmented Generation (RAG)
- Machine learning problem type:
- Binary classification
- Multiclass classification
- Regression
- Supported languages: English
Evaluation process
The average, maximum, median, and minimum record latency for scoring requests and transaction records are calculated during model health monitor evaluations.
To calculate the record latency metric,
value from your scoring requests is used to track the time that your model deployment takes to process scoring requests.response_time
For watsonx.ai Runtime deployments, the
value is automatically detected when you configure evaluations.response_time
For external and custom deployments, you must specify the
value when you send scoring requests to calculate throughput and latency as shown in the following example from the Python SDK:response_time
from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord
client.data_sets.store_records(
data_set_id=payload_data_set_id,
request_body=[
PayloadRecord(
scoring_id=<uuid>,
request=openscale_input,
response=openscale_output,
response_time=<response_time>,
user_id=<user_id>)
]
)
Parent topic: Evaluation metrics
Was the topic helpful?
0/1000