virtex.utils.metrics
This module is a collection of metrics commonly used during pretraining and downstream evaluation. Two main classes here are:
TopkAccuracyused for ImageNet linear classification evaluation.CocoCaptionsEvaluatorused for caption evaluation (CIDEr and SPICE).
Parts of this module (tokenize(), cider() and spice()) are
adapted from coco-captions evaluation code.
- class virtex.utils.metrics.TopkAccuracy(k: int = 1)[source]
Bases:
objectTop-K classification accuracy. This class can accumulate per-batch accuracy that can be retrieved at the end of evaluation. Targets and predictions are assumed to be integers (long tensors).
If used in
DistributedDataParallel, results need to be aggregated across GPU processes outside this class.- Parameters
k –
kfor computing Top-K accuracy.
- class virtex.utils.metrics.CocoCaptionsEvaluator(gt_annotations_path: str)[source]
Bases:
objectA helper class to evaluate caption predictions in COCO format. This uses
cider()andspice()which exactly follow original COCO Captions evaluation protocol.- Parameters
gt_annotations_path – Path to ground truth annotations in COCO format (typically this would be COCO Captions
val2017split).
- virtex.utils.metrics.tokenize(image_id_to_captions: Dict[int, List[str]]) Dict[int, List[str]][source]
Given a mapping of image id to a list of corrsponding captions, tokenize captions in place according to Penn Treebank Tokenizer. This method assumes the presence of Stanford CoreNLP JAR file in directory of this module.