Automatic recognition of surgical phases in surgical videos is a fundamental
task in surgical workflow analysis. In this report, we propose a
Transformer-based method that utilizes calibrated confidence scores for a
2-stage inference pipeline, which dynamically switches between a baseline model
and a separately trained transition model depending on the calibrated
confidence level. Our method outperforms the baseline model on the Cholec80
dataset, and can be applied to a variety of action segmentation methods