Title: Semi-supervised Learning for Predicting Total Knee Replacement with Unsupervised Data Augmentation
Authors: Jimin Tan, Bofei Zhang, Kyunghyun Cho, Gregory Chang, and Cem M. Deniz
Published at: SPIE medical imaging 2020
Introduction
Osteoarthritis (OA) is a chronic degenerative disorder of joints and is the most common reason leading to total knee joint replacement (TKR). In this paper, we implemented a semi-supervised learning approach based on Unsupervised Data Augmentation (UDA) along with valid perturbations for radiographs to enhance the performance of supervised TKR outcome prediction model.
We used radiographs from the Osteoarthritis Initiative (OAI) dataset and performed knee joint localization with model based on ResNet-18 architecture to extract left and right knee.
Our semi-supervised model consists of supervised and UDA modules. The final loss was a weighted combination of both CE and KLD losses.
Result
Our semi-supervised model has an average ROC AUC of 0.79 compared to an average of 0.74 on the supervised baseline with a 4-fold cross validation. The overall increase in accuracy was 6.8%. A paired DeLong test between these two models showed that the difference were significant (p-value = 5.869 × 10−5).
Ablation Study
Size of the unlabeled dataset
We looked at how the size of the unlabeled data affect performance of the model. The table showed a positive correlation between the number of unlabeled data and validation accuracy. However, there is a diminising return.
Augmentation diversity in UDA
We analyzed the effect of increasing the number of augmentations applied in the UDA module. We found that the model performed better and is more consistent as the number of augmentations increases.
Multi-task Learning with consistency training
In this experiment, we are using the consistency training module as a add-on task to exsisting supervised learning module. Both of the modules are using the same labeled data. The figure below showed that the overfitting reduction of multi-task learning is on par with UDA and both of them out performed the baseline supervised model.