Accurate glioma grading before surgery is of the utmost importance in treatment planning and prognosis prediction. But previous studies on magnetic resonance imaging (MRI) images were not effective enough.

According to the remarkable performance of convolutional neural network (CNN) in the medical domain, we hypothesized that a deep learning algorithm can achieve high accuracy in distinguishing the World Health Organization (WHO) low grade and high-grade gliomas.

One hundred and thirteen glioma patients were retrospectively included. Tumor images were segmented with a rectangular region of interest (ROI), which contained about 80% of the tumor.

Then, 20% data were randomly selected and left out at patient-level as a test dataset.  AlexNet and GoogLeNet were both trained from scratch and fine-tuned from models that pre-trained on the large-scale natural image database, ImageNet, to magnetic resonance images.

The classification task was evaluated with five-fold cross-validation (CV) on the patient-level split. The performance measures, including validation accuracy, test accuracy and test area under the curve (AUC), averaged from the five-fold CV of GoogLeNet which trained from scratch were 0.867, 0.909, and 0.939, respectively.

With transfer learning and fine-tuning, better performances were obtained for both AlexNet and GoogLeNet, especially for AlexNet.  Meanwhile, GoogLeNet performed better than AlexNet no matter trained from scratch or learned from the pre-trained model.

Traditional machine learning

In conclusion, we demonstrated that the application of CNN, specially trained with transfer learning and fine-tuning, to preoperative glioma grading improves the performance, compared with either the performance of traditional machine learning method based on hand-crafted features or even the CNN trained from scratch.

There are several possible improvements to this study. First and foremost, sufficient cohort size is a limiting factor in the training of deep CNN. Although we overcame this partially by data augmentation and transfer learning technique, a larger patient population would further improve the performance.

Second, since the patients were retrospectively enrolled from Jan 2015 to May 2016, the pathology data were not up-to-date with 2016 WHO classification of gliomas. The IDH status (mutated vs. wildtype) with the histopathology grade should be included in the future study.

Third, the use of multi-modal and multi-view images, which would provide systemic information of the tumor, may improve the generalizability of the model. Fourth, before the automatically glioma grading, an automatically tumor segmentation model would be necessary to further increase the precision.