Head-to-head ensemble comparisons across 1,000 experiments. The number...
Head-to-head ensemble comparisons across 1,000 experiments. The number displayed is the percentage of experiments where the x-axis ensemble outperformed the y-axis ensemble. M1 represents a single model. A and T refer to all model combinations and top-N ensembles, respectively, and the appended number refers to the number of models in the ensemble. Under each x-axis ensemble is the percentage of experiments where that ensemble achieved the lowest mean absolute deviation.

Ensembling improves machine learning model performance

Ensembles created using models submitted to the RSNA Pediatric Bone Age Machine Learning Challenge convincingly outperformed single-model prediction of bone age, according to a study.

Ensemble learning is a method in machine learning in which different models designed to accomplish the same task are combined into a single model. Model heterogeneity is an important aspect of ensemble learning. Ensembles tend to perform best when each of the individual models performs well in their own right, and the correlation among individual model predictions is relatively low.

Because ensembles benefit from low correlation between model predictions, the greater the underlying differences in approach, the greater the improvement, as long as they achieve similar performance. In this respect, a competition, in which participants are encouraged to submit their best models, provides an ideal setting from which to ensemble high-performing models that use different techniques. “Competitions provide a unique opportunity to study the effects of combining predictions from heterogenous models,” said study author Ian Pan, a medical student at The Warren Alpert Medical School of Brown University in Providence, R.I.

To investigate improvements in performance for automatic bone age estimation that can be gained through model ensembling, Pan and colleagues used 48 submissions from the 2017 RSNA Pediatric Bone Age Machine Learning Challenge.

Participants were provided with 12,611 pediatric hand X-rays with bone ages determined by a pediatric radiologist to develop models for bone age determination. The final results were determined using a test set of 200 X-rays labeled with the weighted average of 6 ratings. The researchers evaluated the mean pairwise model correlation and performance of all possible model combinations for ensembles of up to 10 models using the mean absolute deviation (MAD). To estimate the true generalization MAD, they conducted a bootstrap analysis using the 200 test X-rays.

The estimated generalization MAD of a single model was 4.55 months. The best performing ensemble consisted of four models with a MAD of 3.79 months. The mean pairwise correlation of models within this ensemble was 0.47. In comparison, the lowest achievable MAD by combining the highest-ranking models based on individual scores was 3.93 months using eight models with a mean pairwise model correlation of 0.67. “Our results call attention to a concept that has substantial practical implications, as computer vision and other machine learning algorithms begin to move from research to the clinical environment,” Pan said. “Namely, that the best results are likely to be achieved by combining multiple accurate and diverse models rather than from single models alone.”

Thus, practitioners aiming to incorporate machine learning algorithms into their workflow would benefit from having predictions obtained from different models, similar to how the accuracy of a radiological interpretation can be bolstered with multiple readers.

Pan added that the findings also highlight the importance of open competitions like the 2017 RSNA Pediatric Bone Age Machine Learning Challenge, as they provide a standardized use case, a common training set, and an objective assessment method applied equally to all models. “Machine learning competitions within radiology should be encouraged to spur development of heterogeneous models whose predictions can be combined to achieve optimal performance,” he said.

For the 2019 RSNA Intracranial Hemorrhage Detection and Classification Challenge, researchers worked to develop algorithms that can identify and classify subtypes of hemorrhages on head CT scans. The data set, which comprises more than 25,000 head CT scans contributed by several research institutions, is the first multiplanar dataset used in an RSNA artificial intelligence challenge.

Subscribe to our newsletter

Related articles

Deep learning helps visualize X-ray data in 3D

Deep learning helps visualize X-ray data in 3D

Scientists have leveraged artificial intelligence to train computers to keep up with the massive amounts of X-ray data taken at the Advanced Photon Source.

Machine learning to treat COVID-19 patients worldwide

Machine learning to treat COVID-19 patients worldwide

More than 20 hospitals from across the world together with NVIDIA have used AI to predict Covid patients’ oxygen needs on a global scale.

Artificial intelligence shortcuts introduce bias in cancer treatment

Artificial intelligence shortcuts introduce bias in cancer treatment

AI tools models are a powerful tool in cancer treatment. However, unless these algorithms are properly calibrated, they can sometimes make inaccurate or biased predictions.

Photos of toasters train AI to detect COVID

Photos of toasters train AI to detect COVID

Research using machine learning on images of everyday items is improving the accuracy and speed of detecting respiratory diseases, reducing the need for specialist medical expertise.

COVID-19: AIs shortcuts lead to misdiagnosis

COVID-19: AIs shortcuts lead to misdiagnosis

Researchers discovered that AI models have a tendency to look for shortcuts. In the case of AI-assisted disease detection, these shortcuts could lead to diagnostic errors if deployed in clinical settings.

Deep learning-based image segmentation

Deep learning-based image segmentation

Scientists have presented a new method for configuring self-learning algorithms for a large number of different imaging datasets – without the need for specialist knowledge or very significant computing power.

AI accurately detects COVID-19 on chest x-rays

AI accurately detects COVID-19 on chest x-rays

Researchers have developed a new AI platform that detects COVID-19 by analyzing X-ray images of the lungs.

Using machine learning to detect COVID-19 in X-rays

Using machine learning to detect COVID-19 in X-rays

Students at Cranfield University have designed computer models that can identify COVID-19 in X-rays.

AI-based chest X-ray diagnosis tech approved

AI-based chest X-ray diagnosis tech approved

behold.ai has been issued with a CE Mark Class lla certification in the UK and EU for its AI-based technology that can diagnose chest X-rays as ‘normal’.

Popular articles

Subscribe to Newsletter