Bias in Machine Learning Methods for Healthcare Informatics: Challenges of Equity and Inclusivity
A range of machine learning methods have been applied to diagnostic problems related to cases of infection with the SARS-CoV-2 virus. Reviews of the literature show that preliminary results reported promise for classification and prediction methods with deep learning and other strategies. Several studies involved training on large datasets of images to diagnose radiologically observed complications of Covid-19 infection and others correlated these findings with relevant natural language clinical descriptions. However, systematic reviews to date reveal that the evaluation and validation of these studies were frequently inadequate for supporting the ethical application of these techniques in realistic clinical settings. Significant bias arose from poorly selected training datasets, and inappropriate algorithmic decision and thresholding choices. This raises significant challenges for ensuring unbiased, well-balanced inclusivity of cases for training of the machine learning models, especially for at-risk subpopulations which require care in ensuring equity in treatment and the allocation of resources for pandemic management. In consequence it would appear that the reports of machine learning method successes for Covid-19 related diagnosis have been premature.