Abstract

https://doi.org/10.58984/smbic250101187d

In sports science, the use of machine learning and artificial intelligence techniques has increased significantly in recent years, especially in the domains of performance evaluation, training process optimization, and sports injury prediction. However, a significant number of current studies have significant methodological issues, such as poor validation processes, the possibility of information contamination (data leakage), a lack of reporting transparency, and restricted generalizability of findings. Through a narrative evaluation of peer-reviewed scientific literature indexed in the Web of Science and Scopus databases, this research aims to identify major sources of methodological bias and analyze prevailing methodological practices in the application of machine learning in sport. In order to obtain a meaningful evaluation of model performance, time-aware data splitting and grouped validation procedures are required due to the unique temporal and hierarchical structure of sports data. The use of modern reporting and quality-assessment frameworks, such as TRIPOD+AI and PROBAST+AI, is critically examined in this study, along with the contribution of interpretable models and explainable AI techniques to improving results’ practical applicability and trustworthiness. In order to improve methodological rigor, transparency, and reproducibility, recommendations are developed for future study and practical use of machine learning in sports science based on the synthesis of the literature.