Abstract
https://doi.org/10.58984/smbic250101187d
In sports science, the use of machine learning and artificial intelligence techniques has increased significantly in recent years, especially in the domains of performance evaluation, training process optimization, and sports injury prediction. However, a significant number of current studies have significant methodological issues, such as poor validation processes, the possibility of information contamination (data leakage), a lack of reporting transparency, and restricted generalizability of findings. Through a narrative evaluation of peer-reviewed scientific literature indexed in the Web of Science and Scopus databases, this research aims to identify major sources of methodological bias and analyze prevailing methodological practices in the application of machine learning in sport. In order to obtain a meaningful evaluation of model performance, time-aware data splitting and grouped validation procedures are required due to the unique temporal and hierarchical structure of sports data. The use of modern reporting and quality-assessment frameworks, such as TRIPOD+AI and PROBAST+AI, is critically examined in this study, along with the contribution of interpretable models and explainable AI techniques to improving results’ practical applicability and trustworthiness. In order to improve methodological rigor, transparency, and reproducibility, recommendations are developed for future study and practical use of machine learning in sports science based on the synthesis of the literature.
References
- Alzahrani, A., & Ullah, A. (2024). Advanced biomechanical analytics: Wearable technologies for precision health monitoring in sports performance. Digital Health, 10, Article 20552076241256745. https://doi.org/10.1177/20552076241256745
- Bullock, G. S., et al. (2024). Machine learning for understanding and predicting injuries in sport. Sports Medicine – Open. https://sportsmedicine-open.springeropen.com/articles/10.1186/s40798-024-00745-1
- Calderón-Díaz, J. A., et al. (2024). Explainable machine learning in sports performance: Bridging prediction and interpretation. Sensors, 24(8), 2914. https://www.mdpi.com/1424-8220/24/8/2914
- Carey, D. L., Crossley, K. M., & Whiteley, R. (2023). Machine learning approaches to sports injury prediction: Practical applications and methodological challenges. Sports Medicine – Open, 9(1), 52. https://sportsmedicine-open.springeropen.com/articles/10.1186/s40798-023-00589-1
- Cawley, G. C., & Talbot, N. L. C. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research, 11, 2079–2107. https://www.jmlr.org/papers/v11/cawley10a.html
- Claudino, J. G., Capanema, D. de O., de Souza, T. V., Serrão, J. C., Pereira, A. C. M., & Nassis, G. P. (2019). Current approaches to the use of artificial intelligence for injury risk assessment and performance prediction in team sports: A systematic review. Sports Medicine – Open, 5(1), 28. https://sportsmedicine-open.springeropen.com/articles/10.1186/s40798-019-0202-3
- Collins, G. S., Moons, K. G. M., Dhiman, P., Riley, R. D., Beam, A. L., Van Calster, B., & Logullo, P. (2024). TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ, 385, e078378. https://pubmed.ncbi.nlm.nih.gov/38626948/
- Cordeiro, M. C., Cathain, C. O., Daly, L., Kelly, D. T., & Rodrigues, T. B. (2025). A synthetic data-driven machine learning approach for athlete performance attenuation prediction. Frontiers in Sports and Active Living, 7, 1607600. https://doi.org/10.3389/ fspor.2025.1607600
- Dasic, D. (2018). Sport and industry of sport as a central component of social and economic development process. Srpska Akademska Misao, 3 (1), 27-42. http://www.sam.edu.rs/ index.php/sam/article/view/16
- Dašić, D., & Vuković, M. (2024). Mixing quantitative and qualitative methods in scientific research in sports. SPORTICOPEDIA – SMB, 2(1), 285-297. https://doi.org/10.58984/smbic240201285d-
- Dašić D., (2023a) Nauka i metod – metodologija naučnoistraživačkog rada u sportu. Službeni glasnik, Fakultet za sport, Univerzitet „Union – Nikola Tesla“, Beograd
- Dašić, D. (2023b) Application of delphi method in sports. Sport, mediji i biznis-Vol. 9, no 1, 59-71. https://doi.org/10.58984/smb2301059d
- De Pauw, K., Roelands, B., & Meeusen, R. (2024). Predictive analytics in swimming: Physiological modeling using machine learning. Frontiers in Sports and Active Living, 6, 1372451. https://www.frontiersin.org/articles/10.3389/fspor.2024.1372451/full
- Finzel, B., et al. (2025). Current methods in explainable artificial intelligence and their applications in physiology. Physiological Reports. Advance online publication. https://pmc.ncbi.nlm.nih.gov/articles/PMC11958383/
- Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumé, H., & Crawford, K. (2021). Datasheets for datasets. Communications of the ACM, 64(12), 86–92. https://dl.acm.org/doi/10.1145/3458723
- He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. https://doi.org/10.1109/TKDE.2008.239
- Hernandez-Boussard, T., Bozkurt, S., Ioannidis, J. P. A., & Shah, N. H. (2020). MINIMAR: Developing reporting standards for medical AI. Journal of the American Medical Informatics Association, 27(12), 2011–2015. https://academic.oup.com/jamia/article/27/12/2011/5864179
- Huang, Y., et al. (2020). A tutorial on calibration measurements and interpretations in clinical research. NPJ Digital Medicine, 3, 201. https://pmc.ncbi.nlm.nih.gov/articles/PMC7075534/
- Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6, 27. https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0192-5
- Jordan, M. J., Fransen, J., & Pappalardo, L. (2023). Quality assessment in AI-based sports science research: A systematic review. PLOS ONE, 18(7), e0290045. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0290045
- Kapoor, S., & Narayanan, A. (2023). Leakage and the reproducibility crisis in machine-learning-based science. Patterns, 4(9), 100804. https://pubmed.ncbi.nlm.nih.gov/37720327/
- Kaufman, S., Rosset, S., & Perlich, C. (2012). Leakage in data mining: Formulation, detection, and avoidance. ACM Transactions on Knowledge Discovery from Data, 6(4), 15. https://doi.org/10.1145/2382577.2382579
- Kolbinger, F. R., et al. (2024). Reporting guidelines in medical artificial intelligence: Literature review. Communications Medicine, 4, 92. https://www.nature.com/articles/s43856-024-00492-0
- López-Fernández, J., Sánchez, J., & Ortega, E. (2022). Applications of machine learning in sports: A systematic review. Applied Sciences, 12(18), 9201. https://www.mdpi.com/2076-3417/12/18/9201
- Lunić, T., Ćesarević, J. (2025) Integracija tqm i finansijskog menadžmenta: uticaj na profitabilnost i rizik. Horizonti menadžmenta, 5 (1) 93-112
- Mateus, N., et al. (2024). Empowering the sports scientist with artificial intelligence in training, performance and health management. Frontiers in Sports and Active Living. https://pmc.ncbi.nlm.nih.gov/articles/PMC11723022/
- Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., & Gebru, T. (2019). Model cards for model reporting. Proceedings of FAT ’19, 220–229. https://dl.acm.org/doi/10.1145/3287560.3287596
- Moons, K. G. M., et al. (2025). PROBAST+AI: Updated risk-of-bias tool for prediction models using AI. BMJ, 388, e082505. https://pubmed.ncbi.nlm.nih.gov/40127903/
- Musat, C. L., et al. (2024). Diagnostic applications of AI in sports: A comprehensive review. Diagnostics, 14(22), 2516. https://www.mdpi.com/2075-4418/14/22/2516
- Mongan, J., Moy, L., & Kahn, C. E., Jr. (2020). Checklist for Artificial Intelligence in Medical Imaging (CLAIM): A guide for authors and reviewers. Radiology: Artificial Intelligence, 2(2), e200029. https://doi.org/10.1148/ryai.2020200029
- Mlađenović, N. (2025) Između globalne mobilnosti i kulturne pripadnosti. Horizonti menadžmenta, 5 (1) 127-134.
- Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. PNAS, 115(11), 2600–2606. https://www.pnas.org/doi/10.1073/pnas.1708274114
- Naughton, M., Salmon, P. M., Compton, H. R., & McLean, S. (2024). Challenges and opportunities of artificial intelligence implementation within sports science and sports medicine teams. Frontiers in Sports and Active Living, 6, 1332427. https://doi.org/10.3389/fspor.2024.1332427
- Pappalardo, L., et al. (2019). A public data set of spatio-temporal match events in soccer. Scientific Data, 6, 236. https://www.nature.com/articles/s41597-019-0247-7
- Rana, S., Verma, R., & Li, C. (2023). Bias and generalizability in predictive modeling of sports injuries. Frontiers in Physiology, 14, 1173910. https://www.frontiersin.org/articles/10.3389/fphys.2023.1173910/full
- Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., Guillera-Arroita, G., Hauenstein, S., Lahoz-Monfort, J. J., Schröder, B., Thuiller, W., Warton, D. I., Wintle, B. A., Hartig, F., & Dormann, C. F. (2017). Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40(8), 913–929. https://doi.org/10.1111/ecog.02881
- Rathore, A., Kumar, P., & Singh, D. (2023). Machine learning for performance prediction in basketball. Heliyon, 9(6), e17824. https://doi.org/10.1016/j.heliyon.2023.e17824
- Rebelo, A., et al. (2023). From data to action: Wearable technology in injury prevention. BMC Sports Science, Medicine and Rehabilitation, 15, 117. https://bmcsportsscimedrehabil.biomedcentral.com/articles/10.1186/s13102-023-00783-4
- Reis, F. J. J., Alaiti, R. K., Vallio, C. S., & Hespanhol, L. (2024). Artificial intelligence and machine learning in sports. Brazilian Journal of Physical Therapy, 28(3), 101083. https://www.rbf-bjpt.org.br/en-artificial-intelligence-machine-learningapproaches-articulo-S1413355524004891
- Riley, R. D., et al. (2019). Minimum sample size for prediction models (Part II). Statistics in Medicine, 38(7), 1276–1296. https://pmc.ncbi.nlm.nih.gov/articles/PMC6519266/
- Rudin, C. (2019). Stop explaining black-box machine learning models. Nature Machine Intelligence, 1(5), 206–215. https://pmc.ncbi.nlm.nih.gov/articles/PMC9122117/
- Ruddy, J. D., Shield, A. J., & Duhig, S. J. (2022). Injury risk modeling using machine learning. Journal of Science and Medicine in Sport, 25(10), 825–832. https://doi.org/10.1016/j.jsams.2022.03.004
- Seçkin, A. Ç., et al. (2023). Review on wearable technology in sports. Applied Sciences, 13(18), 10399. https://www.mdpi.com/2076-3417/13/18/10399
- Singh, N., Das, R., & Mukherjee, P. (2023). Deep learning vs regression in sports analytics. IEEE Access, 11, 90324–90335. https://ieeexplore.ieee.org/document/10256644
- Stanković , B., Pavlović, Lj., & Stanković, M. (2024). Education for research and the moral responsibility of researchers. Srpska Akademska Misao, 9(1), 19-33. https://www.sam.edu.rs/index.php/sam/article/view/64
- Van Calster, B., et al. (2019). Calibration: The Achilles heel of predictive analytics. BMC Medicine, 17, 230. https://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-019-1466-7
- Van Eetvelde, H., et al. (2021). Machine learning in sport injury prediction. Journal of Experimental Orthopaedics, 8(1), 27. https://jeo-esska.springeropen.com/articles/10.1186/s40634-021-00346-x
- Varoquaux, G., Raamana, P. R., Engemann, D. A., Hoyos-Idrobo, A., Schwartz, Y., & Thirion, B. (2017). Assessing and tuning brain decoders: Cross-validation, caveats, and guidelines. NeuroImage, 145, 166–179. https://doi.org/10.1016/j.neuroimage.2016.10.038
- Vuković, M., & Dašić, D. (2024). Methodology and research methods in public relations. Ekonomski signali: poslovni magazin, 19(1), 67-87. https://doi.org/10.5937/ekonsig2401067V
- Vuković, M., Urošević, S., & Dašić, D. (2023). Threats to objectivity in the social science research. Sport, media and business, 9(2), 143–158. https://doi.org/10.58984/smb2302143v
- Wang, X., et al. (2023). Wearable sensors for activity monitoring. Journal of Biomedical Informatics: X, 7, 100098. https://www.sciencedirect.com/science/article/pii/S2667379723000037
- West, S. W., et al. (2024). Big data. Big potential. Big problems? BMJ Open Sport & Exercise Medicine, 10(2), e001994. https://bmjopensem.bmj.com/content/10/2/e001994
- Whitaker, L., Gabbett, T. J., & Blanch, P. (2023). Data quality in sports performance analytics. European Journal of Sport Science, 23(5), 782–796. https://doi.org/10.1080/17461391.2022.2120667
- Wynants, L., et al. (2020). Prediction models for COVID-19. BMJ, 369, m1328. https://www.bmj.com/content/369/bmj.m1328
- Zhou, D., Keogh, J. W. L., Ma, Y., Tong, R. K. Y., Khan, A. R., & Jennings, N. R. (2025). Artificial intelligence in sport: A narrative review of applications, challenges and future trends. Journal of Sports Sciences. Advance online publication. https://doi.org/10.1080/02640414.2025.2518694
- Zarić, I., Simić, M., & Bojanić, D. (2024). Explainable artificial intelligence in football analytics. Computers in Human Behavior Reports, 14, 101277. https://doi.org/10.1016/j.chbr.2024.101277

