Creating a random forest model to determine success in women’s collegiate lacrosse

Authors

  • Jennifer Bunn Sam Houston State University
  • Mary K Reagor
  • Bradley J Myers

DOI:

https://doi.org/10.12922/jshp.v13i1.200

Keywords:

team sport, machine learning, performance match analysis, win probability

Abstract

Predicting the outcomes of sports is difficult due to the variation created with human performance, environmental conditions, and style of play. Linear models have proven ineffective in creating usable equations that hold true across these variations. The purpose of this study was to use a random forest model to determine the variables involved in predicting game success (wins and losses) in Division I women’s collegiate lacrosse. Data from the 2013-2018 seasons (103 games) were used as training input to the basic random forest model, with the 2019 data (17 games) used as a hold-out set to test the accuracy of the model. The model was also tested with data from the other teams from the same conference. After optimization, the accuracy of the model was 88.2% using the 2019 team data and 86.0% using the conference data. The variables with the highest importance solely emphasized shots taken by the team of interest and preventing shots from being taken by the opposing team. These data can be used to help coaches design drills based on the most important variables. Because the two models were so similar in accuracy, the designed drills are likely to be transferable to teams of similar capability.

References

Arabzad SM, Tayebi Araghi ME, Sadi-Nezhad S, Ghofrani N. Football Match Results Prediction Using Artificial Neural Networks; The Case of Iran Pro League. Journal of Applied Research on Industrial Engineering [Internet]. 2014 Sep 1 [cited 2022 Aug 31];1(3):159–79. Available from: http://www.journal-aprie.com/article_43050.html

Lago-Peñas C. The role of situational variables in analysing physical performance in soccer. J Hum Kinet [Internet]. 2012 Dec [cited 2022 Aug 31];35(1):89–95. Available from: https://pubmed.ncbi.nlm.nih.gov/23487326/

Bynum L, Snarr R, Myers B, Bunn J. Assessment of Relationships Between External Load Metrics and Game Performance in Women’s Lacrosse. Int J Exerc Sci [Internet]. 2022 [cited 2022 Aug 31];15(6):488–697. Available from: https://digitalcommons.wku.edu/ijes/vol15/iss6/8

Tümer AE, Koçer S. Prediction of team league’s rankings in volleyball by artificial neural network method. Int J Perform Anal Sport [Internet]. 2017 [cited 2021 Mar 2];17(3):202–11. Available from: https://www.tandfonline.com/doi/abs/10.1080/24748668.2017.1331570

McCabe A, Trevathan J. Artificial intelligence in sports prediction. In: Proceedings - International Conference on Information Technology: New Generations, ITNG 2008. 2008. p. 1194–7.

Khan J. Neural Network Prediction of NFL Football Games. Joshua Kahn - PDF Free Download [Internet]. 2003 [cited 2022 Aug 31]. p. 1–19. Available from: https://docplayer.net/21763052-Neural-network-prediction-of-nfl-football-games-joshua-kahn.html

Ivanković Z, Racković M, Markoski B, Radosav D, Ivković M. Analysis of basketball games using neural networks. 2010 11th International Symposium on Computational Intelligence and Informatics (CINTI). 2010.

Wenninger S, Link D, Lames M. Performance of machine learning models in application to beach volleyball data. undefined. 2020 Jul 1;19(1):24–36.

Robertson S, Woods C, Gastin P. Predicting higher selection in elite junior Australian Rules football: The influence of physical performance and anthropometric attributes. J Sci Med Sport [Internet]. 2015 Sep 1 [cited 2022 Aug 31];18(5):601–6. Available from: https://pubmed.ncbi.nlm.nih.gov/25154704/

Woods CT, Sinclair W, Robertson S. Explaining match outcome and ladder position in the National Rugby League using team performance indicators. J Sci Med Sport. 2017 Dec 1;20(12):1107–11.

Leicht AS, Gómez MA, Woods CT. Explaining Match Outcome During The Men’s Basketball Tournament at The Olympic Games. J Sports Sci Med [Internet]. 2017 Dec 1 [cited 2022 Aug 31];16(4):468. Available from: /pmc/articles/PMC5721175/

Bennett M, Bezodis N, Shearer DA, Locke D, Kilduff LP. Descriptive conversion of performance indicators in rugby union. J Sci Med Sport [Internet]. 2019 Mar 1 [cited 2022 Aug 31];22(3):330–4. Available from: https://pubmed.ncbi.nlm.nih.gov/30146476/

Lock D, Nettleton D. Using random forests to estimate win probability before each play of an NFL game. J Quant Anal Sports. 2014 Jun 1;10(2):197–205.

McNeil K, Rhew E, Ticknor P, Col W. The Official National Collegiate Athletic Association 2020 WOMEN’S LACROSSE STATISTICIANS’ MANUAL. 2020 Women’s Lacrosse Statisticians’ Manual National Collegiate Athletic Association; 2020 p. 1–9.

https://bigsouthsports.com/stats.aspx?path=wlax&year=2019#results [Internet]. 2019 Women’s Lacrosse Overall Statistics.

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python Gaël Varoquaux Bertrand Thirion Vincent Dubourg Alexandre Passos PEDREGOSA, VAROQUAUX, GRAMFORT ET AL. Matthieu Perrot. Journal of Machine Learning Research [Internet]. 2011 [cited 2022 Sep 1];12(85):2825–30. Available from: http://scikit-learn.sourceforge.net.

Groll A, Ley C, Schauberger G, Van Eetvelde H. Prediction of the FIFA World Cup 2018 - A random forest approach with an emphasis on estimated team ability parameters. 2018 Jun 8 [cited 2022 Nov 1]; Available from: https://arxiv.org/abs/1806.03208v3

Kursa MB, Rudnicki WR. Feature Selection with the Boruta Package. J Stat Softw [Internet]. 2010 Sep 16 [cited 2022 Aug 31];36(11):1–13. Available from: https://www.jstatsoft.org/index.php/jss/article/view/v036i11

Downloads

Published

2025-01-13