I find a data set from Kaggle that contains matches data from 2011 to 2019. Even the data didn’t report by WTA, and maybe some matches results were wrong, but what if we could dig to find some interesting questions:
- Who won most in each surface form 2011 to 2019?
- Are the results most likely end by 2-0?
- Did the players who has better performance of 1st serves would probably won the game?
There are 5 different kind of surfaces playground. Guess what? Serena Williams was not top 3 winning player. Ex No.1 Wozniacki was the The best player on a surface who won 129 matches from 2011 to 2019. Julia Grabher who is No.85 in 2022 but rank out of 200 in 2019, won 87 matches on a kind of surface.
There are 4 type results for an women tennis player:
- win by 2-0
- win by 2-1
- lose by 0-2
- lose by 1-2
Base on 131398 matches results, 37.56% matches end by the score of 2-0 for a player.
I just try basic simple Linear Regression to analyze players’ performance. Here is what I find:
- The numbers of 1st serves won, 2nd serves won and Aces are positive factors, the number of Double Faults is negative factors.
- And the 1st serves won is more relative to the 2nd serves won.
- If a tennis player has better performance of 1st serves, then she had lager probability to win.