Just FYI, there are variants of ELO ratings which take into account margin of victory. Perhaps there might be some interest in experimenting with these to see whether they perform better.
For instance, Nate Silver's group at FiveThirtyEight uses the following formula for basketball games (see
fivethirtyeight.com/features/how-we-calculate-nba-elo-ratings/
):
K * (MOV+3)^(4/5) / (7.5 + 0.006*elo_diff)
where MOV is the margin of victory -- in our case, presumably this would be something like max( 1, (total games won by winner) - (total games won by loser) )
so that the winner of the match always has a positive MOV. Also elo_diff is
(Elo ranking of winner) - (Elo ranking of loser),
which will be positive if the higher-ranked team wins but negative if the lower-ranked team wins. The values 3, 4/5, 7.5, 0.006 don't seem to be set in stone, instead it sounds like they were obtained by trying a few values until the ranking changes "felt right".