The zero-sum-power issue in the hyper2 package

Robin K. S. Hankin

To cite the hyper2 package in publications, please use Hankin (2017).

The powers of a hyper2 object generally sum to zero. This is implicit in Bradley-Terry type likelihood functions in which probabilities are generally ratios of linear combinations of strengths. Here I discuss this observation in the context of hyper2 package idiom. Usually, powers not summing to zero is an indication of a programming bug. For example, consider the following likelihood function, drawn from the chess dataset:

\[ \frac{p_1^{30}p_2^{36}p_3^{22}}{ \left(p_1+p_2\right)^{35} \left(p_2+p_3\right)^{35} \left(p_1+p_3\right)^{18} } \]

For a time, one package documentation file contained the following hyper2 idiom purporting to create the chess likelihood function:

chess <- hyper2()
chess["Topalov"] <- 30
chess["Anand"  ] <- 36
chess["Karpov" ] <- 22
chess[c("Topalov","Anand" )] <- 35  # bug!  should  be -35
chess[c("Anand","Karpov"  )] <- 35  # bug!  should  be -35
chess[c("Karpov","Topalov")] <- 18  # bug!  should  be -18

However, the above commands include an error [the sign of the denominator is incorrect]. The package includes two mechanisms to detect this type of error. Firstly, the print method (by default) detects such unbalanced likelihood functions and gives a warning:

chess
## Warning in print.hyper2(x): powers have nonzero sum
## log(Anand^36 * (Anand + Karpov)^35 * (Anand + Topalov)^35 * Karpov^22 *
## (Karpov + Topalov)^18 * Topalov^30)

and secondly, function loglik() traps nonzero power sums:

loglik(equalp(chess),chess)
## Error in loglik_single(p, H, log = log): sum(powers(H)) == 0 is not TRUE

The correct idiom would be

chess[c("Topalov","Anand" )] <- -35
chess[c("Anand","Karpov"  )] <- -35
chess[c("Karpov","Topalov")] <- -18
chess
## log(Anand^36 * (Anand + Karpov)^-35 * (Anand + Topalov)^-35 * Karpov^22
## * (Karpov + Topalov)^-18 * Topalov^30)
loglik(equalp(chess),chess)
## [1] -60.9969519

See how the print method gives immediate assurance that its argument is indeed balanced (and besides, unbalanced likelihood functions cannot even be evaluated with loglik()). It is natural to suggest including a check in the creation method—hyper2()—which would prevent the creation of a hyper2 object with unbalanced powers. However, this approach is not consistent with the package; consider the following situation. Suppose we wish to incorporate a new observation into chess, specifically that Anand played Karpov twice, with one win each. We might proceed as follows:

chess["Anand" ] %<>% inc
chess["Karpov"] %<>% inc
chess[c("Anand,Karpov")]  %<>% dec(2)
chess
## log(Anand^37 * (Anand + Karpov)^-35 * (Anand + Topalov)^-35 *
## Anand,Karpov^-2 * Karpov^23 * (Karpov + Topalov)^-18 * Topalov^30)

However, after the increments but before the decrements, second, chess has a nonzero power sum, pending addition of another term. At this point, chess is unbalanced; its nonzero power sum is an indicator that it is a temporary object. That’s OK as long as we remember to add the denominator (as carried out in the next line) which would mean dividing by (Anand+Karpov)^2, thereby restoring the zero power sum. If we forget to do this, the print method gives us a warning, and indeed loglik() returns an error, which should prompt us to check the coding.

The print method

The hyper2 print method is sensitive to option give_warning_on_nonzero_power_sum. If TRUE (the default), a warning is issued if the powers have nonzero sum. This is usually appropriate. If the option is FALSE, the warning is suppressed. Note that the intermediate likelihood functions in the chess example are not printed (or indeed evaluated), so unbalanced likelihood functions are permitted, but only ephemerally.

Function balance()

Sometimes it is convenient to accommodate the numerator terms all together and enforce the zero power sum at the end. This is accomplished by balance():

H <- hyper2()
H['a'] %<>% inc(5)
H['b'] %<>% inc(2)
H['c'] %<>% inc(9)
H %<>% balance
H
## log(a^5 * (a + b + c)^-16 * b^2 * c^9)

References

Hankin, R. K. S. 2017. “Partial Rank Data with the hyper2 Package: Likelihood Functions for Generalized Bradley-Terry Models.” The R Journal 9 (2): 429–39.