R Inconsistencies

A collection of detected R inconsistencies and bugs.

Format Deviations

Unexpected deviations of type or dimensions.

0 dimension matrix

Subsetting a matrix by selecting 0 rows and 0 columns preserves matrix format even without drop=FALSE.

m <- matrix(1:9, nrow=3, ncol=3)


m[1,0]                                          m[0,0]
## integer(0)                                   ## <0 x 0 matrix>

qnorm() on empty matrix

qnorm() drops matrix format when input is an empty matrix.

qnorm(matrix(0.1, nrow=1, ncol=2))              qnorm(matrix(0, nrow=0, ncol=0))
##           [,1]      [,2]                     ## numeric(0)
## [1,] -1.281552 -1.281552

is.nan() on data.frame

is.nan doesn’t work on data.frames while is.na does.

is.na(iris)                                     is.nan(iris)
## <works>                                      ## Error in is.nan(iris) : default method
                                                ## not implemented for type 'list'

rbind() on zero-column data.frames

rbind() does not preserve the number of rows for data.frames, while cbind() does.

mat <- matrix(nrow=2, ncol=0)                   df <- as.data.frame(mat)


dim(mat)                                        dim(df)
## [1] 2 0                                      ## [1] 2 0


dim(rbind(mat, mat))                            dim(rbind(df, df))
## [1] 4 0                                      ## [1] 0 0


dim(cbind(mat, mat))                            dim(cbind(df, df))
## [1] 2 0                                      ## [1] 2 0

Number of columns are always preserved, making rbind and number of rows the only exception1.

any() doesn’t work on data.frames

any() works on numeric data.frames but not on logical data.frames.

any(data.frame(A=0, B=1))                       any(data.frame(A=TRUE, B=FALSE))
## [1] TRUE                                     ## Error in FUN(X[[i]], ...) : only defined
                                                ## on a data frame with all numeric variables

rbind() on nested data.frames

rbind() on data.frame do not adjust row names to be unique when the data.frame is nested.

df    <- data.frame(a=1:3)
df$df <- data.frame(a=1:3)

rbind(df, df)
## Error in `.rowNamesDF<-`(x, value = value) :
##   duplicate 'row.names' are not allowed
## In addition: Warning message:
## non-unique values when setting 'row.names': ‘1’, ‘2’, ‘3’

Statistical Hypothesis Tests

Unexpected behaviour when applying null hypothesis tests.

flinger.test() and constant values

fligner.test() can return significant p-value for constant variance.

fligner.test(c(1,1,2,2), c("a","a","b","b"))
##
##         Fligner-Killeen test of homogeneity of variances
##
## data:  c(1, 1, 2, 2) and c("a", "a", "b", "b")
## Fligner-Killeen:med chi-squared = NaN, df = 1, p-value = NA


fligner.test(c(1,1,1,2,2,2), c("a","a","a","b","b","b"))
##
##         Fligner-Killeen test of homogeneity of variances
##
## data:  c(1, 1, 1, 2, 2, 2) and c("a", "a", "a", "b", "b", "b")
## Fligner-Killeen:med chi-squared = Inf, df = 1, p-value < 2.2e-16

paired wilcoxon.test() and ties

Paired versions of wilcoxon.test() has tolerance issues when detecting if ties are present.

wilcox.test(c(4, 3, 2), c(3, 2, 1), paired=TRUE)
##
##         Wilcoxon signed rank test with continuity correction
##
## data:  c(4, 3, 2) and c(3, 2, 1)
## V = 6, p-value = 0.1489
## alternative hypothesis: true location shift is not equal to 0
##
## Warning message:
## In wilcox.test.default(c(4, 3, 2), c(3, 2, 1), paired = TRUE) :
##   cannot compute exact p-value with ties


wilcox.test(c(0.4,0.3,0.2), c(0.3,0.2,0.1), paired=TRUE)
##
##         Wilcoxon signed rank test
##
## data:  c(0.4, 0.3, 0.2) and c(0.3, 0.2, 0.1)
## V = 6, p-value = 0.25
## alternative hypothesis: true location shift is not equal to 0

paired wilcoxon.test() and Inf

All versions of wilcoxon.test() remove infinite values before proceeding, except when paired=TRUE.

With non-paired version Inf values are removed and results are different:

wilcox.test(c(1,2,3,4), c(0,9,8,7))
##
##         Wilcoxon rank sum test
##
## data:  c(1, 2, 3, 4) and c(0, 9, 8, 7)
## W = 4, p-value = 0.3429
## alternative hypothesis: true location shift is not equal to 0


wilcox.test(c(1,2,3,4), c(0,9,8,Inf))
##
##         Wilcoxon rank sum test
##
## data:  c(1, 2, 3, 4) and c(0, 9, 8, Inf)
## W = 4, p-value = 0.6286
## alternative hypothesis: true location shift is not equal to 0

Paired version leaves Inf and includes it in the ranks, making these results equivalent:

wilcox.test(c(1,2,3,4), c(0,9,8,7), paired=TRUE)
##
##         Wilcoxon signed rank test
##
## data:  c(1, 2, 3, 4) and c(0, 9, 8, 7)
## V = 1, p-value = 0.25
## alternative hypothesis: true location shift is not equal to 0


wilcox.test(c(1,2,3,4), c(0,9,8,Inf), paired=TRUE)
##
##         Wilcoxon signed rank test
##
## data:  c(1, 2, 3, 4) and c(0, 9, 8, Inf)
## V = 1, p-value = 0.25
## alternative hypothesis: true location shift is not equal to 0

paired wilcoxon.test() and warning

Paired wilcox.test() warns about x when y observations are missing.

wilcox.test(c(1,2), c(NA_integer_,NA_integer_), paired=TRUE)
## Error in wilcox.test.default(c(1, 2), c(NA_integer_, NA_integer_), paired = TRUE) :
##   not enough (finite) 'x' observations

wilcoxon.test() output details

wilcox.test() result does not indicate if exact test was used or not.

wilcox.test(rnorm(10), exact=FALSE, correct=FALSE)
##
##         Wilcoxon signed rank test
##
## data:  rnorm(10)
## V = 30, p-value = 0.7989
## alternative hypothesis: true location is not equal to 0


wilcox.test(rnorm(10), exact=TRUE, correct=FALSE)
##
##         Wilcoxon signed rank test
##
## data:  rnorm(10)
## V = 33, p-value = 0.625
## alternative hypothesis: true location is not equal to 0

wilcoxon.test() and dead lines

In wilcox.test() the code has a few correct <- FALSE lines that seem to do nothing:

332                 correct <- FALSE
333                 ESTIMATE <- c(`difference in location` = uniroot(W,
334                   lower = mumin, upper = mumax, tol = 0.0001)$root)
335             }
336             if (exact && TIES) {
337                 warning("cannot compute exact p-value with ties")
338                 if (conf.int)
339                   warning("cannot compute exact confidence intervals with ties")
340             }
341         }
342     }
343     names(mu) <- if (paired || !is.null(y))
344         "location shift"
345     else "location"
346     RVAL <- list(statistic = STATISTIC, parameter = NULL, p.value = as.numeric(PVAL),
347         null.value = mu, alternative = alternative, method = METHOD,
348         data.name = DNAME)
349     if (conf.int)
350         RVAL <- c(RVAL, list(conf.int = cint, estimate = ESTIMATE))
351     class(RVAL) <- "htest"
352     RVAL
353 }

var.test() and conf.level

var.test does not accept conf.level of either 0 or 1, while t.test does.

t.test(rnorm(10), rnorm(10), conf.level=0)
## <works>


var.test(rnorm(10), rnorm(10), conf.level=0)
## Error in var.test.default(rnorm(10), rnorm(10), conf.level = 0) :
## 'conf.level' must be a single number between 0 and 1

  1. Worth noting that this behaviour is documented in the manual pages of ?rbind.data.frame.  ↩