I needed a quick and simple way to identify outliers, so I made a function for R that will identify outliers in a vector. Obviously, outliers can be identified in many ways. This is a simple >1.5 x the IQR approach. The function could easily be modified to identify extreme outliers by changing the IQR multiplication factor to 3.0 from 1.5, for example.

```
# Make the function to identify extreme outliers
outlier.f <- function(x){
low=as.numeric(quantile(x)[2] - IQR(x)*1.5)
high=as.numeric(IQR(x)*1.5 + quantile(x)[4])
list(lower.limit=low, upper.limit=high,
lower=which(x<low), upper=which(x>high))
}
```

This returns a list with four entries: the lower and upper limits (data values) for outliers and the position number of those values above of below the limits. Short working example:

# Make a vector with some obvious outliers foo <- c(rnorm(80),rnorm(10,mean=-5),rnorm(10,mean=5)) # Identify outliers outlier.f(foo) $lower.limit [1] -3.718559 $upper.limit [1] 3.549603 $lower [1] 81 82 83 84 85 86 88 89 90 $upper [1] 91 92 93 94 95 96 97 98 99 100

Advertisements