## Abstract

One predicted impact of climate change is a poleward shift in the boundaries of species ranges. Existing methods for identifying such a boundary shift based on changes in the observed pattern of occupancy within a grid of cells are sensitive to changes in the overall rate of sightings and their latitudinal distribution that are unconnected to a boundary shift. A formal test for a boundary shift is described that allows for such changes. The test is applied to detect northward shifts in the northern boundary of the Essex skipper (*Thymelicus lineola*) butterfly and the European goldfinch (*Carduelis carduelis*) in Great Britain. A shift is detected in the latter case but not in the former. Results from a simulation study are presented showing that the test performs well.

## 1. Introduction

One predicted impact of climate change is a poleward shift in the boundaries of species ranges [1]. Because a species boundary is unobservable, such a shift must be inferred from information about the locations of individual organisms. This information commonly consists of the presence or absence of species sightings in a grid of cells covering the region of interest. We will refer to this as the occupancy pattern, with a cell being occupied if it contains at least one sighting. Existing methods for detecting a boundary shift based on the occupancy pattern involve a comparison of the average latitude of the 10 most poleward occupied cells in a later period with the average during an earlier period [2–6]. While the boundary imposes a strict limit on the distribution of occupied cells, this distribution also depends on the overall sighting rate—which reflects a combination of abundance and sighting effort—and how it varies within the species range. Failure to account for changes in these factors (which may themselves be related to climate change) can lead to an erroneous conclusion about a boundary shift. For example, even in the absence of a boundary shift, occupancy can increase at high latitude if there is an increase in, or poleward re-distribution of, the sighting rate. To isolate the effect of a boundary shift, it is necessary to control for these factors. The effect of an overall increase in sighting rate on occupancy at high latitude has been recognized in the literature but has not been treated formally in tests for a single species. The purpose of this paper is to describe and illustrate a method that does this. For concreteness, we will focus on detecting a northward shift in the northern boundary of a species range. The method can be used to detect a shift in either direction of either boundary as well as shifts in elevational boundaries.

## 2. Material and method

We assume that the locations of sightings of individuals in period *t* (*t* = 1, 2) follow a non-stationary Poisson process with unknown rate function *λ _{t}*(

*x*,

*y*), where

*x*and

*y*are coordinates in the easterly and northerly directions, respectively. We also assume that the dominant variability in this rate is with

*y*so that The region of interest is divided into a grid of non-overlapping cells of unit area. Consider one such cell centred in the northerly direction at

*y*. Provided that the northerly dimension of the cell is small in relation to any curvature in

_{j}*λ*(

_{t}*y*), the probability that the cell is occupied in period

*t*is 2.1

If the grid contains *m _{j}* cells centred at

*y*, then the number

_{j}*N*of these that are occupied—which we will refer to as the occupancy count—in period

_{jt}*t*has a binomial distribution with probability mass function 2.2

We turn now to a model for *λ _{t}*(

*y*). In this situation, it is natural to restrict attention to cells north of a fixed threshold latitude

*μ*. Our basic assumption is that 2.3where

*β*(

_{t}*β*> 0) is a scale parameter governing the overall sighting rate in period

_{t}*t*,

*U*(

_{t}*U*>

_{t}*μ*) is the location of the boundary in period

*t*and

*κ*(0 <

_{t}*κ*< 1) is a shape parameter governing the way in which sightings are distributed in period

_{t}*t*. This distribution becomes uniform as

*κ*→ 1 and decreases linearly with increasing

_{t}*y*for

*κ*= 0.5. As

_{t}*κ*→ 0, the upper tail of the distribution becomes increasingly thin. Restricting the shape parameter to the unit interval ensures that the sighting rate decreases continuously to 0 as

_{t}*y*→

*U*. It is important to note that this model applies to the locations of species sightings and not the locations of every individual so that both

_{t}*β*and

_{t}*κ*reflect a combination of abundance and sighting effort.

_{t}Conditional on their total number, the northerly sighting locations under this model follow a generalized Pareto distribution. Briefly, our use of it here was motivated by the remarkable result from extreme value statistics that, for a wide class of parent distributions, exceedances of a random variable *Y* above a high level *μ* (such exceedances having the form *Y* − *μ*) tend to follow this distribution [7].

Let *n _{jt}* be the observed occupancy count for the cells centred at

*y*(

_{j}*j*= 1, 2,…,

*J*) in period

*t*. Here,

*J*is the total number of latitude bins in the grid. Interest centres on using the collection of these observed counts to test the null hypothesis

*H*

_{0}:

*U*

_{1}=

*U*

_{2}that the boundaries are the same in both time periods against the one-sided alternative hypothesis

*H*

_{1}:

*U*

_{1}<

*U*

_{2}that the boundary has shifted north. The likelihood ratio statistic for testing

*H*

_{0}against

*H*

_{1}is 2.4where log

*L*

_{0}is the value of the log-likelihood maximized under the constraint that

*U*

_{1}=

*U*

_{2}and log

*L*

_{1}is its value maximized under the constraint that

*U*

_{1}<

*U*

_{2}. For the model outlined above, the log-likelihood is given by 2.5where

*π*is given in (2.1) with

_{jt}*λ*(

_{t}*y*) found from (2.3). This log-likelihood depends on a total of six parameters:

_{j}*β*

_{1},

*β*

_{2},

*κ*

_{1},

*κ*

_{2},

*U*

_{1},

*U*

_{2}. Its constrained maximization over these parameters must be done numerically. By allowing the scale and shape parameters to differ in the two periods, this test accounts for changes in the overall rate of sightings and their latitudinal distribution.

Statistical inference based on the likelihood is reviewed in [8]. Inference about a distributional endpoint is a non-regular problem and the standard results for LR do not hold. Instead, significance can be assessed through the following parametric bootstrap [9]. Generate bootstrap values of *N _{jt}* (

*j*= 1, 2,…,

*J*;

*t*= 1, 2) from the binomial distribution in (2.2) using the parameter estimates found by maximizing log

*L*with

*U*

_{1}=

*U*

_{2}. Form the value of LR for the bootstrap sample. Repeat the procedure a many number of times and approximate the observed significance level (or

*p*-value) by the proportion of bootstrap values of LR that exceed the observed value.

Although the focus here is on testing for a boundary shift, the same general approach can be used to construct a confidence interval for the magnitude *δ* = *U*_{2}*−U*_{1} of the shift. One way to do this is to exploit the connection between confidence intervals and hypothesis tests. Briefly, a 1 − *α* confidence interval for *δ* is given by the set of values of *δ*_{0} for which the null hypothesis *H*_{0}: *δ* = *δ*_{0} cannot be rejected at significance level *α* in favour of the general alternative hypothesis *H*_{1}: *δ* ≠ *δ*_{0}. The likelihood ratio statistic in this case has the same form as (2.4) with log *L*_{1} given by the maximized log-likelihood from fitting the model without constraint and log *L*_{0} is the maximized log-likelihood from fitting the model under the constraint that *U*_{2} = *U*_{1} + *δ*_{0}. The former case involves fitting the model to the two time periods separately. As before, the *p-*value can be assessed via a parametric bootstrap—in this case, involving simulating data from the fitted model with *U*_{2} constrained to equal *U*_{1} + *δ*_{0}.

## 3. Results

### (a) Applications

We applied the test to occupancy data for the Essex skipper butterfly (*Thymelicus lineola*) and the European goldfinch (*Carduelis carduelis*) in 10 km^{2} cells of the British Ordnance Survey grid. This grid comprises 2831 cells covering Great Britain from the Channel Islands to the Shetland Islands and excluding Northern Ireland. The periods of comparison for the Essex skipper were 1970–1982 and 2000–2004 with data extracted from [10] and [11], respectively, and the periods of comparison for the European goldfinch were 1968–1972 and 1988–1991 with data extracted from [12] and [13], respectively.

Figure 1*a*,*b* shows the pattern of occupancy in the two periods for the Essex skipper. To apply the test, it is necessary to choose the threshold parameter *μ*. The choice of threshold is an open problem in generalized Pareto modelling [14]. We adopted the practical approach of choosing *μ* as the latitude beyond which the observed occupancy rate *n _{jt}*/

*m*exhibits a general decline with increasing latitude and ascertaining that the test results are not sensitive to values in this neighbourhood. The value chosen in this way for the Essex skipper was 173 km where latitude is measured from the southernmost point in the grid. The

_{j}*p*-value based on 200 bootstrap samples was 0.38 and the null hypothesis of no northward shift in the northern boundary of this species cannot be rejected. The maximum-likelihood (ML) estimate of this common boundary is 433 km. The corresponding ML estimates of the scale parameters

*β*

_{1}and

*β*

_{2}were 0.52 and 0.99, respectively, indicating an overall increase in occupancy rate. The ML estimates of the shape parameters

*κ*

_{1}and

*κ*

_{2}were 0.39 and 0.59, respectively, additionally indicating a northward re-distribution of sightings within the common boundary. Both of these effects are evident in figure 1

*c*,

*d*, where the fitted values of the occupancy probabilities

*π*

_{1}(

*y*) and

*π*

_{2}(

*y*) are plotted against

*y*along with the observed occupancy rates.

The average latitude of the 10 northernmost cells occupied by this species increased from 361 to 408 km. Our results indicate that this reflects a combined increase in and northward re-distribution of sightings, not a shift in the boundary.

Figure 2 shows the analogous results for the European goldfinch. The value of *μ* was chosen to be 570 km. The *p-*value based on 200 bootstrap samples was 0.02 and the null hypothesis can be rejected. The ML estimate of the location of the northern boundary shifted northward from 940 to 1069 km. The corresponding ML estimate of the scale parameter declined slightly from 2.55 to 2.30 with the estimated shape parameter remaining virtually unchanged at 0.33. As in figure 1, the fitted occupancy probabilities follow the observed occupancy rates well.

In both cases, the robustness of these results to the choice of *μ* was assessed by repeating the test with *μ* shifted 30 km in each direction. This had no effect on the results.

### (b) Simulation results

We conducted a simulation study aimed at assessing the performance of the test. The study involved simulating 100 sets of occupancy counts *n _{jt}*,

*j*= 1, 2,…,

*J*,

*t*= 1, 2, from the model (2.1)–(2.3) for the Ordnance Survey grid north of a fixed threshold latitude

*μ*for selected combinations of the parameters

*β*

_{1},

*β*

_{2},

*κ*

_{1},

*κ*

_{2},

*U*

_{1},

*U*

_{2}and applying the test to each simulated dataset at the 0.05 significance level based on 200 bootstrap samples. The parameter values used in this experiment were based on fitting the model to occupancy data for a large number of British birds and butterflies. For each combination of parameter values, this involved evaluating

*λ*(

_{t}*y*) from (2.3) for

_{j}*t*= 1, 2 and

*j*= 1, 2,…,

*J*and simulating the corresponding values of the occupancy counts

*N*from the binomial distribution in (2.2) with the probability

_{jt}*π*given in (2.1). Results are presented here for the case where

_{jt}*μ*= 400 km.

The first part of the study was aimed at assessing the agreement between the nominal 0.05 significance level and the actual rate at which the null hypothesis is rejected. This involved simulating under the null hypothesis with *U*_{1} = *U*_{2}. Results are presented in table 1 for the case where the common boundary was at 800 km, focusing on cases in which the two periods have different shape and scale parameters. The results in table 1 indicate good agreement between the nominal and actual significance levels. In particular, the test is not sensitive to a combination of an increase in and northward shift in the sighting rate.

The second part of the study was aimed at assessing the power of the test. This involved simulating under the alternative hypothesis with *U*_{1} < *U*_{2}. Results are presented in table 2 for selected combinations of the parameters, in this case focusing on cases in which the two periods have the same shape and scale parameters. The test has good power provided the magnitude of the shift and the scale and shape parameters are not too small. Loosely speaking, when the values of both the scale and shape parameters are low, the sightings are few and strung out towards the boundary and constrain its location only poorly. Referring back to the applications, these results suggest that the power of the test is limited in the case of the Essex skipper (but obviously not in the case of the European goldfinch).

## 4. Discussion

The key feature of the approach described in this paper is that it explicitly decomposes changes in occupancy in the poleward part of a species range into three components: an overall re-scaling, a latitudinal re-distribution and a boundary shift. This allows for inference about one component while controlling for the confounding effects of the others. The focus here has been on a boundary shift. This is justified in at least two ways. First, as noted, such shifts are among the clearest predicted ecological impacts of climate change and there is interest in validating this prediction. Second, and more substantively, the appearance of species in a new location can have both ecological and economic consequences and it is important to determine in specific cases whether such a shift has occurred. It can certainly be argued that a poleward (or other) re-distribution of abundance is at least as important as a boundary shift and the model described here can be used to study this as well. In this regard, however, it is worth re-iterating that the sighting data on which this (and other) methods are based reflects a combination of abundance and sighting effort, so that changes in the shape parameter must be interpreted with caution.

On the technical side, although we stress the importance of model-checking as in figures 1*c*,*d* and 2*c*,*d* and an assessment of sensitivity to the choice of *μ*, an attractive feature of our approach is that the underlying model of the latitudinal distribution of individuals has a basis in extreme value statistics. Experience suggests that, as in the examples presented above, this model provides a good fit to the observed occupancy rate in a high proportion of cases. Similarly, results appear to be insensitive to the choice of the threshold latitude *μ* in a high proportion of cases.

## Data accessibility

MATLAB routines for model-fitting and testing are available at http://hdl.handle.net/1912/6348.

## Acknowledgements

The unusually helpful comments of four anonymous reviewers are acknowledged with gratitude.

- Received September 17, 2013.
- Accepted January 24, 2014.

- © 2014 The Author(s) Published by the Royal Society. All rights reserved.