Climate Variable Selection for Climate Matching

Home Discussion Forums – the Agora General Discussion Climate Variable Selection for Climate Matching

Viewing 1 reply thread
  • Author
    • #6946
      Joseph StinzianoJoseph Stinziano

      Hi IPRRG,

      I’m looking for some feedback on climate variable selection when using the climatch algorithm, and hoping the community can help address some of my questions and concerns.

      The goal of the analysis is to conduct a horizon scan to estimate which regions of Canada are more susceptible to pests based on climate matching, under ‘current’ and ‘future’ climates. We are not necessarily interested in which specific pests could be climate suitable where, but which regions have more conducive climate to novel pests.

      We are using the climatch algorithm as implemented in the R package climatchR. We are using the CHELSA climate dataset, focusing on the 16 bioclim variables, while the species occurrence data is being drawn from GBIF with appropriate quality control filters. The resolution is set to 10 arc-minutes for the climate dataset.

      We are trying to determine the optimal approach to run the analysis. Here are the options we are considering, although we are open to other approaches:
      1. Use all climate variables we are considering, even though some may be highly correlated.
      2. Use pairwise comparisons of the climate variables based on pixels with species occurrence data, drawing the correlations with pooled data from all species, then eliminating variables until the remaining variables are not highly correlated.
      3. Option 2, but on a species-specific basis.

      1. Climate variable correlations may change under different climate scenarios and models, such that non-correlated variables in present day data may be highly correlated in projected climate data and vice versa. This may impact our ability to interpret the results of the horizon scan.
      2. For Option 3, this would lead to different variables being used for different species, which may impact the ability to aggregate and interpret the dataset.
      3. Using Option 1 means that some highly correlated variables may be used.
      4. For options 2 and 3, would it be necessary to use more than the 10 arc-minute pixels in which species occurrence data lie, such as all pixels within 1 pixel unit of the occurrence data? Or would the single pixel approach be appropriate due to the coarseness of the data?
      5. Other concerns we are missing?

      Thank you for taking the time to read this, and I’m looking forward to the discussion.

      Joseph Stinziano

    • #7078
      Darren KriticosDarren Kriticos

      Hi Joseph,

      Apologies fr the delay.

      You are asking a compendium question.

      My first point is that it is terribly easy to over-think or ascribe too much respect to climate matching tools.

      I published a method using CLIMEX Match Climates (Regional) to tackle the problem that you are tackling: Kriticos, D. J. 2012. Regional climate-matching to estimate current and future biosecurity threats. Biological Invasions 14:1533-1544.

      Every one of these methods will require you to set an arbitrary threshold, which dictates the specificity of the result.

      Using all 16 Bioclim variables is, ahem,


      a good idea. Typically, 4-5 variables is the best if you want to use a climate matching or correlative method. Given that you are trying to identify risk areas (in general) you should focus on variables that indicate stressful conditions. No plant or animal ever persisted or died depending on the mean annual temperature or total precipitation.

      Correlation of your variables is the least of your worries. In fact, the opposite is the problem if you are wanting to use all of the first 16 BC variables you will be over-fitting the model extremely.

      10 min spatial resolution is fine (possibly even over-kill) foor this challenge.

      Kind regards,

Viewing 1 reply thread
  • You must be logged in to reply to this topic.