SIMULATION OF A 40-YEAR CLIMATIC TIME SERIES TO ILLUSTRATE A RANDOM TREND

Andrew C. Comrie

Dept. of Geography and Regional Development
University of Arizona, Tucson, AZ 85721, U.S.A.

Abstract: The dangers of assuming apparent trends to be real are highlighted via examination of a 40-year random time series of simulated temperature. The overall trend was examined using linear regression, and shorter-term trends were identified using a 9-year moving average. The likelihood of the observed trend having occurred by chance was evaluated by comparison to trends in 100 random series. The example series displayed annual and decadal variability, as well as a clear upward overall trend (slope = 0.0089; R2 = 0.1616) with a 2 percent chance of occurrence. The findings underscore the care that should be taken when evaluating trends in data for which the controlling processes are not fully understood.

Introduction
Many climate studies have examined trends in quantities such as temperature, precipitation, and carbon dioxide (CO2) based on time series of data collected over the last 50 to 100 years (e.g., Cayan et al., 1998; Peterson and Vose, 1997; Keeling and Whorf, 1998). These studies frequently include time-series plots showing, for example, increases since the middle of the twentieth century. In some cases, these figures include trend lines or smoothed curves to highlight the nature of a particular trend.

The statistical strength or weakness of any such trend is usually detailed in the paper. However, it is not uncommon for a graph of an especially newsworthy trend to be reproduced in the media. Figure 1 shows two examples of this phenomenon, the annual Mauna Loa CO2 curve and the annual mean minimum temperature for Tucson, Arizona. While trends published in scientific articles have undergone review for scientific and statistical robustness, it is easy for the untrained eye to see apparent trends in other similar, relatively short time series that may not be real.

Figure 1. Two examples of climatic time series and trend lines for (a) Mauna Loa CO2 data from Keeling and Whorf (1998) and (b) mean minimum temperature data at Tucson International Airport from Peterson and Vose (1997).

The aim of this paper is to examine the apparent trend in a simulated annual climatic time series using random numbers. Any trends present in the data will have occurred by chance, and will highlight the level of caution required for interpretation.

Methods
A series of 40 random numbers was created in a spreadsheet (Microsoft Excel) to simulate annual temperature, representing an arbitrary climatic variable. For each data point, the software returns an evenly distributed random number greater than or equal to 0 and less than 1. The long-term trend in the data was examined by calculating and plotting a straight-line regression. Shorter-term trends approximating a decadal time scale were examined by calculating a 9-year moving average to smooth the annual data (i.e., year 1 to year 9 average plotted at year 5, year 2 to year 10 average plotted at year 6, etc.). The data and results were plotted to enable visual comparison.

To illustrate the likelihood of the observed trend having occurred by chance, 100 versions of the random series were generated for comparison. Slope coefficients were calculated for each series and tabulated by frequency of occurrence.

Results
Figure 2 illustrates the simulated time series and results. Visual inspection shows a clear upward trend in the 40-year series, although there is noticeable annual and decadal variability within the overall trend. The regression line has a calculated slope of 0.0089 and an intercept of 0.401, and it explains about 16 percent of the variance in the data (R2 = 0.1616). The moving average highlights two apparent cycles of rising and falling simulated annual temperatures, with a decrease in the middle of the series that lasts for more than a decade. While the overall spread of data covers the range between 0 and 1, the higher values tend to fall (randomly) in the middle and later part of the this particular series, thereby leading to an apparent upward trend.

Figure 2. Random time series of 40 simulated annual temperatures, showing the raw annual values, the smoothed series using a 9-year moving average to highlight decadal variability, and the best-fit regression line highlighting the apparent long-term trend.

If these data were actual temperatures, this would be the point to consider explanations for the observed trends in the data. However, these are randomly generated data that are known to have occurred by chance. To examine the likelihood of the strong apparent trend in Figure 1, the frequency distribution of slope values representing the long-term trends from 100 simulated data series is provided in Table 2. It can be seen that the chance of the 0.0089 slope in Figure 1 is about 2 percent, or 1 in 50 occurrences.

Table 1: Percentage frequency of slope coefficients in 10 equal sized categories from 100 series of 40 years each. 

Slope
Frequency (%)
< -0.008
0
-0.008 to -0.006
5
-0.006 to -0.004
8
-0.004 to -0.002
17
-0.002 to 0
18
0 to 0.002
18
0.002 to 0.004
14
0.004 to 0.006
13
0.006 to 0.008
5
> 0.008
2

Discussion and Conclusion
The results show that a remarkably strong apparent trend occurs in this example of a 40-year random time series of simulated temperature. There are also strong apparent shorter-term trends visible in the data. While these trends are real in the sense that they exist for these specific data, the more refined question is to what degree the observed long-term trend might have occurred by chance. The simulation of 100 time series provides an answer to this question, and it mimics what would normally be calculated with some basic statistics.

For these 100 trials, 67 percent of the slopes fell between -0.004 and 0.004 (coinciding with a quantity known as the standard deviation). There are few slope values near the upper and lower tails of the frequency distribution, which is said to be normal (or bell shaped), and it appears that the trend in this example is relatively unusual. Slopes of this magnitude occur with a frequency of only about 2 percent. This may seem small, but to put it in perspective, if individual random time series were assigned to each member of a class of 25 students there would be a 50 percent chance of someone having a series displaying a trend as strong as this example.

Notice also that there is an equal chance of any simulated temperature in the series (or the temperature for the next year, 41) being between 0 and 1. Yet, the slope values calculated from the time series are normally distributed, and they have a much greater chance of being near zero.

In conclusion, this paper has examined trends in a simulated annual climatic time series using random numbers. The study identified a clear long-term trend in an example series that is known to have occurred by chance, and it highlights the caution that should be used when interpreting trends in situations where the underlying processes are not fully understood.

References
Cayan, D.R., M.D. Dettinger, H.F. Diaz, and N.E. Graham, 1998: Decadal variability of precipitation over western North America. Journal of Climate, 11, 3148-3166.

Keeling, C.D. and T.P. Whorf, 1998: Atmospheric CO2 concentrations -- Mauna Loa Observatory, Hawaii, 1958-1997. Technical Report NDP-001, Carbon Dioxide Information Analysis Center, Oak Ridge National Laboratory, Oak Ridge, Tennessee.

Peterson, T.C. and R.S. Vose, 1997: An overview of the Global Historical Climatology Network temperature data base. Bulletin of the American Meteorological Society, 78, 2837-2849.

Appendix
Raw data for the example 40-year time series used in the study, and illustrated in Figure 2.

Year    Raw_Data        9-yr_Average
1          0.497138259
2          0.167629143
3          0.594334213
4          0.25999216
5          0.571433527  0.408250071
6          0.24064353    0.400273476
7          0.598618197  0.422243724
8          0.181487353  0.434713576
9          0.562974255  0.475839872
10        0.42534891    0.452854774
11        0.365361371  0.52294034
12        0.706562879  0.543391197
13        0.63012883    0.631537732
14        0.364567641  0.645745307
15        0.871413628  0.696658066
16        0.782675907  0.735152202
17        0.974806165  0.700988974
18        0.690842427  0.670585785
19        0.883563743  0.635680156
20        0.711808598  0.608218434
21        0.399093828  0.562599102
22        0.356500129  0.514998037
23        0.050416981  0.481188078
24        0.62425813    0.455627157
25        0.372101917  0.448180189
26        0.546396576  0.43987523
27        0.386552802  0.510629795
28        0.653515449  0.598826706
29        0.644785886  0.590463604
30        0.324349197  0.648739235
31        0.993291215  0.697518206
32        0.844189183  0.758854334
33        0.548990211  0.755409924
34        0.896582595  0.777210928
35        0.985407317  0.766883809
36        0.938577953  0.766599916
37        0.622515757
38        0.840994928
39        0.231405118
40    0.990736186