What is time series analysis?
Time series analysis is a technique in statistics that deals with time series data and trend analysis. Time series data follows periodic time intervals that have been measured in regular time intervals or have been collected in particular time intervals. In other words, a time series is simply a series of data points ordered in time, and time series analysis is the process of making sense of this data.
In a business context, examples of time series data include any trends that need to be captured over a period of time. A Google trends report is a type of time series data that can be analyzed. There are also far more complex applications such as demand and supply forecasting based on past trends.
Examples of time series data
In economics, time series data could be the Gross Domestic Product (GDP), the Consumer Price Index, S&P 500 Index, and unemployment rates. The data set could be a country’s gross domestic product from the federal reserve economic data.
From a social sciences perspective, time series data could be birth rate, migration data, population rise, and political factors.
The statistical characteristics of time series data does not always fit conventional statistical methods. As a result, analyzing time series data accurately requires a unique set of tools and methods, collectively known as time series analysis.
Certain aspects are an integral part of the time series analysis process. Analyst should be able to identify that the data is:
- Stationarity is a crucial aspect of a time series. A time series is determined to be stationary when its statistical properties such as the average (mean) and the variance do not alter over time. It has a constant variance and mean, and the covariance is separate from time.
- Seasonality refers to periodic fluctuations. For example, if you consider electricity consumption, it is typically high during the day and lowers during the night. In the case of shopping patterns, online sales spike during the holidays before slowing down and dropping.
- Autocorrelation is the similarity between observations as a function of the time lag between them. Plotting autocorrelated data yields a graph similar to a sinusoidal function.
Data: Types, terms, and concepts
Data, in general, is considered to be one of these three types:
- Time series data: A set of observations on the values that a variable takes on at different points of time.
- Cross-sectional data: Data of one or more variables, collected at the same point in time.
- Pooled data: A combination of time series data and cross-sectional data.
These are some of the terms and concepts associated with time series data analysis:
- Dependence: Dependence refers to the association of two observations with the same variable at prior time points.
- Stationarity: This parameter measures the mean or average value of the series. If a value remains constant over the given time period, if there are spikes throughout the data, or if these values tend toward infinity, then it is not stationarity.
- Differencing: Differencing is a technique to make the time series stationary and to control the correlations that arise automatically. That said, not all time series analyses need differencing and doing so can produce inaccurate estimates.
- Curve fitting: Curve fitting as a regression method is useful for data not in a linear relationship. In such cases, the mathematical equation for curve fitting ensures that data that falls too much on the fringes to have any real impact is “regressed” onto a curve with a distinct formula that systems can use and interpret.
Identifying cross sectional data vs time series data
The opposite of time series data is cross-sectional data. This is when various entities such as individuals and organizations are observed at a single point in time to draw inferences. Both forms of data analysis have their own value, and sometimes businesses use both forms of analysis to draw better conclusions.
Time series data can be found in nearly every area of business and organizational application affected by the past. This ranges from economics, social sciences, and anthropology to climate change, business, finance, operations, and even epidemiology. In a time series, time is often the independent variable, and the goal is to make a forecast for the future.
The most prominent advantage of time series analysis is that—because data points in a time series are collected in a linear manner at adjacent time periods—it can potentially make correlations between observations. This feature sets time series data apart from cross-sectional data.
Time series analysis techniques
As we have seen above, time series analysis can be an ambitious goal for organizations. In order to gain accurate results from model-fitting, one of several mathematical models may be used in time series analysis such as:
- Box-Jenkins autoregressive integrated moving average (ARIMA) models
- Box-Jenkins multivariate models
- Holt-Winters exponential smoothing
While the exact mathematical models are beyond the scope of this article, these are some specific applications of these models that are worth discussing here.
The Box-Jenkins models of both the ARIMA and multivariate varieties use the past behavior of a variable to decide which model is best to analyze it. The assumption is that any time series data for analysis can be characterized by a linear function of its past values, past errors, or both. When the model was first developed, the data used was from a gas furnace and its variable behavior over time.
In contrast, the Holt-Winters exponential smoothing model is best suited to analyzing time series data that exhibits a defining trend and varies by seasons.
Such mathematical models are a combination of several methods of measurement; the Holt-Winters method uses weighted averages which can seem simple enough, but these values are layered on the equations for exponential smoothing.
Applications of time series analysis
Time series analysis models yield two outcomes:
- Obtain an understanding of the underlying forces and structure that produced the observed data patterns. Complex, real-world scenarios very rarely fall into set patterns, and time series analysis allows for their study—along with all of their variables as observed over time. This application is usually meant to understand processes that happen gradually and over a period of time such as the impact of climate change on the rise of infection rates.
- Fit a mathematical model as accurately as possible so the process can move into forecasting, monitoring, or even certain feedback loops. This is a use-case for businesses that look to operate at scale and need all the input they can get to succeed.
While the data is numerical and the analysis process seems mathematical, time series analysis can seem almost abstract. However, any organization can realize a number of present-day applications of such methods. For example, it is interesting to imagine that large, global supply chains such as those of Amazon are only kept afloat due to the interpretation of such complex data across various time periods. Even during the COVID-19 pandemic where supply chains suffered maximum damage, the fact that they have been able to bounce back faster is thanks to the numbers, and the comprehension of these numbers, that continues to happen throughout each day and week.
Time series analysis is used to determine the best model that can be used to forecast business metrics. For instance, stock market price fluctuations, sales, turnover, and any other process that can use time series data to make predictions about the future. It enables management to understand time-dependent patterns in data and analyze trends in business metrics.
From a practical standpoint, time series analysis in organizations are mostly used for:
- Economic forecasting
- Sales forecasting
- Utility studies
- Budgetary analysis
- Stock market analysis
- Yield projections
- Census analysis
- Process and quality control
- Inventory studies
- Workload projections
Advantages of time series analysis
Data analysts have much to gain from time series analysis. From cleaning raw data, making sense of it, and uncovering patterns to help with projections much can be accomplished through the application of various time series models.
Here are a few advantages of time series analysis:
It cleans data and removes confounding factors
Data cleansing filters out noise, removes outliers, or applies various averages to gain a better overall perspective of data. It means zoning in on the signal by filtering out the noise. The process of time series analysis removes all the noise and allows businesses to truly get a clearer picture of what is happening day-to-day.
Provides understanding of data
The models used in time series analysis do help to interpret the true meaning of the data in a data set, making life easier for data analysts. Autocorrelation patterns and seasonality measures can be applied to predict when a certain data point can be expected. Furthermore, stationarity measures can gain an estimate of the value of said data point.
This means that businesses can look at data and see patterns across time and space, rather than a mass of figures and numbers that aren’t meaningful to the core function of the organization.
Forecasting data
Time series analysis can be the basis to forecast data. Time series analysis is inherently equipped to uncover patterns in data which form the base to predict future data points. It is this forecasting aspect of time series analysis that makes it extremely popular in the business area. Where most data analytics use past data to retroactively gain insights, time series analysis helps predict the future. It is this very edge that helps management make better business decisions.
Disadvantages of time series analysis
Time series analysis is not perfect. It can suffer from generalization from a single study where more data points and models were warranted. Human error could misidentify the correct data model, which can have a snowballing effect on the output.
It could also be difficult to obtain the appropriate data points. A major point of difference between time-series analysis and most other statistical problems is that in a time series, observations are not always independent.
For example, a single chance event may affect all later data points, and it is up to every data scientist to accurately gauge which of these events may have an impact on the analysis in question. Are there similarities in predictions that can make historical data useful?
Future of time series analysis
Time series analysis represents a highly advanced area of data analysis. It focuses on describing, processing, and forecasting time series. Time series are time-ordered data sets. When interpreting a time series, autocorrelation patterns, seasonality, and stationarity must be taken into account before selecting the right model for analysis. There are several time series analysis models, ranging from basic, fine-tuned, and advanced. Advanced models help data analysts to predict time series behavior with much greater accuracy.
With the advent of automation and machine learning techniques, comprehending this information and conducting complex calculations is not as tough as it once was, paving the way for a better understanding of our past, and our future.