SPTK (and CSP): Random Processes – Cyclostationary Signal Processing

Previous SPTK Post: Examples of Random Variables Next SPTK Post: The Sampling Theorem

In this Signal Processing ToolKit post, I provide an introduction to the concept and use of random processes (also called stochastic processes). This is my perspective on random processes, so although I’ll introduce and use the conventional concepts of stationarity and ergodicity, I’ll end up focusing on the differences between stationary and cyclostationary random processes. The goal is to illustrate those differences with informative graphics and videos; to build intuition in the reader about how the cyclostationarity property comes about, and about how the property relates to the more abstract mathematical object of a random process on one hand and to the concrete data-centric signal on the other.

So … this is the first SPTK post that is also a CSP post.

Jump straight to ‘Significance of Random Processes in CSP’ below.

Introduction

We started our signal-processing toolkit journey by looking at signals, including rectangles, triangles, unit-steps, sinc, etc. We then looked at representing arbitrary signals of time in terms of building-block functions such as Walsh functions, impulses, and harmonically related sine waves. The latter led us to the useful frequency-domain signal representations of the Fourier series and Fourier transform.

We shifted focus to systems–those entities that act on our signals to achieve some goal–and found that our analysis tools enabled much insight into system behavior if the system was linear and time-invariant. We added the crucial tool of convolution to our toolkit, and looked at various kinds of linear time-invariant systems, also known as filters.

We then realized that many of our most important signals (in CSP) don’t quite fit into our analysis framework. In particular, communication signals are well-modeled as random infinite-duration power signals, which are not Fourier transformable. So we started learning probability.

The concept of a random variable bridges the gap between an abstract probability space of events and sets of measurements or numbers that we can operate on with arithmetic, algebra, calculus, and associated computing devices. But random variables are not functions of time. Communication signals are. So we now take the next logical step: going from a random variable to a random function or, as it is more commonly referred to, a random process.

How should we generalize the concept of a signal to include both probability and infinite time? We need infinite time because we want to study our systems’ behavior for arbitrarily long-duration inputs. We need probability because the signals we use in communication theory and practice are inherently unpredictable–they are random.

A useful model of a communication signal is an infinite-duration signal that incorporates some parameters or variables that are in some sense unknown to a receiver of the signal:

$\displaystyle x(t) = \sum_{k=-\infty}^\infty a_k p(t - kT_0 -t_0) \ \ \ (-\infty < t < \infty). \hfill (1)$

Here $a_k$ is the $k$ th ‘symbol‘ to be transmitted to the receiver, $p(t)$ is a ‘pulse function‘ to be modulated (multiplied) by the symbol, and $1/T_0$ is the rate of producing pulses, and therefore the rate of transmitting symbols.

It is possible to use such a model–a single infinite-duration random signal–to construct an entire probabilistic theory of signals. That theory is the fraction-of-time (FOT) probability theory, and I hope to get to that in the SPTK series, or in a CSP post. A more conventional way, and a way that is somewhat easier mathematically even if more obscure physically, is to use a generalization of our previous notion of a probability space.

Probability

A random process is a collection of random variables indexed by time. It is actually more general than that. The indexing can be some other independent variable, such as distance. But for us, swimming in the deep waters of CSP, our random processes are indexed by time $t$ .

We use the notation $X(t)$ to denote a random process–typically we use upper-case letters to denote a process, and we’ll continue to use lower-case letters to denote signals. For each $t$ , $X(t)$ is a random variable with some cumulative distribution function (CDF) and probability density function (PDF). Moreover, every collection of random variables $\{X(t_j)\}_{j=1}^n$ has an $n$ th-order joint PDF and CDF. (Here we are considering real-valued $X(t)$ so the definition of the CDF is straightfoward; complex-valued random processes are only a little more tricky.) In this post, we’ll focus on $n=1$ and $n=2$ . That will be sufficient to ground our discussion in the familiar quantities of mean value (first-order moment), power spectrum, autocorrelation (second-order moment), spectral correlation, and cyclic autocorrelation.

Given the definition of a random process, it follows that the random variable $X(t_1)$ has some PDF we can denote by $f_{X_1}(x)$ and the random variable $X(t_2)$ has PDF $f_{X_2}(x)$ . These two PDFs may or may not be identical. The two random variables may or may not be independent, correlated, uncorrelated, etc.

The joint PDF for the two random variables $\{X(t_1), X(t_2)\}$ is denoted by $f_{X_1,X_2}(x_1, x_2)$ .

All the usual probability and random-variable definitions and tools apply.

For random variables, a particular occurrence of the variable, such as the outcome of a particular card draw or die throw, is called an instance or sample of the variable. For a random process, a particular occurrence of the process is a function, and we call it, variously, a sample path, a sample function, a time-series, or a signal.

Moments

Suppose we have knowledge of a random process $X(t)$ Then we know the PDFs for $X(t_j)$ for all $t_j$ : the collection of functions $f_{X_j}(x)$ . We can therefore compute the mean value for $X(t_j)$ ,

$\displaystyle M_X(t_j) = E[X(t_j)] = \int_{-\infty}^\infty x f_{X_{t_j}}(x) \, dx, \hfill (2)$

and any particular $M_X(t_k)$ may or may not equal some other $M_X(t_j).$

Similarly, we can look at the correlation between two elements of the random process,

$\displaystyle R_X(t_1, t_2) = E[X(t_1) X(t_2)], \hfill (3)$

which may turn out to be independent of $t_1, t_2$ , dependent on them in some general way, or dependent only on their difference $t_2 - t_1.$

The covariance is the correlation of the two variables after their means are removed,

$\displaystyle K_X(t_1, t_2) = E[(X(t_1)-M_X(t_1))(X(t_2)-M_X(t_2))], \hfill (4)$

as usual.

In many cases we don’t actually know the densities (PDFs) of the process itself. However, this is often no barrier to finding the mean, autocorrelation, and covariance functions. Suppose we have a random process that is characterized by a couple specific random variables, such as a sine-wave phase or amplitude. Then we can consider $X(t)$ as a function of the random variables $\boldsymbol{\theta}$ , $X(t, \boldsymbol{\theta})$ , and we know that the expected value of the process is the expected value of the function of $\boldsymbol{\theta}$ , and so all we actually need are the joint PDFs of the random variables $\boldsymbol{\theta}$ . An example should make this more clear.

Example: The Random-Phase Sine Wave

Let’s consider the random process defined by

$\displaystyle X(t) = A \cos(2 \pi f_c t + \Phi). \hfill (5)$

Here $A$ and $f_c$ are constants; they are not random. The only random variable is the sine-wave phase $\Phi$ , which is uniform on the interval $(0, 2\pi)$ . This random process models the situation where we receive a sine wave, and we know the amplitude and frequency, but we don’t know anything about the phase. Or, to put it another way, we don’t know the time origin of the wave, the shift of the sine-wave relative to the sine wave for $\Phi = 0$ .

What is the mean value of this random process? Is it a function of time?

$\displaystyle M_X(t) = E[X(t)] \hfill (6)$

$\displaystyle = E[A\cos(2 \pi f_c t + \Phi)]. \hfill (7)$

To evaluate the expectation in (6) requires the joint distribution of all involved random variables. Since there is only one, $\Phi$ , we only require that distribution–no need to know explicitly the various $f_{X_j}(x)$ .

The mean value, also known as the first moment of $X(t)$ , is given by the expectation

$\displaystyle M_X(t) = E[X(t)], \hfill (8)$

which may be a function of $t$ . For our random-phase sine wave and $t = t_1,$ this expectation is equivalent to

$\displaystyle M_X(t_1) = \int_{-\infty}^\infty A\cos(2\pi f_c t + \phi) f_{\Phi}(\phi)\, d\phi \hfill (9)$

$\displaystyle = \frac{A}{2\pi} \int_0^{2\pi} \cos(2\pi f_c t + \phi) \, d\phi. \hfill (10)$

This definite integral is easily evaluated,

$\displaystyle M_X(t_1) = \frac{A}{2\pi} \left. \sin(2\pi f_c t+\phi) \right|_{t=0}^{2\pi} = 0. \hfill (11)$

Since $t_1$ is arbitrary, the mean value (first moment) of the random process is identically zero. We have a ‘zero-mean random process.’

The reason the mean is zero for each and every time instant is that we are averaging over the ensemble of sample functions, where each sample function is delayed by some amount $\phi/(2\pi f_c)$ and all delays modulo the period $1/f_c$ are equally represented in that ensemble. A picture helps: see Figure 1.

Figure 1. Four members of the random-phase sine-wave ensemble (5).

Now let’s look at the autocorrelation function for the random-phase sine wave. We want to find the average value of the product of two values of the process,

$\displaystyle R_X(t_1, t_2) = E[X(t_1)X(t_2)]. \hfill (12)$

Normally we (and many others) reparameterize the two times $t_1$ and $t_2$ so that $t_1 = t+\tau/2$ , $t_2 = t-\tau/2$ , which means that the lag variable $\tau = t_1 - t_2$ , and the time variable $t$ is the center point of the two times. We’ll do that later, but for now I want to keep our two times as $t_1$ and $t_2$ .

Our autocorrelation is

$\displaystyle R_X(t_1, t_2) = E[A\cos(2\pi f_c t_1 + \Phi) A\cos(2 \pi f_c t_2 + \Phi)] \hfill (13)$

$\displaystyle = \frac{A^2}{2} E[\cos(2\pi f_c t_1 - 2\pi f_c t_2) + \cos(2\pi f_c t_1 + 2\pi f_c t_2 + 2\Phi)] \hfill (14)$

$\displaystyle = \frac{A^2}{2} \cos(2\pi f_c(t_1-t_2)) + \frac{A^2}{2} \underbrace{E [\cos(2\pi f_c(t_1+t_2) + 2\Phi)]}_{=\ 0 \mbox{\rm \ for\ similar\ reasons\ as } M(t) = 0} \hfill (15)$

$\displaystyle \Rightarrow R_X(t_1, t_2) = \frac{A^2}{2} \cos(2 \pi f_c(t_1-t_2)). \hfill (16)$

So the mean is independent of time, being identically zero, and the autocorrelation is a function only of the difference between the two considered times $t_1 - t_2 = \tau$ , and is also, therefore, independent of time.

What happens when we try to compute the first and second moments (mean and autocorrelation) using only a single sample path? We (and I mean mathematicians, engineers, and statisticians) define the mean and autocorrelation by infinite-time averages in such situations. The mean is just the infinite-time average,

$\displaystyle m_X(t) = \lim_{T\rightarrow\infty} \frac{1}{T} \int_{-T/2}^{T/2} x(t) \, dt = \lim_{T\rightarrow\infty} \frac{1}{T} \int_{-T/2}^{T/2} A\cos(2 \pi f_c t + \phi) \, dt. \hfill (17)$

Here $\phi$ is a number, not a random variable–it is an instance of the random variable $\Phi$ . It is easy to show that the time-average mean in (17) is zero for any $\phi$ .

The time-average autocorrelation is typically defined as the following infinite-time average

$\displaystyle R_x(t, \tau) = \lim_{T\rightarrow\infty} \frac{1}{T} \int_{-T/2}^{T/2} x(t) x(t-\tau) \, dt, \hfill (18)$

where we recognize that $t_1 = t$ and $t_2 = t-\tau$ so that, again, $t_1 - t_2 = \tau$ . This average is easy to evaluate too,

$\displaystyle R_x(t, \tau) = \lim_{T\rightarrow\infty}\frac{1}{T} \int_{-T/2}^{T/2} A\cos(2\pi f_c t + \phi)A\cos(2\pi f_c (t-\tau) + \phi) \, dt \hfill (19)$

$\displaystyle = \frac{A^2}{2} \lim_{T\rightarrow\infty} \frac{1}{T} \int_{-T/2}^{T/2} \cos(2 \pi f_c \tau) + \cos(2\pi 2f_c t - 2\pi f_c \tau + 2\phi) \, dt \hfill (20)$

$\displaystyle = \frac{A^2}{2} \cos(2\pi f_c \tau), \hfill (21)$

which does not depend on time, and which cannot depend on time, since we integrated over time to obtain the average.

Notice that the ensemble mean matches the time-average mean (zero), and that the ensemble correlation matches the time-average correlation. Does this always happen? No. The ways the two kinds of average can diverge are important, and depend on the particular random variables involved.

For example, consider a different random process given by

$\displaystyle X(t) = A \cos(2\pi f_c t + \phi), \hfill (22)$

where $f_c$ and $\phi$ are non-random, and $A$ is a uniform random variable on the interval $[-B, B]$ with $B>0$ . The mean is easily seen to be zero,

$\displaystyle M_X(t) = E[X(t)] = E[A\cos(2\pi f_c t + \phi)] \hfill (23)$

$\displaystyle = \cos(2\pi f_c t + \phi) E[A] \hfill (24)$

$\displaystyle = \cos(2 \pi f_c t + \phi) \int_{-\infty}^\infty a f_A(a) \, da \hfill (25)$

$\displaystyle = \cos(2 \pi f_c t + \phi) \int_{-B}^B a\left(\frac{1}{2B} \right) \, da \hfill (26)$

$\displaystyle = \frac{\cos(2\pi f_c t + \phi)}{2B} \left. \frac{a^2}{2} \right|_{a=-B}^B \hfill (27)$

$\displaystyle = \frac{\cos(2\pi f_c t + \phi)}{4B} (B^2 - (-B)^2) = 0. \hfill (28)$

On the other hand, the temporal mean of some sample path where $A$ takes on a concrete value $a \in [-B, B]$ is

$\displaystyle m = \left\langle x(t) \right \rangle = \left\langle a \cos(2 \pi f_c t + \phi) \right\rangle \hfill (29)$

$\displaystyle = \lim_{T\rightarrow\infty} \frac{1}{T} \int_{-T/2}^{T/2} a \cos(2 \pi f_c t + \phi) \, dt = 0. \hfill (30)$

So, as with the random phase, the mean of this random-amplitude sine wave is zero (but what if the distribution of $A$ was changed to uniform on $[0, 2B]$ ?).

Next let’s compute and compare the probabilistic (ensemble-average) autocorrelation with the temporal (time-average) autocorrelation.

The probabilistic autocorrelation is

$\displaystyle R_X(t_1, t_2) = E[X(t_1)X(t_2)] \hfill (31)$

$\displaystyle = E[A^2] \cos(2\pi f_c t_1 + \phi)\cos(2\pi f_c t_2 + \phi) \hfill (32)$

$\displaystyle = \left(\int_{-B}^B \left(\frac{1}{2B} \right) a^2 \, da \right) \underbrace{\cos(2\pi f_c t_1 + \phi) \cos(2\pi f_c t_2 + \phi)}_{c(t_1, t_2)} \hfill (33)$

$\displaystyle = \frac{1}{2B} \left. \frac{a^3}{3}\right|_{a=-B}^B c(t_1, t_2) \hfill (34)$

$\displaystyle = \frac{B^2}{3} c(t_1, t_2). \hfill (35)$

If we let $t_1 = t$ and $t_2 = t-\tau$ , we obtain

$\displaystyle R_X(t, t-\tau) = \frac{B^2}{3} \cos(2\pi f_c t + \phi) \cos(2\pi f_c(t-\tau) + \phi), \hfill (36)$

which is a function of both $t$ and $\tau$ .

On the other hand, the temporal correlation is

$\displaystyle R_x(\tau) = \left\langle x(t) x(t-\tau) \right\rangle = \frac{a^2}{2} \cos(2\pi f_c \tau), \hfill (37)$

which is not–cannot be–a function of time $t$ . Therefore, for the random-amplitude sine-wave random process, the probabilistic and temporal autocorrelation functions do not match.

Averaging Over the Ensemble vs Over Time

Let’s take a closer look at the differences and similarities between ensemble and temporal averages using a prototypical communication signal: binary pulse-amplitude modulation (PAM) with rectangular pulses. The signal is defined by

$\displaystyle x(t) = \sum_{k=-\infty}^\infty a_k p(t-kT_0 -t_0), \hfill (38)$

where $a_k$ is the $k$ th symbol, the symbols are constrained here to be in the set $\{\pm 1\}$ , $1/T_0$ is the symbol rate, $t_0$ is the symbol-clock phase, and $p(t)$ is the pulse function. To make things simple–we’re not trying to be communication-system engineers at the moment–we’ll use the rectangular pulse,

$\displaystyle p(t) = \mbox{\rm rect}(t/T_0) = \left\{ \begin{array}{ll} 1, & |t| \leq T_0/2, \\ 0, & \mbox{\rm otherwise} \end{array} \right. . \hfill (39)$

The symbols are independent and identically distributed–the $+1$ s are as probable as the $-1$ s, and no symbol depends on any other symbol.

It is relatively easy to show that the mean value of this random process is zero and that the autocorrelation function is dependent on the nature of the symbol-clock phase random variable $t_0$ . We talked a lot about phase-randomizing in the post on stationary-vs-cyclostationary models, and we obsess (uh, focus) on it here too.

When the symbol-clock phase variable is a constant (nonrandom), the autocorrelation is a periodic function of time; we have a cyclostationary process. When the symbol-clock phase variable is a uniform random variable on an interval with width $T_0$ , we have a stationary random process. In this latter case, the autocorrelation function is a unit-height triangle, centered at the origin, with a base with width $2T_0$ .

Let’s illustrate these claims using calculations on simulated versions of (38). We can do both ensemble (probabilistic) averaging and single-sample-path (temporal) averaging, approximately, because we can generate a sizeable ensemble, and each sample path can be made quite long (length equal to many thousands times $T_0$ ).

Figure 2 shows a portion of an ensemble of rectangular-pulse PAM sample paths for the case of a random phase variable $t_0$ that is set to zero. That is, there is no difference between the pulse transition times for all the elements of the ensemble–they have ‘the same timing.’

Figure 2. A portion of a 10,000-element ensemble generated using MATLAB. This ensemble corresponds to signal model (38) with $T_0 = 20$ and $t_0 = 0$ . Each sample path (function of time) in the ensemble is a rectangular-pulse binary PAM signal with the same symbol-transition instants (‘the same timing’). When we ‘average over the ensemble,’ we average quantities **vertically** in terms of this graphical representation of the ensemble. When we ‘average over time,’ we pick one of the sample paths and perform some time average using just that signal.

Figure 2. A portion of a 10,000-element ensemble generated using MATLAB. This ensemble corresponds to signal model (38) with $T_0 = 20$ and $t_0 = 0$ . Each sample path (function of time) in the ensemble is a rectangular-pulse binary PAM signal with the same symbol-transition instants (‘the same timing’). When we ‘average over the ensemble,’ we average quantities **vertically** in terms of this graphical representation of the ensemble. When we ‘average over time,’ we pick one of the sample paths and perform some time average using just that signal.

We can use this non-infinite ensemble to approximate the kinds of ensemble averaging we do with math. For example, we can pick some time $t_1,$ and average over all the sample-path values for that time, giving an estimate of $M_X(t_1)$ . If we do that here, we get small numbers like 0.002.

If we pick two times, $t_1$ and $t_2$ , we can form an average of the quantity $x_j(t_1)x_j(t_2)$ , which is an estimate of the ensemble-average autocorrelation $R_X(t_1, t_2)$ , where $j$ indexes the sample path. In Figure 2, we’ve indicated the ensemble values for $t_1 = 45$ and $t_2 = 150$ by the two black vertical lines. If we fix $t_1 = 45$ , and allow $t_2$ to vary, we can plot the averages as a function of $t_2$ . This is done in Figure 3 for the ensemble shown in Figure 2.

Figure 3. Ensemble-average estimates of the autocorrelation function $R_X(t_1, t_2)$ for fixed $t_1$ and the ensemble shown in Figure 2.

When we turn to a time average of a single sample path in the ensemble of Figure 2, we get a much different result, as shown in Figure 4. Moreover, this is the same result we get for any of the sample paths–it does not vary with the index $j$ as defined in the previous paragraph. (Well, that is almost true. You can imagine a sample path for which every symbol $a_k$ is $+1$ . For this sample path, the time average is unity. But this sample path occurs ‘almost never,’ which means ‘with probability zero.’)

Figure 4. The ensemble-average and temporal-average estimates of the autocorrelation for the random process (38) with $t_0 = 0$ (nonrandom symbol-clock phase).

Figure 4. The ensemble-average and temporal-average estimates of the autocorrelation for the random process (38) with $t_0 = 0$ (nonrandom symbol-clock phase).

The autocorrelation function estimate for the case of no phase variable in Figures 3 and 4 depends on $t_1$ and $t_2$ in a more general way than through just their difference. This can be seen by forming the estimate for each of a number of choices for $t_1$ and a variety of $t_2$ for each of these choices. The resulting collection of autocorrelation-function estimates can then be viewed as a movie, as shown in Video 1.

Video 1. Illustration of the time-varying nature of the ensemble-average autocorrelation for the ensemble that does not have a random symbol-clock phase variable ( $t_0 = 0$ ). Notice that the variation with $t_1$ is periodic.

Video 1. Illustration of the time-varying nature of the ensemble-average autocorrelation for the ensemble that does not have a random symbol-clock phase variable ( $t_0 = 0$ ). Notice that the variation with $t_1$ is periodic.

Another way of looking at the time variation of the ensemble-average autocorrelation is by fixing the difference $t_1 - t_2$ and varying $t_1$ . The autocorrelation estimates in Video 1, on the other hand, fix $t_1$ and allow the difference $t_1 - t_2$ to vary. The former choice is illustrated in Figure 5.

Figure 5. Autocorrelation (AF) estimates $R_X(t_1, t_2)$ for ensemble averaging the no-symbol-clock-phase ensemble for each of a number of fixed values of $t_1 - t_2$ . Each autocorrelation estimate is periodic, but they are different periodic functions, and so will have different Fourier-series coefficients. Once the difference $t_1 - t_2$ exceeds the symbol period $T_0 = 20$ , the correlation between two samples in the ensemble is zero.

Now let’s look at these same estimated quantities for the ensemble we generated with a uniform random phase variable $t_0$ . Figure 6 shows a portion of the generated ensemble. Since the value of $t_0$ is randomly chosen for each sample path, the timing of the pulses in the PAM signals differs from path to path.

Figure 6. A portion of a 10,000-element ensemble generated using MATLAB. This ensemble corresponds to signal model (38) with $T_0 = 20$ and a uniformly distributed random symbol-clock phase $t_0$ . Each sample path (function of time) in the ensemble is a rectangular-pulse binary PAM signal with the same symbol-transition instants governed by the particular value of $t_0$ for that path (‘different timing’).

Figure 7 shows the correspondence between the ensemble-average autocorrelation function and the temporal-average autocorrelation for the random-phase ensemble. Here the two match; you get the same answer whether you average over the ensemble or over time.

Figure 7. Autocorrelation estimates using ensemble (vertical) averaging and temporal (horizontal) averaging for the random symbol-clock phase ensemble shown in Figure 6.

Video 2. Illustration of the time-invariant nature of the ensemble-average autocorrelation for the ensemble that has a random symbol-clock phase variable ( $t_0$ is uniform on $[0, T_0]$ ).

Video 2. Illustration of the time-invariant nature of the ensemble-average autocorrelation for the ensemble that has a random symbol-clock phase variable ( $t_0$ is uniform on $[0, T_0]$ ).

Figure 8 shows that the ensemble-average autocorrelation is a function of $t_1$ and $t_2$ only through their difference $t_1 - t_2$ . Note that the constant value in each of the subplots of Figure 8 should map to a particular value on the triangle in Video 2. At upper left, $\tau=0$ , and so the constant value there of one maps to the tip of the triangle in Video 2, that is the value of the triangle for $\tau = 0$ , which is satisfyingly also one. The value in Figure 8 for $t_1 - t_2 = 10 = T_0/2$ is 0.5, and this is exactly what we see for the triangle in Video 2 when $\tau = 10$ .

Figure 8. Autocorrelation (AF) estimates $R_X(t_1, t_2)$ for ensemble averaging the random-symbol-clock-phase ensemble for each of a number of fixed values of $t_1 - t_2$ . Each autocorrelation estimate is constant–the individual values of $t_1$ and $t_2$ don’t matter–just their difference. Once the difference $t_1 - t_2$ exceeds the symbol period $T_0 = 20$ , the correlation between two samples in the ensemble is zero.

Stationarity

When the probabilistic functions of a random process are independent of time, the process possesses a kind of time-invariance that is called stationarity. This property is reminiscent of the all-important system property of time-invariance that we made use of in characterizing linear systems. Systems that are linear and time-invariant are filters, and signals that are time-invariant are stationary, loosely speaking. Of course the signal isn’t actually a constant as a function of time, it is the underlying ensemble that possesses the time-invariance in its statistical functions–stationarity. But we still speak of ‘stationary signals’ in day-to-day work.

There are kinds of stationary random processes. The stricter kind requires all probabilistic parameters (all moments) to be time-invariant. The weaker kind requires only the first two moments to be time-invariant. This latter kind of stationarity is known as wide-sense stationarity (WSS). Remember, stationarity of any kind is a property of a random process–an ensemble together with a set of joint PDFs for all choices of $\{X(t_j)\}_{j=1}^n$ –not of an individual time-series, signal, or sample path.

Strict-sense stationarity is hard to check because you have to have a simple enough process, and the time, to be able to check all the moments (or PDFs). Wide-sense stationarity saves us from this hassle: we need only check the mean and autocorrelation.

For a wide-sense stationary (WSS) process, the mean must be a constant,

$\displaystyle M_X(t_1) = E[X(t_1)] = M, \hfill (40)$

and the autocorrelation is a function only of the time difference between the two involved times $t_1$ and $t_2$ ,

$\displaystyle R_X(t_1, t_2) = E[X(t_1)X^*(t_2)] = R_X(t_1-t_2). \hfill (41)$

We’ve abused the notation a bit in (41): we’re using $R_X(\cdot)$ to indicate both a function of two variables and a function of one. But we know that in the formal two-variable autocorrelation, the values of $t_1$ and $t_2$ only appear as $t_1-t_2$ .

For our two PAM ensembles, only one is wide-sense stationary, and that is the one with the random-phase variable. The other ensemble produces a time-varying autocorrelation (second moment), and is therefore neither wide-sense nor strict-sense stationary.

On the other hand, the sample paths of each of the ensembles are cyclostationary signals. That is, if we compute the cyclic autocorrelation function for $\alpha = 1/T_0$ , we get a non-zero reliable answer for long processing-block lengths (almost surely).

Ergodicity

Ergodicity is a property of a random process that ensures that the time-averages of the sample paths correspond to the ensemble averages. This is the property that mathematicians must invoke so that they can pursue actual real-world utility of the ensemble sample paths.

Ergodic random processes must be stationary because the time averages of sample paths are, by construction, not functions of time. Therefore, if those averages have any chance to match the ensemble averages, those ensemble averages also cannot be functions of time, and so the process must be stationary.

Our ensemble with the random-phase variable is stationary and ergodic because the mean and the autocorrelation functions in ensemble- and temporal-averaging agree.

Cyclostationarity

A wide-sense cyclostationary random process is one for which the mean and autocorrelation are periodic (or almost-periodic) functions of time, rather than being time-invariant.

A second-order cyclostationary signal, or cyclostationary time-series, is simply a signal for which the infinite-time average defined by

$\displaystyle R_x^\alpha(\tau) = \lim_{T\rightarrow\infty} \frac{1}{T} \int_{-T/2}^{T/2} x(t+\tau/2) x^*(t-\tau/2) e^{-i2\pi \alpha t} \, dt, \hfill (42)$

or that defined by

$\displaystyle R_{x^*}^\alpha(\tau) = \lim_{T\rightarrow\infty} \frac{1}{T} \int_{-T/2}^{T/2} x(t+\tau/2) x(t-\tau/2) e^{-i2 \pi \alpha t} \, dt, \hfill (43)$

is not identically zero for all non-trivial cycle frequencies $\alpha$ and delays $\tau$ . The sole trivial cycle frequency is $\alpha=0$ for (42). There are no trivial cycle frequencies for the conjugate case in (43).

It is perfectly possible that there are no non-trivial cycle frequencies for which either the non-conjugate cyclic autocorrelation function $R_x^\alpha(\tau)$ or the conjugate cyclic autocorrelation function $R_{x^*}^\alpha(\tau)$ are non-zero, and yet the signal is still a cyclostationary signal. This can occur if some other, higher-order, moment function possesses a non-trivial cycle frequency. A real-world example is duobinary signaling. Another is simply a square-root raised-cosine QPSK signal with roll-off of zero: it is a wide-sense stationary signal, but possesses many higher-order non-trivial cyclic cumulants.

Cycloergodicity

If we do have a cyclostationary random process, and we want to use stochastic machinery to, say, derive an algorithm, we will also want to know whether the sample paths of that cyclostationary random process are cyclostationary, and that they possess the same cyclic parameters as the process. This is done by checking for, or invoking, cycloergodicity.

For ergodicity, we check the ensemble-average mean value against the temporal-average mean value for a sample path,

$\displaystyle M_X(t) = E[X(t)] \stackrel{?}{=} \left\langle x(t) \right\rangle, \hfill (44)$

and we check the ensemble-average second moment against the temporal-average second moment,

$\displaystyle R_X(t_1,t_2) = E[X(t+t_1)X^*(t+t_2)] \stackrel{?}{=} R_X(t_1-t_2) \stackrel{?}{=} \left\langle x(t+t_1)x^*(t+t_2) \right\rangle. \hfill (45)$

For cycloergodicity, we need a new tool involving time averages to compare with the time-varying ensemble-average autocorrelation. This is the fraction-of-time (FOT) expectation, also called the multiple-sine-wave extractor (or similar), $\displaystyle E^{\{\alpha\}} [\cdot]$ . This operator returns all finite-strength additive sine-wave components in its argument. We’ll have a post or two in the (far?) future on fraction-of-time probability. Accept for now the claim that the FOT expectation really is an expected-value operation, which means somewhere lurking in here is a new kind of cumulative distribution function and associated probability density function (The Literature [R1]).

For now, a couple examples should suffice to illustrate the idea. The FOT mean value of a single sine wave is just that sine wave

$\displaystyle E^{\{\alpha\}} [Ae^{i 2 \pi f_0 t + i \phi_0}] = Ae^{i 2 \pi f_0 t + i \phi_0}. \hfill (46)$

(Compare that with the usual temporal average applied to a sine wave.)

Suppose we have a periodic function $y(t)$ , such as a single sine wave, a square wave, a radar signal, etc., with period $T_0$ . Then we know that we can represent $y(t)$ in a Fourier series

$\displaystyle y(t) = \sum_{k=-\infty}^\infty c_k e^{i 2 \pi (k/T_0) t}. \hfill (47)$

The FOT sine-wave extractor is linear, so that again we have

$\displaystyle E^{\{\alpha\}}[y(t)] = y(t). \hfill (48)$

Suppose some periodic signal $y(t)$ is embedded in noise and interference that does not possess any periodic component,

$\displaystyle z(t) = y(t) + w(t). \hfill (49)$

Then

$\displaystyle E^{\{\alpha\}}[z(t)] = y(t). \hfill (50)$

To check the cycloergodic property of a random process, we check the ensemble-average mean value against the FOT mean value

$\displaystyle M_X(t) = E[X(t)] \stackrel{?}{=} E^{\{\alpha\}}[x(t)]. \hfill (51)$

Note that in (51), the ensemble average on the left and the temporal (non-stochastic) average on the right can be functions of time $t$ . We also check the ensemble-average second moment against the FOT second moment,

$\displaystyle R_X(t+t_1, t+t_2) = E[X(t+t_1)X^*(t+t_2)] \stackrel{?}{=} E^{\{\alpha\}}[x(t+t_1)x^*(t+t_2)]. \hfill (52)$

If these averages match, and the process is cyclostationary, then we say the process is a cycloergodic process.

From this point of view, then, the random process corresponding to the PAM ensemble with a uniform random phase variable is stationary, ergodic, and non-cycloergodic. The autocorrelation functions obtained from ensemble averaging match those obtained from the sample paths (almost surely), and the ensemble and temporal means are zero. We end up with strong observable (sample-path-based) cyclic features, but no cyclic features in the stochastic domain, as illustrated in Figure 9.

Figure 9. The PAM ensemble with a random phase variable is a stationary random process, and so it possesses no cyclic features (the autocorrelation function is time-invariant). On the other hand, each sample path of the ensemble possesses measurable cyclic features (almost surely). This is a case where the quest to render a process mathematically tame (time-invariant moments) results in a divergence between stochastic parameters and real-world measurable temporal parameters.

On the other hand, if we perform the same operations on the ensemble and sample paths using the ensemble without phase randomization, we see that the cyclic features match (Figure 10). This process is cyclostationary, nonergodic, and cycloergodic.

Figure 10. The cyclic features of a cycloergodic cyclostationary process match those of its sample paths (almost surely). Here we compute the ensemble-average cyclic autocorrelation for a non-zero cycle frequency by estimating the time-varying autocorrelation and then computing its Fourier-series coefficient for a variety of lags $t_1 - t_2$ . By refusing to add random-phase variables to our random-process signal model, we accept nonstationarity, but we gain cycloergodicity.

Figure 10. The cyclic features of a cycloergodic cyclostationary process match those of its sample paths (almost surely). Here we compute the ensemble-average cyclic autocorrelation for a non-zero cycle frequency by estimating the time-varying autocorrelation and then computing its Fourier-series coefficient for a variety of lags $t_1 - t_2$ . By refusing to add random-phase variables to our random-process signal model, we accept nonstationarity, but we gain cycloergodicity.

The properties of the two PAM random processes we have studied in this post are summarized in Table 1.

t_0 = — Table 1. Properties of two random-process models for pulse-amplitude-modulated signals: with and without a uniformly distributed random symbol-clock phase.

t_0 — Table 1. Properties of two random-process models for pulse-amplitude-modulated signals: with and without a uniformly distributed random symbol-clock phase.

This was a complex, difficult post. I’m sure there are errors. If you find one, please let me know in the Comments.

Significance of Random Processes in CSP

The theory of cyclostationary signals can be formulated in terms of conventional non-stationary random processes, and so from that point of view, random processes are central to CSP. To bridge the gap between a random process (infinite collection of time-series of infinite durations together with a set of probability density functions for all possible collections of random variables plucked from the process) and a signal (one infinite-time member of the ensemble), we need to invoke a kind of ergodicity called cycloergodicity.

A good way to achieve cycloergodicity is to refrain from adding random variables to the random process model that attempt to account for what are simply unknown constants in a given sample path. For instance, refrain from adding random symbol-clock phases and random carrier phases. This will render the process non-stationary (cyclostationary), but will also lead to cycloergodicity. Include random variables in the random process that are models of quantities that also randomly vary in a single sample path–such as transmitted symbols in a PSK or QAM signal.

It is possible to formulate a theory of cyclostationary signals that completely sidesteps the creation of a random process (ensemble plus probability rules). What is needed is a way to characterize the random (erratic, unpredictable) behavior of a signal using only infinite-time averages. This leads to the idea of a fraction-of-time probability to take the place of the ensemble probability. We’ll look at that in a future CSP post.