The Cycle Detectors

Let’s take a look at a class of signal-presence detectors that exploit cyclostationarity and in doing so illustrate the good things that can happen with CSP whenever cochannel interference is present, or noise models deviate from simple additive white Gaussian noise (AWGN). I’m referring to the cycle detectors, the first CSP algorithms I ever studied.

Cycle detectors are signal-presence detectors. The basic problem of interest is a binary hypothesis-testing problem, typically formulated as

\displaystyle H_1: x(t) = s(t) + w(t), \hfill (1)

\displaystyle H_0: x(t) = w(t), \hfill (2)

where s(t) is the signal to be detected, if present, and w(t) is white Gaussian noise. We’ll look at some variations on these hypotheses later in the post.

The idea is to construct a signal processor that operates on the received data x(t) to produce a decision about the presence or absence of the signal of interest (“signal to be detected”) s(t). Such processors usually produce a real number Y that is generally much different on H_1 than it is on H_0. The common case is that Y is relatively large on H_1 and relatively small on H_0, but that isn’t required: Y could be consistently small on H_1 and large on H_0.

A typical mathematical approach to this decision-making problem is to model the signals s(t) and w(t) so that their probabilistic structures are simple and easy to manipulate mathematically. This has lead to the very common model in which s(t) is a stationary random process that is statistically independent of the stationary random process w(t), which is itself Gaussian and white (it is additive white Gaussian noise [AWGN]). Further simplifications can be had in some cases by assuming that the average power of s(t) is much smaller than that for w(t), which is the weak-signal assumption common in signal surveillance and cognitive-radio settings.

Of course, this is the CSP Blog, so we’ll be modeling the signals of interest as cyclostationary random processes, and by doing so we’ll be able to obtain detectors that are noise- and interference-tolerant.

Detectors for Stationary Signal Models

Throughout this post we are concerned with detecting signals on the basis of their gross statistical nature. This idea contrasts with another, often successful, approach that is based on exploiting some known segment of the waveform. For example, a signal may periodically transmit a known sequence (or one of a small number of known sequences) so that the intended receiver can estimate the propagation channel and compensate for it (equalization), or so that the receiver can perform low-cost reliable synchronization tasks. In this post, we assume we have no such “known-signal components” to detect. An unintended receiver can detect the signal of interest by performing matched filtering using these known components–so I’m saying that matched filtering is not applicable here.

For a signal that is modeled as stationary, a gross statistical characteristic is its power spectral density (PSD) or its average power (the integral of the signal’s PSD). Detectors that attempt to decide between H_1 and H_0 on the basis of power or energy are called energy detectors or radiometers.

A simple energy detector is just the sum of the magnitude-squared values of the observed signal samples,

\displaystyle Y_{ED} = \sum_{j=1}^N \left| x(j) \right|^2. \hfill (3)

This detector does not take into account the distribution of the signal’s energy in the time or frequency domains—it’s just raw energy. It can be highly effective and has low computational cost, but it suffers greatly when the noise or the signal has time-varying behavior such as that caused by time-variant propagation channels, interference, or background noise.

The energy and power of a signal are related by a scale factor that is equal to the temporal duration of the measurement (N above). That is, power is energy per unit time. So we can talk about energy detection or power detection, and they are pretty much the same thing. Another way to get at the power of the signal is to integrate the PSD,

\displaystyle Y_{ED} = \int \hat{S}_x^0(f) \, df, \hfill (4)

where \hat{S}_x^0(f) is an estimate of the signal PSD S_x^0(f). If the signal is oversampled (relative to its Nyquist rate), then the PSD estimate will correspond to a frequency range that contains some noise-only intervals, typically the intervals near the edges. The power from those noise-only frequency intervals will be included in Y_{ED} along with the power from the signal-plus-noise interval, which degrades the statistic in proportion to the amount of oversampling.

In contrast to the simple ED, the optimal energy detector (for a signal that is weak relative to the noise) weights the estimated PSD by the true one for s(t), effectively de-emphasizing those noise-only intervals, and emphasizing those intervals throughout the signal’s band having the larger signal-to-noise ratios,

\displaystyle Y_{OED} = \int \hat{S}_x^0(f) S_s^0(f) \, df. \hfill (5)

Y_{OED} is sometimes called the optimum radiometer.

When the exact form of the PSD for s(t) is not known (perhaps the carrier frequency is only roughly known, or the pulse-shaping function is not known in advance), the ideal PSD S_x^0(f) can be replaced by the PSD estimate, forming the detection statistic

\displaystyle Y_{SED} = \int \hat{S}_x^0(f)^2 \, df. \hfill (6)

I call this detector the suboptimal energy detector (SED).

Detectors for Cyclostationary Signal Models (Cycle Detectors)

The various detectors obtained through mathematical derivation using a cyclostationary (rather than stationary) signal model are collectively referred to as cycle detectors. These detectors can be derived in a variety of ways. Perhaps the most familiar is through likelihood analysis, where a likelihood function is maximized. See The Literature ([R7], [R65]) and My Papers ([4]) for derivations.

The optimum weak-signal detector structure is called the optimum multicycle detector, and it is expressed as the sum of individual terms that contain correlation operations between measured and ideal spectral correlation functions,

\displaystyle Y_{OMD} \propto \Re \sum_\alpha \int \hat{S}_x^\alpha (f) S_s^\alpha(f)^* \, df. \hfill (7)

So we sum up the complex-valued correlations between the measured and ideal spectral correlation functions for all cycle frequencies \alpha exhibited by s(t). A single term from the optimum multicycle detector is the optimum single-cycle detector,

\displaystyle Y_{OSD} \propto \left| \int \hat{S}_x^\alpha (f) S_s^\alpha(f)^* \, df. \right| \hfill (8)

The suboptimal versions of the multicycle and single-cycle detectors replace the ideal spectral correlation function with the measured spectral correlation function, essentially measuring the energy in the measured spectral correlation function for one (single-cycle) or more (multicycle) cycle frequencies. So the suboptimal single-cycle detector is

\displaystyle Y_{SSD} \propto \int \left| \hat{S}_x^\alpha(f) \right|^2 \, df.\hfill (9).

However, the multicycle detector is more subtle. Even if we knew the formula for the  ideal spectral correlation function for the modulation type possessed by s(t), we’d still have a problem with the coherent sum in (6). The problem is that each term in the sum is a complex number whose phase depends on the phases of the values (over frequency f) of the estimated and ideal spectral correlation functions. These phases are sensitive to the symbol-clock phase and carrier phase of the signal. In other words, the derived detector structure uses the assumed synchronization (timing) parameters for the signal s(t) exactly as it is assumed in the H_1 hypothesis. If we use the proper form of the spectral correlation function, but the synchronization/timing parameters used in creating the ideal functions differ from those associated with the observed signal, the complex-valued terms in the multicycle sum can destructively–rather than constructively–add. This degrades the detector performance.

We’re in the unfortunate position of estimating timing parameters for a signal we have not yet detected.

So, the suboptimal version of the multicycle detector sums the magnitudes of the individual terms, rather than summing the complex-valued terms. This obviates the need for high-quality estimates of the synchronization parameters of the signal.

Finally, let’s consider the delay-and-multiply detectors. These are detectors that use a simple delay-and-multiply device to generate a sine wave. Then the presence or absence of the sine wave is detected by examining the power in a small band of frequencies centered at the frequency of the generated sine wave (The Literature [R66], My Papers [3]).

A delay-and-multiply (DM) detector can operate with a regenerated sine-wave frequency of zero, or with some other frequency that is dependent on the particular modulation type and modulation parameters employed by s(t). For example, DSSS signals can be detected by using a quadratic nonlinearity (delay-and-multiply, say) to generate a sine wave with frequency equal to the chipping rate. Such a detector is called a chip-rate detector. For most signal types of interest to us here on the CSP blog, a delay of zero is a good choice, as it tends to maximize the strength of the generated sine wave.

Illustration Using Simulated Signals and Monte Carlo Simulations

We will illustrate the performance and capabilities of the various detector structures using a textbook BPSK signal so that we can control all aspects of the signal, noise, and detectors. The signal uses independent and identically distributed equi-probable symbols (bits) and a pulse-shaping function that is square-root raised-cosine with roll-off parameter of 1.0.

The BPSK signal has a symbol rate of f_{sym} = 1/T_0 = 1/10 (normalized units) and a carrier frequency of f_c = 0.05. So it is similar to our old friend the textbook rectangular-pulse BPSK signal, but with a more realistic pulse-shaping function.

Our BPSK signal has non-conjugate cycle frequencies of \alpha = k/10 = kf_{sym} and conjugate cycle frequencies of \alpha = 2f_c + kf_{sym} = 0.1 + k/10, all for k \in \{-1, 0, 1\}. The measured spectral correlation function is shown here:


Let’s look at a few signal environment variations, and also introduce a pre-processing step called spectral whitening along the way.

In each simulation, I consider a wide range of inband signal-to-noise ratios (SNRs). By inband I mean that the SNR is the ratio of the signal power to the power of the noise in the signal bandwidth. This is typically a more meaningful SNR for CSP algorithms than the total SNR, which is simply the signal power divided by the noise power in the sampling bandwidth (the noise power in the analysis band).

For each set of simulation parameters (SNR, interference, etc.), I use 1000 Monte Carlo trials on each of H_1 and H_0. The result of each trial is one detector output value for each simulated detector. I store these numbers, then analyze them to estimate the probabilities of detection P_D and false alarm P_{FA}.

The detection probability is defined as

\displaystyle P_D = \mbox{\rm Prob} \left[ Y > \eta | H_1 \right], \hfill (10)

and the false-alarm probability is

\displaystyle P_{FA} = \mbox{\rm Prob} \left[ Y > \eta | H_0 \right], \hfill (11)

where \eta is the detection threshold. I won’t be talking in the post about how to choose a threshold. Many researchers and engineers want to plug into a formula that provides some kind of optimum threshold, balancing P_D and P_{FA}, but in my experience such formulas are only possible in highly simplified problems, and must be adjusted using measurement. I suppose one could call them textbook thresholds.

Baseline Simulation: Constant-Power BPSK in Constant-Power AWGN

So here the BPSK signal has the same power on each trial (on H_1), and the additive white Gaussian noise has the same power on each trial (on both H_1 and H_0). The bits that drive the BPSK modulator are chosen independently each trial as is the noise sequence.

Let’s first look at histograms of the obtained detector output values. Here is a typical one, corresponding to an inband SNR of -11 dB and a block length (observation interval length or processing length or data-record length, same thing) of about 1640 samples:


Here I am just showing three detectors. The first is the optimal energy detector (OED) described above; its statistics are shown in red. The second is the incoherent multicycle detector (IMCD), where the “incoherent” word just means that we add the magnitudes of the terms in the optimal MCD. The final detector shown here is the incoherent suboptimal multicycle detector (ISMD), which is what we described above as simply the suboptimal multicycle detector.

Notice that the distributions (histogram curves) for each detector are nearly separate for the two hypotheses H_1 and H_0. This means good detection performance can be had by choosing a threshold \eta anywhere in the gap between the two curves.

Exactly how does the performance depend on the selection of the threshold \eta, especially when the two histograms for the detector output overlap? This is captured in the receiver operating characteristic (ROC), which plots P_D versus P_{FA}. That is, each value of \eta produces a pair (P_D(\eta), P_{FA}(\eta)). For the histograms above, here are the ROCs (for all the considered detectors)


There are a few things to notice about this set of ROCs. First, the OED is the best-performing detector because its ROC is nearly a right angle at (0, 1), meaning we can achieve a P_D of nearly 1 at a very small P_{FA}. Second, the IOMD (using all cycle frequencies except non-conjugate zero) is very nearly as good as the OED. Third, the detectors for the features related to the symbol rate 1/T_0 for the OSD are similar, and are all better than those for the SSD, which themselves are similar. Finally, the DM for \alpha = 0 and the ISMD for cycle frequencies that are not exhibited by the data are the worst-performing.

So in this benign environment with a constant-power signal in constant-power noise, energy detection reigns supreme. If we look at the ROCs for several SNRs and a constant block length, we can extract useful graphs by fixing P_{FA} and plotting P_D. Let’s fix P_{FA} at 0.05 and see what P_D is for the various detectors:


The performance ordering is maintained as the SNR increases: OED, IOMD, 2f_c SDs, all the other SDs, then the DM, and finally the ISMD with false cycle frequencies. All is well except that last one. As the SNR increases, the value of P_D for P_{FA} = 0.05 for the false-CF ISMD approaches one. So we are reliably detecting a signal that is not actually present!

Why is this? If we recall the post on the resolution product, we may remember that the variance of a spectral correlation estimator is inversely proportional to the time-frequency resolution product \Delta t \ \Delta f of the measurement, but it is also proportional to the ideal value of the noise spectral density N_0. This just means that the variance of the measurement is affected by the measurement parameters as well as how much non-signal energy is present. We can always overcome high noise by increasing the resolution product.

In the case of using false cycle frequencies, the “noise” component on H_1 is the combination of our signal s(t) and the noise itself. So on H_1, the value of our ISMD statistic is greater than it is on H_0, just because there is more “noise” present on H_1 than on H_0. We could confirm this by repeating the experiment where

H_1: x(t) = w_1(t), \hfill (12)

H_0: x(t) = w_2(t), \hfill (13)

where the spectral density for w_1(t) is greater than than for w_2(t). (If you do the experiment, let me know.)

One way around this problem is to spectrally whiten the data prior to applying our detectors. Here, spectral whitening means applying a linear time-invariant filter to the data. The outcome of the filtering yields a signal whose spectral density is a constant over all frequencies in the sampling bandwidth. So, if a data block has a (measured) PSD of S(f), then the transfer function for the whitening filter H_w(f) is given by

\displaystyle H_w(f) = \frac{1}{(S(f))^{1/2}}, \hfill (14)

which follows from elementary random-process theory for the spectral densities of the input and output of a linear time-invariant system.

If we apply whitening to the data on a trial-by-trial basis, we obtain the following performance curves for the baseline case:



Now we see that the performance ordering has changed, and that the false-CF ISMD does not tell us a non-existent signal is present as the actual signal’s SNR increases. Spectral whitening is also useful when inband narrowband interference is present, for much the same reasons as we’ve outlined above.

The spectral whitening is not perfect. The OED begins to detect the signal as the SNR gets large due to this imperfection.

Finally, we note that the use of spectral whitening as a data pre-processing step means that the spectral correlation function estimates used in the various detectors are actually spectral coherence estimates. Coherence strikes again!

Variation: Constant-Power BPSK in Constant-Power AWGN with Variable-Power Interference

The interferer is QPSK and has variable (from trial to trial) carrier frequency, power, and symbol rate. It is present on both H_1 and H_0. Moreover, the randomly chosen interferer carrier frequency is such that the two signals always spectrally overlap, so no linear time-invariant preprocessing step could separate the signals. A typical spectral correlation plot for the combination of the two signals is shown here:


Notice that the two signals cannot be distinguished in the PSD. Relative to the spectral correlation plot for BPSK alone, we see the additional non-conjugate feature that corresponds to the QPSK interferer.

The actual hypotheses for this variation can be expressed as

H_1: x(t) = s(t) + i(t) + w(t), \hfill (15)

H_0: x(t) = i(t) + w(t). \hfill (16)

The QPSK interferer has random power level that is uniformly distributed in the range [-10, 10] dB. The BPSK signal has a constant power of 0 dB, so the interferer power ranges from a tenth of the BPSK power to ten times the BPSK power. The interferer’s center frequency is restricted to lie in an interval to the right of the BPSK center frequency. Finally, the interferer bandwidth ranges from one half the BPSK bandwidth to twice the BPSK bandwidth.

Here are some results for this variation, without the use of spectral whitening:




So here there is no need for spectral whitening, because the false-CF detectors will not generally show detection of a false signal. However, spectral whitening works out well in this kind of case, as we will see next.

Variation: Variable-Power BPSK in Variable-Power Noise and Interference

In this last variation for the textbook SRRC BPSK signal, the signal, interference, and noise all have variable power from trial to trial. Everything else is the same. Here are the results without whitening:



And now with spectral whitening applied to the data on each trial:



So, with or without spectral whitening, when the signal environment is difficult–contains variable cochannel interference and/or variable noise–the cycle detectors are vastly superior to energy detectors.

Illustration Using Collected Signals: WCDMA

I captured 10 minutes of a local WCDMA signal using a (complex) sampling rate of 6.25 MHz. For each trial in the WCDMA Monte Carlo simulations, I randomly choose a data segment from this long captured signal and add noise to it. A typical spectral correlation function plot for the WCDMA data is shown here:


The significant non-conjugate cycle frequencies are 15 kHz, 120 kHz, and 3.84 MHz (the chip rate). There are no detected significant conjugate cycle frequencies for this data. Notice the frequency-selective channel implied by the WCDMA PSD, which is normally flat across its bandwidth. The observed channel fluctuates over time.

Baseline Experiment: WCDMA as Captured in Constant-Power AWGN

The block lengths for the WCDMA experiments are reported in terms of the number of DSSS chips, which have length 1/f_{chip} = 1/3.84 \ \ \mu s, or 0.26 micro-seconds. Here is the result for an inband SNR of -9 dB and a block length of 40206 symbols or chips, and no spectral whitening:



So in this benign environment, energy detection is far superior to CSP detection, but the cycle detectors definitely work. We again observe the false-CF detection problem.

Variation: WCDMA as Captured in Constant-Power AWGN with Whitening

When spectral whitening is used, we obtain the following ROCs and probabilities:



In this case, the cycle detectors are superior by a few decibels compared to the OED. The SDs for the cycle frequency of 120 kHz are rather strongly affected by the whitening relative to the other SDs and the MDs. I don’t yet have an explanation for that, but it is clear that the real-world (non textbook) signals are much more complicated than the textbook signals, and application of CSP to the non-textbook signals requires care.

Let me, and the CSP Blog readers, know if you’ve had good or bad experiences with cycle detectors by leaving a comment. And, as always, I would appreciate comments that point out any errors in the post.

17 thoughts on “The Cycle Detectors

  1. Paul says:

    Is there a conjugate SCF version of Equation (9)? I have not been able to find a good definition in the literature, including [1]. For the non-conjugate SCF, Equation (9) can be normalize as shown in Eqn. (34) in [1], which gives you values between [0,1]. What is the equivalent for the conjugate SCF?

    [1] G. Zivanovic and W. Gardner “Degrees of Cyclostationarity and their Application to Detection and Estimation” In Signal Processing March 1991

    • Is there a conjugate SCF version of Equation (9)?

      Yes, just replace the non-conjugate SCF with the conjugate SCF.

      For the non-conjugate SCF, Equation (9) can be normalize as shown in Eqn. (34) in [1], which gives you values between [0,1]. What is the equivalent for the conjugate SCF?

      The conjugate coherence is presented in the coherence post at Equation (14). I believe you can simply replace the (non-conjugate) SCF in (34) with the conjugate SCF, and proceed with a parallel development, substituting the magnitude-squared conjugate coherence for the magnitude-squared non-conjugate coherence in (35).

      If the idea is to figure out which cycle frequencies to include in some multi-cycle detector, or which single cycle-frequency to use in a single-cycle detector (for a signal that exhibits several cycle frequencies), you might just convert your complex-valued signal to a real-valued one (analytically) and then use the formulas in [1] without modification.

      Does that help?

  2. Paul says:

    Thanks Chad, that definitely helps. I’ll work through the derivation and see if that gets me a properly scaled ([0,1]) conjugate version.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s