Comments on “Deep Neural Network Feature Designs for RF Data-Driven Wireless Device Classification,” by B. Hamdaoui et al

Another post-publication review of a paper that is weak on the ‘RF’ in RF machine learning.

Let’s take a look at a recently published paper (The Literature [R148]) on machine-learning-based modulation-recognition to get a data point on how some electrical engineers–these are more on the side of computer science I believe–use mathematics when they turn to radio-frequency problems. You can guess it isn’t pretty, and that I’m not here to exalt their acumen.

The paper addresses modulation recognition both with and without explicit use of cyclostationary signal processing. The two main ideas put forth are that (1) modulation recognition can be significantly improved if you take into account the out-of-band behavior of the signal and that (2) the use of CSP-based input features can substantially decrease neural-network training time relative to the use of complex-valued time-domain data samples as inputs.

Regarding the out-of-band information, I don’t think there is a serious conceptual difficulty with the idea, but it requires two signal characteristics that are in short supply: a large excess SNR and a large number of empty channels on either side of the signal.

For the CSP material, well, I’ll get into that below, but there are some elementary mathematical misconceptions that result in a highly confusing and misleading exposition, especially for less experienced and/or less mathematically sophisticated researchers.

Let’s take a look.

Introductory Material

In the opening material, we find statements that connect cyclostationarity to the existence of repeated signal components, such as

and

but that isn’t a very good description of the origin of the cyclostationarity property in a signal. Yes, periodic functions are cyclostationary, and if a signal does have a periodic component, that component gives rise to cycle frequencies, but this is what I call trivial cyclostationarity. A better word might be degenerate. The periodic component of the second-order moment (autocorrelation or conjugate autocorrelation) in those cases is entirely due to the products of first-order periodic components. This stands in stark contrast to non-trivial cyclostationarity such as that seen in textbook PSK, GPS DSSS BPSK, CDMA, WCDMA, LTE, etc. The periodic components in those second-order moments are not due to the products of first-order periodic components. This is important for and relevant to [R148], because the signals of interest there are 16QAM, as well as for the various textbook-ish PSK and QAMs that are found in the DeepSig and CSP-Blog Challenge data sets. More on their ideas about CSP later, but this CSP characterization definitely sets up expectations, and not in a good way.

Aside from exploiting cyclostationarity in their machine-learning framework, the authors also want to include out-of-band emissions from the transmitters:

This is more of a specific-emitter identification problem (My Papers [10,33]), which has been renamed RF fingerprinting these days. We’ll see some of the problems with this approach to RF fingerprinting as we get deeper into the paper.

The authors then assert that most neural-network data-driven modulation classification work uses features extracted “from hardware” or from “protocol information,” whatever that is. In the papers I’ve reviewed (see All BPSK Signals), the researchers typically just put complex-valued (I/Q) samples directly into the neural networks. So this is confusing:

The authors are going to use eleven modulation types, and they look awfully familiar. Except for “B-FM.” The 2016 DeepSig data set uses BPSK, QPSK, 8PSK, 4PAM, 16QAM, 64QAM, BFSK, CPFSK, WB-FM, AM-SSB, and AM-DSB, for a total of eleven modulation types. So presumably these are the same signals, if we identify ‘FSK’ with ‘BFSK’ and ‘B-FM’ with ‘WB-FM.’ So I don’t think I’m going out on a limb if I say these researchers are using the DeepSig 2016 RML data set.

Later they describe the eleven types this way:

Now ‘FSK’ is replaced by ‘GFSK.’

Relating their input signal set to the DeepSig set is important because the authors don’t tell us explicitly about the signals’ parameters, such as pulse shape or excess bandwidth or the nature of any embedded periodically repeating structures such as synchronization sequences, which becomes relevant later in the paper. But if we make the small leap to the DeepSig data-set signals (which I believe are related to the MATLAB Deep Learning toolbox signals), then we know that the signals do not have framing, slotting, or any other kind of periodically repeating structure–they are textbook signals.

Using Out-of-Band Energy to Perform RF Fingerprinting

After the introduction, the authors focus on the out-of-band idea. Different local oscillators and different power amplifier parameters lead to different out-of-band noise levels, and maybe even different statistical properties of those out-of-band signal components. No quarrel here. For the phase-noise effects of different local oscillators, they provide the following figure:

and for the different power-amplifier parameters they show this figure:

which is strange because they aren’t holding the signal power constant in the comparisons. The actual values of the power-amplifier polynomial model coefficients are not provided, just a statement that they are similar to “high-performing” radios:

So if you are receiving a signal in isolation, that is, with no occupied adjacent channels for a bandwidth that is something like eight times the occupied bandwidth of the signal, and you have excess SNR on the order of $40$ dB or so, then, yes, maybe you can use the out-of-band information to discriminate between the different signals.

The authors also don’t seem to understand the constant-modulus property of bandwidth-efficient (that is, non-rectangular-pulse) phase-shift-keyed or quadrature-amplitude-modulated signals:

BPSK is almost never constant modulus. It is only when the pulse function is rectangular. But for practical RF communication, the pulse is most often a square-root raised-cosine pulse, and so the signal does not have constant modulus. This is relevant because the authors do not employ rectangular-pulse signals in their paper. (Which is good, although I use it extensively throughout the CSP Blog to link all the different estimators, theoretical formulas, resolution issues, etc. But not in data sets such as the Challenge data set.) Truly constant-modulus signals include the continuous-phase modulated signals (My Papers [8]), which is a large class of signals that includes MSK and GMSK.

CSP Material

Let’s turn to the cyclostationary signal processing.

Not too bad, but most of the action in CSP is not for higher orders, it is for second order. The terminology of ‘higher-order statistics’, ‘higher-order moments’, ‘higher-order cumulants’, ‘higher-order cyclic cumulants,’ etc., was invented to distinguish between ubiquitous second-order parameters on the one hand, and all orders greater than two on the other. And of course, as I’ve already said, the direct connection of cyclostationarity to repeated (periodic) components is misleading.

Then it gets weird:

We see the claims that the time-varying autocorrelation peaks at a delay (lag) $\tau$ that is equal to the “period at which the pattern repeats” and that the cyclic autocorrelation itself peaks only at cycle frequencies $\alpha$ and delays (lags) $\tau$ that “correspond to the repeating pattern of the signal.”

The cyclic autocorrelation magnitude typically has a single maximum (an exception occurs for rectangular-pulse signals, shown below and in several other CSP Blog posts) at $\tau=0$. This can be seen from formulas as well as many of the cyclic-autocorrelation surfaces shown in the correlation gallery post. Let’s insert a few here to save you from clicking over there:

On the other hand, for rectangular-pulse signals, the peaks are usually not at $\tau=0$:

Looking at the non-conjugate cyclic autocorrelation for rectangular-pulse BPSK in Figure 5, we have the fundamental periodicity in the time-varying autocorrelation of $1/10$, yet no slice peaks at $10$ or its multiples. The support in $\tau$ for all the slices of the cyclic autocorrelation is limited to the interval $[-10, 10]$.

We then see a connection drawn between the maxima of the cyclic autocorrelation (fixed $\alpha$, maximize over $\tau$) and the maxima of the spectral correlation function (fixed $\alpha$, maximize over spectral frequency $f$):

and

So what happens if the cyclic autocorrelation $R_x^\alpha(\tau)$ peaks at $\tau = 0$? By the authors’ formula, the spectral correlation function would peak at $f = 1/0$. Probably not right.

I don’t think the maxima of $\displaystyle |R_s^\alpha(\tau)|$ and $\displaystyle |S_s^\alpha(f)|$ are closely linked. Let’s indulge ourselves and allow a brief interlude to look at the connectedness of the peaks in one Fourier domain to those in the other.

Interlude: Maxima in Fourier Transform Pairs

First let’s just consider a cosine waveform

$\displaystyle s(t) = \cos(2 \pi (1/T_0) t) \hfill (1)$

This signal has peaks (maxima) at $t = kT_0$, including $t = 0$. The Fourier transform is

$\displaystyle S(f) = (1/2) \delta(f-(1/T_0)) + (1/2)\delta(f+1/T_0)) \hfill (2)$

If we introduce a delay $D$ into $s(t)$,

$\displaystyle s(t) = \cos(2 \pi(1/T_0) (t-D)) \hfill (3)$

then the location of the peaks (maxima) of the sine wave are now at $t=kT_0 + D$, and we find that the Fourier transform is still two impulse functions at $f = \pm 1/T_0$, but the weights of the impulses changes from $1/2$ to $e^{\pm i 2 \pi (D/T_0)}$. So the peaks in frequency $f$ are always in exactly the same place no matter where the peaks of $s(t)$ are in time $t$.

Let’s look at slightly more relevant signal, the triangle with width $2T$ and height $T$, centered at the time origin $t=0$,

$\displaystyle \mbox{\rm tri}(t/T) = \left\{ \begin{array}{ll} (t+T), & -T \leq t \leq 0 \\ (-t+T), & 0 \leq t \leq T \\ 0, & \mbox{\rm otherwise} \end{array} \right . \hfill (4)$

From our brief study of convolution and the convolution theorem, we know the Fourier transform of this function is the squared sinc, because the triangle here is equal to the convolution of a unit-height $T$-width rectangle with itself, so that

$\displaystyle \mbox{\rm tri}(t/T) \Longleftrightarrow T^2 \mbox{\rm sinc}^2 (fT) \hfill (5)$

Here the time-domain function unambiguously peaks at $t = 0$ and the frequency-domain function unambiguously peaks at $f = 0$. We can force the triangle to peak at $t=D$ by using the delayed signal

$\displaystyle s(t) = \mbox{\rm tri}((t-D)/T) \hfill (6)$

which has Fourier transform

$\displaystyle S(f) = e^{-i2\pi D f} \mbox{\rm sinc}^2(fT). \hfill (7)$

The peak magnitude is still at $f=0$.

We can easily see this basic behavior in action with the cyclic autocorrelation and its Fourier transform the spectral correlation function. For a frequency-shifted signal (not a signal at complex baseband, so a signal with a non-zero carrier frequency offset), the cyclic autocorrelation has the same support in $\tau$ as its non-frequency-shifted counterpart, but the spectral correlation will be centered at $f= f_c$, where $f_c$ is the carrier frequency offset.

So the peak of the non-conjugate spectral correlation function typically is at $f=f_c$ and the peak of the non-conjugate cyclic autocorrelation is at $\tau = 0$ (exception: rectangular-pulse signals). This is independent of the numerical value of the symbol rate (“the frequency of the repeated pattern”). For the conjugate spectral correlation function, the peak almost always occurs for $f=0$, and for the conjugate cyclic autocorrelation function, the almost always occurs for $\tau=0$.

Back to the CSP Material

If we have prior knowledge of the signal’s parameters, such as symbol rate and carrier frequency offset, we can make use of that in CSP by employing non-blind estimators of spectral correlation, cyclic correlation, cyclic moments, and cyclic cumulants, saving us large amounts of computation that is implied by a blind search for cycle frequencies. The authors acknowledge this, but then introduce the notion of “random CFs:”

Why would we want to use “random” cycle frequencies for anything?

The authors acknowledge that the use of higher-order statistics might be beneficial, but do not specifically draw the connection between their use of second-order cyclic parameters and higher-order cyclic parameters, leaving the reader to wonder whether they are using stationary-signal cumulants or cyclic cumulants, a topic I examined in detail in the post on stationarizing cyclostationary signals:

The big reveal:

But this is well known (My Papers [5,6,13,25,26,28,35,38,43], the work of Octavia Dobre, many others). It is as if all received wisdom must be conferred upon us through the use of a machine–don’t bother looking at the literature.

Finally, the remarks about frequency resolution here

will undoubtedly confuse. There are three resolutions in CSP: temporal, spectral (“frequency”), and cycle-frequency. The temporal and cycle-frequency resolutions are determined by the length of the data that is processed by the algorithm–call it $T$ seconds. The temporal resolution is $T$ and the cycle-frequency resolution is about $1/T$. The spectral resolution, also typically called frequency resolution in conventional spectral analysis, is independently chosen–call it $\Delta f$. The accuracy of the spectral correlation estimate depends on the product $T\Delta f.$

What’s Going on Here?

Regarding the authors, I think they are more on the computer science part of the electrical engineering and computer science (EECS) continuum, and so their signal-theory training might be lacking. But that’s OK–nobody knows everything, we’re each of us good at and knowledgeable about only a few things. It’s the review process that I’m fingering as the culprit, as usual in these “Comments On” screeds. I don’t see how three reviewers could have given approval for this paper in light of the strange, confusing, and erroneous mathematical statements that are made in connection with Fourier transforms and CSP (not to mention the out-of-band stuff, which strikes me as highly impractical). Yet this paper appears in IEEE Networks, which is a peer-reviewed journal. So either the reviewers are incompetent, the reviewers did not discharge their obligation, or …

I have to admit trying to read and critique this kind of paper twists my mind around and around and so I become more prone to mistakes. Let me know if I’ve interpreted something incorrectly, overlooked something crucial, or have made a math error myself.

One final minor thing. The authors’ [11] is

but the authors of [11] should be listed as “Chad M. Spooner and Apurva N. Mody” (I’m the first author) or “C. M. Spooner and A. N. Mody.”