Machine Learning and Modulation Recognition: Comments on “Convolutional Radio Modulation Recognition Networks” by T. O’Shea, J. Corgan, and T. Clancy

In this post I provide some comments on another paper I’ve seen on arxiv.org (I have also received copies of it through email) that relates to modulation classification and cyclostationary signal processing. The paper is by O’Shea et al and is called “Convolutional Radio Modulation Recognition Networks.” You can find it at this link.

My main interest in commenting on this paper is that it makes reference to cyclic moments as good features for modulation recognition. Although I think cyclic cumulants are a much better way to go, we do know that for order n=2, cyclic moments and cyclic cumulants are equal (provided there are no finite-strength additive sine-wave components in the data). So, the modulation recognition algorithms that use the spectral correlation function or cyclic autocorrelation function can be said to be using cyclic moments. That is, for order two, we can say we are using either cyclic moments or cyclic cumulants as we prefer.

Let’s start with Section 2.1, titled “Expert Cyclic-Moment Features.” We have the quote:

Integrated cyclic-moment based features [1] are currently widely popular in performing modulation recognition and for forming analytically derived decision trees to sort modulations into different classes.

In general, they take the form given in equation 3

\displaystyle s_{nm} = f_m(x^n(t)\ldots x^n(t+T)) \hfill (3)

By computing the mth order statistic on the nth power of the instantaneous or time delayed received signal r(t), we may obtain a set of statistics which uniquely separate it from other modulations given a decision process on the features. For our expert feature set, we compute 32 features. These consist of cyclic time lags of 0 and 8 samples. And the first 2 moments of the first 2 powers of the complex received signal, the amplitude, the phase, and the absolute value of the phase for each of these lags.

And that’s all that they say about “expert cyclic-moment features.”

I’m not at all sure what they mean, but I think I am an expert in cyclic moments. So I’m going to take take the quote from Section 2.1 seriously for a moment to see if there is an interpretation that is charitable.

In the quote, reference [1] is my paper with Gardner My Papers [1], which is titled “Signal Interception: Performance Advantages of Cyclic Feature Detectors,” which is all about the cycle detectors, not modulation classification. And it says nothing about moments with orders higher than two except we intend to study their effectiveness in the future. So the first part of the quote doesn’t make much sense.

Now let’s look at Equation (3). I’ve looked through the paper several times, but I cannot find a definition of f_m(\cdot). Perhaps it is an infinite-time average, but if so, the subscript m doesn’t fit. Perhaps it is the homogeneous mth-order transformation

\displaystyle f_m(x(t)) = x^m(t) \hfill (A)

But then s_{nm} it isn’t a moment, it is just a nonlinearly transformed signal, and so isn’t a feature at all.

On the right side of (3), what is T? Perhaps it is the “cyclic time lag” mentioned below the equation. But does it appear in any of the terms represented by the ellipsis \ldots in (3)? Why specify the feature in terms of absolute numbers like 8?

Perhaps f_m(\cdot) is both the mth-order nonlinearity and the infinite time averaging operation, rolled into one functional? But where is the cycle frequency? (Recall the section starts out by talking about “cyclic moment features.”)

I couldn’t find any other mentions of s_{nm} nor f_m(\cdot) in the remainder of the paper.

So let’s now ignore (3) and focus on the words that follow it: “And the first 2 moments of the first 2 powers of the complex received signal …” Let’s let the received data be denoted by r(t) and look at what this phrase might mean. The first two powers of r(t) are

\displaystyle y_1(t) = r(t) \hfill (B)

and

\displaystyle y_2(t) = r^2(t) \hfill (C)

The first moments of the first two powers are then

\displaystyle z_1(t) = E[y_1(t)] = E[r(t)] \hfill (D)

and

\displaystyle z_2(t) = E[y_2^2(t)] = E[r^2(t)] \hfill (E)

The moment z_1(t) is typically zero. Exceptions are signals like OOK and AM with a transmitted carrier.

The second moments of the first two powers are

\displaystyle z_3(t) = E[y_1^2(t)] = z_2(t) \hfill (F)

and

\displaystyle z_4(t) = E[y_2^2(t)] = E[r^4(t)] \hfill (G)

And I suppose all of these with one or more of the factors delayed by T =  8.

Out of the four quantities, two are redundant, one is typically zero for the signals of interest to the authors, and the last one is the expected value of the fourth power of the signal. This is what we call here on the CSP Blog the “(4,0) moment”, because the order n is four and there are no applied conjugations, m=0. This is zero for MPSK, M > 4 and CPM/CPFSK with a few exceptions like MSK. But it is a good feature for digital QAM in general, especially if you break out the Fourier-series components (the cyclic moments).

Where are the conjugations?

Later in the paper, the authors describe their simulated signals, and they say

“Data is modulated at a rate of roughly 8 samples per symbol with a normalized average transmit power of 0 dB.”

Now if they use T=8 in true cyclic moments, they will be using cyclic moments that are small or zero (depending on how “rough” the 8 samples per symbol is).

So, in the end, I can’t see how the section on Expert Cyclic-Moment Features lives up to its name.

Much of the rest of the paper is devoted to applying machine learning tools to a large simulated data set, and there are confusing issues there too, but I digress. There are a few more instances of strange comments relating to signals and their properties:

“We treat the complex valued input as an input dimension of 2 real valued inputs and use r(t) as a set of 2xN vectors into a narrow 2D Convolutional Network where the orthogonal synchronously sampled In-Phase and Quadrature (I & Q) samples make up this 2-wide dimension.”

In Figure 2 (which is quite tiny), the authors show high-SNR PSDs for some of their generated signals. BPSK appears to contain a tone (is it really OOK?), AM-SSB appears to span the entire sampling bandwidth, and WBFM looks like a sine wave.

figure_2

In explaining how 8PSK might be confused for QPSK,

“An 8PSK symbol containing the specific bits is indiscernible from QPSK since the QPSK constellation points are spanned by 8PSK points.”

But QPSK is not confused for BPSK here, and yet the BPSK constellation points are spanned by the QPSK points.

I think we get to the real motivation for the paper in the last sentence of the section on Future Work:

“This application domain is ripe for a wide array of further investigation and applications which will significantly impact the state of the art in wireless signal processing and cognitive radio domains, shifting them more towards machine learning and data driven approaches.”

I would welcome this conclusion, and the new research avenues, if it was based on a study that accurately represented what has come before.

2 thoughts on “Machine Learning and Modulation Recognition: Comments on “Convolutional Radio Modulation Recognition Networks” by T. O’Shea, J. Corgan, and T. Clancy

  1. Tim says:

    Chad,
    We’ve implemented a much more rigorous version of your excellent prior work as a baseline in this more recent paper on the topic https://arxiv.org/abs/1702.00832. Including all possible combinations of conjugations for the orders considers and a very strong boosted tree classifier operating on them. I appreciate your harping on semantics and minutia within a preliminary conference paper (the term “Expert … Features” is quite, the feature extractors are derived by an expert to do one specific thing) — we make an effort to leverage your work but do not wish to make it the main focus of discussion in this paper (as you’ve shown it takes a lot of time and detail to explain the careful manual feature engineering required which would take up the whole paper) as we are simply exploring a new approach here — if there were better open source tools and implementations of your work and the datasets, perhaps we could advance the community as a whole, and more easily baseline against your best practice, recall we are not all extreme experts in high order moment engineering. Rather than critiquing the dataset, feel free to send pull requests to the github repository and help improve it so we can all have a good baseline to compare with, if you feel I’ve not done justice to your work, please release quantitative results and a reference implementation we can compare against. I’ve personally been reading and appreciate your blog here, but I have not had great success in executing the limited number of matlab scripts you’ve provided —
    Best Regards
    Tim

    • I’ll take a look at your new paper.

      a preliminary conference paper

      I wouldn’t have done the post if I knew that. The paper I commented on is found on arxiv.org and has a revision history of posts on Feb 12, Apr 24, and Jun 10 of 2016. It is 15 pages long and I can’t find any indication that it is a conference paper draft, much less a preliminary one. My experience with arxiv.org is that people post submitted journal papers in order to get them out while the lengthy review process proceeds. So I figured after three drafts were uploaded, this was something you were standing by as solid. Is there a way I can tell on arxiv.org that the paper is a preliminary draft of a conference paper? I admit I am not an expert on that site.

      I appreciate your harping on semantics and minutia

      Is that called for?

      (the term “Expert … Features” is quite, the feature extractors are derived by an expert to do one specific thing)

      I don’t get the grammar of that parenthetical remark, but I would very much like to know what you intended.

      I admit my interest is narrow at the CSP Blog. I didn’t even try to critique the ML setup or all the parameters for the algorithm you put forth. I just saw that the cyclic moment feature part was so confusing that I didn’t think any reader would be able to understand what your Machine was doing, and that my name was explicitly connected to that confusion. Was I wrong?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s