Another post-publication review of a paper that is weak on the ‘RF’ in RF machine learning.
Let’s take a look at a recently published paper (The Literature [R148]) on machine-learning-based modulation-recognition to get a data point on how some electrical engineers–these are more on the side of computer science I believe–use mathematics when they turn to radio-frequency problems. You can guess it isn’t pretty, and that I’m not here to exalt their acumen.
The Machine Learners think that their “feature engineering” (rooting around in voluminous data) is the same as “features” in mathematically derived signal-processing algorithms. I take a lighthearted look.
One of the things the machine learners never tire of saying is that their neural-network approach to classification is superior to previous methods because, in part, those older methods use hand-crafted features. They put it in different ways, but somewhere in the introductory section of a machine-learning modulation-recognition paper (ML/MR), you’ll likely see the claim. You can look through the ML/MR papers I’ve cited in The Literature ([R133]-[R146]) if you are curious, but I’ll extract a couple here just to illustrate the idea.
What happens when a cyclostationary time-series is treated as if it were stationary?
In this post let’s consider the difference between modeling a communication signal as stationary or as cyclostationary.
There are two contexts for this kind of issue. The first is when someone recognizes that a particular signal model is cyclostationary, and then takes some action to render it stationary (sometimes called ‘stationarizing the signal’). They then proceed with their analysis or algorithm development using the stationary signal model. The second context is when someone applies stationary-signal processing to a cyclostationary signal model, either without knowing that the signal is cyclostationary, or perhaps knowing but not caring.
At the center of this topic is the difference between the mathematical object known as a random process (or stochastic process) and the mathematical object that is a single infinite-time function (or signal or time-series).
A related paper is The Literature [R68], which discusses the pitfalls of applying tools meant for stationary signals to the samples of cyclostationary signals.
DeepSig’s data sets are popular in the machine-learning modulation-recognition community, and in that community there are many claims that the deep neural networks are vastly outperforming any expertly hand-crafted tired old conventional method you care to name (none are usually named though). So I’ve been looking under the hood at these data sets to see what the machine learners think of as high-quality inputs that lead to disruptive upending of the sclerotic mod-rec establishment. In previous posts, I’ve looked at two of the most popular DeepSig data sets from 2016 (here and here). In this post, we’ll look at one more and I will then try to get back to the CSP posts.
Let’s take a look at one more DeepSig data set: 2018.01.OSC.0001_1024x2M.h5.tar.gz.
The second DeepSig data set I analyze: SNR problems and strange PSDs.
I presented an analysis of one of DeepSig’s earlier modulation-recognition data sets (RML2016.10a.tar.bz2) in the post on All BPSK Signals. There we saw several flaws in the data set as well as curiosities. Most notably, the signals in the data set labeled as analog amplitude-modulated single sideband (AM-SSB) were absent: these signals were only noise. DeepSig has several other data sets on offer at the time of this writing:
In this post, I’ll present a few thoughts and results for the “Larger Version” of RML2016.10a.tar.bz2, which is called RML2016.10b.tar.bz2. This is a good post to offer because it is coherent with the first RML post, but also because more papers are being published that use the RML 10b data set, and of course more such papers are in review. Maybe the offered analysis here will help reviewers to better understand and critique the machine-learning papers. The latter do not ever contain any side analysis or validation of the RML data sets (let me know if you find one that does in the Comments below), so we can’t rely on the machine learners to assess their inputs. (Update: I analyze a third DeepSig data set here.)
An analysis of DeepSig’s 2016.10A data set, used in many published machine-learning papers, and detailed comments on quite a few of those papers.
Update June 2020
I’ll be adding new papers to this post as I find them. At the end of the original post there is a sequence of date-labeled updates that briefly describe the relevant aspects of the newly found papers. Some machine-learning modulation-recognition papers deserve their own post, so check back at the CSP Blog from time-to-time for “Comments On …” posts.
We first met Professor Jang in a “Comments on the Literature” type of post from 2016. In that post, I pointed out fundamental mathematical errors contained in a paper the Professor published in the IEEE Communications Letters in 2014 (The Literature [R71]).
I have just noticed a new paper by Professor Jang, published in the journal IEEE Access, which is a peer-reviewed journal, like the Communications Letters. This new paper is titled “Simultaneous Power Harvesting and Cyclostationary Spectrum Sensing in Cognitive Radios” (The Literature [R144]). Many of the same errors are present in this paper. In fact, the beginning of the paper, and the exposition on cyclostationary signal processing is nearly the same as in The Literature [R71].
My friend and colleague Antonio Napolitano has just published a new book on cyclostationary signals and cyclostationary signal processing:
Cyclostationary Processes and Time Series: Theory, Applications, and Generalizations, Academic Press/Elsevier, 2020, ISBN: 978-0-08-102708-0. The book is a comprehensive guide to the structure of cyclostationary random processes and signals, and it also provides pointers to the literature on many different applications. The book is mathematical in nature; use it to deepen your understanding of the underlying mathematics that make CSP possible.
You can check out the book on amazon.com using the following link:
And I still don’t understand how a random variable with infinite variance can be a good model for anything physical. So there.
I’ve seen several published and pre-published (arXiv.org) technical papers over the past couple of years on the topic of cyclic correntropy (The Literature [R123-R127]). I first criticized such a paper ([R123]) here, but the substance of that review was about my problems with the presented mathematics, not impulsive noise and its effects on CSP. Since the papers keep coming, apparently, I’m going to put down some thoughts on impulsive noise and some evidence regarding simple means of mitigation in the context of CSP. Preview: I don’t think we need to go to the trouble of investigating cyclic correntropy as a means of salvaging CSP from the evil clutches of impulsive noise.
What modest academic success I’ve had in the area of cyclostationary signal theory and cyclostationary signal processing is largely due to the patient mentorship of my doctoral adviser, William (Bill) Gardner, and the fact that I was able to build on an excellent foundation put in place by Gardner, his advisor Lewis Franks, and key Gardner students such as William (Bill) Brown.
Learning machine learning for radio-frequency signal-processing problems, continued.
I continue with my foray into machine learning (ML) by considering whether we can use widely available ML tools to create a machine that can output accurate power spectrum estimates. Previously we considered the perhaps simpler problem of learning the Fourier transform. See here and here.
Along the way I’ll expose my ignorance of the intricacies of machine learning and my apparent inability to find the correct hyperparameter settings for any problem I look at. But, that’s where you come in, dear reader. Let me know what to do!
We learned it using abstractions involving various infinite quantities. Can a machine learn it without that advantage?
This post is just a blog post. Just some guy on the internet thinking out loud. If you have relevant thoughts or arguments you’d like to advance, please leave them in the Comments section at the end of the post.
How did this come about? Is it even interesting to ask the question? Well, it is to me. I ask it because of the current hot topic in signal processing: machine learning. And in particular, machine learning applied to modulation recognition (see here and here). The machine learners want to capitalize on the success of machine learning applied to image recognition by directly applying the same sorts of image-recognition techniques to the problem of automatic type-recognition for human-made electromagnetic waves.
Update November 1, 2018: A site called feedspot (blog.feedspot.com) contacted me to tell me I made their “Top 10 Digital Signal Processing Blogs, Websites & Newsletters in 2018” list. Weirdly, there are only eight blogs in the list. What’s most important for this post is the other signal processing blogs on the list. So check it out if you are looking for other sources of online signal processing information. Enjoy! blog.feedspot.com/digital_signal_processing_blogs
But I’d like to be able to refer readers to good websites that discuss related aspects of signal processing and communication signals, such as filtering, spectrum estimation, mathematical models, Fourier analysis, etc. I’ve had little success with the Google searches I’ve tried.
The statistics-oriented wing of electrical engineering is perpetually dazzled by [insert Revered Person]’s Theorem at the expense of, well, actual engineering.
I recently came across the conference paper in the post title (The Literature [R101]). Let’s take a look.
The paper is concerned with “detect[ing] the presence of ACS signals with unknown cycle period.” In other words, blind cyclostationary-signal detection and cycle-frequency estimation. Of particular importance to the authors is the case in which the “period of cyclostationarity” is not equal to an integer number of samples. They seem to think this is a new and difficult problem. By my lights, it isn’t. But maybe I’m missing something. Let me know in the Comments.
Reconsidering my first attempt at teaching a machine the Fourier transform with the help of a CSP Blog reader. Also, the Fourier transform is viewed by Machine Learners as an input data representation, and that representation matters.
I first considered whether a machine (neural network) could learn the (64-point, complex-valued) Fourier transform in this post. I used MATLAB’s Neural Network Toolbox and I failed to get good learning results because I did not properly set the machine’s hyperparameters. A kind reader named Vito Dantona provided a comment to that original post that contained good hyperparameter selections, and I’m going to report the new results here in this post.
Since the Fourier transform is linear, the machine should be set up to do linear processing. It can’t just figure that out for itself. Once I used Vito’s suggested hyperparameters to force the machine to be linear, the results became much better:
Let’s talk about another published paper on signal detection involving cyclostationarity and/or cumulants. This one is called “Energy-Efficient Processor for Blind Signal Classification in Cognitive Radio Networks,” (The Literature [R69]), and is authored by UCLA researchers E. Rebeiz and four colleagues.
My focus on this paper is its idea that broad signal-type classes, such as direct-sequence spread-spectrum (DSSS), QAM, and OFDM can be reliably distinguished by the use of a single number: the fourth-order cumulant with two conjugated terms. This kind of cumulant is referred to as the cumulant here at the CSP Blog, and in the paper, because the order is and the number of conjugated terms is .
Modulation recognition is the process of assigning one or more modulation-class labels to a provided time-series data sequence.
In this post, we start a discussion of what I consider the ultimate application of the theory of cyclostationary signals: Automatic Modulation Recognition. My relevant papers are My Papers [16,17,25,26,28,30,32,33,38,43,44]. See also my machine-learning modulation-recognition critiques by clicking on Machine Learning in the CSP Blog Categories on the right side of any post or page.
We are all susceptible to using bad mathematics to get us where we want to go. Here is an example.
I recently came across the 2014 paper in the title of this post. I mentioned it briefly in the post on the periodogram. But I’m going to talk about it a bit more here because this is the kind of thing that makes things harder for people trying to learn about cyclostationarity, which eventually leads to the need for something like the CSP Blog as a corrective.
The idea behind the paper is that it would be nice to avoid the need for prior knowledge of cycle frequencies when using cycle detectors or the like. If you could just compute the entire spectral correlation function, then collapse it by integrating (summing) over frequency , then you’d have a one-dimensional function of cycle frequency and you could then process that function inexpensively to perform detection and classification tasks.