Update on DeepSig Datasets

‘Insufficient facts always invite danger.’
Spock in Star Trek TOS Episode “Space Seed”

As most CSP Blog readers likely know, I’ve performed detailed critical analyses (one, two, three, and four) of the modulation-recognition datasets put forth publicly by DeepSig in 2016-2018. These datasets are associated with some of their published or arxiv.org papers, such as The Literature [R138], which I also reviewed here.

My conclusion is that the DeepSig datasets are as flawed as the DeepSig papers–it was the highly flawed nature of the papers that got me started down the critical-review path in the first place.

A reader recently alerted me to a change in the Datasets page at deepsig.ai that may indicate they are listening to critics. Let’s take a look and see if there is anything more to say.

Continue reading “Update on DeepSig Datasets”

What is the Minimum Effort Required to Find ‘Related Work?’: Comments on Some Spectrum-Sensing Literature by N. West [R176] and T. Yucek [R178]

Starts as a personal gripe, but ends with weird stuff from the literature.

During my poking around on arxiv.org the other day (Grrrrr…), I came across some postings by O’Shea et al I’d not seen before, including The Literature [R176]: “Wideband Signal Localization and Spectral Segmentation.”

Huh, I thought, they are probably trying to train a neural network to do automatic spectral segmentation that is superior to my published algorithm (My Papers [32]). Yeah, no. I mean yes to a machine, no to nods to me. Let’s take a look.

Continue reading “What is the Minimum Effort Required to Find ‘Related Work?’: Comments on Some Spectrum-Sensing Literature by N. West [R176] and T. Yucek [R178]”

One Last Time …

We take a quick look at a fourth DeepSig dataset called 2016.04C.multisnr.tar.bz2 in the context of the data-shift problem in machine learning.

And if we get this right,

We’re gonna teach ’em how to say

Goodbye …

You and I.

Lin-Manuel Miranda, “One Last Time,” Hamilton

I didn’t expect to have to do this, but I am going to analyze yet another DeepSig dataset. One last time. This one is called 2016.04C.multisnr.tar.bz2, and is described thusly on the DeepSig website:

Figure 1. Description of various DeepSig data sets found on the DeepSig website as of November 2021.

I’ve analyzed the 2018 dataset here, the RML2016.10b.tar.bz2 dataset here, and the RML2016.10a.tar.bz2 dataset here.

Now I’ve come across a manuscript-in-review in which both the RML2016.10a and RML2016.04c data sets are used. The idea is that these two datasets represent two sufficiently distinct datasets so that they are good candidates for use in a data-shift study involving trained neural-network modulation-recognition systems.

The data-shift problem is, as one researcher puts it:

Data shift or data drift, concept shift, changing environments, data fractures are all similar terms that describe the same phenomenon: the different distribution of data between train and test sets

Georgios Sarantitis

But … are they really all that different?

Continue reading “One Last Time …”

DeepSig’s 2018 Dataset: 2018.01.OSC.0001_1024x2M.h5.tar.gz

The third DeepSig dataset I’ve examined. It’s better!

Update February 2021. I added material relating to the DeepSig-claimed variation of the roll-off parameter in a square-root raised-cosine pulse-shaping function. It does not appear that the roll-off was actually varied as stated in Table I of [R137].

DeepSig’s datasets are popular in the machine-learning modulation-recognition community, and in that community there are many claims that the deep neural networks are vastly outperforming any expertly hand-crafted tired old conventional method you care to name (none are usually named though). So I’ve been looking under the hood at these datasets to see what the machine learners think of as high-quality inputs that lead to disruptive upending of the sclerotic mod-rec establishment. In previous posts, I’ve looked at two of the most popular DeepSig datasets from 2016 (here and here). In this post, we’ll look at one more and I will then try to get back to the CSP posts.

Let’s take a look at one more DeepSig dataset: 2018.01.OSC.0001_1024x2M.h5.tar.gz.

Continue reading “DeepSig’s 2018 Dataset: 2018.01.OSC.0001_1024x2M.h5.tar.gz”

More on DeepSig’s RML Datasets

The second DeepSig data set I analyze: SNR problems and strange PSDs.

I presented an analysis of one of DeepSig’s earlier modulation-recognition datasets (RML2016.10a.tar.bz2) in the post on All BPSK Signals. There we saw several flaws in the dataset as well as curiosities. Most notably, the signals in the dataset labeled as analog amplitude-modulated single sideband (AM-SSB) were absent: these signals were only noise. DeepSig has several other datasets on offer at the time of this writing:

In this post, I’ll present a few thoughts and results for the “Larger Version” of RML2016.10a.tar.bz2, which is called RML2016.10b.tar.bz2. This is a good post to offer because it is coherent with the first RML post, but also because more papers are being published that use the RML 10b dataset, and of course more such papers are in review. Maybe the offered analysis here will help reviewers to better understand and critique the machine-learning papers. The latter do not ever contain any side analysis or validation of the RML datasets (let me know if you find one that does in the Comments below), so we can’t rely on the machine learners to assess their inputs. (Update: I analyze a third DeepSig dataset here. And a fourth and final one here.)

Continue reading “More on DeepSig’s RML Datasets”

All BPSK Signals

An analysis of DeepSig’s 2016.10A dataset, used in many published machine-learning papers, and detailed comments on quite a few of those papers.

Update March 2021

See my analyses of three other DeepSig datasets here, here, and here.

Update June 2020

I’ll be adding new papers to this post as I find them. At the end of the original post there is a sequence of date-labeled updates that briefly describe the relevant aspects of the newly found papers. Some machine-learning modulation-recognition papers deserve their own post, so check back at the CSP Blog from time-to-time for “Comments On …” posts.

Continue reading “All BPSK Signals”

Machine Learning and Modulation Recognition: Comments on “Convolutional Radio Modulation Recognition Networks” by T. O’Shea, J. Corgan, and T. Clancy

Update October 2020:

Since I wrote the paper review in this post, I’ve analyzed three of O’Shea’s data sets (O’Shea is with the company DeepSig, so I’ve been referring to the data sets as DeepSig’s in other posts): All BPSK Signals, More on DeepSig’s Data Sets, and DeepSig’s 2018 Data Set. The data set relating to this paper is analyzed in All BPSK Signals. Preview: It is heavily flawed.

Continue reading “Machine Learning and Modulation Recognition: Comments on “Convolutional Radio Modulation Recognition Networks” by T. O’Shea, J. Corgan, and T. Clancy”