And if we get this right,
We’re gonna teach ’em how to say
You and I.Lin-Manuel Miranda, “One Last Time,” Hamilton
I didn’t expect to have to do this, but I am going to analyze yet another DeepSig dataset. One last time. This one is called 2016.04C.multisnr.tar.bz2, and is described thusly on the DeepSig website:
I’ve analyzed the 2018 dataset here, the RML2016.10b.tar.bz2 dataset here, and the RML2016.10a.tar.bz2 dataset here.
Now I’ve come across a manuscript-in-review in which both the RML2016.10a and RML2016.04c data sets are used. The idea is that these two datasets represent two sufficiently distinct datasets so that they are good candidates for use in a data-shift study involving trained neural-network modulation-recognition systems.
The data-shift problem is, as one researcher puts it:
Data shift or data drift, concept shift, changing environments, data fractures are all similar terms that describe the same phenomenon: the different distribution of data between train and test setsGeorgios Sarantitis
But … are they really all that different?
Our first clue that these aren’t actually good datasets for a data-shift study is that DeepSig says they are quite similar. The 10a and 10b datasets are ‘cleaner and more normalized’ versions of the 4C dataset, and they are meant to ‘supersede’ the 4C dataset, not complement it. But let’s take a look anyway.
The 4C dataset is again in pickle form, and I read the I/Q samples out using a slightly modified version of the python program I used for the other datasets. In this post, I only look at 100 signal instances for each combination of modulation-type label and SNR label. The two kinds of labels are read directly from the pickle file–they are DeepSig’s labels.
The dataset contains 11 signal-type labels: BPSK, QPSK, 8PSK, CPFSK, GFSK, AM-DSB, AM-SSB, QAM16, QAM64, PAM4, and WBFM. And as before, each is associated with SNR labels that range from -20 to +18 in steps of 2. Each signal instance consists of 128 inphase and quadrature (complex) samples.
We see the same kinds of problems as in the other 2016 datasets: lots of instances that appear to be just noise and the lack of any discernable signal component for any SNR label for the SSB signal type. There does not appear to be a coherent way to understand the SNR label in terms of measurable total or inband SNRs. But overall the dataset is similar to the others and will not make a good data-shift complement to 10a or 10b. Moreover, the dataset is flawed and should not be used in any study (SSB is missing!).
Here are 100 PSD estimates for BPSK and each of the SNR labels found in the pickle file:
As in the 2016-A data set, there is no signal component to the AM-SSB signal instances. Even for the largest SNR label of 18, the PSDs are plainly just noise.
The AM-DSB and WBFM signal types are simply very narrowband signals, as evidenced by their PSDs, which are near-perfect rectangles. I’m using the frequency-smoothing method (FSM) of spectrum estimation, together with a rectangular smoothing window, so that any impulse in the PSD will look just like a rectangle (here with width 0.1).
It is unclear whether any noise is actually added to the CPFSK signal for the label of 18, as evidenced by the fact that the out-of-band spectral components can vary by tens of dB, as in Figure 5.
The other signal labels produce various curious PSD estimates–you can see all of the PSDs I generated by viewing the movies at the end of this post.
Comparison to 10a and 10b
Our primary purpose here is to assess the suitability of the 4C dataset as sufficiently different from the A or B dataset so that it would be useful in a data-shift study involving it and A or B. So let’s take a look at some PSDs for each of the three datasets side-by-side.
I created many more of these three-way PSD comparison plots. You can find them in a zip archive on the Downloads page.
By comparing the PSDs for a particular signal label over the three datasets, it appears that the parameters of the signals do not significantly vary between those three data sets. I see no evidence of differences in symbol rate, carrier offset, pulse type, or pulse roll-off. The set of considered modulation types is identical, except that the SSB signal label is not present in dataset B, and the SSB signal has zero power in datasets A and 4C.
We could do a more complete statistical (CSP) analysis if the signal instances were significantly longer than their length of 128 samples.
So if you take the datasets ‘as is,’ they are not good candidates for a data-shift machine-learning study–the probability distributions of the underlying random variables appear to be too similar. However, one might convert one of the datasets into a more suitable dataset by taking the higher-SNR elements and performing filtering, frequency-shifting, and resampling operations, then add noise to recreate the SNR range of the original datasets. This does not remove the fundamental problem that the datasets all originate from the same researchers, and so likely will jointly possess whatever idiosyncrasies those researchers have wittingly or unwittingly introduced through their signal-generation process.
Or a data-shift (generalization in machine learning) researcher could compare one of these datasets to one of their own creation, or try using one of these and a different publicly available dataset.