DeepSig’s 2018 Data Set: 2018.01.OSC.0001_1024x2M.h5.tar.gz

DeepSig’s data sets are popular in the machine-learning modulation-recognition community, and in that community there are many claims that the deep neural networks are vastly outperforming any expertly hand-crafted tired old conventional method you care to name (none are usually named though). So I’ve been looking under the hood at these data sets to see what the machine learners think of as high-quality inputs that lead to disruptive upending of the sclerotic mod-rec establishment. In previous posts, I’ve looked at two of the most popular DeepSig data sets from 2016 (here and here). In this post, we’ll look at one more and I will then try to get back to the CSP posts.

Let’s take a look at one more DeepSig data set: 2018.01.OSC.0001_1024x2M.h5.tar.gz.

Continue reading “DeepSig’s 2018 Data Set: 2018.01.OSC.0001_1024x2M.h5.tar.gz”

More on DeepSig’s RML Data Sets

I presented an analysis of one of DeepSig’s earlier modulation-recognition data sets (RML2016.10a.tar.bz2) in the post on All BPSK Signals. There we saw several flaws in the data set as well as curiosities. Most notably, the signals in the data set labeled as analog amplitude-modulated single sideband (AM-SSB) were absent: these signals were only noise. DeepSig has several other data sets on offer at the time of this writing:

In this post, I’ll present a few thoughts and results for the “Larger Version” of RML2016.10a.tar.bz2, which is called RML2016.10b.tar.bz2. This is a good post to offer because it is coherent with the first RML post, but also because more papers are being published that use the RML 10b data set, and of course more such papers are in review. Maybe the offered analysis here will help reviewers to better understand and critique the machine-learning papers. The latter do not ever contain any side analysis or validation of the RML data sets (let me know if you find one that does in the Comments below), so we can’t rely on the machine learners to assess their inputs.

Continue reading “More on DeepSig’s RML Data Sets”

Blog Notes: New Page with All CSP Blog Posts in Chronological Order

To aid navigating the CSP Blog, I’ve added a new page called “All CSP Blog Posts.” You can find the page link at the top of the home page, or in various lists on the right side of the Blog, such as “Pages” and “Site Navigation.”

Let me know in the Comments if there are other ways that you think I can improve the usability of the site.

h/t: Reader Clint.

All BPSK Signals

Update June 2020

I’ll be adding new papers to this post as I find them. At the end of the original post there is a sequence of date-labeled updates that briefly describe the relevant aspects of the newly found papers.

Continue reading “All BPSK Signals”

Professor Jang Again Tortures CSP Mathematics Until it Breaks

We first met Professor Jang in a “Comments on the Literature” type of post from 2016. In that post, I pointed out fundamental mathematical errors contained in a paper the Professor published in the IEEE Communications Letters in 2014 (The Literature [R71]).

I have just noticed a new paper by Professor Jang, published in the journal IEEE Access, which is a peer-reviewed journal, like the Communications Letters. This new paper is titled “Simultaneous Power Harvesting and Cyclostationary Spectrum Sensing in Cognitive Radios” (The Literature [R144]). Many of the same errors are present in this paper. In fact, the beginning of the paper, and the exposition on cyclostationary signal processing is nearly the same as in The Literature [R71].

Let’s take a look.

Continue reading “Professor Jang Again Tortures CSP Mathematics Until it Breaks”

Symmetries of Higher-Order Temporal Probabilistic Parameters in CSP

In this post, we continue our study of the symmetries of CSP parameters. The second-order parameters–spectral correlation and cyclic correlation–are covered in detail in the companion post, including the symmetries for ‘auto’ and ‘cross’ versions of those parameters.

Here we tackle the generalizations of cyclic correlation: cyclic temporal moments and cumulants. We’ll deal with the generalization of the spectral correlation function, the  cyclic polyspectra, in a subsequent post. It is reasonable to me to focus first on the higher-order temporal parameters, because I consider the temporal parameters to be much more useful in practice than the spectral parameters.

This topic is somewhat harder and more abstract than the second-order topic, but perhaps there are bigger payoffs in algorithm development for exploiting symmetries in higher-order parameters than in second-order parameters because the parameters are multidimensional. So it could be worthwhile to sally forth.

Continue reading “Symmetries of Higher-Order Temporal Probabilistic Parameters in CSP”

New Look for a New Year and New Decade

2020 is the fifth full year of existence for the CSP Blog, and the beginning of a new decade that will be full of CSP explorations. I thought I’d freshen up the look of the Blog, so I’ve switched the theme. It is a cleaner look with fewer colors and no more hexagons. I’m not completely happy with it, so I might change it yet again. Let me know if you have problems viewing the content or posting a comment (cmspooner at ieee dot org).

Happy New Year to all my readers!

Symmetries of Second-Order Probabilistic Parameters in CSP

As you progress through the various stages of learning CSP (intimidation, frustration, elucidation, puzzlement, and finally smooth operation), the symmetries of the various functions come up over and over again. Exploiting symmetries can result in lower computational costs, quicker debugging, and easier mathematical development.

What exactly do we mean by ‘symmetries of parameters?’ I’m talking primarily about the evenness or oddness of the time-domain functions in the delay \tau and cycle frequency \alpha variables and of the frequency-domain functions in the spectral frequency f and cycle frequency \alpha variables. Or a generalized version of evenness/oddness, such as f(-x) = g(x), where f(x) and g(x) are closely related functions. We have to consider the non-conjugate and conjugate functions separately, and we’ll also consider both the auto and cross versions of the parameters. We’ll look at higher-order cyclic moments and cumulants in a future post.

You can use this post as a resource for mathematical development because I present the symmetry equations. But also each symmetry result is illustrated using estimated parameters via the frequency smoothing method (FSM) of spectral correlation function estimation. The time-domain parameters are obtained from the inverse transforms of the FSM parameters. So you can also use this post as an extension of the second-order verification guide to ensure that your estimator works for a wide variety of input parameters.

Continue reading “Symmetries of Second-Order Probabilistic Parameters in CSP”

The Ambiguity Function and the Cyclic Autocorrelation Function: Are They the Same Thing?

Let’s talk about ambiguity and correlation. The ambiguity function is a core component of radar signal processing practice and theory. The autocorrelation function and the cyclic autocorrelation function, are key elements of generic signal processing and cyclostationary signal processing, respectively. Ambiguity and correlation both apply a quadratic functional to the data or signal of interest, and they both weight that quadratic functional by a complex exponential (sine wave) prior to integration or summation.

Are they the same thing? Well, my answer is both yes and no.

Continue reading “The Ambiguity Function and the Cyclic Autocorrelation Function: Are They the Same Thing?”

CSP Resources: The Ultimate Guides to Cyclostationary Random Processes by Professor Napolitano

My friend and colleague Antonio Napolitano has just published a new book on cyclostationary signals and cyclostationary signal processing:

Cyclostationary Processes and Time Series: Theory, Applications, and Generalizations, Academic Press/Elsevier, 2020, ISBN: 978-0-08-102708-0. The book is a comprehensive guide to the structure of cyclostationary random processes and signals, and it also provides pointers to the literature on many different applications. The book is mathematical in nature; use it to deepen your understanding of the underlying mathematics that make CSP possible.

You can check out the book on amazon.com using the following link:

Cyclostationary Processes and Time Series

Continue reading “CSP Resources: The Ultimate Guides to Cyclostationary Random Processes by Professor Napolitano”

On Impulsive Noise, CSP, and Correntropy

I’ve seen several published and pre-published (arXiv.org) technical papers over the past couple of years on the topic of cyclic correntropy (The Literature [R123-R127]). I first criticized such a paper ([R123]) here, but the substance of that review was about my problems with the presented mathematics, not impulsive noise and its effects on CSP. Since the papers keep coming, apparently, I’m going to put down some thoughts on impulsive noise and some evidence regarding simple means of mitigation in the context of CSP. Preview: I don’t think we need to go to the trouble of investigating cyclic correntropy as a means of salvaging CSP from the clutches of impulsive noise.

Continue reading “On Impulsive Noise, CSP, and Correntropy”

Sponsoring the CSP Blog

I’ve decided to solicit donations to the CSP Blog through PayPal. For the past four years, I’ve been writing blog posts and doing my best to answer comments at no cost to my readers. And it has turned out very well indeed, thanks to all the people that stop by to read and contribute.

Continue reading “Sponsoring the CSP Blog”

For the Beginner at CSP

Here is a list of links to CSP Blog posts that I think are suitable for a beginner: read them in the order given.

How to Obtain Help from the CSP Blog

Introduction to CSP

How to Create a Simple Cyclostationary Signal: Rectangular-Pulse BPSK

The Cyclic Autocorrelation Function

The Spectral Correlation Function

The Cyclic Autocorrelation for BPSK

Continue reading “For the Beginner at CSP”

A Gallery of Cyclic Correlations

There are some situations in which the spectral correlation function is not the preferred measure of (second-order) cyclostationarity. In these situations, the cyclic autocorrelation (non-conjugate and conjugate versions) may be much simpler to estimate and work with in terms of detector, classifier, and estimator structures. So in this post, I’m going to provide plots of the cyclic autocorrelation for each of the signals in the spectral correlation gallery post. The exceptions are those signals I called feature-rich in the spectral correlation gallery post, such as LTE and radar. Recall that such signals possess a large number of cycle frequencies, and plotting their three-dimensional spectral correlation surface is not helpful as it is difficult to interpret with the human eye. So for the cycle-frequency patterns of feature-rich signals, we’ll rely on the stem-style (cyclic-domain profile) plots in the gallery post.

Continue reading “A Gallery of Cyclic Correlations”

On The Shoulders

What modest academic success I’ve had in the area of cyclostationary signal theory and cyclostationary signal processing is largely due to the patient mentorship of my doctoral adviser, William (Bill) Gardner, and the fact that I was able to build on an excellent foundation put in place by Gardner, his advisor Lewis Franks, and key Gardner students such as William (Bill) Brown.

Continue reading “On The Shoulders”

Simple Synchronization Using CSP

In this post I discuss the use of cyclostationary signal processing applied to communication-signal synchronization problems. First, just what are synchronization problems? Synchronize and synchronization have multiple meanings, but the meaning of synchronize that is relevant here is something like:

syn·chro·nize: To cause to occur or operate with exact coincidence in time or rate

If we have an analog amplitude-modulated (AM) signal (such as voice AM used in the AM broadcast bands) at a receiver we want to remove the effects of the carrier sine wave, resulting in an output that is only the original voice or music message. If we have a digital signal such as binary phase-shift keying (BPSK), we want to remove the effects of the carrier but also sample the message signal at the correct instants to optimally recover the transmitted bit sequence. 

Continue reading “Simple Synchronization Using CSP”

100,000 Page Views!

The CSP Blog has reached 100,000 page views! Also, a while back it passed the “20,000 visitors” milestone. All of this for 53 posts and 10 pages. More to come!

yearly_totals

I started the CSP Blog in late 2015, so it has taken a bit over three years to get to 100,000 views. I don’t know if that should be considered fast or slow. But I like it anyway.

I want to thank each and every one of the visitors to the CSP Blog. It has reached so many more people that I though it ever would when I started it.

Thank you for all your clicks, comments, emails, and downloads! If you’d like, leave a comment to this post if you have an idea for a post you’d like to see.

Below the fold, some graphics that show the vital statistics of the CSP Blog as of the 100,000 page-view milestone.

Continue reading “100,000 Page Views!”

Can a Machine Learn a Power Spectrum Estimator?

I continue with my foray into machine learning (ML) by considering whether we can use widely available ML tools to create a machine that can output accurate power spectrum estimates. Previously we considered the perhaps simpler problem of learning the Fourier transform. See here and here.

Along the way I’ll expose my ignorance of the intricacies of machine learning and my apparent inability to find the correct hyperparameter settings for any problem I look at. But, that’s where you come in, dear reader. Let me know what to do!

Continue reading “Can a Machine Learn a Power Spectrum Estimator?”

Data Set for the Machine-Learning Challenge

Update September 2020. I made a mistake when I created the signal-parameter “truth” files signal_record.txt and signal_record_first_20000.txt. Like the DeepSig RML data sets that I analyzed on the CSP Blog here and here, the SNR parameter in the truth files did not match the actual SNR of the signals in the data files. I’ve updated the truth files and the links below. You can still use the original files for all other signal parameters, but the SNR parameter was in error.

Update July 2020. I originally posted 20,000 signals in the posted data set. I’ve now added another 92,000 for a total of 112,000 signals. The original signals are contained in Batches 1-5, the additional signals in Batches 6-28. I’ve placed these additional Batches at the end of the post to preserve the original post’s content.

I’ve posted 20000 PSK/QAM signals to the CSP Blog. These are the signals I refer to in the post I wrote challenging the machine-learners. In this brief post, I provide links to the data and describe how to interpret the text file containing the signal-type labels and signal parameters.

Overview of Data Set

The 20,000 signals are stored in five zip files, each containing 4000 individual signal files:

Batch 1

Batch 2

Batch 3

Batch 4

Batch 5

The zip files are each about 1 GB in size.

The modulation-type labels for the signals, such as “BPSK” or “MSK,” are contained in the text file:

signal_record_first_20000.txt

Each signal file is stored in a binary format involving interleaved real and imaginary parts, which I call ‘.tim’ files. You can read a .tim file into MATLAB using read_binary.m. Or use the code inside read_binary.m to write your own data-reader; the format is quite simple.

The Label and Parameter File

Let’s look at the format of the truth/label file. The first line of signal_record_first_20000.txt is

1 bpsk  11  -7.4433467080e-04  9.8977795076e-01  10  9  5.4532617590e+00  0.0

which comprises 9 fields. All temporal and spectral parameters (times and frequencies) are normalized with respect to the sampling rate. In other words, the sampling rate can be taken to be unity in this data set. These fields are described in the following list:

  1. Signal index. In the case above this is `1′ and that means the file containing the signal is called signal_1.tim. In general, the nth signal is contained in the file signal_n.tim. The Batch 1 zip file contains signal_1.tim through signal_4000.tim.
  2. Signal type. A string indicating the modulation format of the signal in the file. For this data set, I’ve only got eight modulation types: BPSK, QPSK, 8PSK, \pi/4-DQPSK, 16QAM, 64QAM, 256QAM, and MSK. These are denoted by the strings bpsk, qpsk, 8psk, dqpsk, 16qam, 64qam, 256qam, and msk, respectively.
  3. Base symbol period. In the example above (line one of the truth file), the base symbol period is T_0 = 11.
  4. Carrier offset. In this case, it is -7.4433467080\times 10^{-4}.
  5. Excess bandwidth. The excess bandwidth parameter, or square-root raised-cosine roll-off parameter, applies to all of the signal types except MSK. Here it is 9.8977795076\times 10^{-1}. It can be any real number between 0.1 and 1.0.
  6. Upsample factor. The sixth field is an upsampling parameter U.
  7. Downsample factor. The seventh field is a downsampling parameter D. The actual symbol rate of the signal in the file is computed from the base symbol period, upsample factor, and downsample factor: \displaystyle f_{sym} = (1/T_0)*(D/U). So the BPSK signal in signal_1.tim has rate 0.08181818. If the downsample factor is zero in the truth-parameters file, no resampling was done to the signal.
  8. Inband SNR (dB). The ratio of the signal power to the noise power within the signal’s bandwidth, taking into account the signal type and the excess bandwidth parameter.
  9. Noise spectral density (dB). It is always 0 dB. So the various SNRs are generated by varying the signal power.

To ensure that you have correctly downloaded and interpreted my data files, I’m going to provide some PSD plots and a couple of the actual sample values for a couple of the files.

signal_1.tim

The line from the truth file is:

1 bpsk  11  -7.4433467080e-04  9.8977795076e-01  10  9  5.4532617590e+00  0.0

The first ten samples of the file are:

-5.703014e-02   -6.163056e-01
-1.285231e-01   -6.318392e-01
6.664069e-01    -7.007506e-02
7.731103e-01    -1.164615e+00
3.502680e-01    -1.097872e+00
7.825349e-01    -3.721564e-01
1.094809e+00    -3.123962e-01
4.146149e-01    -5.890701e-01
1.444665e+00    7.358724e-01
-2.217039e-01   -1.305001e+00

An FSM-based PSD estimate for signal_1.tim is:

psd_1

And the blindly estimated cycle frequencies (using the SSCA) are:

cfs_signal_1

The previous plot corresponds to the numerical values:

Non-conjugate (\alpha, C, S):

8.181762695e-02  7.480e-01  5.406e+00

Conjugate (\alpha, C, S):

8.032470942e-02  7.800e-01  4.978e+00
-1.493096002e-03  8.576e-01  1.098e+01
-8.331298083e-02  7.090e-01  5.039e+00

signal_4000.tim

The line from the truth file is

4000 256qam  9  8.3914849139e-04  7.2367959637e-01  9  8  1.0566301192e+01  0.0

which means the symbol rate is given by (1/9)*(8/9) = 0.09876543209. The carrier offset is 0.000839 and the excess bandwidth is 0.723. Because the signal type is 256QAM, it has a single (non-zero) non-conjugate cycle frequency of 0.098765 and no conjugate cycle frequencies. But the square of the signal has cycle frequencies related to the quadrupled carrier:

cfs_signal_4000

Final Thoughts

Is 20000 waveforms a large enough data set? Maybe not. I have generated tens of thousands more, but will not post until there is a good reason to do so. And that, my friends, is up to you!

That’s about it. I think that gives you enough information to ensure that you’ve interpreted the data and the labels correctly. What remains is experimentation, machine-learning or otherwise I suppose. Please get back to me and the readers of the CSP Blog with any interesting results using the Comments section of this post or the Challenge post.

For my analysis of a commonly used machine-learning modulation-recognition data set (RML), see the All BPSK Signals post.

Additional Batches of Signals:

Batch 6

Batch 7

Batch 8

Batch 9

Batch 10

Batch 11

Batch 12

Batch 13

Batch 14

Batch 15

Batch 16

Batch 17

Batch 18

Batch 19

Batch 20

Batch 21

Batch 22

Batch 23

Batch 24

Batch 25

Batch 26

Batch 27

Batch 28

Signal parameters text file

MATLAB’s SSCA: commP25ssca.m

In this short post, I describe some errors that are produced by MATLAB’s strip spectral correlation analyzer function commP25ssca.m. I don’t recommend that you use it; far better to create your own function.

Continue reading “MATLAB’s SSCA: commP25ssca.m”