Radio Frequency Scene Analysis – Cyclostationary Signal Processing

Interference Mitigation Course at GTRI

Update December 2024: The likely date for this course at GTRI is February 4-5, 2025.

Update September 2024: This course is postponed until Spring 2025. I’ll post further updates here as they become available.

I’ll be part of a team of researchers and practicing engineers, led by the estimable Dr. Ryan Westafer, that will be teaching a class on radio-frequency interference mitigation in September. The class is hosted by the Georgia Tech Research Institute (GTRI) and will be held on the Georgia Tech campus on September 10-11, 2024.

Desultory CSP: What’s That Under the TV?

“Alive in the Superunknown
First it steals your Mind, and then it steals your … Soul”

–Soundgarden

An advantage of using and understanding the statistics of communication signals ™, the basics of signal processing, and the rich details of cyclostationary signal processing is that a practitioner can deal with, to some useful degree, unknown unknowns. The unknown unknowns I’m talking about here on the CSP Blog are, of course, signals. We know about the by-now-familiar known-type detection, multi-class modulation-recognition, and RF scene-analysis problems, in which it is often assumed that we know the signals we are looking for, but we don’t know their times of arrival, some of their parameters, or how they might overlap in time, frequency, and space. Then there are the less-familiar problems involving unknown unknowns.

Sometimes we just don’t know the signals we are looking for. We still want to do as good a job on RF scene analysis as we can, but there might be signals in the scene that do not conform to the body of knowledge we have, to date, of manmade RF signals. Or, in modern parlance, we didn’t even know we left such signals out of our neural-network training dataset; we’re a couple steps back from even worrying about generalization, because we don’t even know we can’t generalize since we are ignorant about what to generalize to.

In this post I look at the broadcast TV band, seen in downtown Monterey, California, sometime in the recent past. I expect to see ATSC DTV signals (of the older 8VSB/16VSB or the newer OFDM types), and I do. But what else is there? Spoiler: Unknown unknowns.

Let’s take a look.

CSPB.ML.2018R2: Correcting an RNG Flaw in CSPB.ML.2018

KIRK: Everything that is in error must be sterilised.
NOMAD: There are no exceptions.
KIRK: Nomad, I made an error in creating you.
NOMAD: The creation of perfection is no error.
KIRK: I did not create perfection. I created error.

I’ve had to update the original Challenge for the Machine Learners post, and the associated dataset post, a couple times due to flaws in my metadata (truth) files. Those were fairly minor, so I just updated the original posts.

But a new flaw in CSPB.ML.2018 and CSPB.ML.2022 has come to light due to the work of the estimable research engineers at Expedition Technology. The problem is not with labeling or the fundamental correctness of the modulation types, pulse functions, etc., but with the way a random-number generator was applied in my multi-threaded dataset-generation technique.

I’ll explain after the fold, and this post will provide links to an updated version of the dataset, CSPB.ML.2018R2. I’ll keep the original up for continuity and also place a link to this post there. Moreover, the descriptions of the truth files over at CSPB.ML.2018 are still valid–the truth file posted here has the same format as the truth files available on the CSPB.ML.2018 and CSPB.ML.2022 posts.

Is Radio-Frequency Scene Analysis a Wicked Problem?

‘By the pricking of my thumbs, something wicked this way comes …’ Macbeth by W. Shakespeare

I attended a conference on dynamic spectrum access in 2017 and participated in a session on automatic modulation recognition. The session was connected to a live competition within the conference where participants would attempt to apply their modulation-recognition system to signals transmitted in the conference center by the conference organizers. Like a grand modulation-recognition challenge but confined to the temporal, spectral, and spatial constraints imposed by the short-duration conference.

What I didn’t know going in was the level of frustration on the part of the machine-learner organizers regarding the seeming inability of signal-processing and machine-learning researchers to solve the radio-frequency scene analysis problem once and for all. The basic attitude was ‘if the image-processors can have the AlexNet image-recognition solution, and thereby abandon their decades-long attempt at developing serious mathematics-based image-processing theory and practice, why haven’t we solved the RFSA problem yet?’

PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]

The next step in dataset complexity at the CSP Blog: cochannel signals.

I’ve developed another dataset for use in assessing modulation-recognition algorithms (machine-learning-based or otherwise) that is more complex than the original sets I posted for the ML Challenge (CSPB.ML.2018 and CSPB.ML.2022). Half of the new dataset consists of one signal in noise and the other half consists of two signals in noise. In most cases the two signals overlap spectrally, which is a signal condition called cochannel interference.

We’ll call it CSPB.ML.2023.

Neural Networks for Modulation Recognition: IQ-Input Networks Do Not Generalize, but Cyclic-Cumulant-Input Networks Generalize Very Well

Neural networks with CSP-feature inputs DO generalize in the modulation-recognition problem setting.

In some recently published papers (My Papers [50,51]), my ODU colleagues and I showed that convolutional neural networks and capsule networks do not generalize well when their inputs are complex-valued data samples, commonly referred to as simply IQ samples, or as raw IQ samples by machine learners.(Unclear why the adjective ‘raw’ is often used as it adds nothing to the meaning. If I just say Hey, pass me those IQ samples, would ya?, do you think maybe he means the processed ones? How about raw-I-mean–seriously-man–I-did-not-touch-those-numbers-OK? IQ samples? All-natural vegan unprocessed no-GMO organic IQ samples? Uncooked IQ samples?) Moreover, the capsule networks typically outperform the convolutional networks.

In a new paper (MILCOM 2022: My Papers [52]; arxiv.org version), my colleagues and I continue this line of research by including cyclic cumulants as the inputs to convolutional and capsule networks. We find that capsule networks outperform convolutional networks and that convolutional networks trained on cyclic cumulants outperform convolutional networks trained on IQ samples. We also find that both convolutional and capsule networks trained on cyclic cumulants generalize perfectly well between datasets that have different (disjoint) probability density functions governing their carrier frequency offset parameters.

That is, convolutional networks do better recognition with cyclic cumulants and generalize very well with cyclic cumulants.

So why don’t neural networks ever ‘learn’ cyclic cumulants with IQ data at the input?

The majority of the software and analysis work is performed by the first author, John Snoap, with an assist on capsule networks by James Latshaw. I created the datasets we used (available here on the CSP Blog [see below]) and helped with the blind parameter estimation. Professor Popescu guided us all and contributed substantially to the writing.

What is the Minimum Effort Required to Find ‘Related Work?’: Comments on Some Spectrum-Sensing Literature by N. West [R176] and T. Yucek [R178]

Starts as a personal gripe, but ends with weird stuff from the literature.

During my poking around on arxiv.org the other day (Grrrrr…), I came across some postings by O’Shea et al I’d not seen before, including The Literature [R176]: “Wideband Signal Localization and Spectral Segmentation.”

Huh, I thought, they are probably trying to train a neural network to do automatic spectral segmentation that is superior to my published algorithm (My Papers [32]). Yeah, no. I mean yes to a machine, no to nods to me. Let’s take a look.

One Last Time …

We take a quick look at a fourth DeepSig dataset called 2016.04C.multisnr.tar.bz2 in the context of the data-shift problem in machine learning.

And if we get this right,

We’re gonna teach ’em how to say

Goodbye …

You and I.
Lin-Manuel Miranda, “One Last Time,” Hamilton

I didn’t expect to have to do this, but I am going to analyze yet another DeepSig dataset. One last time. This one is called 2016.04C.multisnr.tar.bz2, and is described thusly on the DeepSig website:

Figure 1. Description of various DeepSig data sets found on the DeepSig website as of November 2021.

I’ve analyzed the 2018 dataset here, the RML2016.10b.tar.bz2 dataset here, and the RML2016.10a.tar.bz2 dataset here.

Now I’ve come across a manuscript-in-review in which both the RML2016.10a and RML2016.04c data sets are used. The idea is that these two datasets represent two sufficiently distinct datasets so that they are good candidates for use in a data-shift study involving trained neural-network modulation-recognition systems.

The data-shift problem is, as one researcher puts it:

Data shift or data drift, concept shift, changing environments, data fractures are all similar terms that describe the same phenomenon: the different distribution of data between train and test sets
Georgios Sarantitis

But … are they really all that different?

J. Antoni’s Fast Spectral Correlation Estimator

The Fast Spectral Correlation estimator is a quick way to find small cycle frequencies. However, its restrictions render it inferior to estimators like the SSCA and FAM.

Update 2023: I continue with critical analysis of Antoni’s work as applied to ‘using and understanding the statistics of communication signals‘ in a follow-on post.

In this post we take a look at an alternative CSP estimator created by J. Antoni et al (The Literature [R152]). The paper describing the estimator can be found here, and you can get some corresponding MATLAB code, posted by the authors, here if you have a Mathworks account.

Desultory CSP: More Signals from SigIDWiki.com

More real-world data files from SigIDWiki.com. The range of spectral correlation function types exhibited by man-made RF signals is vast.

Let’s look at a few more signals posted to sigidwiki.com. Just for fun.

Cyclostationarity of DMR Signals

Let’s take a brief look at the cyclostationarity of a captured DMR signal. It’s more complicated than one might think.

In this post I look at the cyclostationarity of a digital mobile radio (DMR) signal empirically. That is, I have a captured DMR signal from sigidwiki.com, and I apply blind CSP to it to determine its cycle frequencies and spectral correlation function. The signal is arranged in frames or slots, with gaps between successive slots, so there is the chance that we’ll see cyclostationarity due to the on-burst (or on-frame) signaling and cyclostationarity due to the framing itself.

Comments on “Deep Neural Network Feature Designs for RF Data-Driven Wireless Device Classification,” by B. Hamdaoui et al

Another post-publication review of a paper that is weak on the ‘RF’ in RF machine learning.

Let’s take a look at a recently published paper (The Literature [R148]) on machine-learning-based modulation-recognition to get a data point on how some electrical engineers–these are more on the side of computer science I believe–use mathematics when they turn to radio-frequency problems. You can guess it isn’t pretty, and that I’m not here to exalt their acumen.

More on DeepSig’s RML Datasets

The second DeepSig data set I analyze: SNR problems and strange PSDs.

I presented an analysis of one of DeepSig’s earlier modulation-recognition datasets (RML2016.10a.tar.bz2) in the post on All BPSK Signals. There we saw several flaws in the dataset as well as curiosities. Most notably, the signals in the dataset labeled as analog amplitude-modulated single sideband (AM-SSB) were absent: these signals were only noise. DeepSig has several other datasets on offer at the time of this writing:

In this post, I’ll present a few thoughts and results for the “Larger Version” of RML2016.10a.tar.bz2, which is called RML2016.10b.tar.bz2. This is a good post to offer because it is coherent with the first RML post, but also because more papers are being published that use the RML 10b dataset, and of course more such papers are in review. Maybe the offered analysis here will help reviewers to better understand and critique the machine-learning papers. The latter do not ever contain any side analysis or validation of the RML datasets (let me know if you find one that does in the Comments below), so we can’t rely on the machine learners to assess their inputs. (Update: I analyze a third DeepSig dataset here. And a fourth and final one here.)

All BPSK Signals

An analysis of DeepSig’s 2016.10A dataset, used in many published machine-learning papers, and detailed comments on quite a few of those papers.

Update March 2021

See my analyses of three other DeepSig datasets here, here, and here.

Update June 2020

I’ll be adding new papers to this post as I find them. At the end of the original post there is a sequence of date-labeled updates that briefly describe the relevant aspects of the newly found papers. Some machine-learning modulation-recognition papers deserve their own post, so check back at the CSP Blog from time-to-time for “Comments On …” posts.

Continue reading “All BPSK Signals”

A Gallery of Cyclic Correlations

For your delectation.

There are some situations in which the spectral correlation function is not the preferred measure of (second-order) cyclostationarity. In these situations, the cyclic autocorrelation (non-conjugate and conjugate versions) may be much simpler to estimate and work with in terms of detector, classifier, and estimator structures. So in this post, I’m going to provide surface plots of the cyclic autocorrelation for each of the signals in the spectral correlation gallery post. The exceptions are those signals I called feature-rich in the spectral correlation gallery post, such as DSSS, LTE, and radar. Recall that such signals possess a large number of cycle frequencies, and plotting their three-dimensional spectral correlation surface is not helpful as it is difficult to interpret with the human eye. So for the cycle-frequency patterns of feature-rich signals, we’ll rely on the stem-style (cyclic-domain profile) plots that I used in the spectral correlation gallery post.

Continue reading “A Gallery of Cyclic Correlations”

Dataset for the Machine-Learning Challenge [CSPB.ML.2018]

A PSK/QAM/SQPSK data set with randomized symbol rate, inband SNR, carrier-frequency offset, and pulse roll-off.

Update April 2025: All but the first five batch files have been removed. I needed to make space since WordPress has a hard limit on storage. Use CSPB.ML.2018R2 in any case.

Update September 2023: A randomization flaw has been found and fixed for CSPB.ML.2018, resulting in CSPB.ML.2018R2. Use that one going forward.

Update February 2023: I’ve posted a third challenge dataset here. It is CSPB.ML.2023 and features cochannel signals.

Update April 2022. I’ve also posted a second dataset here. This new dataset is similar to the original ML Challenge dataset except the random variable representing the carrier frequency offset has a slightly different distribution.

If you refer to either of the posted datasets in a published paper, please use the following designators, which I am also using in papers I’m attempting to publish:

Original ML Challenge Dataset: CSPB.ML.2018.

Shifted ML Challenge Dataset: CSPB.ML.2022.

Update September 2020. I made a mistake when I created the signal-parameter “truth” files signal_record.txt and signal_record_first_20000.txt. Like the DeepSig RML datasets that I analyzed on the CSP Blog here and here, the SNR parameter in the truth files did not match the actual SNR of the signals in the data files. I’ve updated the truth files and the links below. You can still use the original files for all other signal parameters, but the SNR parameter was in error.

Update July 2020. I originally posted $20,000$ signals in the posted dataset. I’ve now added another $92,000$ for a total of $112,000$ signals. The original signals are contained in Batches 1-5, the additional signals in Batches 6-28. I’ve placed these additional Batches at the end of the post to preserve the original post’s content.

MATLAB’s SSCA: commP25ssca.m

In this short post, I describe some errors that are produced by MATLAB’s strip spectral correlation analyzer function commP25ssca.m. I don’t recommend that you use it; far better to create your own function.

Continue reading “MATLAB’s SSCA: commP25ssca.m”

A Challenge for the Machine Learners

The machine-learning modulation-recognition community consistently claims vastly superior performance to anything that has come before. Let’s test that.

Update September 2023: A randomization flaw has been found and fixed for CSPB.ML.2018, resulting in CSPB.ML.2018R2. Use that one going forward.

Update February 2023: A third dataset has been posted here. This new dataset, CSPB.ML.2023, features cochannel signals.

Update April 2022: I’ve also posted a second dataset here. This new dataset is similar to the original ML Challenge dataset except the random variable representing the carrier frequency offset has a slightly different distribution.

If you refer to any of the posted datasets in a published paper, please use the following designators, which I am also using in papers I’m attempting to publish:

Original ML Challenge Dataset: CSPB.ML.2018.

Shifted ML Challenge Dataset: CSPB.ML.2022.

Cochannel ML Dataset: CSPB.ML.2023.

Update February 2019

I’ve decided to post the data set I discuss here to the CSP Blog for all interested parties to use. See the new post on the Data Set. If you do use it, please let me and the CSP Blog readers know how you fared with your experiments in the Comments section of either post. Thanks!

Continue reading “A Challenge for the Machine Learners”

CSP Estimators: The FFT Accumulation Method

An alternative to the strip spectral correlation analyzer.

Let’s look at another spectral correlation function estimator: the FFT Accumulation Method (FAM). This estimator is in the time-smoothing category, is exhaustive in that it is designed to compute estimates of the spectral correlation function over its entire principal domain, and is efficient, so that it is a competitor to the Strip Spectral Correlation Analyzer (SSCA) method. I implemented my version of the FAM by using the paper by Roberts et al (The Literature [R4]). If you follow the equations closely, you can successfully implement the estimator from that paper. The tricky part, as with the SSCA, is correctly associating the outputs of the coded equations to their proper $\displaystyle (f, \alpha)$ values.

Continue reading “CSP Estimators: The FFT Accumulation Method”

‘Can a Machine Learn the Fourier Transform?’ Redux, Plus Relevant Comments on a Machine-Learning Paper by M. Kulin et al.

Reconsidering my first attempt at teaching a machine the Fourier transform with the help of a CSP Blog reader. Also, the Fourier transform is viewed by Machine Learners as an input data representation, and that representation matters.

I first considered whether a machine (neural network) could learn the (64-point, complex-valued) Fourier transform in this post. I used MATLAB’s Neural Network Toolbox and I failed to get good learning results because I did not properly set the machine’s hyperparameters. A kind reader named Vito Dantona provided a comment to that original post that contained good hyperparameter selections, and I’m going to report the new results here in this post.

Since the Fourier transform is linear, the machine should be set up to do linear processing. It can’t just figure that out for itself. Once I used Vito’s suggested hyperparameters to force the machine to be linear, the results became much better:

Continue reading “‘Can a Machine Learn the Fourier Transform?’ Redux, Plus Relevant Comments on a Machine-Learning Paper by M. Kulin et al.”

Chad Spooner on Watch Out!May 25, 2026
Welcome to the CSP Blog Tim! Thanks for the thoughtful comment. I've come across the substack called Slow AI by…
Chad Spooner on PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]May 12, 2026
Welcome to the CSP Blog Muhammad! Thanks for reaching out and for your interest in CSPB.ML.2023. It will take some…
Muhammad Zakir Khan on PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]May 11, 2026
Great post and i am really intrested to move forward. can i get the full dataset link to process?
Tim Meehan on Watch Out!March 20, 2026
Great article Chad, AI use in research has made an existing problem: sloppy research. I have found LLMs very useful…
Simon Clift on SPTK: Interconnection of Linear SystemsMarch 18, 2026
I'll happily defer to you, at least until I can say something coherent. I'm a mathematician and exploring some connections…
RUI WU on Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]March 11, 2026
Thank you very much for your helpful explanation. I noticed that both Ref. 52 and Ref. 56 show relatively weak…
Chad Spooner on Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]March 8, 2026
Welcome to the CSP Blog Rui! Thanks for the questions. Does this mean we don’t need to compute all (11…
RUI WU on Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]March 7, 2026
Hi Chad, thank you so much for your continued contributions to the CSP community. I'm currently very interested in your…
Chad Spooner on SPTK: The Matched FilterMarch 4, 2026
Welcome to the CSP Blog Charles! Thanks for the question. The answer is yes, provided the note is a periodic…
Chad Spooner on SPTK: Interconnection of Linear SystemsMarch 4, 2026
Hi Simon! Welcome to the CSP Blog. I don't see the connection beyond the simple category of "signal processing." The…