Machine Learning – Cyclostationary Signal Processing

The CSP Blog Turns 10

Raise a glass!

I launched the site way back in September 2015. As with most things in my life, the CSP Blog was not the result of some carefully crafted plan, such as to corner the online market on signal-processing instruction, create a side-hustle, or manage my brand, whatever that might mean. It was a lark. I wanted my wife to start a blog or website as a place to share her writing with the world. “Look, dear, it really is super easy to create your own website,” I said to her, after writing a post or two. And it really is easy.

But then the CSP Blog took on a life of its own. Or, better said, it took over my life. Certainly it took a lot of my time, and still does.

The Big Time

“The place where I come from is a small town
They think so small, they use small words
But not me, I’m smarter than that
I worked it out
I’ve been stretching my mouth
To let those big words come right out”

‘Big Time’ by Peter Gabriel

The CSP Blog is now linked-to at the top of cyclostationarity.com, Professor Gardner’s online repository of all things cyclostationary! (See also The Literature [R1].)

CSPB.ML.2018R2.NF

A noise-free version of the 2018 CSP Blog dataset CSPB.ML.2018R2 is posted here. This allows researchers to correctly apply propagation-channel effects to the generated signals, and to easily add their own noise at whatever level they wish.

The format of the files is the same as CSPB.ML.2018R2, and the truth parameters for each file are the same as the truth parameters for the corresponding file in CSPB.ML.2018R2, except for SNR, which is infinite.

Final Snoap Doctoral-Work Journal Paper: My Papers [56] on Novel Network Layers for Modulation Recognition that Generalizes

Dr. Snoap’s final journal paper related to his recently completed doctoral work has been published in IEEE Transactions on Broadcasting (My Papers [56]).

CSPB.ML.2023G1

Another dataset aimed at the continuing problem of generalization in machine-learning-based modulation recognition. This one is a companion to CSPB.ML.2023, which features cochannel situations.

Quality datasets containing digital signals with varied parameters and lengths sufficient to permit many kinds of validation checks by signal-processing experts remain in short supply. In this post, we continue our efforts to provide such datasets by offering a companion unlabeled dataset to CSPB.ML.2023.

Introducing Dr. John A. Snoap

An expert signal processor. An expert machine learner. All in one person!

I am very pleased to announce that my signal-processing, machine-learning, and modulation-recognition collaborator and friend John Snoap has successfully defended his doctoral dissertation and is now Dr. Snoap!

I started working with John after we met in the Comments section of the CSP Blog way back in 2019. John was building his own set of CSP software tools and ran into a small bump in the road and asked for some advice. Just the kind of reader I hope for–independent-minded, gets to the bottom of things, and embraces signal processing.

As we interacted over email and zoom it became clear that John was thinking of making a contribution in the area of modulation recognition, and was also interested in learning more about machine learning using neural networks. Since I had been recently engaged in hand-to-hand combat with machine learners who were, in my opinion of course, injecting more confusion than elucidation into the field, I figured this might be a friendly way for me to understand machine learning better, and maybe there would be a way or two to marry signal processing with supervised learning. So off we went.

Fast forward four years and we’ve published five papers, with a sixth in review, that I believe are trailblazing. John is that rare person that has mastered two very different technical areas: cyclostationary signal processing and deep learning. Because I believe that neural networks do not actually learn the things that we hope they will, but need not-so-gentle nudges toward learning the truly valuable things, a researcher with one foot firmly in the signal-processing world and the other firmly in the machine-learning world has a very bright future indeed.

The title of John’s dissertation is Deep-Learning-Based Classification of Digitally Modulated Signals, which he wrote as a student in the Department of Electrical and Computer Engineering at Old Dominion University under the direction of his advisor Professor Dimitrie Popescu.

Congratulations Dr. Snoap! And thank you for everything.

CSPB.ML.2022R2: Correcting an RNG Flaw in CSPB.ML.2022

For completeness, I also correct the CSPB.ML.2022 dataset, which is aimed at facilitating neural-network generalization studies.

The same random-number-generator (RNG) error that plagued CSPB.ML.2018 corrupts CSPB.ML.2022, so that some of the files in the dataset correspond to identical signal parameters. This makes the CSPB.ML.2018 dataset potentially problematic for training a neural network using supervised learning.

In a recent post, I remedied the error and provided an updated CSPB.ML.2018 dataset and called it CSPB.ML.2018R2. Both are still available on the CSP Blog.

In this post, I provide an update to CSPB.ML.2022, called CSPB.ML.2022R2.

CSPB.ML.2018R2: Correcting an RNG Flaw in CSPB.ML.2018

KIRK: Everything that is in error must be sterilised.
NOMAD: There are no exceptions.
KIRK: Nomad, I made an error in creating you.
NOMAD: The creation of perfection is no error.
KIRK: I did not create perfection. I created error.

I’ve had to update the original Challenge for the Machine Learners post, and the associated dataset post, a couple times due to flaws in my metadata (truth) files. Those were fairly minor, so I just updated the original posts.

But a new flaw in CSPB.ML.2018 and CSPB.ML.2022 has come to light due to the work of the estimable research engineers at Expedition Technology. The problem is not with labeling or the fundamental correctness of the modulation types, pulse functions, etc., but with the way a random-number generator was applied in my multi-threaded dataset-generation technique.

I’ll explain after the fold, and this post will provide links to an updated version of the dataset, CSPB.ML.2018R2. I’ll keep the original up for continuity and also place a link to this post there. Moreover, the descriptions of the truth files over at CSPB.ML.2018 are still valid–the truth file posted here has the same format as the truth files available on the CSPB.ML.2018 and CSPB.ML.2022 posts.

The Next Logical Step in CSP+ML for Modulation Recognition: Snoap’s MILCOM ’23 Paper [Preview]

We are attempting to force a neural network to learn the features that we have already shown deliver simultaneous good performance and good generalization.

ODU doctoral student John Snoap and I have a new paper on the convergence of cyclostationary signal processing, machine learning using trained neural networks, and RF modulation classification: My Papers [55] (arxiv.org link here).

Previously in My Papers [50-52, 54] we have shown that the (multitudinous!) neural networks in the literature that use I/Q data as input and perform modulation recognition (output a modulation-class label) are highly brittle. That is, they minimize the classification error, they converge, but they don’t generalize. A trained neural network generalizes well if it can maintain high classification performance even if some of the probability density functions for the data’s random variables differ from the training inputs (in the lab) relative to the application inputs (in the field). The problem is also called the dataset-shift problem or the domain-adaptation problem. Generalization is my preferred term because it is simpler and has a strong connection to the human equivalent: we can quite easily generalize our observations and conclusions from one dataset to another without massive retraining of our neural noggins. We can find the cat in the image even if it is upside-down and colored like a giraffe.

Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]

Another step forward in the merging of CSP and ML for modulation recognition, and another step away from the misstep of always relying on convolutional neural networks from image processing for RF-domain problem-solving.

My Old Dominion colleagues and I have published an extended version of the 2022 MILCOM paper My Papers [52] in the journal MDPI Sensors. The first author is John Snoap, who is one of those rare people that is an expert in signal processing and in machine learning. Bright future there! Dimitrie Popescu, James Latshaw, and I provided analysis, programming, writing, and research-direction support.

Update on DeepSig Datasets

‘Insufficient facts always invite danger.’
Spock in Star Trek TOS Episode “Space Seed”

As most CSP Blog readers likely know, I’ve performed detailed critical analyses (one, two, three, and four) of the modulation-recognition datasets put forth publicly by DeepSig in 2016-2018. These datasets are associated with some of their published or arxiv.org papers, such as The Literature [R138], which I also reviewed here.

My conclusion is that the DeepSig datasets are as flawed as the DeepSig papers–it was the highly flawed nature of the papers that got me started down the critical-review path in the first place.

A reader recently alerted me to a change in the Datasets page at deepsig.ai that may indicate they are listening to critics. Let’s take a look and see if there is anything more to say.

The Altar of Optimality

Danger Will Robinson! Non-technical post approaching!

When I was a wee engineer, I’d sometimes clash with other engineers that sneered at technical approaches that didn’t set up a linear-algebraic optimization problem as the first step. Never mind that I’ve been relentlessly focused on single-sensor problems, rather than array-processing problems, and so the naturalness of the linear-algebraic mathematical setting was debatable–however there were still ways to fashion matrices and compute those lovely eigenvalues. The real issue wasn’t the dimensionality of the data model, it was that I didn’t have a handy crank I could turn and pop out a provably optimal solution to the posed problem. Therefore I could be safely ignored. And if nobody could actually write down an optimization problem for, say, general radio-frequency scene analysis, then that problem just wasn’t worth pursuing.

Those critical engineers worship at the altar of optimality. Time for another rant.

Is Radio-Frequency Scene Analysis a Wicked Problem?

‘By the pricking of my thumbs, something wicked this way comes …’ Macbeth by W. Shakespeare

I attended a conference on dynamic spectrum access in 2017 and participated in a session on automatic modulation recognition. The session was connected to a live competition within the conference where participants would attempt to apply their modulation-recognition system to signals transmitted in the conference center by the conference organizers. Like a grand modulation-recognition challenge but confined to the temporal, spectral, and spatial constraints imposed by the short-duration conference.

What I didn’t know going in was the level of frustration on the part of the machine-learner organizers regarding the seeming inability of signal-processing and machine-learning researchers to solve the radio-frequency scene analysis problem once and for all. The basic attitude was ‘if the image-processors can have the AlexNet image-recognition solution, and thereby abandon their decades-long attempt at developing serious mathematics-based image-processing theory and practice, why haven’t we solved the RFSA problem yet?’

PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]

The next step in dataset complexity at the CSP Blog: cochannel signals.

I’ve developed another dataset for use in assessing modulation-recognition algorithms (machine-learning-based or otherwise) that is more complex than the original sets I posted for the ML Challenge (CSPB.ML.2018 and CSPB.ML.2022). Half of the new dataset consists of one signal in noise and the other half consists of two signals in noise. In most cases the two signals overlap spectrally, which is a signal condition called cochannel interference.

We’ll call it CSPB.ML.2023.

ICARUS: More on Attempts to Merge IQ Data with Extracted-Feature Data in Machine Learning

How can we train a neural network to make use of both IQ data samples and CSP features in the context of weak-signal detection?

I’ve been working with some colleagues at Northeastern University (NEU) in Boston, MA, on ways to combine CSP with machine learning. The work I’m doing with Old Dominion University is focused on basic modulation recognition using neural networks and, in particular, the generalization (dataset-shift) problem that is pervasive in deep learning with convolution neural networks. In contrast, the NEU work is focused on specific signal detection and classification problems and looks at how to use multiple disparate data types as inputs to neural-networks; inputs such as complex-valued samples (IQ data) as well as carefully selected components of spectral correlation and spectral coherence surfaces.

My NEU colleagues and I will be publishing a rather lengthy conference paper on a new multi-input-data neural-network approach called ICARUS at InfoCom 2023 this May (My Papers [53]). You can get a copy of the pre-publication version here or on arxiv.org.

Neural Networks for Modulation Recognition: IQ-Input Networks Do Not Generalize, but Cyclic-Cumulant-Input Networks Generalize Very Well

Neural networks with CSP-feature inputs DO generalize in the modulation-recognition problem setting.

In some recently published papers (My Papers [50,51]), my ODU colleagues and I showed that convolutional neural networks and capsule networks do not generalize well when their inputs are complex-valued data samples, commonly referred to as simply IQ samples, or as raw IQ samples by machine learners.(Unclear why the adjective ‘raw’ is often used as it adds nothing to the meaning. If I just say Hey, pass me those IQ samples, would ya?, do you think maybe he means the processed ones? How about raw-I-mean–seriously-man–I-did-not-touch-those-numbers-OK? IQ samples? All-natural vegan unprocessed no-GMO organic IQ samples? Uncooked IQ samples?) Moreover, the capsule networks typically outperform the convolutional networks.

In a new paper (MILCOM 2022: My Papers [52]; arxiv.org version), my colleagues and I continue this line of research by including cyclic cumulants as the inputs to convolutional and capsule networks. We find that capsule networks outperform convolutional networks and that convolutional networks trained on cyclic cumulants outperform convolutional networks trained on IQ samples. We also find that both convolutional and capsule networks trained on cyclic cumulants generalize perfectly well between datasets that have different (disjoint) probability density functions governing their carrier frequency offset parameters.

That is, convolutional networks do better recognition with cyclic cumulants and generalize very well with cyclic cumulants.

So why don’t neural networks ever ‘learn’ cyclic cumulants with IQ data at the input?

The majority of the software and analysis work is performed by the first author, John Snoap, with an assist on capsule networks by James Latshaw. I created the datasets we used (available here on the CSP Blog [see below]) and helped with the blind parameter estimation. Professor Popescu guided us all and contributed substantially to the writing.

Epistemic Bubbles: Comments on “Modulation Recognition Using Signal Enhancement and Multi-Stage Attention Mechanism” by Lin, Zeng, and Gong.

Another brick in the wall, another drop in the bucket, another windmill on the horizon …

Let’s talk more about The Cult. No, I don’t mean She Sells Sanctuary, for which I do have considerable nostalgic fondness. I mean the Cult(ure) of Machine Learning in RF communications and signal processing. Or perhaps it is more of an epistemic bubble where there are The Things That Must Be Said and The Unmentionables in every paper and a style of research that is strictly adhered to but that, sadly, produces mostly error and promotes mostly hype. So we have shibboleths, taboos, and norms to deal with inside the bubble.

Time to get on my high horse. She’s a good horse named Ravager and she needs some exercise. So I’m going to strap on my claymore, mount Ravager, and go for a ride. Or am I merely tilting at windmills?

Let’s take a close look at another paper on machine learning for modulation recognition. It uses, uncritically, the DeepSig RML 2016 datasets. And the world and the world, the world drags me down…

Some Concrete Results on Generalization in Modulation Recognition using Machine Learning

Neural networks with I/Q data as input do not generalize in the modulation-recognition problem setting.

Update May 20, 2022: Here is the arxiv.org link.

Back in 2018 I posted a dataset consisting of 112,000 I/Q data files, 32,768 samples in length each, as a part of a challenge to machine learners who had been making strong claims of superiority over signal processing in the area of automatic modulation recognition. One part of the challenge was modulation recognition involving eight digital modulation types, and the other was estimating the carrier frequency offset. That dataset is described here, and I’d like to refer to it as CSPB.ML.2018.

Then in 2022 I posted a companion dataset to CSPB.ML.2018 called CSPB.ML.2022. This new dataset uses the same eight modulation types, similar ranges of SNR, pulse type, and symbol rate, but the random variable that governs the carrier frequency offset is different with respect to the random variable in CSPB.ML.2018. The purpose of the CSPB.ML.2022 dataset is to facilitate studies of the dataset-shift, or generalization, problem in machine learning.

Throughout the past couple of years I’ve been working with some graduate students and a professor at Old Dominion University on merging machine learning and signal processing for problems involving RF signal analysis, such as modulation recognition. We are starting to publish a sequence of papers that describe our efforts. I briefly describe the results of one such paper, My Papers [51], in this post.

A Great American Science Writer: Lee Smolin

While reading a book on string theory for lay readers, I did a double take…

I don’t know why I haven’t read any of Lee Smolin’s physics books prior to this year, but I haven’t. Maybe blame my obsession with Sean Carroll. In any case, I’ve been reading The Trouble with Physics (The Literature [R175]), which is about string theory and string theorists. Smolin finds it troubling that the string theorist subculture in physics shows some signs of groupthink and authoritarianism. Perhaps elder worship too.

I came across this list of attributes, conceived by Smolin, of the ‘sociology’ of the string-theorist contingent:

The Domain Expertise Trap

The softwarization of engineering continues apace…

I keep seeing people write things like “a major disadvantage of the technique for X is that it requires substantial domain expertise.” Let’s look at a recent good paper that makes many such remarks and try to understand what it could mean, and if having or getting domain expertise is actually a bad thing. Spoiler: It isn’t.

The paper under the spotlight is The Literature [R174], “Interference Suppression Using Deep Learning: Current Approaches and Open Challenges,” published for the nonce on arxiv.org. I’m not calling this post a “Comments On …” post, because once I extract the (many) quotes about domain expertise, I’m leaving the paper alone. The paper is a good paper and I expect it to be especially useful for current graduate students looking to make a contribution in the technical area where machine learning and RF signal processing overlap. I especially like Figure 1 and the various Tables.

Chad Spooner on Watch Out!May 25, 2026
Welcome to the CSP Blog Tim! Thanks for the thoughtful comment. I've come across the substack called Slow AI by…
Chad Spooner on PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]May 12, 2026
Welcome to the CSP Blog Muhammad! Thanks for reaching out and for your interest in CSPB.ML.2023. It will take some…
Muhammad Zakir Khan on PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]May 11, 2026
Great post and i am really intrested to move forward. can i get the full dataset link to process?
Tim Meehan on Watch Out!March 20, 2026
Great article Chad, AI use in research has made an existing problem: sloppy research. I have found LLMs very useful…
Simon Clift on SPTK: Interconnection of Linear SystemsMarch 18, 2026
I'll happily defer to you, at least until I can say something coherent. I'm a mathematician and exploring some connections…
RUI WU on Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]March 11, 2026
Thank you very much for your helpful explanation. I noticed that both Ref. 52 and Ref. 56 show relatively weak…
Chad Spooner on Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]March 8, 2026
Welcome to the CSP Blog Rui! Thanks for the questions. Does this mean we don’t need to compute all (11…
RUI WU on Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]March 7, 2026
Hi Chad, thank you so much for your continued contributions to the CSP community. I'm currently very interested in your…
Chad Spooner on SPTK: The Matched FilterMarch 4, 2026
Welcome to the CSP Blog Charles! Thanks for the question. The answer is yes, provided the note is a periodic…
Chad Spooner on SPTK: Interconnection of Linear SystemsMarch 4, 2026
Hi Simon! Welcome to the CSP Blog. I don't see the connection beyond the simple category of "signal processing." The…