‘Insufficient facts always invite danger.’
Spock in Star Trek TOS Episode “Space Seed”
As most CSP Blog readers likely know, I’ve performed detailed critical analyses (one, two, three, and four) of the modulation-recognition datasets put forth publicly by DeepSig in 2016-2018. These datasets are associated with some of their published or arxiv.org papers, such as The Literature [R138], which I also reviewed here.
My conclusion is that the DeepSig datasets are as flawed as the DeepSig papers–it was the highly flawed nature of the papers that got me started down the critical-review path in the first place.
A reader recently alerted me to a change in the Datasets page at deepsig.ai that may indicate they are listening to critics. Let’s take a look and see if there is anything more to say.
The cyclostationarity of frequency-shift-keyed signals depends strongly on the way the carrier phase evolves over time. Many distinct cycle-frequency patterns and spectral correlation shapes are possible.
Let’s get back to basics by looking at a large class of signals known as frequency-shift-keyed (FSK) signals. We will leave to the side, for the most part, the very large class of signals that goes by the name of continuous-phase modulation (CPM), which includes continuous-phase FSK (CPFSK), MSK, GMSK, and many more (The Literature [R188]-[R190]). Those are treated in My Papers [8], and in a future CSP Blog post.
Here we want to look at more conventional forms of FSK. These signal types don’t necessarily have a continuous phase function. They are generally easier to demodulate and are more robust to noise and interference than the more complicated CPM signal types, but generally have much lower spectral efficiency. They are like the rectangular-pulse PSK of the FSK/CPM world. But they are still used.
So among the CSP Blog readers that voted, I think the consensus is to produce more “on brand” posts on CSP and the Signal-Processing ToolKit. Also, there is significant interest in doing CSP with GNU Radio, which I have considerable experience with, and so I’ll likely be posting some flowgraph ideas and results at some point in 2023.
Thanks everybody! (But I’ll still rant and rave from time to time; sorry!)
Update June 25, 2023: When I said you can vote multiple times, I didn’t mean to ‘spam’ the poll (as my kids would say). Someone just voted for one of the responses ten times in a row (same IP address ten votes within one minute). I meant you can vote for several different items in the poll! So I did remove some of those identical votes. I’ll close the poll at the end of the day June 30.
Update May 11, 2023: Please vote in the Reader Poll below (multiple times if you’d like) soon! As of today, CSP Applications and Signal Processing ToolKit are in the lead, with Rants and Datasets at the bottom.
The CSP Blog is rolling along here in 2023!
March 2023 broke a record for pageviews in a calendar month with over 7,000 as of this writing early in the day on March 31.
Let’s note some other milestones and introduce a poll.
Milestones
What a month! We’re at about 7,145 views right now, and the previous monthly record is 6,482.
About 84,000 visitors have been counted over the years since the CSP Blog launched in 2015, with 5,500 this year already. I believe this is just a count of the unique IP addresses that have accessed a page. But the number of subscribers is only 198! You can subscribe (“Follow”) to the CSP Blog by entering an email address in the “Follow Blog via Email” box on the right edge of any viewed page, near the top of the page. You’ll get notified through that email address whenever there is a new post. CSP Blog readers cannot see that email address, just as they cannot see the email address associated with any comment, unless there is an associated gravatar.
Reader Poll
I’m planning to have more time available to devote to improving and extending the CSP Blog over the next few months. If you want to have input into that process, consider voting in the poll below.
Danger Will Robinson! Non-technical post approaching!
When I was a wee engineer, I’d sometimes clash with other engineers that sneered at technical approaches that didn’t set up a linear-algebraic optimization problem as the first step. Never mind that I’ve been relentlessly focused on single-sensor problems, rather than array-processing problems, and so the naturalness of the linear-algebraic mathematical setting was debatable–however there were still ways to fashion matrices and compute those lovely eigenvalues. The real issue wasn’t the dimensionality of the data model, it was that I didn’t have a handy crank I could turn and pop out a provably optimal solution to the posed problem. Therefore I could be safely ignored. And if nobody could actually write down an optimization problem for, say, general radio-frequency scene analysis, then that problem just wasn’t worth pursuing.
Those critical engineers worship at the altar of optimality. Time for another rant.
‘By the pricking of my thumbs, something wicked this way comes …’ Macbeth by W. Shakespeare
I attended a conference on dynamic spectrum access in 2017 and participated in a session on automatic modulation recognition. The session was connected to a live competition within the conference where participants would attempt to apply their modulation-recognition system to signals transmitted in the conference center by the conference organizers. Like a grand modulation-recognition challenge but confined to the temporal, spectral, and spatial constraints imposed by the short-duration conference.
What I didn’t know going in was the level of frustration on the part of the machine-learner organizers regarding the seeming inability of signal-processing and machine-learning researchers to solve the radio-frequency scene analysis problem once and for all. The basic attitude was ‘if the image-processors can have the AlexNet image-recognition solution, and thereby abandon their decades-long attempt at developing serious mathematics-based image-processing theory and practice, why haven’t we solved the RFSA problem yet?’
In this post, we’ll switch gears a bit and look at the problem of waveform estimation. This comes up in two situations for me: single-sensor processing and array (multi-sensor) processing. At some point, I’ll write a post on array processing for waveform estimation (using, say, the SCORE algorithm The Literature [R102]), but here we restrict our attention to the case of waveform estimation using only a single sensor (a single antenna connected to a single receiver). We just have one observed sampled waveform to work with. There are also waveform estimation methods that are multi-sensor but not typically referred to as array processing, such as the blind source separation problem in acoustic scene analysis, which is often solved by principal component analysis (PCA), independent component analysis (ICA), and their variants.
The signal model consists of the noisy sum of two or more modulated waveforms that overlap in both time and frequency. If the signals do not overlap in time, then we can separate them by time gating, and if they do not overlap in frequency, we can separate them using linear time-invariant systems (filters).
The next step in dataset complexity at the CSP Blog: cochannel signals.
I’ve developed another dataset for use in assessing modulation-recognition algorithms (machine-learning-based or otherwise) that is more complex than the original sets I posted for the ML Challenge (CSPB.ML.2018 and CSPB.ML.2022). Half of the new dataset consists of one signal in noise and the other half consists of two signals in noise. In most cases the two signals overlap spectrally, which is a signal condition called cochannel interference.
Update January 31, 2023: I’ve added numbers in square brackets next to the worst of the wrong things. I’ll document the errors at the bottom of the post.
Of course I have to see what ChatGPT has to say about CSP. Including definitions, which I don’t expect it to get too wrong, and code for estimators, which I expect it to get very wrong.
How can we train a neural network to make use of both IQ data samples and CSP features in the context of weak-signal detection?
I’ve been working with some colleagues at Northeastern University (NEU) in Boston, MA, on ways to combine CSP with machine learning. The work I’m doing with Old Dominion University is focused on basic modulation recognition using neural networks and, in particular, the generalization (dataset-shift) problem that is pervasive in deep learning with convolution neural networks. In contrast, the NEU work is focused on specific signal detection and classification problems and looks at how to use multiple disparate data types as inputs to neural-networks; inputs such as complex-valued samples (IQ data) as well as carefully selected components of spectral correlation and spectral coherence surfaces.
My NEU colleagues and I will be publishing a rather lengthy conference paper on a new multi-input-data neural-network approach called ICARUS at InfoCom 2023 this May (My Papers [53]). You can get a copy of the pre-publication version here or on arxiv.org.
The CSP Blog took a big step forward in 2022, with 66,700 67,965 page views and counting, which is 10,000 12,000 more than last year’s (record) number of about 56,000. Thanks to all my readers!
The CSP Blog recently received a comment from a signal processor that needed a small amount of debugging help with their python spectral correlation estimator code.
The code uses a form of the time-smoothing method and aims to compute and plot the spectral correlation estimate as well as the corresponding coherence estimate. What is cool about this code is that it is clear, well-organized, on github, and is written using Jupyter Notebook. Moreover, there is a Google Colab function so that anyone can run the code from a chrome browser and see the results, even a python newbie like me. Tres moderne.
It’s too close to home, and it’s too near the bone …
Park the car at the side of the road You should know Time’s tide will smother you… And I will too
“That Joke Isn’t Funny Anymore” by The Smiths
I applaud the intent behind the paper in this post’s title, which is The Literature [R183], apparently accepted in 2022 for publication in IEEE Access, a peer-reviewed journal. That intent is to list all the found ways in which researchers preprocess radio-frequency data (complex sampled data) prior to applying some sort of modulation classification (recognition) algorithm or system.
The problem is that this attempt at gathering up all of the ‘representations’ gets a lot of the math wrong, and so has a high potential to confuse rather than illuminate.
Neural networks with CSP-feature inputs DO generalize in the modulation-recognition problem setting.
In some recently published papers (My Papers [50,51]), my ODU colleagues and I showed that convolutional neural networks and capsule networks do not generalize well when their inputs are complex-valued data samples, commonly referred to as simply IQ samples, or as raw IQ samples by machine learners.(Unclear why the adjective ‘raw’ is often used as it adds nothing to the meaning. If I just say Hey, pass me those IQ samples, would ya?, do you think maybe he means the processed ones? How about raw-I-mean–seriously-man–I-did-not-touch-those-numbers-OK? IQ samples? All-natural vegan unprocessed no-GMO organic IQ samples?Uncooked IQ samples?) Moreover, the capsule networks typically outperform the convolutional networks.
In a new paper (MILCOM 2022: My Papers [52]; arxiv.org version), my colleagues and I continue this line of research by including cyclic cumulants as the inputs to convolutional and capsule networks. We find that capsule networks outperform convolutional networks and that convolutional networks trained on cyclic cumulants outperform convolutional networks trained on IQ samples. We also find that both convolutional and capsule networks trained on cyclic cumulants generalize perfectly well between datasets that have different (disjoint) probability density functions governing their carrier frequency offset parameters.
That is, convolutional networks do better recognition with cyclic cumulants and generalize very well with cyclic cumulants.
So why don’t neural networks ever ‘learn’ cyclic cumulants with IQ data at the input?
The majority of the software and analysis work is performed by the first author, John Snoap, with an assist on capsule networks by James Latshaw. I created the datasets we used (available here on the CSP Blog [see below]) and helped with the blind parameter estimation. Professor Popescu guided us all and contributed substantially to the writing.
Let’s take an excursion outside of “Understanding and Using the Statistics of Communication Signals” by looking at a naturally occurring signal: the human genome.
Another brick in the wall, another drop in the bucket, another windmill on the horizon …
Let’s talk more about The Cult. No, I don’t mean She Sells Sanctuary, for which I do have considerable nostalgic fondness. I mean the Cult(ure) of Machine Learning in RF communications and signal processing. Or perhaps it is more of an epistemic bubble where there are The Things That Must Be Said and The Unmentionables in every paper and a style of research that is strictly adhered to but that, sadly, produces mostly error and promotes mostly hype. So we have shibboleths, taboos, and norms to deal with inside the bubble.
Time to get on my high horse. She’s a good horse named Ravager and she needs some exercise. So I’m going to strap on my claymore, mount Ravager, and go for a ride. Or am I merely tilting at windmills?
Let’s take a close look at another paper on machine learning for modulation recognition. It uses, uncritically, the DeepSig RML 2016 datasets. And the world and the world, the world drags me down…
Introducing swag for the best CSP-Blog commenters.
Update January 2023: You can find the list of winners on this page.
The comments that CSP Blog readers have made over the past six years are arguably the most helpful part of the Blog for do-it-yourself CSP practitioners. In those comments, my many errors have been revealed, which then has permitted me to attempt post corrections. Many unclear aspects of a post have been clarified after pondering a reader’s comment. At least one comment has been elevated to a post of its own.
The readership of the CSP Blog has been steadily growing since its inception in 2015, but the ratio of page views to comments remains huge–the vast majority of readers do not comment. This is understandable and perfectly acceptable. I rarely comment on any of the science and engineering blogs that I frequent. Nevertheless, I would like to encourage more commenting and also reward it.
Starts as a personal gripe, but ends with weird stuff from the literature.
During my poking around on arxiv.org the other day (Grrrrr…), I came across some postings by O’Shea et al I’d not seen before, including The Literature [R176]: “Wideband Signal Localization and Spectral Segmentation.”
Huh, I thought, they are probably trying to train a neural network to do automatic spectral segmentation that is superior to my published algorithm (My Papers [32]). Yeah, no. I mean yes to a machine, no to nods to me. Let’s take a look.
May 2022 saw 6026 page views at the CSP Blog, a new monthly record!
Thanks so much to all my readers, new and old, signal processors and machine learners, commenters and lurkers.
My next non-ranty post is on frequency-shift (FRESH) filtering. I will go over cyclic Wiener filtering (The Literature [R6]), which is optimal FRESH filtering, and then describe some interesting puzzles and problems with CW filtering, which may form the seeds of some solid signal-processing research projects of the academic sort.
Back in 2018 I posted a dataset consisting of 112,000 I/Q data files, 32,768 samples in length each, as a part of a challenge to machine learners who had been making strong claims of superiority over signal processing in the area of automatic modulation recognition. One part of the challenge was modulation recognition involving eight digital modulation types, and the other was estimating the carrier frequency offset. That dataset is described here, and I’d like to refer to it as CSPB.ML.2018.
Then in 2022 I posted a companion dataset to CSPB.ML.2018 called CSPB.ML.2022. This new dataset uses the same eight modulation types, similar ranges of SNR, pulse type, and symbol rate, but the random variable that governs the carrier frequency offset is different with respect to the random variable in CSPB.ML.2018. The purpose of the CSPB.ML.2022 dataset is to facilitate studies of the dataset-shift, or generalization, problem in machine learning.
Throughout the past couple of years I’ve been working with some graduate students and a professor at Old Dominion University on merging machine learning and signal processing for problems involving RF signal analysis, such as modulation recognition. We are starting to publish a sequence of papers that describe our efforts. I briefly describe the results of one such paper, My Papers [51], in this post.
Can we fix peer review in engineering by some form of payment to reviewers?
Let’s talk about another paper about cyclostationarity and correntropy. I’ve critically reviewed two previously, which you can find here and here. When you look at the correntropy as applied to a cyclostationary signal, you get something called cyclic correntropy, which is not particularly useful except if you don’t understand regular cyclostationarity and some aspects of garden-variety signal processing. Then it looks great.
But this isn’t a post that primarily takes the authors of a paper to task, although it does do that. I want to tell the tale to get us thinking about what ‘peer’ could mean, these days, in ‘peer-reviewed paper.’ How do we get the best peers to review our papers?