Starts as a personal gripe, but ends with weird stuff from the literature.
During my poking around on arxiv.org the other day (Grrrrr…), I came across some postings by O’Shea et al I’d not seen before, including The Literature [R176]: “Wideband Signal Localization and Spectral Segmentation.”
Huh, I thought, they are probably trying to train a neural network to do automatic spectral segmentation that is superior to my published algorithm (My Papers ). Yeah, no. I mean yes to a machine, no to nods to me. Let’s take a look.
May 2022 saw 6026 page views at the CSP Blog, a new monthly record!
Thanks so much to all my readers, new and old, signal processors and machine learners, commenters and lurkers.
My next non-ranty post is on frequency-shift (FRESH) filtering. I will go over cyclic Wiener filtering (The Literature [R6]), which is optimal FRESH filtering, and then describe some interesting puzzles and problems with CW filtering, which may form the seeds of some solid signal-processing research projects of the academic sort.
Back in 2018 I posted a dataset consisting of 112,000 I/Q data files, 32,768 samples in length each, as a part of a challenge to machine learners who had been making strong claims of superiority over signal processing in the area of automatic modulation recognition. One part of the challenge was modulation recognition involving eight digital modulation types, and the other was estimating the carrier frequency offset. That dataset is described here, and I’d like to refer to it as CSPB.ML.2018.
Then in 2022 I posted a companion dataset to CSPB.ML.2018 called CSPB.ML.2022. This new dataset uses the same eight modulation types, similar ranges of SNR, pulse type, and symbol rate, but the random variable that governs the carrier frequency offset is different with respect to the random variable in CSPB.ML.2018. The purpose of the CSPB.ML.2022 dataset is to facilitate studies of the dataset-shift, or generalization, problem in machine learning.
Throughout the past couple of years I’ve been working with some graduate students and a professor at Old Dominion University on merging machine learning and signal processing for problems involving RF signal analysis, such as modulation recognition. We are starting to publish a sequence of papers that describe our efforts. I briefly describe the results of one such paper, My Papers , in this post.
Can we fix peer review in engineering by some form of payment to reviewers?
Let’s talk about another paper about cyclostationarity and correntropy. I’ve critically reviewed two previously, which you can find here and here. When you look at the correntropy as applied to a cyclostationary signal, you get something called cyclic correntropy, which is not particularly useful except if you don’t understand regular cyclostationarity and some aspects of garden-variety signal processing. Then it looks great.
But this isn’t a post that primarily takes the authors of a paper to task, although it does do that. I want to tell the tale to get us thinking about what ‘peer’ could mean, these days, in ‘peer-reviewed paper.’ How do we get the best peers to review our papers?
In this Signal Processing ToolKit post we take a close look at the basic sampling theorem used daily by signal-processing engineers. Application of the sampling theorem is a way to choose a sampling rate for converting an analog continuous-time signal to a digital discrete-time signal. The former is ubiquitous in the physical world–for example all the radio-frequency signals whizzing around in the air and through your body right now. The latter is ubiquitous in the computing-device world–for example all those digital-audio files on your DiscmanItunesIpodDVDSmartphoneCloudNeuralink Singularity.
So how are those physical real-world analog signals converted to convenient lists of finite-precision numbers that we can apply arithmetic to? For that’s all [digital or cyclostationary] signal processing is at bottom: arithmetic. You might know the basic rule-of-thumb for choosing a sampling rate: Make sure it is at least twice as big as the largest frequency component in the analog signal undergoing the sampling. But why, exactly, and what does ‘largest frequency component’ mean?
Let’s take a look at an even faster spectral correlation function estimator. How useful is it for CSP applications in communications?
Reader Gideon pointed out that Antoni had published a paper a year after the paper that I considered in my first Antoni post. This newer paper, The Literature [R172], promises a faster fast spectral correlation estimator, and it delivers on that according to the analysis in the paper. However, I think the faster fast spectral correlation estimator is just as limited as the slower fast spectral correlation estimator when considered in the context of communication-signal processing.
And, to be fair, Antoni doesn’t often consider the context of communication-signal processing. His favored application is fault detection in mechanical systems with rotating parts. But I still don’t think the way he compares his fast and faster estimators to conventional estimators is fair. The reason is that his estimators are both severely limited in the maximum cycle frequency that can be processed, relative to the maximum cycle frequency that is possible.
Another RF-signal dataset to help push along our R&D on modulation recognition.
In this post I provide a second dataset for the Machine-Learning Challenge I issued in 2018 (CSPB.ML.2018). This dataset is similar to the original dataset, but possesses a key difference in that the probability distribution of the carrier-frequency offset parameter, viewed as a random variable, is not the same, but is still realistic.
Blog Note: By WordPress’ count, this is the 100th post on the CSP Blog. Together with a handful of pages (like My Papers and The Literature), these hundred posts have resulted in about 250,000 page views. That’s an average of 2,500 page views per post. However, the variance of the per-post pageviews is quite large. The most popular is The Spectral Correlation Function (> 16,000) while the post More on Pure and Impure Sinewaves, from the same era, has only 316 views. A big Thanks to all my readers!!
What are the ranges of spectral frequency and cycle frequency that we need to consider in a discrete-time/discrete-frequency setting for CSP?
Let’s talk about that diamond-shaped region in the plane we so often see associated with CSP. I’m talking about the principal domain for the discrete-time/discrete-frequency spectral correlation function. Where does it come from? Why do we care? When does it come up?
The Fast Spectral Correlation estimator is a quick way to find small cycle frequencies. However, its restrictions render it inferior to estimators like the SSCA and FAM.
In this post we take a look at an alternative CSP estimator created by J. Antoni et al (The Literature [R152]). The paper describing the estimator can be found here, and you can get some corresponding MATLAB code, posted by the authors, here if you have a Mathworks account.
The merging of conventional probability theory with signal theory leads to random processes, also known as stochastic processes. The ideas involved with random processes are central to cyclostationary signal processing.
In this Signal Processing ToolKit post, I provide an introduction to the concept and use of random processes (also called stochastic processes). This is my perspective on random processes, so although I’ll introduce and use the conventional concepts of stationarity and ergodicity, I’ll end up focusing on the differences between stationary and cyclostationary random processes. The goal is to illustrate those differences with informative graphics and videos; to build intuition in the reader about how the cyclostationarity property comes about, and about how the property relates to the more abstract mathematical object of a random process on one hand and to the concrete data-centric signal on the other.
So … this is the first SPTK post that is also a CSP post.
Just a reminder that if you are getting some value out of the CSP Blog, I would appreciate it if you could make a donation to offset my costs: I do pay WordPress to keep ads off the site! I also pay extra for a class of service that allows me to post large data sets like the one for the Machine-Learner Challenge.
If everyone that derived value from the CSP Blog were to donate $5, I’d have enough leftover for at least a couple cups of fancy coffee.
Why does zero-padding help in various estimators of the spectral correlation and spectral coherence functions?
Update to the exchange: May 7, 2021.May 14, 2021.
Reader Clint posed a great question about zero-padding in the frequency-smoothing method (FSM) of spectral correlation function estimation. The question prompted some pondering on my part, and I went ahead and did some experiments with the FSM to illustrate my response to Clint. The exchange with Clint (ongoing!) was deep and detailed enough that I thought it deserved to be seen by other CSP-Blog readers. One of the problems with developing material, or refining explanations, in the Comments sections of the CSP Blog is that these sections are not nearly as visible in the navigation tools featured on the Blog as are the Posts and Pages.
Let’s take a brief look at the cyclostationarity of a captured DMR signal. It’s more complicated than one might think.
In this post I look at the cyclostationarity of a digital mobile radio (DMR) signal empirically. That is, I have a captured DMR signal from sigidwiki.com, and I apply blind CSP to it to determine its cycle frequencies and spectral correlation function. The signal is arranged in frames or slots, with gaps between successive slots, so there is the chance that we’ll see cyclostationarity due to the on-burst (or on-frame) signaling and cyclostationarity due to the framing itself.
Another post-publication review of a paper that is weak on the ‘RF’ in RF machine learning.
Let’s take a look at a recently published paper (The Literature [R148]) on machine-learning-based modulation-recognition to get a data point on how some electrical engineers–these are more on the side of computer science I believe–use mathematics when they turn to radio-frequency problems. You can guess it isn’t pretty, and that I’m not here to exalt their acumen.
Spectral correlation surfaces for real-valued and complex-valued versions of the same signal look quite different.
In the real world, the electromagnetic field is a multi-dimensional time-varying real-valued function (volts/meter or newtons/coulomb). But in mathematical physics and signal processing, we often use complex-valued representations of the field, or of quantities derived from it, to facilitate our mathematics or make the signal processing more compact and efficient.
So throughout the CSP Blog I’ve focused almost exclusively on complex-valued signals and data. However, there is a considerable older literature that uses real-valued signals, such as The Literature [R1, R151]. You can use either real-valued or complex-valued signal representations and data, as you prefer, but there are advantages and disadvantages to each choice. Moreover, an author might not be perfectly clear about which one is used, especially when presenting a spectral correlation surface (as opposed to a sequence of equations, where things are often more clear).
Last evening the CSP Blog crossed the 50,000 page-view threshold for 2020, a yearly total that has not been achieved previously!
I want to thank each reader, each commenter, and each person that’s clicked the Donate button. You’ve made the CSP Blog the success it is, and I am so grateful for the time you spend here.
On these occasions I put some of the more interesting CSP-Blog statistics below the fold. If you have been wanting to see a post on a particular CSP or Signal Processing ToolKit topic, and it just hasn’t appeared, feel free to leave me a note in the Comments section.
The Machine Learners think that their “feature engineering” (rooting around in voluminous data) is the same as “features” in mathematically derived signal-processing algorithms. I take a lighthearted look.
One of the things the machine learners never tire of saying is that their neural-network approach to classification is superior to previous methods because, in part, those older methods use hand-crafted features. They put it in different ways, but somewhere in the introductory section of a machine-learning modulation-recognition paper (ML/MR), you’ll likely see the claim. You can look through the ML/MR papers I’ve cited in The Literature ([R133]-[R146]) if you are curious, but I’ll extract a couple here just to illustrate the idea.
What happens when a cyclostationary time-series is treated as if it were stationary?
In this post let’s consider the difference between modeling a communication signal as stationary or as cyclostationary.
There are two contexts for this kind of issue. The first is when someone recognizes that a particular signal model is cyclostationary, and then takes some action to render it stationary (sometimes called ‘stationarizing the signal’). They then proceed with their analysis or algorithm development using the stationary signal model. The second context is when someone applies stationary-signal processing to a cyclostationary signal model, either without knowing that the signal is cyclostationary, or perhaps knowing but not caring.
At the center of this topic is the difference between the mathematical object known as a random process (or stochastic process) and the mathematical object that is a single infinite-time function (or signal or time-series).
A related paper is The Literature [R68], which discusses the pitfalls of applying tools meant for stationary signals to the samples of cyclostationary signals.
DeepSig’s data sets are popular in the machine-learning modulation-recognition community, and in that community there are many claims that the deep neural networks are vastly outperforming any expertly hand-crafted tired old conventional method you care to name (none are usually named though). So I’ve been looking under the hood at these data sets to see what the machine learners think of as high-quality inputs that lead to disruptive upending of the sclerotic mod-rec establishment. In previous posts, I’ve looked at two of the most popular DeepSig data sets from 2016 (here and here). In this post, we’ll look at one more and I will then try to get back to the CSP posts.
Let’s take a look at one more DeepSig data set: 2018.01.OSC.0001_1024x2M.h5.tar.gz.