## Are Probability Density Functions “Engineered” or “Hand-Crafted” Features?

The Machine Learners think that their “feature engineering” (rooting around in voluminous data) is the same as “features” in mathematically derived signal-processing algorithms. I take a lighthearted look.

One of the things the machine learners never tire of saying is that their neural-network approach to classification is superior to previous methods because, in part, those older methods use hand-crafted features. They put it in different ways, but somewhere in the introductory section of a machine-learning modulation-recognition paper (ML/MR), you’ll likely see the claim. You can look through the ML/MR papers I’ve cited in The Literature ([R133]-[R146]) if you are curious, but I’ll extract a couple here just to illustrate the idea.

Continue reading “Are Probability Density Functions “Engineered” or “Hand-Crafted” Features?”

## Stationary Signal Models Versus Cyclostationary Signal Models

What happens when a cyclostationary time-series is treated as if it were stationary?

In this post let’s consider the difference between modeling a communication signal as stationary or as cyclostationary.

There are two contexts for this kind of issue. The first is when someone recognizes that a particular signal model is cyclostationary, and then takes some action to render it stationary (sometimes called ‘stationarizing the signal’). They then proceed with their analysis or algorithm development using the stationary signal model. The second context is when someone applies stationary-signal processing to a cyclostationary signal model, either without knowing that the signal is cyclostationary, or perhaps knowing but not caring.

At the center of this topic is the difference between the mathematical object known as a random process (or stochastic process) and the mathematical object that is a single infinite-time function (or signal or time-series).

A related paper is The Literature [R68], which discusses the pitfalls of applying tools meant for stationary signals to the samples of cyclostationary signals.

## All BPSK Signals

An analysis of DeepSig’s 2016.10A dataset, used in many published machine-learning papers, and detailed comments on quite a few of those papers.

Update March 2021

See my analyses of three other DeepSig datasets here, here, and here.

Update June 2020

I’ll be adding new papers to this post as I find them. At the end of the original post there is a sequence of date-labeled updates that briefly describe the relevant aspects of the newly found papers. Some machine-learning modulation-recognition papers deserve their own post, so check back at the CSP Blog from time-to-time for “Comments On …” posts.

## Symmetries of Higher-Order Temporal Probabilistic Parameters in CSP

What are the unique parts of the multidimensional cyclic moments and cyclic cumulants?

In this post, we continue our study of the symmetries of CSP parameters. The second-order parameters–spectral correlation and cyclic correlation–are covered in detail in the companion post, including the symmetries for ‘auto’ and ‘cross’ versions of those parameters.

Here we tackle the generalizations of cyclic correlation: cyclic temporal moments and cumulants. We’ll deal with the generalization of the spectral correlation function, the  cyclic polyspectra, in a subsequent post. It is reasonable to me to focus first on the higher-order temporal parameters, because I consider the temporal parameters to be much more useful in practice than the spectral parameters.

This topic is somewhat harder and more abstract than the second-order topic, but perhaps there are bigger payoffs in algorithm development for exploiting symmetries in higher-order parameters than in second-order parameters because the parameters are multidimensional. So it could be worthwhile to sally forth.

## Symmetries of Second-Order Probabilistic Parameters in CSP

Do we need to consider all cycle frequencies, both positive and negative? Do we need to consider all delays and frequencies in our second-order CSP parameters?

As you progress through the various stages of learning CSP (intimidation, frustration, elucidation, puzzlement, and finally smooth operation), the symmetries of the various functions come up over and over again. Exploiting symmetries can result in lower computational costs, quicker debugging, and easier mathematical development.

What exactly do we mean by ‘symmetries of parameters?’ I’m talking primarily about the evenness or oddness of the time-domain functions in the delay $\tau$ and cycle frequency $\alpha$ variables and of the frequency-domain functions in the spectral frequency $f$ and cycle frequency $\alpha$ variables. Or a generalized version of evenness/oddness, such as $f(-x) = g(x)$, where $f(x)$ and $g(x)$ are closely related functions. We have to consider the non-conjugate and conjugate functions separately, and we’ll also consider both the auto and cross versions of the parameters. We’ll look at higher-order cyclic moments and cumulants in a future post.

You can use this post as a resource for mathematical development because I present the symmetry equations. But also each symmetry result is illustrated using estimated parameters via the frequency smoothing method (FSM) of spectral correlation function estimation. The time-domain parameters are obtained from the inverse transforms of the FSM parameters. So you can also use this post as an extension of the second-order verification guide to ensure that your estimator works for a wide variety of input parameters.

## The Ambiguity Function and the Cyclic Autocorrelation Function: Are They the Same Thing?

To-may-to, to-mah-to?

Let’s talk about ambiguity and correlation. The ambiguity function is a core component of radar signal processing practice and theory. The autocorrelation function and the cyclic autocorrelation function, are key elements of generic signal processing and cyclostationary signal processing, respectively. Ambiguity and correlation both apply a quadratic functional to the data or signal of interest, and they both weight that quadratic functional by a complex exponential (sine wave) prior to integration or summation.

Are they the same thing? Well, my answer is both yes and no.

## CSP Resources: The Ultimate Guides to Cyclostationary Random Processes by Professor Napolitano

My friend and colleague Antonio Napolitano has just published a new book on cyclostationary signals and cyclostationary signal processing:

Cyclostationary Processes and Time Series: Theory, Applications, and Generalizations, Academic Press/Elsevier, 2020, ISBN: 978-0-08-102708-0. The book is a comprehensive guide to the structure of cyclostationary random processes and signals, and it also provides pointers to the literature on many different applications. The book is mathematical in nature; use it to deepen your understanding of the underlying mathematics that make CSP possible.

You can check out the book on amazon.com using the following link:

Cyclostationary Processes and Time Series

## On Impulsive Noise, CSP, and Correntropy

And I still don’t understand how a random variable with infinite variance can be a good model for anything physical. So there.

I’ve seen several published and pre-published (arXiv.org) technical papers over the past couple of years on the topic of cyclic correntropy (The Literature [R123-R127]). I first criticized such a paper ([R123]) here, but the substance of that review was about my problems with the presented mathematics, not impulsive noise and its effects on CSP. Since the papers keep coming, apparently, I’m going to put down some thoughts on impulsive noise and some evidence regarding simple means of mitigation in the context of CSP. Preview: I don’t think we need to go to the trouble of investigating cyclic correntropy as a means of salvaging CSP from the evil clutches of impulsive noise.

## On The Shoulders

What modest academic success I’ve had in the area of cyclostationary signal theory and cyclostationary signal processing is largely due to the patient mentorship of my doctoral adviser, William (Bill) Gardner, and the fact that I was able to build on an excellent foundation put in place by Gardner, his advisor Lewis Franks, and key Gardner students such as William (Bill) Brown.

## Simple Synchronization Using CSP

Using CSP to find the exact values of symbol rate, carrier frequency offset, symbol-clock phase, and carrier phase for PSK/QAM signals.

In this post I discuss the use of cyclostationary signal processing applied to communication-signal synchronization problems. First, just what are synchronization problems? Synchronize and synchronization have multiple meanings, but the meaning of synchronize that is relevant here is something like:

syn·chro·nize: To cause to occur or operate with exact coincidence in time or rate

If we have an analog amplitude-modulated (AM) signal (such as voice AM used in the AM broadcast bands) at a receiver we want to remove the effects of the carrier sine wave, resulting in an output that is only the original voice or music message. If we have a digital signal such as binary phase-shift keying (BPSK), we want to remove the effects of the carrier but also sample the message signal at the correct instants to optimally recover the transmitted bit sequence.

## Dataset for the Machine-Learning Challenge [CSPB.ML.2018]

A PSK/QAM/SQPSK data set with randomized symbol rate, inband SNR, carrier-frequency offset, and pulse roll-off.

Update September 2023: A randomization flaw has been found and fixed for CSPB.ML.2018, resulting in CSPB.ML.2018R2. Use that one going forward.

Update February 2023: I’ve posted a third challenge dataset here. It is CSPB.ML.2023 and features cochannel signals.

Update April 2022. I’ve also posted a second dataset here. This new dataset is similar to the original ML Challenge dataset except the random variable representing the carrier frequency offset has a slightly different distribution.

If you refer to either of the posted datasets in a published paper, please use the following designators, which I am also using in papers I’m attempting to publish:

Original ML Challenge Dataset: CSPB.ML.2018.

Shifted ML Challenge Dataset: CSPB.ML.2022.

Update September 2020. I made a mistake when I created the signal-parameter “truth” files signal_record.txt and signal_record_first_20000.txt. Like the DeepSig RML datasets that I analyzed on the CSP Blog here and here, the SNR parameter in the truth files did not match the actual SNR of the signals in the data files. I’ve updated the truth files and the links below. You can still use the original files for all other signal parameters, but the SNR parameter was in error.

Update July 2020. I originally posted $20,000$ signals in the posted dataset. I’ve now added another $92,000$ for a total of $112,000$ signals. The original signals are contained in Batches 1-5, the additional signals in Batches 6-28. I’ve placed these additional Batches at the end of the post to preserve the original post’s content.

Continue reading “Dataset for the Machine-Learning Challenge [CSPB.ML.2018]”

## How we Learned CSP

We learned it using abstractions involving various infinite quantities. Can a machine learn it without that advantage?

This post is just a blog post. Just some guy on the internet thinking out loud. If you have relevant thoughts or arguments you’d like to advance, please leave them in the Comments section at the end of the post.

How did we, as people not machines, learn to do cyclostationary signal processing? We’ve successfully applied it to many real-world problems, such as weak-signal detection, interference-tolerant detection, interference-tolerant time-delay estimation, modulation recognition, joint multiple-cochannel-signal modulation recognition (My Papers [25,26,28,38,43]), synchronization (The Literature [R7]), beamforming (The Literature [R102,R103]), direction-finding (The Literature [R104-R106]), detection of imminent mechanical failures (The Literature [R017-R109]), linear time-invariant system identification (The Literature [R110-R115]), and linear periodically time-variant filtering for cochannel signal separation (FRESH filtering) (My Papers [45], The Literature [R6]).

How did this come about? Is it even interesting to ask the question? Well, it is to me. I ask it because of the current hot topic in signal processing: machine learning. And in particular, machine learning applied to modulation recognition (see here and here and here and here). The machine learners want to capitalize on the success of machine learning as applied to image recognition by directly applying the same sorts of image-recognition techniques to the problem of automatic type-recognition for human-made electromagnetic waves.

## A Challenge for the Machine Learners

The machine-learning modulation-recognition community consistently claims vastly superior performance to anything that has come before. Let’s test that.

Update September 2023: A randomization flaw has been found and fixed for CSPB.ML.2018, resulting in Use that one going forward.

Update February 2023: A third dataset has been posted here. This new dataset, CSPB.ML.2023, features cochannel signals.

Update April 2022: I’ve also posted a second dataset here. This new dataset is similar to the original ML Challenge dataset except the random variable representing the carrier frequency offset has a slightly different distribution.

If you refer to any of the posted datasets in a published paper, please use the following designators, which I am also using in papers I’m attempting to publish:

Original ML Challenge Dataset: CSPB.ML.2018.

Shifted ML Challenge Dataset: CSPB.ML.2022.

Cochannel ML Dataset: CSPB.ML.2023.

### Update February 2019

I’ve decided to post the data set I discuss here to the CSP Blog for all interested parties to use. See the new post on the Data Set. If you do use it, please let me and the CSP Blog readers know how you fared with your experiments in the Comments section of either post. Thanks!

## Computational Costs for Spectral Correlation Estimators

The costs strongly depend on whether you have prior cycle-frequency information or not.

Let’s look at the computational costs for spectral-correlation analysis using the three main estimators I’ve previously described on the CSP Blog: the frequency-smoothing method (FSM), the time-smoothing method (TSM), and the strip spectral correlation analyzer (SSCA).

We’ll see that the FSM and TSM are the low-cost options when estimating the spectral correlation function for a few cycle frequencies and that the SSCA is the low-cost option when estimating the spectral correlation function for many cycle frequencies. That is, the TSM and FSM are good options for directed analysis using prior information (values of cycle frequencies) and the SSCA is a good option for exhaustive blind analysis, for which there is no prior information available.

## CSP Patent: Tunneling

Tunneling == Purposeful severe undersampling of wideband communication signals. If some of the cyclostationarity property remains, we can exploit it at a lower cost.

My colleague Dr. Apurva Mody (of BAE Systems, AiRANACULUS, IEEE 802.22, and the WhiteSpace Alliance) and I have received a patent on a CSP-related invention we call tunneling. The US Patent is 9,755,869 and you can read it here or download it here. We’ve got a journal paper in review and a 2013 MILCOM conference paper (My Papers [38]) that discuss and illustrate the involved ideas. I’m also working on a CSP Blog post on the topic.

Update December 28, 2017: Our Tunneling journal paper has been accepted for publication in the journal IEEE Transactions on Cognitive Communications and Networking. You can download the pre-publication version here.

## Resolution in Time, Frequency, and Cycle Frequency for CSP Estimators

Unlike conventional spectrum analysis for stationary signals, CSP has three kinds of resolutions that must be considered in all CSP applications, not just two.

In this post, we look at the ability of various CSP estimators to distinguish cycle frequencies, temporal changes in cyclostationarity, and spectral features. These abilities are quantified by the resolution properties of CSP estimators.

### Resolution Parameters in CSP: Preview

Consider performing some CSP estimation task, such as using the frequency-smoothing method, time-smoothing method, or strip spectral correlation analyzer method of estimating the spectral correlation function. The estimate employs $T$ seconds of data.

Then the temporal resolution $\Delta t$ of the estimate is approximately $T$, the cycle-frequency resolution $\Delta \alpha$ is about $1/T$, and the spectral resolution $\Delta f$ depends strongly on the particular estimator and its parameters. The resolution product $\Delta f \Delta t$ was discussed in this post. The fundamental result for the resolution product is that it must be very much larger than unity in order to obtain an SCF estimate with low variance.

## CSP Estimators: Cyclic Temporal Moments and Cumulants

How do we efficiently estimate higher-order cyclic cumulants? The basic answer is first estimate cyclic moments, then combine using the moments-to-cumulants formula.

In this post we discuss ways of estimating $n$-th order cyclic temporal moment and cumulant functions. Recall that for $n=2$, cyclic moments and cyclic cumulants are usually identical. They differ when the signal contains one or more finite-strength additive sine-wave components. In the common case when such components are absent (as in our recurring numerical example involving rectangular-pulse BPSK), they are equal and they are also equal to the conventional cyclic autocorrelation function provided the delay vector is chosen appropriately. That is, the two-dimensional delay vector $\boldsymbol{\tau} = [\tau_1\ \ \tau_2]$ is set equal to $[\tau/2\ \ -\tau/2]$.

The more interesting case is when the order $n$ is greater than two. Most communication signal models possess odd-order moments and cumulants that are identically zero, so the first non-trivial order $n$ greater than two is four. Our estimation task is to estimate $n$-th order temporal moment and cumulant functions for $n \ge 4$ using a sampled-data record of length $T$.

## CSP Blog Highlights

Welcome to the CSP Blog!

To help new readers, I’m supplying here links to the posts that have gotten the most attention over the lifetime of the Blog. Omitted from this list are the more esoteric topics as well as most of the posts that comment on the engineering literature.

Click the button below to follow the CSP Blog:

Use this button to donate to the CSP Blog:

Support the CSP Blog and Keep it Ad-Free

Please consider donating to the CSP Blog to keep it ad-free and to support the addition of major new features. The small box below is used to specify the number of $5 donations.$5.00

You can see a pre-publication version of my latest CSP journal paper, on “tunneling”, here.

Here are some highlights:

## Automatic Spectral Segmentation

Radio-frequency scene analysis is much more complex than modulation recognition. A good first step is to blindly identify the frequency intervals for which significant non-noise energy exists.

In this post, I discuss a signal-processing algorithm that has almost nothing to do with cyclostationary signal processing (CSP). Almost. The topic is automatic spectral segmentation, which I also call band-of-interest (BOI) detection. When attempting to perform automatic radio-frequency scene analysis (RFSA), we may be confronted with a data block that contains multiple signals in a number of distinct frequency subbands. Moreover, these signals may be turning on and off within the data block. To apply our cyclostationary signal processing tools effectively, we would like to isolate these signals in time and frequency to the greatest extent possible using linear time-invariant filtering (for separating in the frequency dimension) and time-gating (for separating in the time dimension). Then the isolated signal components can be processed serially using CSP.

It is very important to remember that even perfect spectral and temporal segmentation will not solve the cochannel-signal problem. It is perfectly possible that an isolated subband will contain more than one cochannel signal.

The basics of my BOI-detection approach are published in a 2007 conference paper (My Papers [32]). I’ll describe this basic approach, illustrate it with examples relevant to RFSA, and also provide a few extensions of interest, including one that relates to cyclostationary signal processing.