## Symmetries of Higher-Order Temporal Probabilistic Parameters in CSP

In this post, we continue our study of the symmetries of CSP parameters. The second-order parameters–spectral correlation and cyclic correlation–are covered in detail in the companion post, including the symmetries for ‘auto’ and ‘cross’ versions of those parameters.

Here we tackle the generalizations of cyclic correlation: cyclic temporal moments and cumulants. We’ll deal with the generalization of the spectral correlation function, the  cyclic polyspectra, in a subsequent post. It is reasonable to me to focus first on the higher-order temporal parameters, because I consider the temporal parameters to be much more useful in practice than the spectral parameters.

This topic is somewhat harder and more abstract than the second-order topic, but perhaps there are bigger payoffs in algorithm development for exploiting symmetries in higher-order parameters than in second-order parameters because the parameters are multidimensional. So it could be worthwhile to sally forth.

## On The Shoulders

What modest academic success I’ve had in the area of cyclostationary signal theory and cyclostationary signal processing is largely due to the patient mentorship of my doctoral adviser, William (Bill) Gardner, and the fact that I was able to build on an excellent foundation put in place by Gardner, his advisor Lewis Franks, and key Gardner students such as William (Bill) Brown.

## Simple Synchronization Using CSP

In this post I discuss the use of cyclostationary signal processing applied to communication-signal synchronization problems. First, just what are synchronization problems? Synchronize and synchronization have multiple meanings, but the meaning of synchronize that is relevant here is something like:

syn·chro·nize: To cause to occur or operate with exact coincidence in time or rate

If we have an analog amplitude-modulated (AM) signal (such as voice AM used in the AM broadcast bands) at a receiver we want to remove the effects of the carrier sine wave, resulting in an output that is only the original voice or music message. If we have a digital signal such as binary phase-shift keying (BPSK), we want to remove the effects of the carrier but also sample the message signal at the correct instants to optimally recover the transmitted bit sequence.

## Data Set for the Machine-Learning Challenge

Update July 2020. I originally posted $20,000$ signals in the posted data set. I’ve now added another $92,000$ for a total of $112,000$ signals. The original signals are contained in Batches 1-5, the additional signals in Batches 6-28. I’ve placed these additional Batches at the end of the post to preserve the original post’s content.

I’ve posted $20000$ PSK/QAM signals to the CSP Blog. These are the signals I refer to in the post I wrote challenging the machine-learners. In this brief post, I provide links to the data and describe how to interpret the text file containing the signal-type labels and signal parameters.

### Overview of Data Set

The $20,000$ signals are stored in five zip files, each containing $4000$ individual signal files:

Batch 1

Batch 2

Batch 3

Batch 4

Batch 5

The zip files are each about 1 GB in size.

The modulation-type labels for the signals, such as “BPSK” or “MSK,” are contained in the zipped text file:

signal_record_first_20000.txt.zip

Each signal file is stored in a binary format involving interleaved real and imaginary parts, which I call ‘.tim’ files. You can read a .tim file into MATLAB using read_binary.m. Or use the code inside read_binary.m to write your own data-reader; the format is quite simple.

### The Label and Parameter File

Let’s look at the format of the truth/label file. The first line of signal_record_first_20000.txt is

1 bpsk  11  -7.4433467080e-04  9.8977795076e-01  10  9  5.4532617590e+00  0.0

which comprises $9$ fields. All temporal and spectral parameters (times and frequencies) are normalized with respect to the sampling rate. In other words, the sampling rate can be taken to be unity in this data set. These fields are described in the following list:

1. Signal index. In the case above this is `1′ and that means the file containing the signal is called signal_1.tim. In general, the $n$th signal is contained in the file signal_n.tim. The Batch 1 zip file contains signal_1.tim through signal_4000.tim.
2. Signal type. A string indicating the modulation format of the signal in the file. For this data set, I’ve only got eight modulation types: BPSK, QPSK, 8PSK, $\pi/4$-DQPSK, 16QAM, 64QAM, 256QAM, and MSK. These are denoted by the strings bpsk, qpsk, 8psk, dqpsk, 16qam, 64qam, 256qam, and msk, respectively.
3. Base symbol period. In the example above (line one of the truth file), the base symbol period is $T_0 = 11$.
4. Carrier offset. In this case, it is $-7.4433467080\times 10^{-4}$.
5. Excess bandwidth. The excess bandwidth parameter, or square-root raised-cosine roll-off parameter, applies to all of the signal types except MSK. Here it is $9.8977795076\times 10^{-1}$. It can be any real number between $0.1$ and $1.0$.
6. Upsample factor. The sixth field is an upsampling parameter U.
7. Downsample factor. The seventh field is a downsampling parameter D. The actual symbol rate of the signal in the file is computed from the base symbol period, upsample factor, and downsample factor: $\displaystyle f_{sym} = (1/T_0)*(D/U)$. So the BPSK signal in signal_1.tim has rate $0.08181818$. If the downsample factor is zero in the truth-parameters file, no resampling was done to the signal.
8. Inband SNR (dB). The ratio of the signal power to the noise power within the signal’s bandwidth, taking into account the signal type and the excess bandwidth parameter.
9. Noise spectral density (dB). It is always $0$ dB. So the various SNRs are generated by varying the signal power.

To ensure that you have correctly downloaded and interpreted my data files, I’m going to provide some PSD plots and a couple of the actual sample values for a couple of the files.

### signal_1.tim

The line from the truth file is:

1 bpsk  11  -7.4433467080e-04  9.8977795076e-01  10  9  5.4532617590e+00  0.0

The first ten samples of the file are:

-5.703014e-02   -6.163056e-01
-1.285231e-01   -6.318392e-01
6.664069e-01    -7.007506e-02
7.731103e-01    -1.164615e+00
3.502680e-01    -1.097872e+00
7.825349e-01    -3.721564e-01
1.094809e+00    -3.123962e-01
4.146149e-01    -5.890701e-01
1.444665e+00    7.358724e-01
-2.217039e-01   -1.305001e+00

An FSM-based PSD estimate for signal_1.tim is:

And the blindly estimated cycle frequencies (using the SSCA) are:

The previous plot corresponds to the numerical values:

Non-conjugate $(\alpha, C, S)$:

8.181762695e-02  7.480e-01  5.406e+00

Conjugate $(\alpha, C, S)$:

8.032470942e-02  7.800e-01  4.978e+00
-1.493096002e-03  8.576e-01  1.098e+01
-8.331298083e-02  7.090e-01  5.039e+00

### signal_4000.tim

The line from the truth file is

4000 256qam  9  8.3914849139e-04  7.2367959637e-01  9  8  1.0566301192e+01  0.0

which means the symbol rate is given by $(1/9)*(8/9) = 0.09876543209$. The carrier offset is $0.000839$ and the excess bandwidth is $0.723$. Because the signal type is 256QAM, it has a single (non-zero) non-conjugate cycle frequency of $0.098765$ and no conjugate cycle frequencies. But the square of the signal has cycle frequencies related to the quadrupled carrier:

### Final Thoughts

Is $20000$ waveforms a large enough data set? Maybe not. I have generated tens of thousands more, but will not post until there is a good reason to do so. And that, my friends, is up to you!

That’s about it. I think that gives you enough information to ensure that you’ve interpreted the data and the labels correctly. What remains is experimentation, machine-learning or otherwise I suppose. Please get back to me and the readers of the CSP Blog with any interesting results using the Comments section of this post or the Challenge post.

For my analysis of a commonly used machine-learning modulation-recognition data set (RML), see the All BPSK Signals post.

Batch 6

Batch 7

Batch 8

Batch 9

Batch 10

Batch 11

Batch 12

Batch 13

Batch 14

Batch 15

Batch 16

Batch 17

Batch 18

Batch 19

Batch 20

Batch 21

Batch 22

Batch 23

Batch 24

Batch 25

Batch 26

Batch 28

Signal parameters text file

## How we Learned CSP

This post is just a blog post. Just some guy on the internet thinking out loud. If you have relevant thoughts or arguments you’d like to advance, please leave them in the Comments section at the end of the post.

How did we, as people not machines, learn to do cyclostationary signal processing? We’ve successfully applied it to many real-world problems, such as weak-signal detection, interference-tolerant detection, interference-tolerant time-delay estimation, modulation recognition, joint multiple-cochannel-signal modulation recognition (My Papers [25,26,28,38,43]), synchronization (The Literature [R7]), beamforming (The Literature [R102,R103]), direction-finding (The Literature [R104-R106]), detection of imminent mechanical failures (The Literature [R017-R109]), linear time-invariant system identification (The Literature [R110-R115]), and linear periodically time-variant filtering for cochannel signal separation (FRESH filtering) (My Papers [45], The Literature [R6]).

How did this come about? Is it even interesting to ask the question? Well, it is to me. I ask it because of the current hot topic in signal processing: machine learning. And in particular, machine learning applied to modulation recognition (see here and here). The machine learners want to capitalize on the success of machine learning applied to image recognition by directly applying the same sorts of image-recognition techniques to the problem of automatic type-recognition for human-made electromagnetic waves.

## ‘Can a Machine Learn the Fourier Transform?’ Redux, Plus Relevant Comments on a Machine-Learning Paper by M. Kulin et al.

I first considered whether a machine (neural network) could learn the (64-point, complex-valued)  Fourier transform in this post. I used MATLAB’s Neural Network Toolbox and I failed to get good learning results because I did not properly set the machine’s hyperparameters. A kind reader named Vito Dantona provided a comment to that original post that contained good hyperparameter selections, and I’m going to report the new results here in this post.

Since the Fourier transform is linear, the machine should be set up to do linear processing. It can’t just figure that out for itself. Once I used Vito’s suggested hyperparameters to force the machine to be linear, the results became much better:

## Resolution in Time, Frequency, and Cycle Frequency for CSP Estimators

In this post, we look at the ability of various CSP estimators to distinguish cycle frequencies, temporal changes in cyclostationarity, and spectral features. These abilities are quantified by the resolution properties of CSP estimators.

### Resolution Parameters in CSP: Preview

Consider performing some CSP estimation task, such as using the frequency-smoothing method, time-smoothing method, or strip spectral correlation analyzer method of estimating the spectral correlation function. The estimate employs $T$ seconds of data.

Then the temporal resolution $\Delta t$ of the estimate is approximately $T$, the cycle-frequency resolution $\Delta \alpha$ is about $1/T$, and the spectral resolution $\Delta f$ depends strongly on the particular estimator and its parameters. The resolution product $\Delta f \Delta t$ was discussed in this post. The fundamental result for the resolution product is that it must be very much larger than unity in order to obtain an SCF estimate with low variance.

## CSP Estimators: Cyclic Temporal Moments and Cumulants

In this post we discuss ways of estimating $n$-th order cyclic temporal moment and cumulant functions. Recall that for $n=2$, cyclic moments and cyclic cumulants are usually identical. They differ when the signal contains one or more finite-strength additive sine-wave components. In the common case when such components are absent (as in our recurring numerical example involving rectangular-pulse BPSK), they are equal and they are also equal to the conventional cyclic autocorrelation function provided the delay vector is chosen appropriately.

The more interesting case is when the order $n$ is greater than $2$. Most communication signal models possess odd-order moments and cumulants that are identically zero, so the first non-trivial order $n$ greater than $2$ is $4$. Our estimation task is to estimate $n$-th order temporal moment and cumulant functions for $n \ge 4$ using a sampled-data record of length $T$.

## More on Pure and Impure Sine Waves

Remember when we derived the cumulant as the solution to the pure $n$th-order sine-wave problem? It sounded good at the time, I hope. But here I describe a curious special case where the interpretation of the cumulant as the pure component of a nonlinearly generated sine wave seems to break down.

## Machine Learning and Modulation Recognition: Comments on “Convolutional Radio Modulation Recognition Networks” by T. O’Shea, J. Corgan, and T. Clancy

In this post I provide some comments on another paper I’ve seen on arxiv.org (I have also received copies of it through email) that relates to modulation classification and cyclostationary signal processing. The paper is by O’Shea et al and is called “Convolutional Radio Modulation Recognition Networks.” (The Literature [R138]) You can find it at this link.

## Cyclic Polyspectra

In this post we take a first look at the spectral parameters of higher-order cyclostationarity (HOCS). In previous posts, I have introduced the topic of HOCS and have looked at the temporal parameters, such as cyclic cumulants and cyclic moments. Those temporal parameters have proven useful in modulation classification and parameter estimation settings, and will likely be an important part of my ultimate radio-frequency scene analyzer.

The spectral parameters of HOCS have not proven to be as useful as the temporal parameters, unless you include the trivial case where the moment/cumulant order is equal to two. In that case, the spectral parameters reduce to the spectral correlation function, which is extremely useful in CSP (see the TDOA and signal-detection posts for examples).

## Comments on “Cyclostationary Correntropy: Definition and Application” by Fontes et al

I recently came across a published paper with the title Cyclostationary Correntropy: Definition and Application, by Aluisio Fontes et al. It is published in a journal called Expert Systems with Applications (Elsevier). Actually, it wasn’t the first time I’d seen this work by these authors. I had reviewed a similar paper in 2015 for a different journal.

I was surprised to see the paper published because I had a lot of criticisms of the original paper, and the other reviewers agreed since the paper was rejected. So I did my job, as did the other reviewers, and we tried to keep a flawed paper from entering the literature, where it would stay forever causing problems for readers.

The editor(s) of the journal Expert Systems with Applications did not ask me to review the paper, so I couldn’t give them the benefit of the work I already put into the manuscript, and apparently the editor(s) did not themselves see sufficient flaws in the paper to merit rejection.

It stings, of course, when you submit a paper that you think is good, and it is rejected. But it also stings when a paper you’ve carefully reviewed, and rejected, is published anyway.

Fortunately I have the CSP Blog, so I’m going on another rant. After all, I already did this the conventional rant-free way.

## Cyclostationarity of Digital QAM and PSK

Let’s look into the statistical properties of a class of textbook signals that encompasses digital quadrature amplitude modulation (QAM), phase-shift keying (PSK), and pulse-amplitude modulation (PAM). I’ll call the class simply digital QAM (DQAM), and all of its members have an analytical-signal mathematical representation of the form

$\displaystyle s(t) = \sum_{k=-\infty}^\infty a_k p(t - kT_0 - t_0) e^{i2\pi f_0 t + i \phi_0}. \hfill (1)$

In this model, $k$ is the symbol index, $1/T_0 = f_{sym}$ is the symbol rate, $f_0$ is the carrier frequency (sometimes called the frequency offset), $t_0$ is the symbol-clock phase, and $\phi_0$ is the carrier phase. The finite-energy function $p(t)$ is the pulse function (sometimes called the pulse-shaping function). Finally, the random variable $a_k$ is called the symbol, and has a discrete distribution that is called the constellation.

Model (1) is a textbook signal when the sequence of symbols is independent and identically distributed (IID). This condition rules out real-world communication aids such as periodically transmitted bursts of known symbols, adaptive modulation (where the constellation may change in response to the vagaries of the propagation channel), some forms of coding, etc. Also, when the pulse function $p(t)$ is a rectangle (with width $T_0$), the signal is even less realistic, and therefore more textbook.

We will look at the moments and cumulants of this general model in this post. Although the model is textbook, we could use it as a building block to form more realistic, less textbooky, signal models. Then we could find the cyclostationarity of those models by applying signal-processing transformation rules that define how the cumulants of the output of a signal processor relate to those for the input.

## Signal Processing Operations and CSP

It is often useful to know how a signal processing operation affects the probabilistic parameters of a random signal. For example, if I know the power spectral density (PSD) of some signal $x(t)$, and I filter it using a linear time-invariant transformation with impulse response function $h(t)$, producing the output $y(t)$, then what is the PSD of $y(t)$? This input-output relationship is well known and quite useful. The relationship is

$\displaystyle S_y^0(f) = \left| H(f) \right|^2 S_x^0(f). \hfill (1)$

In (1), the function $H(f)$ is the transfer function of the filter, which is the Fourier transform of the impulse-response function $h(t)$.

Because the mathematical models of real-world communication signals can be constructed by subjecting idealized textbook signals to various signal-processing operations, such as filtering, it is of interest to us here at the CSP Blog to know how the spectral correlation function of the output of a signal processor is related to the spectral correlation function for the input. Similarly, we’d like to know such input-output relationships for the cyclic cumulants and the cyclic polyspectra.

Another benefit of knowing these CSP input-output relationships is that they tend to build insight into the meaning of the probabilistic parameters. For example, in the PSD input-output relationship (1), we already know that the transfer function at $f = f_0$ scales the input frequency component at $f_0$ by the complex number $H(f_0)$. So it makes sense that the PSD at $f_0$ is scaled by the squared magnitude of $H(f_0)$. If the filter transfer function is zero at $f_0$, then the density of averaged power at $f_0$ should vanish too.

So, let’s look at this kind of relationship for CSP parameters. All of these results can be found, usually with more mathematical detail, in My Papers [6, 13].

## Square-Root Raised-Cosine PSK/QAM

Let’s look at a somewhat more realistic textbook signal: The PSK/QAM signal with independent and identically distributed symbols (IID) and a square-root raised-cosine (SRRC) pulse function. The SRRC pulse is used in many practical systems and in many theoretical and simulation studies. In this post, we’ll look at how the free parameter of the pulse function, called the roll-off parameter or excess bandwidth parameter, affects the power spectrum and the spectral correlation function.

## Conjugation Configurations

When we considered complex-valued signals and second-order statistics, we ended up with two kinds of parameters: non-conjugate and conjugate. So we have the non-conjugate autocorrelation, which is the expected value of the normal second-order lag product in which only one of the factors is conjugated (consistent with the normal definition of variance for complex-valued random variables),

$\displaystyle R_x(t, \boldsymbol{\tau}) = E \left[ x(t+\tau_1)x^*(t+\tau_2) \right] \hfill (1)$

and the conjugate autocorrelation, which is the expected value of the second-order lag product in which neither factor is conjugated

$\displaystyle R_{x^*}(t, \boldsymbol{\tau}) = E \left[ x(t+\tau_1)x(t+\tau_2) \right]. \hfill (2)$

The complex-valued Fourier-series amplitudes of these functions of time $t$ are the non-conjugate and conjugate cyclic autocorrelation functions, respectively.

The Fourier transforms of the non-conjugate and conjugate cyclic autocorrelation functions are the non-conjugate and conjugate spectral correlation functions, respectively.

I never explained why both the non-conjugate and conjugate functions are needed. In this post, I rectify that omission. The reason for the many different choices of conjugated factors in higher-order cyclic moments and cumulants is also provided.

## Cyclic Temporal Cumulants

In this post I continue the development of the theory of higher-order cyclostationarity (My Papers [5,6]) that I began here. It is largely taken from my doctoral work (download my dissertation here).

This is a long post. To make it worthwhile, I’ve placed some movies of cyclic-cumulant estimates at the end. Or just skip to the end now if you’re impatient!

In my work on cyclostationary signal processing (CSP), the most useful tools are those for estimating second-order statistics, such as the cyclic autocorrelation, spectral correlation function, and spectral coherence function. However, as we discussed in the post on Textbook Signals, there are some situations (perhaps only academic; see my question in the Textbook post) for which higher-order cyclostationarity is required. In particular, a probabilistic approach to blind modulation recognition for ideal (textbook) digital QAM, PSK, and CPM requires higher-order cyclostationarity because such signals have similar or identical spectral correlation functions and PSDs. (Other high-SNR non-probabilistic approaches can still work, such as blind constellation extraction.)

Recall that in the post introducing higher-order cyclostationarity, I mentioned that one encounters a bit of a puzzle when attempting to generalize experience with second-order cyclostationarity to higher orders. This is the puzzle of pure sine waves (My Papers [5]). Let’s look at pure and impure sine waves, and see how they lead to the probabilistic parameters widely known as cyclic cumulants.

## Introduction to Higher-Order Cyclostationarity

We’ve seen how to define second-order cyclostationarity in the time- and frequency-domains, and we’ve looked at ideal and estimated spectral correlation functions for a synthetic rectangular-pulse BPSK signal. In future posts, we’ll look at how to create simple spectral correlation estimators, but in this post I want to introduce the topic of higher-order cyclostationarity (HOCS).  This post is more conceptual in nature; for mathematical details about HOCS, see the posts on cyclic cumulants and cyclic polyspectra. Estimators of higher-order parameters, such as cyclic cumulants and cyclic moments, are discussed in this post.

To contrast with HOCS, we’ll refer to second-order parameters such as the cyclic autocorrelation and the spectral correlation function as parameters of second-order cyclostationarity (SOCS).

The first question we might ask is Why do we care about HOCS? And one answer is that SOCS does not provide all the statistical information about a signal that we might need to perform some signal-processing task. There are two main limitations of SOCS that drive us to HOCS.