Watch Out!

“Hear my words and bear witness to my vow:

Night gathers, and now my watch begins. It shall not end until my death. I am the fire that burns against the cold, the light that brings the dawn, the horn that wakes the sleepers, the shield that guards the realms of men.”

Night’s Watch Oath, Game of Thrones, by G. R. R. Martin

Due to my pretention to academic worthiness, I have Google Scholar alert me to all my new citations. That is, citations to My Papers. I got an anodyne alert the other day and, as usual, gave it a quick once-over. Anything new or interesting? Any new signal-processing twist or a machine-learning breakthrough, finally smashing the last vestiges of the old order? I’m referring here to The Literature [R205].

Well … no. But some modern AI-related weirdness is there, and it is a concerning variety of weirdness for researchers that attempt to learn from published technical work, and especially for those that attempt to use references in a published technical paper to dig a little deeper toward foundational material. Let’s take a look.

[R205] is called “Covert Routing with DSSS Signaling Against Cycle Detectors,” and it is concerned with a multi-hop message-transmission scheme using multiple pairs of transmitters and receivers and wireless transmissions. The physical-layer signals are direct-sequence spread-spectrum BPSK signals, with which we are very familiar.

The idea is that a communicator named Alice wants to send a message to a communicator named Bob (See Figure 1), but there is an interceptor named Willie that is trying to detect their transmission (for presumably nefarious reasons). Willie can use an energy detector or what the authors call a cycle detector. I was in on the ground floor of what are commonly called cycle detectors–see the early CSP Blog post on cycle detectors here.

Figure 1. Reproduction of Figure 1 from The Literature [R205].

When the authors get around to defining their cycle detector mathematically, what we see is reproduced in Figure 2.

Figure 2. Definition of cycle detector in The Literature [R205]. Notice that the finite-time spectrum is a function of t, but the spectral correlation estimate is not. More importantly, the spectral correlation estimate is the cyclic periodogram, which does not converge to the spectral correlation function as T_0 increases.

Their reference [11] is my paper My Papers [1], which is “Signal Interception: Performance Advantages of Cyclic-Feature Detectors,” which is indeed a paper about cycle detectors. Unfortuately for [R205], the “Degree of Cyclostationarity” phrase and definition never appears in My Papers [1]. The authors supply, in Figure 2, an estimate of the spectral correlation function for y(t) that is actually a scaled version of the non-conjugate cyclic periodogram, but they end up integrating its squared magnitude, which doesn’t make sense and won’t work; see below. The notation is a bit sloppy as the sum over cycle frequency \alpha should explicitly say that only those cycle frequencies exhibited by the signal should be included in the sum.

More disturbing, and more to the point about possible AI misuse, is the reference in Figure 2 to [15] as relating to, or containing a definition of, the finite-time complex spectrum. But [15] is a half-page book review, written by Enders Robinson about Gardner’s book The Literature [R1]. You can see the full review in Figure 3. Note that the citation [15] simply says:

W. A. Gardner and E. A. Robinson, “Statistical Spectral Analysis–A Nonprobabilistic Theory,” 1989.

Figure 3. A review of The Literature [R1] (Gardner’s Statistical Spectral Analysis book) by Enders Robinson. This is reference [15] in The Literature [R205]. Say what?

Note that The Literature [R1] has that same title, but was published in 1987. The Robinson review came out in 1989.

To be clear, the relevant part of the reference list in R205 is shown in Figure 4.

Figure 4. The part of the reference list in R205 that shows the citations under study here: [11], [15], and [16].

It is nice to be cited. But the citations here are essentially dead ends. If someone went looking for the exposition on degree of cyclostationarity in [11] or the finite-time complex spectrum, they would be stymied. Perhaps they would just stop that part of their investigation.

The most pressing question is: How can this happen? There is a paper by Gardner and Zivanovic called “Degrees of Cyclostationarity and Their Application to Signal Detection and Estimation” written in 1991 (The Literature [R206]). That’s actually the only one I know of that defines and discusses DCS. My CSP colleagues and I don’t use it. The spectral coherence is a much better detection statistic. So how can these researchers, who reference their own work that also seems to relate to CSP (their [3] and [5], for example), mangle all these references and citations? Why does their [15] just have authors, title, and year? Where was it published?

Moving on, as the authors gear up for presentation of numerical results, they now reference their [16] (My Papers [17]) in support of DCS, as shown in Figure 5. My Papers [17] never mentions DCS, you will not be surprised to learn.

Figure 5. Now the citation for understanding the degree of cyclostationarity quantity has shifted to [16], or My Papers [17], a paper I wrote in 1988. I never mention DCS in this paper.

These errors have a familiar flavor to me. I’ve tried to use a large language model (LLM) a couple of times when searching for published papers relevant to some technical topic. The citations can be wrong. I believe that is what is happening here. The authors either didn’t use LLMs and are terrible at scholarship or they did, and they are terrible at scholarship.

So watch out! Researchers may now be citing your papers for bad reasons and causing other researchers to miss out on your work because they encounter dead ends in the reference list.

On to Some Technical Concerns

The AI/citation weirdness got me looking too closely at this paper and I dove down the rabbit hole a bit. It ain’t pretty.

Notice that the DCS definition (their (2) in my Figure 2 above) uses the theoretical spectral correlation function for some infinitely long signal or a random process signal model. If we want to convert that to an estimate of the DCS, we have to replace the theoretical spectral correlation function with an estimate. Further, if we want the DCS estimate to converge to its theoretical (limit) value, a sufficient condition is that the spectral correlation estimates themselves converge to the theoretical spectral correlation function values. I say “sufficient” because it isn’t clear that it is necessary; you’d have to prove that. For example, if the spectral correlation estimates converge to the theoretical spectral correlation values except for a countable number of values of frequency f, the integrals in (2) will be unaffected.

But we can also notice that for some spectral correlation estimators, the estimated DCS cannot converge to its theoretical value. Furthermore, the DCS will not be useful in such cases. One of those cases is, in fact, the estimator used by the authors–the cyclic periodogram.

Consider the case where the data y(t) is only stationary noise, such as white Gaussian noise. Then the non-conjugate spectral correlation function for y(t) is zero for all (f, \alpha) except \alpha = 0. Therefore, as desired, the DCS for stationary noise will be zero, as the numerator of (2) is zero and the denominator is not zero.

When we substitute the cyclic periodogram for the theoretical spectral correlation in (2), however, we see that the integrands in the numerator do not get small as the block length increases. This is because the cyclic periodogram (and the periodogram) never converge to anything as the block length increases without bound. The integral is over the squared magnitude of this erratic function, and so cannot converge to zero–the integral must add up a large number of positive numbers. So the DCS for stationary noise with the cyclic periodogram used as the spectral correlation estimate will NOT converge to the theoretical value of zero. Similarly, the DCS for some cyclostationary signal using the cyclic periodogram will NOT converge to its theoretical value.

Let’s test it out. We select a DSSS BPSK signal (as do the authors) with processing gain 127, chip rate 0.5 (normalized frequencies used here), and square-root raised-cosine pulse shaping with excess bandwidth (roll-off) of 1.0. This implies non-conjugate cycle frequencies of k(1/2)/127, which includes the chip rate 0.5 for k=127.

We estimate the DCS using two spectral correlation estimators. The first is the authors’ cyclic periodogram and the second is the frequency-smoothed cyclic periodogram with smoothing-window width of 0.02 (2 percent of the sampling rate of 1).

The signal and noise both have unit power, which means that the inband SNR is 0 dB since the occupied bandwidth of the signal is approximately equal to the sampling rate.

We’ll consider estimator block lengths (in samples) that are powers of two starting with 8192 and ending with 4194304. A typical result in shown in Figure 6. The authors’ estimator corresponds to \Delta{f} =0, my frequency-smoothing estimate corresponds to \Delta{f} = 0.02.

Figure 6. DCS estimates for the noisy signal and for the noise using a block length of 262144 samples. When the spectral resolution \Delta{f} is zero, the DCS estimate corresponds to the authors’ estimate.

The DCS estimates for all block lengths are shown in Figure 7. The DCS for the frequency-smoothing method works as expected. As the block length increases, the DCS of the noise-only signal converges to zero as noise is not cyclostationary. The DCS for the noisy signal approaches a constant relating to the particular integrals of the squared spectral correlation functions, which relate to the signal type, spreading code, signal power, and noise power.

Figure 7. DCS estimates for a noisy DSSS signal and for noise only. The green lines use the frequency-smoothing method to estimate the spectral correlation functions in the DCS estimate whereas the red lines use the authors’ cyclic periodogram estimator.

On the other hand, the DCS for the cyclic periodogram estimator does not behave as desired. The DCS for noise does not converge to zero. Moreover, the DCS for the noisy signal is smaller than the DCS for the noise only.

Finally, when the authors do their numerical example, they specify the data length for their cyclic periodogram, as in Figure 8.

Figure 8. Numerical values for simulation parameters. Notably, the data-block length size is 100 million bits. At a minimum of two samples per bit (the signal has excess bandwidth parameter of 1), that’s a block size of 200 million samples!

Earlier in the paper, the symbol T_0 is used as the size of the cyclic periodogram data block, and in Section III.B it is defined as T_0 = M T_{\nu_b}, where T_{\nu_b} is the bit duration. So they want to compute the cyclic periodogram for a block length of 200 million samples, minimum. I remain skeptical that they actually did that. I tried to check their code as a footnote says:

The simulation code is available at https://github.com/swapnil-saha/Covert-Routing-with-DSSS-Signaling-Against-Cycle-Detectors.git

and that repository exists, but it just says that the code is coming soon.

I’ll watch out for it.

Author: Chad Spooner

I'm a signal processing researcher specializing in cyclostationary signal processing (CSP) for communication signals. I hope to use this blog to help others with their cyclo-projects and to learn more about how CSP is being used and extended worldwide.

One thought on “Watch Out!”

  1. Great article Chad,

    AI use in research has made an existing problem: sloppy research. I have found LLMs very useful in for helping me to identify research papers or new topics, but it is it is like sending elementary students into the library to find relevant research articles, a large portion of what they bring back is not relevant.

    What concerns me most is students are treating an LLM as a primary source. This combined with the ease at which modern LLMs allow one to generate slop papers will make the situation much worse.

    The average quality of published papers diminishes as the cost of producing the paper drops, LLMs have accelerated this process.

Leave a Comment, Ask a Question, or Point out an Error

Discover more from Cyclostationary Signal Processing

Subscribe now to keep reading and get access to the full archive.

Continue reading