ICARUS: More on Attempts to Merge IQ Data with Extracted-Feature Data in Machine Learning

I’ve been working with some colleagues at Northeastern University (NEU) in Boston, MA, on ways to combine CSP with machine learning. The work I’m doing with Old Dominion University is focused on basic modulation recognition using neural networks and, in particular, the generalization (dataset-shift) problem that is pervasive in deep learning with convolution neural networks. In contrast, the NEU work is focused on specific signal detection and classification problems and looks at how to use multiple disparate data types as inputs to neural-networks; inputs such as complex-valued samples (IQ data) as well as carefully selected components of spectral correlation and spectral coherence surfaces.

My NEU colleagues and I will be publishing a rather lengthy conference paper on a new multi-input-data neural-network approach called ICARUS at InfoCom 2023 this May (My Papers [53]). You can get a copy of the pre-publication version here or on arxiv.org.

The problem is to detect the presence of a low-level direct-sequence spread-spectrum signal that is cochannel with a conventional wideband signal such as LTE. Because the spread-spectrum signal is designed to be demodulatable even when the inband noise and interference power is tens or hundreds of times that of the signal, it can hide, effectively, underneath the innocuous conventional communication signal, and still be received and demodulated by its intended receiver with adequate bit-error performance.

The ultimate goal is the construction of a fusion pipeline, which is machine-learning terminology for a neural network that can take in disparate data types, such as IQ data and a cycle-frequency list. But it is also of interest to compare the performance of such a construction with marginal, or single-input-type, networks (pipelines). Therefore, four distinct pipelines (algorithms) are defined:

Signal-processing only. Here we use only CSP products and simple signal-processing algorithms to detect the presence of the anomaly.
Trained neural network using only IQ samples as input.
Trained neural network using only CSP products as input.
The fusion pipeline: a trained neural network that makes use of both IQ samples and CSP products.

The detection performance is summarized in the paper’s Table II, which is reproduced here as Figure 1.

Figure 1. Summary of anomaly (DSSS) detection performance for each pipeline (algorithm or method) considered in My Papers [53].

A couple things to note, briefly, because I want you to go look at the full paper.

CSP doesn’t work so well compared to the pipeline that uses only IQ data (where have we seen that before?). This is similar to other published results that don’t take into account the highly brittle nature of networks trained on IQ data–they don’t generalize. But we didn’t try to include generalization here in My Papers [53]; we’ll get to that eventually though. (In other words, we don’t report “trained on Synthetic, tested on OTA-Cellular,” or the like.)

The second thing is that performance generally is quite dependent on the particular kind of LTE signal involved in the training and performance evaluation. We have three basic types of LTE data: synthetic (MATLAB), PAWR (captured), and wild/cellular (captured). The synthetic data is from the MATLAB LTE toolbox, the PAWR data is from an NEU testbed that uses srsLTE, and the cellular data is captured using an Ettus SDR pointed at various cellular bands in Monterey, CA. The spectral signature of the srsLTE signals is different from the other two, and it produces a huge number of cycle frequencies, overwhelming the signal processor’s ability to sort out the DSSS cycle frequencies from the LTE cycle frequencies. But this is quite helpful information because it goes, yet again, to generalization. You can’t just say “LTE” and mean much by it. Even though each of the three types of LTE signals might be demodulatable by the very same LTE receiver, they are still different in other important ways.

Finally, it is satisfying, and a testament to the ingenuity of my co-authors, that the fusion network always performs best.

Pipelines 2-4 also can perform modulation recognition in terms of specifying whether the detected DSSS signal is DSSS BPSK or DSSS QPSK. The high-level results are summarized in the paper’s Figure 7, which is reproduced here as Figure 2.

Figure 2. Modulation recognition performance for three of the four pipelines (algorithm or method) considered in My Papers [53].

There are several strange trends in the bar graphs of Figure 2, and I don’t have good explanations of them. I believe that this means we still have a lot of work to do to really solve the stated problem. But the paper represents a solid step forward.

A good chunk of the paper is devoted to the problem of choosing one of these pipelines under various constraints on inference time and computational cost. This kind of guidance will be increasingly important as, over time, we improve the performance of all the pipelines, and perhaps introduce new ones with different performance/cost/latency tradeoffs.

Go take a look, and maybe I’ll see you at InfoCom!

Author: Chad Spooner

I'm a signal processing researcher specializing in cyclostationary signal processing (CSP) for communication signals. I hope to use this blog to help others with their cyclo-projects and to learn more about how CSP is being used and extended worldwide. View all posts by Chad Spooner

CSP Categories

Chad Spooner on SPTK: Sampling and The Sampling TheoremJune 10, 2026
Here is Mansoor's figure:
Chad Spooner on Cyclostationarity of Direct-Sequence Spread-Spectrum SignalsJune 9, 2026
Welcome to the CSP Blog XY! Thanks for the comment. will DSSS signals always present such clear peak characteristics regardless…
XY on Cyclostationarity of Direct-Sequence Spread-Spectrum SignalsJune 8, 2026
Hi there, hope you get a chance to see this question. I've been following your mathematical derivations regarding DSSS signals,…
Mansoor Wahab on SPTK: Sampling and The Sampling TheoremJune 7, 2026
When sampling a RF/bandpass signal with carrier frequency, you have several options: 1) Downconvert the signal to baseband with analog…
Chad Spooner on Watch Out!May 25, 2026
Welcome to the CSP Blog Tim! Thanks for the thoughtful comment. I've come across the substack called Slow AI by…
Chad Spooner on PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]May 12, 2026
Welcome to the CSP Blog Muhammad! Thanks for reaching out and for your interest in CSPB.ML.2023. It will take some…
Muhammad Zakir Khan on PSK/QAM Cochannel Dataset for Modulation Recognition Researchers [CSPB.ML.2023]May 11, 2026
Great post and i am really intrested to move forward. can i get the full dataset link to process?
Tim Meehan on Watch Out!March 20, 2026
Great article Chad, AI use in research has made an existing problem: sloppy research. I have found LLMs very useful…
Simon Clift on SPTK: Interconnection of Linear SystemsMarch 18, 2026
I'll happily defer to you, at least until I can say something coherent. I'm a mathematician and exploring some connections…
RUI WU on Latest Paper on CSP and Deep-Learning for Modulation Recognition: An Extended Version of My Papers [52]March 11, 2026
Thank you very much for your helpful explanation. I noticed that both Ref. 52 and Ref. 56 show relatively weak…

Author: Chad Spooner

Leave a Comment, Ask a Question, or Point out an ErrorCancel reply

Discover more from Cyclostationary Signal Processing