Update September 2023: A randomization flaw has been found and fixed for CSPB.ML.2018, resulting in CSPB.ML.2018R2. Use that one going forward.
Update February 2023: A third dataset has been posted here. This new dataset, CSPB.ML.2023, features cochannel signals.
Update April 2022: I’ve also posted a second dataset here. This new dataset is similar to the original ML Challenge dataset except the random variable representing the carrier frequency offset has a slightly different distribution.
If you refer to any of the posted datasets in a published paper, please use the following designators, which I am also using in papers I’m attempting to publish:
Original ML Challenge Dataset: CSPB.ML.2018.
Shifted ML Challenge Dataset: CSPB.ML.2022.
Cochannel ML Dataset: CSPB.ML.2023.
Update February 2019
I’ve decided to post the data set I discuss here to the CSP Blog for all interested parties to use. See the new post on the Data Set. If you do use it, please let me and the CSP Blog readers know how you fared with your experiments in the Comments section of either post. Thanks!