I’ve had to update the original Challenge for the Machine Learners post, and the associated dataset post, a couple times due to flaws in my metadata (truth) files. Those were fairly minor, so I just updated the original posts.
But a new flaw in CSPB.ML.2018 and CSPB.ML.2022 has come to light due to the work of the estimable research engineers at Expedition Technology. The problem is not with labeling or the fundamental correctness of the modulation types, pulse functions, etc., but with the way a random-number generator was applied in my multi-threaded dataset-generation technique.
I’ll explain after the fold, and this post will provide links to an updated version of the dataset, CSPB.ML.2018R2. I’ll keep the original up for continuity and also place a link to this post there. Moreover, the descriptions of the truth files over at CSPB.ML.2018 are still valid–the truth file posted here has the same format as the truth files available on the CSPB.ML.2018 and CSPB.ML.2022 posts.
