The Methodological Pitfall of Dataset-Driven Research on Deep Learning: An IoT Example

Nov 1, 2022·

Tianshi Wang

Denizhan Kara

Jinyang Li

Shengzhong Liu

Tarek Abdelzaher

Brian Jalaian

· 0 min read

Cite DOI URL

Abstract

In this paper, we highlight a dangerous pitfall in the state-of-the-art evaluation methodology of deep learning algorithms. It results in deceptively good evaluation outcomes on test datasets, whereas the underlying algorithms remain prone to catastrophic failure in practice. We illustrate the pitfall in the context of an Internet-of-Things (IoT) application example and show that it occurs despite the use of cross-validation that breaks down the data into separate training, validation, and testing sets. The pitfall is illustrated by designing two target detection and classification algorithms. One is based on a recently proposed neural network architecture for embedded AI, and the other is based on a traditional machine learning approach with domain-inspired feature engineering. The neural network approach outperforms the traditional one on test data. Yet, it fails in deployment. The mechanics behind the failure are explained and linked to the way the algorithms are trained. Suggestions are presented to avoid the pitfall. The paper is a “call to arms” to improve the evaluation methodology of machine learning algorithms for mission-critical systems.

Type

Conference paper

Publication

MILCOM 2022 - 2022 IEEE Military Communications Conference (MILCOM)

Last updated on Oct 19, 2024

Deep Learning Uncertainty Quantification

← Adversarial Machine Learning: A New Threat Paradigm for Next-generation Wireless Communications Jan 1, 2023

Maximizing Energy Efficiency With Channel Uncertainty Under Mutual Interference Oct 1, 2022 →