Customs Risk Assessment Based on Unsupervised Anomaly Detection Using Autoencoders

conference paper
In this paper we describe our initial findings on a method for improving anomaly detection on a dataset with scarcely labeled data, based on an ongoing use-case with the Belgian Customs Administration (BCA). Data on shipping containers is used to predict the level of risk associated with a shipment, as well as the probability that the shipment is fraudulent. The absence of labeled data prevents the use of super vised machine learning techniques and calls for unsupervised analysis. We employ an autoencoder to learn the distribution of the dataset and detect anomalies, under the assumption that only a fraction of all shipments is fraudulent. The absence of labels in the dataset complicates the evaluation of the autoencoder’s performance. A qualitative approach is taken to assess the assess the properties of the detected anomalies. The variable distributions of the anomalies differ significantly from variable distributions in the complete dataset and are marked interesting by domain experts. To obtain an impression of the quantitative performance in the absence of ground-truths, synthetic data is generated using a variational autoencoder. The preliminary qualitative and quantitative results suggest that autoencoders can provide value for customs risk assessment.
TNO Identifier
962742
ISSN
23673370
ISBN
9783030821920
Publisher
Springer Science and Business Media Deutschland GmbH
Source title
Lecture Notes in Networks and Systems, Intelligent Systems Conference, IntelliSys 2021, 2 September 2021 through 3 September 2021
Editor(s)
Arai, K.
Pages
668-681
Files
To receive the publication files, please send an e-mail request to TNO Repository.