TY - JOUR
T1 - Fund transfer fraud detection
T2 - Analyzing irregular transactions and customer relationships with self-attention and graph neural networks
AU - Shih, Yi Cheng
AU - Dai, Tian Shyr
AU - Chen, Ying Ping
AU - Ti, Yen Wu
AU - Wang, Wun Hao
AU - Kuo, Yun
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/1/1
Y1 - 2025/1/1
N2 - This paper presents a method for identifying fraudulent fund transfers using real bank data, analyzing customer information, transactional activities, and customer relationships. The preprocessing step transforms high-dimensional, irregular transaction time series into regular time series, then further compresses them in a latent space using a self-attention-based autoencoder. To address the scarcity of fraudulent data samples and mitigate training issues caused by data imbalance, various deep generative models, including the conditional variational autoencoder and Wasserstein generative adversarial network, are applied to generate additional fraudulent raw data and augment fraud samples in latent space. The reparameterization trick is integrated into the encoder–decoder structure to boost the model's generative capabilities. Additionally, a Graph Neural Network (GNN) is used to model customer relationships. The proposed approach utilizes end-to-end learning, integrating the autoencoder's reconstruction loss, KL divergence loss (when reparameterization trick is applied), and classification loss for fraud detection. To optimize computational resources, neighborhood sampling for GNN is combined with mini-batch training for the autoencoder, improving both training efficiency and model reliability. Comprehensive experiments demonstrate the effectiveness of the proposed fusion network, highlighting the importance of each component and preprocessing step. For example, the areas under the precision–recall curves for fraud detection show notable improvements in our model. For suspicious transactions identified by the bank's rules, other models range from 0.66% to 22.15%, while our model reached 27%. For non-suspicious transactions, other models range from 2.53% to 22.00%, with our model achieving 22.90%. This model has potential for wider applications in anomaly detection, particularly in datasets with irregular time series and complex customer relationships.
AB - This paper presents a method for identifying fraudulent fund transfers using real bank data, analyzing customer information, transactional activities, and customer relationships. The preprocessing step transforms high-dimensional, irregular transaction time series into regular time series, then further compresses them in a latent space using a self-attention-based autoencoder. To address the scarcity of fraudulent data samples and mitigate training issues caused by data imbalance, various deep generative models, including the conditional variational autoencoder and Wasserstein generative adversarial network, are applied to generate additional fraudulent raw data and augment fraud samples in latent space. The reparameterization trick is integrated into the encoder–decoder structure to boost the model's generative capabilities. Additionally, a Graph Neural Network (GNN) is used to model customer relationships. The proposed approach utilizes end-to-end learning, integrating the autoencoder's reconstruction loss, KL divergence loss (when reparameterization trick is applied), and classification loss for fraud detection. To optimize computational resources, neighborhood sampling for GNN is combined with mini-batch training for the autoencoder, improving both training efficiency and model reliability. Comprehensive experiments demonstrate the effectiveness of the proposed fusion network, highlighting the importance of each component and preprocessing step. For example, the areas under the precision–recall curves for fraud detection show notable improvements in our model. For suspicious transactions identified by the bank's rules, other models range from 0.66% to 22.15%, while our model reached 27%. For non-suspicious transactions, other models range from 2.53% to 22.00%, with our model achieving 22.90%. This model has potential for wider applications in anomaly detection, particularly in datasets with irregular time series and complex customer relationships.
KW - Fraud detection
KW - Graph neural networks
KW - Irregular time series
KW - Self-attention mechanism
KW - Variational autoencoder
UR - http://www.scopus.com/inward/record.url?scp=85202930924&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2024.125211
DO - 10.1016/j.eswa.2024.125211
M3 - Article
AN - SCOPUS:85202930924
SN - 0957-4174
VL - 259
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 125211
ER -