Enhancing Cyber Threat Intelligence with Named Entity Recognition Using BERT-CRF

Sheng Shan Chen, Ren Hung Hwang*, Chin Yu Sun, Ying Dar Lin, Tun Wen Pai*

*此作品的通信作者

研究成果: Conference contribution同行評審

2 引文 斯高帕斯(Scopus)

摘要

Cyber Threat Intelligence (CTI) helps organizations understand the tactics, techniques, and procedures used by potential cyber criminals to defend against cyber threats. To protect the core systems and services of organizations, security analysts must analyze information about threats and vulnerabilities. However, analyzing large amounts of data requires significant time and effort. To streamline this process, we propose an enhanced architecture, BERT-CRF, by removing the BiLSTM layer from the conventional BERT-BiLSTM-CRF model. This model leverages the strengths of deep learning-based language models to extract critical threat intelligence and novel information from threats effectively. In our BERT-CRF model, the token embeddings generated by BERT are directly fed into the Conditional Random Field (CRF) layer for efficient Named Entity Recognition (NER), thus preventing the need for an intermediate BiLSTM layer. We train and evaluate the model with three publicly available threat entity databases. We also collect open-source threat intelligence data from recent years for evaluating the applicability of the constructed model in a real-world environment. Furthermore, we compare our model with the most popular GPT-3.5 and the most downloaded open-source BERT question-and-answer models. Through this study, our proposed model demonstrated robust usability and outperformed other models, signifying its potential for application in CTI. In a real-world scenario, our model achieved an accuracy of 82.64%, while with malware-specific threat intelligence data, it achieved an impressive accuracy of 93.95%. The code for this research is publicly available at https://github.com/stwater20/ner-bert-crf-open-version.

原文English
主出版物標題GLOBECOM 2023 - 2023 IEEE Global Communications Conference
發行者Institute of Electrical and Electronics Engineers Inc.
頁面7532-7537
頁數6
ISBN(電子)9798350310900
DOIs
出版狀態Published - 2023
事件2023 IEEE Global Communications Conference, GLOBECOM 2023 - Kuala Lumpur, 馬來西亞
持續時間: 4 12月 20238 12月 2023

出版系列

名字Proceedings - IEEE Global Communications Conference, GLOBECOM
ISSN(列印)2334-0983
ISSN(電子)2576-6813

Conference

Conference2023 IEEE Global Communications Conference, GLOBECOM 2023
國家/地區馬來西亞
城市Kuala Lumpur
期間4/12/238/12/23

指紋

深入研究「Enhancing Cyber Threat Intelligence with Named Entity Recognition Using BERT-CRF」主題。共同形成了獨特的指紋。

引用此