TY - JOUR
T1 - ReFSM
T2 - Reverse engineering from protocol packet traces to test generation by extended finite state machines
AU - Lin, Ying Dar
AU - Lai, Yu Kuen
AU - Bui, Quan Tien
AU - Lai, Yuan Cheng
N1 - Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Protocol reverse engineering is helpful to automatically obtain the specifications of protocols that are useful for network management, network security systems and test case generation tools. To achieve better accuracy, these kinds of applications require good models that can capture not only the order of exchanging messages (control flow aspect) but also the data being transmitted (data flow aspect). However, current techniques only focus on inferring the control flow represented as a Finite State Machine (FSM) and without interpreting the data flow. The Extended Finite State Machine (EFSM), embedding memory in the states and data guard in the FSM transitions, is a method commonly used to represent the data flow. In this work, we propose ReFSM, a novel approach to infer the EFSMs of protocols from only network packet traces. The proposed method is evaluated by using datasets of real-world network traffic traces of four protocols: FTP, SMTP, BitTorrent and PPLive. Based on the results, the coverage, accuracy scores of correctness and behavior of inferred models are always higher than 90%. The precision and recall values of message type identification are, at least, well above 94% and 96%, respectively. The inferred EFSMs are close to the correct model derived from protocol specification.
AB - Protocol reverse engineering is helpful to automatically obtain the specifications of protocols that are useful for network management, network security systems and test case generation tools. To achieve better accuracy, these kinds of applications require good models that can capture not only the order of exchanging messages (control flow aspect) but also the data being transmitted (data flow aspect). However, current techniques only focus on inferring the control flow represented as a Finite State Machine (FSM) and without interpreting the data flow. The Extended Finite State Machine (EFSM), embedding memory in the states and data guard in the FSM transitions, is a method commonly used to represent the data flow. In this work, we propose ReFSM, a novel approach to infer the EFSMs of protocols from only network packet traces. The proposed method is evaluated by using datasets of real-world network traffic traces of four protocols: FTP, SMTP, BitTorrent and PPLive. Based on the results, the coverage, accuracy scores of correctness and behavior of inferred models are always higher than 90%. The precision and recall values of message type identification are, at least, well above 94% and 96%, respectively. The inferred EFSMs are close to the correct model derived from protocol specification.
KW - EFSM inference
KW - Protocol reverse engineering
KW - Protocol semantic deduction
UR - http://www.scopus.com/inward/record.url?scp=85090547004&partnerID=8YFLogxK
U2 - 10.1016/j.jnca.2020.102819
DO - 10.1016/j.jnca.2020.102819
M3 - Article
AN - SCOPUS:85090547004
SN - 1084-8045
VL - 171
JO - Journal of Network and Computer Applications
JF - Journal of Network and Computer Applications
M1 - 102819
ER -