TY - JOUR
T1 - Using Programmable P4 Switches to Reduce Communication Costs of Parallel and Distributed Simulations
AU - Wang, Shie Yuan
AU - Kuo, Nai En
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In the past, several time synchronization methods have been proposed for parallel and distributed simulation (PDS). Among them, one widely used conservative method is the Chandy-Misra-Bryant (CMB) algorithm. In the CMB algorithm, many null messages may be exchanged among logical processes to advance their clocks so that deadlock will not occur among them. In this work, using a data-plane programmable P4 hardware switch, we design and implement a data fusion-based approach inside the packet processing pipeline of the P4 switch. Our approach extracts the timestamp carried in exchanged null messages, computes the fusion results of these timestamps, drops unnecessary null messages inside the switch, generates new messages carrying the fusion results, and sends these generated messages to only the logical processes that can benefit from receiving these messages. Experimental results show that on an 8-host testbed, our approach can speed up a PDS by a factor of 2.75 and 1.65 when compared with the unicast and multicast approaches, respectively.
AB - In the past, several time synchronization methods have been proposed for parallel and distributed simulation (PDS). Among them, one widely used conservative method is the Chandy-Misra-Bryant (CMB) algorithm. In the CMB algorithm, many null messages may be exchanged among logical processes to advance their clocks so that deadlock will not occur among them. In this work, using a data-plane programmable P4 hardware switch, we design and implement a data fusion-based approach inside the packet processing pipeline of the P4 switch. Our approach extracts the timestamp carried in exchanged null messages, computes the fusion results of these timestamps, drops unnecessary null messages inside the switch, generates new messages carrying the fusion results, and sends these generated messages to only the logical processes that can benefit from receiving these messages. Experimental results show that on an 8-host testbed, our approach can speed up a PDS by a factor of 2.75 and 1.65 when compared with the unicast and multicast approaches, respectively.
KW - P4
KW - data-plane pro-grammable switch
KW - parallel and distributed simulation
KW - software-defined network
UR - http://www.scopus.com/inward/record.url?scp=85146921809&partnerID=8YFLogxK
U2 - 10.1109/GLOBECOM48099.2022.10001195
DO - 10.1109/GLOBECOM48099.2022.10001195
M3 - Conference article
AN - SCOPUS:85146921809
SN - 2334-0983
SP - 4443
EP - 4448
JO - Proceedings - IEEE Global Communications Conference, GLOBECOM
JF - Proceedings - IEEE Global Communications Conference, GLOBECOM
T2 - 2022 IEEE Global Communications Conference, GLOBECOM 2022
Y2 - 4 December 2022 through 8 December 2022
ER -