TY - GEN
T1 - Reinforcement Learning for Accident Risk-Adaptive V2X Networking
AU - Kim, Seungmo
AU - Kim, Byung Jun
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11
Y1 - 2020/11
N2 - The significance of vehicle-to-everything (V2X) communications has been ever increased as connected and autonomous vehicles (CAVs) get more emergent in practice. The key challenge is the dynamicity: each vehicle needs to recognize the frequent changes of the surroundings and apply them to its networking behavior. This is the point where the need for machine learning is raised. However, the learning itself is extremely complicated due to the dynamicity as well, which necessitates that the learning framework itself must be resilient and flexible according to the environment. As such, this paper proposes a V2X networking framework integrating reinforcement learning (RL) into scheduling of multiple access. Specifically, the learning mechanism is formulated as a multi-armed bandit (MAB) problem, which enables a vehicle, without any assistance from external infrastructure, to (i) learn the environment, (ii) quantify the accident risk, and (iii) adapt its backoff counter according to the risk. The results of this paper show that the proposed learning protocol is able to (i) evaluate an accident risk close to optimal and, as a result, (ii) yield a higher chance of transmission for a dangerous vehicle.
AB - The significance of vehicle-to-everything (V2X) communications has been ever increased as connected and autonomous vehicles (CAVs) get more emergent in practice. The key challenge is the dynamicity: each vehicle needs to recognize the frequent changes of the surroundings and apply them to its networking behavior. This is the point where the need for machine learning is raised. However, the learning itself is extremely complicated due to the dynamicity as well, which necessitates that the learning framework itself must be resilient and flexible according to the environment. As such, this paper proposes a V2X networking framework integrating reinforcement learning (RL) into scheduling of multiple access. Specifically, the learning mechanism is formulated as a multi-armed bandit (MAB) problem, which enables a vehicle, without any assistance from external infrastructure, to (i) learn the environment, (ii) quantify the accident risk, and (iii) adapt its backoff counter according to the risk. The results of this paper show that the proposed learning protocol is able to (i) evaluate an accident risk close to optimal and, as a result, (ii) yield a higher chance of transmission for a dangerous vehicle.
KW - Connected and autonomous vehicles
KW - Contextual multiarmed bandit
KW - Reinforcement learning
KW - Vehicle-to-everything communications
UR - http://www.scopus.com/inward/record.url?scp=85101379888&partnerID=8YFLogxK
U2 - 10.1109/VTC2020-Fall49728.2020.9348445
DO - 10.1109/VTC2020-Fall49728.2020.9348445
M3 - Conference article
AN - SCOPUS:85101379888
T3 - IEEE Vehicular Technology Conference
BT - 2020 IEEE 92nd Vehicular Technology Conference, VTC 2020-Fall - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 92nd IEEE Vehicular Technology Conference, VTC 2020-Fall
Y2 - 18 November 2020
ER -