Upgrade to Pro — share decks privately, control downloads, hide ads and more …

令和のミニ四駆!? AWS DeepRacer で強化学習に入門してみた

Avatar for ryu-ki ryu-ki
May 01, 2025
120

令和のミニ四駆!? AWS DeepRacer で強化学習に入門してみた

Avatar for ryu-ki

ryu-ki

May 01, 2025
Tweet

Transcript

  1. 報酬関数の改善|改善前 26 def reward_function(params): track_width = params['track_width'] distance_from_center = params['distance_from_center']

    marker_1 = 0.1 * track_width marker_2 = 0.25 * track_width marker_3 = 0.5 * track_width if distance_from_center <= marker_1: reward = 1.0 elif distance_from_center <= marker_2: reward = 0.5 elif distance_from_center <= marker_3: reward = 0.1 else: reward = 1e-3 return float(reward)
  2. 報酬関数の改善|改善前 27 def reward_function(params): track_width = params['track_width'] distance_from_center = params['distance_from_center']

    marker_1 = 0.1 * track_width marker_2 = 0.25 * track_width marker_3 = 0.5 * track_width if distance_from_center <= marker_1: reward = 1.0 elif distance_from_center <= marker_2: reward = 0.5 elif distance_from_center <= marker_3: reward = 0.1 else: reward = 1e-3 return float(reward) 車体がコースの中央から 逸れるほど報酬が少なくなる
  3. 報酬関数の改善|改善後 28 speed_reward = speed / 2.0 reward += speed_reward

    ABS_STEERING_THRESHOLD = 30 steering_penalty = steering_angle / ABS_STEERING_THRESHOLD reward *= (1 - steering_penalty) return float(reward)
  4. 報酬関数の改善|改善後 29 speed_reward = speed / 2.0 reward += speed_reward

    ABS_STEERING_THRESHOLD = 30 steering_penalty = steering_angle / ABS_STEERING_THRESHOLD reward *= (1 - steering_penalty) return float(reward) 速度が速いほど報酬が大きくなる
  5. 報酬関数の改善|改善後 30 speed_reward = speed / 2.0 reward += speed_reward

    ABS_STEERING_THRESHOLD = 30 steering_penalty = steering_angle / ABS_STEERING_THRESHOLD reward *= (1 - steering_penalty) return float(reward) 角度が小さいほど報酬が大きくなる