<<VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation>>是Google Waymo在2020年放出Paper，如论文题目所表达的，其核心贡献在于提出了一种针对HDMap和车辆轨迹的矢量化Encoding方式，相对于以往图像+Conv的编码方式，不仅极大的降低了神经网络的训练参数，在实际的车辆轨迹预测中也取得了不错的效果。
论文链接: https://arxiv.org/pdf/2005.04259.pdf

网上关于论文的解读已经很多，本文主要记录下代码复现过程和自己的一些理解。

Part1环境准备

论文中使用了公开数据集Argoverse Dataset。

数据集和高精地图下载地址：

https://www.argoverse.org/data.html#download-link

高精地图API:

除了高精地图数据，Argoverse还提供了开源的地图查询的API，GitHub地址如下：

https://github.com/argoai/argoverse-api

安装的方法wiki中有详细的说明，不赘述。API提供了很多数据加载查询和可视化的方法。

Argoverse运动预测竞赛

除此之外，Argoverse还有一个运动预测的挑战赛，官方提供了一些Base Line的算法实现，有些代码可以直接拿过用。代码的Github地址：https://github.com/jagjeet-singh/argoverse-forecasting

Part2数据预处理

高精地图数据和轨迹数据下载完成之后，进行模型训练前先要对数据进行预处理。

Argoverse Motion Forecasting数据集的每个数据序列中类型为Agent的轨迹是用于预测的轨迹，该轨迹采样频率为10HZ，采样时长为5s，通常我们用前2s的轨迹作为观察序列，后3s作为轨迹预测的真值来训练神经网络。

seq_df = pd.read_csv(seq_path)

agent_x = seq_df[seq_df['OBJECT_TYPE'] == 'AGENT']['X']
agent_y = seq_df[seq_df['OBJECT_TYPE'] == 'AGENT']['Y']
agent_traj = np.column_stack((agent_x, agent_y))

agent_timestamp = seq_df[seq_df['OBJECT_TYPE'] == 'AGENT']['TIMESTAMP']

city_name = seq_df['CITY_NAME'].values[0]

1坐标系处理

为了避免神经网络对绝对坐标的依赖，同时降低神经网络的训练难度，论文中对坐标系进行了标准化处理：将坐标系的原点移到Tatget Agent的Observe序列的最后一个点，同时以该点的朝向作为x轴的正向。

论文中对坐标系的描述如下：

实现代码如下：

base_pt = agent_traj[args.obs_len - 1]

offset_agent_traj, angle = normalized_traj(args, agent_traj, base_pt)

平移和旋转坐标序列的代码如下。函数返回标准化的轨迹坐标序列以及旋转的角度。高精地图和所有车辆轨迹的标准化都使用这个函数。

def normalized_traj(args, xy_seq, base_pt):

    # First apply translation
    m = [1, 0, 0, 1, -base_pt[0], -base_pt[1]]
    ls = LineString(xy_seq)

    # Now apply rotation, taking care of edge cases
    ls_offset = affine_transform(ls, m)
    end = ls_offset.coords[args.obs_len]
    if end[0] == 0 and end[1] == 0:
        angle = 0.0
    elif end[0] == 0:
        angle = -90.0 if end[1] > 0 else 90.0
    elif end[1] == 0:
        angle = 0.0 if end[0] > 0 else 180.0
    else:
        angle = math.degrees(math.atan(end[1] / end[0]))
        if (end[0] > 0 and end[1] > 0) or (end[0] > 0 and end[1] < 0):
            angle = -angle
        else:
            angle = 180.0 - angle

    # Rotate the trajetory
    ls_rotate = rotate(ls_offset, angle, origin=(0, 0)).coords[:]

    # Normalized trajectory
    norm_xy = np.array(ls_rotate)

    return norm_xy, angle

2高精地图数据预处理

车道中心线插值

论文中用均匀采样的离散点序列表达高精地图，具体的处理手法如下：

Argoverse Motion Prediction数据集中只提供了矢量的车道中心线，道路边界、车道边界都是多边形，不好处理，因此我只使用了车道中心线。

在将车道中心线表述为向量之前，先对中心线进行插值。这里以0.1m的间隔对车道中心线进行插值。插值代码如下:

def interpolate_polyline_n(polyline, num_points):
    tck, u = interp.splprep(polyline.T, s=0)
    u = np.linspace(0.0, 1.0, num_points)

    return np.column_stack(interp.splev(u, tck))

高精地图的向量表达

高精地图训练数据的总体表达分为三层：每个Seq是一个Graph，每一个车道中心线是一个PolyLine，每个Polyline包含多个vector。在我们的实现中，每个vector是长度为9为向量，每个Polyline最多包含100个vector，每个Graph最多包含200个Polyline。

--Graph_1
------PolyLine_1
-----------vector_1
-----------vector_2
-----------......
------PolyLine_2
-----------vector_1
-----------vector_2
-----------......
--Graph_2
--.........

Vector数据组织

按照论文的表述，将车道中心线表达为向量形式。

其中：、、、是相邻两个Point的坐标，是元素类型(车道、车辆轨迹)，否有红绿灯，是转向类型(左转、右转、直行)，是是否在路口内，id是元素的polyline的id，同一个Lane所有向量的id是相同的。

实现代码如下:

has_traffic_contrl = avm.lane_has_traffic_control_measure(lane_id, city_name)
turn = avm.get_lane_turn_direction(lane_id, city_name)
in_intersection = avm.lane_is_in_intersection(lane_id, city_name)
origin_center_line = avm.get_lane_segment_centerline(lane_id, city_name)

...

  lane_features = []
  for cl_indx in range(1, len(center_line)):
      lane_feature = []

      cl_0 = center_line[cl_indx - 1]
      cl_1 = center_line[cl_indx]

      # offset to the last observed point
      lane_feature.extend([cl_0[0], cl_0[1]])
      lane_feature.extend([cl_1[0], cl_1[1]])

      # object type: lane
      lane_feature.append(1)

      if has_traffic_contrl == True:
          lane_feature.append(1)
      else:
          lane_feature.append(-1)

      if turn == 'LEFT':
          lane_feature.append(1)
      elif turn == 'RIGHT':
          lane_feature.append(-1)
      else:
          lane_feature.append(0.5)

      if in_intersection == True:
          lane_feature.append(1)
      else:
          lane_feature.append(-1)

      lane_feature.append(lane_id)

      lane_features.append(lane_feature)

解决车道线长短不一的问题

由于车道中心线长短不一，为了解决向量长度差异比较大的问题，将长度超过100m的车道做了打断处理。

Polyline的数据组织

Polyline包含多个Vector，我们代码中设置最多包含100个Vector，不足100个用0补齐，多于100个直接截断；同时设置一个mask数组，记录哪些是真实数据，哪些是补齐的数据。

代码如下:

lane_features_mask = [True] * len(lane_features)

if len(lane_features) < args.max_features_in_elems:
    pad_mask = [False] * (args.max_features_in_elems - len(lane_features))
    lane_features_mask.extend(pad_mask)

    pad_feature = [[0] * args.max_feature_dim] * (args.max_features_in_elems - len(lane_features))
    lane_features.extend(pad_feature)
else:
    lane_features = lane_features[:args.max_features_in_elems]
    lane_features_mask = lane_features_mask[:args.max_features_in_elems]

sub_graph.append(lane_features)
sub_graph_mask.append(lane_features_mask)

Graph的数据组织

Graph包含多个Polyline，我们代码中设置最多包含200个Polyline，不足200个用0补齐，多于200个直接截断；同时也设置一个mask数组，记录哪些是真实数据，哪些是补齐的数据。

代码如下:

graph_mask = [True] * len(sub_graph)

if len(sub_graph) < args.max_elems_in_sub_graph:
    pad_mask = [[False] * args.max_features_in_elems]  * (args.max_elems_in_sub_graph - len(sub_graph))
    sub_graph_mask.extend(pad_mask)
        
    graph_mask.extend([False] * (args.max_elems_in_sub_graph - len(sub_graph)))
        
    pad_feature = [[[0] * args.max_feature_dim] * args.max_features_in_elems]  * (args.max_elems_in_sub_graph - len(sub_graph))
    sub_graph.extend(pad_feature)
else:
    sub_graph_mask = sub_graph_mask[:args.max_elems_in_sub_graph]
    graph_mask = graph_mask[:args.max_elems_in_sub_graph]
    sub_graph = sub_graph[:args.max_elems_in_sub_graph]

其它

除了地图数据之外，论文中还进行了Graph Completion和Position Encoding。

Graph Completion

随机Mask的代码如下：

node_mask = np.array(graph_mask)
lane_idxs = np.argwhere(node_mask == True)

random_cnt = 5 # (int)(lane_idxs.shape[0] * 0.3)
indexs = np.random.choice(np.arange(1, lane_idxs.shape[0]), size = random_cnt, replace = False)
lane_random_idxs = lane_idxs[indexs]
node_mask = np.full(node_mask.shape, False)
node_mask[lane_random_idxs] = True

Position Encoding

Position Encoding的实现很简单，取每个polyline的坐标最小数值即可。

sub_pids.append([np.array(lane_features)[:, [0, 2]].min(), np.array(lane_features)[:, [1, 3]].min()])

3Agent轨迹数据预处理

Agent分为Target Agent和Other Agent两个分类，Target Agent是待预测的车辆轨迹，Other Agent是其它车辆的轨迹，这两类轨迹数据的处理方法相同，其向量形式表达如下：

组织代码如下:

agent_features = []

# obs_traj
for agent_idx in range(1, len(offset_agent_traj)):
    agent_feature = []
    agent_0 = offset_agent_traj[agent_idx - 1]
    agent_1 = offset_agent_traj[agent_idx]

    agent_feature.extend([agent_0[0], agent_0[1]])
    agent_feature.extend([agent_1[0], agent_1[1]])

    # object type: agent_traj
    agent_feature.append(obj_type)

    #timestamp
    diff_timestamp = agent_timestamp[agent_idx] - agent_timestamp[agent_idx - 1]
    agent_feature.append(diff_timestamp)
        
    unit_vec = (agent_1 - agent_0)
    agent_feature.append(unit_vec[0])
    agent_feature.append(unit_vec[1])
        
    agent_feature.append(agent_id)

    agent_features.append(agent_feature)

在处理Other Agent的轨迹，过滤掉了静止的车辆。过滤的方式采用了Argoverse Baseline的代码，将速度低于1m/s的Agent过滤掉。

def compute_velocity(track_df: pd.DataFrame) -> List[float]:
    x_coord = track_df['X'].values
    y_coord = track_df['Y'].values
    timestamp = track_df['TIMESTAMP'].values
    vel_x, vel_y = zip(*[(
        x_coord[i] - x_coord[i - 1] /
        (float(timestamp[i]) - float(timestamp[i - 1])),
        y_coord[i] - y_coord[i - 1] /
        (float(timestamp[i]) - float(timestamp[i - 1])),
    ) for i in range(1, len(timestamp))])
    vel = [np.sqrt(x**2 + y**2) for x, y in zip(vel_x, vel_y)]

    return vel

def get_is_track_stationary(track_df: pd.DataFrame) -> bool:
    STATIONARY_THRESHOLD = 13
    
    vel = compute_velocity(track_df)
    sorted_vel = sorted(vel)
    threshold_vel = sorted_vel[STATIONARY_THRESHOLD]
    
    VELOCITY_THRESHOLD = 1.0
    
    return True if threshold_vel < VELOCITY_THRESHOLD else False

读取并处理Other Agent的数据。

EXIST_THRESHOLD = 15
other_agent_groups = seq_df.groupby('TRACK_ID')
for group_name, group_data in other_agent_groups:

# Check if the track is long enough
if len(group_data) < EXIST_THRESHOLD:
    continue
    
# Skip if agent track
if group_data['OBJECT_TYPE'].iloc[0] == 'AGENT':
   continue
                
if get_is_track_stationary(group_data):
  continue

x_coord = group_data['X'].values
y_coord = group_data['Y'].values
timestamp = group_data['TIMESTAMP'].values
            
other_agent_traj = np.column_stack((x_coord, y_coord))
            
offset_other_agent_traj = normalized_map(other_agent_traj, base_pt, angle)
            
create_agent_feature(args, offset_other_agent_traj, timestamp, 3, origin_center_lines, sub_pids, sub_graph, sub_graph_mask)

读取并处理Target Agent数据。

agent_x = seq_df[seq_df['OBJECT_TYPE'] == 'AGENT']['X']
agent_y = seq_df[seq_df['OBJECT_TYPE'] == 'AGENT']['Y']
agent_traj = np.column_stack((agent_x, agent_y))
agent_timestamp = seq_df[seq_df['OBJECT_TYPE'] == 'AGENT']['TIMESTAMP'].values

offset_agent_traj, angle = normalized_traj(args, agent_traj, base_pt)

base_pt = agent_traj[args.obs_len - 1]

offset_agent_traj, angle = normalized_traj(args, agent_traj, base_pt)

create_agent_feature(args, offset_agent_traj[:args.obs_len], agent_timestamp, 2, origin_center_lines, sub_pids, sub_graph, sub_graph_mask)

Part3训练效果

完整训练Argoverse数据耗时比较长，先用200个Sequence进行训练调试。

20/25, loss:55.78622817993164
------------------------------------------------
{'minADE': 4.3394534013715225, 'minFDE': 8.642288635816033, 'MR': 0.90625}
------------------------------------------------
21/25, loss:55.540924072265625
------------------------------------------------
{'minADE': 4.290202449056427, 'minFDE': 8.579150848481632, 'MR': 0.90625}
------------------------------------------------
22/25, loss:55.3294792175293
------------------------------------------------
{'minADE': 4.2439700423643965, 'minFDE': 8.519883228494539, 'MR': 0.9375}
------------------------------------------------
23/25, loss:55.14278030395508
------------------------------------------------
{'minADE': 4.200904121030685, 'minFDE': 8.466275328781865, 'MR': 0.9375}
------------------------------------------------
24/25, loss:54.96728515625
------------------------------------------------
{'minADE': 4.1585633970311235, 'minFDE': 8.41244784105069, 'MR': 0.9375}
------------------------------------------------

下图中黄色是Observe序列，紫色是Ground Truth，绿色是预测结果。

转弯车道的训练结果很差，完全没学会... 可能是训练数据太少吧，也可能网络有Bug，以后慢慢调吧，嘿嘿...

Part4参考资料

1.https://github.com/xk-huang/yet-another-vectornet

2.https://github.com/DQSSSSS/VectorNet

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。