https://blog.csdn.net/cdj0311/article/details/107246476
在遇到大规模推荐算法训练时,我们常常会有不同种类的特征,大体上可分为稀疏特征和稠密特征两类。
稀疏特征一般会经过Embedding转为稠密特征再传入全连接层。然而,当稀疏特征中包含大量ID类特征时,由于其原始维度非常高(如UserID几乎都是千万级以上),训练如此庞大的Embedding时会非常缓慢。一种解决方案是增大学习率,但学习率过大又会影响稠密特征(如一些向量特征)的训练,所以可以设计两个优化器分别以不同的学习率去优化稀疏Embedding和稠密特征。
这里以tf.estimator + tf.feature_column实现:
- def isSparse(variable, fields):
- """ 判断变量是否为稀疏变量 """
- flag = False
- for filed in fields:
- if filed in variable.name:
- flag = True
- break
- return flag
- # 获取全局步数
- global_step = tf.train.get_global_step()
- # 获取所有可训练的变量
- trainable_variables = [variable for variable in tf.trainable_variables()]
- # 获取稀疏变量列表
- sparse_list = [x.name for x in params["feature_configs"].all_columns.values()
- if "EmbeddingColumn" in str(type(x)) and
- "HashedCategoricalColumn" in str(type(x.categorical_column)) and
- x.categorical_column.hash_bucket_size > 200000
- ]
- # 获取embedding变量
- embedding_variables = [variable for variable in trainable_variables if "embedding_weights" in variable.name]
- # 获取稀疏embedding变量
- embedding_sparse_variables = [variable for variable in embedding_variables if isSparse(variable, sparse_list)]
- # 获取稠密embedding变量
- embedding_dense_variables = [variable for variable in embedding_variables if
- variable not in embedding_sparse_variables]
- param_variables = [variable for variable in trainable_variables if variable not in embedding_variables]
- all_variables = [variable for variable in trainable_variables]
- # 定义稀疏变量优化器
- optimizer_sparse_emb = tf.train.AdagradOptimizer(learning_rate=0.01)
- train_op_sparse_emb = optimizer_sparse_emb.minimize(loss, var_list=embedding_sparse_variables)
- # 定义稠密变量优化器
- optimizer_dense_emb = tf.train.AdagradOptimizer(learning_rate=0.001)
- train_op_dense_emb = optimizer_dense_emb.minimize(loss, var_list=embedding_dense_variables)
- # 定义全局变量优化器
- optimizer_param = tf.train.AdamOptimizer(learning_rate=params["learning_rate"])
- train_op_param = optimizer_param.minimize(loss, var_list=param_variables, global_step=global_step)
- # 将所有优化器集合到一起
- update_ops = tf.compat.v1.get_collection(tf.GraphKeys.UPDATE_OPS)
- train_op = tf.group(update_ops, train_op_sparse_emb, train_op_dense_emb, train_op_param)
- return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
联系客服