文章目录

1 Overview
2 Practice
- 2.1 Serving 镜像
- 2.2 模型文件和 s3cmd 环境
- 2.3 部署
- 2.5 更新模型
3 测试
4 Summary
5 Reference

TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments.

1 Overview

在 Tensorflow 给的官方例子中 Use TensorFlow Serving with Kubernetes，是将模型拷贝到镜像里的，这里是会有点不太灵活，因为更新模型就要重新构建镜像，并且再去更新对应的 Pod。

由于 Tensorflow Serving 本身就提供了滚动更新模型的能力，而 Tensorflow Serving 是可以通过 S3 来直接读取模型文件。

关于 Demo，希望大家可以通过 Amazon S3 Tools Usage，了解一下 s3cmd 的用法。

2 Practice

部署 S3 上的模型，和随时更新模型，需要提前准备下面的材料。

Serving 镜像
模型文件和 s3cmd 的环境

2.1 Serving 镜像

Serving 镜像可以在 Tensorflow Serving 官方的镜像仓库获取。选择镜像的时候，务必注意是否为 GPU 版本。然后 tag 为 TenC 的镜像仓库。

# 拉取官方镜像docker pull tensorflow/serving

2.2 模型文件和 s3cmd 环境

这是 tensorflow/serving 提供的例子。模型所在的地址在这里。

.└── 00000123    ├── assets    │   └── foo.txt    ├── saved_model.pb    └── variables        ├── variables.data-00000-of-00001        └── variables.index

假设大家对 s3cmd 有所了解了，如果需要访问对应的 bucket，需要 Access key 和 Secret key。

s3cmd put --recursive --access_key=runzhliu-demo-xxx --secret_key=runzhliu-demo-xxx --no-ssl --host=xxx.db:7480 resnet_v2_fp32_savedmodel_NHWC_jpg s3://runzhliu__demo/Tensorflow_Serving/

2.3 部署

这里需要在创建 Pod 的时候，传入跟 Tensorflow 与 S3 相关的几个环节变量，否则 Serving 是无法加载 S3 的模型。

# key 可以从上面的内容 Ceph 存储 Ceph Key 模块找到export AWS_ACCESS_KEY_ID=runzhliu-demo-keyexport AWS_SECRET_ACCESS_KEY=runzhliu-demo-secretexport S3_ENDPOINT=9.25.151.xxx:7480export S3_USE_HTTPS=0

2.5 更新模型

这里测试的更新方式是直接上传一个 version 更大的模型文件夹。然后再通过 s3cmd put 上传到 Ceph 存储。

.|-- 00000123 // 原来的模型|   |-- assets|   |   `-- foo.txt|   |-- saved_model.pb|   `-- variables|       |-- variables.data-00000-of-00001|       `-- variables.index`-- 00000124 // 更新后的模型    |-- assets    |   `-- foo.txt    |-- saved_model.pb    `-- variables        |-- variables.data-00000-of-00001        `-- variables.index

然后再通过 curl 来查看模型服务的情况，可以发现 version 为 124 的模型是 AVAILABLE 的，而 123 的模型变成 END，这是由 Serving 默认的 Version Policy 决定的，会自动加载模型版本号更大的模型。

# curl http://172.17.91.182:8501/v1/models/saved_model_half_plus_two_cpu{ "model_version_status": [  {   "version": "124",   "state": "AVAILABLE",   "status": {    "error_code": "OK",    "error_message": ""   }  },  {   "version": "123",   "state": "END",   "status": {    "error_code": "OK",    "error_message": ""   }  } ]}

3 测试

下面我们通过几个简单的 curl 请求来测试一下我们部署的 tensorflow-serving Workload。测试的环境可以参考 Serving_Curl。

/ # curl http://6.16.240.189:8501/v1/models/saved_model_half_plus_two_cpu{ "model_version_status": [  {   "version": "123",   "state": "AVAILABLE",   "status": {    "error_code": "OK",    "error_message": ""   }  } ]}# curl -d '{"instances": [1.0, 2.0, 5.0]}'  -X POST http://6.16.240.189:8501/v1/models/saved_model_half_plus_two_cpu:predict{    "predictions": [2.5, 3.0, 4.5]}

下面通过 curl 已经部署的 Pod IP 和 HTTP 对应的8501端口，查看部署模型的 metadata 信息，将会输出 signature_def 等信息。

# curl http://6.16.240.189:8501/v1/models/saved_model_half_plus_two_cpu/versions/123/metadata{"model_spec":{ "name": "saved_model_half_plus_two_cpu", "signature_name": "", "version": "123"},"metadata": {"signature_def": { "signature_def": {  "regress_x_to_y2": {   "inputs": {    "inputs": {     "dtype": "DT_STRING",     "tensor_shape": {      "dim": [],      "unknown_rank": true     },     "name": "tf_example:0"    }   },   "outputs": {    "outputs": {     "dtype": "DT_FLOAT",     "tensor_shape": {      "dim": [       {        "size": "-1",        "name": ""       },       {        "size": "1",        "name": ""       }      ],      "unknown_rank": false     },     "name": "y2:0"    }   },   "method_name": "tensorflow/serving/regress"  },  "classify_x_to_y": {   "inputs": {    "inputs": {     "dtype": "DT_STRING",     "tensor_shape": {      "dim": [],      "unknown_rank": true     },     "name": "tf_example:0"    }   },   "outputs": {    "scores": {     "dtype": "DT_FLOAT",     "tensor_shape": {      "dim": [       {        "size": "-1",        "name": ""       },       {        "size": "1",        "name": ""       }      ],      "unknown_rank": false     },     "name": "y:0"    }   },   "method_name": "tensorflow/serving/classify"  },  "regress_x2_to_y3": {   "inputs": {    "inputs": {     "dtype": "DT_FLOAT",     "tensor_shape": {      "dim": [       {        "size": "-1",        "name": ""       },       {        "size": "1",        "name": ""       }      ],      "unknown_rank": false     },     "name": "x2:0"    }   },   "outputs": {    "outputs": {     "dtype": "DT_FLOAT",     "tensor_shape": {      "dim": [       {        "size": "-1",        "name": ""       },       {        "size": "1",        "name": ""       }      ],      "unknown_rank": false     },     "name": "y3:0"    }   },   "method_name": "tensorflow/serving/regress"  },  "serving_default": {   "inputs": {    "x": {     "dtype": "DT_FLOAT",     "tensor_shape": {      "dim": [       {        "size": "-1",        "name": ""       },       {        "size": "1",        "name": ""       }      ],      "unknown_rank": false     },     "name": "x:0"    }   },   "outputs": {    "y": {     "dtype": "DT_FLOAT",     "tensor_shape": {      "dim": [       {        "size": "-1",        "name": ""       },       {        "size": "1",        "name": ""       }      ],      "unknown_rank": false     },     "name": "y:0"    }   },   "method_name": "tensorflow/serving/predict"  },  "regress_x_to_y": {   "inputs": {    "inputs": {     "dtype": "DT_STRING",     "tensor_shape": {      "dim": [],      "unknown_rank": true     },     "name": "tf_example:0"    }   },   "outputs": {    "outputs": {     "dtype": "DT_FLOAT",     "tensor_shape": {      "dim": [       {        "size": "-1",        "name": ""       },       {        "size": "1",        "name": ""       }      ],      "unknown_rank": false     },     "name": "y:0"    }   },   "method_name": "tensorflow/serving/regress"  } }}}}

可以通过 Serving Pod 对应的 Serving 的 name 和集群 IP 来请求结果，分别是 tensorflow-serving 和 172.17.91.182 。所以就算更新了 Pod，Pod IP 变化了，通过上述两种方法，依然可以路由到 serving 的服务。

# curl http://tensorflow-serving:8501/v1/models/saved_model_half_plus_two_cpu{ "model_version_status": [  {   "version": "123",   "state": "AVAILABLE",   "status": {    "error_code": "OK",    "error_message": ""   }  } ]}# curl http://172.17.91.182:8501/v1/models/saved_model_half_plus_two_cpu{ "model_version_status": [  {   "version": "123",   "state": "AVAILABLE",   "status": {    "error_code": "OK",    "error_message": ""   }  } ]}

4 Summary

目前在 Kubernetes 部署 Tensorflow Serving 还是非常便利的，同时通过 S3 来管理模型，也提供了滚动更新模型的能力。

5 Reference

本站仅提供存储服务，所有内容均由用户发布，如发现有害或侵权内容，请点击举报。