本文分享测试阿里推出的gpushare GPU虚拟化方案过程
https://developer.aliyun.com/article/690623
gpu share scheduler extender https://github.com/AliyunContainerService/gpushare-scheduler-extender
gpu share device plugin
https://github.com/AliyunContainerService/gpushare-scheduler-extender/blob/master/docs/install.md
下载scheduler-policy-config,所有master都要做:
bashcd /etc/kubernetes/
curl -O https://raw.githubusercontent.com/AliyunContainerService/gpushare-scheduler-extender/master/config/scheduler-policy-config.json
注意:127.0.0.1:32766 这个端口只有在部署了gpushare-scheduler的机器才能访问,要改成service的ip和端口。
部署scheduler-extender:
bashcd /tmp/
curl -O https://raw.githubusercontent.com/AliyunContainerService/gpushare-scheduler-extender/master/config/gpushare-schd-extender.yaml
kubectl create -f gpushare-schd-extender.yaml
所有master都要操作
bashcp /etc/kubernetes/manifests/kube-scheduler.yaml .
# edit kube-scheduler.yaml
cp kube-scheduler.yaml /etc/kubernetes/manifests/kube-scheduler.yaml
bashwget https://raw.githubusercontent.com/AliyunContainerService/gpushare-device-plugin/master/device-plugin-rbac.yaml kubectl create -f device-plugin-rbac.yaml wget https://raw.githubusercontent.com/AliyunContainerService/gpushare-device-plugin/master/device-plugin-ds.yaml kubectl create -f device-plugin-ds.yam
bashkubectl label node k8s-container-group-87-89 nodeType-
kubectl label node k8s-container-group-87-89 gpushare=true
bashcd /usr/bin/
wget https://github.com/AliyunContainerService/gpushare-device-plugin/releases/download/v0.3.0/kubectl-inspect-gpushare
chmod u+x /usr/bin/kubectl-inspect-gpushare
参考:https://github.com/AliyunContainerService/gpushare-scheduler-extender/blob/master/docs/userguide.md
本文作者:renbear
本文链接:
版权声明:本博客所有文章除特别声明外,均采用 CC BY-NC 2.0 许可协议。转载请注明出处!