When we first introduced Tubi Data Runtime (TDR) [1], it was just a Python library, and early users liked its features. However, there are two issues that need to be addressed in production: 1. Maintaining a native Python environment is not easy, even for data scientists who use Python every day; 2. Local workflows mean that we can’t ensure that groups version the same from group to group or share links with others, making effective collaboration extremely difficult. Until we address these pain points, we won’t be able to achieve our vision of everyone at Tubi being a data analyst.

An overview of

Given that we’ve done a lot of data analysis and model building on Jupyter Notebook, JupyterHub was a natural choice for us to build TDR into a unified, scalable platform. On paper it looks simple enough – users get a JupyterHub URL, launch their exclusive Jupyter service and be happy to work from there.

Simplified TDR architecture

But the architecture ignores several key issues, such as:

  1. How to authenticate and authorize?

  2. How do I store and share a notebook?

  3. How do I access cross-VPC AWS resources such as S3, Redshift and RDS?

  4. How can WE customize JupyterHub to include TDR core features and extensions?

This article details how we solved these problems. In addition, we will also reveal Tubi’s experiments with advanced features such as deep learning, node affinity, and Kubernetes automatic cluster scaling.

= = =

Basic features

AWS (Amazon Web Services) is the main cloud service provider that Tubi has been using since its inception, so kOPS [2], Kubectl [3] and Helm [4] became our early and best choices for deploying Kubernetes applications. JupyterHub officially provides Helm Chart [5], with which we deploy the Native application of JupyterHub cloud is quite simple. However, a number of issues still need to be resolved to make it a production-grade Tubi data platform.

Authentication/single sign-on

As a service, certification is an integral part. The single sign-on service (SSO [6]) that Tubi has used for many years is Okta [7]. It supports OAuth 2.0[8] and is well documented for development, making it easy to configure and use OAuth. Okta sso can be added by adding the following lines to the values. Yaml [9] file of JupyterHub.

auth:
  type: custom
  custom:
    className: oauthenticator.generic.GenericOAuthenticator
    config:
      login\_service: "Okta"
      client\_id: "{{ okta\_client\_id }}"
      client\_secret: "{{ okta\_client\_secret }}"
      token\_url: https://{{ tubi\_okta\_domain }}/oauth2/v1/token
      userdata\_url: https://{{ tubi\_okta\_domain }}/oauth2/v1/userinfo
      userdata\_method: GET
      userdata\_params: {'state': 'state'}
      username\_key: preferred\_username
Copy the code

Okta SSO configuration

Now that we have Okta SSO, we simply click on the TDR icon on Okta to log in to the TDR page.

TDR on Okta

Deep customization of Docker images

The single-user [10] mirror image used in the official Helm chart of JupyterHub is the standard JupyterLab. In order to add our own TDR core functionality and JupyterLab extensions to the single-user image, we needed to compile our own Docker image. We use Alpine Linux [11] as the base image, which makes our image smaller and more compatible than the official image. Earlier versions used Ubuntu and had more layers than the current version. After cutting the number of layers from 48 to 21 and cutting the base mirror to Alpine, the mirror size was reduced by 50%. It’s worth noting that Alpine uses a different package management tool, APK [12], while most people are probably more familiar with apt/apt-get or yum.

Shared storage

JupyterHub allows users to log in and out at any time, and users are assigned different pods each time they use it. These Pods themselves are stateless, so it becomes necessary to store and load the Notebook for the user, for which we use AWS EFS [13] internally as the central storage. In each user’s Jupyter Pod, we mounted two separate directories: 1) a directory with the same SSO login name for personal use; 2) Shared directories for all to use. In use, we found that AWS EFS is not a perfect solution:

  1. Lack of version management. For example, a notebook cannot be rolled back if it has no history.

  2. Lack of fine-grained permission control. If a notebook cannot be shared read-only, everyone has the same read and write permissions.

As mentioned above, each Pod has two mount directories, private and shared. As shown in the official guide [14], it is easy to mount a persistent volume (PV [15]). In our case, the same persistent volume must be mounted multiple times, with different paths mounted into different directories. Thread [16] on GitHub describes and discusses this problem in more detail. After some attempts, we found a scheme to mount the same Pod multiple times on the same block of persistent volume [17] :

singleuser:
  storage:
    homeMountPath: '/home/tubi/notebooks/{username}'
    type: "static"
    static:
      pvcName: "efs-persist"
      subPath: 'home/{username}'
    extraVolumeMounts:
      - name: home
        mountPath: /home/tubi/notebooks/shared
        subPath: shared/notebooks
Copy the code

Store the mount configuration

ExtraVolumeMounts in the above configuration is an array, so extraVolumeMounts can mount as many EFS directories as needed into the same Pod. Note the following two points: 1) There is no need to declare extraVolumes [18] again in storage, we only need to reuse persistent volumes that are already defined; 2) The extra volume name is home [19], as you can see from the Volumes section in the result of “Kubectl describe pod/”.

Volumes:
  home:
    Type:        PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:   efs-persist
    ReadOnly:    false
Copy the code

The Volumes section in the Pod description

The same user has started multiple Jupyter services

By default, only one default single-user service is started per user. Sometimes, however, we may need to start multiple single-user services at the same time. This allows us to manage both the default and named services by setting “C. jupyterhub.allow_named_Servers =True”. Note that: 1) When multiple pages open the same service, the later page will be redirected to the new namespace automatically generated; 2) When users open multiple services at the same time, please ensure that others have the same naming service when sharing links, otherwise they will encounter the embarrassment of not being able to open the page.

Named service and default service

In addition to the features described in more detail earlier, we have addressed these issues: VPC interconnection, access to resources across AWS accounts (such as Redshift, S3, RDS), and access to common internal services. This is all because TDR is managed by a sub-account under the main Tubi account, which is deployed in a separate VPC without any data or services.

By now, the TDR architecture has become more complete.

TDR architecture with basic features

= = =

Advanced features

Deep learning

GPU

In addition to the basic features described earlier, we still need some proprietary tools to support Tubi internal deep learning applications. Compiling TensorFlow and PyTorch is complicated, and most importantly it increases the image size dramatically, and for most users they don’t need these libraries, so we compiled a separate image for deep learning. Both libraries rely on CUDA[20] and cuDNN [21] for compilation, which in turn relies on glibc [22]. Because Alpine implemented MUSL with a different LIBC [23], we also had to abandon it.

GPU devices are important for training deep learning models. If the Kubernetes cluster is going to use GPU devices, we need to rely on the NVIDIA device plug-in [24]. If you want to know how the device plug-in is implemented and works, especially how to expose the GPU device for Pod use, you can refer to the official Kubernetes documentation [25] and the source code of the plug-in [26]. The images mentioned earlier in the Plugin guide [27] use older NVIDIA drivers and CUDA libraries (V9.1), but recent deep learning frameworks such as TensorFlow (2.0+) and PyTorch (1.4+) require CUDA V10.0 +. To this end, we made a modest contribution to the KOPS community and put the image of the device plug-in based on this change on Docker Hub [28] for the use of people who are connected. At this point, we have two Docker images — CPU and GPU — that users can choose to use when logging into TDR.

Login selection page

To achieve this, you need to add a little extra configuration.

singleuser:
  profileList:
    - display\_name: "Default"
      description: |
        Tubi Data Runtime
      default: True
      kubespawner\_override:
        image: <private-docker-registry>/tubi-data-runtime
        extra\_resource\_limits:
          nvidia.com/gpu: "0"
    - display\_name: "GPU"
      description: |
        Tubi Data Runtime with GPU Support. 1 GPU ONLY.
      kubespawner\_override:
        image: <private-docker-registry>/tubi-data-runtime-gpu
        extra\_resource\_limits:
          nvidia.com/gpu: "1"
Copy the code

Login Selection Configuration

TensorBoard

In developing the deep learning model, we ran into a TensorBoard rendering problem: After executing “% TensorBoard” in the Notebook cell, the page went blank. The reason is that TensorBoard will start a service locally and listen on port 6006 by default, but this port cannot be directly accessed by Jupyter service, such as https://tdr-domain:6006. Luckily Jupyter Server Proxy [29] allows us to run any process locally and access it via this link — “https://tdr-domain/hub/user-redirect/proxy/{{proxy}}/” (note: The final slash in this URL is required, though not in the official documentation [30]). Our compromise is to patch: by default, the cell command “%tensorboard” will render the Tensorboard into the current cell by modifying the lines [31] in tensorboard/notebook.

diff --git a/tensorboard/notebook.py b/tensorboard/notebook.py index fe0e13aa.. Ab774377 100644 - a/tensorboard/notebook. + + + b/p y tensorboard/notebook. Py, 8 + 378 @ @ @ @ - 378 def \_display\_ipython(port, height, display\_handle): <script> (function() { const frame = document.getElementById(%JSON\_ID%); - const url = new URL("/", window.location); - url.port = %PORT%; + var baseUrl = "/"; + try { + // parse baseUrl in JupyterLab + baseUrl = JSON.parse(document.getElementById('jupyter-config-data').text || '').baseUrl; + } catch { + try { + // parse baseUrl in classic Jupyter + baseUrl = $('body').data('baseUrl'); + } catch {} + } + const url = new URL(baseUrl, window.location) + "proxy/%PORT%/"; frame.src = url; }) (); </script>Copy the code

TensorBoard patch

The following figure shows the effect of running the official TensorBoard example [32] after we patched.

TensorBoard in TDR

Node affinity

As we learned earlier, TDR supports both CPU and GPU versions of the Jupyter Notebook service. In the deep learning section, we did not set any node affinity rules, which resulted in a mess of scheduling. CPU Pods can run on CPU or GPU nodes, while GPU Pods can only run on GPU nodes. Let’s take a look at a typical everyday application scenario within Tubi. Most users tend to use CPU pods, and inadvertently some of these pods will be allocated to the GPU node, where a new GPU Pod may run out of resources because the GPU node is already occupied by other CPU pods. So we added node_affinity_preferred[33] and node_affinity_required (which are actually aliases for the corresponding Kubernetes concept [34]) configurations (the two instance groups we used are called CPU and GPU, respectively).

singleuser:
  profileList:
    - display\_name: "Default"
      description: |
        Tubi Data Runtime
      default: True
      kubespawner\_override:
        image: <private-docker-registry>/tubi-data-runtime
        node\_affinity\_preferred:
          - weight: 1
            preference:
              matchExpressions:
              - key: kops.k8s.io/instancegroup
                operator: NotIn
                values:
                - gpu
        extra\_resource\_limits:
          nvidia.com/gpu: "0"
    - display\_name: "GPU"
      description: |
        Tubi Data Runtime with GPU Support. 1 GPU ONLY.
      kubespawner\_override:
        image: <private-docker-registry>/tubi-data-runtime-gpu
        node\_affinity\_required:
        - matchExpressions:
          - key: kops.k8s.io/instancegroup
            operator: In
            values:
            - gpu
        extra\_resource\_limits:
          nvidia.com/gpu: "1"
Copy the code

Affinity configuration for the node

The node affinity configured above indicates that CPU pods tend to be assigned to CPU nodes, while GPU pods must be assigned to GPU nodes. And we can configure multiple rules to achieve more precise node control.

Automatic Cluster scaling

Before the introduction of automatic cluster scaling, we often encountered warnings of insufficient resources, and then had to increase the value of minSize in the InstanceGroup[35] configuration, redeploy, and wait for the new nodes to be ready. There are two drawbacks to this approach:

  1. Spending more. These new nodes are often the result of temporary and explosive user requests over a period of time, after which the nodes are underutilized most of the time.

  2. Manual scaling may remove and destroy running pods. Once a node is selected for destruction, the PODS running on it are also forcibly destroyed.

During manual scaling, the compromise is to have nodes ready in advance of peak demand, rather than having machines idle and resources wasted during off-peak times.

Pod usage mode

Many cloud service providers have implemented Kubernetes clustering auto-scaling [36], so Tubi’s preferred version is AWS [37] and we use the auto-discovery Settings [38]. We also did a few things to successfully configure automatic cluster scaling in AWS:

  1. All nodes plus permissions required for automatic scaling [39]

  2. IO/cluster-AutoScaler /enabled and k8s. IO/cluster-AutoScaler/are added to all nodes

  3. Example Change the SSL certs[40] path to /etc/ssl/certs.ca-certificate. CRT. This question is mentioned here [41]

With the addition of automatic cluster scaling and node affinity, our machine overhead was reduced by 50%. Also, the small images we compiled with Alpine earlier pay off here — when cluster expansion is triggered, a node can be started, ready, and pulled in 2 minutes.

monitoring

Prometheus[42] is the de facto monitoring solution for Kubernetes ecology. Using the Kube-Prometheus [43] library we soon deployed a monitoring system for TDR. However, for security reasons, the services included here, such as Dashboard [44], Grafana [45], Prometheus[46], and Alert Manager [47], are only accessible from internal company IP addresses. Grafana can use the Prometheus data source, which comes with a number of namespaces, Node, and PoDE-related dashboards. We also have a TDR dashboard that monitors Pod and Node in JupyterHub.

TDR monitoring (you can see how many PODS are running, when they were started, how long they were running, etc.)

At this point, our TDR architecture becomes clearer and richer with the addition of basic and advanced features.

TDR architecture

= = =

summary

TDR is only the first step in our long journey to build a unified data platform within the company. In the future, we plan to address some of the larger issues that exist today, such as fine-grained permission control, task scheduling, reproducibility, model tuning, and even model services. Because we want non-technical experts to be able to use the platform, creating an easy-to-use and seamless user experience for them will be one of our priorities.

In the Tubi engineering team, we are always looking for and solving challenging problems. We advocate proactive thinking and value ideas and solutions that will have a profound impact on the company. If you are also interested in building a data platform or cloud native application, come and join us!

The author:Huihua Zhang, Tubi Data Engineer

Click “Read original” to view the English version

  • [1] code.tubitv.com/introducing…

  • [2] github.com/kubernetes/…

  • [3] kubernetes. IO/docs/refere…

  • [4]helm.sh/

  • [5] github.com/jupyterhub/…

  • [6] code.tubitv.com/securing-aw…

  • [7]www.okta.com/

  • [8] developer.okta.com/docs/refere…

  • [9] a zero – to – jupyterhub. Readthedocs. IO/en/latest/s…

  • [10] github.com/jupyterhub/…

  • [11]alpinelinux.org/

  • [12] wiki.alpinelinux.org/wiki/Alpine…

  • [13]aws.amazon.com/efs/

  • [14] a zero – to – jupyterhub. Readthedocs. IO/en/latest/a…

  • [15] kubernetes. IO/docs/concep…

  • [16] github.com/jupyterhub/…

  • [17] github.com/jupyterhub/…

  • [18] github.com/jupyterhub/…

  • [19] github.com/jupyterhub/…

  • [20] developer.nvidia.com/cuda-downlo…

  • [21]developer.nvidia.com/cudnn

  • [22] www.gnu.org/software/li…

  • [23] pkgs.alpinelinux.org/package/edg…

  • [24] github.com/NVIDIA/k8s-…

  • [25] kubernetes. IO/docs/concep…

  • [26] github.com/NVIDIA/k8s-…

  • [27] github.com/kubernetes/…

  • [28] hub.docker.com/layers/fifa…

  • [29]jupyter-server-proxy.readthedocs.io/en/latest/

  • [30] jupyter – server – proxy. Readthedocs. IO/en/latest/a…

  • [31] github.com/tensorflow/…

  • [32] www.tensorflow.org/tensorboard…

  • [33] jupyterhub – kubespawner. Readthedocs. IO/en/latest/o…

  • [34] kubernetes. IO/docs/concep…

  • [35] github.com/kubernetes/…

  • [36] github.com/kubernetes/…

  • [37] github.com/kubernetes/…

  • [38] github.com/kubernetes/…

  • [39] github.com/kubernetes/…

  • [40] github.com/kubernetes/…

  • [41] github.com/kubernetes/…

  • [42]prometheus.io/

  • [43] github.com/coreos/kube…

  • [44] github.com/kubernetes/…

  • [45]grafana.com/

  • [46] github.com/prometheus/…

  • [47] github.com/prometheus/…