Page Comparison

...

This document serves as a place for brainstorming ideas for Model & Dataset CRD design. The general goal is to design reusable CRDs that can be shared by various higher level Machine machine learning tasks and frameworks.

Goals

Metadata of dataset and model objects.
Used by the EdgeAI featuresWhat does the CRDs controllers do? Define the exact responsibilities of model & dataset CRDs and controllers.
How will the higher level tasks, i.e. federated learning, model serving etc, utilize the services provided by model & dataset CRDs.

Non-goals

The truly format of the AI dataset, such as imagenet, coco or tf-record etc.
The truly format of the AI model, such as ckpt, saved_model of tensorflow etc.
The truly operations of the AI dataset, such as shuffle, crop etc.
The truly operations of the AI model, such as train, inference etc.

...

We propose using Kubernetes Custom Resource Definitions (CRDs) to describe the dataset/model specification/status and a controller to synchronize these updates between edge and cloud.

Image Removed

Use Cases

Users can create the dataset resource, by providing the dataset url, format and the nodeName which owns the dataset.
Users can create the model resource by providing the model url and format.
Users can show the information of dataset/model.
Users can delete the dataset/model.

Design Details

CRD API Group and Version

The Dataset and Model CRDs will be namespace-scoped. The tables below summarize the group, kind and API version details for the CRDs.

Dataset

Field	Description
Group	edgeai.io
APIVersion	v1alpha1
Kind	Dataset

Model

Field	Description
Group	edgeai.io
APIVersion	v1alpha1
Kind	Model

CRDs

Dataset crd

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: datasets.edgeai.io
spec:
  group: edgeai.io
  names:
    kind: Dataset
    plural: datasets
  scope: Namespaced
  versions:
  - name: v1alpha1
    subresources:
      # status enables the status subresource.
      status: {}
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              dataUrl:
                type: string
              format:
                type: string
              nodeName:
                type: string
          status:
            type: object
            properties:
              numberOfSamples:
                type: integer
              updateTime:
                type: string
                format: datatime


    additionalPrinterColumns:
    - name: NumberOfSamples
      type: integer
      description: The number of samples in the dataset
      jsonPath: ".status.numberOfSamples"
    - name: Node
      type: string
      description: The node name of the dataset
      jsonPath: ".spec.nodeName"
    - name: spec
      type: string
      description: The spec of the dataset
      jsonPath: ".spec"

Model crd

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: models.edgeai.io
spec:
  group: edgeai.io
  names:
    kind: Model
    plural: models
  scope: Namespaced
  versions:
    - name: v1alpha1
      subresources:
        # status enables the status subresource.
        status: {}
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                modelUrl:
                  type: string
            status:
              type: object
              properties:
                updateTime:
                  type: string
                  format: datetime
                metrics:
                  type: array
                  items:
                    type: object 
                    properties:
                      key:
                        type: string
                      value:
                        type: string

      additionalPrinterColumns:
        - name: updateAGE
          type: date
          description: The update age
          jsonPath: ".status.updateTime"
        - name: metrics
          type: string
          description: The metrics
          jsonPath: ".status.metrics"

CRD type definition

Dataset

type Dataset struct {
        metav1.TypeMeta `json:",inline"`

        metav1.ObjectMeta `json:"metadata,omitempty"`

        Spec   DatasetSpec   `json:"spec"`
        Status DatasetStatus `json:"status"`
}

type DatasetSpec struct {
        DataUrl  string `json:"dataUrl"`
        Format   string `json:"format"`
        NodeName string `json:"nodeName"`
}

type DatasetStatus struct {
        UpdateTime      *metav1.Time `json:"updateTime,omitempty"`
        NumberOfSamples int          `json:"numberOfSamples"`
}

// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object

type DatasetList struct {
        metav1.TypeMeta `json:",inline"`
        metav1.ListMeta `json:"metadata"`

        Items []Dataset `json:"items"`
}

Model

// +genclient
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object

type Model struct {
        metav1.TypeMeta `json:",inline"`

        metav1.ObjectMeta `json:"metadata,omitempty"`

        Spec   ModelSpec   `json:"spec"`
        Status ModelStatus `json:"status"`
}

type ModelSpec struct {
        ModelUrl string `json:"modelUrl"`
        Format   string `json:"format"`
}

type ModelStatus struct {
        UpdateTime *metav1.Time  `json:"updateTime,omitempty"`
        Metrics    []ModelMetric `json:"metrics,omitempty"`
}

type ModelMetric struct {
        Key   string `json:"key"`
        Value string `json:"value"`
}

// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object

type ModelList struct {
        metav1.TypeMeta `json:",inline"`
        metav1.ListMeta `json:"metadata"`

        Items []Model `json:"items"`
}

Crd samples

Dataset

apiVersion: edgeai.io/v1alpha1
kind: Dataset
metadata:
  name: "dataset-examp"
spec:
  dataUrl: "/code/data"
  format: "txt"
  nodeName: "edge0"

Model

apiVersion: edgeai.io/v1alpha1
kind: Model
metadata:
  name: model-examp
spec:
  modelUrl: "/model/frozen.pb"
  format: pb

Controller Design

In the current design there is a controller for dataset, no controller for model.

The dataset controller synchronizes the dataset between the cloud and edge.

downstream: synchronize the dataset info from the cloud to the edge node.
upstream: synchronize the dataset status from the edge to the cloud node, such as the information how many samples the dataset has.

Here is the flow of the dataset creation

...

Versions Compared

Old Version 1

New Version 2

Key