Skip to content

Commit e71727d

Browse files
hbelmiromholder6
authored andcommitted
chore(components): Added kfp_deploy_model_to_kserve_demo
Signed-off-by: Helber Belmiro <[email protected]> docs(backend): improved backend README (kubeflow#11511) * improved backend README Signed-off-by: Daniel Dowler <[email protected]> * Update backend/README.md Co-authored-by: Helber Belmiro <[email protected]> Signed-off-by: Daniel Dowler <[email protected]> * Update backend/README.md Co-authored-by: Helber Belmiro <[email protected]> Signed-off-by: Daniel Dowler <[email protected]> * Update backend/README.md Co-authored-by: Helber Belmiro <[email protected]> Signed-off-by: Daniel Dowler <[email protected]> * Update backend/README.md Co-authored-by: Helber Belmiro <[email protected]> Signed-off-by: Daniel Dowler <[email protected]> --------- Signed-off-by: Daniel Dowler <[email protected]> Co-authored-by: Helber Belmiro <[email protected]> fix(CI): Use the correct image registry for replacements in integration tests (kubeflow#11564) * Use the correct image registry for replacements in integration tests The image registry was changed to GitHub Container Registry in the 2.4 release. Signed-off-by: mprahl <[email protected]> * Print the pod logs when the pods fail to start in integration tests Signed-off-by: mprahl <[email protected]> * Fix the sample compilation in the API server container build Signed-off-by: mprahl <[email protected]> * Show the output when building the container images in CI Signed-off-by: mprahl <[email protected]> --------- Signed-off-by: mprahl <[email protected]> feat(api): Add SemaphoreKey and MutexName fields to proto (kubeflow#11384) Signed-off-by: ddalvi <[email protected]>
1 parent 1234c8d commit e71727d

File tree

21 files changed

+1193
-62
lines changed

21 files changed

+1193
-62
lines changed

.github/resources/manifests/argo/kustomization.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ resources:
55
- ../../../../manifests/kustomize/env/platform-agnostic
66

77
images:
8-
- name: gcr.io/ml-pipeline/api-server
8+
- name: ghcr.io/kubeflow/kfp-api-server
99
newName: kind-registry:5000/apiserver
1010
newTag: latest
11-
- name: gcr.io/ml-pipeline/persistenceagent
11+
- name: ghcr.io/kubeflow/kfp-persistence-agent
1212
newName: kind-registry:5000/persistenceagent
1313
newTag: latest
14-
- name: gcr.io/ml-pipeline/scheduledworkflow
14+
- name: ghcr.io/kubeflow/kfp-scheduled-workflow-controller
1515
newName: kind-registry:5000/scheduledworkflow
1616
newTag: latest
1717

.github/resources/manifests/tekton/kustomization.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,13 @@ resources:
1414
# when application is deleted.
1515

1616
images:
17-
- name: gcr.io/ml-pipeline/api-server
17+
- name: ghcr.io/kubeflow/kfp-api-server
1818
newName: kind-registry:5000/apiserver
1919
newTag: latest
20-
- name: gcr.io/ml-pipeline/persistenceagent
20+
- name: ghcr.io/kubeflow/kfp-persistence-agent
2121
newName: kind-registry:5000/persistenceagent
2222
newTag: latest
23-
- name: gcr.io/ml-pipeline/scheduledworkflow
23+
- name: ghcr.io/kubeflow/kfp-scheduled-workflow-controller
2424
newName: kind-registry:5000/scheduledworkflow
2525
newTag: latest
2626
- name: '*/aipipeline/tekton-exithandler-controller'

.github/resources/scripts/build-images.sh

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,35 +25,35 @@ EXIT_CODE=0
2525

2626
docker system prune -a -f
2727

28-
docker build -q -t "${REGISTRY}/apiserver:${TAG}" -f backend/Dockerfile . && docker push "${REGISTRY}/apiserver:${TAG}" || EXIT_CODE=$?
28+
docker build --progress=plain -t "${REGISTRY}/apiserver:${TAG}" -f backend/Dockerfile . && docker push "${REGISTRY}/apiserver:${TAG}" || EXIT_CODE=$?
2929
if [[ $EXIT_CODE -ne 0 ]]
3030
then
3131
echo "Failed to build apiserver image."
3232
exit $EXIT_CODE
3333
fi
3434

35-
docker build -q -t "${REGISTRY}/persistenceagent:${TAG}" -f backend/Dockerfile.persistenceagent . && docker push "${REGISTRY}/persistenceagent:${TAG}" || EXIT_CODE=$?
35+
docker build --progress=plain -t "${REGISTRY}/persistenceagent:${TAG}" -f backend/Dockerfile.persistenceagent . && docker push "${REGISTRY}/persistenceagent:${TAG}" || EXIT_CODE=$?
3636
if [[ $EXIT_CODE -ne 0 ]]
3737
then
3838
echo "Failed to build persistenceagent image."
3939
exit $EXIT_CODE
4040
fi
4141

42-
docker build -q -t "${REGISTRY}/scheduledworkflow:${TAG}" -f backend/Dockerfile.scheduledworkflow . && docker push "${REGISTRY}/scheduledworkflow:${TAG}" || EXIT_CODE=$?
42+
docker build --progress=plain -t "${REGISTRY}/scheduledworkflow:${TAG}" -f backend/Dockerfile.scheduledworkflow . && docker push "${REGISTRY}/scheduledworkflow:${TAG}" || EXIT_CODE=$?
4343
if [[ $EXIT_CODE -ne 0 ]]
4444
then
4545
echo "Failed to build scheduledworkflow image."
4646
exit $EXIT_CODE
4747
fi
4848

49-
docker build -q -t "${REGISTRY}/driver:${TAG}" -f backend/Dockerfile.driver . && docker push "${REGISTRY}/driver:${TAG}" || EXIT_CODE=$?
49+
docker build --progress=plain -t "${REGISTRY}/driver:${TAG}" -f backend/Dockerfile.driver . && docker push "${REGISTRY}/driver:${TAG}" || EXIT_CODE=$?
5050
if [[ $EXIT_CODE -ne 0 ]]
5151
then
5252
echo "Failed to build driver image."
5353
exit $EXIT_CODE
5454
fi
5555

56-
docker build -q -t "${REGISTRY}/launcher:${TAG}" -f backend/Dockerfile.launcher . && docker push "${REGISTRY}/launcher:${TAG}" || EXIT_CODE=$?
56+
docker build --progress=plain -t "${REGISTRY}/launcher:${TAG}" -f backend/Dockerfile.launcher . && docker push "${REGISTRY}/launcher:${TAG}" || EXIT_CODE=$?
5757
if [[ $EXIT_CODE -ne 0 ]]
5858
then
5959
echo "Failed to build launcher image."

.github/resources/scripts/kfp-readiness/wait_for_pods.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,17 @@
1313
config.load_kube_config()
1414
v1 = client.CoreV1Api()
1515

16+
def log_pods():
17+
pods = v1.list_namespaced_pod(namespace=namespace)
18+
19+
for pod in pods.items:
20+
try:
21+
logging.info(
22+
f"---- Pod {namespace}/{pod.metadata.name} logs ----\n"
23+
+ v1.read_namespaced_pod_log(pod.metadata.name, namespace)
24+
)
25+
except client.exceptions.ApiException:
26+
continue
1627

1728
def get_pod_statuses():
1829
pods = v1.list_namespaced_pod(namespace=namespace)
@@ -74,6 +85,8 @@ def check_pods(calm_time=10, timeout=600, retries_after_ready=5):
7485
logging.info(f"Pods are still stabilizing. Retrying in {calm_time} seconds...")
7586
time.sleep(calm_time)
7687
else:
88+
log_pods()
89+
7790
raise Exception("Pods did not stabilize within the timeout period.")
7891

7992
logging.info("Final pod statuses:")

api/v2alpha1/go/pipelinespec/pipeline_spec.pb.go

Lines changed: 30 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

api/v2alpha1/pipeline_spec.proto

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1106,5 +1106,9 @@ message PlatformDeploymentConfig {
11061106

11071107
// Spec for pipeline-level config options. See PipelineConfig DSL class.
11081108
message PipelineConfig {
1109-
// TODO add pipeline-level configs
1109+
// Name of the semaphore key to control pipeline concurrency
1110+
string semaphore_key = 1;
1111+
1112+
// Name of the mutex to ensure mutual exclusion
1113+
string mutex_name = 2;
11101114
}

backend/Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,9 @@ COPY backend/src/apiserver/config/sample_config.json /samples/
5454
# Compiling the preloaded samples.
5555
# The default image is replaced with the GCR-hosted python image.
5656
RUN set -e; \
57-
< /samples/sample_config.json jq .[].file --raw-output | while read pipeline_yaml; do \
57+
< /samples/sample_config.json jq ".pipelines[].file" --raw-output | while read pipeline_yaml; do \
5858
pipeline_py="${pipeline_yaml%.yaml}"; \
59-
python3 "$pipeline_py"; \
59+
echo "Compiling: \"$pipeline_py\"" && python3 "$pipeline_py" && echo -n "Output: " && ls "$pipeline_py.yaml"; \
6060
done
6161

6262
# 3. Start api web server

backend/README.md

Lines changed: 52 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,20 @@
1+
# Kubeflow Pipelines Backend
2+
3+
## Overview
4+
15
This directory contains code for the components that comprise the Kubeflow
26
Pipelines backend.
37

8+
This README will help you set up your coding environment in order to build and run the Kubeflow Pipelines backend. The KFP backend powers the core functionality of the KFP platform, handling API requests, workflow management, and data persistence.
9+
10+
## Prerequisites
11+
Before you begin, ensure you have:
12+
- Go programming language installed
13+
- [go-licenses tool](../hack/install-go-licenses.sh)
14+
- Docker or Podman installed (for building container images)
15+
16+
Note that you may need to restart your shell after installing these resources in order for the changes to take effect.
17+
418
## Building & Testing
519

620
To run all unittests for backend:
@@ -15,64 +29,46 @@ The API server itself can be built using:
1529
go build -o /tmp/apiserver backend/src/apiserver/*.go
1630
```
1731

18-
## Code Style
19-
20-
Backend codebase follows the [Google's Go Style Guide](https://google.github.io/styleguide/go/). Please, take time to get familiar with the [best practices](https://google.github.io/styleguide/go/best-practices). It is not intended to be exhaustive, but it often helps minimizing guesswork among developers and keep codebase uniform and consistent.
21-
22-
We use [golangci-lint](https://golangci-lint.run/) tool that can catch common mistakes locally (see detailed configuration [here](https://github.com/kubeflow/pipelines/blob/master/.golangci.yaml)). It can be [conveniently integrated](https://golangci-lint.run/usage/integrations/) with multiple popular IDEs such as VS Code or Vim.
23-
24-
Finally, it is advised to install [pre-commit](https://pre-commit.com/) in order to automate linter checks (see configuration [here](https://github.com/kubeflow/pipelines/blob/master/.pre-commit-config.yaml))
25-
26-
## Building APIServer image locally
27-
2832
The API server image can be built from the root folder of the repo using:
2933
```
3034
export API_SERVER_IMAGE=api_server
3135
docker build -f backend/Dockerfile . --tag $API_SERVER_IMAGE
3236
```
33-
## Deploy APIServer with the image you own build
37+
### Deploying the APIServer (from the image you built) on Kubernetes
3438

35-
Run
39+
First, push your image to a registry that is accessible from your Kubernetes cluster.
40+
41+
Then, run:
3642
```
3743
kubectl edit deployment.v1.apps/ml-pipeline -n kubeflow
3844
```
39-
You'll see the field reference the api server docker image.
45+
You'll see the field reference the api server container image (`spec.containers[0].image: gcr.io/ml-pipeline/api-server:<image-version>`).
4046
Change it to point to your own build, after saving and closing the file, apiserver will restart with your change.
4147

42-
## Building client library and swagger files
48+
### Building client library and swagger files
4349

4450
After making changes to proto files, the Go client libraries, Python client libraries and swagger files
4551
need to be regenerated and checked-in. Refer to [backend/api](./api/README.md) for details.
4652

47-
## Updating licenses info
48-
49-
1. [Install go-licenses tool](../hack/install-go-licenses.sh) and refer to [its documentation](https://github.com/google/go-licenses) for how to use it.
53+
### Updating licenses info
5054

55+
1. [Install go-licenses tool](../hack/install-go-licenses.sh) (if you haven't already) and refer to [its documentation](https://github.com/google/go-licenses) for how to use it.
5156

5257
2. Run the tool to update all licenses:
5358

5459
```bash
55-
make all
60+
make -C backend all
5661
```
5762

58-
## Updating python dependencies
59-
60-
[pip-tools](https://github.com/jazzband/pip-tools) is used to manage python
61-
dependencies. To update dependencies, edit [requirements.in](requirements.in)
62-
and run `./update_requirements.sh` to update and pin the transitive
63-
dependencies.
64-
65-
# Visualization Server Instructions
66-
67-
## Updating python dependencies
63+
### Updating python dependencies
6864

6965
[pip-tools](https://github.com/jazzband/pip-tools) is used to manage python
7066
dependencies. To update dependencies, edit [requirements.in](requirements.in)
7167
and run `./update_requirements.sh` to update and pin the transitive
7268
dependencies.
7369

7470

75-
## Building conformance tests (WIP)
71+
### Building conformance tests (WIP)
7672

7773
Run
7874
```
@@ -81,7 +77,7 @@ docker build . -f backend/Dockerfile.conformance -t <tag>
8177
8278
## API Server Development
8379
84-
### Run Locally With a Kind Cluster
80+
### Run the KFP Backend Locally With a Kind Cluster
8581
8682
This deploys a local Kubernetes cluster leveraging [kind](https://kind.sigs.k8s.io/), with all the components required
8783
to run the Kubeflow Pipelines API server. Note that the `ml-pipeline` `Deployment` (API server) has its replicas set to
@@ -99,6 +95,7 @@ pods on the cluster using the `ml-pipeline` `Service`.
9995
network interface through Docker/Podman Desktop. See
10096
[kind #1200](https://github.com/kubernetes-sigs/kind/issues/1200#issuecomment-1304855791) for an example manifest.
10197
* Optional: VSCode is installed to leverage a sample `launch.json` file.
98+
* This relies on dlv: (go install -v github.com/go-delve/delve/cmd/dlv@latest)
10299
103100
#### Provisioning the Cluster
104101
@@ -111,15 +108,9 @@ make -C backend dev-kind-cluster
111108
This may take several minutes since there are many pods. Note that many pods will be in "CrashLoopBackOff" status until
112109
all the pods have started.
113110

114-
#### Deleting the Cluster
115-
116-
Run the following to delete the cluster:
111+
Also, note that the config in the `make` command above sets the `ml-pipeline` `Deployment` (api server) to have 0 replicas. The intent is to replace it with a locally running API server for debugging and faster development. See the following steps to run the API server locally, and connect it to the KFP backend on your Kind cluster. Note that other backend components (for example, the persistence agent) may show errors until the API server is brought up and connected to the cluster.
117112

118-
```bash
119-
kind delete clusters dev-pipelines-api
120-
```
121-
122-
#### Launch the API Server With VSCode
113+
#### Launching the API Server With VSCode
123114

124115
After the cluster is provisioned, you may leverage the following sample `.vscode/launch.json` file to run the API
125116
server locally:
@@ -168,12 +159,12 @@ You can also directly connect to the MariaDB database server with:
168159
mysql -h 127.0.0.1 -u root
169160
```
170161

171-
## Remote Debug the Driver
162+
### Remote Debug the Driver
172163

173164
These instructions assume you are leveraging the Kind cluster in the
174165
[Run Locally With a Kind Cluster](#run-locally-with-a-kind-cluster) section.
175166

176-
### Build the Driver Image With Debug Prerequisites
167+
#### Build the Driver Image With Debug Prerequisites
177168

178169
Run the following to create the `backend/Dockerfile.driver-debug` file and build the container image
179170
tagged as `kfp-driver:debug`. This container image is based on `backend/Dockerfile.driver` but installs
@@ -197,7 +188,7 @@ Alternatively, you can use this Make target that does both.
197188
make -C kind-build-and-load-driver-debug
198189
```
199190

200-
### Run the API Server With Debug Configuration
191+
#### Run the API Server With Debug Configuration
201192

202193
You may use the following VS Code `launch.json` file to run the API server which overrides the Driver
203194
command to use Delve and the Driver image to use debug image built previously.
@@ -229,7 +220,7 @@ command to use Delve and the Driver image to use debug image built previously.
229220
}
230221
```
231222

232-
### Starting a Remote Debug Session
223+
#### Starting a Remote Debug Session
233224

234225
Start by launching a pipeline. This will eventually create a Driver pod that is waiting for a remote debug connection.
235226

@@ -273,3 +264,22 @@ For debugging a specific Driver pod, you'll need to continuously port forward an
273264
without a breakpoint so that Delve will continue execution until the Driver pod you are interested in starts up. At that
274265
point, you can set a break point, port forward, and connect to the remote debug session to debug that specific Driver
275266
pod.
267+
268+
### Deleting the Kind Cluster
269+
270+
Run the following to delete the cluster (once you are finished):
271+
272+
```bash
273+
kind delete clusters dev-pipelines-api
274+
```
275+
276+
## Contributing
277+
### Code Style
278+
279+
Backend codebase follows the [Google's Go Style Guide](https://google.github.io/styleguide/go/). Please, take time to get familiar with the [best practices](https://google.github.io/styleguide/go/best-practices). It is not intended to be exhaustive, but it often helps minimizing guesswork among developers and keep codebase uniform and consistent.
280+
281+
We use [golangci-lint](https://golangci-lint.run/) tool that can catch common mistakes locally (see detailed configuration [here](https://github.com/kubeflow/pipelines/blob/master/.golangci.yaml)). It can be [conveniently integrated](https://golangci-lint.run/usage/integrations/) with multiple popular IDEs such as VS Code or Vim.
282+
283+
Finally, it is advised to install [pre-commit](https://pre-commit.com/) in order to automate linter checks (see configuration [here](https://github.com/kubeflow/pipelines/blob/master/.pre-commit-config.yaml))
284+
285+
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
FROM python:3.9-slim-bullseye
2+
RUN apt-get update && apt-get install -y gcc python3-dev
3+
4+
COPY requirements.txt .
5+
RUN pip install --upgrade pip
6+
RUN python3 -m pip install --upgrade -r \
7+
requirements.txt --quiet --no-cache-dir \
8+
&& rm -f requirements.txt
9+
10+
ENV APP_HOME /app
11+
COPY kservedeployer.py $APP_HOME/kservedeployer.py
12+
WORKDIR $APP_HOME
13+
14+
ENTRYPOINT ["python"]
15+
CMD ["kservedeployer.py"]

0 commit comments

Comments
 (0)