Kubernetes Test and Production Deployment
Purpose
Section titled “Purpose”This runbook defines the Phase 4.5 GitOps deployment model for the dedicated k3s server.
It separates the continuously updated test instance from the manually promoted production instance while keeping both slices on the current single-node server.
Dedicated Server
Section titled “Dedicated Server”| Field | Value |
|---|---|
| Public IP | 91.99.127.138 |
| Restricted SSH user | deploy-test |
| Kubernetes | k3s |
| Ingress class | traefik |
| Storage class | local-path |
| ACME email | sens.ops@sens.at |
| Argo CD hostname | argo.iot-sens.schlossers.at |
| Test docs hostname | docs.dev.iot-sens.schlossers.at |
| Production docs hostname | docs.iot-sens.schlossers.at |
| Test API hostname | api.dev.iot-sens.schlossers.at |
| Production API hostname | api.iot-sens.schlossers.at |
The SSH private key must stay outside Git history. The repository ignores
ssh-private, ssh-private.*, *.pem, and *.key.
Desired State Repositories
Section titled “Desired State Repositories”The product repository contains:
- service source code,
- Dockerfiles,
- the Helm chart under
infra/helm/sens-platform, - documentation and CI checks.
The private SENS-GmbH/sens-platform-infra repository contains:
- cluster bootstrap resources,
- Argo CD Applications and Projects,
- Argo CD local-user RBAC,
- environment-specific Helm values,
- release metadata for immutable deployments.
No secret values may be committed to either repository.
Namespaces
Section titled “Namespaces”| Namespace | Purpose |
|---|---|
argocd | Argo CD control plane |
cert-manager | Certificate management |
sens-test | SENS Platform test release |
sens-production | SENS Platform production |
Test and production use separate Kubernetes Secrets, PVCs, TLS Secrets, Argo CD Applications, and Helm release names.
Release Selection
Section titled “Release Selection”Deployments use immutable product revisions and immutable SemVer image tags.
Runtime deployments must not use latest, branch names, git-<short-sha>,
full-SHA tags, or mutable channel tags such as test.
Deployable images are published by the publish-images GitHub Actions
workflow. After ci succeeds on main, it automatically publishes all
deployable images with a generated alpha SemVer tag and updates test GitOps
desired state.
The automatic test tag is derived from the root package.json version:
<package.json version>-alpha.<publish workflow run number>.<run attempt>For example, package version 0.5.0 can produce
0.5.0-alpha.123.1. The package version must be a stable SemVer version because
the workflow appends the alpha pre-release suffix for test builds.
Manual publish-images runs require a release_version input and publish that
same SemVer tag for all deployable images.
Environment release channels:
| Environment | Required version shape | Example |
|---|---|---|
| Test | alpha pre-release | 0.5.0-alpha.123.1 |
| Future staging | rc pre-release | 0.5.0-rc.1 |
| Production | stable SemVer | 0.5.0 |
The workflow accepts an optional leading v, which is stripped before writing
the Docker image tag. SemVer build metadata with + is intentionally rejected
because Docker image tags cannot use +. The Git SHA is still recorded as
product_commit in the GitOps update and as the OCI image revision label.
Version Progression Check
Section titled “Version Progression Check”CI runs pnpm version:check before build and image publication. The check
compares the root package.json version with the previous version from Git.
Allowed progression:
- unchanged version for repeated test builds on the same release line,
- patch increase within the same major and minor version, for example
1.5.2to1.5.3, - next minor version only with patch reset to
0, for example1.5.2to1.6.0, - next major version only with minor and patch reset to
0, for example1.5.2to2.0.0.
Rejected progression:
- downgrades, for example
0.4.0to0.3.1, - skipped minor versions, for example
1.5.2to1.7.0, - minor versions with non-zero patch, for example
1.5.2to1.6.1, - skipped major versions or major versions with non-zero minor/patch.
The root package.json version must be stable SemVer without pre-release or
build metadata. Test, staging, and production channel suffixes are applied by
the release workflow, not by changing the root package version.
This check only blocks new product repository builds. Argo CD rollbacks to an already published older image tag remain possible because they are GitOps desired-state changes in the infra repository and do not run this product version progression check.
Normal test flow:
- Update the root
package.jsonversion when starting a new release line. - Merge to
main. - Wait for
cito succeed. publish-imagesautomatically publishes the generatedalphatag and triggersdeploy-test-main.ymlin the infra repository.
Manual alpha publish and test deploy for exceptional cases:
gh workflow run publish-images.yml \ --repo SENS-GmbH/sens-platform \ -f ref=main \ -f release_version=0.5.0-alpha.1 \ -f update_test_environment=trueThis publishes 0.5.0-alpha.1 for all deployable images, then triggers the
infra repository test deployment with image_tag=0.5.0-alpha.1.
The workflow rejects test deployment requests where release_version is not an
alpha pre-release.
Image tags must not be moved after publication. If another test build is needed,
merge another change to main or publish the next alpha pre-release manually.
After image publication, the workflow triggers
deploy-test-main.yml in SENS-GmbH/sens-platform-infra. That workflow updates
only the test Application and test values. It must not update production files.
Production promotion is a manual GitOps change in the infra repository:
- Run
promote-production-from-testin GitHub Actions or runscripts/promote-production-from-test.shfrom the infra repository. - Review and merge the generated promotion pull request.
- Sync
sens-platform-productionmanually in Argo CD assebastianor break-glassadmin.
Production must promote a stable SemVer tag that has already been validated through the earlier release channels. Pre-release tags must not be used for customer-facing production.
The infra repository branch protection must require CODEOWNERS review for
production paths. Without that GitHub setting, the CODEOWNERS file is only
documentation and does not enforce review. The GitHub CODEOWNERS entry uses the
repository collaborator @SensSebastianSchlosser; Argo CD uses the local user
name sebastian.
GitHub Actions Secret
Section titled “GitHub Actions Secret”The product repository requires one Actions secret:
| Secret | Purpose |
|---|---|
SENS_PLATFORM_INFRA_WORKFLOW_TOKEN | Triggers the infra repository test deployment workflow |
The token must be restricted to the infra repository and must not have general contents-write access. Prefer a fine-grained token or GitHub App credential with permission to dispatch the target workflow only where possible.
GitHub does not allow a workflow to generate its own fine-grained personal access token. The token must be created by an authorized GitHub user or replaced by a GitHub App installation token. If the secret is absent or empty, image publishing still succeeds but the automatic test GitOps update is skipped.
Argo CD Git Webhook
Section titled “Argo CD Git Webhook”The infra repository has a GitHub push webhook that notifies Argo CD when the GitOps desired state changes:
| Field | Value |
|---|---|
| Repository | SENS-GmbH/sens-platform-infra |
| Event | push |
| Payload URL | https://argo.iot-sens.schlossers.at/api/webhook |
| Content type | application/json |
The webhook removes the normal Git polling delay after the
deploy-test-main.yml workflow updates the test desired state. Argo CD still
owns sync decisions; the webhook only requests a repository refresh.
The current webhook is configured without a shared secret because the restricted
deploy-test access boundary does not allow patching argocd-secret. Argo CD
supports unauthenticated webhook refreshes, but a shared secret should be added
by a cluster administrator to reduce public webhook abuse risk.
Admin hardening command:
WEBHOOK_SECRET="$(openssl rand -hex 32)"kubectl -n argocd patch secret argocd-secret --type merge \ -p "{\"stringData\":{\"webhook.github.secret\":\"${WEBHOOK_SECRET}\"}}"After patching Argo CD, set the same secret value on the GitHub webhook in
SENS-GmbH/sens-platform-infra.
Public Exposure
Section titled “Public Exposure”Public HTTPS ingress is limited to:
- Argo CD at
argo.iot-sens.schlossers.at, - test documentation at
docs.dev.iot-sens.schlossers.at, - production documentation at
docs.iot-sens.schlossers.at, - test
platform-apioperation endpoints atapi.dev.iot-sens.schlossers.at, - production
platform-apioperation endpoints atapi.iot-sens.schlossers.at.
The following components stay cluster-internal:
mqtt-ingestion-worker,telemetry-worker,- embedded TimescaleDB,
- embedded NATS JetStream,
- embedded mock MQTT broker.
The mock MQTT broker is enabled only for test.
Argo CD Access
Section titled “Argo CD Access”Local Argo CD users:
| User | Purpose |
|---|---|
deploy | Test and standard non-production operations |
sebastian | Production owner and manual production deployer |
admin | Break-glass administrator, not automation identity |
Rotate the initial Argo CD admin password immediately after first login. The admin password must not be stored in the repository, SSH user environment, or automation logs.
Required Runtime Secrets
Section titled “Required Runtime Secrets”Create secrets directly in the target cluster or through a documented external secret process. Do not commit their values.
Required in sens-test:
| Secret | Type | Purpose |
|---|---|---|
ghcr-pull | kubernetes.io/dockerconfigjson | Pull private GHCR images |
sens-test-timescaledb-auth | Opaque | Embedded TimescaleDB auth |
Required in sens-production:
| Secret | Type | Purpose |
|---|---|---|
ghcr-pull | kubernetes.io/dockerconfigjson | Pull private GHCR images |
sens-production-timescaledb-auth | Opaque | Embedded TimescaleDB auth |
Required key in each TimescaleDB auth secret:
postgres-password
SSH and Kubernetes Access Boundary
Section titled “SSH and Kubernetes Access Boundary”The restricted Linux user deploy-test must not be able to deploy production
by SSH.
Required controls:
deploy-testhas nosudoaccess.deploy-testcannot read/etc/rancher/k3s/k3s.yaml.- no production-capable kubeconfig is stored in the
deploy-testhome directory or workspace. - the
deploy-testuser’s$HOME/.kube/configuses the restrictedsens-test/deploy-testServiceAccount. - Kubernetes RBAC for that ServiceAccount grants at most
sens-testpermissions and no mutation rights insens-production. - Argo CD RBAC grants non-production users no
applications/syncpermission forsens-production/sens-platform-production. - GitHub automation from the product repository can trigger the test deployment workflow but cannot directly write production desired state.
If an SSH user has root access, sudo, or a cluster-admin kubeconfig, there is
no technical guarantee that production cannot be changed. Those credentials
must stay outside the AI/deploy context.
DNS and ACME Checks
Section titled “DNS and ACME Checks”DNS records must resolve before HTTP-01 certificates can become ready:
dig +short argo.iot-sens.schlossers.at Adig +short docs.dev.iot-sens.schlossers.at Adig +short docs.iot-sens.schlossers.at Adig +short api.dev.iot-sens.schlossers.at Adig +short api.iot-sens.schlossers.at AExpected result for each host:
91.99.127.138Cluster DNS check:
kubectl -n sens-test run dns-check --attach --rm --restart=Never \ --image=busybox:1.36 -- nslookup api.dev.iot-sens.schlossers.atIf public DNS resolves but cert-manager still reports an HTTP-01 self-check DNS failure, restart CoreDNS and wait for cert-manager to retry:
kubectl -n kube-system rollout restart deployment/corednskubectl -n kube-system rollout status deployment/corednskubectl get certificate -ASmoke Checks
Section titled “Smoke Checks”After Argo CD sync, check test:
kubectl -n sens-test get podskubectl -n sens-test rollout status deployment/sens-platform-test-platform-apikubectl -n sens-test rollout status deployment/sens-platform-test-docscurl -fsS https://api.dev.iot-sens.schlossers.at/healthzcurl -fsS https://api.dev.iot-sens.schlossers.at/readyzcurl -fsS https://api.dev.iot-sens.schlossers.at/versioncurl -fsS https://api.dev.iot-sens.schlossers.at/testcurl -fsS https://api.dev.iot-sens.schlossers.at/metricscurl -fsS https://docs.dev.iot-sens.schlossers.at/healthzcurl -fsS https://docs.dev.iot-sens.schlossers.at/readyzcurl -fsS https://docs.dev.iot-sens.schlossers.at/Check production only after an intentional manual production sync:
kubectl -n sens-production get podskubectl -n sens-production rollout status deployment/sens-platform-production-platform-apikubectl -n sens-production rollout status deployment/sens-platform-production-docscurl -fsS https://api.iot-sens.schlossers.at/healthzcurl -fsS https://api.iot-sens.schlossers.at/readyzcurl -fsS https://api.iot-sens.schlossers.at/versioncurl -fsS https://api.iot-sens.schlossers.at/testcurl -fsS https://api.iot-sens.schlossers.at/metricscurl -fsS https://docs.iot-sens.schlossers.at/healthzcurl -fsS https://docs.iot-sens.schlossers.at/readyzcurl -fsS https://docs.iot-sens.schlossers.at/View the platform-api logs through Argo CD’s pod log view or with:
kubectl -n sens-test logs deployment/sens-platform-test-platform-apikubectl -n sens-production logs deployment/sens-platform-production-platform-apiCalling GET /test writes a structured log entry with the message
test endpoint called.
Internal checks:
kubectl -n sens-test exec statefulset/sens-platform-test-timescaledb -- \ pg_isready -U sens_platform -d sens_platform
kubectl -n sens-test exec statefulset/sens-platform-test-nats -- \ wget -qO- http://127.0.0.1:8222/healthzValidation
Section titled “Validation”Local and CI validation commands:
pnpm helm:lintpnpm helm:templatepnpm helm:template:productionThe chart must render without hard-coded cloud provider, DNS provider, certificate issuer, or storage class assumptions. Environment-specific values belong in the GitOps repository.
Limitations
Section titled “Limitations”This controlled production slice does not provide:
- production high availability,
- customer-data backup and restore readiness,
- Netmore ingestion,
- authentication,
- tenant data handling,
- telemetry persistence behavior.
Before real customer data is stored, backup, restore, monitoring, retention, and database operating procedures must be completed and tested.