Requirements
Functional Requirements
FR1 — Naming Convention
- Every Kubernetes namespace follows the pattern
{project}-{env}-{service} - Every namespace carries all 9 required labels:
project,env,service,team,backstage.io/domain,backstage.io/system,backstage.io/component,argocd/app,argocd/app-set - Backstage entity names are env-agnostic:
{system}-{service}for Components,{provider}-{project}-{env}-{type}-{name}for Resources - ArgoCD Application names follow
{system}-{service}-{env}
FR2 — GitOps Repositories
- One
platform-gitopsrepository owned by the platform team containing AppProjects, ApplicationSets, Crossplane XRDs/Compositions, RBAC, and Backstage templates - One
{domain}-gitopsrepository per domain containing k8s manifests, Crossplane Claims, and Backstage catalog entities - One
{app}-repoapplication repository per service containing the source code,Dockerfile, TechDocs, and CI pipelines - Domain repos are created automatically by the
create-domainBackstage template, and application repos bycreate-service
FR3 — ArgoCD Delivery
- Every domain has an AppProject that restricts source repos and destination namespaces
- Services are deployed via ApplicationSet matrix generators (services × clusters)
- Prod Applications require manual sync — enforced via
syncWindowsandtemplatePatch - Sync windows deny automated prod deploys on weekday evenings and weekends
- ArgoCD
ignoreDifferenceson/spec/replicasprevents overwriting HPA decisions - Crossplane Claims are deployed via git-directory-generator ApplicationSets
FR4 — Backstage Catalog
- Domain, System, Component, Resource, Group, and User entities for every platform asset
backstage.io/kubernetes-label-selectoron every Component surfaces health across all clusters with a single annotationargocd/app-selectoron every Component surfaces sync status per environmentbackstage.io/kubernetes-label-selectoron every Resource surfaces Crossplane Claim READY/SYNCED status- Catalog registered via
catalog-info.yamlLocation entities per domain repo
FR5 — Cloud Resource Provisioning (Crossplane)
- Cloud resources declared as Crossplane Claims in
{domain}-gitops/crossplane/claims/{env}/ - Infra namespaces
{domain}-{env}-infraisolated from application namespaces deletionPolicy: Orphandefault on all production Claims- Connection secrets written to the infra namespace after provisioning
- Supported providers: GCP, AWS, Azure, IBM
- Supported resource types: clusters (GKE/EKS/AKS/IKS), databases (Cloud SQL/RDS/PostgreSQL/Cosmos), queues (Pub/Sub/SQS/Service Bus/Event Streams), storage (GCS/S3/Blob/COS), cache (Memorystore/ElastiCache/Redis), registries, secret stores
FR6 — RBAC and Access Control
- Kubernetes RoleBindings use Group subjects only (never individual User subjects for routine access)
- Four platform roles:
viewer,developer,lead,platform-engineer developerrole: no RoleBinding created for prod namespacesplatform-engineerrole: ClusterRoleBinding forplatform-*namespaces- ArgoCD project roles:
readonly,developer(dev/staging sync only),domain-admin - RBAC applied exclusively via Backstage templates (
create-group,create-user)
FR7 — Backstage Scaffolder Templates
| Template | Triggers | Output |
|---|---|---|
create-domain | Onboarding new product area | Domain entity, {domain}-gitops repo, AppProject |
create-system | Adding a new product grouping | System entity, ApplicationSet, TechDocs base |
create-service | Adding a new workload | New {app}-repo (CI/CD, TechDocs), k8s manifests per env, Component entity, AppSet element |
create-resource | Requesting cloud infrastructure | Crossplane Claims per env, Resource entities |
create-secret | Encrypting sensitive data | SealedSecret manifests securely pushed to domain-gitops |
create-group | Onboarding a new team | Group entity, RBAC bindings, ArgoCD roles |
create-user | Onboarding a new engineer | User entity, RBAC bindings, ArgoCD user |
All templates use EntityPicker to enforce referential integrity — a child entity cannot be created without its parent already existing in the catalog.
FR8 — Convention Validation CI
- Every domain repo contains
.github/workflows/validate-conventions.yamlgenerated bycreate-domain - Checks: kubeconform schema validation, namespace naming pattern via
validate-namespaces.sh, all 9 required labels, resource requests/limits on all containers, ArgoCD dry-run diff against dev cluster - Convention violations fail the PR — merge is blocked
Non-Functional Requirements
NFR1 — Reliability
- ArgoCD: 99.9% uptime on management cluster
- Backstage: 99.5% uptime; degraded functionality acceptable during outages
- Crossplane: continuous reconciliation loop — drift corrected within 5 minutes of detection
NFR2 — Performance
- Backstage catalog page load: < 2 seconds (p95)
- ArgoCD sync latency: < 60 seconds from push to namespace update
create-servicetemplate run time: < 3 minutes end-to-end
NFR3 — Security
- No secrets in plain text in Git — self-service Sealed Secrets via the
create-secrettemplate - Network isolation enforced by NetworkPolicy on every namespace (deny all by default)
- Pod security defaults:
runAsNonRoot,readOnlyRootFilesystem,allowPrivilegeEscalation: false - Prod sync windows: automated syncs denied outside business hours
- OIDC authentication for Backstage, ArgoCD, and Headlamp
NFR4 — Scalability
- Platform must support 20+ domains, 100+ systems, 500+ services without architectural changes
- ApplicationSet matrix generator scales linearly — no manual Application creation needed
- Crossplane git-directory generator auto-discovers new Claims without ApplicationSet changes
NFR5 — Observability
- All platform services instrumented with Prometheus metrics
- Logs shipped to central Loki via Alloy on every cluster
- Backstage k8s plugin surfaces pod health and recent events per component per env
- Crossplane Claim status (READY/SYNCED) surfaced on Backstage Resource pages
NFR6 — Developer Experience
- P1 (Product Engineer) can create a new service with zero platform team interaction
- All required knowledge is encoded in templates — no external documentation required during normal workflow
- Convention validation CI provides actionable error messages