An optimization that skips deleting the last Pod when preempting N Pods if the first N-1 Pods deletions have failed. The early termination when N-1 async preemptions fail was already implemented, so this was the remaining optimization in that area.
Tip
If the first N-1 deletions have already failed, there's no point in deleting just the last one. It's meaningless because even if we fill NNN, it will still result in insufficient resources and add scheduling overhead.
While reading through this, I found some areas in the metrics that could be improved, so I created an issue. scheduler_preemption_victims differs between sync and async preemption
I also took the opportunity to read the Async Preemption KEP. Indeed, preemption is slow.
- name: PreemptionBasic
workloadTemplate:
- opcode: createNodes
countParam: $initNodes
- opcode: createPods
countParam: $initPods
podTemplatePath: config/templates/pod-low-priority.yaml
- opcode: createPods
countParam: $measurePods
podTemplatePath: config/templates/pod-high-priority.yaml
collectMetrics: true
workloads:
- name: 5Nodes
labels: [integration-test, fast, short]
params:
initNodes: 5
initPods: 20
measurePods: 5
- name: 500Nodes
labels: [performance, fast]
threshold: 18
params:
initNodes: 500
initPods: 2000
measurePods: 500I felt that having a FeatureGate makes features easier to understand.