k/k#135495: The Last Piece of Async Preemption Optimization

Skip last victim in async preemption if any prior Pod preemption failed kubernetes/kubernetes #135495by tosi3k

An optimization that skips deleting the last Pod when preempting N Pods if the first N-1 Pods deletions have failed. The early termination when N-1 async preemptions fail was already implemented, so this was the remaining optimization in that area.

Tip

If the first N-1 deletions have already failed, there's no point in deleting just the last one. It's meaningless because even if we fill NNN, it will still result in insufficient resources and add scheduling overhead.

While reading through this, I found some areas in the metrics that could be improved, so I created an issue. scheduler_preemption_victims differs between sync and async preemption kubernetes/kubernetes #135717by utam0kclosed

I also took the opportunity to read the Async Preemption KEP. Indeed, preemption is slow.

- name: PreemptionBasic
  workloadTemplate:
  - opcode: createNodes
    countParam: $initNodes
  - opcode: createPods
    countParam: $initPods
    podTemplatePath: config/templates/pod-low-priority.yaml
  - opcode: createPods
    countParam: $measurePods
    podTemplatePath: config/templates/pod-high-priority.yaml
    collectMetrics: true
  workloads:
  - name: 5Nodes
    labels: [integration-test, fast, short]
    params:
      initNodes: 5
      initPods: 20
      measurePods: 5
  - name: 500Nodes
    labels: [performance, fast]
    threshold: 18
    params:
      initNodes: 500
      initPods: 2000
      measurePods: 500

I felt that having a FeatureGate makes features easier to understand.