xv6 Deep Dive table of contents
Previous post

Note for Advent Calendar readers

This is Day 20 of the Homebrew OS Advent Calendar 2019. It’s part of my xv6 explanation series, focusing on scheduling because it’s self-contained for this event. For why xv6, how the series works, or how to run it, please read the introduction.

This chapter covers:

How context switching works (this post)
How the scheduler switches processes (this post)
How sleep/wakeup switches processes (next post)

Scheduling sequence diagram

Here’s the rough flow from a timer interrupt to the scheduler picking the next process. Over time the “new” process becomes the “old” one. The point where the new process resumes is right after it previously called swtch() when it was the old process.

Code: Context switching

How xv6 switches processes. Key points:

Each CPU has its own scheduler.
Each process keeps its own kernel stack.

Source: commentary/textbook

The figure shows, at a high level, two user processes (shell and cat) being switched by the scheduler.

The shell process enters its kernel thread via a system call or interrupt (e.g., timer interrupt).
Save the shell process context.
Context switch to the scheduler thread.
Run scheduling.
Context switch to the cat process’s kernel thread.
Restore its context and continue in cat.

swtch()

This function saves/restores a thread. Switching threads essentially means changing %eip and %esp. entry.c

.globl swtch
swtch:
 movl 4(%esp), %eax
 movl 8(%esp), %edx
13
 # Save old callee-saved registers
 pushl %ebp
 pushl %ebx
 pushl %esi
 pushl %edi
19
 # Switch stacks
 movl %esp, (%eax)
 movl %edx, %esp
23
 # Load new callee-saved registers
 popl %edi
 popl %esi
 popl %ebx
 popl %ebp
 ret

L11-12: Arguments pass pointers to context **old and *new.

proc.h

struct context {
 uint edi;
 uint esi;
 uint ebx;
 uint ebp;
 uint eip;
};

L14-18: Save the current process’s context (callee-saved registers).

eip is saved by the call instruction.

L21: Store the current stack pointer (the saved context) into old. L22: Set the stack pointer to new. L24-29: Restore the new process’s context.

ret restores eip.

Code: Scheduling

trap()

On timer interrupt, yield() is called. trap.c

void
trap(struct trapframe *tf)
~~~
 // Force process to give up CPU on clock tick.
 // If interrupts were on while locks held, would need to check nlock.
 if(myproc() && myproc()->state == RUNNING &&
    tf->trapno == T_IRQ0+IRQ_TIMER)
   yield();

yield()

proc.c

// Give up the CPU for one scheduling round.
void
yield(void)
{
 acquire(&ptable.lock);  //DOC: yieldlock
 myproc()->state = RUNNABLE;
 sched();
 release(&ptable.lock);
}

L388: Acquire ptable lock.
L389: Mark the current process RUNNABLE instead of RUNNING.
L390: Call sched().
L391: Release ptable lock (execution resumes here after switching back).

sched()

proc.c

void
sched(void)
{
 if(!holding(&ptable.lock))
   panic("sched ptable.lock");
~~~
 intena = mycpu()->intena;
 swtch(&p->context, mycpu()->scheduler);
 mycpu()->intena = intena;
}

L371: Ensure the lock is held so other CPUs can’t change process state.
L380: Context switch to the scheduler.

About the ptable lock:

yield() acquires/releases it.
sched() assumes the caller already holds/releases it; unusual but necessary.
During swtch() invariants aren’t maintained; without the lock, another CPU could see process A as RUNNABLE, choose it, while A’s kernel stack is mid-switch.

scheduler()

Now the main scheduling loop: find the next process and switch. proc.c

void
scheduler(void)
{
 struct proc *p;
 struct cpu *c = mycpu();
 c->proc = 0;
328
 for(;;){
   // Enable interrupts on this processor.
   sti();
332
   // Loop over process table looking for process to run.
   acquire(&ptable.lock);
   for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){
     if(p->state != RUNNABLE)
       continue;
338
     // Switch to chosen process.  It is the process's job
     // to release ptable.lock and then reacquire it
     // before jumping back to us.
     c->proc = p;
     switchuvm(p);
     p->state = RUNNING;
345
     swtch(&(c->scheduler), p->context);
     switchkvm();
348
     // Process is done running for now.
     // It should have changed its p->state before coming back.
     c->proc = 0;
   }
   release(&ptable.lock);
354
 }
}

L329: Infinite loop.
L335-337: Search for a RUNNABLE process p.
L343: Set p’s TSS (Task State Segment) in TR so interrupts use p’s kernel stack.
L342: Set c->proc so myproc() returns the running process.

proc.c

struct proc*
myproc(void) {
 struct cpu *c;
 struct proc *p;
 pushcli();
 c = mycpu();
 p = c->proc;
 popcli();
 return p;
}

L346: Context switch to p.
L347: Switch address spaces (switchkvm).
L351: When back in the scheduler, clear c->proc.

If no runnable process is found:
L334/353: Lock ptable only while searching.

If the scheduler idled while holding the lock, other CPUs couldn’t context-switch or run syscalls needing ptable.
Processes couldn’t be marked RUNNABLE.
Prevents duplicate pids and duplicated process table entries.

L331: Enable interrupts.

Shell etc. are often blocked on I/O; if interrupts were disabled while idling, I/O waits would never finish.

We traced xv6 from context switch to scheduling. Debugging with gdb while reading the code makes it more fun! Tomorrow’s Advent post is SugarHigh_bin on “Concurrent symbolic fuzzing with memory model awareness.”

If you have comments or corrections, please contact me on Twitter or open an Issue.