[LinuxKernel][Spinlock] spin_lock(): in-depth code review
Now, let me look into the spinlock implmentation in more details.
After analyzing the assembly code, R2 is key debugging signature which is the original value of spinlock owner.
Code Review: arch_spin_lock(part 1) -> If the spinlock is acquired sucessfully.
Let me assume tickets.next=0x0, tickets.owner=0x0(spinlock is not held) before _raw_spin_lock() is executed
[1]: R2: (raw_spinlock_t *)lock is loaded from R0.
[2]: +1 increament struct raw_spinlock_t.raw_lock.tickets.next of R2 and save it into R3(0x1, raw_spinlock_t.raw_lock.tickets.next+1)
[3]: Save the incremented +0x1 next element into lock value
If "strex" insruction is executed successfully, r1=0x0, otherwise, r1=0x1
[4]: if R1 is 0x0, jump to 0xc0ee8c60, else go to [1]0xc0ee8c4c.
[5]: R3(0x0) is holding struct raw_spinlock_t.raw_lock.tickets.owner from R2(original spinlock)
Please be reminded that R2 holds original struct raw_spinlock_t* lock;
[6]: R2(0x0) contains struct raw_spinlock_t.raw_lock.tickets.next from R2
[7]: Compare R2(tickets.next) and R3(tickets.owner)
[8]: In this case, tickets.next == tickets.owner(spinlock is not held before this funtion is called), jump to 0xc0ee8c80 for function termination.
0xc0ee8c30 <_raw_spin_lock>: mov r2, sp 0xc0ee8c34 <_raw_spin_lock+0x4>: bic r3, r2, #8128 ; 0x1fc0 0xc0ee8c38 <_raw_spin_lock+0x8>: bic r3, r3, #63 ; 0x3f 0xc0ee8c3c <_raw_spin_lock+0xc>: ldr r2, [r3, #4] 0xc0ee8c40 <_raw_spin_lock+0x10>: add r2, r2, #1 0xc0ee8c44 <_raw_spin_lock+0x14>: str r2, [r3, #4] 0xc0ee8c48 <_raw_spin_lock+0x18>: pldw [r0] 0xc0ee8c4c <_raw_spin_lock+0x1c>: ldrex r2, [r0] //<<--[1] 0xc0ee8c50 <_raw_spin_lock+0x20>: add r3, r2, #65536 ; 0x10000 //<<--[2] 0xc0ee8c54 <_raw_spin_lock+0x24>: strex r1, r3, [r0] //<<--[3] 0xc0ee8c58 <_raw_spin_lock+0x28>: teq r1, #0 //<<--[4] 0xc0ee8c5c <_raw_spin_lock+0x2c>: bne 0xc0ee8c4c <_raw_spin_lock+28> 0xc0ee8c60 <_raw_spin_lock+0x30>: uxth r3, r2 //<<--[5] 0xc0ee8c64 <_raw_spin_lock+0x34>: ubfx r2, r2, #16, #16 //<<--[6] 0xc0ee8c68 <_raw_spin_lock+0x38>: cmp r2, r3 //<<--[7] 0xc0ee8c6c <_raw_spin_lock+0x3c>: beq 0xc0ee8c80 <_raw_spin_lock+80> //<<--[8] 0xc0ee8c70 <_raw_spin_lock+0x40>: wfe 0xc0ee8c74 <_raw_spin_lock+0x44>: ldrh r3, [r0] 0xc0ee8c78 <_raw_spin_lock+0x48>: uxth r3, r3 0xc0ee8c7c <_raw_spin_lock+0x4c>: b 0xc0ee8c68 <_raw_spin_lock+56> 0xc0ee8c80 <_raw_spin_lock+0x50>: dmb ish 0xc0ee8c84 <_raw_spin_lock+0x54>: bx lr |
Code Review: arch_spin_lock(part 2) -> If the spinlock is held by someone, so it waits for it to released.
Let me assume tickets.next=0x1, tickets.owner=0x0(the spinlock is already held) before _raw_spin_lock() is executed.
[1]: R2: (raw_spinlock_t *)lock is loaded from R0.
[2]: +1 increament struct raw_spinlock_t.raw_lock.tickets.next of R2 and save it into R3(0x2, raw_spinlock_t.raw_lock.tickets.next+1)
[3]: Save the incremented +0x1 next element into lock value(raw_spinlock_t.raw_lock.tickets.next is updated as 0x2)
If "strex" insruction is executed successfully, r1=0x0, otherwise, r1=0x1
[4]: if R1 is 0x0, jump to 0xc0ee8c60, else go to [1]0xc0ee8c4c.
[5]: R3(0x0) is holding struct raw_spinlock_t.raw_lock.tickets.owner from R2
Please be reminded that R2 holds original struct raw_spinlock_t* lock;
[6]: R2(0x1) contains struct raw_spinlock_t.raw_lock.tickets.next from R2
[7]: Compare R2(tickets.next: 0x1) and R3(tickets.owner: 0x0)
[8]: The tickets.next > tickets.owner means spinlock is already held, this code is executed.
R3 is loaded from (raw_spinlock_t *) lock which is the spinlock **instance**(which can be accessed other processes)
[9]: R3 is updated as struct raw_spinlock_t.raw_lock.tickets.owner. And then jump to [7] 0xc0ee8c68.
0xc0ee8c30 <_raw_spin_lock>: mov r2, sp 0xc0ee8c34 <_raw_spin_lock+0x4>: bic r3, r2, #8128 ; 0x1fc0 0xc0ee8c38 <_raw_spin_lock+0x8>: bic r3, r3, #63 ; 0x3f 0xc0ee8c3c <_raw_spin_lock+0xc>: ldr r2, [r3, #4] 0xc0ee8c40 <_raw_spin_lock+0x10>: add r2, r2, #1 0xc0ee8c44 <_raw_spin_lock+0x14>: str r2, [r3, #4] 0xc0ee8c48 <_raw_spin_lock+0x18>: pldw [r0] 0xc0ee8c4c <_raw_spin_lock+0x1c>: ldrex r2, [r0] //<<--[1] 0xc0ee8c50 <_raw_spin_lock+0x20>: add r3, r2, #65536 ; 0x10000 //<<--[2] 0xc0ee8c54 <_raw_spin_lock+0x24>: strex r1, r3, [r0] //<<--[3] 0xc0ee8c58 <_raw_spin_lock+0x28>: teq r1, #0 //<<--[4] 0xc0ee8c5c <_raw_spin_lock+0x2c>: bne 0xc0ee8c4c <_raw_spin_lock+28> 0xc0ee8c60 <_raw_spin_lock+0x30>: uxth r3, r2 //<<--[5] 0xc0ee8c64 <_raw_spin_lock+0x34>: ubfx r2, r2, #16, #16 //<<--[6] 0xc0ee8c68 <_raw_spin_lock+0x38>: cmp r2, r3 //<<--[7] 0xc0ee8c6c <_raw_spin_lock+0x3c>: beq 0xc0ee8c80 <_raw_spin_lock+80> 0xc0ee8c70 <_raw_spin_lock+0x40>: wfe 0xc0ee8c74 <_raw_spin_lock+0x44>: ldrh r3, [r0] //<<--[8] 0xc0ee8c78 <_raw_spin_lock+0x48>: uxth r3, r3 //<<--[9] 0xc0ee8c7c <_raw_spin_lock+0x4c>: b 0xc0ee8c68 <_raw_spin_lock+56> 0xc0ee8c80 <_raw_spin_lock+0x50>: dmb ish 0xc0ee8c84 <_raw_spin_lock+0x54>: bx lr |
Code Review: arch_spin_lock(part 2.1): if process is waiting for spinlock to be released.
Running [7]-[8]-[9] loop until struct raw_spinlock_t.raw_lock.tickets.owner is increated 0x1(which means spinlock is released)
0xc0ee8c68 <_raw_spin_lock+0x38>: cmp r2, r3 //<<--[7] 0xc0ee8c6c <_raw_spin_lock+0x3c>: beq 0xc0ee8c80 <_raw_spin_lock+80> 0xc0ee8c70 <_raw_spin_lock+0x40>: wfe 0xc0ee8c74 <_raw_spin_lock+0x44>: ldrh r3, [r0] //<<--[8] 0xc0ee8c78 <_raw_spin_lock+0x48>: uxth r3, r3 //<<--[9] |
Please be reminded that R2 holds original struct raw_spinlock_t* lock.ticker.owner;
Code Review: arch_spin_lock(part 2.2): Exit this function after the spinlock is released.
After running [7]-[8]-[9] loop...
[8]: R3 is loaded from (raw_spinlock_t *) lock which is the spinlock **instance**(which can be accessed other processes)
[9]: At this point, if other process releases spinlock, R3 is updated as 0x1 (as struct raw_spinlock_t.raw_lock.tickets.owner is increamented).
[7]: Since R2 == R3(spinlock is released), exit this function.
0xc0ee8c68 <_raw_spin_lock+0x38>: cmp r2, r3 //<<--[7] 0xc0ee8c6c <_raw_spin_lock+0x3c>: beq 0xc0ee8c80 <_raw_spin_lock+80> 0xc0ee8c70 <_raw_spin_lock+0x40>: wfe 0xc0ee8c74 <_raw_spin_lock+0x44>: ldrh r3, [r0] //<<--[8] 0xc0ee8c78 <_raw_spin_lock+0x48>: uxth r3, r3 //<<--[9] |
To wrap up the spinlock operation in easy way, let me picture the following scenario.
1. Scenario: next: 0x0, owner:0x0(before spinlock is called)
[1]. Increament next +1(next: 0x1)
[2]: Exit spinlock
2. Scenario: next: 0x1, owner:0x0(before spinlock is called)
[1]. Increament next +1(next: 0x2)
[2]: Original next(0x1) is saved R2
[3]: Loop until the owner of spinlock instance is updated as 0x1(by other process)
[4]: Exit spinlock(next: 0x2, owner:0x1)
3. Scenario: next: 0x45, owner:0x41(before spinlock is called): it means the spinlock is held by 4 times.
[1]. Increament next +1(next: 0x46)
[2]: Original next(0x45) is saved R2
[3]: Loop until the owner of spinlock instance is updated as 0x45(by other process)
[4]: Exit spinlock(next: 0x46, owner:0x45)