디버깅 옵션(-d 6)을 적용해 crash-utilty를 실행하면 vmcore를 로딩하면서 다양한 디버깅 정보를 출력합니다.
 
$ ./crash vmcore vmlinux -d 6
 
crash 7.3.0++
Copyright (C) 2002-2021  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2021  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
NOTE: setting kimage_voffset to: 0xffffffbf70000000
 
NOTE: setting vabits_actual to: 39
 
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=aarch64-elf-linux"...
 
WARNING: could not find MAGIC_START!
raw_crash64: cannot determine base kernel version
raw_crash64: vmlinux and /var/tmp/ramdump_elf_74SLyJ do not match!
 
Usage:
 
  crash [OPTION]... NAMELIST MEMORY-IMAGE[@ADDRESS]     (dumpfile form)
  crash [OPTION]... [NAMELIST]                          (live system form)
 
...
 
creating temporary ELF header: /var/tmp/ramdump_elf_BGK8hx
 
vmcore_data:
                  flags: c0 (KDUMP_LOCAL|KDUMP_ELF64)
                   ndfd: 16
                    ofp: 7f2e25cad620
            header_size: 848
   num_pt_load_segments: 13
     pt_load_segment[0]:
            file_offset: 350
             phys_start: 80000000
               phys_end: c0000000
              zero_fill: 0
     pt_load_segment[1]:
            file_offset: 40000350
             phys_start: 880000000
               phys_end: 8c0000000
              zero_fill: 0

 

< '시스템 소프트웨어 개발을 위한 Arm 아키텍처의 구조와 원리' 저자>

 
 

Kernel panic log

 2107.232713 / 01-01 11:11:03.809][7] init: cannot find '/system/bin/qrngp' (No such file or directory), disabling 'qrngp'

[ 2107.239317 / 01-01 11:11:03.809][5] Unable to handle kernel NULL pointer dereference at virtual address 00000028

[ 2107.239351 / 01-01 11:11:03.809][5] pgd = e37ec000

[ 2107.239366 / 01-01 11:11:03.809][0] [00000028] *pgd=00000000

[ 2107.239388 / 01-01 11:11:03.809][5] Internal error: Oops: 5 [#1] PREEMPT SMP ARM

[ 2107.239405 / 01-01 11:11:03.809][0] Modules linked in: texfat(PO)

[ 2107.239433 / 01-01 11:11:03.809][5] CPU: 5 PID: 2803 Comm: sensorservice Tainted: P        W  O   3.18.31-perf-gd069b48-00001-g8a6d6e5 #1

[ 2107.239452 / 01-01 11:11:03.809][5] task: e3ffb700 ti: e37f4000 task.ti: e37f4000

[ 2107.239479 / 01-01 11:11:03.809][5] PC is at find_vma_links+0x1c/0x78

[ 2107.239499 / 01-01 11:11:03.809][5] LR is at vma_adjust+0x3a0/0x574


Equivalent callstacks can be restored using the following command.

crash>  bt -I C01002D8 -S  E37F5DD0 0xE3FFB700

PID: 2803   TASK: e3ffb700  CPU: 5   COMMAND: "sensorservice"

bt: WARNING:  stack address:0xe37f5b98, program counter:0xc0ee5b60

 #0 [<c01002d8>] (do_DataAbort) from [<c010ad58>]

    pc : [<c01f980c>]    lr : [<c01fa708>]    psr: 200f0013

    sp : e37f5ec4  ip : 00000034  fp : e8236d9c

    r10: 00000000  r9 : 00000000  r8 : b2c99000

    r7 : c4616c80  r6 : e8236d68  r5 : 7f555034  r4 : 7f555034

    r3 : e37f5f04  r2 : b2c97000  r1 : b2c95000  r0 : 7f555038

    Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM

 #1 [<c010ad58>] (__dabt_svc) from [<c01fa708>]

 #2 [<c01f980c>] (find_vma_links) from [<c01fa708>]

 #3 [<c01fa708>] (vma_adjust) from [<c01fa9e4>]

 #4 [<c01fa9e4>] (__split_vma) from [<c01fb364>]

 #5 [<c01fb364>] (do_munmap) from [<c01fb558>]

 #6 [<c01fb558>] (vm_munmap) from [<c0106820>]


Execution profile at the moment of kernel panic

[1]: R12 is updated as 0x34 where 0x34 = *0x7f555038 = *R0

[2]: Data abort is raised since base address for LDR execution is 0x28(0x34-0xC = R12-0xC)

0xc01f97f0 <find_vma_links>:    push    {r4, r5, lr}

0xc01f97f4 <find_vma_links+0x4>:        add     r0, r0, #4

0xc01f97f8 <find_vma_links+0x8>:        mov     r4, #0

0xc01f97fc <find_vma_links+0xc>:        mov     r5, r4

0xc01f9800 <find_vma_links+0x10>:       ldr     r12, [r0]  //<<--[1]

0xc01f9804 <find_vma_links+0x14>:       cmp     r12, #0

0xc01f9808 <find_vma_links+0x18>:       beq     0xc01f983c <find_vma_links+76>

0xc01f980c <find_vma_links+0x1c>:       ldr     r0, [r12, #-12] //<<--[2]


When tracing back to the context where the call to find_vma_links() is made inside vma_adjust(), R0 is copied from R7(0xC4616C80) whose data structure type is (struct mm_struct*) .

0xc01fa6e8 <vma_adjust+896>:    mov     r0, r7

0xc01fa6ec <vma_adjust+900>:    str     r3, [sp]

0xc01fa6f0 <vma_adjust+904>:    add     r3, sp, #60     ; 0x3c

0xc01fa6f4 <vma_adjust+908>:    str     r3, [sp, #4]

0xc01fa6f8 <vma_adjust+912>:    add     r3, sp, #52     ; 0x34

0xc01fa6fc <vma_adjust+916>:    ldr     r1, [r2]

0xc01fa700 <vma_adjust+920>:    ldr     r2, [r2, #4]

0xc01fa704 <vma_adjust+924>:    bl      0xc01f97f0 <find_vma_links>  //<<--


Identical debug information

crash> struct task_struct.mm e3ffb700

  mm = 0xc4616d80


At logical level, when find_vma_links() is called, R0 should have been 0xC4616C80 instead of 0x7f555038. This signature cannot be understandable from SW point of view.


This issue ended up with replacing defective PMIC with normal one in the problematic target device.


When I enter the command adb shell "cat /d/shrinker", the system crashes with 100% after dumping the following kernel message as below.

[  761.636711] Unable to handle kernel paging request at virtual address f38a9a84

[  761.645048] pgd = e8074000

[  761.649800] [f38a9a84] *pgd=a0721811, *pte=00000000, *ppte=00000000

[  761.658106] Internal error: Oops: 7 [#1] PREEMPT SMP ARM

[  761.665481] Modules linked in:

[  761.670605] CPU: 6 PID: 6978 Comm: sh Tainted: G        W      3.18.31-g5b0528e-dirty #12

[  761.680852] task: eb5b8000 ti: d228e000 task.ti: d228e000

[  761.688315] PC is at do_raw_spin_lock+0x8/0x17c

[  761.694912] LR is at list_lru_count_node+0x14/0x60

//--- skip ---

[  761.885719] [<c016a650>] (do_raw_spin_lock) from [<c02057f4>] (list_lru_count_node+0x14/0x60)

[  761.896307] [<c02057f4>] (list_lru_count_node) from [<c022dffc>] (super_cache_count+0x40/0x94)

[  761.906982] [<c022dffc>] (super_cache_count) from [<c01f2708>] (debug_shrinker_show+0x44/0x7c)

[  761.917658] [<c01f2708>] (debug_shrinker_show) from [<c024a56c>] (seq_read+0x1a4/0x3f0)

[  761.927726] [<c024a56c>] (seq_read) from [<c022c87c>] (vfs_read+0x88/0x138)

[  761.936755] [<c022c87c>] (vfs_read) from [<c022cda8>] (SyS_read+0x3c/0x7c)

[  761.945695] [<c022cda8>] (SyS_read) from [<c0106840>] (ret_fast_syscall+0x0/0x38)

[  761.955242] Code: e12fff1e deaf1eed e92d47f0 e1a08000 (e5902004) 


After loading coredump using T32 program, we can find out that the data abort is triggered inside debug_spin_lock_before() function during sanity-check procedure of spinlock where list_lru_count_node() is attempting to acquire.


-000|do_DataAbort(addr = 170, fsr = 3247439880, regs = 0xEC4901C0)

-001|__dabt_svc(asm)

 -->|exception

-002|debug_spin_lock_before(inline)  //<<-- data abort

-002|do_raw_spin_lock(lock = 0xF38A9A80)

-003|list_lru_count_node(?, ?)

-004|super_cache_count(shrink = 0xEBC47DE0, sc = 0xD228FEF8)

-005|debug_shrinker_show(s = 0xE7CF63C0, ?)

-006|seq_read(file = 0xE805CA00, buf = 0xB3246000, size = 4096, ppos = 0xD228FF80)

-007|vfs_read(file = 0xFFFFFFFF, buf = 0x0, ?, pos = 0xD228FF80)

-008|SYSC_read(inline)

-008|sys_read(?, buf = -1289461760, count = 4096)

-009|ret_fast_syscall(asm)



Upon acquiring spinlock, all members of raw_spinlock_t(0xF38A9A80) have been scribbled.


It is necessary to examine the variable which contains spinlock as member.


    (register raw_spinlock_t *) [R0] lock = 0xF38A9A80 -> (

      (arch_spinlock_t) [NSD:0xF38A9A80] raw_lock = (

        (u32) [NSD:0xF38A9A80] slock = 0x6F632F34,

        (struct __raw_tickets) [NSD:0xF38A9A80] tickets = (

          (u16) [NSD:0xF38A9A80] owner = 0x2F34, //<<--

          (u16) [NSD:0xF38A9A82] next = 0x6F63)), //<<--

      (unsigned int) [NSD:0xF38A9A84] magic = 0x6E6F6D6D //<<--

      (unsigned int) [NSD:0xF38A9A88] owner_cpu = 0x6F6F675F,

      (void *) [NSD:0xF38A9A8C] owner = 0x5F656C67))



The address of spin_lock is passed from &nlru->lock which is responsible for saving &lru->node[nid].


51unsigned long

52list_lru_count_node(struct list_lru *lru, int nid)

53{

54 unsigned long count = 0;

55 struct list_lru_node *nlru = &lru->node[nid];  <<--

56

57 spin_lock(&nlru->lock);



The list_lru_count_node() is called inside super_cache_count() [128 line.].


108static unsigned long super_cache_count(struct shrinker *shrink,

109           struct shrink_control *sc)

110{

111 struct super_block *sb;

112 long total_objects = 0;

113

114 sb = container_of(shrink, struct super_block, s_shrink);

//snip

124 if (sb->s_op && sb->s_op->nr_cached_objects)

125  total_objects = sb->s_op->nr_cached_objects(sb,

126       sc->nid);

127

128 total_objects += list_lru_count_node(&sb->s_dentry_lru, //<<--

129       sc->nid);


Tracing back to the caller function of super_cache_count() using T32 program, we can find out that debug_shrinker_show() calls super_cache_count().


Here, the second parameter in a call to super_cache_count() is &sc defined as local variable whose memory location belongs to stack area, which looks very *odd*. odd debug information.

205static int debug_shrinker_show(struct seq_file *s, void *unused)

206{

207 struct shrinker *shrinker;

208 struct shrink_control sc;

209

210 sc.gfp_mask = -1;

211 sc.nr_to_scan = 0;

212

213 down_read(&shrinker_rwsem);

214 list_for_each_entry(shrinker, &shrinker_list, list) {

215  int num_objs;

216

217  num_objs = shrinker->count_objects(shrinker, &sc);  //<<-- shrinker->count_objects = super_cache_count


Here, let me wrap up the call trace flow as follows.

1. shrinker->count_objects(shrinker, &sc); // <<-- super_cache_count


2. list_lru_count_node(&sb->s_dentry_lru, sc->nid); // The second parameter sc->nid contains garbage value(since it is defined as local variable without initialization)


3. list_lru_count_node(struct list_lru *lru, int nid) // <<-- nid(sc->nid): garbage value

  3.1 struct list_lru_node *nlru = &lru->node[nid]; // &lru->node[nid]; // access lru->node from  nid(garbage value)

  3.2 spin_lock(&nlru->lock); // &nlru->lock: garbage address


Solution

After visiting linux kernel community and google web-site, we have found that google already generated revert commit

https://chromium-review.googlesource.com/c/345895/


The commit message below seems to contain identical debugging signature what we analyzed(shrink_control.nid is used but not initialzed)


Commit log

ANDROID: Revert "mm: vmscan: Add a debug file for shrinkers"


Kernel panic when type "cat /sys/kernel/debug/shrinker"


Unable to handle kernel paging request at virtual address 0af37d40

pgd = d4dec000

[0af37d40] *pgd=00000000

Internal error: Oops: 5 [#1] PREEMPT SMP ARM

[<c0bb8f24>] (_raw_spin_lock) from [<c020aa08>] (list_lru_count_one+0x14/0x28)

[<c020aa08>] (list_lru_count_one) from [<c02309a8>] (super_cache_count+0x40/0xa0)

[<c02309a8>] (super_cache_count) from [<c01f6ab0>] (debug_shrinker_show+0x50/0x90)

[<c01f6ab0>] (debug_shrinker_show) from [<c024fa5c>] (seq_read+0x1ec/0x48c)

[<c024fa5c>] (seq_read) from [<c022e8f8>] (__vfs_read+0x20/0xd0)

[<c022e8f8>] (__vfs_read) from [<c022f0d0>] (vfs_read+0x7c/0x104)

[<c022f0d0>] (vfs_read) from [<c022f974>] (SyS_read+0x44/0x9c)

[<c022f974>] (SyS_read) from [<c0107580>] (ret_fast_syscall+0x0/0x3c)

Code: e1a04000 e3a00001 ebd66b39 f594f000 (e1943f9f)

---[ end trace 60c74014a63a9688 ]---

Kernel panic - not syncing: Fatal exception


shrink_control.nid is used but not initialzed, same for

shrink_control.memcg.


This reverts commit b0e7a582b2264cdf75874dcd8df915b6b4427755.


Reported-by: Xiaowen Liu <xiaowen.liu@freescale.com>

Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>

(cherry picked from https://android.googlesource.com/kernel/common android-4.4

 commit ad95c12f66df9efae04b15d5c4d0d0ba56ab2620)


BUG=chromium:612862

TEST=build and boot on a few devices

From callstack, "ndroid.systemui" process is executing inside arch_spin_lock() function. And then watchdog reset is triggered.

-000|aee_wdt_atf_entry()

-001|prefetchw(inline)

-001|arch_spin_lock(inline)

-001|do_raw_spin_lock_flags(inline)

-001|__raw_spin_lock_irqsave(inline)

-001|raw_spin_lock_irqsave(lock = 0xDD82725C)

-002|__wake_up(q = 0xDD82725C, mode = 1, nr_exclusive = 1, key = 0x0)

-003|mmc_wait_data_done

-004|msdc_irq_data_complete(host = 0xDD827280, ?, error = 0)

-005|msdc_irq(?, dev_id = 0xDD13B9C8)

-006|static_key_count(inline)

-006|static_key_false(inline)

-006|trace_irq_handler_exit(inline)

-006|handle_irq_event_percpu(?, action = 0xDD13B9C8)

-007|handle_irq_event(desc = 0xC0F38440)

-008|cond_unmask_eoi_irq(inline)

-008|handle_fasteoi_irq(?, desc = 0xC0F38440)

-009|static_key_count(inline)

-009|static_key_false(inline)

-009|trace_irq_exit(inline)

-009|__handle_domain_irq(domain = 0xDE806000, hwirq = 111, lookup = TRUE, ?)

-010|__raw_readl(inline)

-010|gic_handle_irq(regs = 0xDD13BFB0)

-011|__irq_usr(asm)


Now, it is time to analyze the spinlock field first. Looking at spinlock instance 0xDD82725C, the each owner and next filed is as below.

[owner: 0x56AC, next: 0x56AD]

(raw_spinlock_t *)0xDD82725C = 0xDD82725C -> (

  raw_lock = (

    slock = 0x56AD56AB,

       tickets = (

         owner = 0x56AB, // <<--

         next = 0x56AD))) // <<--


The fact that next field is larger than owner means the other process already acquired spinlock beforehand.


         

When scanning memory dump who is accessing  0xDD82725C spinlock instance, several candidate processes can be seen as below.

(where)

(raw_spinlock_t *)0xDD82725C

c2d01b78: dd82725c

c2d01b7c: dd82725c

c44f0994: dd82725c

dd13bd60: dd82725c // "ndroid.systemui"

dd13bd9c: dd82725c // "ndroid.systemui"

dd13bdc8: dd82725c // "ndroid.systemui"

ddbefdb8: dd82725c // <<--"mmcqd/0"

ddbefdd0: dd82725c

e0abd994: dd82725c


From the candidate list above, "mmcqd/0" is accessing  0xDD82725C spinlock instance.

v.v % (struct thread_info*)(0xDDBEFDB8 & ~0x1fff)

  (long unsigned int) swap_in = 0 = 0x0 = '....',

  (long unsigned int) swap_out = 0 = 0x0 = '....',

  (struct task_cputime) cputime_expires = ((cputime_t) utime = 0 = 0x0 = '....', (cputime_t) stime = 0 = 0x0 = '....', (long lo

  (struct list_head [3]) cpu_timers = ([0] = ((struct list_head *) next = 0xDDB49A08 = -> ((struct list_head *) next = 0xDDB49

  (struct cred *) real_cred = 0xDDB71200 = -> ((atomic_t) usage = ((int) counter = 2 = 0x2 = '....'), (kuid_t) uid = ((uid_t)

  (struct cred *) cred = 0xDDB71200 = -> ((atomic_t) usage = ((int) counter = 2 = 0x2 = '....'), (kuid_t) uid = ((uid_t) val =

   (char [16]) comm = "mmcqd/0",


The callstacks of "mmcqd/0" process can be observed as follows. 

-000|schedule()

-001|mmc_wait_for_data_req_done(inline)

-001|mmc_start_req(host = 0xC0F75D0C, areq = 0xC0F644E0, error = 0xC102CB08)

-002|mmc_blk_issue_rw_rq(mq = 0xDDB64408, rqc = 0x0)

-003|mmc_blk_issue_rq(mq = 0xDDB64408, req = 0x0)

-004|mmc_queue_thread(d = 0xDDB64408)

-005|kthread(_create = 0xDDB8BDC0)

-006|ret_from_fork(asm)

-007|ret_fast_syscall(asm)


Looking into mmc_wait_for_data_req_done(), the spinlock is acquired using spin_lock_irqsave(&context_info->lock, flags).

static int mmc_wait_for_data_req_done(struct mmc_host *host,

struct mmc_request *mrq,

struct mmc_async_req *next_req)

{

    struct mmc_command *cmd;

    struct mmc_context_info *context_info = &host->context_info;

    int err;

    unsigned long flags;

    while (1) {

        wait_event_interruptible(context_info->wait,

                                               (context_info->is_done_rcv ||

                                               context_info->is_new_req));

        spin_lock_irqsave(&context_info->lock, flags);

        context_info->is_waiting_last_req = false;


Looking into code snippet more, below functions are executed among callbacks of "ndroid.systemui" processes.

-003|mmc_wait_data_done

-004|msdc_irq_data_complete(host = 0xDD827280, ?, error = 0)


mmc_wait_data_done() 함수를 보면 mrq->host->context_info.is_done_rcv = true로 설정하고,

&mrq->host->context_info.wait wait queue을 깨우는 역할을 수행했다.

static void mmc_wait_data_done(struct mmc_request *mrq){

    mrq->host->context_info.is_done_rcv = true;

     wake_up_interruptible(&mrq->host->context_info.wait);

}


그런데 mmc_wait_for_data_req_done() 함수의 아래 구문은 context_info->is_done_rcv, context_info->is_new_req가 true가

될 때 까지 기다리는 코드다.

wait_event_interruptible(context_info->wait,

  (context_info->is_done_rcv ||

  context_info->is_new_req));


정리해보면, race condition으로 wake_up_interruptible(&mrq->host->context_info.wait); 함수를 통해 wait_queue에 queue된 

&mrq->host->context_info.wait으로 등록된 프로세스를 깨우기 전에 아래 구문이 풀려버려 이미 깨어버린 경우임을 짐작할 수 있다.


잠을 자고 있는 누군가를 깨우려고 손을 대려고 하는데 이미 깨버린 경우와 비슷하네.


In-depth Analysis

내 추정이 맞다면 아래 코드를 삽입하면 바로 커널 패닉이 나야 한다.

(램덤프 force_mmc_wait_data_done_crash)

static void mmc_wait_data_done(struct mmc_request mrq){*

    mrq->host->context_info.is_done_rcv = true;

    + mdelay(1);

    wake_up_interruptible(&mrq->host->context_info.wait);

}


그런데 바로 커널 패닉이 발생했다.

-000|do_DataAbort(addr = 0, fsr = 0, regs = 0x0)

-001|__dabt_svc(asm)

-->|exception

-002|debug_spin_lock_before(inline)

-002|do_raw_spin_lock(lock = 0x02B4)

-003|raw_spin_lock_irqsave(lock = 0x02B4)

-004|__wake_up(q = 0x02B4, mode = 1, nr_exclusive = 1, key = 0x0)

-005|mmc_wait_data_done(mrq = 0xDC9725F8)

-006|msdc_irq_data_complete(host = 0xDC919300, ?, error = 0)

-007|msdc_irq(?, dev_id = 0x1)

-008|handle_irq_event_percpu(?, action = 0xDC91E0C0)

-009|handle_irq_event(desc = 0xC121A440)

-010|cond_unmask_eoi_irq(inline)

-010|handle_fasteoi_irq(?, desc = 0xC121A440)

-011|static_key_count(inline)

-011|static_key_false(inline)

-011|trace_irq_exit(inline)

-011|__handle_domain_irq(domain = 0xDE806000, hwirq = 111, lookup = TRUE, ?)

-012|__raw_readl(inline)

-012|gic_handle_irq(regs = 0xCB773E10)

-013|__irq_svc(asm)

-->|exception

-014|current_thread_info(inline)

-014|check_preemption_disabled(what1 = 0xC0F115AC, what2 = 0xC0EE0B28)

-015|debug_smp_processor_id()

-016|MT_trace_preempt_on()

-017|preempt_count_sub

-018|current_thread_info(inline)

-018|preempt_count(inline)

-018|should_resched(inline)

-018|__percpu_counter_add(fbc = 0xD8AC7178, amount = 0, batch = 32)

-019|__sb_start_write(sb = 0xC0F115AC, level = -1058141400, wait = FALSE)

-020|file_start_write(inline)

-020|vfs_write(file = 0xD366C480, buf = 0xBEA83A4C, ?, pos = 0xCB773F78)

-021|SYSC_write(inline)

-021|sys_write(?, buf = -1096271284, count = 146)

-022|ret_fast_syscall(asm)


mrq->host->context_info.wait 변수를 확인해보면, 이미 mrq->host 변수가 0이다.

(struct mmc_request *)0xDC9725F8 = 0xDC9725F8 -> (

    sbc = 0xDC97262C,

    cmd = 0xDC972660,

    data = 0xDC9726C8,

    stop = 0xDC972694,

    completion = (done = 0x0, wait = (lock = (rlock = (raw_lock = (slock = 0x0,

      done = 0x0,

     host = 0x0)


어머 그런데, mmcqd/0 프로세스가 mmc_wait_for_data_req_done() 코드를 이미 수행했네?

-000|context_switch(inline)

-000|__schedule()

-001|schedule()

-002|mmc_wait_for_data_req_done(inline)

-002|mmc_start_req(host = 0xDC919000, areq = 0xDC972710, ?)

-003|mmc_blk_issue_rw_rq(mq = 0x0, rqc = 0xC64B18C0)

-004|mmc_blk_issue_rq(mq = 0xDC972494, req = 0xC64B18C0)

-005|mmc_queue_thread(d = 0xDC972494)

-006|kthread(_create = 0xDCA24E40)

-007|ret_from_fork(asm)

-008|ret_fast_syscall(asm)


@ftrace log를 좀 더 추가해보자.

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c

index db74413..fcf6692 100644

? a/drivers/mmc/core/core.c

+++ b/drivers/mmc/core/core.c

@@ -1034,6 +1034,9 @@ out:

}

EXPORT_SYMBOL(mmc_start_bkops);

#endif

/*

mmc_wait_data_done() - done callback for data request

@mrq: done data request

@@ -1042,10 +1045,25 @@ EXPORT_SYMBOL(mmc_start_bkops);

*/

static void mmc_wait_data_done(struct mmc_request *mrq)

static void mmc_wait_data_done(struct mmc_request mrq) *{

        mrq->host->context_info.is_done_rcv = true;


+      trace_printk("+msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0));

        mdelay(1);

+      trace_printk("++msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0));

        wake_up_interruptible(&mrq->host->context_info.wait);

+      trace_printk("--msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0));


역시나 GIC mtk-msdc(111) IRQ가 Trigger된 다음에 아래와 같이 커널 패닉이 발생하였네.

(램덤프: neo_more_crash_ftrace)

-000|do_DataAbort(addr = 0, fsr = 0, regs = 0x0)

-001|__dabt_svc(asm)

-->|exception

-002|debug_spin_lock_before(inline)

-002|do_raw_spin_lock(lock = 0x02B4)

-003|raw_spin_lock_irqsave(lock = 0x02B4)

-004|__wake_up(q = 0x02B4, mode = 1, nr_exclusive = 1, key = 0x0)

-005|mmc_wait_data_done(mrq = 0xDC9235F8)

-006|msdc_irq_data_complete(host = 0xDC927300, ?, error = 0)

-007|msdc_irq(?, dev_id = 0x1)

-008|handle_irq_event_percpu(?, action = 0xDC92B180)

-009|handle_irq_event(desc = 0xC121A440)

-010|cond_unmask_eoi_irq(inline)

-010|handle_fasteoi_irq(?, desc = 0xC121A440)

-011|static_key_count(inline)

-011|static_key_false(inline)

-011|trace_irq_exit(inline)

-011|__handle_domain_irq(domain = 0xDE806000, hwirq = 111, lookup = TRUE, ?)

-012|__raw_readl(inline)

-012|gic_handle_irq(regs = 0xC2631E20)

-013|__irq_svc(asm)

-->|exception

-014|__fget(?, mask = 16384)

-015|__fget_light(?, mask = 16384)

-016|__fdget

-017|sockfd_lookup_light(?, err = 0xC2631EC8, fput_needed = 0xC2631ECC)

-018|SYSC_recvfrom(inline)

-018|sys_recvfrom(?, ubuf = -1212542344, ?, flags = 64, addr = 0, addr_len = 0)

-019|ret_fast_syscall(asm)

Ftrace log 분석 좀 해보자.

커널 패닉이 일어나기 직전 상황

[1]: irq=111 name=mtk-msdc 가 Trigger됨

[2]: IRQ context에서 mmc_wait_data_done() 함수가 호출 from(msdc_irq_data_complete)

[2]~[3] 사이 mmcq0 process가 context switching 되지 않고 수행 ( InputDispatcher-1538)

[3]: mdelay(1); 이후 찍히는 ftrace log

[4]: wake_up_interruptible(&mrq->host->context_info.wait); API 호출로 mmcq0 process을 wakeup 시킴

CPU5: mmcqd/0, pid=244

sched_wakeup: comm=mmcqd/0 pid=244 prio=120 success=1 target_cpu=005 state=W

[5]: wake_up_interruptible(&mrq->host->context_info.wait); API 호출이 끝난 후 찍히는 로그(별 의미 없음)


kworker/4:0-25 [004] 149.827488: irq_handler_entry: irq=111 name=mtk-msdc // <<--[1]

kworker/4:0-25 [004] 149.827495: bprint: mmc_wait_data_done: +msdc [F: mmc_wait_data_done, L:1049]: ? ? from(msdc_irq_data_complete) // <<--[2]

InputDispatcher-1538 [005] 149.827496: sched_migrate_task: comm=m.lge.launcher3 pid=2135 prio=120 orig_cpu=6 dest_cpu=4 state=W

InputDispatcher-1538 [005] 149.827519: sched_wakeup: comm=m.lge.launcher3 pid=2135 prio=120 success=1 target_cpu=004 state=W

InputDispatcher-1538 [005] 149.827591: sched_migrate_task: comm=ndroid.systemui pid=1622 prio=120 orig_cpu=5 dest_cpu=6 state=W

InputDispatcher-1538 [005] 149.827608: sched_wakeup: comm=ndroid.systemui pid=1622 prio=120 success=1 target_cpu=006 state=W

InputDispatcher-1538 [005] 149.827675: sched_migrate_task: comm=android.ui pid=1287 prio=118 orig_cpu=4 dest_cpu=6 state=W

InputDispatcher-1538 [005] 149.827694: sched_wakeup: comm=android.ui pid=1287 prio=118 success=1 target_cpu=006 state=W

InputDispatcher-1538 [005] 149.827855: sched_switch: prev_comm=InputDispatcher prev_pid=1538 prev_prio=112 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 extra_prev_state=K|m|d 

<idle>-0 [005] 149.827912: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=InputDispatcher next_pid=1538 next_prio=112 extra_prev_state=K|m|d 

InputDispatcher-1538 [005] 149.828064: sched_switch: prev_comm=InputDispatcher prev_pid=1538 prev_prio=112 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 extra_prev_state=K|m|d 

<idle>-0 [005] 149.828209: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=InputDispatcher next_pid=1538 next_prio=112 extra_prev_state=K|m|d 

InputDispatcher-1538 [005] 149.828352: sched_switch: prev_comm=InputDispatcher prev_pid=1538 prev_prio=112 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 extra_prev_state=K|m|d 

kworker/4:0-25 [004] 149.828497: bprint: mmc_wait_data_done: ++msdc [F: mmc_wait_data_done, L:1058]: ? ? from(msdc_irq_data_complete) // <<--[3]

kworker/4:0-25 [004] 149.828521: sched_wakeup: comm=mmcqd/0 pid=244 prio=120 success=1 target_cpu=005 state=W // <<--[4]

kworker/4:0-25 [004] 149.828525: bprint: mmc_wait_data_done: -msdc [F: mmc_wait_data_done, L:1060]: ? ? from(msdc_irq_data_complete) <<-[5]

<idle>0 [005] 149.828528: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=mmcqd/0 next_pid=244 next_prio=120 extra_prev_state=K|m|d <<-[6]

kworker/4:0-25 [004] 149.828530: irq_handler_exit: irq=111 ret=handled

1046 static void mmc_wait_data_done(struct mmc_request *mrq)

1047*{*

1048 mrq->host->context_info.is_done_rcv = true;

1049 trace_printk("+msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<--[2]

//생략

1051 mdelay(1);

//생략

1057

1058 trace_printk("++msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<--[3]

1059 wake_up_interruptible(&mrq->host->context_info.wait);

1060 trace_printk("-msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<-[5]


커널 패닉 상황

[7]: irq=111 name=mtk-msdc 가 handle 되었음

[8]: [mmcqd/0-244, CPU5]mmc_wait_for_data_req_done() 함수 내[from(mmc_blk_issue_rw_rq)] wait_event_interruptible()에서 빠져나옴.

[9][10]: [mmcqd/0-244, CPU5]mmc_wait_for_data_req_done() 함수를 빠져나옴

[11]: irq=111 name=mtk-msdc 가 Trigger됨 --> CPU4의 m.lge.launcher3-2135

[12]: [mmcqd/0-244, CPU5] 깨어난 mmcqd/0-244 프로세스 활동 시작

&mqrq->brq 변수를 0으로 밀어버림 (where: memset(brq, 0, sizeof(struct mmc_blk_request));

[13]: [m.lge.launcher3-2135 CPU4] irq=111 name=mtk-msdc IRQ로 Trigger되어 실행되어

mmc_wait_data_done() 함수 진입 

[14][15]: [mmcqd/0-244, CPU5] mmc_blk_rw_rq_prep() 함수 퇴장하려함

[16][17][18] [mmcqd/0-244, CPU5] mmc_wait_for_data_req_done() 함수 처리 후 퇴장하려함

[19]: [m.lge.launcher3-2135 CPU4] wake_up_interruptible(&mrq->host->context_info.wait);

[20] 커널 패닉

[12]번 동작에서 0으로 초기화된 &mrq->host 변수를 통해 밀린((where: memset(brq, 0, sizeof(struct mmc_blk_request)) wake_up_interruptible() API 호출

kworker/4:0-25 [004] 149.828530: irq_handler_exit: irq=111 ret=handled // <<--[7]

mmcqd/0-244 [005] 149.828536: bprint: mmc_start_req: ++mmcq [F: mmc_wait_for_data_req_done, L:1236]: ? ? from(mmc_blk_issue_rw_rq) // <<--[8]

kworker/4:0-25 [004] 149.828537: irq_handler_entry: irq=30 name=arch_timer

kworker/4:0-25 [004] 149.828546: irq_handler_exit: irq=30 ret=handled

mmcqd/0-244 [005] 149.828559: bprint: mmc_start_req: mmcq [F: mmc_wait_for_data_req_done, L:1268]: ? ? from(mmc_blk_issue_rw_rq) // <<-[9]

mmcqd/0-244 [005] 149.828561: bprint: mmc_start_req: mmcq release [F: mmc_wait_for_data_req_done, L:1270]: ? ? from(mmc_blk_issue_rw_rq) // <<--[10]

kworker/4:0-25 [004] 149.828584: workqueue_execute_end: work struct 0xcb40cae4

kworker/4:0-25 [004] 149.828630: sched_switch: prev_comm=kworker/4:0 prev_pid=25 prev_prio=120 prev_state=S ==> next_comm=m.lge.launcher3 next_pid=2135 next_prio=120 extra_prev_state=K|m|d 

m.lge.launcher3-2135 [004] 149.828660: irq_handler_entry: irq=111 name=mtk-msdc // <<--[11]

mmcqd/0-244 [005] 149.828661: bprint: mmc_blk_rw_rq_prep: ++mmc_blk_rw_rq_prep [F: mmc_blk_rw_rq_prep, L:1725]: ? ? from(mmc_blk_issue_rw_rq) // <<--[12]

m.lge.launcher3-2135 [004] 149.828666: bprint: mmc_wait_data_done: +msdc [F: mmc_wait_data_done, L:1049]: ? ? from(msdc_irq_data_complete) // <<--[13]

mmcqd/0-244 [005] 149.828670: bprint: mmc_blk_rw_rq_prep: -mmc_blk_rw_rq_prep [F: mmc_blk_rw_rq_prep, L:1905]: ? ? from(mmc_blk_issue_rw_rq) // <<-[14]

mmcqd/0-244 [005] 149.828672: bprint: mmc_blk_rw_rq_prep: mmc_blk_rw_rq_prep [F: mmc_blk_rw_rq_prep, L:1907]: ? ? from(mmc_blk_issue_rw_rq) // <<-[15]

mmcqd/0-244 [005] 149.828681: bprint: mmc_start_req: +mmcq [F: mmc_wait_for_data_req_done, L:1231]: ? ? from(mmc_blk_issue_rw_rq) // <<--[16]

mmcqd/0-244 [005] 149.828683: bprint: mmc_start_req: ++mmcq [F: mmc_wait_for_data_req_done, L:1236]: ? ? from(mmc_blk_issue_rw_rq) // <<--[17]

mmcqd/0-244 [005] 149.828695: bprint: mmc_start_req: -mmcq [F: mmc_wait_for_dat뗥nQ, L:0]: ? ? from(0x726f776bS) // <<--[18]

m.lge.launcher3-2135 [004] 149.829669: bprint: mmc_wait_data_done: ++msdc [F: mmc_wait_data_done, L:1058]: ? ? from(msdc_irq_data_complete) // <<--[19]

m.lge.launcher3-2135 [004] 149.829748: tracing_on: ftrace is disabled caller=oops_enter // <<?[20]


1222 static int mmc_wait_for_data_req_done(struct mmc_host *host,

1223 struct mmc_request *mrq,

1224 struct mmc_async_req *next_req)

1225*{*

1226 struct mmc_command *cmd;

1227 struct mmc_context_info *context_info = &host->context_info;

1228 int err;

1229 unsigned long flags;

1230

1231 trace_printk("+mmcq [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<--[16]

1232 while (1) {

1233 wait_event_interruptible(context_info->wait,

1234 (context_info->is_done_rcv ||

1235 context_info->is_new_req));

1236 trace_printk("++mmcq [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<--[8], [17]

1237 spin_lock_irqsave(&context_info->lock, flags);

//snip

1268 trace_printk("mmcq [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<-[9], [18]

1269 mmc_retune_release(host);

1270 trace_printk("mmcq release [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0));

1271 return err;

1272 }


1701 static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,

1702 struct mmc_card *card,

1703 int disable_multi,

1704 struct mmc_queue *mq)

1705*{*

1706 u32 readcmd, writecmd;

1707 struct mmc_blk_request *brq = &mqrq->brq;

//생략

1724 memset(brq, 0, sizeof(struct mmc_blk_request));

1725 trace_printk("++mmc_blk_rw_rq_prep [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<--[12]

1726 brq->mrq.cmd = &brq->cmd;

//생략

1905 trace_printk("-mmc_blk_rw_rq_prep [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); //<<-[14]

1906 mmc_queue_bounce_pre(mqrq);

1907 trace_printk("-mmc_blk_rw_rq_prep [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0));

1908 }


1046 static void mmc_wait_data_done(struct mmc_request *mrq)

1047*{*

1048 mrq->host->context_info.is_done_rcv = true;

1049 trace_printk("+msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<--[13]

//생략

1051 mdelay(1);

//생략

1057

1058 trace_printk("++msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0)); // <<--[19]

1059 wake_up_interruptible(&mrq->host->context_info.wait);

1060 trace_printk("--msdc [F: %s, L:%d]: ? ? from(%pS)\n", _func,LINE, (void *)_builtin_return_address(0));


정리하면 wake_up_interruptible() API에 파라미터로 전달되는 &mrq->host->context_info.wait를 다른 값(mmcqd를 깨울 수 있는 wait_queue)으로 바꾸거나,

wake_up_interruptible() 호출 시 조건을 줘야 한다. 혹은 spinlock irq save로 코드 보호


솔류션


분석된 내용에 따라 아래 패치를 적용하였더니 문제가 사라졌다. 

 diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c

index c3abe34..01e4bfc 100644

--- a/drivers/mmc/core/core.c

+++ b/drivers/mmc/core/core.c

@@ -403,8 +403,13 @@ EXPORT_SYMBOL(mmc_start_bkops);

*/

static void mmc_wait_data_done(struct mmc_request *mrq)

{

+       unsigned long flags;

+       struct mmc_context_info *context_info = &mrq->host->context_info;

+

+       spin_lock_irqsave(&context_info->lock, flags);

         mrq->host->context_info.is_done_rcv = true;

         wake_up_interruptible(&mrq->host->context_info.wait);

+       spin_unlock_irqrestore(&context_info->lock, flags);

}

static void mmc_wait_done(struct mmc_request *mrq)

@@ -465,6 +470,7 @@ static int mmc_wait_for_data_req_done(struct mmc_host *host,

         struct mmc_command *cmd;

         struct mmc_context_info *context_info = &host->context_info;

         int err;

+       bool is_done_rcv = false;

         unsigned long flags;

         while (1) {

@@ -472,9 +478,10 @@ static int mmc_wait_for_data_req_done(struct mmc_host *host,

        (context_info->is_done_rcv ||

                     context_info->is_new_req));

                     spin_lock_irqsave(&context_info->lock, flags);

+          is_done_rcv = context_info->is_done_rcv;

            context_info->is_waiting_last_req = false;

            spin_unlock_irqrestore(&context_info->lock, flags);

-          if (context_info->is_done_rcv) {

+         if (is_done_rcv) {

                  context_info->is_done_rcv = false;

                  context_info->is_new_req = false;

                  cmd = mrq->cmd;


By the way, relevant patch has been already released inside Linux Kernel Community. 

https://patchwork.kernel.org/patch/7089351/

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c

index 347108a..db74413 100644

--- a/drivers/mmc/core/core.c

+++ b/drivers/mmc/core/core.c

@@ -1042,8 +1042,10 @@ EXPORT_SYMBOL(mmc_start_bkops);

*/

static void mmc_wait_data_done(struct mmc_request *mrq)

{

-       mrq->host->context_info.is_done_rcv = true;

-       wake_up_interruptible(&mrq->host->context_info.wait);

+      struct mmc_context_info *context_info = &mrq->host->context_info;

+

+      context_info->is_done_rcv = true;

+      wake_up_interruptible(&context_info->wait);

}


+ Recent posts