본문 바로가기

Core BSP 분석/리눅스 커널 핵심 분석

커널 패닉 이슈: tty_wakeup() 08/18/2014

커널이 아래와 같은 유서를 남기고 돌아가셨다. tty_wakeup()에서 돌아가셨고, 암은 슈퍼바이저 모드상태였다.
[71653.259161 08-15 11:27:39.779] android_work: did not send uevent (0 0   (null))
[71653.290526 08-15 11:27:39.811] android_work: sent uevent USB_STATE=CONNECTED
[71653.292844 08-15 11:27:39.813] android_work: sent uevent USB_STATE=DISCONNECTED
[71653.365187 08-15 11:27:39.885] android_usb gadget: high speed config #1: android
[71653.365758 08-15 11:27:39.885] Unable to handle kernel NULL pointer dereference at virtual address 000000a8
[71653.365775 08-15 11:27:39.885] pgd = c0004000
[71653.365785 08-15 11:27:39.885] [000000a8] *pgd=00000000
[71653.365805 08-15 11:27:39.885] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[71653.365841 08-15 11:27:39.885] Modules linked in: bcmdhd(O)
[71653.365864 08-15 11:27:39.885] CPU: 0    Tainted: G           O  (3.4.65-g000d394 #1)
[71653.365891 08-15 11:27:39.885] PC is at tty_wakeup+0x14/0x80
[71653.365910 08-15 11:27:39.885] LR is at gs_start_io+0x90/0xe8
[71653.365922 08-15 11:27:39.885] pc : [<c03b2fe8>]    lr : [<c04a2520>]    psr: 20030193
[71653.365928 08-15 11:27:39.885] sp : d2b2db10  ip : d2b2db28  fp : d2b2db24
[71653.365944 08-15 11:27:39.885] r10: e29ba318  r9 : e29ba324  r8 : 00000000
[71653.365955 08-15 11:27:39.885] r7 : e29ba2f4  r6 : e16290a8  r5 : e29ba2e8  r4 : e29ba2c0
[71653.365967 08-15 11:27:39.885] r3 : 0000000a  r2 : 00000000  r1 : 60030193  r0 : 00000000
[71653.365984 08-15 11:27:39.885] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[71653.365996 08-15 11:27:39.885] Control: 10c5387d  Table: 8567c04a  DAC: 00000015
...
[71653.369827 08-15 11:27:39.885] [<c03b2fe8>] (tty_wakeup+0x14/0x80) from [<c04a2520>] (gs_start_io+0x90/0xe8)
[71653.369848 08-15 11:27:39.885] [<c04a2520>] (gs_start_io+0x90/0xe8) from [<c04a2f24>] (gserial_connect+0xdc/0x164)
[71653.369865 08-15 11:27:39.885] [<c04a2f24>] (gserial_connect+0xdc/0x164) from [<c04a3440>] (acm_set_alt+0x90/0x1ac)
[71653.369883 08-15 11:27:39.885] [<c04a3440>] (acm_set_alt+0x90/0x1ac) from [<c04995b4>] (composite_setup+0x244/0xf80)
[71653.369899 08-15 11:27:39.885] [<c04995b4>] (composite_setup+0x244/0xf80) from [<c049a3b0>] (android_setup+0xc0/0x154)
[71653.369921 08-15 11:27:39.885] [<c049a3b0>] (android_setup+0xc0/0x154) from [<c0487d44>] (tegra_udc_irq+0xbd4/0x11d0)
[71653.369944 08-15 11:27:39.885] [<c0487d44>] (tegra_udc_irq+0xbd4/0x11d0) from [<c00e64a8>] (handle_irq_event_percpu+0x88/0x2ec)
[71653.369962 08-15 11:27:39.885] [<c00e64a8>] (handle_irq_event_percpu+0x88/0x2ec) from [<c00e6758>] (handle_irq_event+0x4c/0x6c)
[71653.369980 08-15 11:27:39.885] [<c00e6758>] (handle_irq_event+0x4c/0x6c) from [<c00e96ec>] (handle_fasteoi_irq+0xcc/0x174)
[71653.369998 08-15 11:27:39.885] [<c00e96ec>] (handle_fasteoi_irq+0xcc/0x174) from [<c00e5c28>] (generic_handle_irq+0x3c/0x50)
[71653.370022 08-15 11:27:39.885] [<c00e5c28>] (generic_handle_irq+0x3c/0x50) from [<c000f9c4>] (handle_IRQ+0x5c/0xbc)
[71653.370040 08-15 11:27:39.885] [<c000f9c4>] (handle_IRQ+0x5c/0xbc) from [<c0008504>] (gic_handle_irq+0x34/0x68)
[71653.370057 08-15 11:27:39.885] [<c0008504>] (gic_handle_irq+0x34/0x68) from [<c000ec00>] (__irq_svc+0x40/0x70)
[71653.370070 08-15 11:27:39.885] Exception stack(0xd2b2dd80 to 0xd2b2ddc8)


커널 패닉이 발생한 주소는 0xc03b2fe8이다. 그런데 r0이 0이므로 tty_wakeup 함수의 파라미터 tty 포인터가 널이다.
정확히 e59030a8        ldr     r3, [r0, #168]  ; 0xa8 이 명령어를 실행하다가 죽었다.
c03b2fd4 <tty_wakeup>:
tty_wakeup():
c03b2fd4:       e1a0c00d        mov     ip, sp
c03b2fd8:       e92dd830        push    {r4, r5, fp, ip, lr, pc}
c03b2fdc:       e24cb004        sub     fp, ip, #4
c03b2fe0:       e92d4000        stmfd   sp!, {lr}
c03b2fe4:       ebf17057        bl      c000f148 <__gnu_mcount_nc>
test_bit():
android/kernel/include/asm-generic/bitops/non-atomic.h:105
c03b2fe8:       e59030a8        ldr     r3, [r0, #168]  ; 0xa8

struct tty_struct 란 구조체에서 얼만큼 오프셋으로 flags가 있는 지 확인해보았다.
struct tty_struct {
     [0x0] int magic;
     [0x4] struct kref kref;
...
    [0xa4] struct pid *session;
    [0xa8] unsigned long flags;
    

널 포인터 엑세스가 발생한 이유는 tty->flags 플래그에 접근을 할 수 없기 때문임을 알 수 있다.
void tty_wakeup(struct tty_struct *tty)
{
    struct tty_ldisc *ld;

    if (test_bit(TTY_DO_WRITE_WAKEUP, &tty->flags)) {
        ld = tty_ldisc_ref(tty);
        if (ld) {
            if (ld->ops->write_wakeup)
                ld->ops->write_wakeup(tty);
            tty_ldisc_deref(ld);
        }
    }
    wake_up_interruptible_poll(&tty->write_wait, POLLOUT);
}

그럼 tty_wakeup() 함수를 호출한 gs_start_io()함수로 거슬러 올라가보자. 아래와 같이 703 라인에서 tty_wakeup() 함수를
부르는 것을 알 수 있다. 그럼 port 변수에 대해서 좀 살펴보자.
672 static int gs_start_io(struct gs_port *port)
 673 {
 674     struct list_head    *head = &port->read_pool;
 675     struct usb_ep       *ep = port->port_usb->out;
 676     int         status;
 677     unsigned        started;
 678
 700
 701     /* unblock any pending writes into our circular buffer */
 702     if (started) {
 703         tty_wakeup(port->port_tty);
 704     } else {
 705         gs_free_requests(ep, head, &port->read_allocated);
 706         gs_free_requests(port->port_usb->in, &port->write_pool,
 707             &port->write_allocated);
 708         status = -EIO;
 709     }
 710
 711     return status;
 712 }
 
아래와 같이 port_tty이 널임을 알 수 있다. 이 파라미터가 왜 널인지를 파악해야 할 것으로 보인다.
gs_start_io(
    (register struct gs_port *) port = 0xE29BA2C0 = __bss_stop+0x21AB0E10 -> (
      (spinlock_t) port_lock = (
        (struct raw_spinlock) rlock = (
          (arch_spinlock_t) raw_lock = (
            (unsigned int) lock = 1 = 0x1 = '....'),
          (unsigned int) break_lock = 0 = 0x0 = '....')),
      (struct gserial *) port_usb = 0xE17DB080 = __bss_stop+0x208D1BD0 -> ((struct usb_function) func = ((char *) name = 0

 
# Reference: For more information on 'Linux Kernel';
 
디버깅을 통해 배우는 리눅스 커널의 구조와 원리. 1
 
디버깅을 통해 배우는 리눅스 커널의 구조와 원리. 2