watchdog reset - race in ipv6_ifa_notify()

#커널 크래시 디버깅 및 TroubleShooting

예전 MTK SoC에서 일어났던 Watchdog Reset 이슈인데요. 디버깅 정보를 조금 공유하고자 해요.

CPU1에서 수행 중인 "migration/1" process에서 watchdog reset이 발생하여 시스템이 리셋된 흔적을 확인할 수 있어요.

-000|do_undefinstr()

-001|__und_svc_fault(asm)

-->|exception

-002|aee_wdt_atf_info()

-003|aee_wdt_atf_entry()

-004|debug_smp_processor_id()

-005|cpu_stopper_thread()

-006|smpboot_thread_fn()

-007|kthread()

-008|ret_from_fork(asm)

-009|ret_fast_syscall(asm)

MTK SoC에서는 각 CPU가 Hotplug 동작 전 후에서 Watchdog Kick 여부를 파악하는데요.

cpus_kick_bit 변수는 hotplugin을 수행 요청을 한 CPU의 비트마스크이고, kick_bit은 hotplugin 동작 완료 시 notifier call이 수행된 CPU 비트 필드를 보여줘요.

11110011 (where) cpus_kick_bit = 0xF3

00000000 (where)kick_bit = 0x0

그럼 위 정보로 보아, CPU0/1/4/5/6/7에서 무엇인가 Stuck이 되어 Watchdog Kick을 못했다고 추정할 수 있어요.

각 CPU의 콜스택을 점검해보니 CPU4에서 "kworker/4:3" 프로세스가 아래 콜스택으로 구동 중임을 알 수 있어요.

-000|__ipv6_dev_ac_inc()

-001|addrconf_join_anycast()

-002|__ipv6_ifa_notify()

-003|local_bh_enable(inline)

-003|rcu_read_unlock_bh(inline)

-003|ipv6_ifa_notify()

-004|addrconf_dad_begin(inline)

-004|addrconf_dad_work()

-005|static_key_count(inline)

-005|static_key_false(inline)

-005|trace_workqueue_execute_end(inline)

-005|process_one_work()

-006|list_empty(inline)

-006|worker_thread()

-007|kthread()

-008|ret_from_fork(asm)

-009|ret_fast_syscall(asm)

그럼 ftrace log로 실제 각 CPU별로 어떤 동작을 수행 중이었는 지 살펴보도록 할께요.

우선 CPU4에서 어떤 동작이 있었는지 살펴보면, "kworker/4:3-10711" 프로세스가 addrconf_dad_work work를 수행 중인 로그가 확인되요.

그런데, workqueue_execute_end 0xd8c3cb7c란 로그가 안 보이는 것으로 보아 위에 제가 잡은 콜스택 정보와 일치하는 정보이군요.

addrconf_dad_work -> ipv6_ifa_notify -> __ipv6_ifa_notify -> addrconf_join_anycast

그리고 마지막 ftrace로그의 타임 스탬프는 1595.644768로 확인되거든요. CPU4가 22초 동안 preemption되지 않고 stuck되었다는 정보를 확인할 수 있죠.

kworker/4:3-10711 [004] 1573.162047: workqueue_execute_start: work struct 0xd8c3cb7c: function addrconf_dad_work

kworker/4:3-10711 [004] 1573.162099: sched_migrate_task: comm=dnsmasq pid=11779 prio=120 orig_cpu=5 dest_cpu=6 state=W

kworker/4:3-10711 [004] 1573.162114: sched_wakeup: comm=dnsmasq pid=11779 prio=120 success=1 target_cpu=006 state=W

kworker/4:3-10711 [004] 1573.162128: sched_wakeup: comm=vpnclientd pid=2786 prio=120 success=1 target_cpu=004 state=W

kworker/4:3-10711 [004] 1573.162136: sched_migrate_task: comm=netd pid=1145 prio=120 orig_cpu=5 dest_cpu=4 state=W

kworker/4:3-10711 [004] 1573.162140: sched_wakeup: comm=netd pid=1145 prio=120 success=1 target_cpu=004 state=W

kworker/4:3-10711 [004] 1573.164771: irq_handler_entry: irq=30 name=arch_timer

kworker/4:3-10711 [004] 1573.164795: irq_handler_exit: irq=30 ret=handled

kworker/4:3-10711 [004] 1573.170639: irq_handler_entry: irq=216 name=ovl0

kworker/4:3-10711 [004] 1573.170651: irq_handler_exit: irq=216 ret=handled

kworker/4:3-10711 [004] 1573.170771: irq_handler_entry: irq=218 name=rdma0

kworker/4:3-10711 [004] 1573.170789: sched_wakeup: comm=VSyncThread_0 pid=455 prio=112 success=1 target_cpu=006 state=W

kworker/4:3-10711 [004] 1573.170792: irq_handler_exit: irq=218 ret=handled

kworker/4:3-10711 [004] 1573.174770: irq_handler_entry: irq=30 name=arch_timer

kworker/4:3-10711 [004] 1573.174786: irq_handler_exit: irq=30 ret=handled

kworker/4:3-10711 [004] 1573.175936: irq_handler_entry: irq=209 name=mutex

kworker/4:3-10711 [004] 1573.175944: irq_handler_exit: irq=209 ret=handled

kworker/4:3-10711 [004] 1573.176141: irq_handler_entry: irq=270 name=AHB_SLAVE_HIF

kworker/4:3-10711 [004] 1573.176150: sched_wakeup: comm=tx_thread pid=11483 prio=120 success=1 target_cpu=006 state=W

kworker/4:3-10711 [004] 1573.176152: irq_handler_exit: irq=270 ret=handled

// 생략

migration/1-11 [001] 1595.638701: irq_handler_entry: irq=232 name=ovl1_2l

migration/1-11 [001] 1595.638704: irq_handler_exit: irq=232 ret=handled

migration/1-11 [001] 1595.644768: irq_handler_entry: irq=30 name=arch_timer

// end of ftrace log

이제 더 watchdog reset의 원인을 더 좁히기 위해, 디버깅 정보를 좀 더 뒤져봐야 하는데요.

Uninterruptible 상태 프로세스 정보를 확인해볼께요. 보통 크래시 코어 덤프가 나오면 습관적으로 점검을 하곤 하거든요.

그런데 아래와 같이 평소에 보이지 않는 프로세스들이 보이네요.

crash> ps -l | grep UN

[1573221067182] [UN] PID: 1043 TASK: d8959000 CPU: 5 COMMAND: "android.io"

[1573169741644] [UN] PID: 273 TASK: dd903600 CPU: 5 COMMAND: "bat_update_thre"

[1573164748105] [UN] PID: 6531 TASK: c748a000 CPU: 5 COMMAND: "DnsConfigServic"

[1573163602413] [UN] PID: 4299 TASK: c1a8e600 CPU: 6 COMMAND: "DnsConfigServic"

[1573160763720] [UN] PID: 25 TASK: de900600 CPU: 4 COMMAND: "kworker/4:0"

[1573134929413] [UN] PID: 166 TASK: ddca8600 CPU: 4 COMMAND: "disp_idlemgr"

[1572444794180] [UN] PID: 172 TASK: dea1d000 CPU: 4 COMMAND: "disp_check"

[1571706203179] [UN] PID: 221 TASK: ddfc2600 CPU: 0 COMMAND: "touch_boost"

[ 67489120544] [UN] PID: 279 TASK: ddb81000 CPU: 5 COMMAND: "reclaim_task"

[ 61815993377] [UN] PID: 5447 TASK: c7464000 CPU: 5 COMMAND: "secwidevinemdw_"

[ 5302444704] [UN] PID: 483 TASK: da78d600 CPU: 4 COMMAND: "rpmb_Dci"

[ 1263689925] [UN] PID: 281 TASK: ddb83000 CPU: 1 COMMAND: "reset_task"

[ 695977770] [UN] PID: 168 TASK: de3d3000 CPU: 4 COMMAND: "decouple_trigge"

[ 680735924] [UN] PID: 165 TASK: ddc76000 CPU: 4 COMMAND: "ddp_irq_log_kth"

PID: 1043 TASK: d8959000 CPU: 5 COMMAND: "android.io" 프로세스 콜스택을 보니,

mutex lock을 획득하지 못해서 __schedule() 함수 호출로 sleep에 들어가네요.

-000|context_switch(inline)

-000|__schedule()

-001|schedule()

-002|schedule_preempt_disabled()

-003|spin_lock(inline)

-003|__mutex_lock_common(inline)

-003|__mutex_lock_slowpath(lock_count = 0xC0FBFB94)

-004|__mutex_fastpath_lock(inline)

-004|mutex_lock(lock = 0x0)

-005|rtnl_lock()

-006|devinet_ioctl(net = 0xC0FBEC40, ?, arg = 0x9BE1A20C)

-007|inet_ioctl(?, ?, ?)

-008|sock_do_ioctl(inline)

-008|sock_ioctl(?, cmd = 35093, ?)

-009|vfs_ioctl(inline)

-009|do_vfs_ioctl(filp = 0x0, ?, ?, ?)

-010|SYSC_ioctl(inline)

-010|sys_ioctl(fd = 267, cmd = 35093, arg = -1679711732)

-011|ret_fast_syscall(asm)

TASK: c748a000 CPU: 5 COMMAND: "DnsConfigServic" 프로세스도 마찬가지군요.

-000|context_switch(inline)

-000|__schedule()

-001|schedule()

-002|schedule_preempt_disabled()

-003|spin_lock(inline)

-003|__mutex_lock_common(inline)

-003|__mutex_lock_slowpath(lock_count = 0xC0FBFB94)

-004|__mutex_fastpath_lock(inline)

-004|mutex_lock(lock = 0x0)

-005|rtnl_lock()

-006|wireless_process_ioctl(inline)

-006|wext_ioctl_dispatch(inline)

-006|wext_handle_ioctl(net = 0xC0FBEC40, ifr = 0xC4EC5EA8, cmd = 35585, arg = 0x

-007|dev_ioctl(net = 0xC0FBEC40, cmd = 35585, arg = 0x9B73A4B4)

-008|sock_do_ioctl(inline)

-008|sock_ioctl(?, cmd = 35585, ?)

-009|vfs_ioctl(inline)

-009|do_vfs_ioctl(filp = 0x0, ?, ?, ?)

-010|SYSC_ioctl(inline)

-010|sys_ioctl(fd = 19, cmd = 35585, arg = -1686920012)

-011|ret_fast_syscall(asm)

위 두 프로세스에서 rtnl_lock() 함수 수행 후 __schedule() 함수 실행 중인 공통점을 확인할 수 있네요.

rtnl_lock() 함수 코드 리뷰 좀 해볼까요? rtnl_lock() 함수 body를 보니, rtnl_mutex란 mutex 변수로 락을 걸고 있음을 알 수 있어요.

void rtnl_lock(void)

{

mutex_lock(&rtnl_mutex);

// .. 생략..

rtnl_mutex 변수를 확인해보니 mutex lock owner가 "kworker/4:3"임을 알수 있어요.

mutex deadlock현상은 결국 문제의 원인이 아니라 결과임을 할 수 있어요.

(static struct mutex) rtnl_mutex = (

(atomic_t) count = ((int) counter = 0xFFFFFFFD),

(spinlock_t) wait_lock = ((struct raw_spinlock) rlock = ((arch_spinlock_t) r

(struct list_head) wait_list = (

(struct list_head *) next = 0xD91E5E04 -> (

(struct list_head *) next = 0xC4EC5E04 -> (

(struct list_head *) next = 0xD92E3E14 -> (

(struct list_head *) next = 0xC0FBFB9C -> (

(struct list_head *) next = 0xD91E5E04,

(struct list_head *) prev = 0xD92E3E14),

(struct list_head *) prev = 0xC4EC5E04),

(struct list_head *) prev = 0xD91E5E04),

(struct list_head *) prev = 0xC0FBFB9C),

(struct list_head *) prev = 0xD92E3E14),

(struct task_struct *) owner = 0xDA073600 -> (

(long int) state = 0x0,

(void *) stack = 0xC5728000,

// ..생략..

(struct cred *) cred = 0xD8FBD600,

(char [16]) comm = "kworker/4:3",

(int) link_count = 0x0,

그 이유는 "kworker/4:3" 프로세스가 rtnl_mutex 이란 mutex을 잠그고 __ipv6_dev_ac_inc() 함수에서 spinning으로 돌고 있었기 때문에, 다른 프로세스들이 rtnl_mutex란 mutex을 획득하지 못하는 상황이거든요.

-000|__ipv6_dev_ac_inc()

-001|addrconf_join_anycast()

-002|__ipv6_ifa_notify()

-003|local_bh_enable(inline)

-003|rcu_read_unlock_bh(inline)

-003|ipv6_ifa_notify()

-004|addrconf_dad_begin(inline)

-004|addrconf_dad_work()

-005|static_key_count(inline)

-005|static_key_false(inline)

-005|trace_workqueue_execute_end(inline)

-005|process_one_work()

-006|list_empty(inline)

-006|worker_thread()

-007|kthread()

-008|ret_from_fork(asm)

-009|ret_fast_syscall(asm)

그럼 "kworker/4:3" 프로세스가 rtnl_mutex란 mutex를 언제 거냐구요? 위 콜스택 정보를 참고하면 addrconf_dad_work 함수에서 rtnl_lock() 함수 호출로 rtnl_mutex란 mutex를 잠그고 있어요.

static void addrconf_dad_work(struct work_struct *w)

{

struct inet6_ifaddr *ifp = container_of(to_delayed_work(w),

struct inet6_ifaddr,

dad_work);

struct inet6_dev *idev = ifp->idev;

struct in6_addr mcaddr;

enum {

DAD_PROCESS,

DAD_BEGIN,

DAD_ABORT,

} action = DAD_PROCESS;

rtnl_lock();

그럼 __ipv6_dev_ac_inc() 함수로 되돌아갈 시간이네요.

__ipv6_dev_ac_inc() 함수를 리뷰해보면, 아래 for loop에서 함수가 계속 무한루프를 돌고 있었는데요.

int __ipv6_dev_ac_inc(struct inet6_dev *idev, const struct in6_addr *addr)

{

struct ifacaddr6 *aca;

struct rt6_info *rt;

int err;

// .. skip..

for (aca = idev->ac_list; aca; aca = aca->aca_next) {

if (ipv6_addr_equal(&aca->aca_addr, addr)) {

aca->aca_users++;

err = 0;

goto out;

}

idev->ac_list 변수가 어떤 값을 갖고 있는 지 점검을 해보았어요.

확인해보니 참 황당하게도 이 값이 0x0이네요.

(struct inet6_dev *) (struct inet6_dev *)0xC2000400 = 0xC2000400 -> (

(struct net_device *) dev = 0xDA5CA800,

(struct list_head) addr_list = ((struct list_head *) next = 0xD8C3CBC0, (str

// .. skip ..

(struct ifacaddr6 *) ac_list = 0x0,

kworker/4:3-10711란 프로세스가 __ipv6_dev_ac_inc() 함수 내 for loop를 돌면서 idev->ac_list 리스트를 iteration하는 동안에, 다른 모듈이 race condition으로 __ipv6_dev_ac_inc() 함수에 진입하여 idev->ac_list에 access한 것으로 보이네요.

아래와 같은 디버깅 패치를 작성해서 race condition이 발생하는지 점검하고 있었는데.

diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c

index baf2742..44b28c0 100644

--- a/net/ipv6/anycast.c

+++ b/net/ipv6/anycast.c

@@ -257,6 +257,7 @@ int __ipv6_dev_ac_inc(struct inet6_dev *idev, const struct in6_addr *addr)

}

for (aca = idev->ac_list; aca; aca = aca->aca_next) {

+ trace_printk("%p caller: %pS\n", idev->ac_list, (void *)__builtin_return_address(0));

if (ipv6_addr_equal(&aca->aca_addr, addr)) {

aca->aca_users++;

err = 0;

네트워크 드라이버를 보는 분이 잽싸게 리눅스 커뮤니티에서 ipv6_ifa_notify() 함수가 race condition을 유발시킬 수 있다는 정보를 캐치하셨어요. 아래 패치 적용 후 문제가 재현되지 않았어요.

https://patchwork.kernel.org/patch/8147541/

ipv6_ifa_notify() may need to acquire idev->lock, for example

when calling __ipv6_dev_ac_inc() from addrconf_join_anycast()

in __ipv6_ifa_notify().

So caller must release idev->lock before calling ipv6_ifa_notify() to

avoid a deadlock.

Fix a checkpatch warning present in the same area to make checkpatch

happy.

#패치 코드

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c

index 3d9afb7..7b92c50 100644

--- a/net/ipv6/addrconf.c

+++ b/net/ipv6/addrconf.c

@@ -2227,10 +2227,12 @@ static void manage_tempaddrs(struct inet6_dev *idev,

ift->flags &= ~IFA_F_DEPRECATED;

spin_unlock(&ift->lock);

- if (!(flags&IFA_F_TENTATIVE))

- ipv6_ifa_notify(0, ift);

- }

+ if (!(flags&IFA_F_TENTATIVE)) {

+ read_unlock_bh(&idev->lock);

+ ipv6_ifa_notify(0, ift);

+ read_lock_bh(&idev->lock);

+ }

if ((create || list_empty(&idev->tempaddr_list)) &&

idev->cnf.use_tempaddr > 0) {

/* When a new public address is created as described

@@ -3340,7 +3342,9 @@ static void addrconf_dad_begin(struct inet6_ifaddr *ifp)

/* Because optimistic nodes can use this address,

* notify listeners. If DAD fails, RTM_DELADDR is sent.

- ipv6_ifa_notify(RTM_NEWADDR, ifp);

+ read_unlock_bh(&idev->lock);

+ ipv6_ifa_notify(RTM_NEWADDR, ifp);

+ read_lock_bh(&idev->lock);

}

"이 포스팅이 유익하다고 생각되시면 댓글로 응원해주시면 감사하겠습니다.

그리고 혹시 궁금점이 있으면 댓글로 질문 남겨주세요. 상세한 답글 올려드리겠습니다!"

# Reference: For more information on 'Linux Kernel';

디버깅을 통해 배우는 리눅스 커널의 구조와 원리. 1

http://www.yes24.com/Product/Goods/90085976

디버깅을 통해 배우는 리눅스 커널의 구조와 원리. 2

http://www.yes24.com/Product/Goods/90087307

저작자표시

'Core BSP 분석 > 커널 트러블슈팅' 카테고리의 다른 글

[Linux][Kernel] data abort@irq_affinity_notify - 리눅스 커널 (0)	2023.05.07
[Linux][Kernel] data abort @tty_wakeup - 리눅스 커널 (0)	2023.05.07
tombstone - __strlen_chk (0)	2023.05.07
Kernel][Panic] 메모리 불량 커널 크래시@find_vma_links - 1 (0)	2023.05.07
[Kernel][Panic] crash due to "signature and/or required key missing" (0)	2023.05.07

RISC-V and Arm Linux Kernel Hacks

watchdog reset - race in ipv6_ifa_notify() - 리눅스 커널

'Core BSP 분석 > 커널 트러블슈팅' 카테고리의 다른 글

티스토리툴바

watchdog reset - race in ipv6_ifa_notify() - 리눅스 커널

'Core BSP 분석 > 커널 트러블슈팅' 카테고리의 다른 글

'Core BSP 분석/커널 트러블슈팅' Related Articles

티스토리툴바