互斥量、读写锁长占时分析的利器——valgrind的DRD
? ? ? ? 在進行多線程編程時,我們可能會存在同時操作(讀、寫)同一份內存的可能性。為了保證數據的正確性,我們往往會使用互斥量、讀寫鎖等同步方法。(轉載請指明出于breaksoftware的csdn博客)
? ? ? ? 互斥量的用法如下
pthread_mutex_lock(&mutex);// do somethingpthread_mutex_unlock(&mutex);
? ? ? ? 我們在第2行處填充業務代碼。這樣一個線程上鎖成功后,其他線程必須等待這個鎖被釋放(第3行)。這也就意味著其他線程必須等著第2行業務代碼執行完畢才能繼續執行。
? ? ? ? 如果業務代碼非常耗時,就會導致整個程序執行的效率大打折扣。因為大量的線程都處在等待狀態,沒有充分利用CPU資源。這與多線程編程的初衷是相違背的。于是控制鎖粒度是個非常重要的優化設計方案。
? ? ? ? 但是,對于一個龐大的項目,可能使用互斥量加鎖的地方很多,我們如何排查出是哪個鎖的效率低呢?這個使用valgrind就該出場了。
? ? ? ? 我們設計一個例子
/** Hold several types of synchronization objects locked as long as specified.*/#define _GNU_SOURCE 1#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>static void delay_ms(const int ms) {struct timespec ts;assert(ms >= 0);ts.tv_sec = ms / 1000;ts.tv_nsec = (ms % 1000) * 1000 * 1000;nanosleep(&ts, 0);
}void double_lock_mutex(const int ms) {pthread_mutex_t mutex;pthread_mutexattr_t mutexattr;fprintf(stderr, "Locking mutex ...\n");pthread_mutexattr_init(&mutexattr);pthread_mutexattr_settype(&mutexattr, PTHREAD_MUTEX_RECURSIVE);pthread_mutex_init(&mutex, &mutexattr);pthread_mutexattr_destroy(&mutexattr);pthread_mutex_lock(&mutex);delay_ms(ms);pthread_mutex_lock(&mutex);pthread_mutex_unlock(&mutex);pthread_mutex_unlock(&mutex);pthread_mutex_destroy(&mutex);
}int main(int argc, char** argv) {int interval = 0;int optchar;while ((optchar = getopt(argc, argv, "i:")) != EOF) {switch (optchar) {case 'i':interval = atoi(optarg);break;default:fprintf(stderr, "Usage: %s [-i <interval time in ms>].\n", argv[0]);break;}}double_lock_mutex(interval);fprintf(stderr, "Done.\n");return 0;
}
? ? ? ??delay_ms方法接受程序傳入的參數,然后休眠相應的毫秒數。這個操作用于模擬業務代碼,當我們希望業務代碼執行較快時,則把該時間調低;當我們希望業務代碼非常耗時時,則把該時間調大。
? ? ? ? 使用下面指令編譯
gcc hold_lock.c -g -lpthread -o hold_lock
? ? ? ? 對于產出,我們可以這么調用
./hold_lock -i 10000
? ? ? ? 程序將在10000ms(10s)后執行完畢。相當于一個復雜的業務代碼執行了10秒。
? ? ? ? 然后我們使用下面的valgrind指令來檢查鎖占用的時間
valgrind --tool=drd --exclusive-threshold=10 ./hold_lock -i 20
? ? ? ? 這次我們讓業務代碼只執行20ms,但是使用--exclusive-threshold=10參數的意思是:檢查所有獨占鎖占用10ms已上的場景。
==4000== drd, a thread error detector
==4000== Copyright (C) 2006-2017, and GNU GPL'd, by Bart Van Assche.
==4000== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==4000== Command: ./hold_lock -i 20
==4000==
Locking mutex ...
==4000== Acquired at:
==4000== at 0x4C39193: pthread_mutex_lock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4000== by 0x108E1E: double_lock_mutex (hold_lock.c:31)
==4000== by 0x109029: main (hold_lock.c:80)
==4000== Lock on mutex 0x1ffefffe60 was held during 22 ms (threshold: 10 ms).
==4000== at 0x4C3A123: pthread_mutex_unlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4000== by 0x108E4C: double_lock_mutex (hold_lock.c:35)
==4000== by 0x109029: main (hold_lock.c:80)
==4000== mutex 0x1ffefffe60 was first observed at:
==4000== at 0x4C385F0: pthread_mutex_init (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4000== by 0x108E06: double_lock_mutex (hold_lock.c:29)
==4000== by 0x109029: main (hold_lock.c:80)
==4000==
Done.
==4000==
==4000== For counts of detected and suppressed errors, rerun with: -v
==4000== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
? ? ? ? 第11行顯示,這個互斥量占用了22ms。它是在hold_lock.c的第29行(第17行顯示)第一次被使用的,在第31行(第9行顯示)第一次被上鎖,在第35行(第13行顯示)最后一次被解鎖。如此我們便能找到耗時超過10ms的獨占鎖了。
? ? ? ? 我們再改下執行指令,讓業務代碼執行(休眠)9ms。這是處在比較靠近邊界10ms的時間,所以我們多執行幾次下面命令,可以看到有時候可能檢測到超過10ms的,有時候也沒有。
valgrind --tool=drd --exclusive-threshold=10 ./hold_lock -i 9
==4026== Command: ./hold_lock -i 9
==4026==
Locking mutex ...
Done.
==4026==
? ? ? ? 上面是不超過10ms的場景,下面是超過的場景。
==4027==
Locking mutex ...
==4027== Acquired at:
==4027== at 0x4C39193: pthread_mutex_lock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4027== by 0x108E1E: double_lock_mutex (hold_lock.c:31)
==4027== by 0x109029: main (hold_lock.c:80)
==4027== Lock on mutex 0x1ffefffe60 was held during 11 ms (threshold: 10 ms).
==4027== at 0x4C3A123: pthread_mutex_unlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4027== by 0x108E4C: double_lock_mutex (hold_lock.c:35)
==4027== by 0x109029: main (hold_lock.c:80)
==4027== mutex 0x1ffefffe60 was first observed at:
==4027== at 0x4C385F0: pthread_mutex_init (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4027== by 0x108E06: double_lock_mutex (hold_lock.c:29)
==4027== by 0x109029: main (hold_lock.c:80)
==4027==
Done.
==4027==
? ? ? ? 除了互斥量,這個方式還可以檢測讀寫鎖。
? ? ? ? 讀寫鎖又稱共享-獨占鎖。當寫鎖被設置,其他獲取鎖的操作都會進入等待狀態(獨占);當讀鎖被設置,其他線程仍然可以獲取讀鎖(共享),但是寫鎖需要等待所有讀鎖釋放后才可以獲得。
? ? ? ? 我們看個寫鎖耗時長的例子
void write_lock(const int ms) {pthread_rwlock_t rwlock;fprintf(stderr, "Locking rwlock exclusively ...\n");pthread_rwlock_init(&rwlock, 0);pthread_rwlock_wrlock(&rwlock);delay_ms(ms);pthread_rwlock_unlock(&rwlock);pthread_rwlock_destroy(&rwlock);
}
? ? ? ? 仍然使用exclusive-threshold參數去檢測
valgrind --tool=drd --exclusive-threshold=10 ./hold_lock -i 20
? ? ? ? 可以得到如下結果。其解讀方式和之前一致(注意此處的代碼行號是我文件中的行號,而非csdn顯示的局部代碼行號)。
==4074==
Locking rwlock exclusively ...
==4074== Acquired at:
==4074== at 0x4C41404: pthread_rwlock_wrlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4074== by 0x108EC6: write_lock (hold_lock.c:45)
==4074== by 0x109033: main (hold_lock.c:81)
==4074== Lock on rwlock 0x1ffefffe50 was held during 22 ms (threshold: 10 ms).
==4074== at 0x4C428D5: pthread_rwlock_unlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4074== by 0x108EDC: write_lock (hold_lock.c:47)
==4074== by 0x109033: main (hold_lock.c:81)
==4074== rwlock 0x1ffefffe50 was first observed at:
==4074== at 0x4C40685: pthread_rwlock_init (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4074== by 0x108EBA: write_lock (hold_lock.c:44)
==4074== by 0x109033: main (hold_lock.c:81)
==4074==
Done.
? ? ? ? 最后看一個讀鎖耗時長的場景
void read_lock(const int ms) {pthread_rwlock_t rwlock;fprintf(stderr, "Locking rwlock shared ...\n");pthread_rwlock_init(&rwlock, 0);pthread_rwlock_rdlock(&rwlock);delay_ms(ms);pthread_rwlock_rdlock(&rwlock);pthread_rwlock_unlock(&rwlock);pthread_rwlock_unlock(&rwlock);pthread_rwlock_destroy(&rwlock);
}
? ? ? ? 由于讀鎖不是獨占鎖,所以我們不能使用exclusive-threshold去分析,而是要使用shared-threshold
valgrind --tool=drd --shared-threshold=10 ./hold_lock -i 20
? ? ? ? 其結果的解讀和前面一致
Locking rwlock shared ...
==4122== Acquired at:
==4122== at 0x4C40FB4: pthread_rwlock_rdlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4122== by 0x108F56: read_lock (hold_lock.c:57)
==4122== by 0x10903D: main (hold_lock.c:82)
==4122== Lock on rwlock 0x1ffefffe50 was held during 21 ms (threshold: 10 ms).
==4122== at 0x4C428D5: pthread_rwlock_unlock (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4122== by 0x108F84: read_lock (hold_lock.c:61)
==4122== by 0x10903D: main (hold_lock.c:82)
==4122== rwlock 0x1ffefffe50 was first observed at:
==4122== at 0x4C40685: pthread_rwlock_init (in /usr/lib/valgrind/vgpreload_drd-amd64-linux.so)
==4122== by 0x108F4A: read_lock (hold_lock.c:56)
==4122== by 0x10903D: main (hold_lock.c:82)
==4122==
Done.
?
總結
以上是生活随笔為你收集整理的互斥量、读写锁长占时分析的利器——valgrind的DRD的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 动态执行流程分析和性能瓶颈分析的利器——
- 下一篇: 死锁问题分析的利器——valgrind的