block_dump观察Linux IO写入的具体文件(mysqld)
- 一、使用方法:
- 二、基本原理:
- 三、總結(jié)
很多情況下開發(fā)者調(diào)測程序需要在Linux下獲取具體的IO的狀況,目前常用的IO觀察工具用vmstat和iostat,具體功能上說當(dāng)然是iostat更勝一籌,在IO統(tǒng)計(jì)上時(shí)間點(diǎn)上更具體精細(xì)。但二者都是在全局上看到IO,宏觀上的數(shù)據(jù)對(duì)于判斷IO到哪個(gè)文件上毫無幫助,這個(gè)時(shí)候block_dump的作用就顯現(xiàn)出來了。
一、使用方法:
需要先停掉syslog功能,因?yàn)榫唧wIO數(shù)據(jù)要通過printk輸出,如果syslog存在,則會(huì)往message產(chǎn)生大量IO,干擾正常結(jié)果
?
| 1 2 | suse:~ # service syslog stop Shutting down syslog services done |
?
然后啟動(dòng)block_dump
?
| 1 | suse:~ # echo 1 > /proc/sys/vm/block_dump |
?
先說效果:
?
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | suse:~ # dmesg | tail dmesg(3414): dirtied inode 9594 (LC_MONETARY) on sda1 dmesg(3414): dirtied inode 9238 (LC_COLLATE) on sda1 dmesg(3414): dirtied inode 9241 (LC_TIME) on sda1 dmesg(3414): dirtied inode 9606 (LC_NUMERIC) on sda1 dmesg(3414): dirtied inode 9350 (LC_CTYPE) on sda1 kjournald(506): WRITE block 3683672 on sda1 kjournald(506): WRITE block 3683680 on sda1 kjournald(506): WRITE block 3683688 on sda1 kjournald(506): WRITE block 3683696 on sda1 kjournald(506): WRITE block 3683704 on sda1 kjournald(506): WRITE block 3683712 on sda1 kjournald(506): WRITE block 3683720 on sda1 kjournald(506): WRITE block 3683728 on sda1 kjournald(506): WRITE block 3683736 on sda1 kjournald(506): WRITE block 3683744 on sda1 |
?
通過dmesg信息可以看到IO正在寫那些文件,有進(jìn)程號(hào),inode號(hào),文件名和磁盤設(shè)備名;但每個(gè)文件寫了多少呢,僅僅通過dirtied inode就看不出來了,還需要分析WRITE block,后面的數(shù)字并不是真正的塊號(hào),而是內(nèi)核IO層獲取的扇區(qū)號(hào),除以8即為塊號(hào),然后根據(jù)debugfs工具的icheck和ncheck選項(xiàng),就可以獲取該文件系統(tǒng)塊屬于哪個(gè)具體文件,具體請(qǐng)google之。
二、基本原理:
block_dump的原理其實(shí)很簡單,內(nèi)核在IO層根據(jù)標(biāo)志block_dump在IO提交給磁盤的關(guān)口卡主過關(guān)的每一個(gè)BIO,將它們的數(shù)據(jù)打出來:
?
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | void submit_bio(int rw, struct bio *bio) { ???? int count = bio_sectors(bio); ???? bio->bi_rw |= rw; /* * If it's a regular read/write or a barrier with data attached, * go through the normal accounting stuff before submission. */ ???? if (bio_has_data(bio) && !(rw & REQ_DISCARD)) { ???????? if (rw & WRITE) { ???????? count_vm_events(PGPGOUT, count); ???? } else { ???????? task_io_account_read(bio->bi_size); ???????? count_vm_events(PGPGIN, count); ???? } ???? if (unlikely(block_dump)) { ???????? char b[BDEVNAME_SIZE]; ???????? printk(KERN_DEBUG "%s(%d): %s block %Lu on %s (%u sectors)n", ??????????????current->comm, task_pid_nr(current), ??????????????(rw & WRITE) ? "WRITE" : "READ", ??????????????(unsigned long long)bio->bi_sector, ??????????????bdevname(bio->bi_bdev, b), ??????????????count); ????????} ????} ????generic_make_request(bio); } |
?
具體WRITE block塊號(hào)和文件系統(tǒng)塊號(hào)之間的對(duì)應(yīng)關(guān)系在submit_bh函數(shù)中決定
?
| 1 | bio->bi_sector = bh->b_blocknr * (bh->b_size >> 9); |
?
inode的block_dump實(shí)現(xiàn)是通過block_dump___mark_inode_dirty搞定的,這次把關(guān)口架在inode臟數(shù)據(jù)寫回的路上,把每個(gè)過關(guān)的inode信息打出來:
?
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | void __mark_inode_dirty(struct inode *inode, int flags) { if (unlikely(block_dump)) block_dump___mark_inode_dirty(inode); } static noinline void block_dump___mark_inode_dirty(struct inode *inode) { if (inode->i_ino || strcmp(inode->i_sb->s_id, "bdev")) { struct dentry *dentry; const char *name = "?"; dentry = d_find_alias(inode); if (dentry) { spin_lock(&dentry->d_lock); name = (const char *) dentry->d_name.name; } printk(KERN_DEBUG "%s(%d): dirtied inode %lu (%s) on %sn", current->comm, task_pid_nr(current), inode->i_ino, name, inode->i_sb->s_id); if (dentry) { spin_unlock(&dentry->d_lock); dput(dentry); } } |
?
三、總結(jié)
1.內(nèi)核由很多合適的關(guān)口來截獲獲取的IO信息,不改動(dòng)內(nèi)核,也可以用jprobe搶劫很多東西。
2.debugfs在大量的block–>file轉(zhuǎn)換過程總太慢,自己用ext2fs寫一個(gè),效率應(yīng)該能提高很多。
—結(jié)束—
?
block_dump觀察Linux IO寫入的具體文件–OenHan
http://www.oenhan.com/block-dump-linux-io
?
[root@server-mysql log]# echo 5 > /proc/sys/vm/block_dump
[root@server-mysql log]# dmesg -c |grep mysqld
mysqld(11780): dirtied inode 1069049 (ib_logfile0) on sda3
mysqld(11780): dirtied inode 1069049 (ib_logfile0) on sda3
mysqld(11780): WRITE block 8236616 on sda3
mysqld(9674): dirtied inode 1069048 (ibdata1) on sda3
mysqld(9674): dirtied inode 1069048 (ibdata1) on sda3
mysqld(9674): WRITE block 8144896 on sda3
mysqld(9674): WRITE block 8144904 on sda3
mysqld(9674): WRITE block 8144912 on sda3
mysqld(9674): WRITE block 8144920 on sda3
mysqld(9674): WRITE block 8144928 on sda3
mysqld(9674): WRITE block 8144936 on sda3
mysqld(9674): WRITE block 8144944 on sda3
mysqld(9674): WRITE block 8144952 on sda3
mysqld(9674): dirtied inode 1071023 (kk.ibd) on sda3
mysqld(9674): dirtied inode 1071023 (kk.ibd) on sda3
mysqld(9663): WRITE block 32762104 on sda3
mysqld(9663): WRITE block 32762112 on sda3
mysqld(9663): WRITE block 32762120 on sda3
mysqld(9663): WRITE block 32762128 on sda3
mysqld(9663): WRITE block 16177376 on sda3
mysqld(9663): WRITE block 16177384 on sda3
mysqld(9663): WRITE block 16177392 on sda3
mysqld(9663): WRITE block 16177400 on sda3
mysqld(9658): WRITE block 8175616 on sda3
轉(zhuǎn)載于:https://www.cnblogs.com/zengkefu/p/5639167.html
總結(jié)
以上是生活随笔為你收集整理的block_dump观察Linux IO写入的具体文件(mysqld)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 醉驾要是罚款得多少钱?谢谢?
- 下一篇: ReportDB数据库存储选型分析