Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle duplicated items in iobref during wb_fulfill_head? #4446

Open
chen1195585098 opened this issue Dec 31, 2024 · 0 comments
Open

Comments

@chen1195585098
Copy link
Contributor

chen1195585098 commented Dec 31, 2024

Description of problem:
Does the duplicated items in iobref during wb_fulfill_head meet expectation?

With performance.write-behind on, a writev request will be WOUND in wb_fulfill_head func after iobref_merge.

    list_for_each_entry(req, &head->winds, winds)
    {
        WB_IOV_LOAD(vector, count, req, head);

        if (iobref_merge(head->stub->args.iobref, req->stub->args.iobref))
            goto err;
    }

Normally, such a request will succeed. Howerver, if this request gets a failure from brick, wb_fulfill_err will be called, which eventualy puts the failed request back to wb_inode->todo list by calling __wb_add_request_for_retry without clearing merged iobrefs.

When these retried requests are picked from wb_inode->todo list and ready to WIND, iobref_merge will be called again and same request iobufs from head->winds are repeatly megered into head->stub->args.iobref. This leads to extra memory usage.
For instance,

[2024-12-28 05:01:52.897906] A [MSGID: 0] [mem-pool.c:201:__gf_realloc] : no memory available for size (18446744056529682504) current memory usage in kilobytes 19820560 [call stack follows]
/lib64/libglusterfs.so.0(+0x2a2b4)[0x7f2d2ae2a2b4]
/lib64/libglusterfs.so.0(_gf_msg_nomem+0x292)[0x7f2d2ae2a762]
/lib64/libglusterfs.so.0(__gf_realloc+0x1d8)[0x7f2d2ae52a18]
/lib64/libglusterfs.so.0(+0x57491)[0x7f2d2ae57491]

or

[root@localhost ~]# grep -rnE "cmdlinestr|running\ process" /home/mnt.log |tail -n 2
520590:[2024-12-31 07:49:43.212733 +0000] I [MSGID: 100030] [glusterfsd.c:2949:main] 0-/usr/local/sbin/glusterfs: Started running version [{arg=/usr/local/sbin/glusterfs}, {version=2024.12.31}, {cmdlinestr=/usr/local/sbin/glusterfs --log-file=/home/mnt.log --process-name fuse --volfile-server=localhost --volfile-id=issue /mnt}] 
520591:[2024-12-31 07:49:43.287638 +0000] I [glusterfsd.c:2638:daemonize] 0-glusterfs: Pid of current running process is 2443
[root@localhost ~]# grep -rnw 2443 /var/log/messages|grep oom
25131:Dec 31 17:05:34 localhost kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/session-1.scope,task=glusterfs,pid=2443,uid=0
25132:Dec 31 17:05:34 localhost kernel: Out of memory: Killed process 2443 (glusterfs) total-vm:10298156kB, anon-rss:7152392kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:18480kB oom_score_adj:0

And it seems these additional alloceated iobrefs also reduce the performance of wb request handle, which leads to high response latency of ls <mount_point> or cd <mount_point>.

Can we introduce an uniqueness check in iobref_merge during adding a new iobuf Or taking some measures to clear merged iobrefs in wb_fulfill_err?

The exact command to reproduce the issue:

Additional info:

- The operating system / glusterfs version: Centos, kernel 5.10, latest devel version.

Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant