当前位置: 首页 > news >正文

Redis4.x新特性 -- 萌萌的MEMORY DOCTOR

Redis4.x版本去年发布以后,新增了许多新的功能特性。大致翻看下来,一个叫MEMORY DOCTOR的命令吸引了我的注意。MEMORY DOCTOR命令是Redis4.x版本新增MEMORY 命令下的一个子命令,它可以通过诊断给出关于redis内存使用方面的建议,在不同的状态下会有不同的分析结果。此时我的脑海里第一个闪过念头:最强AI?redis是不是通过什么复杂的人工智能算法,对其使用的内存状况做了全方位的分析进而给出了最合理的优化建议?它是不是和AlphaGo一样通过学习会变得越来越强?伴随着疑问我尝试的从github上找了找相关的源码,最后知道真相的我眼泪掉来,不得不说MEMORY DOCTOR的实现真的挺萌的,萌的可爱。什么最强AI,我上我也行呀。

使用展示

doctor

源码分析

MEMORY DOCTOR功能源码地址 https://github.com/antirez/redis/blob/unstable/src/object.c
实现逻辑全在1080行开始的getMemoryDoctorReport方法中

/* This implements MEMORY DOCTOR. An human readable analysis of the Redis
 * memory condition. */
sds getMemoryDoctorReport(void) {
    int empty = 0;          /* Instance is empty or almost empty. */
    int big_peak = 0;       /* Memory peak is much larger than used mem. */
    int high_frag = 0;      /* High fragmentation. */
    int high_alloc_frag = 0;/* High allocator fragmentation. */
    int high_proc_rss = 0;  /* High process rss overhead. */
    int high_alloc_rss = 0; /* High rss overhead. */
    int big_slave_buf = 0;  /* Slave buffers are too big. */
    int big_client_buf = 0; /* Client buffers are too big. */
    int many_scripts = 0;   /* Script cache has too many scripts. */
    int num_reports = 0;
    struct redisMemOverhead *mh = getMemoryOverheadData();

    if (mh->total_allocated < (1024*1024*5)) {
        empty = 1;
        num_reports++;
    } else {
        /* Peak is > 150% of current used memory? */
        if (((float)mh->peak_allocated / mh->total_allocated) > 1.5) {
            big_peak = 1;
            num_reports++;
        }

        /* Fragmentation is higher than 1.4 and 10MB ?*/
        if (mh->total_frag > 1.4 && mh->total_frag_bytes > 10<<20) {
            high_frag = 1;
            num_reports++;
        }

        /* External fragmentation is higher than 1.1 and 10MB? */
        if (mh->allocator_frag > 1.1 && mh->allocator_frag_bytes > 10<<20) {
            high_alloc_frag = 1;
            num_reports++;
        }

        /* Allocator fss is higher than 1.1 and 10MB ? */
        if (mh->allocator_rss > 1.1 && mh->allocator_rss_bytes > 10<<20) {
            high_alloc_rss = 1;
            num_reports++;
        }

        /* Non-Allocator fss is higher than 1.1 and 10MB ? */
        if (mh->rss_extra > 1.1 && mh->rss_extra_bytes > 10<<20) {
            high_proc_rss = 1;
            num_reports++;
        }

        /* Clients using more than 200k each average? */
        long numslaves = listLength(server.slaves);
        long numclients = listLength(server.clients)-numslaves;
        if (mh->clients_normal / numclients > (1024*200)) {
            big_client_buf = 1;
            num_reports++;
        }

        /* Slaves using more than 10 MB each? */
        if (numslaves > 0 && mh->clients_slaves / numslaves > (1024*1024*10)) {
            big_slave_buf = 1;
            num_reports++;
        }

        /* Too many scripts are cached? */
        if (dictSize(server.lua_scripts) > 1000) {
            many_scripts = 1;
            num_reports++;
        }
    }

    sds s;
    if (num_reports == 0) {
        s = sdsnew(
        "Hi Sam, I can't find any memory issue in your instance. "
        "I can only account for what occurs on this base.\n");
    } else if (empty == 1) {
        s = sdsnew(
        "Hi Sam, this instance is empty or is using very little memory, "
        "my issues detector can't be used in these conditions. "
        "Please, leave for your mission on Earth and fill it with some data. "
        "The new Sam and I will be back to our programming as soon as I "
        "finished rebooting.\n");
    } else {
        s = sdsnew("Sam, I detected a few issues in this Redis instance memory implants:\n\n");
        if (big_peak) {
            s = sdscat(s," * Peak memory: In the past this instance used more than 150% the memory that is currently using. The allocator is normally not able to release memory after a peak, so you can expect to see a big fragmentation ratio, however this is actually harmless and is only due to the memory peak, and if the Redis instance Resident Set Size (RSS) is currently bigger than expected, the memory will be used as soon as you fill the Redis instance with more data. If the memory peak was only occasional and you want to try to reclaim memory, please try the MEMORY PURGE command, otherwise the only other option is to shutdown and restart the instance.\n\n");
        }
        if (high_frag) {
            s = sdscatprintf(s," * High total RSS: This instance has a memory fragmentation and RSS overhead greater than 1.4 (this means that the Resident Set Size of the Redis process is much larger than the sum of the logical allocations Redis performed). This problem is usually due either to a large peak memory (check if there is a peak memory entry above in the report) or may result from a workload that causes the allocator to fragment memory a lot. If the problem is a large peak memory, then there is no issue. Otherwise, make sure you are using the Jemalloc allocator and not the default libc malloc. Note: The currently used allocator is \"%s\".\n\n", ZMALLOC_LIB);
        }
        if (high_alloc_frag) {
            s = sdscatprintf(s," * High allocator fragmentation: This instance has an allocator external fragmentation greater than 1.1. This problem is usually due either to a large peak memory (check if there is a peak memory entry above in the report) or may result from a workload that causes the allocator to fragment memory a lot. You can try enabling 'activedefrag' config option.\n\n");
        }
        if (high_alloc_rss) {
            s = sdscatprintf(s," * High allocator RSS overhead: This instance has an RSS memory overhead is greater than 1.1 (this means that the Resident Set Size of the allocator is much larger than the sum what the allocator actually holds). This problem is usually due to a large peak memory (check if there is a peak memory entry above in the report), you can try the MEMORY PURGE command to reclaim it.\n\n");
        }
        if (high_proc_rss) {
            s = sdscatprintf(s," * High process RSS overhead: This instance has non-allocator RSS memory overhead is greater than 1.1 (this means that the Resident Set Size of the Redis process is much larger than the RSS the allocator holds). This problem may be due to Lua scripts or Modules.\n\n");
        }
        if (big_slave_buf) {
            s = sdscat(s," * Big replica buffers: The replica output buffers in this instance are greater than 10MB for each replica (on average). This likely means that there is some replica instance that is struggling receiving data, either because it is too slow or because of networking issues. As a result, data piles on the master output buffers. Please try to identify what replica is not receiving data correctly and why. You can use the INFO output in order to check the replicas delays and the CLIENT LIST command to check the output buffers of each replica.\n\n");
        }
        if (big_client_buf) {
            s = sdscat(s," * Big client buffers: The clients output buffers in this instance are greater than 200K per client (on average). This may result from different causes, like Pub/Sub clients subscribed to channels bot not receiving data fast enough, so that data piles on the Redis instance output buffer, or clients sending commands with large replies or very large sequences of commands in the same pipeline. Please use the CLIENT LIST command in order to investigate the issue if it causes problems in your instance, or to understand better why certain clients are using a big amount of memory.\n\n");
        }
        if (many_scripts) {
            s = sdscat(s," * Many scripts: There seem to be many cached scripts in this instance (more than 1000). This may be because scripts are generated and `EVAL`ed, instead of being parameterized (with KEYS and ARGV), `SCRIPT LOAD`ed and `EVALSHA`ed. Unless `SCRIPT FLUSH` is called periodically, the scripts' caches may end up consuming most of your memory.\n\n");
        }
        s = sdscat(s,"I'm here to keep you safe, Sam. I want to help you.\n");
    }
    freeMemoryOverheadData(mh);
    return s;
}

通过阅读源码,我们不难发现redis会从Instance is empty or not,Memory used peak,High fragmentation,High allocator fragmentation,High process rss overhead,High rss overhead,Slave buffers are too big,Client buffers are too big,Script cache has too many 这八个维度来分析一个redis节点的内存使用状况,每个维度都有相应的触发条件,然后再分别给每个维度提出对应解决建议,具体见下表:

维度触发条件建议
Instance is empty or almost empty节点内存空间使用小于5M
Memory peak is much larger节点在过去使用内存的峰值超过150%的当前内存导致内存碎片率较大,但这它是无害的,可以手动执行MEMORY PURGE命令或者重启节点
High fragmentationmemory_fragmentation 和 RSS memory之和大于1.4通常是由于工作负载导致分配器大量分段内存造成的,请确保使用的内存分配器是Jemalloc
High allocator fragmentation节点的allocator external fragmentation大于1.1建议尝试启用"activedefrag"配置选项开启自动内存碎片整理
High rss overhead节点的non-allocator RSS memory开销大于1.1意味着Redis进程的驻留集大小远大于分配器保存的RSS,通常可能是由于Lua脚本或模块化功能引起的
High process rss overhead节点的RSS memory大于1.1此通常是由于峰值内存较大,可以尝试使用MEMORY PURGE命令来回收它
Slave buffers are too big节点的output buffers对于每个replica(平均)大于10MB一些replica正在努力接收数据,要么是因为它太慢,要么是因为网络问题.可以使用INFO输出来检查副本延迟.并使用CLIENT LIST命令检查每个副本的输出缓冲区
Client buffers are too big节点的 clients output buffers大于每个客户端200K(平均)可能是由于不同的原因造成的,使用CLIENT LIST命令调查问题
Script cache has too many节点中缓存脚超过1000个定期调用SCRIPT FLUSH命令

到此为止,我们应该彻底弄清楚了MEMORY DOCTOR的实现机制了,它到底在什么情况下会给相应的诊断意见。虽然他的实现有那么些许的萌萌哒,但又显得那么的合情合理。后续我们的RedisManager会基于MEMORY DOCTOR命令增加4.0以上集群的健康诊断功能。

些许想法

1.MEMORY DOCTOR这几个对于内存状况维度分析的指标和执行info命令得到的结果如此的吻合,对于3.0甚至2.8版本那些不支持MEMORY DOCTOR命令的redis版本,我们是不是自己也可以实现一套内存诊断机制,给出对应的节点解决方案?
2.诊断意见中的那个Sam到底是谁。。

相关文章:

  • 如何在 Intellij IDEA 更高效地将应用部署到容器服务 Kubernetes ...
  • vue css 模块化编程 CSS Modules Scoped
  • java动态代理
  • 【Zabbix-SNMPtrap】列举一个常见使用Zabbix的SNMP trap监控设备的例子
  • javascrip 对数组的操作方法
  • [人间也值得] - Mryang带你快速入门第一个 Python 程序
  • Python自定义异常及抛出异常
  • ES6 - 数组扩展
  • 戴尔服务器使用omreport(OMSA)查看监控硬件信息
  • 处理SQL Server中的重复行
  • 互联网视频直播技术(广电总局、优酷土豆、XX直播)
  • Zookeeper Clustered (Multi-Server) Deployment
  • 孤儿进程 僵尸进程 守护进程及wait函数详解
  • 利用Java反射机制实现对象相同字段的复制
  • Markdown基本语法
  • 时间复杂度分析经典问题——最大子序列和
  • 《Javascript数据结构和算法》笔记-「字典和散列表」
  • 【技术性】Search知识
  • 【面试系列】之二:关于js原型
  • 2017 年终总结 —— 在路上
  • axios 和 cookie 的那些事
  • canvas 五子棋游戏
  • JavaScript 基本功--面试宝典
  • JavaScript 一些 DOM 的知识点
  • mockjs让前端开发独立于后端
  • Nodejs和JavaWeb协助开发
  • python学习笔记 - ThreadLocal
  • spark本地环境的搭建到运行第一个spark程序
  • 成为一名优秀的Developer的书单
  • 检测对象或数组
  • 聊聊flink的TableFactory
  • 一些css基础学习笔记
  • LevelDB 入门 —— 全面了解 LevelDB 的功能特性
  • 回归生活:清理微信公众号
  • #【QT 5 调试软件后,发布相关:软件生成exe文件 + 文件打包】
  • #绘制圆心_R语言——绘制一个诚意满满的圆 祝你2021圆圆满满
  • #我与Java虚拟机的故事#连载17:我的Java技术水平有了一个本质的提升
  • %3cli%3e连接html页面,html+canvas实现屏幕截取
  • (C#)if (this == null)?你在逗我,this 怎么可能为 null!用 IL 编译和反编译看穿一切
  • (附源码)springboot炼糖厂地磅全自动控制系统 毕业设计 341357
  • (译)2019年前端性能优化清单 — 下篇
  • (转)关于多人操作数据的处理策略
  • (轉貼) 2008 Altera 亞洲創新大賽 台灣學生成果傲視全球 [照片花絮] (SOC) (News)
  • .NET/C# 使窗口永不获得焦点
  • .Net程序帮助文档制作
  • .NET开源项目介绍及资源推荐:数据持久层
  • .NET企业级应用架构设计系列之技术选型
  • .net下的富文本编辑器FCKeditor的配置方法
  • .net中应用SQL缓存(实例使用)
  • @WebService和@WebMethod注解的用法
  • [2018][note]用于超快偏振开关和动态光束分裂的all-optical有源THz超表——
  • [Android]创建TabBar
  • [Angular 基础] - 数据绑定(databinding)
  • [BZOJ4337][BJOI2015]树的同构(树的最小表示法)
  • [C#]DataTable常用操作总结【转】