大促监控工作

监控范围

  • 应用服务器(CPU、内存、网络、IO、GC、JVM,容器云还要关注是否重启)
  • 数据库服务器(CPU、内存、QPS、QPM、主从延迟,以及以上指标异常时段的其他指标分析)
  • 数据库慢SQL统计
  • 系统关键业务单据分时段统计
  • 高峰时段API调用统计(以新监控平台为准)
  • 高峰时段流量组成(以Dynatrace系统为准)
  • 每天发现的问题及原因分析,并根据问题及时调整、更新大促方案,作用于下次大促

应用&数据库(服务器)监控

容器云节点监控:
http://mgr-prod.bard.midea.com:55558/d/-Da4MjF7k/jie-dian-zi-yuan-kan-ban?orgId=1&var-origin_prometheus=annto-prd%E9%9B%86%E7%BE%A4&var-Node=All&var-NameSpace=app-c-csp-admin-1345&var-PodIP=All&var-Pod=All&var-key=quota&from=now-12h&to=now
数据库截图链接:
https://monitor.midea.com/monitoring/monitor/view/middleware?systemId=c54ae0e3efff40dc84a45122a60dc41f&nodeId=fa9cfd60bb33492f8cf7ac2441f72cfa&group=AnDe&ip=all&monitorItemId=2a43e0b848c24145bcaaf74052b6930d&timeRange=p_l_24_hours
redis监控:
https://monitor.midea.com/monitoring/monitor/view/host?systemId=c54ae0e3efff40dc84a45122a60dc41f&nodeId=73cc41cde31544afa9d9450b5233a024&group=AnDe&ip=all&monitorItemId=d70509588fe247c785cd5262926e462b&viewType=cpu%25E8%25A7%2586%25E5%259B%25BE&timeRange=p_l_24_hours
主机cpu等:https://monitor.midea.com/monitoring/monitor/view/host?systemId=c54ae0e3efff40dc84a45122a60dc41f&nodeId=3fc6e2edf4764cd5b90c6efd542cccd2&group=AnDe&ip=all&monitorItemId=d70509588fe247c785cd5262926e462b&viewType=cpu%25E8%25A7%2586%25E5%259B%25BE&timeRange=p_l_24_hours

数据库慢SQL监控

多系统MySQL及慢SQL统一监控(支持导出):
https://grafana.midea.com/d/9-WuNGBVk/shi-li-shu-ju-ku-mysql?orgId=1&var-prometheu_env=Prometheus-%E7%94%9F%E4%BA%A7%28mideamonitor%29&var-midea_env=PRD&var-linux_count=&var-windows_count=&var-mysql_host=10.27.150.50&var-mysql_host=10.27.150.160&var-mysql_host=10.27.150.124&var-mysql_host=10.27.150.92&var-es_slowlog_env=Elasticsearch-mysql-slowlog&from=1716307200000&to=1716346799000&var-midea_system=C-CSP-ADMIN&var-midea_system=C-LSRM&var-midea_system=C-PMS&var-midea_system=C-NTP
单系统MySQL及慢SQL统计监控(支持跳转):
https://grafana.midea.com/d/dgtw2wU4k/an-de-ji-chu-jian-kong?orgId=1&var-ds=Prometheus-%E7%94%9F%E4%BA%A7%28mideamonitor%29&var-midea_system=C-NTP&var-midea_env=PRD&var-midea_system_short=NTP&var-node_total=49&var-pod_total=&var-mysql_host=All&var-access_amz_source=applognh-t-lc-prd-nginx-access-annto-amz&var-access_dmz_source=applognh-t-lc-prd-nginx-access-annto-dmz&var-dmz_nginx_hosts=All&var-amz_nginx_hosts=All&var-k8s_source=Prometheus-annto-prod&from=1716134400000&to=1716220799000&var-uri=All
慢SQL综合查询:
https://applognh.midea.com/app/kibana#/discover?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:'2024-05-21T16:00:00.000Z',to:now))&_a=(columns:!(db_user,rows_sent,slow_sql,query_time),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:'81d48bb0-adcb-11ed-b0a0-bda27278c9d0',key:ip,negate:!f,params:(query:'10.27.150.124'),type:phrase),query:(match_phrase:(ip:'10.27.150.124')))),index:'81d48bb0-adcb-11ed-b0a0-bda27278c9d0',interval:auto,query:(language:kuery,query:''),sort:!(!(collect_date,desc)))

流量监控

入口监控:
https://grafana.midea.com/d/uHH-KZw4z/an-de-ru-kou-jian-kong-by-ye-wu?orgId=1&from=now-12h&to=now
专线监控:
https://grafana.midea.com/d/CI7sstQVz/an-de-nginx-wu-liu-ru-kou-liu-liang-tong-ji?orgId=1&from=now-24h&to=now
接口汇总监控:
https://grafana.midea.com/d/wKYPsUQVk/an-de-nginx-wu-liu-ru-kou-liu-liang-tong-ji-v2?orgId=1
流量情况:
https://argus.midea.com/e/336a9550-ab5d-4f68-9fde-aafd0b41f33f/ui/diagnostictools/mda?gtf=-24h%20to%20now&gf=6377759893942933087&mdaId=topweb&metric=REQUEST_COUNT&dimension=%7BRequest:Name%7D&mergeServices=false&aggregation=COUNT&percentile=80&chart=COLUMN&servicefilter=0%1E26%112%1026%111

API调用及并发监控

统一接口调用排行:
https://monitor.midea.com/monitoring/monitor/view/overview?systemId=c17efc05bd024d28ae35ddd0ca3874c9&nodeId=c17efc05bd024d28ae35ddd0ca3874c9&group=AnDe&ip=all
并发数:
https://monitor.midea.com/monitoring/monitor/view/application?systemId=c17efc05bd024d28ae35ddd0ca3874c9&nodeId=7c885b0c0a1f4d3990479c8088623691&instance=all&monitorItemId=c1d1b69a57a34798821d8b1f5679ea83&viewType=url%25E8%25AF%25A6%25E6%2583%2585

送装业务数据统计脚本

# 接入订单总数
select count(1) from csp_accept where create_time >'2024-05-20 20:00:00'
-- 每天高峰时段接单量统计
select DATE_FORMAT(create_time,'%Y-%m-%d') as '大促高峰日期',count(1) as '高峰时段(0:00~1:00)接单数量'
from csp_accept where create_time between '2024-05-17 00:00:00' and now()
and DATE_FORMAT(create_time,'%H%i%s')<='010000'
group by DATE_FORMAT(create_time,'%Y-%m-%d')
order by 1 asc
-- 每天接单量统计
select DATE_FORMAT(create_time,'%Y-%m-%d') as '大促高峰日期',count(1) as '接单数量'
from csp_accept where create_time between '2024-05-17 00:00:00' and now()
group by DATE_FORMAT(create_time,'%Y-%m-%d')
order by 1 asc
# 每分钟接单数
select DATE_FORMAT(s.create_time,'%Y-%m-%d %H:%i'),count(1) from csp_accept s
where s.create_time >='2024-05-20 20:00:00' and s.create_time<='2024-05-31 20:30:00'
group by DATE_FORMAT(s.create_time,'%Y-%m-%d %H:%i')
# 每分钟站点解析数量
select DATE_FORMAT(s.create_time,'%Y-%m-%d %H:%i'),count(1) from csp_work_head s
where s.create_time >='2024-05-20 20:00:00' and s.create_time<='2024-05-31 20:30:00'
group by DATE_FORMAT(s.create_time,'%Y-%m-%d %H:%i')
# 待分配网点
select count(1) from csp_accept a where a.create_time >='2024-05-20 20:00:00' and a.create_time<='2024-05-20 23:00:00' and accept_state = 10;