CPU爆高分析1. 真的爆高吗dump的分析第一原则就是相信数据先使用!tp观察cpu使用率。0:031 !tp Using the Portable thread pool. CPU utilization: 100% Workers Total: 5 Workers Running: 4 Workers Idle: 1 Worker Min Limit: 2 Worker Max Limit: 32767从卦中可以看到CPU utilization100%说明确实被打满了接下来我们看下 cpu 的 strength使用!cpuid即可参考如下0:031 !cpuid CP F/M/S Manufacturer MHz 0 6,85,7 unavailable 2993 1 6,85,7 unavailable 2993这卦不看不知道一看吓一跳堂堂server端的程序居然只有2core 这程序稍微痉挛一下必定cpu爆高这上哪说理去让马儿跑又不让马儿吃草。。。接下来我们探究下到底是什么引发的cpu爆高按照行业惯例首先怀疑是不是gc导致的最steady的方式就是使用命令~* k观察各个线程栈。# 31 Id: 19e4.23e4 Suspend: 0 Teb: 000000636f12e000 Unfrozen .NET TP Worker # Child-SP RetAddr Call Site 00 00000063041bef18 00007ffe03a5d75e ntdll!NtWaitForSingleObject0x14 01 00000063041bef20 00007ffdb87330fd KERNELBASE!WaitForSingleObjectEx0x8e 02 (Inline Function) ---------------- coreclr!GCEvent::Impl::Wait0x13 [D:\a\_work\1\s\src\coreclr\gc\windows\gcenv.windows.cpp 1372] 03 (Inline Function) ---------------- coreclr!GCEvent::Wait0x13 [D:\a\_work\1\s\src\coreclr\gc\windows\gcenv.windows.cpp 1422] 04 00000063041befc0 00007ffdb8734bb2 coreclr!SVR::gc_heap::wait_for_gc_done0x5d [D:\a\_work\1\s\src\coreclr\gc\gc.cpp 14979] 05 00000063041beff0 00007ffdb873311d coreclr!SVR::GCHeap::GarbageCollectGeneration0xee [D:\a\_work\1\s\src\coreclr\gc\gc.cpp 50760] 06 00000063041bf040 00007ffdb882d6d8 coreclr!SVR::gc_heap::trigger_gc_for_alloc0x15 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp 19134] 07 00000063041bf070 00007ffdb87e0b70 coreclr!SVR::gc_heap::try_allocate_more_space0x4ca20 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp 19295] 08 00000063041bf0e0 00007ffdb87e0ae4 coreclr!SVR::gc_heap::allocate_more_space0x68 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp 19779] 09 00000063041bf140 00007ffdb8724e65 coreclr!SVR::gc_heap::allocate_uoh_object0x58 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp 45154] 0a 00000063041bf1b0 00007ffdb86eb239 coreclr!SVR::GCHeap::Alloc0x1c5 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp 49673] 0b 00000063041bf200 00007ffdb86edd6a coreclr!Alloc0x99 [D:\a\_work\1\s\src\coreclr\vm\gchelpers.cpp 238] 0c (Inline Function) ---------------- coreclr!AllocateSzArray0x29b [D:\a\_work\1\s\src\coreclr\vm\gchelpers.cpp 428] 0d 00000063041bf250 00007ffd5be4394a coreclr!JIT_NewArr10x3ba [D:\a\_work\1\s\src\coreclr\vm\jithelpers.cpp 2800] 0e 00000063041bf470 00007ffd5bdbd7de Microsoft_Data_SqlClient!Microsoft.Data.SqlClient.TdsParser.TryReadSqlStringValue(Microsoft.Data.SqlClient.SqlBuffer, Byte, Int32, System.Text.Encoding, Boolean, Microsoft.Data.SqlClient.TdsParserStateObject)0x17aa9ca [/_/src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/TdsParser.cs 5703] ... 17 00000063041bfab0 00007ffdb86ca02a coreclr!CallDescrWorkerInternal0x83 [D:\a\_work\1\s\src\coreclr\vm\amd64\CallDescrWorkerAMD64.asm 100] 18 00000063041bfaf0 00007ffdb87eb116 coreclr!DispatchCallSimple0x66 [D:\a\_work\1\s\src\coreclr\vm\callhelpers.cpp 221] 19 00000063041bfb80 00007ffdb86e2602 coreclr!ThreadNative::KickOffThread_Worker0x66 [D:\a\_work\1\s\src\coreclr\vm\comsynchronizable.cpp 158] ... 21 00000063041bfd90 0000000000000000 ntdll!RtlUserThreadStart0x2b纵观卦中的各个调用栈并没有看到触发的GC线程只有一个coreclr!SVR::gc_heap::wait_for_gc_done根据gc的知识这是等待gc finish的操作所以这个程序果然触发了GC。2. 为什么会触发 GC要想回答这个问题需要知道GC触发的是哪一代这是我们继续下去的重要突破口接下来使用x coreclr!*settings*提取 settings 结构体参考如下0:031 x coreclr!*settings* 00007ffdb8b221b0 coreclr!SVR::gc_heap::settings class SVR::gc_mechanisms 0:031 dx -r1 (*((coreclr!SVR::gc_mechanisms *)0x7ffdb8b221b0)) (*((coreclr!SVR::gc_mechanisms *)0x7ffdb8b221b0)) [Type: SVR::gc_mechanisms] [0x000] gc_index : 0x2d3e [Type: unsigned __int64] [0x008] condemned_generation : 2 [Type: int] [0x00c] promotion : 1 [Type: int] [0x010] compaction : 0 [Type: int] [0x014] loh_compaction : 0 [Type: int] [0x018] heap_expansion : 0 [Type: int] [0x01c] concurrent : 0x0 [Type: unsigned int] [0x020] demotion : 0 [Type: int] [0x024] card_bundles : 1 [Type: int] [0x028] gen0_reduction_count : 0 [Type: int] [0x02c] should_lock_elevation : 0 [Type: int] [0x030] elevation_locked_count : 0 [Type: int] [0x034] elevation_reduced : 0 [Type: int] [0x038] minimal_gc : 0 [Type: int] [0x03c] reason : reason_alloc_loh (4) [Type: gc_reason] [0x040] pause_mode : pause_interactive (1) [Type: SVR::gc_pause_mode] [0x044] found_finalizers : 1 [Type: int] [0x048] background_p : 0 [Type: int] [0x04c] b_state : bgc_not_in_process (0) [Type: bgc_state] [0x050] stress_induced : 0 [Type: int] [0x054] entry_memory_load : 0x24 [Type: unsigned int] [0x058] entry_available_physical_mem : 0x3d3d73000 [Type: unsigned __int64] [0x060] exit_memory_load : 0x24 [Type: unsigned int]从卦中的condemned_generation : 2和reason : reason_alloc_loh来看一切都明白了然来这个 gc正在触发 fullgc并且是因为 loh 的代边界满了这个信息的潜台词就是告诉我们有人在大概率折腾 LOH接下来用!eeheap -gc切到大对象堆中去看看。0:031 !eeheap -gc DATAS Number of GC Heaps: 2 ---------------------------------------- Large object heap segment begin allocated committed allocated size committed size ... 022f3f4fe4d0 01efdb000028 01efdf9ded80 01efdf9df000 0x49ded58 (77458776) 0x49df000 (77459456) 022f3f500750 01efe7000028 01efea8c86b8 01efea8e9000 0x38c8690 (59541136) 0x38e9000 (59674624) 022f3f501890 01efed000028 01eff22a05c0 01eff22a1000 0x52a0598 (86640024) 0x52a1000 (86642688) 022f3f504c50 01efff000028 01f004e6c1b8 01f004e8d000 0x5e6c190 (99008912) 0x5e8d000 (99143680) 022f3f508010 01f011000028 01f016620f68 01f016641000 0x5620f40 (90312512) 0x5641000 (90443776) 022f3f50a290 01f01d000028 01f0219ded80 01f0219df000 0x49ded58 (77458776) 0x49df000 (77459456) 022f3f50b3d0 01f023000028 01f023f1be28 01f023f3c000 0xf1be00 (15842816) 0xf3c000 (15974400) 022f3f513250 01f04f000028 01f0539ded80 01f0539df000 0x49ded58 (77458776) 0x49df000 (77459456) 022f3f5154d0 01f05b000028 01f05b000028 01f05f9df000 0x49df000 (77459456) 022f3f518890 01f06d000028 01f0719ded80 01f0719df000 0x49ded58 (77458776) 0x49df000 (77459456) 022f3f524650 01f0af000028 01f0b39ded80 01f0b39df000 0x49ded58 (77458776) 0x49df000 (77459456) ... Pinned object heap segment begin allocated committed allocated size committed size 022f3f4df200 01ef2d800028 01ef2d816d08 01ef2d821000 0x16ce0 (93408) 0x21000 (135168) ------------------------------ GC Allocated Heap Size: Size: 0xce96da90 (3465992848) bytes. GC Committed Heap Size: Size: 0x119465000 (4719005696) bytes.从卦中看大对象堆确实块头不小接下来抽几个段出来看看。0:031 !dumpheap 01efdb000028 01efdf9ded80 Address MT Size 01efdb000028 01ef2af58580 32 Free 01efdb000048 7ffd58de2a10 77,458,744 Statistics: MT Count TotalSize Class Name 01ef2af58580 1 32 Free 7ffd58de2a10 1 77,458,744 System.Char[] Total 2 objects, 77,458,776 bytes 0:031 !dumpheap 01f04f000028 01f0539ded80 Address MT Size 01f04f000028 01ef2af58580 32 Free 01f04f000048 7ffd58de2a10 77,458,744 Statistics: MT Count TotalSize Class Name 01ef2af58580 1 32 Free 7ffd58de2a10 1 77,458,744 System.Char[] Total 2 objects, 77,458,776 bytes从卦中看确实也验证了我们的思路里面有不少77M的 char[]如果胆大心细你也能大概率猜到这个等待gc完成的线程应该也是在分配大对象使用!dso观察便知。0:031 !dso OS Thread Id: 0x23e4 (31) SP/REG Object Name 0063041bf198 01ef3607c610 Microsoft.Data.SqlClient.TdsParserStateObjectStateSnapshot 0063041bf200 01ef8fd2d128 System.Threading.ExecutionContext 0063041bf220 01ef2ff33390 Microsoft.Data.SqlClient.TdsParserStateObjectNative 0063041bf228 01ef8fd2d150 Microsoft.Data.SqlClient.TdsParserStateObjectStateSnapshotPacketData 0063041bf268 01ef3607c610 Microsoft.Data.SqlClient.TdsParserStateObjectStateSnapshot 0063041bf278 01f06d000048 System.Char[] 0063041bf290 022fc03f2228 System.String 0063041bf298 01ef8fd2d150 Microsoft.Data.SqlClient.TdsParserStateObjectStateSnapshotPacketData 0063041bf340 01ef8fd2cfe0 System.Threading.Tasks.TaskCompletionSourceSystem.Object 0063041bf3c0 01ef3607b8c8 Microsoft.Data.SqlClient._SqlMetaData 0063041bf400 01ef2ff38e38 Microsoft.Data.SqlClient.SNIHandle 0063041bf438 01ef2ff33390 Microsoft.Data.SqlClient.TdsParserStateObjectNative 0063041bf450 01ef3607b8c8 Microsoft.Data.SqlClient._SqlMetaData 0063041bf460 01ef3607c290 Microsoft.Data.SqlClient.SqlBuffer 0063041bf470 01ef2ff33390 Microsoft.Data.SqlClient.TdsParserStateObjectNative 0063041bf480 01ef2ff33390 Microsoft.Data.SqlClient.TdsParserStateObjectNative 0063041bf4a8 01ef2ff33390 Microsoft.Data.SqlClient.TdsParserStateObjectNative 0063041bf4b0 01ef8fd21370 Microsoft.Data.SqlClient.TdsParserStateObjectStateSnapshotPacketData 0063041bf528 01ef3607b8c8 Microsoft.Data.SqlClient._SqlMetaData到这里基本上就知道了是有人从数据库中捞取77M的大对象所致接下来我们看下神秘的77M是什么直接将内存导出到txt中这里涉及到客户隐私我们截张去敏的图参考如下原来这个 77M 是一个售后单将这个涉及到若干图片的html直接保存到数据库所致