1、英文链接:
1. https://www.intel.com/content/www/us/en/docs/vtune-profiler/cookbook/2023-0/top-down-microarchitecture-analysis-method.html
2. http://portal.nacad.ufrj.br/online/intel/vtune2017/help/GUID-02271361-CCD4-410C-8338-4B8158157EB6.html
VTune Profiler automatically highlights metric values in the GUI if they are outside a predefined threshold and occur in a hotspot. VTune Profiler classifies a function as a hotspot if greater than 5% of the total clockticks for an application accrued within it. Determining whether a given fraction of pipeline slots in a particular category constitutes a bottleneck can be workload-dependent, but some general guidelines are provided in the table below:
Expected Range of Pipeline Slots in This Category, for a Hotspot in a Well-Tuned: | |||
---|---|---|---|
Category | Client/Desktop Application | Server/Database/Distributed application | High Performance Computing (HPC) application |
Retiring | 20-50% | 10-30% | 30-70% |
Back-End Bound | 20-40% | 20-60% | 20-40% |
Front-End Bound | 5-10% | 10-25% | 5-10% |
Bad Speculation | 5-10% | 5-10% | 1-5% |
These thresholds are based on analysis of some workloads in labs at Intel. If the fraction of time spent in a category (other than Retiring) for a hotspot is on the high end or greater than the range indicated, an investigation might be useful. If this is true for more than one category, the category with the highest fraction of time should be investigated first. Note that it is expected that hotspots will have some fraction of time spent in each category, and that values within the normal range below may not indicate a problem.
The important thing to realize about the Top-Down Method is that you do not need to spend time optimizing issues in a category that is not identified as a bottleneck - doing so will likely not lead to a significant performance improvement.
2、中文链接
C/C++ 性能优化背后的方法论:TMAM - vivo互联网技术 - 博客园 (cnblogs.com)
3、Analyzing Open vSwitch* with DPDK Bottlenecks Using Intel® VTune™...