Files
twx-linux/tools/perf/pmu-events/arch/x86/amdzen2/cache.json
T
Smita Koralahalli b07520a55f perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
[ Upstream commit 86c2bc3da7 ]

Commit 08ed77e414 ("perf vendor events amd: Add recommended events")
added the hits event "L2 Cache Hits from L2 HWPF" with the same metric
expression as the accesses event "L2 Cache Accesses from L2 HWPF":

$ perf list --details
...
  l2_cache_accesses_from_l2_hwpf
     [L2 Cache Accesses from L2 HWPF]
     [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3]
  l2_cache_hits_from_l2_hwpf
     [L2 Cache Hits from L2 HWPF]
     [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3]
...

This was wrong and led to counting hits the same as accesses. Section
2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h
B0 - 55803 Rev 0.54 - Sep 12, 2019", documents the hits event with
EventCode 0x70 which is the same as l2_pf_hit_l2.

Fix this, and massage the description for l2_pf_hit_l2 as the hits event
is now the duplicate of l2_pf_hit_l2. AMD recommends using the recommended
event over other events if the duplicate exists and maintain both for
consistency. Hence, l2_cache_hits_from_l2_hwpf should override
l2_pf_hit_l2.

Before:

 # perf stat -M l2_cache_accesses_from_l2_hwpf,l2_cache_hits_from_l2_hwpf sleep 1

 Performance counter stats for 'sleep 1':

             1,436      l2_pf_miss_l2_l3          # 11114.00 l2_cache_accesses_from_l2_hwpf
                                                  # 11114.00 l2_cache_hits_from_l2_hwpf
             4,482      l2_pf_hit_l2
             5,196      l2_pf_miss_l2_hit_l3

       1.001765339 seconds time elapsed

After:

 # perf stat -M l2_cache_accesses_from_l2_hwpf sleep 1

 Performance counter stats for 'sleep 1':

             1,477      l2_pf_miss_l2_l3          # 10442.00 l2_cache_accesses_from_l2_hwpf
             3,978      l2_pf_hit_l2
             4,987      l2_pf_miss_l2_hit_l3

       1.001491186 seconds time elapsed

 # perf stat -e l2_cache_hits_from_l2_hwpf sleep 1

 Performance counter stats for 'sleep 1':

             3,983      l2_cache_hits_from_l2_hwpf

       1.001329970 seconds time elapsed

Note the difference in performance counter values for the accesses
versus the hits after the fix, and the hits event now counting the same
as l2_pf_hit_l2.

Fixes: 08ed77e414 ("perf vendor events amd: Add recommended events")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Reviewed-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> # On a 3900X
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: linux-perf-users@vger.kernel.org
Link: https://lore.kernel.org/r/20210406215944.113332-2-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14 09:50:34 +02:00

362 lines
13 KiB
JSON

[
{
"EventName": "l2_request_g1.rd_blk_l",
"EventCode": "0x60",
"BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache reads (including hardware and software prefetch).",
"UMask": "0x80"
},
{
"EventName": "l2_request_g1.rd_blk_x",
"EventCode": "0x60",
"BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache stores.",
"UMask": "0x40"
},
{
"EventName": "l2_request_g1.ls_rd_blk_c_s",
"EventCode": "0x60",
"BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache shared reads.",
"UMask": "0x20"
},
{
"EventName": "l2_request_g1.cacheable_ic_read",
"EventCode": "0x60",
"BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Instruction cache reads.",
"UMask": "0x10"
},
{
"EventName": "l2_request_g1.change_to_x",
"EventCode": "0x60",
"BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). Data cache state change requests. Request change to writable, check L2 for current state.",
"UMask": "0x8"
},
{
"EventName": "l2_request_g1.prefetch_l2_cmd",
"EventCode": "0x60",
"BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). PrefetchL2Cmd.",
"UMask": "0x4"
},
{
"EventName": "l2_request_g1.l2_hw_pf",
"EventCode": "0x60",
"BriefDescription": "All L2 Cache Requests (Breakdown 1 - Common). L2 Prefetcher. All prefetches accepted by L2 pipeline, hit or miss. Types of PF and L2 hit/miss broken out in a separate perfmon event.",
"UMask": "0x2"
},
{
"EventName": "l2_request_g1.group2",
"EventCode": "0x60",
"BriefDescription": "Miscellaneous events covered in more detail by l2_request_g2 (PMCx061).",
"UMask": "0x1"
},
{
"EventName": "l2_request_g1.all_no_prefetch",
"EventCode": "0x60",
"UMask": "0xf9"
},
{
"EventName": "l2_request_g2.group1",
"EventCode": "0x61",
"BriefDescription": "Miscellaneous events covered in more detail by l2_request_g1 (PMCx060).",
"UMask": "0x80"
},
{
"EventName": "l2_request_g2.ls_rd_sized",
"EventCode": "0x61",
"BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Data cache read sized.",
"UMask": "0x40"
},
{
"EventName": "l2_request_g2.ls_rd_sized_nc",
"EventCode": "0x61",
"BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Data cache read sized non-cacheable.",
"UMask": "0x20"
},
{
"EventName": "l2_request_g2.ic_rd_sized",
"EventCode": "0x61",
"BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Instruction cache read sized.",
"UMask": "0x10"
},
{
"EventName": "l2_request_g2.ic_rd_sized_nc",
"EventCode": "0x61",
"BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Instruction cache read sized non-cacheable.",
"UMask": "0x8"
},
{
"EventName": "l2_request_g2.smc_inval",
"EventCode": "0x61",
"BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Self-modifying code invalidates.",
"UMask": "0x4"
},
{
"EventName": "l2_request_g2.bus_locks_originator",
"EventCode": "0x61",
"BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Bus locks.",
"UMask": "0x2"
},
{
"EventName": "l2_request_g2.bus_locks_responses",
"EventCode": "0x61",
"BriefDescription": "All L2 Cache Requests (Breakdown 2 - Rare). Bus lock response.",
"UMask": "0x1"
},
{
"EventName": "l2_latency.l2_cycles_waiting_on_fills",
"EventCode": "0x62",
"BriefDescription": "Total cycles spent waiting for L2 fills to complete from L3 or memory, divided by four. Event counts are for both threads. To calculate average latency, the number of fills from both threads must be used.",
"UMask": "0x1"
},
{
"EventName": "l2_wcb_req.wcb_write",
"EventCode": "0x63",
"BriefDescription": "LS to L2 WCB write requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) write requests.",
"UMask": "0x40"
},
{
"EventName": "l2_wcb_req.wcb_close",
"EventCode": "0x63",
"BriefDescription": "LS to L2 WCB close requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) close requests.",
"UMask": "0x20"
},
{
"EventName": "l2_wcb_req.zero_byte_store",
"EventCode": "0x63",
"BriefDescription": "LS to L2 WCB zero byte store requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) zero byte store requests.",
"UMask": "0x4"
},
{
"EventName": "l2_wcb_req.cl_zero",
"EventCode": "0x63",
"BriefDescription": "LS to L2 WCB cache line zeroing requests. LS (Load/Store unit) to L2 WCB (Write Combining Buffer) cache line zeroing requests.",
"UMask": "0x1"
},
{
"EventName": "l2_cache_req_stat.ls_rd_blk_cs",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache shared read hit in L2",
"UMask": "0x80"
},
{
"EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_x",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache read hit in L2.",
"UMask": "0x40"
},
{
"EventName": "l2_cache_req_stat.ls_rd_blk_l_hit_s",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache read hit on shared line in L2.",
"UMask": "0x20"
},
{
"EventName": "l2_cache_req_stat.ls_rd_blk_x",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache store or state change hit in L2.",
"UMask": "0x10"
},
{
"EventName": "l2_cache_req_stat.ls_rd_blk_c",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Data cache request miss in L2 (all types).",
"UMask": "0x8"
},
{
"EventName": "l2_cache_req_stat.ic_fill_hit_x",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache hit modifiable line in L2.",
"UMask": "0x4"
},
{
"EventName": "l2_cache_req_stat.ic_fill_hit_s",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache hit clean line in L2.",
"UMask": "0x2"
},
{
"EventName": "l2_cache_req_stat.ic_fill_miss",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2.",
"UMask": "0x1"
},
{
"EventName": "l2_cache_req_stat.ic_access_in_l2",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache requests in L2.",
"UMask": "0x7"
},
{
"EventName": "l2_cache_req_stat.ic_dc_miss_in_l2",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request miss in L2 and Data cache request miss in L2 (all types).",
"UMask": "0x9"
},
{
"EventName": "l2_cache_req_stat.ic_dc_hit_in_l2",
"EventCode": "0x64",
"BriefDescription": "Core to L2 cacheable request access status (not including L2 Prefetch). Instruction cache request hit in L2 and Data cache request hit in L2 (all types).",
"UMask": "0xf6"
},
{
"EventName": "l2_fill_pending.l2_fill_busy",
"EventCode": "0x6d",
"BriefDescription": "Cycles with fill pending from L2. Total cycles spent with one or more fill requests in flight from L2.",
"UMask": "0x1"
},
{
"EventName": "l2_pf_hit_l2",
"EventCode": "0x70",
"BriefDescription": "L2 prefetch hit in L2. Use l2_cache_hits_from_l2_hwpf instead.",
"UMask": "0xff"
},
{
"EventName": "l2_pf_miss_l2_hit_l3",
"EventCode": "0x71",
"BriefDescription": "L2 prefetcher hits in L3. Counts all L2 prefetches accepted by the L2 pipeline which miss the L2 cache and hit the L3.",
"UMask": "0xff"
},
{
"EventName": "l2_pf_miss_l2_l3",
"EventCode": "0x72",
"BriefDescription": "L2 prefetcher misses in L3. All L2 prefetches accepted by the L2 pipeline which miss the L2 and the L3 caches.",
"UMask": "0xff"
},
{
"EventName": "ic_fw32",
"EventCode": "0x80",
"BriefDescription": "The number of 32B fetch windows transferred from IC pipe to DE instruction decoder (includes non-cacheable and cacheable fill responses)."
},
{
"EventName": "ic_fw32_miss",
"EventCode": "0x81",
"BriefDescription": "The number of 32B fetch windows tried to read the L1 IC and missed in the full tag."
},
{
"EventName": "ic_cache_fill_l2",
"EventCode": "0x82",
"BriefDescription": "The number of 64 byte instruction cache line was fulfilled from the L2 cache."
},
{
"EventName": "ic_cache_fill_sys",
"EventCode": "0x83",
"BriefDescription": "The number of 64 byte instruction cache line fulfilled from system memory or another cache."
},
{
"EventName": "bp_l1_tlb_miss_l2_hit",
"EventCode": "0x84",
"BriefDescription": "The number of instruction fetches that miss in the L1 ITLB but hit in the L2 ITLB."
},
{
"EventName": "bp_l1_tlb_miss_l2_tlb_miss",
"EventCode": "0x85",
"BriefDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs.",
"UMask": "0xff"
},
{
"EventName": "bp_l1_tlb_miss_l2_tlb_miss.if1g",
"EventCode": "0x85",
"BriefDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs. Instruction fetches to a 1GB page.",
"UMask": "0x4"
},
{
"EventName": "bp_l1_tlb_miss_l2_tlb_miss.if2m",
"EventCode": "0x85",
"BriefDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs. Instruction fetches to a 2MB page.",
"UMask": "0x2"
},
{
"EventName": "bp_l1_tlb_miss_l2_tlb_miss.if4k",
"EventCode": "0x85",
"BriefDescription": "The number of instruction fetches that miss in both the L1 and L2 TLBs. Instruction fetches to a 4KB page.",
"UMask": "0x1"
},
{
"EventName": "bp_snp_re_sync",
"EventCode": "0x86",
"BriefDescription": "The number of pipeline restarts caused by invalidating probes that hit on the instruction stream currently being executed. This would happen if the active instruction stream was being modified by another processor in an MP system - typically a highly unlikely event."
},
{
"EventName": "ic_fetch_stall.ic_stall_any",
"EventCode": "0x87",
"BriefDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle for any reason (nothing valid in pipe ICM1).",
"UMask": "0x4"
},
{
"EventName": "ic_fetch_stall.ic_stall_dq_empty",
"EventCode": "0x87",
"BriefDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to DQ empty.",
"UMask": "0x2"
},
{
"EventName": "ic_fetch_stall.ic_stall_back_pressure",
"EventCode": "0x87",
"BriefDescription": "Instruction Pipe Stall. IC pipe was stalled during this clock cycle (including IC to OC fetches) due to back-pressure.",
"UMask": "0x1"
},
{
"EventName": "ic_cache_inval.l2_invalidating_probe",
"EventCode": "0x8c",
"BriefDescription": "IC line invalidated due to L2 invalidating probe (external or LS). The number of instruction cache lines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core.",
"UMask": "0x2"
},
{
"EventName": "ic_cache_inval.fill_invalidated",
"EventCode": "0x8c",
"BriefDescription": "IC line invalidated due to overwriting fill response. The number of instruction cache lines invalidated. A non-SMC event is CMC (cross modifying code), either from the other thread of the core or another core.",
"UMask": "0x1"
},
{
"EventName": "ic_oc_mode_switch.oc_ic_mode_switch",
"EventCode": "0x28a",
"BriefDescription": "OC Mode Switch. OC to IC mode switch.",
"UMask": "0x2"
},
{
"EventName": "ic_oc_mode_switch.ic_oc_mode_switch",
"EventCode": "0x28a",
"BriefDescription": "OC Mode Switch. IC to OC mode switch.",
"UMask": "0x1"
},
{
"EventName": "l3_request_g1.caching_l3_cache_accesses",
"EventCode": "0x01",
"BriefDescription": "Caching: L3 cache accesses",
"UMask": "0x80",
"Unit": "L3PMC"
},
{
"EventName": "l3_lookup_state.all_l3_req_typs",
"EventCode": "0x04",
"BriefDescription": "All L3 Request Types",
"UMask": "0xff",
"Unit": "L3PMC"
},
{
"EventName": "l3_comb_clstr_state.other_l3_miss_typs",
"EventCode": "0x06",
"BriefDescription": "Other L3 Miss Request Types",
"UMask": "0xfe",
"Unit": "L3PMC"
},
{
"EventName": "l3_comb_clstr_state.request_miss",
"EventCode": "0x06",
"BriefDescription": "L3 cache misses",
"UMask": "0x01",
"Unit": "L3PMC"
},
{
"EventName": "xi_sys_fill_latency",
"EventCode": "0x90",
"BriefDescription": "L3 Cache Miss Latency. Total cycles for all transactions divided by 16. Ignores SliceMask and ThreadMask.",
"UMask": "0x00",
"Unit": "L3PMC"
},
{
"EventName": "xi_ccx_sdp_req1.all_l3_miss_req_typs",
"EventCode": "0x9A",
"BriefDescription": "All L3 Miss Request Types. Ignores SliceMask and ThreadMask.",
"UMask": "0x3f",
"Unit": "L3PMC"
}
]