tools/power turbostat: Avoid probing the same perf counters
authorZhang Rui <rui.zhang@intel.com>
Fri, 30 May 2025 00:09:28 +0000 (08:09 +0800)
committerLen Brown <len.brown@intel.com>
Sun, 8 Jun 2025 18:10:16 +0000 (14:10 -0400)
For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.

Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.

As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.

Fix the problem by skipping the already probed counter.

Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.

In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
tools/power/x86/turbostat/turbostat.c

index 6f91ec3..8deb6a2 100644 (file)
@@ -7991,6 +7991,21 @@ void rapl_perf_init(void)
 
                        struct rapl_counter_info_t *rci = &rapl_counter_info_perdomain[next_domain];
 
+                       /*
+                        * rapl_counter_arch_infos[] can have multiple entries describing the same
+                        * counter, due to the difference from different platforms/Vendors.
+                        * E.g. rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] share the
+                        * same perf_subsys and perf_name, but with different MSR address.
+                        * rapl_counter_arch_infos[0] is for Intel and rapl_counter_arch_infos[1]
+                        * is for AMD.
+                        * In this case, it is possible that multiple rapl_counter_arch_infos[]
+                        * entries are probed just because their perf/msr is duplicate and valid.
+                        *
+                        * Thus need a check to avoid re-probe the same counters.
+                        */
+                       if (rci->source[cai->rci_index] != COUNTER_SOURCE_NONE)
+                               break;
+
                        /* Use perf API for this counter */
                        if (add_rapl_perf_counter(cpu, rci, cai, &scale, &unit) != -1) {
                                rci->source[cai->rci_index] = COUNTER_SOURCE_PERF;