linux-2.6-microblaze.git
3 years agoperf tools: Copy metric events properly when expand cgroups
Namhyung Kim [Thu, 24 Sep 2020 12:44:53 +0000 (21:44 +0900)]
perf tools: Copy metric events properly when expand cgroups

The metricgroup__copy_metric_events() is to handle metrics events when
expanding event for cgroups.  As the metric events keep pointers to
evsel, it should be refreshed when events are cloned during the
operation.

The perf_stat__collect_metric_expr() is also called in case an event has
a metric directly.

During the copy, it references evsel by index as the evlist now has
cloned evsels for the given cgroup.

Also kernel test robot found an issue in the python module import so add
empty implementations of those two functions to fix it.

Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Add --for-each-cgroup option
Namhyung Kim [Thu, 24 Sep 2020 12:44:52 +0000 (21:44 +0900)]
perf stat: Add --for-each-cgroup option

The --for-each-cgroup option is a syntax sugar to monitor large number
of cgroups easily.  Current command line requires to list all the events
and cgroups even if users want to monitor same events for each cgroup.
This patch addresses that usage by copying given events for each cgroup
on user's behalf.

For instance, if they want to monitor 6 events for 200 cgroups each they
should write 1200 event names (with -e) AND 1200 cgroup names (with -G)
on the command line.  But with this change, they can just specify 6
events and 200 cgroups with a new option.

A simpler example below: It wants to measure 3 events for 2 cgroups ('A'
and 'B').  The result is that total 6 events are counted like below.

  $ perf stat -a -e cpu-clock,cycles,instructions --for-each-cgroup A,B sleep 1

   Performance counter stats for 'system wide':

              988.18 msec cpu-clock                 A #    0.987 CPUs utilized
       3,153,761,702      cycles                    A #    3.200 GHz                      (100.00%)
       8,067,769,847      instructions              A #    2.57  insn per cycle           (100.00%)
              982.71 msec cpu-clock                 B #    0.982 CPUs utilized
       3,136,093,298      cycles                    B #    3.182 GHz                      (99.99%)
       8,109,619,327      instructions              B #    2.58  insn per cycle           (99.99%)

         1.001228054 seconds time elapsed

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf evsel: Add evsel__clone() function
Namhyung Kim [Thu, 24 Sep 2020 12:44:51 +0000 (21:44 +0900)]
perf evsel: Add evsel__clone() function

The evsel__clone() is to create an exactly same evsel from same
attributes.  The function assumes the given evsel is not configured
yet so it cares fields set during event parsing.  Those fields are now
moved together as Jiri suggested.  Note that metric events will be
handled by later patch.

It will be used by perf stat to generate separate events for each
cgroup.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200924124455.336326-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf vendor events: Update SkylakeX events to v1.21
Jin Yao [Wed, 13 May 2020 08:13:33 +0000 (16:13 +0800)]
perf vendor events: Update SkylakeX events to v1.21

- Update SkylakeX events to v1.21.
- Update SkylakeX JSON metrics from TMAM 4.0.

Other fixes:

- Add NO_NMI_WATCHDOG metric constraint to Backend_Bound
- Fix misspelled error

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf vendor events intel: Update CascadelakeX events to v1.08
Jin Yao [Tue, 22 Sep 2020 02:51:19 +0000 (10:51 +0800)]
perf vendor events intel: Update CascadelakeX events to v1.08

- Update CascadelakeX events to v1.08.
- Update CascadelakeX JSON metrics from TMAM 4.0.

Other fixes:

- Add NO_NMI_WATCHDOG metric constraint to Backend_Bound
- Change 'MB/sec' to 'MB' in UNC_M_PMM_BANDWIDTH.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf script: Add min, max to futex-contention output, in addition to avg
Hagen Paul Pfeifer [Tue, 22 Sep 2020 20:09:22 +0000 (22:09 +0200)]
perf script: Add min, max to futex-contention output, in addition to avg

Average is quite informative, but the outliners - especially max - are
also of interest.

Before:

  mutex-locker[793299] lock 5637ec61e080 contended 3400 times, 446 avg ns
  mutex-locker[793301] lock 5637ec61e080 contended 3563 times, 385 avg ns
  mutex-locker[793300] lock 5637ec61e080 contended 3110 times, 1855 avg ns

After:

  mutex-locker[795251] lock 55b14e6dd080 contended 3853 times, 1279 avg ns [max: 12270 ns, min 340 ns]
  mutex-locker[795253] lock 55b14e6dd080 contended 2911 times, 518 avg ns [max: 51660261 ns, min 347 ns]
  mutex-locker[795252] lock 55b14e6dd080 contended 3843 times, 385 avg ns [max: 24323998 ns, min 338 ns]

Committer testing:

  [root@five ~]# perf script record futex-contention -a
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.877 MB perf.data (923 samples) ]

  [root@five ~]# perf evlist
  syscalls:sys_enter_futex
  syscalls:sys_exit_futex
  dummy:HG
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #

Before:

  [root@five ~]# perf script report futex-contention
  JS Helper[2457] lock 55fe0cf82610 contended 4 times, 6657 avg ns
  ibus-daemon[2975] lock 56227f6d0210 contended 4 times, 1020 avg ns
  chromium-browse[1905801] lock 7ffe573f5088 contended 8 times, 108463 avg ns
  gnome-shell[2240] lock 55fe0cf82678 contended 1 times, 8616 avg ns
  gnome-shel:cs0[2292] lock 55fe0d0ab768 contended 3 times, 606016034 avg ns
  JS Helper[2458] lock 55fe0cf82690 contended 1 times, 1167840 avg ns
  chromium-browse[1905470] lock 7ffe573f5358 contended 1 times, 551504 avg ns
  chromium-browse[1905948] lock 7ffe573f5358 contended 1 times, 577422 avg ns
  gnome-shell[2240] lock 55fe0cf82660 contended 6 times, 202696 avg ns
  pool[2602] lock 7fd600008ef0 contended 1 times, 500046007 avg ns
  chromium-browse[1905801] lock 7ffe573f5128 contended 4 times, 285083 avg ns
  JS Helper[2460] lock 55fe0cf82690 contended 1 times, 680877 avg ns
  JS Helper[2459] lock 55fe0cf82610 contended 7 times, 4224 avg ns
  chromium-browse[1905434] lock 7ffe573f5358 contended 1 times, 697038 avg ns
  chromium-browse[212592] lock 7ffe573f53c8 contended 4 times, 460601 avg ns
  gnome-shel:cs0[2292] lock 55fe0d0ab76c contended 2 times, 601237648 avg ns
  JS Helper[2460] lock 55fe0cf82610 contended 4 times, 3340 avg ns
  JS Helper[2462] lock 55fe0cf82694 contended 1 times, 237275 avg ns
  chromium-browse[1905605] lock 7ffe573f5358 contended 2 times, 634555 avg ns
  chromium-browse[1905992] lock 7ffe573f5358 contended 1 times, 583965 avg ns
  chromium-browse[1905647] lock 7ffe573f5368 contended 8 times, 549800 avg ns
  JS Helper[2462] lock 55fe0cf82610 contended 2 times, 4694 avg ns
  JS Helper[2461] lock 55fe0cf82694 contended 1 times, 257793 avg ns
  JS Helper[2456] lock 55fe0cf82690 contended 1 times, 677771 avg ns
  JS Helper[2463] lock 55fe0cf82610 contended 3 times, 5139 avg ns
  gdbus[2980] lock 56227f6d0210 contended 2 times, 2465 avg ns
  gnome-shell[2240] lock 55fe0cf82664 contended 5 times, 8036 avg ns
  chromium-browse[1906308] lock 7ffe573f5358 contended 1 times, 210735 avg ns
  JS Helper[2463] lock 55fe0cf82694 contended 1 times, 251531 avg ns
  chromium-browse[1905801] lock 7ffe573f4f58 contended 4 times, 399927 avg ns
  [root@five ~]#

After:

  [root@five ~]# perf script report futex-contention
  JS Helper[2457] lock 55fe0cf82610 contended 4 times, 6657 avg ns [max: 11502 ns, min 792 ns]
  ibus-daemon[2975] lock 56227f6d0210 contended 4 times, 1020 avg ns [max: 1813 ns, min 581 ns]
  chromium-browse[1905801] lock 7ffe573f5088 contended 8 times, 108463 avg ns [max: 380103 ns, min 57989 ns]
  gnome-shell[2240] lock 55fe0cf82678 contended 1 times, 8616 avg ns [max: 8616 ns, min 8616 ns]
  gnome-shel:cs0[2292] lock 55fe0d0ab768 contended 3 times, 606016034 avg ns [max: 611295960 ns, min 600191357 ns]
  JS Helper[2458] lock 55fe0cf82690 contended 1 times, 1167840 avg ns [max: 1167840 ns, min 1167840 ns]
  chromium-browse[1905470] lock 7ffe573f5358 contended 1 times, 551504 avg ns [max: 551504 ns, min 551504 ns]
  chromium-browse[1905948] lock 7ffe573f5358 contended 1 times, 577422 avg ns [max: 577422 ns, min 577422 ns]
  gnome-shell[2240] lock 55fe0cf82660 contended 6 times, 202696 avg ns [max: 398998 ns, min 5050 ns]
  pool[2602] lock 7fd600008ef0 contended 1 times, 500046007 avg ns [max: 500046007 ns, min 500046007 ns]
  chromium-browse[1905801] lock 7ffe573f5128 contended 4 times, 285083 avg ns [max: 389531 ns, min 76183 ns]
  JS Helper[2460] lock 55fe0cf82690 contended 1 times, 680877 avg ns [max: 680877 ns, min 680877 ns]
  JS Helper[2459] lock 55fe0cf82610 contended 7 times, 4224 avg ns [max: 12724 ns, min 1012 ns]
  chromium-browse[1905434] lock 7ffe573f5358 contended 1 times, 697038 avg ns [max: 697038 ns, min 697038 ns]
  chromium-browse[212592] lock 7ffe573f53c8 contended 4 times, 460601 avg ns [max: 594956 ns, min 232996 ns]
  gnome-shel:cs0[2292] lock 55fe0d0ab76c contended 2 times, 601237648 avg ns [max: 601255863 ns, min 601219434 ns]
  JS Helper[2460] lock 55fe0cf82610 contended 4 times, 3340 avg ns [max: 9168 ns, min 962 ns]
  JS Helper[2462] lock 55fe0cf82694 contended 1 times, 237275 avg ns [max: 237275 ns, min 237275 ns]
  chromium-browse[1905605] lock 7ffe573f5358 contended 2 times, 634555 avg ns [max: 1024060 ns, min 245050 ns]
  chromium-browse[1905992] lock 7ffe573f5358 contended 1 times, 583965 avg ns [max: 583965 ns, min 583965 ns]
  chromium-browse[1905647] lock 7ffe573f5368 contended 8 times, 549800 avg ns [max: 775293 ns, min 258375 ns]
  JS Helper[2462] lock 55fe0cf82610 contended 2 times, 4694 avg ns [max: 8556 ns, min 832 ns]
  JS Helper[2461] lock 55fe0cf82694 contended 1 times, 257793 avg ns [max: 257793 ns, min 257793 ns]
  JS Helper[2456] lock 55fe0cf82690 contended 1 times, 677771 avg ns [max: 677771 ns, min 677771 ns]
  JS Helper[2463] lock 55fe0cf82610 contended 3 times, 5139 avg ns [max: 6873 ns, min 931 ns]
  gdbus[2980] lock 56227f6d0210 contended 2 times, 2465 avg ns [max: 4188 ns, min 742 ns]
  gnome-shell[2240] lock 55fe0cf82664 contended 5 times, 8036 avg ns [max: 13105 ns, min 401 ns]
  chromium-browse[1906308] lock 7ffe573f5358 contended 1 times, 210735 avg ns [max: 210735 ns, min 210735 ns]
  JS Helper[2463] lock 55fe0cf82694 contended 1 times, 251531 avg ns [max: 251531 ns, min 251531 ns]
  chromium-browse[1905801] lock 7ffe573f4f58 contended 4 times, 399927 avg ns [max: 476904 ns, min 178495 ns]
  [root@five ~]#

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/20200922200922.1306034-1-hagen@jauu.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf script: Autopep8 futex-contention
Hagen Paul Pfeifer [Mon, 21 Sep 2020 20:19:27 +0000 (22:19 +0200)]
perf script: Autopep8 futex-contention

10 years leaves its mark! Python has evolved and so has its style guide.
Even with vim it is getting hard to follow the no longer valid
guidelines (spaces vs. tabs).

Autopep8 this code to modernize it!

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Link: http://lore.kernel.org/lkml/20200921201928.799498-1-hagen@jauu.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Skip duration_time in setup_system_wide
Jin Yao [Tue, 22 Sep 2020 01:50:04 +0000 (09:50 +0800)]
perf stat: Skip duration_time in setup_system_wide

Some metrics (such as DRAM_BW_Use) consists of uncore events and
duration_time. For uncore events, counter->core.system_wide is true. But
for duration_time, counter->core.system_wide is false so
target.system_wide is set to false.

Then 'enable_on_exec' is set in perf_event_attr of uncore event.  Kernel
will return error when trying to open the uncore event.

This patch skips the duration_time in setup_system_wide then
target.system_wide will be set to true for the evlist of uncore events +
duration_time.

Before (tested on skylake desktop):

  # perf stat -M DRAM_BW_Use -- sleep 1
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (arb/event=0x84,umask=0x1/).
  /bin/dmesg | grep -i perf may provide additional information.

After:

  # perf stat -M DRAM_BW_Use -- sleep 1

   Performance counter stats for 'system wide':

                169      arb/event=0x84,umask=0x1/ #     0.00 DRAM_BW_Use
             40,427      arb/event=0x81,umask=0x1/
      1,000,902,197 ns   duration_time

        1.000902197 seconds time elapsed

Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200922015004.30114-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tsc: Support cap_user_time_short for event TIME_CONV
Leo Yan [Mon, 14 Sep 2020 11:53:09 +0000 (19:53 +0800)]
perf tsc: Support cap_user_time_short for event TIME_CONV

The synthesized event TIME_CONV doesn't contain the complete parameters
for counters, this will lead to wrong conversion between counter cycles
and timestamp.

This patch extends event TIME_CONV to record flags 'cap_user_time_zero'
which is used to indicate the counter parameters are valid or not, if
not will directly return 0 for timestamp calculation.  And record the
flag 'cap_user_time_short' and its relevant fields 'time_cycles' and
'time_mask' for cycle calibration.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200914115311.2201-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tsc: Calculate timestamp with cap_user_time_short
Leo Yan [Mon, 14 Sep 2020 11:53:08 +0000 (19:53 +0800)]
perf tsc: Calculate timestamp with cap_user_time_short

The perf mmap'ed buffer contains the flag 'cap_user_time_short' and two
extra fields 'time_cycles' and 'time_mask', perf tool needs to know them
for handling the counter wrapping case.

This patch is to reads out the relevant parameters from the head of the
first mmap'ed page and stores into the structure 'perf_tsc_conversion',
if the flag 'cap_user_time_short' has been set, it will firstly
calibrate cycle value for timestamp calculation.

Committer testing:

Before/after:

  # perf test tsc
  70: Convert perf time to TSC                                        : Ok
  #
  # perf test -v tsc
  70: Convert perf time to TSC                                        :
  --- start ---
  test child forked, pid 11059
  mmap size 528384B
  1st event perf time 996384576521 tsc 3850532906613
  rdtsc          time 996384578455 tsc 3850532913950
  2nd event perf time 996384578845 tsc 3850532915428
  test child finished with 0
  ---- end ----
  Convert perf time to TSC: Ok
  #

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200914115311.2201-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tsc: Add rdtsc() for Arm64
Leo Yan [Mon, 14 Sep 2020 11:53:07 +0000 (19:53 +0800)]
perf tsc: Add rdtsc() for Arm64

The system register CNTVCT_EL0 can be used to retrieve the counter from
user space.  Add rdtsc() for Arm64.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200914115311.2201-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tsc: Move out common functions from x86
Leo Yan [Mon, 14 Sep 2020 11:53:06 +0000 (19:53 +0800)]
perf tsc: Move out common functions from x86

Functions perf_read_tsc_conversion() and perf_event__synth_time_conv()
should work as common functions rather than x86 specific, so move these
two functions out from arch/x86 folder and place them into util/tsc.c.

Since the function perf_event__synth_time_conv() will be linked in
util/tsc.c, remove its weak version.

Committer testing:

Before/after:

  # perf test tsc
  70: Convert perf time to TSC                                        : Ok
  #
  # perf test -v tsc
  70: Convert perf time to TSC                                        :
  --- start ---
  test child forked, pid 8520
  mmap size 528384B
  1st event perf time 592110439891 tsc 2317172044331
  rdtsc          time 592110441915 tsc 2317172052010
  2nd event perf time 592110442336 tsc 2317172053605
  test child finished with 0
  ---- end ----
  Convert perf time to TSC: Ok
  #

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Link: http://lore.kernel.org/lkml/20200914115311.2201-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf probe: Fall back to debuginfod query if debuginfo and source not found locally
Masami Hiramatsu [Fri, 18 Sep 2020 08:01:41 +0000 (17:01 +0900)]
perf probe: Fall back to debuginfod query if debuginfo and source not found locally

Since 'perf probe' heavily depends on debuginfo, debuginfod gives us
many benefits on the 'perf probe' command on remote machine.

Especially, this will be helpful for the embedded devices which will not
have enough storage, or boot with a cross-build kernel whose source code
is in the host machine.

This will work as similar to commit c7a14fdcb3fa7736 ("perf build-ids:
Fall back to debuginfod query if debuginfo not found")

Tested with:

  (host) $ cd PATH/TO/KBUILD/DIR/
  (host) $ debuginfod -F .
  ...

  (remote) # perf probe -L vfs_read
  Failed to find the path for the kernel: No such file or directory
    Error: Failed to show lines.

  (remote) # export DEBUGINFOD_URLS="http://$HOST_IP:8002/"
  (remote) # perf probe -L vfs_read
  <vfs_read@...>
        0  ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
           {
        2         ssize_t ret;

                  if (!(file->f_mode & FMODE_READ))
                          return -EBADF;
        6         if (!(file->f_mode & FMODE_CAN_READ))
                          return -EINVAL;
        8         if (unlikely(!access_ok(buf, count)))
                          return -EFAULT;

       11         ret = rw_verify_area(READ, file, pos, count);
       12         if (ret)
                          return ret;
                  if (count > MAX_RW_COUNT)
  ...

  (remote) # perf probe -a "vfs_read count"
  Added new event:
    probe:vfs_read       (on vfs_read with count)

  (remote) # perf probe -l
    probe:vfs_read       (on vfs_read@ksrc/linux/fs/read_write.c with count)

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Frank Ch. Eigler <fche@redhat.com>
Cc: Aaron Merey <amerey@redhat.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Link: http://lore.kernel.org/lkml/160041610083.912668.13659563860278615846.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf probe: Fix to adjust symbol address with correct reloc_sym address
Masami Hiramatsu [Fri, 18 Sep 2020 08:01:30 +0000 (17:01 +0900)]
perf probe: Fix to adjust symbol address with correct reloc_sym address

'perf probe' uses ref_reloc_sym to adjust symbol offset address from
debuginfo address or ref_reloc_sym based address, but that is misusing
reloc_sym->addr and reloc_sym->unrelocated_addr.  If map is not
relocated (map->reloc == 0), we can use reloc_sym->addr as unrelocated
address instead of reloc_sym->unrelocated_addr.

This usually does not happen. If we have a non-stripped ELF binary, we
will use it for map and debuginfo, if not, we use only kallsyms without
debuginfo. Thus, the map is always relocated (ELF and DWARF binary) or
not relocated (kallsyms).

However, if we allow the combination of debuginfo and kallsyms based map
(like using debuginfod), we have to check the map->reloc and choose the
collect address of reloc_sym.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Frank Ch. Eigler <fche@redhat.com>
Cc: Aaron Merey <amerey@redhat.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Link: http://lore.kernel.org/lkml/160041609047.912668.14314639291419159274.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf intel-pt: Fix "context_switch event has no tid" error
Adrian Hunter [Wed, 9 Sep 2020 08:49:23 +0000 (11:49 +0300)]
perf intel-pt: Fix "context_switch event has no tid" error

A context_switch event can have no tid because pids can be detached from
a task while the task is still running (in do_exit()). Note this won't
happen with per-task contexts because then tracing stops at
perf_event_exit_task()

If a task with no tid gets preempted, or a dying task gets preempted and
its parent releases it, when it subsequently gets switched back in,
Intel PT will not be able to determine what task is running and prints
an error "context_switch event has no tid". However, it is not really an
error because the task is in kernel space and the decoder can continue
to decode successfully. Fix by changing the error to be only a logged
message, and make allowance for tid == -1.

Example:

  Using 5.9-rc4 with Preemptible Kernel (Low-Latency Desktop) e.g.
  $ uname -r
  5.9.0-rc4
  $ grep PREEMPT .config
  # CONFIG_PREEMPT_NONE is not set
  # CONFIG_PREEMPT_VOLUNTARY is not set
  CONFIG_PREEMPT=y
  CONFIG_PREEMPT_COUNT=y
  CONFIG_PREEMPTION=y
  CONFIG_PREEMPT_RCU=y
  CONFIG_PREEMPT_NOTIFIERS=y
  CONFIG_DRM_I915_PREEMPT_TIMEOUT=640
  CONFIG_DEBUG_PREEMPT=y
  # CONFIG_PREEMPT_TRACER is not set
  # CONFIG_PREEMPTIRQ_DELAY_TEST is not set

Before:

  $ cat forkit.c

  #include <sys/types.h>
  #include <unistd.h>
  #include <sys/wait.h>

  int main()
  {
          pid_t child;
          int status = 0;

          child = fork();
          if (child == 0)
                  return 123;
          wait(&status);
          return 0;
  }

  $ gcc -o forkit forkit.c
  $ sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k &
  [1] 11016
  $ taskset 2 ./forkit
  $ sudo pkill perf
  $ [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 17.262 MB perf.data ]

  [1]+  Terminated              sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k
  $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit
  context_switch event has no tid
           taskset 11019 [001] 66663.270045029:          1 instructions:k:  ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms])
           taskset 11019 [001] 66663.270201816:          1 instructions:k:  ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019
            forkit 11019 [001] 66663.270420028:          1 instructions:k:  ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270648704:          1 instructions:k:  ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270833163:          1 instructions:k:  ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271092359:          1 instructions:k:  ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019)
            forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid: 11020/11020
            forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
            forkit 11020 [001] 66663.271312066:          1 instructions:k:  ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms])
            forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019)
            forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 11019/11019
            forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11020/11020
            forkit 11019 [001] 66663.271517241:          1 instructions:k:  ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms])
            forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386)

After:

  $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit
           taskset 11019 [001] 66663.270045029:          1 instructions:k:  ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms])
           taskset 11019 [001] 66663.270201816:          1 instructions:k:  ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019
            forkit 11019 [001] 66663.270420028:          1 instructions:k:  ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270648704:          1 instructions:k:  ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms])
            forkit 11019 [001] 66663.270833163:          1 instructions:k:  ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271092359:          1 instructions:k:  ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms])
            forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019)
            forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid: 11020/11020
            forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
            forkit 11020 [001] 66663.271312066:          1 instructions:k:  ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms])
            forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019)
            forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 11019/11019
            forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11020/11020
            forkit 11019 [001] 66663.271517241:          1 instructions:k:  ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms])
            forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386)
            forkit 11019 [001] 66663.271688752: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:    -1/-1
               :-1    -1 [001] 66663.271692086: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 11019/11019
                :-1    -1 [001] 66663.271707466:          1 instructions:k:  ffffffffb18eb096 update_load_avg+0x306 ([kernel.kallsyms])

Fixes: 86c2786994bd7c ("perf intel-pt: Add support for PERF_RECORD_SWITCH")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lore.kernel.org/lkml/20200909084923.9096-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf script: Display negative tid in non-sample events
Adrian Hunter [Wed, 9 Sep 2020 08:49:22 +0000 (11:49 +0300)]
perf script: Display negative tid in non-sample events

The kernel can release tasks while they are still running. This can
result in a task having no tid, in which case perf records a tid of -1.
Improve the perf script output in that case.

Example:

Before:

  # cat ./autoreap.c

  #include <sys/types.h>
  #include <unistd.h>
  #include <sys/wait.h>
  #include <signal.h>

  struct sigaction act = {
          .sa_handler = SIG_IGN,
  };

  int main()
  {
          pid_t child;
          int status = 0;

          sigaction(SIGCHLD, &act, NULL);
          child = fork();
          if (child == 0)
                  return 123;
          wait(&status);
          return 0;
  }

  # gcc -o autoreap autoreap.c
  # ./perf record -a -e dummy --switch-events ./autoreap
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.948 MB perf.data ]
  # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
           swapper     0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
              perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
          autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
          autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25191/25191
          autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
           swapper     0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:    11/11
       rcu_preempt    11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
       rcu_preempt    11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    11/11
          autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
  PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 4294967295/4294967295
           swapper     0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
          autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
          autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25188/25188

After:

  # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
           swapper     0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
              perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
          autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
          autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25191/25191
          autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
           swapper     0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:    11/11
       rcu_preempt    11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
       rcu_preempt    11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    11/11
          autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
               :-1    -1 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:    -1/-1
           swapper     0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25189/25189
          autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0
          autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
          autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
           swapper     0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid: 25189/25189
           swapper     0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid: 25188/25188

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Link: http://lore.kernel.org/lkml/20200909084923.9096-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf docs: Improve help information in perf.txt
Zejiang Tang [Wed, 9 Sep 2020 09:53:14 +0000 (17:53 +0800)]
perf docs: Improve help information in perf.txt

perf has many undocumented options, such as:-vv, --exec-path,
--html-path, -p, --paginate,--no-pager, --debugfs-dir, --buildid-dir,
--list-cmds, --list-opts.

Add entris for these options in perf.txt.

Signed-off-by: Zejiang Tang <tangzejiang@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1599645194-8438-1-git-send-email-tangzejiang@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf metric: Remove duplicate include
YueHaibing [Tue, 15 Sep 2020 08:15:41 +0000 (16:15 +0800)]
perf metric: Remove duplicate include

Remove duplicate header which is included twice.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200915081541.41004-1-yuehaibing@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tools: Add documentation for topdown metrics
Andi Kleen [Fri, 11 Sep 2020 14:48:08 +0000 (07:48 -0700)]
perf tools: Add documentation for topdown metrics

Add some documentation how to use the topdown metrics in ring 3.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-5-kan.liang@linux.intel.com
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Support new per thread TopDown metrics
Andi Kleen [Fri, 11 Sep 2020 14:48:07 +0000 (07:48 -0700)]
perf stat: Support new per thread TopDown metrics

Icelake has support for reporting per thread TopDown metrics.

These are reported differently than the previous TopDown support,
each metric is standalone, but scaled to pipeline "slots".

We don't need to do anything special for HyperThreading anymore.
Teach perf stat --topdown to handle these new metrics and
print them in the same way as the previous TopDown metrics.

The restrictions of only being able to report information per core is
gone.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Co-developed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-4-kan.liang@linux.intel.com
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf record: Support sample-read topdown metric group
Kan Liang [Fri, 11 Sep 2020 14:48:06 +0000 (07:48 -0700)]
perf record: Support sample-read topdown metric group

With the hardware TopDown metrics feature, sample-read feature should be
supported for a topdown group, e.g., sample a non-topdown event and read
a topdown metric group. But the current perf record code errors out.

For a topdown metric group, the slots event must be the leader of the
group, but the leader slots event doesn't support sampling.

To support sample-read the topdown metric group, use the 2nd event of
the group as the "leader" for the purposes of sampling.

Only the platform with Topdown metic feature supports sample-read the
topdown group. Add arch_topdown_sample_read() to indicate whether the
topdown group supports sample-read.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tools: Rename group to topdown
Kan Liang [Fri, 11 Sep 2020 14:48:05 +0000 (07:48 -0700)]
perf tools: Rename group to topdown

The group.h/c only include TopDown group related functions. The name
"group" is too generic and inaccurate. Use the name "topdown" to replace
it.

Move topdown related functions to a dedicated file, topdown.c.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf machine: Add machine__for_each_dso() function
Jiri Olsa [Sun, 13 Sep 2020 21:03:08 +0000 (23:03 +0200)]
perf machine: Add machine__for_each_dso() function

Add the machine__for_each_dso() to iterate over all dso objects defined
for the within a machine object. It will be used in the MMAP3 patch
series.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200913210313.1985612-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge remote-tracking branch 'torvalds/master' into perf/core
Arnaldo Carvalho de Melo [Thu, 17 Sep 2020 18:45:05 +0000 (15:45 -0300)]
Merge remote-tracking branch 'torvalds/master' into perf/core

To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge tag 'perf-tools-fixes-for-v5.9-2020-09-16' of git://git.kernel.org/pub/scm...
Linus Torvalds [Wed, 16 Sep 2020 19:00:16 +0000 (12:00 -0700)]
Merge tag 'perf-tools-fixes-for-v5.9-2020-09-16' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Set PERF_SAMPLE_PERIOD if attr->freq is set.

 - Remove trailing commas from AMD JSON vendor event files.

 - Don't clear event's period if set by a event definition term.

 - Leader sampling shouldn't clear sample period in 'perf test'.

 - Fix the "signal" test inline assembly when built with DEBUG=1.

 - Fix memory leaks detected by ASAN, some in normal paths, some in
   error paths.

 - Fix 2 memory sanitizer warnings in 'perf bench'.

 - Fix the ratio comments of miss-events in 'perf stat'.

 - Prevent override of attr->sample_period for libpfm4 events.

 - Sync kvm.h and in.h headers with the kernel sources.

* tag 'perf-tools-fixes-for-v5.9-2020-09-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf stat: Fix the ratio comments of miss-events
  perf test: Free formats for perf pmu parse test
  perf metric: Do not free metric when failed to resolve
  perf metric: Free metric when it failed to resolve
  perf metric: Release expr_parse_ctx after testing
  perf test: Fix memory leaks in parse-metric test
  perf parse-event: Fix memory leak in evsel->unit
  perf evlist: Fix cpu/thread map leak
  perf metric: Fix some memory leaks - part 2
  perf metric: Fix some memory leaks
  perf test: Free aliases for PMU event map aliases test
  perf vendor events amd: Remove trailing commas
  perf test: Leader sampling shouldn't clear sample period
  perf record: Don't clear event's period if set by a term
  tools headers UAPI: update linux/in.h copy
  tools headers UAPI: Sync kvm.h headers with the kernel sources
  perf record: Prevent override of attr->sample_period for libpfm4 events
  perf record: Set PERF_RECORD_PERIOD if attr->freq is set.
  perf bench: Fix 2 memory sanitizer warnings
  perf test: Fix the "signal" test inline assembly

3 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 16 Sep 2020 18:52:56 +0000 (11:52 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk driver fixes from Stephen Boyd:
 "A handful of clk driver fixes. Mostly they're for error paths or
  improper memory allocations sizes. Nothing as exciting as a wildfire"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: qcom: lpass: Correct goto target in lpass_core_sc7180_probe()
  clk: versatile: Add of_node_put() before return statement
  clk: bcm: dvp: Select the reset framework
  clk: rockchip: Fix initialization of mux_pll_src_4plls_p
  clk: davinci: Use the correct size when allocating memory

3 years agoMAINTAINERS: Fix Max's and Shravan's emails
Leon Romanovsky [Tue, 15 Sep 2020 08:27:41 +0000 (11:27 +0300)]
MAINTAINERS: Fix Max's and Shravan's emails

Max's and Shravan's usernames were changed while @mellanox.com emails
were transferred to be @nvidia.com.

Fixes: f6da70d99c96 ("MAINTAINERS: Update Mellanox and Cumulus Network addresses to new domain")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoperf stat: Fix the ratio comments of miss-events
Qi Liu [Wed, 16 Sep 2020 10:48:51 +0000 (18:48 +0800)]
perf stat: Fix the ratio comments of miss-events

'perf stat' displays miss ratio of L1-dcache, L1-icache, dTLB cache,
iTLB cache and LL-cache. Take L1-dcache for example, miss ratio is
caculated as "L1-dcache-load-misses/L1-dcache-loads". So "of all
L1-dcache hits" is unsuitable to describe it, and "of all L1-dcache
accesses" seems better.

The comments of L1-icache, dTLB cache, iTLB cache and LL-cache are
fixed in the same way.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1600253331-10535-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Tue, 15 Sep 2020 23:30:20 +0000 (16:30 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fix from James Bottomley:
 "Just one fix in libsas for a resource leak in an error path"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: libsas: Fix error path in sas_notify_lldd_dev_found()

3 years agoMerge tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Tue, 15 Sep 2020 23:26:57 +0000 (16:26 -0700)]
Merge tag 'fixes-v5.9a' of git://git./linux/kernel/git/jmorris/linux-security

Pull security layer fix from James  Morris:
 "A device_cgroup RCU warning fix from Amol Grover"

* tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  device_cgroup: Fix RCU list debugging warning

3 years agoMerge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 15 Sep 2020 23:20:43 +0000 (16:20 -0700)]
Merge tag 'hyperv-fixes-signed' of git://git./linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:
 "Two patches from Michael and Dexuan to fix vmbus hanging issues"

* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload
  Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()

3 years agoperf test: Free formats for perf pmu parse test
Namhyung Kim [Tue, 15 Sep 2020 03:18:19 +0000 (12:18 +0900)]
perf test: Free formats for perf pmu parse test

The following leaks were detected by ASAN:

  Indirect leak of 360 byte(s) in 9 object(s) allocated from:
    #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
    #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
    #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
    #4 0x560578e07045 in test__pmu tests/pmu.c:155
    #5 0x560578de109b in run_test tests/builtin-test.c:410
    #6 0x560578de109b in test_and_print tests/builtin-test.c:440
    #7 0x560578de401a in __cmd_test tests/builtin-test.c:661
    #8 0x560578de401a in cmd_test tests/builtin-test.c:807
    #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: cff7f956ec4a1 ("perf tests: Move pmu tests into separate object")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf metric: Do not free metric when failed to resolve
Namhyung Kim [Tue, 15 Sep 2020 03:18:17 +0000 (12:18 +0900)]
perf metric: Do not free metric when failed to resolve

It's dangerous to free the original metric when it's called from
resolve_metric() as it's already in the metric_list and might have other
resources too.  Instead, it'd better let them bail out and be released
properly at the later stage.

So add a check when it's called from metricgroup__add_metric() and
release it.  Also make sure that mp is set properly.

Fixes: 83de0b7d535de ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf metric: Free metric when it failed to resolve
Namhyung Kim [Tue, 15 Sep 2020 03:18:16 +0000 (12:18 +0900)]
perf metric: Free metric when it failed to resolve

The metricgroup__add_metric() can find multiple match for a metric group
and it's possible to fail.  Also it can fail in the middle like in
resolve_metric() even for single metric.

In those cases, the intermediate list and ids will be leaked like:

  Direct leak of 3 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
    #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
    #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
    #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
    #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
    #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
    #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
    #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
    #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
    #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
    #11 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 83de0b7d535de ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf metric: Release expr_parse_ctx after testing
Namhyung Kim [Tue, 15 Sep 2020 03:18:15 +0000 (12:18 +0900)]
perf metric: Release expr_parse_ctx after testing

The test_generic_metric() missed to release entries in the pctx.  Asan
reported following leak (and more):

  Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14)
    #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497)
    #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111
    #4 0x55f7e7341667 in expr__add_ref util/expr.c:120
    #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783
    #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858
    #7 0x55f7e712390b in compute_single tests/parse-metric.c:128
    #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180
    #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196
    #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295
    #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355
    #12 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 6d432c4c8aa56 ("perf tools: Add test_generic_metric function")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf test: Fix memory leaks in parse-metric test
Namhyung Kim [Tue, 15 Sep 2020 03:18:14 +0000 (12:18 +0900)]
perf test: Fix memory leaks in parse-metric test

It didn't release resources when there's an error so the
test_recursion_fail() will leak some memory.

Fixes: 0a507af9c681a ("perf tests: Add parse metric test for ipc metric")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf parse-event: Fix memory leak in evsel->unit
Namhyung Kim [Tue, 15 Sep 2020 03:18:13 +0000 (12:18 +0900)]
perf parse-event: Fix memory leak in evsel->unit

The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string.  But tool event (duration_time) passes a result of
strdup() caused a leak.

It was found by ASAN during metric test:

  Direct leak of 210 byte(s) in 70 object(s) allocated from:
    #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
    #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
    #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
    #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
    #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
    #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
    #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
    #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
    #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
    #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
    #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
    #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
    #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
    #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: f0fbb114e3025 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf evlist: Fix cpu/thread map leak
Namhyung Kim [Tue, 15 Sep 2020 03:18:11 +0000 (12:18 +0900)]
perf evlist: Fix cpu/thread map leak

Asan reported leak of cpu and thread maps as they have one more refcount
than released.  I found that after setting evlist maps it should release
it's refcount.

It seems to be broken from the beginning so I chose the original commit
as the culprit.  But not sure how it's applied to stable trees since
there are many changes in the code after that.

Fixes: 7e2ed097538c5 ("perf evlist: Store pointer to the cpu and thread maps")
Fixes: 4112eb1899c0e ("perf evlist: Default to syswide target when no thread/cpu maps set")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf metric: Fix some memory leaks - part 2
Namhyung Kim [Tue, 15 Sep 2020 03:18:10 +0000 (12:18 +0900)]
perf metric: Fix some memory leaks - part 2

The metric_event_delete() missed to free expr->metric_events and it
should free an expr when metric_refs allocation failed.

Fixes: 4ea2896715e67 ("perf metric: Collect referenced metrics in struct metric_expr")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf metric: Fix some memory leaks
Namhyung Kim [Tue, 15 Sep 2020 03:18:09 +0000 (12:18 +0900)]
perf metric: Fix some memory leaks

I found some memory leaks while reading the metric code.  Some are real
and others only occur in the error path.  When it failed during metric
or event parsing, it should release all resources properly.

Fixes: b18f3e365019d ("perf stat: Support JSON metrics in perf stat")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf test: Free aliases for PMU event map aliases test
Namhyung Kim [Tue, 15 Sep 2020 03:18:18 +0000 (12:18 +0900)]
perf test: Free aliases for PMU event map aliases test

The aliases were never released causing the following leaks:

  Indirect leak of 1224 byte(s) in 9 object(s) allocated from:
    #0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
    #1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322
    #2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778
    #3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295
    #4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367
    #5 0x56332c76a09b in run_test tests/builtin-test.c:410
    #6 0x56332c76a09b in test_and_print tests/builtin-test.c:440
    #7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695
    #8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807
    #9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 956a78356c24c ("perf test: Test pmu-events aliases")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf vendor events amd: Remove trailing commas
Henry Burns [Tue, 15 Sep 2020 00:40:49 +0000 (20:40 -0400)]
perf vendor events amd: Remove trailing commas

The amdzen2/core.json and amdzen/core.json vendor events files have the
occasional trailing comma. Since that goes against the JSON standard,
lets remove it.

Signed-off-by: Henry Burns <henrywolfeburns@gmail.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Link: http://lore.kernel.org/lkml/20200915004125.971-1-henrywolfeburns@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge tag 'for-5.9-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Mon, 14 Sep 2020 22:41:58 +0000 (15:41 -0700)]
Merge tag 'for-5.9-rc5-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fix from David Sterba:
 "One of the recent lockdep fixes introduced a bug that breaks the
  search ioctl, which is used by some applications (bees, compsize). The
  patch made it to stable trees so we need this fixup to make it work
  again"

* tag 'for-5.9-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix wrong address when faulting in pages in the search ioctl

3 years agoperf test: Leader sampling shouldn't clear sample period
Ian Rogers [Sat, 12 Sep 2020 02:56:55 +0000 (19:56 -0700)]
perf test: Leader sampling shouldn't clear sample period

Add test that a sibling with leader sampling doesn't have its period
cleared.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200912025655.1337192-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf record: Don't clear event's period if set by a term
Ian Rogers [Sat, 12 Sep 2020 02:56:54 +0000 (19:56 -0700)]
perf record: Don't clear event's period if set by a term

If events in a group explicitly set a frequency or period with leader
sampling, don't disable the samples on those events.

Prior to 5.8:

  perf record -e '{cycles/period=12345000/,instructions/period=6789000/}:S'

would clear the attributes then apply the config terms. In commit
5f34278867b7 leader sampling configuration was moved to after applying the
config terms, in the example, making the instructions' event have its period
cleared.

This change makes it so that sampling is only disabled if configuration
terms aren't present.

Committer testing:

Before:

  # perf record -e '{cycles/period=1/,instructions/period=2/}:S' sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.051 MB perf.data (6 samples) ]
  #
  # perf evlist -v
  cycles/period=1/: size: 120, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, disabled: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  instructions/period=2/: size: 120, config: 0x1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, sample_id_all: 1, exclude_guest: 1
  #

After:

  # perf record -e '{cycles/period=1/,instructions/period=2/}:S' sleep 0.0001
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.052 MB perf.data (4 samples) ]
  # perf evlist -v
  cycles/period=1/: size: 120, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, disabled: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  instructions/period=2/: size: 120, config: 0x1, { sample_period, sample_freq }: 2, sample_type: IP|TID|TIME|READ|ID, read_format: ID|GROUP, sample_id_all: 1, exclude_guest: 1
  #

Fixes: 5f34278867b7 ("perf evlist: Move leader-sampling configuration")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20200912025655.1337192-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools headers UAPI: update linux/in.h copy
Arnaldo Carvalho de Melo [Mon, 14 Sep 2020 22:06:41 +0000 (19:06 -0300)]
tools headers UAPI: update linux/in.h copy

To get the changes from:

  645f08975f49441b ("net: Fix some comments")

That don't cause any changes in tooling, its just a typo fix.

This silences this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools headers UAPI: Sync kvm.h headers with the kernel sources
Arnaldo Carvalho de Melo [Mon, 14 Sep 2020 22:02:18 +0000 (19:02 -0300)]
tools headers UAPI: Sync kvm.h headers with the kernel sources

To pick the changes in:

  15e9e35cd1dec2bc ("KVM: MIPS: Change the definition of kvm type")
  004a01241c5a0d37 ("arm64/x86: KVM: Introduce steal-time cap")

That do not result in any change in tooling, as the additions are not
being used in any table generator.

This silences these perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Jones <drjones@redhat.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf record: Prevent override of attr->sample_period for libpfm4 events
Stephane Eranian [Sat, 12 Sep 2020 02:56:53 +0000 (19:56 -0700)]
perf record: Prevent override of attr->sample_period for libpfm4 events

Before:

  $ perf record -c 10000 --pfm-events=cycles:period=77777

Would yield a cycles event with period=10000, instead of 77777.

the event string and perf record initializing the event.
This was due to an ordering issue between libpfm4 parsing

events with attr->sample_period != 0 by the time
intent of the author.
perf_evsel__config() is invoked. This seems to have been the
This patch fixes the problem by preventing override for

Signed-off-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20200912025655.1337192-3-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf record: Set PERF_RECORD_PERIOD if attr->freq is set.
David Sharp [Sat, 12 Sep 2020 02:56:52 +0000 (19:56 -0700)]
perf record: Set PERF_RECORD_PERIOD if attr->freq is set.

evsel__config() would only set PERF_RECORD_PERIOD if it set attr->freq
from perf record options. When it is set by libpfm events, it would not
get set. This changes evsel__config to see if attr->freq is set outside
of whether or not it changes attr->freq itself.

Signed-off-by: David Sharp <dhsharp@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: david sharp <dhsharp@google.com>
Link: http://lore.kernel.org/lkml/20200912025655.1337192-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf bench: Fix 2 memory sanitizer warnings
Ian Rogers [Sat, 12 Sep 2020 05:37:25 +0000 (22:37 -0700)]
perf bench: Fix 2 memory sanitizer warnings

Memory sanitizer warns if a write is performed where the memory being
read for the write is uninitialized. Avoid this warning by initializing
the memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200912053725.1405857-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf test: Fix the "signal" test inline assembly
Jiri Olsa [Fri, 11 Sep 2020 13:00:05 +0000 (15:00 +0200)]
perf test: Fix the "signal" test inline assembly

When compiling with DEBUG=1 on Fedora 32 I'm getting crash for 'perf
test signal':

  Program received signal SIGSEGV, Segmentation fault.
  0x0000000000c68548 in __test_function ()
  (gdb) bt
  #0  0x0000000000c68548 in __test_function ()
  #1  0x00000000004d62e9 in test_function () at tests/bp_signal.c:61
  #2  0x00000000004d689a in test__bp_signal (test=0xa8e280 <generic_ ...
  #3  0x00000000004b7d49 in run_test (test=0xa8e280 <generic_tests+1 ...
  #4  0x00000000004b7e7f in test_and_print (t=0xa8e280 <generic_test ...
  #5  0x00000000004b8927 in __cmd_test (argc=1, argv=0x7fffffffdce0, ...
  ...

It's caused by the symbol __test_function being in the ".bss" section:

  $ readelf -a ./perf | less
    [Nr] Name              Type             Address           Offset
         Size              EntSize          Flags  Link  Info  Align
    ...
    [28] .bss              NOBITS           0000000000c356a0  008346a0
         00000000000511f8  0000000000000000  WA       0     0     32

  $ nm perf | grep __test_function
  0000000000c68548 B __test_function

I guess most of the time we're just lucky the inline asm ended up in the
".text" section, so making it specific explicit with push and pop
section clauses.

  $ readelf -a ./perf | less
    [Nr] Name              Type             Address           Offset
         Size              EntSize          Flags  Link  Info  Align
    ...
    [13] .text             PROGBITS         0000000000431240  00031240
         0000000000306faa  0000000000000000  AX       0     0     16

  $ nm perf | grep __test_function
  00000000004d62c8 T __test_function

Committer testing:

  $ readelf -wi ~/bin/perf | grep producer -m1
    <c>   DW_AT_producer    : (indirect string, offset: 0x254a): GNU C99 10.2.1 20200723 (Red Hat 10.2.1-1) -mtune=generic -march=x86-64 -ggdb3 -std=gnu99 -fno-omit-frame-pointer -funwind-tables -fstack-protector-all
                                                                                                                                         ^^^^^
                                                                                                                                         ^^^^^
                                                                                                                                         ^^^^^
  $

Before:

  $ perf test signal
  20: Breakpoint overflow signal handler                    : FAILED!
  $

After:

  $ perf test signal
  20: Breakpoint overflow signal handler                    : Ok
  $

Fixes: 8fd34e1cce18 ("perf test: Improve bp_signal")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20200911130005.1842138-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agovgacon: remove software scrollback support
Linus Torvalds [Wed, 9 Sep 2020 21:53:50 +0000 (14:53 -0700)]
vgacon: remove software scrollback support

Yunhai Zhang recently fixed a VGA software scrollback bug in commit
ebfdfeeae8c0 ("vgacon: Fix for missing check in scrollback handling"),
but that then made people look more closely at some of this code, and
there were more problems on the vgacon side, but also the fbcon software
scrollback.

We don't really have anybody who maintains this code - probably because
nobody actually _uses_ it any more.  Sure, people still use both VGA and
the framebuffer consoles, but they are no longer the main user
interfaces to the kernel, and haven't been for decades, so these kinds
of extra features end up bitrotting and not really being used.

So rather than try to maintain a likely unused set of code, I'll just
aggressively remove it, and see if anybody even notices.  Maybe there
are people who haven't jumped on the whole GUI badnwagon yet, and think
it's just a fad.  And maybe those people use the scrollback code.

If that turns out to be the case, we can resurrect this again, once
we've found the sucker^Wmaintainer for it who actually uses it.

Reported-by: NopNop Nop <nopitydays@gmail.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: 张云海 <zhangyunhai@nsfocus.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Willy Tarreau <w@1wt.eu>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agofbcon: remove now unusued 'softback_lines' cursor() argument
Linus Torvalds [Tue, 8 Sep 2020 17:56:27 +0000 (10:56 -0700)]
fbcon: remove now unusued 'softback_lines' cursor() argument

Since the softscroll code got removed, this argument is always zero and
makes no sense any more.

Tested-by: Yuan Ming <yuanmingbuaa@gmail.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agofbcon: remove soft scrollback code
Linus Torvalds [Mon, 7 Sep 2020 18:45:27 +0000 (11:45 -0700)]
fbcon: remove soft scrollback code

This (and the VGA soft scrollback) turns out to have various nasty small
special cases that nobody really is willing to fight.  The soft
scrollback code was really useful a few decades ago when you typically
used the console interactively as the main way to interact with the
machine, but that just isn't the case any more.

So it's not worth dragging along.

Tested-by: Yuan Ming <yuanmingbuaa@gmail.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agobtrfs: fix wrong address when faulting in pages in the search ioctl
Filipe Manana [Mon, 14 Sep 2020 08:01:04 +0000 (09:01 +0100)]
btrfs: fix wrong address when faulting in pages in the search ioctl

When faulting in the pages for the user supplied buffer for the search
ioctl, we are passing only the base address of the buffer to the function
fault_in_pages_writeable(). This means that after the first iteration of
the while loop that searches for leaves, when we have a non-zero offset,
stored in 'sk_offset', we try to fault in a wrong page range.

So fix this by adding the offset in 'sk_offset' to the base address of the
user supplied buffer when calling fault_in_pages_writeable().

Several users have reported that the applications compsize and bees have
started to operate incorrectly since commit a48b73eca4ceb9 ("btrfs: fix
potential deadlock in the search ioctl") was added to stable trees, and
these applications make heavy use of the search ioctls. This fixes their
issues.

Link: https://lore.kernel.org/linux-btrfs/632b888d-a3c3-b085-cdf5-f9bb61017d92@lechevalier.se/
Link: https://github.com/kilobyte/compsize/issues/34
Fixes: a48b73eca4ceb9 ("btrfs: fix potential deadlock in the search ioctl")
CC: stable@vger.kernel.org # 4.4+
Tested-by: A L <mail@lechevalier.se>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
3 years agoDrivers: hv: vmbus: Add timeout to vmbus_wait_for_unload
Michael Kelley [Sun, 13 Sep 2020 19:47:29 +0000 (12:47 -0700)]
Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload

vmbus_wait_for_unload() looks for a CHANNELMSG_UNLOAD_RESPONSE message
coming from Hyper-V.  But if the message isn't found for some reason,
the panic path gets hung forever.  Add a timeout of 10 seconds to prevent
this.

Fixes: 415719160de3 ("Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload()")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/1600026449-23651-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
3 years agoLinux 5.9-rc5
Linus Torvalds [Sun, 13 Sep 2020 23:06:00 +0000 (16:06 -0700)]
Linux 5.9-rc5

3 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sun, 13 Sep 2020 21:54:40 +0000 (14:54 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Olof Johansson:
 "A collection of fixes I've been accruing over the last few weeks, none
  of them have been severe enough to warrant flushing the queue but it's
  been long enough now that it's a good idea to send them in.

  A handful of them are fixups for QSPI DT/bindings/compatibles, some
  smaller fixes for system DMA clock control and TMU interrupts on i.MX,
  a handful of fixes for OMAP, including a fix for DSI (display) on
  omap5"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (27 commits)
  arm64: dts: ns2: Fixed QSPI compatible string
  ARM: dts: BCM5301X: Fixed QSPI compatible string
  ARM: dts: NSP: Fixed QSPI compatible string
  ARM: dts: bcm: HR2: Fixed QSPI compatible string
  dt-bindings: spi: Fix spi-bcm-qspi compatible ordering
  ARM: dts: imx6sx: fix the pad QSPI1B_SCLK mux mode for uart3
  arm64: dts: imx8mp: correct sdma1 clk setting
  arm64: dts: imx8mq: Fix TMU interrupt property
  ARM: dts: imx7d-zii-rmu2: fix rgmii phy-mode for ksz9031 phy
  ARM: dts: vfxxx: Add syscon compatible with OCOTP
  ARM: dts: imx6q-logicpd: Fix broken PWM
  arm64: dts: imx: Add missing imx8mm-beacon-kit.dtb to build
  ARM: dts: imx6q-prtwd2: Remove unneeded i2c unit name
  ARM: dts: imx6qdl-gw51xx: Remove unneeded #address-cells/#size-cells
  ARM: dts: imx7ulp: Correct gpio ranges
  ARM: dts: ls1021a: fix QuadSPI-memory reg range
  arm64: defconfig: Enable ptn5150 extcon driver
  arm64: defconfig: Enable USB gadget with configfs
  ARM: configs: Update Integrator defconfig
  ARM: dts: omap5: Fix DSI base address and clocks
  ...

3 years agoMerge tag 'usb-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 13 Sep 2020 16:23:54 +0000 (09:23 -0700)]
Merge tag 'usb-5.9-rc5' of git://git./linux/kernel/git/gregkh/usb

Pull USB/Thunderbolt fixes from Greg KH:
 "Here are some small USB and Thunderbolt driver fixes for 5.9-rc5.

  Nothing huge, just a number of bugfixes and new device ids for
  problems reported:

   - new USB serial driver ids

   - bug fixes for syzbot reported problems

   - typec driver fixes

   - thunderbolt driver fixes

   - revert of reported broken commit

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: intel_pmc_mux: Do not configure SBU and HSL Orientation in Alternate modes
  usb: typec: intel_pmc_mux: Do not configure Altmode HPD High
  usb: core: fix slab-out-of-bounds Read in read_descriptors
  Revert "usb: dwc3: meson-g12a: fix shared reset control use"
  usb: typec: ucsi: acpi: Check the _DEP dependencies
  usb: typec: intel_pmc_mux: Un-register the USB role switch
  usb: Fix out of sync data toggle if a configured device is reconfigured
  USB: serial: option: support dynamic Quectel USB compositions
  USB: serial: option: add support for SIM7070/SIM7080/SIM7090 modules
  thunderbolt: Use maximum USB3 link rate when reclaiming if link is not up
  thunderbolt: Disable ports that are not implemented
  USB: serial: ftdi_sio: add IDs for Xsens Mti USB converter

3 years agoMerge tag 'staging-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 13 Sep 2020 16:15:20 +0000 (09:15 -0700)]
Merge tag 'staging-5.9-rc5' of git://git./linux/kernel/git/gregkh/staging

Pull staging/IIO driver fixes from Greg KH:
 "Here are a number of staging and IIO driver fixes for 5.9-rc5.

  The majority of these are IIO driver fixes, to resolve a timestamp
  issue that was recently found to affect a bunch of IIO drivers.

  The other fixes in here are:

   - small IIO driver fixes

   - greybus driver fix

   - counter driver fix (came in through the IIO fixes tree)

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'staging-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (23 commits)
  iio: adc: mcp3422: fix locking on error path
  iio: adc: mcp3422: fix locking scope
  iio: adc: meson-saradc: Use the parent device to look up the calib data
  iio:adc:max1118 Fix alignment of timestamp and data leak issues
  iio:adc:ina2xx Fix timestamp alignment issue.
  iio:adc:ti-adc084s021 Fix alignment and data leak issues.
  iio:adc:ti-adc081c Fix alignment and data leak issues
  iio:magnetometer:ak8975 Fix alignment and data leak issues.
  iio:light:ltr501 Fix timestamp alignment issue.
  iio:light:max44000 Fix timestamp alignment and prevent data leak.
  iio:chemical:ccs811: Fix timestamp alignment and prevent data leak.
  iio:proximity:mb1232: Fix timestamp alignment and prevent data leak.
  iio:accel:mma7455: Fix timestamp alignment and prevent data leak.
  iio:accel:bmc150-accel: Fix timestamp alignment and prevent data leak.
  iio:accel:mma8452: Fix timestamp alignment and prevent data leak.
  iio: accel: kxsd9: Fix alignment of local buffer.
  iio: adc: rockchip_saradc: select IIO_TRIGGERED_BUFFER
  iio: adc: ti-ads1015: fix conversion when CONFIG_PM is not set
  counter: microchip-tcb-capture: check the correct variable
  iio: cros_ec: Set Gyroscope default frequency to 25Hz
  ...

3 years agoMerge tag 'driver-core-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 13 Sep 2020 16:02:59 +0000 (09:02 -0700)]
Merge tag 'driver-core-5.9-rc5' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are some small driver core and debugfs fixes for 5.9-rc5

  Included in here are:

   - firmware loader memory leak fix

   - firmware loader testing fixes for non-EFI systems

   - device link locking fixes found by lockdep

   - kobject_del() bugfix that has been affecting some callers

   - debugfs minor fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  test_firmware: Test platform fw loading on non-EFI systems
  PM: <linux/device.h>: fix @em_pd kernel-doc warning
  kobject: Drop unneeded conditional in __kobject_del()
  driver core: Fix device_pm_lock() locking for device links
  MAINTAINERS: Add the security document to SECURITY CONTACT
  driver code: print symbolic error code
  debugfs: Fix module state check condition
  kobject: Restore old behaviour of kobject_del(NULL)
  firmware_loader: fix memory leak for paged buffer

3 years agoMerge tag 'arm-soc/for-5.9/devicetree-fixes' of https://github.com/Broadcom/stblinux...
Olof Johansson [Sun, 13 Sep 2020 15:57:37 +0000 (08:57 -0700)]
Merge tag 'arm-soc/for-5.9/devicetree-fixes' of https://github.com/Broadcom/stblinux into arm/fixes

This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
5.9, please pull the following:

- Florian fixes the Broadcom QSPI controller binding such that the most
  specific compatible string is the left most one, and all existing
  in-tree users are updated as well.

* tag 'arm-soc/for-5.9/devicetree-fixes' of https://github.com/Broadcom/stblinux:
  arm64: dts: ns2: Fixed QSPI compatible string
  ARM: dts: BCM5301X: Fixed QSPI compatible string
  ARM: dts: NSP: Fixed QSPI compatible string
  ARM: dts: bcm: HR2: Fixed QSPI compatible string
  dt-bindings: spi: Fix spi-bcm-qspi compatible ordering

Link: https://lore.kernel.org/r/20200909211857.4144718-1-f.fainelli@gmail.com
Signed-off-by: Olof Johansson <olof@lixom.net>
3 years agoMerge tag 'imx-fixes-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo...
Olof Johansson [Sun, 13 Sep 2020 15:56:03 +0000 (08:56 -0700)]
Merge tag 'imx-fixes-5.9-2' of git://git./linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 5.9, round 2:

- Fix the misspelling of 'interrupts' property in i.MX8MQ TMU DT node.
- Correct 'ahb' clock for i.MX8MP SDMA1 in device tree.
- Fix pad QSPI1B_SCLK mux mode for UART3 on i.MX6SX.

* tag 'imx-fixes-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6sx: fix the pad QSPI1B_SCLK mux mode for uart3
  arm64: dts: imx8mp: correct sdma1 clk setting
  arm64: dts: imx8mq: Fix TMU interrupt property

Link: https://lore.kernel.org/r/20200909143844.GA25109@dragon
Signed-off-by: Olof Johansson <olof@lixom.net>
3 years agoMerge tag 'omap-for-v5.9/fixes-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Sun, 13 Sep 2020 15:54:01 +0000 (08:54 -0700)]
Merge tag 'omap-for-v5.9/fixes-rc3' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes

Fixes for omaps for v5.9-rc cycle

Few fixes for omap based devices:

- Fix of_clk_get() error handling for omap-iommu

- Fix missing audio pinctrl entries for logicpd boards

- Fix video for logicpd-som-lv after switch to generic panels

- Fix omap5 DSI clocks base

* tag 'omap-for-v5.9/fixes-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: omap5: Fix DSI base address and clocks
  ARM: dts: logicpd-som-lv-baseboard: Fix missing video
  ARM: dts: logicpd-som-lv-baseboard: Fix broken audio
  ARM: dts: logicpd-torpedo-baseboard: Fix broken audio
  ARM: OMAP2+: Fix an IS_ERR() vs NULL check in _get_pwrdm()

Link: https://lore.kernel.org/r/pull-1599132064-54898@atomide.com
Signed-off-by: Olof Johansson <olof@lixom.net>
3 years agoMerge tag 'char-misc-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 13 Sep 2020 15:52:21 +0000 (08:52 -0700)]
Merge tag 'char-misc-5.9-rc5' of git://git./linux/kernel/git/gregkh/char-misc

Pull char / misc driver fixes from Greg KH:
 "Here are a number of small driver fixes for 5.9-rc5

  Included in here are:

   - habanalabs driver fixes

   - interconnect driver fixes

   - soundwire driver fixes

   - dyndbg fixes for reported issues, and then reverts to fix it all up
     to a sane state.

   - phy driver fixes

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Revert "dyndbg: accept query terms like file=bar and module=foo"
  Revert "dyndbg: fix problem parsing format="foo bar""
  scripts/tags.sh: exclude tools directory from tags generation
  video: fbdev: fix OOB read in vga_8planes_imageblit()
  dyndbg: fix problem parsing format="foo bar"
  dyndbg: refine export, rename to dynamic_debug_exec_queries()
  dyndbg: give %3u width in pr-format, cosmetic only
  interconnect: qcom: Fix small BW votes being truncated to zero
  soundwire: fix double free of dangling pointer
  interconnect: Show bandwidth for disabled paths as zero in debugfs
  habanalabs: fix report of RAZWI initiator coordinates
  habanalabs: prevent user buff overflow
  phy: omap-usb2-phy: disable PHY charger detect
  phy: qcom-qmp: Use correct values for ipq8074 PCIe Gen2 PHY init
  soundwire: bus: fix typo in comment on INTSTAT registers
  phy: qualcomm: fix return value check in qcom_ipq806x_usb_phy_probe()
  phy: qualcomm: fix platform_no_drv_owner.cocci warnings

3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 13 Sep 2020 15:34:47 +0000 (08:34 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "A bit on the bigger side, mostly due to me being on vacation, then
  busy, then on parental leave, but there's nothing worrisome.

  ARM:
   - Multiple stolen time fixes, with a new capability to match x86
   - Fix for hugetlbfs mappings when PUD and PMD are the same level
   - Fix for hugetlbfs mappings when PTE mappings are enforced (dirty
     logging, for example)
   - Fix tracing output of 64bit values

  x86:
   - nSVM state restore fixes
   - Async page fault fixes
   - Lots of small fixes everywhere"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  KVM: emulator: more strict rsm checks.
  KVM: nSVM: more strict SMM checks when returning to nested guest
  SVM: nSVM: setup nested msr permission bitmap on nested state load
  SVM: nSVM: correctly restore GIF on vmexit from nesting after migration
  x86/kvm: don't forget to ACK async PF IRQ
  x86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro
  KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit
  KVM: SVM: avoid emulation with stale next_rip
  KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN
  KVM: SVM: Periodically schedule when unregistering regions on destroy
  KVM: MIPS: Change the definition of kvm type
  kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed
  KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control
  KVM: fix memory leak in kvm_io_bus_unregister_dev()
  KVM: Check the allocation of pv cpu mask
  KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected
  KVM: arm64: Update page shift if stage 2 block mapping not supported
  KVM: arm64: Fix address truncation in traces
  KVM: arm64: Do not try to map PUDs when they are folded into PMD
  arm64/x86: KVM: Introduce steal-time cap
  ...

3 years agoMerge tag 'for-linus' of git://github.com/openrisc/linux
Linus Torvalds [Sat, 12 Sep 2020 20:03:49 +0000 (13:03 -0700)]
Merge tag 'for-linus' of git://github.com/openrisc/linux

Pull OpenRISC fixes from Stafford Horne:
 "Fixes for compile issues pointed out by kbuild and one bug I found in
  initrd with the 5.9 patches"

* tag 'for-linus' of git://github.com/openrisc/linux:
  openrisc: Fix issue with get_user for 64-bit values
  openrisc: Fix cache API compile issue when not inlining
  openrisc: Reserve memblock for initrd

3 years agoMerge tag 'seccomp-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Sat, 12 Sep 2020 19:58:01 +0000 (12:58 -0700)]
Merge tag 'seccomp-v5.9-rc5' of git://git./linux/kernel/git/kees/linux

Pull seccomp fixes from Kees Cook:
 "This fixes a rare race condition in seccomp when using TSYNC and
  USER_NOTIF together where a memory allocation would not get freed
  (found by syzkaller, fixed by Tycho).

  Additionally updates Tycho's MAINTAINERS and .mailmap entries for his
  new address"

* tag 'seccomp-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: don't leave dangling ->notif if file allocation fails
  mailmap, MAINTAINERS: move to tycho.pizza
  seccomp: don't leak memory when filter install races

3 years agoMerge tag 'libnvdimm-fix-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 12 Sep 2020 19:43:58 +0000 (12:43 -0700)]
Merge tag 'libnvdimm-fix-v5.9-rc5' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fix from Vishal Verma:
 "Fix detection of dax support for block devices.

  Previous fixes in this area, which only affected printing of debug
  messages, had an incorrect condition for detection of dax. This fix
  should finally do the right thing"

* tag 'libnvdimm-fix-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: fix detection of dax support for non-persistent memory block devices

3 years agoMerge tag 'for-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Sat, 12 Sep 2020 19:28:39 +0000 (12:28 -0700)]
Merge tag 'for-5.9-rc4-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few more fixes:

   - regression fix for a crash after failed snapshot creation

   - one more lockep fix: use nofs allocation when allocating missing
     device

   - fix reloc tree leak on degraded mount

   - make some extent buffer alignment checks less strict to mount
     filesystems created by btrfs-convert"

* tag 'for-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix NULL pointer dereference after failure to create snapshot
  btrfs: free data reloc tree on failed mount
  btrfs: require only sector size alignment for parent eb bytenr
  btrfs: fix lockdep splat in add_missing_dev

3 years agoMerge tag '5.9-rc4-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 12 Sep 2020 18:48:04 +0000 (11:48 -0700)]
Merge tag '5.9-rc4-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fix from Steve French:
 "A fix for lookup on DFS link when cifsacl or modefromsid is used"

* tag '5.9-rc4-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix DFS mount with cifsacl/modefromsid

3 years agoKVM: emulator: more strict rsm checks.
Maxim Levitsky [Thu, 27 Aug 2020 17:11:44 +0000 (20:11 +0300)]
KVM: emulator: more strict rsm checks.

Don't ignore return values in rsm_load_state_64/32 to avoid
loading invalid state from SMM state area if it was tampered with
by the guest.

This is primarly intended to avoid letting guest set bits in EFER
(like EFER.SVME when nesting is disabled) by manipulating SMM save area.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827171145.374620-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: nSVM: more strict SMM checks when returning to nested guest
Maxim Levitsky [Thu, 27 Aug 2020 16:27:20 +0000 (19:27 +0300)]
KVM: nSVM: more strict SMM checks when returning to nested guest

* check that guest is 64 bit guest, otherwise the SVM related fields
  in the smm state area are not defined

* If the SMM area indicates that SMM interrupted a running guest,
  check that EFER.SVME which is also saved in this area is set, otherwise
  the guest might have tampered with SMM save area, and so indicate
  emulation failure which should triple fault the guest.

* Check that that guest CPUID supports SVM (due to the same issue as above)

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827162720.278690-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoSVM: nSVM: setup nested msr permission bitmap on nested state load
Maxim Levitsky [Thu, 27 Aug 2020 16:27:19 +0000 (19:27 +0300)]
SVM: nSVM: setup nested msr permission bitmap on nested state load

This code was missing and was forcing the L2 run with L1's msr
permission bitmap

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827162720.278690-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoSVM: nSVM: correctly restore GIF on vmexit from nesting after migration
Maxim Levitsky [Thu, 27 Aug 2020 16:27:18 +0000 (19:27 +0300)]
SVM: nSVM: correctly restore GIF on vmexit from nesting after migration

Currently code in svm_set_nested_state copies the current vmcb control
area to L1 control area (hsave->control), under assumption that
it mostly reflects the defaults that kvm choose, and later qemu
overrides  these defaults with L2 state using standard KVM interfaces,
like KVM_SET_REGS.

However nested GIF (which is AMD specific thing) is by default is true,
and it is copied to hsave area as such.

This alone is not a big deal since on VMexit, GIF is always set to false,
regardless of what it was on VM entry.  However in nested_svm_vmexit we
were first were setting GIF to false, but then we overwrite the control
fields with value from the hsave area.  (including the nested GIF field
itself if GIF virtualization is enabled).

Now on normal vm entry this is not a problem, since GIF is usually false
prior to normal vm entry, and this is the value that copied to hsave,
and then restored, but this is not always the case when the nested state
is loaded as explained above.

To fix this issue, move svm_set_gif after we restore the L1 control
state in nested_svm_vmexit, so that even with wrong GIF in the
saved L1 control area, we still clear GIF as the spec says.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200827162720.278690-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoopenrisc: Fix issue with get_user for 64-bit values
Stafford Horne [Wed, 2 Sep 2020 20:54:40 +0000 (05:54 +0900)]
openrisc: Fix issue with get_user for 64-bit values

A build failure was raised by kbuild with the following error.

    drivers/android/binder.c: Assembler messages:
    drivers/android/binder.c:3861: Error: unrecognized keyword/register name `l.lwz ?ap,4(r24)'
    drivers/android/binder.c:3866: Error: unrecognized keyword/register name `l.addi ?ap,r0,0'

The issue is with 64-bit get_user() calls on openrisc.  I traced this to
a problem where in the internally in the get_user macros there is a cast
to long __gu_val this causes GCC to think the get_user call is 32-bit.
This binder code is really long and GCC allocates register r30, which
triggers the issue. The 64-bit get_user asm tries to get the 64-bit pair
register, which for r30 overflows the general register names and returns
the dummy register ?ap.

The fix here is to move the temporary variables into the asm macros.  We
use a 32-bit __gu_tmp for 32-bit and smaller macro and a 64-bit tmp in
the 64-bit macro.  The cast in the 64-bit macro has a trick of casting
through __typeof__((x)-(x)) which avoids the below warning.  This was
barrowed from riscv.

    arch/openrisc/include/asm/uaccess.h:240:8: warning: cast to pointer from integer of different size

I tested this in a small unit test to check reading between 64-bit and
32-bit pointers to 64-bit and 32-bit values in all combinations.  Also I
ran make C=1 to confirm no new sparse warnings came up.  It all looks
clean to me.

Link: https://lore.kernel.org/lkml/202008200453.ohnhqkjQ%25lkp@intel.com/
Signed-off-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
3 years agox86/kvm: don't forget to ACK async PF IRQ
Vitaly Kuznetsov [Tue, 8 Sep 2020 13:53:50 +0000 (15:53 +0200)]
x86/kvm: don't forget to ACK async PF IRQ

Merge commit 26d05b368a5c0 ("Merge branch 'kvm-async-pf-int' into HEAD")
tried to adapt the new interrupt based async PF mechanism to the newly
introduced IDTENTRY magic but unfortunately it missed the fact that
DEFINE_IDTENTRY_SYSVEC() doesn't call ack_APIC_irq() on its own and
all DEFINE_IDTENTRY_SYSVEC() users have to call it manually.

As the result all multi-CPU KVM guest hang on boot when
KVM_FEATURE_ASYNC_PF_INT is present. The breakage went unnoticed because no
KVM userspace (e.g. QEMU) currently set it (and thus async PF mechanism
is currently disabled) but we're about to change that.

Fixes: 26d05b368a5c0 ("Merge branch 'kvm-async-pf-int' into HEAD")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200908135350.355053-3-vkuznets@redhat.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agox86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro
Vitaly Kuznetsov [Tue, 8 Sep 2020 13:53:49 +0000 (15:53 +0200)]
x86/kvm: properly use DEFINE_IDTENTRY_SYSVEC() macro

DEFINE_IDTENTRY_SYSVEC() already contains irqentry_enter()/
irqentry_exit().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200908135350.355053-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit
Wanpeng Li [Wed, 19 Aug 2020 08:55:27 +0000 (16:55 +0800)]
KVM: VMX: Don't freeze guest when event delivery causes an APIC-access exit

According to SDM 27.2.4, Event delivery causes an APIC-access VM exit.
Don't report internal error and freeze guest when event delivery causes
an APIC-access exit, it is handleable and the event will be re-injected
during the next vmentry.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1597827327-25055-2-git-send-email-wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: SVM: avoid emulation with stale next_rip
Wanpeng Li [Sat, 12 Sep 2020 06:16:39 +0000 (02:16 -0400)]
KVM: SVM: avoid emulation with stale next_rip

svm->next_rip is reset in svm_vcpu_run() only after calling
svm_exit_handlers_fastpath(), which will cause SVM's
skip_emulated_instruction() to write a stale RIP.

We can move svm_exit_handlers_fastpath towards the end of
svm_vcpu_run().  To align VMX with SVM, keep svm_complete_interrupts()
close as well.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Paul K. <kronenpj@kronenpj.dyndns.org>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
[Also move vmcb_mark_all_clean before any possible write to the VMCB.
 - Paolo]

3 years agoMerge tag 'ceph-for-5.9-rc5' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 11 Sep 2020 20:47:29 +0000 (13:47 -0700)]
Merge tag 'ceph-for-5.9-rc5' of git://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "Add missing capability checks in rbd, marked for stable"

* tag 'ceph-for-5.9-rc5' of git://github.com/ceph/ceph-client:
  rbd: require global CAP_SYS_ADMIN for mapping and unmapping

3 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 11 Sep 2020 20:43:05 +0000 (13:43 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
 "Usual driver bugfixes for the I2C subsystem"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: algo: pca: Reapply i2c bus settings after reset
  i2c: npcm7xx: Fix timeout calculation
  misc: eeprom: at24: register nvmem only after eeprom is ready to use

3 years agoMerge tag 'pm-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 11 Sep 2020 18:59:14 +0000 (11:59 -0700)]
Merge tag 'pm-5.9-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix three pieces of documentation and add new CPU IDs to the
  Intel RAPL power capping driver.

  Specifics:

   - Add CPU IDs of the TigerLake Desktop, RocketLake and AlderLake
     chips to the Intel RAPL power capping driver (Zhang Rui).

   - Add the missing energy model performance domain item to the struct
     device kerneldoc comment (Randy Dunlap).

   - Fix the struct powercap_control_type kerneldoc comment to match the
     actual definition of that structure and add missing item to the
     struct powercap_zone_ops kerneldoc comment (Amit Kucheria)"

* tag 'pm-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  powercap: make documentation reflect code
  PM: <linux/device.h>: fix @em_pd kernel-doc warning
  powercap/intel_rapl: add support for AlderLake
  powercap/intel_rapl: add support for RocketLake
  powercap/intel_rapl: add support for TigerLake Desktop

3 years agoMerge tag 'block-5.9-2020-09-11' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 11 Sep 2020 18:55:28 +0000 (11:55 -0700)]
Merge tag 'block-5.9-2020-09-11' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Fix a regression in bdev partition locking (Christoph)

 - NVMe pull request from Christoph:
      - cancel async events before freeing them (David Milburn)
      - revert a broken race fix (James Smart)
      - fix command processing during resets (Sagi Grimberg)

 - Fix a kyber crash with requeued flushes (Omar)

 - Fix __bio_try_merge_page() same_page error for no merging (Ritesh)

* tag 'block-5.9-2020-09-11' of git://git.kernel.dk/linux-block:
  block: Set same_page to false in __bio_try_merge_page if ret is false
  nvme-fabrics: allow to queue requests for live queues
  block: only call sched requeue_request() for scheduled requests
  nvme-tcp: cancel async events before freeing event struct
  nvme-rdma: cancel async events before freeing event struct
  nvme-fc: cancel async events before freeing event struct
  nvme: Revert: Fix controller creation races with teardown flow
  block: restore a specific error code in bdev_del_partition

3 years agoMerge tag 'spi-fix-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Fri, 11 Sep 2020 18:35:55 +0000 (11:35 -0700)]
Merge tag 'spi-fix-v5.9-rc4' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "There's some driver specific fixes here plus one core fix for memory
  leaks that could be triggered by a potential race condition when
  cleaning up after we have split transfers to fit into what the
  controller can support"

* tag 'spi-fix-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: stm32: fix pm_runtime_get_sync() error checking
  spi: Fix memory leak on splited transfers
  spi: spi-cadence-quadspi: Fix mapping of buffers for DMA reads
  spi: stm32: Rate-limit the 'Communication suspended' message
  spi: spi-loopback-test: Fix out-of-bounds read
  spi: spi-cadence-quadspi: Populate get_name() interface
  MAINTAINERS: add myself as maintainer for spi-fsl-dspi driver

3 years agoMerge tag 'regulator-fix-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 11 Sep 2020 18:25:55 +0000 (11:25 -0700)]
Merge tag 'regulator-fix-v5.9-rc4' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "The biggest set of fixes here is those from Michał Mirosław fixing
  some locking issues with coupled regulators that are triggered in
  cases where a coupled regulator is used by a device involved in
  fs_reclaim like eMMC storage.

  These are relatively serious for the affected systems, though the
  circumstances where they trigger are very rare"

* tag 'regulator-fix-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: pwm: Fix machine constraints application
  regulator: core: Fix slab-out-of-bounds in regulator_unlock_recursive()
  regulator: remove superfluous lock in regulator_resolve_coupling()
  regulator: cleanup regulator_ena_gpio_free()
  regulator: plug of_node leak in regulator_register()'s error path
  regulator: push allocation in set_consumer_device_supply() out of lock
  regulator: push allocations in create_regulator() outside of lock
  regulator: push allocation in regulator_ena_gpio_request() out of lock
  regulator: push allocation in regulator_init_coupling() outside of lock
  regulator: fix spelling mistake "Cant" -> "Can't"
  regulator: cros-ec-regulator: Add NULL test for devm_kmemdup call

3 years agoKVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN
Vitaly Kuznetsov [Fri, 11 Sep 2020 09:31:47 +0000 (11:31 +0200)]
KVM: x86: always allow writing '0' to MSR_KVM_ASYNC_PF_EN

Even without in-kernel LAPIC we should allow writing '0' to
MSR_KVM_ASYNC_PF_EN as we're not enabling the mechanism. In
particular, QEMU with 'kernel-irqchip=off' fails to start
a guest with

qemu-system-x86_64: error: failed to set MSR 0x4b564d02 to 0x0

Fixes: 9d3c447c72fb2 ("KVM: X86: Fix async pf caused null-ptr-deref")
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200911093147.484565-1-vkuznets@redhat.com>
[Actually commit the version proposed by Sean Christopherson. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: SVM: Periodically schedule when unregistering regions on destroy
David Rientjes [Tue, 25 Aug 2020 19:56:28 +0000 (12:56 -0700)]
KVM: SVM: Periodically schedule when unregistering regions on destroy

There may be many encrypted regions that need to be unregistered when a
SEV VM is destroyed.  This can lead to soft lockups.  For example, on a
host running 4.15:

watchdog: BUG: soft lockup - CPU#206 stuck for 11s! [t_virtual_machi:194348]
CPU: 206 PID: 194348 Comm: t_virtual_machi
RIP: 0010:free_unref_page_list+0x105/0x170
...
Call Trace:
 [<0>] release_pages+0x159/0x3d0
 [<0>] sev_unpin_memory+0x2c/0x50 [kvm_amd]
 [<0>] __unregister_enc_region_locked+0x2f/0x70 [kvm_amd]
 [<0>] svm_vm_destroy+0xa9/0x200 [kvm_amd]
 [<0>] kvm_arch_destroy_vm+0x47/0x200
 [<0>] kvm_put_kvm+0x1a8/0x2f0
 [<0>] kvm_vm_release+0x25/0x30
 [<0>] do_exit+0x335/0xc10
 [<0>] do_group_exit+0x3f/0xa0
 [<0>] get_signal+0x1bc/0x670
 [<0>] do_signal+0x31/0x130

Although the CLFLUSH is no longer issued on every encrypted region to be
unregistered, there are no other changes that can prevent soft lockups for
very large SEV VMs in the latest kernel.

Periodically schedule if necessary.  This still holds kvm->lock across the
resched, but since this only happens when the VM is destroyed this is
assumed to be acceptable.

Signed-off-by: David Rientjes <rientjes@google.com>
Message-Id: <alpine.DEB.2.23.453.2008251255240.2987727@chino.kir.corp.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: MIPS: Change the definition of kvm type
Huacai Chen [Thu, 10 Sep 2020 10:33:51 +0000 (18:33 +0800)]
KVM: MIPS: Change the definition of kvm type

MIPS defines two kvm types:

 #define KVM_VM_MIPS_TE          0
 #define KVM_VM_MIPS_VZ          1

In Documentation/virt/kvm/api.rst it is said that "You probably want to
use 0 as machine type", which implies that type 0 be the "automatic" or
"default" type. And, in user-space libvirt use the null-machine (with
type 0) to detect the kvm capability, which returns "KVM not supported"
on a VZ platform.

I try to fix it in QEMU but it is ugly:
https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html

And Thomas Huth suggests me to change the definition of kvm type:
https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html

So I define like this:

 #define KVM_VM_MIPS_AUTO        0
 #define KVM_VM_MIPS_VZ          1
 #define KVM_VM_MIPS_TE          2

Since VZ and TE cannot co-exists, using type 0 on a TE platform will
still return success (so old user-space tools have no problems on new
kernels); the advantage is that using type 0 on a VZ platform will not
return failure. So, the only problem is "new user-space tools use type
2 on old kernels", but if we treat this as a kernel bug, we can backport
this patch to old stable kernels.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Message-Id: <1599734031-28746-1-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoMerge tag 'mmc-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 11 Sep 2020 17:19:27 +0000 (10:19 -0700)]
Merge tag 'mmc-v5.9-rc4' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - sdio: Restore ~20% performance drop for SDHCI drivers, by using
     mmc_pre_req() and mmc_post_req() for SDIO requests.

  MMC host:
   - sdhci-of-esdhc: Fix support for erratum eSDHC7
   - mmc_spi: Allow the driver to be built when CONFIG_HAS_DMA is unset
   - sdhci-msm: Use retries to fix tuning
   - sdhci-acpi: Fix resume for eMMC HS400 mode"

* tag 'mmc-v5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdio: Use mmc_pre_req() / mmc_post_req()
  mmc: sdhci-of-esdhc: Don't walk device-tree on every interrupt
  mmc: mmc_spi: Allow the driver to be built when CONFIG_HAS_DMA is unset
  mmc: sdhci-msm: Add retries when all tuning phases are found valid
  mmc: sdhci-acpi: Clear amd_sdhci_host on reset

3 years agokvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed
Lai Jiangshan [Wed, 2 Sep 2020 13:54:21 +0000 (21:54 +0800)]
kvm x86/mmu: use KVM_REQ_MMU_SYNC to sync when needed

When kvm_mmu_get_page() gets a page with unsynced children, the spt
pagetable is unsynchronized with the guest pagetable. But the
guest might not issue a "flush" operation on it when the pagetable
entry is changed from zero or other cases. The hypervisor has the
responsibility to synchronize the pagetables.

KVM behaved as above for many years, But commit 8c8560b83390
("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes")
inadvertently included a line of code to change it without giving any
reason in the changelog. It is clear that the commit's intention was to
change KVM_REQ_TLB_FLUSH -> KVM_REQ_TLB_FLUSH_CURRENT, so we don't
needlessly flush other contexts; however, one of the hunks changed
a nearby KVM_REQ_MMU_SYNC instead.  This patch changes it back.

Link: https://lore.kernel.org/lkml/20200320212833.3507-26-sean.j.christopherson@intel.com/
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20200902135421.31158-1-jiangshanlai@gmail.com>
fixes: 8c8560b83390 ("KVM: x86/mmu: Use KVM_REQ_TLB_FLUSH_CURRENT for MMU specific flushes")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control
Chenyi Qiang [Fri, 28 Aug 2020 08:56:21 +0000 (16:56 +0800)]
KVM: nVMX: Fix the update value of nested load IA32_PERF_GLOBAL_CTRL control

A minor fix for the update of VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL field
in exit_ctls_high.

Fixes: 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL
VM-{Entry,Exit} control")
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200828085622.8365-5-chenyi.qiang@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: fix memory leak in kvm_io_bus_unregister_dev()
Rustam Kovhaev [Mon, 7 Sep 2020 18:55:35 +0000 (11:55 -0700)]
KVM: fix memory leak in kvm_io_bus_unregister_dev()

when kmalloc() fails in kvm_io_bus_unregister_dev(), before removing
the bus, we should iterate over all other devices linked to it and call
kvm_iodevice_destructor() for them

Fixes: 90db10434b16 ("KVM: kvm_io_bus_unregister_dev() should never fail")
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+f196caa45793d6374707@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=f196caa45793d6374707
Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200907185535.233114-1-rkovhaev@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: Check the allocation of pv cpu mask
Haiwei Li [Tue, 1 Sep 2020 11:41:37 +0000 (19:41 +0800)]
KVM: Check the allocation of pv cpu mask

check the allocation of per-cpu __pv_cpu_mask. Initialize ops only when
successful.

Signed-off-by: Haiwei Li <lihaiwei@tencent.com>
Message-Id: <d59f05df-e6d3-3d31-a036-cc25a2b2f33f@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoKVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected
Peter Shier [Thu, 20 Aug 2020 23:05:45 +0000 (16:05 -0700)]
KVM: nVMX: Update VMCS02 when L2 PAE PDPTE updates detected

When L2 uses PAE, L0 intercepts of L2 writes to CR0/CR3/CR4 call
load_pdptrs to read the possibly updated PDPTEs from the guest
physical address referenced by CR3.  It loads them into
vcpu->arch.walk_mmu->pdptrs and sets VCPU_EXREG_PDPTR in
vcpu->arch.regs_dirty.

At the subsequent assumed reentry into L2, the mmu will call
vmx_load_mmu_pgd which calls ept_load_pdptrs. ept_load_pdptrs sees
VCPU_EXREG_PDPTR set in vcpu->arch.regs_dirty and loads
VMCS02.GUEST_PDPTRn from vcpu->arch.walk_mmu->pdptrs[]. This all works
if the L2 CRn write intercept always resumes L2.

The resume path calls vmx_check_nested_events which checks for
exceptions, MTF, and expired VMX preemption timers. If
vmx_check_nested_events finds any of these conditions pending it will
reflect the corresponding exit into L1. Live migration at this point
would also cause a missed immediate reentry into L2.

After L1 exits, vmx_vcpu_run calls vmx_register_cache_reset which
clears VCPU_EXREG_PDPTR in vcpu->arch.regs_dirty.  When L2 next
resumes, ept_load_pdptrs finds VCPU_EXREG_PDPTR clear in
vcpu->arch.regs_dirty and does not load VMCS02.GUEST_PDPTRn from
vcpu->arch.walk_mmu->pdptrs[]. prepare_vmcs02 will then load
VMCS02.GUEST_PDPTRn from vmcs12->pdptr0/1/2/3 which contain the stale
values stored at last L2 exit. A repro of this bug showed L2 entering
triple fault immediately due to the bad VMCS02.GUEST_PDPTRn values.

When L2 is in PAE paging mode add a call to ept_load_pdptrs before
leaving L2. This will update VMCS02.GUEST_PDPTRn if they are dirty in
vcpu->arch.walk_mmu->pdptrs[].

Tested:
kvm-unit-tests with new directed test: vmx_mtf_pdpte_test.
Verified that test fails without the fix.

Also ran Google internal VMM with an Ubuntu 16.04 4.4.0-83 guest running a
custom hypervisor with a 32-bit Windows XP L2 guest using PAE. Prior to fix
would repro readily. Ran 14 simultaneous L2s for 140 iterations with no
failures.

Signed-off-by: Peter Shier <pshier@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20200820230545.2411347-1-pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoMerge tag 'kvmarm-fixes-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmar...
Paolo Bonzini [Fri, 11 Sep 2020 17:12:11 +0000 (13:12 -0400)]
Merge tag 'kvmarm-fixes-5.9-1' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for Linux 5.9, take #1

- Multiple stolen time fixes, with a new capability to match x86
- Fix for hugetlbfs mappings when PUD and PMD are the same level
- Fix for hugetlbfs mappings when PTE mappings are enforced
  (dirty logging, for example)
- Fix tracing output of 64bit values

3 years agoMerge tag 'drm-fixes-2020-09-11' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 11 Sep 2020 17:10:27 +0000 (10:10 -0700)]
Merge tag 'drm-fixes-2020-09-11' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regular fixes, not much a major amount. One thing though is Laurent
  fixed some Kconfig issues, and I'm carrying the rapidio kconfig change
  so the drm one for xlnx driver works. He hadn't got a response from
  rapidio maintainers.

  Otherwise, virtio, sun4i, tve200, ingenic have some fixes, one audio
  fix for i915 and a core docs fix.

  kconfig:
   - rapidio/xlnx kconfig fix

  core:
   - Documentation fix

  i915:
   - audio regression fix

  virtio:
   - Fix double free in virtio
   - Fix virtio unblank
   - Remove output->enabled from virtio, as it should use crtc_state

  sun4i:
   - Add missing put_device in sun4i, and other fixes
   - Handle sun4i alpha on lowest plane correctly

  tv200:
   - Fix tve200 enable/disable

  ingenic
   - Small ingenic fixes"

* tag 'drm-fixes-2020-09-11' of git://anongit.freedesktop.org/drm/drm:
  drm/i915: fix regression leading to display audio probe failure on GLK
  drm: xlnx: dpsub: Fix DMADEVICES Kconfig dependency
  rapidio: Replace 'select' DMAENGINES 'with depends on'
  drm/virtio: drop virtio_gpu_output->enabled
  drm/sun4i: backend: Disable alpha on the lowest plane on the A20
  drm/sun4i: backend: Support alpha property on lowest plane
  drm/sun4i: Fix DE2 YVU handling
  drm/tve200: Stabilize enable/disable
  dma-buf: fence-chain: Document missing dma_fence_chain_init() parameter in kerneldoc
  dma-buf: Fix kerneldoc of dma_buf_set_name()
  drm/virtio: fix unblank
  Documentation: fix dma-buf.rst underline length warning
  drm/sun4i: Fix dsi dcs long write function
  drm/ingenic: Fix driver not probing when IPU port is missing
  drm/ingenic: Fix leak of device_node pointer
  drm/sun4i: add missing put_device() call in sun8i_r40_tcon_tv_set_mux()
  drm/virtio: Revert "drm/virtio: Call the right shmem helpers"

3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 11 Sep 2020 17:02:36 +0000 (10:02 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "A number of driver bug fixes and a few recent regressions:

   - Several bug fixes for bnxt_re. Crashing, incorrect data reported,
     and corruption on new HW

   - Memory leak and crash in rxe

   - Fix sysfs corruption in rxe if the netdev name is too long

   - Fix a crash on error unwind in the new cq_pool code

   - Fix kobject panics in rtrs by working device lifetime properly

   - Fix a data corruption bug in iser target related to misaligned
     buffers"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/isert: Fix unaligned immediate-data handling
  RDMA/rtrs-srv: Set .release function for rtrs srv device during device init
  RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx'
  RDMA/core: Fix reported speed and width
  RDMA/core: Fix unsafe linked list traversal after failing to allocate CQ
  RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
  RDMA/bnxt_re: Restrict the max_gids to 256
  RDMA/bnxt_re: Static NQ depth allocation
  RDMA/bnxt_re: Fix the qp table indexing
  RDMA/bnxt_re: Do not report transparent vlan from QP1
  RDMA/mlx4: Read pkey table length instead of hardcoded value
  RDMA/rxe: Fix panic when calling kmem_cache_create()
  RDMA/rxe: Fix memleak in rxe_mem_init_user
  RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars
  RDMA/rtrs-srv: Replace device_register with device_initialize and device_add

3 years agogcov: add support for GCC 10.1
Peter Oberparleiter [Thu, 10 Sep 2020 12:52:01 +0000 (14:52 +0200)]
gcov: add support for GCC 10.1

Using gcov to collect coverage data for kernels compiled with GCC 10.1
causes random malfunctions and kernel crashes.  This is the result of a
changed GCOV_COUNTERS value in GCC 10.1 that causes a mismatch between
the layout of the gcov_info structure created by GCC profiling code and
the related structure used by the kernel.

Fix this by updating the in-kernel GCOV_COUNTERS value.  Also re-enable
config GCOV_KERNEL for use with GCC 10.

Reported-by: Colin Ian King <colin.king@canonical.com>
Reported-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Tested-by: Leon Romanovsky <leonro@nvidia.com>
Tested-and-Acked-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge branch 'powercap'
Rafael J. Wysocki [Fri, 11 Sep 2020 14:46:01 +0000 (16:46 +0200)]
Merge branch 'powercap'

* powercap:
  powercap: make documentation reflect code
  powercap/intel_rapl: add support for AlderLake
  powercap/intel_rapl: add support for RocketLake
  powercap/intel_rapl: add support for TigerLake Desktop