linux-2.6-microblaze.git
9 years agox86: Pack loops tightly as well
Ingo Molnar [Sun, 17 May 2015 05:56:54 +0000 (07:56 +0200)]
x86: Pack loops tightly as well

Packing loops tightly (-falign-loops=1) is beneficial to code size:

     text        data    bss     dec              filename
 12566391        1617840 1089536 15273767         vmlinux.align.16-byte
 12224951        1617840 1089536 14932327         vmlinux.align.1-byte
 11976567        1617840 1089536 14683943         vmlinux.align.1-byte.funcs-1-byte
 11903735        1617840 1089536 14611111         vmlinux.align.1-byte.funcs-1-byte.loops-1-byte

Which reduces the size of the kernel by another 0.6%, so the
the total combined size reduction of the alignment-packing
patches is ~5.5%.

The x86 decoder bandwidth and caching arguments laid out in:

  be6cb02779ca ("x86: Align jump targets to 1-byte boundaries")

apply to loop alignment as well.

Furtermore, modern CPU uarchs have a loop cache/buffer that
is a L0 cache before even any uop cache, covering a few
dozen most recently executed instructions.

This loop cache generally does not have the 16-byte alignment
restrictions of the uop cache.

Now loop alignment can still be beneficial if:

 - a loop is cache-hot and its surroundings are not.

 - if the loop is so cache hot that the instruction
   flow becomes x86 decoder bandwidth limited

But loop alignment is harmful if:

 - a loop is cache-cold

 - a loop's surroundings are cache-hot as well

 - two cache-hot loops are close to each other

 - if the loop fits into the loop cache

 - if the code flow is not decoder bandwidth limited

and I'd argue that the latter five scenarios are much
more common in the kernel, as our hottest loops are
typically:

 - pointer chasing: this should fit into the loop cache
   in most cases and is typically data cache and address
   generation limited

 - generic memory ops (memset, memcpy, etc.): these generally
   fit into the loop cache as well, and are likewise data
   cache limited.

So this patch packs loop addresses tightly as well.

Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20150410123017.GB19918@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86: Align jump targets to 1-byte boundaries
Ingo Molnar [Fri, 10 Apr 2015 12:08:46 +0000 (14:08 +0200)]
x86: Align jump targets to 1-byte boundaries

The following NOP in a hot function caught my attention:

  >   5a: 66 0f 1f 44 00 00     nopw   0x0(%rax,%rax,1)

That's a dead NOP that bloats the function a bit, added for the
default 16-byte alignment that GCC applies for jump targets.

I realize that x86 CPU manufacturers recommend 16-byte jump
target alignments (it's in the Intel optimization manual),
to help their relatively narrow decoder prefetch alignment
and uop cache constraints, but the cost of that is very
significant:

        text           data       bss         dec      filename
    12566391        1617840   1089536    15273767      vmlinux.align.16-byte
    12224951        1617840   1089536    14932327      vmlinux.align.1-byte

By using 1-byte jump target alignment (i.e. no alignment at all)
we get an almost 3% reduction in kernel size (!) - and a
probably similar reduction in I$ footprint.

Now, the usual justification for jump target alignment is the
following:

 - modern decoders tend to have 16-byte (effective) decoder
   prefetch windows. (AMD documents it higher but measurements
   suggest the effective prefetch window on curretn uarchs is
   still around 16 bytes)

 - on Intel there's also the uop-cache with cachelines that have
   16-byte granularity and limited associativity.

 - older x86 uarchs had a penalty for decoder fetches that crossed
   16-byte boundaries. These limits are mostly gone from recent
   uarchs.

So if a forward jump target is aligned to cacheline boundary then
prefetches will start from a new prefetch-cacheline and there's
higher chance for decoding in fewer steps and packing tightly.

But I think that argument is flawed for typical optimized kernel
code flows: forward jumps often go to 'cold' (uncommon) pieces
of code, and  aligning cold code to cache lines does not bring a
lot of advantages  (they are uncommon), while it causes
collateral damage:

 - their alignment 'spreads out' the cache footprint, it shifts
   followup hot code further out

 - plus it slows down even 'cold' code that immediately follows 'hot'
   code (like in the above case), which could have benefited from the
   partial cacheline that comes off the end of hot code.

But even in the cache-hot case the 16 byte alignment brings
disadvantages:

 - it spreads out the cache footprint, possibly making the code
   fall out of the L1 I$.

 - On Intel CPUs, recent microarchitectures have plenty of
   uop cache (typically doubling every 3 years) - while the
   size of the L1 cache grows much less aggressively. So
   workloads are rarely uop cache limited.

The only situation where alignment might matter are tight
loops that could fit into a single 16 byte chunk - but those
are pretty rare in the kernel: if they exist they tend
to be pointer chasing or generic memory ops, which both tend
to be cache miss (or cache allocation) intensive and are not
decoder bandwidth limited.

So the balance of arguments strongly favors packing kernel
instructions tightly versus maximizing for decoder bandwidth:
this patch changes the jump target alignment from 16 bytes
to 1 byte (tightly packed, unaligned).

Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/uaccess: Get rid of copy_user_nocache_64.S
Borislav Petkov [Wed, 13 May 2015 17:42:24 +0000 (19:42 +0200)]
x86/asm/uaccess: Get rid of copy_user_nocache_64.S

Move __copy_user_nocache() to arch/x86/lib/copy_user_64.S and
kill the containing file.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431538944-27724-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/uaccess: Unify the ALIGN_DESTINATION macro
Borislav Petkov [Wed, 13 May 2015 17:42:23 +0000 (19:42 +0200)]
x86/asm/uaccess: Unify the ALIGN_DESTINATION macro

Pull it up into the header and kill duplicate versions.
Separately, both macros are identical:

 35948b2bd3431aee7149e85cfe4becbc  /tmp/a
 35948b2bd3431aee7149e85cfe4becbc  /tmp/b

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431538944-27724-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/uaccess: Remove FIX_ALIGNMENT define from copy_user_nocache_64.S:
Borislav Petkov [Wed, 13 May 2015 17:42:22 +0000 (19:42 +0200)]
x86/asm/uaccess: Remove FIX_ALIGNMENT define from copy_user_nocache_64.S:

No code changed:

  # arch/x86/lib/copy_user_nocache_64.o:

   text    data     bss     dec     hex filename
    390       0       0     390     186 copy_user_nocache_64.o.before
    390       0       0     390     186 copy_user_nocache_64.o.after

md5:
   7fa0577b28700af89d3a67a8b590426e  copy_user_nocache_64.o.before.asm
   7fa0577b28700af89d3a67a8b590426e  copy_user_nocache_64.o.after.asm

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431538944-27724-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/alternatives: Switch AMD F15h and later to the P6 NOPs
Borislav Petkov [Mon, 11 May 2015 08:15:46 +0000 (10:15 +0200)]
x86/alternatives: Switch AMD F15h and later to the P6 NOPs

Software optimization guides for both F15h and F16h cite those
NOPs as the optimal ones. A microbenchmark confirms that
actually even older families are better with the single-insn
NOPs so switch to them for the alternatives.

Cycles count below includes the loop overhead of the measurement
but that overhead is the same with all runs.

F10h, revE:
-----------
Running NOP tests, 1000 NOPs x 1000000 repetitions

K8:
      90     288.212282 cycles
   66 90     288.220840 cycles
66 66 90     288.219447 cycles
     66 66 66 90     288.223204 cycles
  66 66 90 66 90     571.393424 cycles
       66 66 90 66 66 90     571.374919 cycles
    66 66 66 90 66 66 90     572.249281 cycles
 66 66 66 90 66 66 66 90     571.388651 cycles

P6:
      90     288.214193 cycles
   66 90     288.225550 cycles
0f 1f 00     288.224441 cycles
     0f 1f 40 00     288.225030 cycles
  0f 1f 44 00 00     288.233558 cycles
       66 0f 1f 44 00 00     324.792342 cycles
    0f 1f 80 00 00 00 00     325.657462 cycles
 0f 1f 84 00 00 00 00 00     430.246643 cycles

F14h:
----
Running NOP tests, 1000 NOPs x 1000000 repetitions

K8:
      90     510.404890 cycles
   66 90     510.432117 cycles
66 66 90     510.561858 cycles
     66 66 66 90     510.541865 cycles
  66 66 90 66 90    1014.192782 cycles
       66 66 90 66 66 90    1014.226546 cycles
    66 66 66 90 66 66 90    1014.334299 cycles
 66 66 66 90 66 66 66 90    1014.381205 cycles

P6:
      90     510.436710 cycles
   66 90     510.448229 cycles
0f 1f 00     510.545100 cycles
     0f 1f 40 00     510.502792 cycles
  0f 1f 44 00 00     510.589517 cycles
       66 0f 1f 44 00 00     510.611462 cycles
    0f 1f 80 00 00 00 00     511.166794 cycles
 0f 1f 84 00 00 00 00 00     511.651641 cycles

F15h:
-----
Running NOP tests, 1000 NOPs x 1000000 repetitions

K8:
      90     243.128396 cycles
   66 90     243.129883 cycles
66 66 90     243.131631 cycles
     66 66 66 90     242.499324 cycles
  66 66 90 66 90     481.829083 cycles
       66 66 90 66 66 90     481.884413 cycles
    66 66 66 90 66 66 90     481.851446 cycles
 66 66 66 90 66 66 66 90     481.409220 cycles

P6:
      90     243.127026 cycles
   66 90     243.130711 cycles
0f 1f 00     243.122747 cycles
     0f 1f 40 00     242.497617 cycles
  0f 1f 44 00 00     245.354461 cycles
       66 0f 1f 44 00 00     361.930417 cycles
    0f 1f 80 00 00 00 00     362.844944 cycles
 0f 1f 84 00 00 00 00 00     480.514948 cycles

F16h:
-----
Running NOP tests, 1000 NOPs x 1000000 repetitions

K8:
      90     507.793298 cycles
   66 90     507.789636 cycles
66 66 90     507.826490 cycles
     66 66 66 90     507.859075 cycles
  66 66 90 66 90    1008.663129 cycles
       66 66 90 66 66 90    1008.696259 cycles
    66 66 66 90 66 66 90    1008.692517 cycles
 66 66 66 90 66 66 66 90    1008.755399 cycles

P6:
      90     507.795232 cycles
   66 90     507.794761 cycles
0f 1f 00     507.834901 cycles
     0f 1f 40 00     507.822629 cycles
  0f 1f 44 00 00     507.838493 cycles
       66 0f 1f 44 00 00     507.908597 cycles
    0f 1f 80 00 00 00 00     507.946417 cycles
 0f 1f 84 00 00 00 00 00     507.954960 cycles

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431332153-18566-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/entry: Fix remaining use of SYSCALL_VECTOR
Ingo Molnar [Mon, 11 May 2015 05:17:04 +0000 (07:17 +0200)]
x86/asm/entry: Fix remaining use of SYSCALL_VECTOR

Commit:

  51bb92843edc ("x86/asm/entry: Remove SYSCALL_VECTOR")

Converted most uses of SYSCALL_VECTOR to IA32_SYSCALL_VECTOR, but
forgot about lguest.

Cc: Brian Gerst <brgerst@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431185813-15413-4-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/entry/irq: Clean up IRQn_VECTOR macros
Brian Gerst [Sat, 9 May 2015 15:36:53 +0000 (11:36 -0400)]
x86/asm/entry/irq: Clean up IRQn_VECTOR macros

Since the ISA irqs are in a single block, use
ISA_IRQ_VECTOR(irq) instead of individual macros.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431185813-15413-5-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/entry: Remove SYSCALL_VECTOR
Brian Gerst [Sat, 9 May 2015 15:36:52 +0000 (11:36 -0400)]
x86/asm/entry: Remove SYSCALL_VECTOR

Use IA32_SYSCALL_VECTOR for both compat and native.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431185813-15413-4-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/entry/irq: Remove unused invalidate_interrupt prototypes
Brian Gerst [Sat, 9 May 2015 15:36:51 +0000 (11:36 -0400)]
x86/asm/entry/irq: Remove unused invalidate_interrupt prototypes

The invalidate_interrupt* functions no longer exist.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431185813-15413-3-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/irq: Merge irq_regs & irq_stat
Brian Gerst [Sat, 9 May 2015 15:36:50 +0000 (11:36 -0400)]
x86/irq: Merge irq_regs & irq_stat

Move irq_regs and irq_stat definitions to irq.c.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431185813-15413-2-git-send-email-brgerst@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/entry: Define 'cpu_current_top_of_stack' for 64-bit code
Denys Vlasenko [Fri, 24 Apr 2015 15:31:35 +0000 (17:31 +0200)]
x86/entry: Define 'cpu_current_top_of_stack' for 64-bit code

32-bit code has PER_CPU_VAR(cpu_current_top_of_stack).
64-bit code uses somewhat more obscure: PER_CPU_VAR(cpu_tss + TSS_sp0).

Define the 'cpu_current_top_of_stack' macro on CONFIG_X86_64
as well so that the PER_CPU_VAR(cpu_current_top_of_stack)
expression can be used in both 32-bit and 64-bit code.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1429889495-27850-3-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/entry: Remove unused 'kernel_stack' per-cpu variable
Denys Vlasenko [Fri, 24 Apr 2015 15:31:34 +0000 (17:31 +0200)]
x86/entry: Remove unused 'kernel_stack' per-cpu variable

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1429889495-27850-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/entry: Stop using PER_CPU_VAR(kernel_stack)
Denys Vlasenko [Fri, 24 Apr 2015 15:31:33 +0000 (17:31 +0200)]
x86/entry: Stop using PER_CPU_VAR(kernel_stack)

PER_CPU_VAR(kernel_stack) is redundant:

  - On the 64-bit build, we can use PER_CPU_VAR(cpu_tss + TSS_sp0).
  - On the 32-bit build, we can use PER_CPU_VAR(cpu_current_top_of_stack).

PER_CPU_VAR(kernel_stack) will be deleted by a separate change.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1429889495-27850-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86, selftests: Add a test for the "sysret_ss_attrs" bug
Andy Lutomirski [Fri, 24 Apr 2015 22:09:19 +0000 (15:09 -0700)]
x86, selftests: Add a test for the "sysret_ss_attrs" bug

On AMD CPUs, SYSRET can return with a valid SS descriptor with
with the hidden attributes set to an unusable state.  Make sure
the kernel doesn't let this happen.  This detects an
as-yet-unfixed regression.

Note that the 64-bit version of this test fails on AMD CPUs on
all kernel versions, although the issue in the 64-bit case is
much less severe than in the 32-bit case.

Reported-by: Brian Gerst <brgerst@gmail.com>
Tested-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Tests: e7d6eefaaa44 ("x86/vdso32/syscall.S: Do not load __USER32_DS to %ss")
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/resend_4d740841bac383742949e2fefb03982736595087.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agoMerge branch 'linus' into x86/asm, before applying dependent patch
Ingo Molnar [Fri, 8 May 2015 11:33:33 +0000 (13:33 +0200)]
Merge branch 'linus' into x86/asm, before applying dependent patch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86: Force inlining of atomic ops
Denys Vlasenko [Fri, 8 May 2015 10:26:02 +0000 (12:26 +0200)]
x86: Force inlining of atomic ops

With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously
doesn't inline very small functions we expect to be inlined:

$ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^
*1 ' | sort -rn     473 000000000000000b t spin_unlock_irqrestore
    449 000000000000005f t rcu_read_unlock
    355 0000000000000009 t atomic_inc                <== THIS
    353 000000000000006e t rcu_read_lock
    350 0000000000000075 t rcu_read_lock_sched_held
    291 000000000000000b t spin_unlock
    266 0000000000000019 t arch_local_irq_restore
    215 000000000000000b t spin_lock
    180 0000000000000011 t kzalloc
    165 0000000000000012 t list_add_tail
    161 0000000000000019 t arch_local_save_flags
    153 0000000000000016 t test_and_set_bit
    134 000000000000000b t spin_unlock_irq
    134 0000000000000009 t atomic_dec                <== THIS
    130 000000000000000b t spin_unlock_bh
    122 0000000000000010 t brelse
    120 0000000000000016 t test_and_clear_bit
    120 000000000000000b t spin_lock_irq
    119 000000000000001e t get_dma_ops
    117 0000000000000053 t cpumask_next
    116 0000000000000036 t kref_get
    114 000000000000001a t schedule_work
    106 000000000000000b t spin_lock_bh
    103 0000000000000019 t arch_local_irq_disable
...

Note sizes of marked functions. They are merely 9 bytes long!
Selecting function with 'atomic' in their names:

    355 0000000000000009 t atomic_inc
    134 0000000000000009 t atomic_dec
     98 0000000000000014 t atomic_dec_and_test
     31 000000000000000e t atomic_add_return
     27 000000000000000a t atomic64_inc
     26 000000000000002f t kmap_atomic
     24 0000000000000009 t atomic_add
     12 0000000000000009 t atomic_sub
     10 0000000000000021 t __atomic_add_unless
     10 000000000000000a t atomic64_add
      5 000000000000001f t __atomic_add_unless.constprop.7
      5 000000000000000a t atomic64_dec
      4 000000000000001f t __atomic_add_unless.constprop.18
      4 000000000000001f t __atomic_add_unless.constprop.12
      4 000000000000001f t __atomic_add_unless.constprop.10
      3 000000000000001f t __atomic_add_unless.constprop.13
      3 0000000000000011 t atomic64_add_return
      2 000000000000001f t __atomic_add_unless.constprop.9
      2 000000000000001f t __atomic_add_unless.constprop.8
      2 000000000000001f t __atomic_add_unless.constprop.6
      2 000000000000001f t __atomic_add_unless.constprop.5
      2 000000000000001f t __atomic_add_unless.constprop.3
      2 000000000000001f t __atomic_add_unless.constprop.22
      2 000000000000001f t __atomic_add_unless.constprop.14
      2 000000000000001f t __atomic_add_unless.constprop.11
      2 000000000000001e t atomic_dec_if_positive
      2 0000000000000014 t atomic_inc_and_test
      2 0000000000000011 t atomic_add_return.constprop.4
      2 0000000000000011 t atomic_add_return.constprop.17
      2 0000000000000011 t atomic_add_return.constprop.16
      2 000000000000000d t atomic_inc.constprop.4
      2 000000000000000c t atomic_cmpxchg

This patch fixes this for x86 atomic ops via
s/inline/__always_inline/. This decreases allyesconfig kernel by
about 25k:

    text     data      bss       dec     hex filename
82399481 22255416 20627456 125282353 777a831 vmlinux.before
82375570 22255544 20627456 125258570 7774b4a vmlinux

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1431080762-17797-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/entry/64: Clean up usage of TEST insns
Denys Vlasenko [Mon, 27 Apr 2015 13:21:52 +0000 (15:21 +0200)]
x86/asm/entry/64: Clean up usage of TEST insns

By the nature of TEST operation, it is often possible
to test a narrower part of the operand:

    "testl $3, mem"  -> "testb $3, mem"

This results in shorter insns, because TEST insn has no
sign-entending byte-immediate forms unlike other ALU ops.

   text    data     bss     dec     hex filename
  11674       0       0   11674    2d9a entry_64.o.before
  11658       0       0   11658    2d8a entry_64.o

Changes in object code:

- f7 84 24 88 00 00 00 03 00 00 00  testl  $0x3,0x88(%rsp)
+ f6 84 24 88 00 00 00 03           testb  $0x3,0x88(%rsp)
- f7 44 24 68 03 00 00 00           testl  $0x3,0x68(%rsp)
+ f6 44 24 68 03                   testb  $0x3,0x68(%rsp)
- f7 84 24 90 00 00 00 03 00 00 00 testl  $0x3,0x90(%rsp)
+ f6 84 24 90 00 00 00 03          testb  $0x3,0x90(%rsp)

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1430140912-7960-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm/entry/64: Tidy up JZ insns after TESTs
Denys Vlasenko [Mon, 27 Apr 2015 13:21:51 +0000 (15:21 +0200)]
x86/asm/entry/64: Tidy up JZ insns after TESTs

After TESTs, use logically correct JZ/JNZ mnemonics instead of
JE/JNE. This doesn't change code.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1430140912-7960-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agoMerge tag 'pm+acpi-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 7 May 2015 22:58:00 +0000 (15:58 -0700)]
Merge tag 'pm+acpi-4.1-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These include three regression fixes (PCI resources management,
  ACPI/PNP device enumeration, ACPI SBS on MacBook) and two ACPI
  documentation fixes related to GPIO.

  Specifics:

   - Fix for a PCI resources management regression introduced during the
     4.0 cycle and related to the handling of ACPI resources'
     Producer/Consumer flags that turn out to be useless (Jiang Liu)

   - Fix for a MacBook regression related to the Smart Battery Subsystem
     (SBS) driver causing various problems (stalls on boot, failure to
     detect or report battery) to happen and introduced during the 3.18
     cycle (Chris Bainbridge)

   - Fix for an ACPI/PNP device enumeration regression introduced during
     the 3.16 cycle caused by failing to include two PNP device IDs into
     the list of IDs that PNP device objects need to be created for
     (Witold Szczeponik)

   - Fixes for two minor mistakes in the ACPI GPIO properties
     documentation (Antonio Ospite, Rafael J Wysocki)"

* tag 'pm+acpi-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / PNP: add two IDs to list for PNPACPI device enumeration
  ACPI / documentation: Fix ambiguity in the GPIO properties document
  ACPI / documentation: fix a sentence about GPIO resources
  ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook
  x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus

9 years agoMerge branches 'acpi-resources', 'acpi-battery', 'acpi-doc' and 'acpi-pnp'
Rafael J. Wysocki [Thu, 7 May 2015 19:24:34 +0000 (21:24 +0200)]
Merge branches 'acpi-resources', 'acpi-battery', 'acpi-doc' and 'acpi-pnp'

* acpi-resources:
  x86/PCI/ACPI: Make all resources except [io 0xcf8-0xcff] available on PCI bus

* acpi-battery:
  ACPI / SBS: Add 5 us delay to fix SBS hangs on MacBook

* acpi-doc:
  ACPI / documentation: Fix ambiguity in the GPIO properties document
  ACPI / documentation: fix a sentence about GPIO resources

* acpi-pnp:
  ACPI / PNP: add two IDs to list for PNPACPI device enumeration

9 years agoMerge tag 'for-f2fs-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk...
Linus Torvalds [Thu, 7 May 2015 18:18:34 +0000 (11:18 -0700)]
Merge tag 'for-f2fs-4.1-rc3' of git://git./linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:
 "Fix a performance regression and a bug"

* tag 'for-f2fs-4.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: fix wrong error hanlder in f2fs_follow_link
  Revert "f2fs: enhance multi-threads performance"

9 years agoMerge tag 'pinctrl-v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Thu, 7 May 2015 15:27:38 +0000 (08:27 -0700)]
Merge tag 'pinctrl-v4.1-3' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Here is a smallish set of pin control fixes for the v4.1 cycle,
  collected the last two weeks:

   - fix a real nasty legacy bug that has screwed up the protection of
     adding pinctrl maps dynamically.  Normally this didn't happen so
     much but Dough Anderson ran into it and fixed it, kudos!

  - minor driver fixes for Qualcomm spmi, mediatek and Marvell drivers"

* tag 'pinctrl-v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: Don't just pretend to protect pinctrl_maps, do it for real
  pinctrl: mediatek: mtk-common: initialize unmask
  pinctrl: qcom-spmi-mpp: Fix input value report
  pinctrl: qcom-spmi: Fix pin direction configuration
  pinctrl: mvebu: Fix mapping of pin 63 (gpo -> gpio)

9 years agoMerge tag 'vfio-v4.1-rc3' of git://github.com/awilliam/linux-vfio
Linus Torvalds [Thu, 7 May 2015 15:18:01 +0000 (08:18 -0700)]
Merge tag 'vfio-v4.1-rc3' of git://github.com/awilliam/linux-vfio

Pull vfio fixes from Alex Williamson:
 "Fix some undesirable behavior with the vfio device request interface:

   - increase verbosity of device request channel (Alex Williamson)

   - fix runaway interruptible timeout (Alex Williamson)"

* tag 'vfio-v4.1-rc3' of git://github.com/awilliam/linux-vfio:
  vfio: Fix runaway interruptible timeout
  vfio-pci: Log device requests more verbosely

9 years agoMerge tag 'for-linus' of git://github.com/dledford/linux
Linus Torvalds [Thu, 7 May 2015 14:04:33 +0000 (07:04 -0700)]
Merge tag 'for-linus' of git://github.com/dledford/linux

Pull infiniband updates from Doug Ledford:
 "Minor updates for 4.1-rc

  Most of the changes are fairly small and well confined.  The iWARP
  address reporting changes are the only ones that are a medium size.  I
  had these queued up prior to rc1, but due to the shuffle in
  maintainers, they did not get submitted when I expected.  My apologies
  for that.  I feel comfortable with them however due to the testing
  they've received, so I left them in this submission"

* tag 'for-linus' of git://github.com/dledford/linux:
  MAINTAINERS: Update InfiniBand subsystem maintainer
  MAINTAINERS: add include/rdma/ to InfiniBand subsystem
  IPoIB/CM: Fix indentation level
  iw_cxgb4: Remove negative advice dmesg warnings
  IB/core: Fix unaligned accesses
  IB/core: change rdma_gid2ip into void function as it always return zero
  IB/qib: use arch_phys_wc_add()
  IB/qib: add acounting for MTRR
  IB/core: dma unmap optimizations
  IB/core: dma map/unmap locking optimizations
  RDMA/cxgb4: Report the actual address of the remote connecting peer
  RDMA/nes: Report the actual address of the remote connecting peer
  RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients
  iw_cxgb4: enforce qp/cq id requirements
  iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs
  iw_cxgb4: 32b platform fixes
  iw_cxgb4: Cleanup register defines/MACROS
  RDMA/CMA: Canonize IPv4 on IPV6 sockets properly

9 years agoMerge tag 'for-linus-4.1b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 6 May 2015 22:58:06 +0000 (15:58 -0700)]
Merge tag 'for-linus-4.1b-rc2-tag' of git://git./linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - fix blkback regression if using persistent grants

 - fix various event channel related suspend/resume bugs

 - fix AMD x86 regression with X86_BUG_SYSRET_SS_ATTRS

 - SWIOTLB on ARM now uses frames <4 GiB (if available) so device only
   capable of 32-bit DMA work.

* tag 'for-linus-4.1b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
  hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests
  xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()
  xen/console: Update console event channel on resume
  xen/xenbus: Update xenbus event channel on resume
  xen/events: Clear cpu_evtchn_mask before resuming
  xen-pciback: Add name prefix to global 'permissive' variable
  xen: Suspend ticks on all CPUs during suspend
  xen/grant: introduce func gnttab_unmap_refs_sync()
  xen/blkback: safely unmap purge persistent grants

9 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 6 May 2015 17:57:37 +0000 (10:57 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "EFI fixes, and FPU fix, a ticket spinlock boundary condition fix and
  two build fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Always restore_xinit_state() when use_eager_cpu()
  x86: Make cpu_tss available to external modules
  efi: Fix error handling in add_sysfs_runtime_map_entry()
  x86/spinlocks: Fix regression in spinlock contention detection
  x86/mm: Clean up types in xlate_dev_mem_ptr()
  x86/efi: Store upper bits of command line buffer address in ext_cmd_line_ptr
  efivarfs: Ensure VariableName is NUL-terminated

9 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 6 May 2015 17:47:25 +0000 (10:47 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also an uncore PMU driver fix and an uncore
  PMU driver hardware-enablement addition"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf probe: Fix segfault if passed with ''.
  perf report: Fix -T/--threads option to work again
  perf bench numa: Fix immediate meeting of convergence condition
  perf bench numa: Fixes of --quiet argument
  perf bench futex: Fix hung wakeup tasks after requeueing
  perf probe: Fix bug with global variables handling
  perf top: Fix a segfault when kernel map is restricted.
  tools lib traceevent: Fix build failure on 32-bit arch
  perf kmem: Fix compiles on RHEL6/OL6
  tools lib api: Undefine _FORTIFY_SOURCE before setting it
  perf kmem: Consistently use PRIu64 for printing u64 values
  perf trace: Disable events and drain events when forked workload ends
  perf trace: Enable events when doing system wide tracing and starting a workload
  perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver
  perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
  perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu

9 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 6 May 2015 17:26:37 +0000 (10:26 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull RCU fix from Ingo Molnar:
 "An RCU Kconfig fix that eliminates an annoying interactive kconfig
  question for CONFIG_RCU_TORTURE_TEST_SLOW_INIT"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcu: Control grace-period delays directly from value

9 years agopinctrl: Don't just pretend to protect pinctrl_maps, do it for real
Doug Anderson [Fri, 1 May 2015 16:01:27 +0000 (09:01 -0700)]
pinctrl: Don't just pretend to protect pinctrl_maps, do it for real

Way back, when the world was a simpler place and there was no war, no
evil, and no kernel bugs, there was just a single pinctrl lock.  That
was how the world was when (57291ce pinctrl: core device tree mapping
table parsing support) was written.  In that case, there were
instances where the pinctrl mutex was already held when
pinctrl_register_map() was called, hence a "locked" parameter was
passed to the function to indicate that the mutex was already locked
(so we shouldn't lock it again).

A few years ago in (42fed7b pinctrl: move subsystem mutex to
pinctrl_dev struct), we switched to a separate pinctrl_maps_mutex.
...but (oops) we forgot to re-think about the whole "locked" parameter
for pinctrl_register_map().  Basically the "locked" parameter appears
to still refer to whether the bigger pinctrl_dev mutex is locked, but
we're using it to skip locks of our (now separate) pinctrl_maps_mutex.

That's kind of a bad thing(TM).  Probably nobody noticed because most
of the calls to pinctrl_register_map happen at boot time and we've got
synchronous device probing.  ...and even cases where we're
asynchronous don't end up actually hitting the race too often.  ...but
after banging my head against the wall for a bug that reproduced 1 out
of 1000 reboots and lots of looking through kgdb, I finally noticed
this.

Anyway, we can now safely remove the "locked" parameter and go back to
a war-free, evil-free, and kernel-bug-free world.

Fixes: 42fed7ba44e4 ("pinctrl: move subsystem mutex to pinctrl_dev struct")
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
9 years agoxen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM
Stefano Stabellini [Fri, 24 Apr 2015 09:16:40 +0000 (10:16 +0100)]
xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM

Make sure that xen_swiotlb_init allocates buffers that are DMA capable
when at least one memblock is available below 4G. Otherwise we assume
that all devices on the SoC can cope with >4G addresses. We do this on
ARM and ARM64, where dom0 is mapped 1:1, so pfn == mfn in this case.

No functional changes on x86.

From: Chen Baozi <baozich@gmail.com>

Signed-off-by: Chen Baozi <baozich@gmail.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Chen Baozi <baozich@gmail.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
9 years agox86/alternatives: Document macros
Borislav Petkov [Sat, 4 Apr 2015 14:40:45 +0000 (16:40 +0200)]
x86/alternatives: Document macros

Add some text to the macro magic for future reference and against
failing human memory.

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/fpu: Always restore_xinit_state() when use_eager_cpu()
Bobby Powers [Mon, 27 Apr 2015 15:10:41 +0000 (08:10 -0700)]
x86/fpu: Always restore_xinit_state() when use_eager_cpu()

The following commit:

  f893959b0898 ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()")

removed drop_init_fpu() usage from flush_thread(). This seems to break
things for me - the Go 1.4 test suite fails all over the place with
floating point comparision errors (offending commit found through
bisection).

The functional change was that flush_thread() after this commit
only calls restore_init_xstate() when both use_eager_fpu() and
!used_math() are true. drop_init_fpu() (now fpu_reset_state()) calls
restore_init_xstate() regardless of whether current used_math() - apply
the same logic here.

Switch used_math() -> tsk_used_math(tsk) to consistently use the grabbed
tsk instead of current, like in the rest of flush_thread().

Tested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Bobby Powers <bobbypowers@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pekka Riikonen <priikone@iki.fi>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f893959b ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()")
Link: http://lkml.kernel.org/r/1430147441-9820-1-git-send-email-bobbypowers@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agox86/asm: Use -mskip-rax-setup if supported
H.J. Lu [Thu, 18 Dec 2014 02:05:29 +0000 (18:05 -0800)]
x86/asm: Use -mskip-rax-setup if supported

GCC 5 added a compiler option, -mskip-rax-setup, for x86-64. It skips
setting up the RAX register when SSE is disabled and there are no
variable arguments passed in vector registers. (According to the x86_64
ABI, %al is used as a hidden register containing the number of vector
registers used).

Since the kernel doesn't pass vector registers to functions with
variable arguments, this option can be used to optimize the x86-64
kernel.

This GCC feature was suggested by Rasmus Villemoes <linux@rasmusvillemoes.dk>.
This is the corresponding kernel change using it.

For kernel v3.17:

      text   data    bss    dec       filename
  11455921 2204048 5853184 19513153   vmlinux #with -mskip-rax-setup
  11480079 2204048 5853184 19537311   vmlinux

For Kernel v4.0+ - custom config:

      text   data    bss    dec       filename
  10231778 3479800 16617472 30329050  vmlinux-gcc5+-mskip-rax-setup
  10268797 3547448 16621568 30437813  vmlinux

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agoMerge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming...
Ingo Molnar [Wed, 6 May 2015 06:29:37 +0000 (08:29 +0200)]
Merge tag 'efi-urgent' of git://git./linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

 * Avoid garbage names in efivarfs due to buggy firmware by zeroing
   EFI variable name. (Ross Lagerwall)

 * Stop erroneously dropping upper 32 bits of boot command line pointer
   in EFI boot stub and stash them in ext_cmd_line_ptr. (Roy Franz)

 * Fix double-free bug in error handling code path of EFI runtime map
   code. (Dan Carpenter)

Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 6 May 2015 02:54:11 +0000 (04:54 +0200)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

  - Fix 'perf probe -a' segfault if passed with '' (Wang Nan)

  - Fix report -T/--threads option (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
9 years agoMerge tag 'for-linus-4.1-1' of git://git.code.sf.net/p/openipmi/linux-ipmi
Linus Torvalds [Wed, 6 May 2015 02:42:01 +0000 (19:42 -0700)]
Merge tag 'for-linus-4.1-1' of git://git.code.sf.net/p/openipmi/linux-ipmi

Pull IPMI fixes from Corey Minyard:
 "Lots of minor IPMI fixes, especially ones that have have come up since
  the SSIF driver has been in the main kernel for a while"

* tag 'for-linus-4.1-1' of git://git.code.sf.net/p/openipmi/linux-ipmi:
  ipmi: Fix multi-part message handling
  ipmi: Add alert handling to SSIF
  ipmi: Fix a problem that messages are not issued in run_to_completion mode
  ipmi: Report an error if ACPI _IFT doesn't exist
  ipmi: Remove unused including <linux/version.h>
  ipmi: Don't report err in the SI driver for SSIF devices
  ipmi: Remove incorrect use of seq_has_overflowed
  ipmi:ssif: Ignore spaces when comparing I2C adapter names
  ipmi_ssif: Fix the logic on user-supplied addresses

9 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Wed, 6 May 2015 01:52:13 +0000 (18:52 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "16 patches

  This includes a new rtc driver for the Abracon AB x80x and isn't very
  appropriate for -rc2.  It was still being fiddled with a bit during
  the merge window and I fell asleep during -rc1"

[ So I took the new driver, it seems small and won't regress anything.
  I'm a softy.   - Linus ]

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  rtc: armada38x: fix concurrency access in armada38x_rtc_set_time
  ocfs2: dlm: fix race between purge and get lock resource
  nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()
  util_macros.h: have array pointer point to array of constants
  configfs: init configfs module earlier at boot time
  mm/hwpoison-inject: check PageLRU of hpage
  mm/hwpoison-inject: fix refcounting in no-injection case
  mm: soft-offline: fix num_poisoned_pages counting on concurrent events
  rtc: add rtc-abx80x, a driver for the Abracon AB x80x i2c rtc
  Documentation: bindings: add abracon,abx80x
  kasan: show gcc version requirements in Kconfig and Documentation
  mm/memory-failure: call shake_page() when error hits thp tail page
  lib: delete lib/find_last_bit.c
  MAINTAINERS: add co-maintainer for LED subsystem
  zram: add Designated Reviewer for zram in MAINTAINERS
  revert "zram: move compact_store() to sysfs functions area"

9 years agoMerge tag 'platform-drivers-x86-v4.1-2' of git://git.infradead.org/users/dvhart/linux...
Linus Torvalds [Wed, 6 May 2015 01:14:04 +0000 (18:14 -0700)]
Merge tag 'platform-drivers-x86-v4.1-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull x86 platform driver fixes from Darren Hart:
 "This includes a trivial warning and adding a Lenovo laptop to an
  existing quirk.

  I've held off on things like the latter in the past, but I didn't feel
  it was risky enough to push out to 4.2.

   - thinkpad_acpi:
        Fix warning for static not at beginning

   - ideapad_laptop:
        Add Lenovo G40-30 to devices without radio switch"

* tag 'platform-drivers-x86-v4.1-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
  thinkpad_acpi: Fix warning for static not at beginning
  ideapad_laptop: Add Lenovo G40-30 to devices without radio switch

9 years agoipmi: Fix multi-part message handling
Corey Minyard [Wed, 29 Apr 2015 22:59:21 +0000 (17:59 -0500)]
ipmi: Fix multi-part message handling

Lots of little fixes for multi-part messages:

The values was not being re-initialized, if something went wrong
handling a multi-part message and it got left in a bad state, it
might be an issue.

The commands were not correct when issuing multi-part reads, the
code was not passing in the proper value for commands.  Also clean
up some minor formatting issues.

Get the block number from the right location, limit the maximum send
message size to 63 bytes and explain why, and fix some minor sylistic
issues.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agoipmi: Add alert handling to SSIF
Corey Minyard [Fri, 24 Apr 2015 12:46:06 +0000 (07:46 -0500)]
ipmi: Add alert handling to SSIF

The SSIF interface can optionally have an SMBus alert come in when
data is ready.  Unfortunately, the IPMI spec gives wiggle room to
the implementer to allow them to always have the alert enabled,
even if the driver doesn't enable it.  So implement alerts.
If you don't in this situation, the SMBus alert handling will
constantly complain.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agoipmi: Fix a problem that messages are not issued in run_to_completion mode
Hidehiro Kawai [Thu, 23 Apr 2015 02:16:44 +0000 (11:16 +0900)]
ipmi: Fix a problem that messages are not issued in run_to_completion mode

start_next_msg() issues a message placed in smi_info->waiting_msg
if it is non-NULL.  However, sender() sets a message to
smi_info->curr_msg and NULL to smi_info->waiting_msg in the context
of run_to_completion mode.  As the result, it leads an infinite
loop by waiting the completion of unissued message when leaving
dying message after kernel panic.

sender() should set the message to smi_info->waiting_msg not
curr_msg.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agoipmi: Report an error if ACPI _IFT doesn't exist
Corey Minyard [Wed, 22 Apr 2015 18:25:40 +0000 (13:25 -0500)]
ipmi: Report an error if ACPI _IFT doesn't exist

When probing an ACPI table, report a specific error, instead of just
returning an error, if _IFT doesn't exist.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agoipmi: Remove unused including <linux/version.h>
Wei Yongjun [Thu, 16 Apr 2015 13:09:53 +0000 (21:09 +0800)]
ipmi: Remove unused including <linux/version.h>

Remove including <linux/version.h> that don't need it.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agortc: armada38x: fix concurrency access in armada38x_rtc_set_time
Gregory CLEMENT [Tue, 5 May 2015 23:24:05 +0000 (16:24 -0700)]
rtc: armada38x: fix concurrency access in armada38x_rtc_set_time

While setting the time, the RTC TIME register should not be accessed.
However due to hardware constraints, setting the RTC time involves
sleeping during 100ms.  This sleep was done outside the critical section
protected by the spinlock, so it was possible to read the RTC TIME
register and get an incorrect value.  This patch introduces a mutex for
protecting the RTC TIME access, unlike the spinlock it is allowed to
sleep in a critical section protected by a mutex.

The RTC STATUS register can still be used from the interrupt handler but
it has no effect on setting the time.

Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@vger.kernel.org> [4.0]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoocfs2: dlm: fix race between purge and get lock resource
Junxiao Bi [Tue, 5 May 2015 23:24:02 +0000 (16:24 -0700)]
ocfs2: dlm: fix race between purge and get lock resource

There is a race window in dlm_get_lock_resource(), which may return a
lock resource which has been purged.  This will cause the process to
hang forever in dlmlock() as the ast msg can't be handled due to its
lock resource not existing.

    dlm_get_lock_resource {
        ...
        spin_lock(&dlm->spinlock);
        tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
        if (tmpres) {
             spin_unlock(&dlm->spinlock);
             >>>>>>>> race window, dlm_run_purge_list() may run and purge
                              the lock resource
             spin_lock(&tmpres->spinlock);
             ...
             spin_unlock(&tmpres->spinlock);
        }
    }

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agonilfs2: fix sanity check of btree level in nilfs_btree_root_broken()
Ryusuke Konishi [Tue, 5 May 2015 23:24:00 +0000 (16:24 -0700)]
nilfs2: fix sanity check of btree level in nilfs_btree_root_broken()

The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoutil_macros.h: have array pointer point to array of constants
Guenter Roeck [Tue, 5 May 2015 23:23:57 +0000 (16:23 -0700)]
util_macros.h: have array pointer point to array of constants

Using the new find_closest() macro can result in the following sparse
warnings.

  drivers/hwmon/lm85.c:194:16: warning:
   incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:194:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:194:16:    got int static const [toplevel] *<noident>
  drivers/hwmon/lm85.c:210:16: warning:
   incorrect type in initializer (different modifiers)
  drivers/hwmon/lm85.c:210:16:    expected int *__fc_a
  drivers/hwmon/lm85.c:210:16:    got int const *map

This is because the array passed to find_closest() will typically be
declared as array of constants, but the macro declares a non-constant
pointer to it.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoconfigfs: init configfs module earlier at boot time
Daniel Baluta [Tue, 5 May 2015 23:23:54 +0000 (16:23 -0700)]
configfs: init configfs module earlier at boot time

We need this earlier in the boot process to allow various subsystems to
use configfs (e.g Industrial IIO).

Also, debugfs is at core_initcall level and configfs should be on the same
level from infrastructure point of view.

Signed-off-by: Daniel Baluta <daniel.baluta@intel.com>
Suggested-by: Lars-Peter Clausen <lars@metafoo.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm/hwpoison-inject: check PageLRU of hpage
Naoya Horiguchi [Tue, 5 May 2015 23:23:52 +0000 (16:23 -0700)]
mm/hwpoison-inject: check PageLRU of hpage

Hwpoison injector checks PageLRU of the raw target page to find out
whether the page is an appropriate target, but current code now filters
out thp tail pages, which prevents us from testing for such cases via this
interface.  So let's check hpage instead of p.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Dean Nelson <dnelson@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm/hwpoison-inject: fix refcounting in no-injection case
Naoya Horiguchi [Tue, 5 May 2015 23:23:49 +0000 (16:23 -0700)]
mm/hwpoison-inject: fix refcounting in no-injection case

Hwpoison injection via debugfs:hwpoison/corrupt-pfn takes a refcount of
the target page.  But current code doesn't release it if the target page
is not supposed to be injected, which results in memory leak.  This patch
simply adds the refcount releasing code.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Dean Nelson <dnelson@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm: soft-offline: fix num_poisoned_pages counting on concurrent events
Naoya Horiguchi [Tue, 5 May 2015 23:23:46 +0000 (16:23 -0700)]
mm: soft-offline: fix num_poisoned_pages counting on concurrent events

If multiple soft offline events hit one free page/hugepage concurrently,
soft_offline_page() can handle the free page/hugepage multiple times,
which makes num_poisoned_pages counter increased more than once.  This
patch fixes this wrong counting by checking TestSetPageHWPoison for normal
papes and by checking the return value of dequeue_hwpoisoned_huge_page()
for hugepages.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Dean Nelson <dnelson@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <stable@vger.kernel.org> [3.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agortc: add rtc-abx80x, a driver for the Abracon AB x80x i2c rtc
Philippe De Muyter [Tue, 5 May 2015 23:23:44 +0000 (16:23 -0700)]
rtc: add rtc-abx80x, a driver for the Abracon AB x80x i2c rtc

This is a basic driver for the ultra-low-power Abracon AB x80x series of RTC
chips. It supports in particular, the supersets AB0805 and AB1805.
It allows reading and writing the time, and enables the supercapacitor/
battery charger.

[arnd@arndb.de: abx805 depends on i2c]
[alexandre.belloni@free-electrons.com: renam buffer from date to buf in abx80x_rtc_read_time()]
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoDocumentation: bindings: add abracon,abx80x
Alexandre Belloni [Tue, 5 May 2015 23:23:41 +0000 (16:23 -0700)]
Documentation: bindings: add abracon,abx80x

Document the bindings for abracon,abx80x and related compatibles.

Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: Philippe De Muyter <phdm@macqel.be>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokasan: show gcc version requirements in Kconfig and Documentation
Joe Perches [Tue, 5 May 2015 23:23:38 +0000 (16:23 -0700)]
kasan: show gcc version requirements in Kconfig and Documentation

The documentation shows a need for gcc > 4.9.2, but it's really >=.  The
Kconfig entries don't show require versions so add them.  Correct a
latter/later typo too.  Also mention that gcc 5 required to catch out of
bounds accesses to global and stack variables.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm/memory-failure: call shake_page() when error hits thp tail page
Naoya Horiguchi [Tue, 5 May 2015 23:23:35 +0000 (16:23 -0700)]
mm/memory-failure: call shake_page() when error hits thp tail page

Currently memory_failure() calls shake_page() to sweep pages out from
pcplists only when the victim page is 4kB LRU page or thp head page.
But we should do this for a thp tail page too.

Consider that a memory error hits a thp tail page whose head page is on
a pcplist when memory_failure() runs.  Then, the current kernel skips
shake_pages() part, so hwpoison_user_mappings() returns without calling
split_huge_page() nor try_to_unmap() because PageLRU of the thp head is
still cleared due to the skip of shake_page().

As a result, me_huge_page() runs for the thp, which is broken behavior.

One effect is a leak of the thp.  And another is to fail to isolate the
memory error, so later access to the error address causes another MCE,
which kills the processes which used the thp.

This patch fixes this problem by calling shake_page() for thp tail case.

Fixes: 385de35722c9 ("thp: allow a hwpoisoned head page to be put back to LRU")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Dean Nelson <dnelson@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Jin Dongming <jin.dongming@np.css.fujitsu.com>
Cc: <stable@vger.kernel.org> [3.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agolib: delete lib/find_last_bit.c
Yury Norov [Tue, 5 May 2015 23:23:33 +0000 (16:23 -0700)]
lib: delete lib/find_last_bit.c

The file lib/find_last_bit.c was no longer used and supposed to be
deleted by commit 8f6f19dd51 ("lib: move find_last_bit to
lib/find_next_bit.c") but that delete didn't happen.  This gets rid of
it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoMAINTAINERS: add co-maintainer for LED subsystem
Andrew Morton [Tue, 5 May 2015 23:23:30 +0000 (16:23 -0700)]
MAINTAINERS: add co-maintainer for LED subsystem

Add myself (Jacek Anaszewski) as a co-maintainer for the LED subsystem.

Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Acked-by: Bryan Wu <cooloney@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agozram: add Designated Reviewer for zram in MAINTAINERS
Minchan Kim [Tue, 5 May 2015 23:23:28 +0000 (16:23 -0700)]
zram: add Designated Reviewer for zram in MAINTAINERS

Sergey Senozhatsky has contributed/reviewed to zram for a long time.  He
is really helpful for maintaining zram so I want for him to continue
helping me as Designated Reviewer unless he hates it.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agorevert "zram: move compact_store() to sysfs functions area"
Andrew Morton [Tue, 5 May 2015 23:23:25 +0000 (16:23 -0700)]
revert "zram: move compact_store() to sysfs functions area"

Revert commit c72c6160d967ed26a0b136dbab337f821d233509

It was intended to be a cosmetic change that w/o any functional change
and was part of a bigger change:

  http://lkml.iu.edu/hypermail/linux/kernel/1503.1/01818.html

Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agothinkpad_acpi: Fix warning for static not at beginning
Jean Delvare [Mon, 27 Apr 2015 07:45:06 +0000 (09:45 +0200)]
thinkpad_acpi: Fix warning for static not at beginning

Fix the following warning:

warning: "static" is not at beginning of declaration
 void static hotkey_mask_warn_incomplete_mask(void)
 ^

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Henrique de Moraes Holschuh <ibm-acpi@hmh.eng.br>
Cc: Darren Hart <dvhart@infradead.org>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
9 years agoipmi: Don't report err in the SI driver for SSIF devices
Corey Minyard [Sat, 11 Apr 2015 01:19:18 +0000 (20:19 -0500)]
ipmi: Don't report err in the SI driver for SSIF devices

Really ignore them by returning -ENODEV from the probe, but not
doing anything.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agoipmi: Remove incorrect use of seq_has_overflowed
Joe Perches [Sun, 22 Feb 2015 18:21:07 +0000 (10:21 -0800)]
ipmi: Remove incorrect use of seq_has_overflowed

commit d6c5dc18d863 ("ipmi: Remove uses of return value of seq_printf")
incorrectly changed the return value of various proc_show functions
to use seq_has_overflowed().

These functions should return 0 on completion rather than 1/true
on overflow.  1 is the same as #define SEQ_SKIP which would cause
the output to not be emitted (skipped) instead.

This is a logical defect only as the length of these outputs are
all smaller than the initial allocation done by the seq filesystem.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agoipmi:ssif: Ignore spaces when comparing I2C adapter names
Corey Minyard [Tue, 31 Mar 2015 17:48:53 +0000 (12:48 -0500)]
ipmi:ssif: Ignore spaces when comparing I2C adapter names

Some of the adapters have spaces in their names, but that's really
hard to pass in as a module or kernel parameters.  So ignore the
spaces.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agoipmi_ssif: Fix the logic on user-supplied addresses
Corey Minyard [Thu, 26 Mar 2015 18:35:18 +0000 (13:35 -0500)]
ipmi_ssif: Fix the logic on user-supplied addresses

Returning zero is success.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
9 years agox86: Make cpu_tss available to external modules
Marc Dionne [Mon, 4 May 2015 18:16:44 +0000 (15:16 -0300)]
x86: Make cpu_tss available to external modules

Commit 75182b1632 ("x86/asm/entry: Switch all C consumers of
kernel_stack to this_cpu_sp0()") changed current_thread_info
to use this_cpu_sp0, and indirectly made it rely on init_tss
which was exported with EXPORT_PER_CPU_SYMBOL_GPL.
As a result some macros and inline functions such as set/get_fs,
test_thread_flag and variants have been made unusable for
external modules.

Make cpu_tss exported with EXPORT_PER_CPU_SYMBOL so that these
functions are accessible again, as they were previously.

Signed-off-by: Marc Dionne <marc.dionne@your-file-system.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/1430763404-21221-1-git-send-email-marc.dionne@your-file-system.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
9 years agohypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests
Boris Ostrovsky [Mon, 4 May 2015 15:02:15 +0000 (11:02 -0400)]
hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests

Commit 61f01dd941ba ("x86_64, asm: Work around AMD SYSRET SS descriptor
attribute issue") makes AMD processors set SS to __KERNEL_DS in
__switch_to() to deal with cases when SS is NULL.

This breaks Xen PV guests who do not want to load SS with__KERNEL_DS.

Since the problem that the commit is trying to address would have to be
fixed in the hypervisor (if it in fact exists under Xen) there is no
reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here.

This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features
op which will clear this flag. (And since this structure is no longer
HVM-specific we should do some renaming).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
9 years agoxen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()
Boris Ostrovsky [Wed, 29 Apr 2015 21:10:15 +0000 (17:10 -0400)]
xen/events: Set irq_info->evtchn before binding the channel to CPU in __startup_pirq()

.. because bind_evtchn_to_cpu(evtchn, cpu) will map evtchn to
'info' and pass 'info' down to xen_evtchn_port_bind_to_cpu().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Annie Li <annie.li@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
9 years agoxen/console: Update console event channel on resume
Boris Ostrovsky [Wed, 29 Apr 2015 21:10:14 +0000 (17:10 -0400)]
xen/console: Update console event channel on resume

After a resume the hypervisor/tools may change console event
channel number. We should re-query it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
9 years agoxen/xenbus: Update xenbus event channel on resume
Boris Ostrovsky [Wed, 29 Apr 2015 21:10:13 +0000 (17:10 -0400)]
xen/xenbus: Update xenbus event channel on resume

After a resume the hypervisor/tools may change xenbus event
channel number. We should re-query it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
9 years agoxen/events: Clear cpu_evtchn_mask before resuming
Boris Ostrovsky [Wed, 29 Apr 2015 21:10:12 +0000 (17:10 -0400)]
xen/events: Clear cpu_evtchn_mask before resuming

When a guest is resumed, the hypervisor may change event channel
assignments. If this happens and the guest uses 2-level events it
is possible for the interrupt to be claimed by wrong VCPU since
cpu_evtchn_mask bits may be stale. This can happen even though
evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
is passed in is not necessarily the original one (from pre-migration
times) but instead is freshly allocated during resume and so any
information about which CPU the channel was bound to is lost.

Thus we should clear the mask during resume.

We also need to make sure that bits for xenstore and console channels
are set when these two subsystems are resumed. While rebind_evtchn_irq()
(which is invoked for both of them on a resume) calls irq_set_affinity(),
the latter will in fact postpone setting affinity until handling the
interrupt. But because cpu_evtchn_mask will have bits for these two
cleared we won't be able to take the interrupt.

With that in mind, we need to bind those two channels explicitly in
rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
pass through generic irq affinity code later, in case something needs
to be updated there as well.

(Also replace cpumask_of(0) with cpumask_of(info->cpu) in
rebind_evtchn_irq(): it should be set to zero in preceding
xen_irq_info_evtchn_setup().)

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Annie Li <annie.li@oracle.com>
Cc: <stable@vger.kernel.org> # 3.14+
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
9 years agoMAINTAINERS: Update InfiniBand subsystem maintainer
Doug Ledford [Tue, 5 May 2015 16:57:09 +0000 (12:57 -0400)]
MAINTAINERS: Update InfiniBand subsystem maintainer

Since Roland stepped down, the community asked me to take his place, and
the nomination was followed by sufficient votes and no dissensions that
we can move forward with the change.

Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoMAINTAINERS: add include/rdma/ to InfiniBand subsystem
Yann Droneaud [Mon, 4 May 2015 12:31:03 +0000 (14:31 +0200)]
MAINTAINERS: add include/rdma/ to InfiniBand subsystem

Most headers for InfiniBand/RDMA are located under
include/rdma/ and include/uapi/rdma.

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoIPoIB/CM: Fix indentation level
Bart Van Assche [Tue, 5 May 2015 11:01:39 +0000 (13:01 +0200)]
IPoIB/CM: Fix indentation level

See also patch "IPoIB/cm: Add connected mode support for devices
without SRQs" (commit ID 68e995a29572). Detected by smatch.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoiw_cxgb4: Remove negative advice dmesg warnings
Hariprasad S [Mon, 4 May 2015 22:25:24 +0000 (03:55 +0530)]
iw_cxgb4: Remove negative advice dmesg warnings

Remove these log messages in favor of per-endpoint counters as well as
device-global counters that can be inspected via debugfs.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoIB/core: Fix unaligned accesses
David Ahern [Sun, 3 May 2015 13:48:26 +0000 (09:48 -0400)]
IB/core: Fix unaligned accesses

Addresses the following kernel logs seen during boot of sparc systems:

Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]

Signed-off-by: David Ahern <david.ahern@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoIB/core: change rdma_gid2ip into void function as it always return zero
Honggang LI [Wed, 29 Apr 2015 09:40:44 +0000 (17:40 +0800)]
IB/core: change rdma_gid2ip into void function as it always return zero

Signed-off-by: Honggang Li <honli@redhat.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Tue, 5 May 2015 16:03:52 +0000 (09:03 -0700)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes a build problem with bcm63xx and yet another fix to the
  memzero_explicit function to ensure that the memset is not elided"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  hwrng: bcm63xx - Fix driver compilation
  lib: make memzero_explicit more robust against dead store elimination

9 years agoMerge tag 'media/v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Tue, 5 May 2015 15:42:06 +0000 (08:42 -0700)]
Merge tag 'media/v4.1-3' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "Three driver fixes:

   - fix for omap4, fixing a regression due to a subsystem API that got
     removed for 4.1 (commit efde234674d9);

   - fix for one of the formats supported by Marvel ccic driver;

   - fix rcar_vin driver that, when stopping abnormally, the driver
     can't return from wait_for_completion"

* tag 'media/v4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] v4l: omap4iss: Replace outdated OMAP4 control pad API with syscon
  [media] media: soc_camera: rcar_vin: Fix wait_for_completion
  [media] marvell-ccic: fix Y'CbCr ordering

9 years agoperf probe: Fix segfault if passed with ''.
Wang Nan [Tue, 28 Apr 2015 08:46:09 +0000 (08:46 +0000)]
perf probe: Fix segfault if passed with ''.

Since parse_perf_probe_point() deals with a user passed argument, we
should not assume it to be a valid string.

Without this patch, if pass '' to perf probe, a segfault raises:

 $ perf probe -a ''
 Segmentation fault

This patch checks argument of parse_perf_probe_point() before
string processing.

After this patch:

 $ perf probe -a ''

  usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...]
     or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...]
     ...

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1430210769-94177-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
9 years agoefi: Fix error handling in add_sysfs_runtime_map_entry()
Dan Carpenter [Tue, 21 Apr 2015 13:46:28 +0000 (16:46 +0300)]
efi: Fix error handling in add_sysfs_runtime_map_entry()

I spotted two (difficult to hit) bugs while reviewing this.

1)  There is a double free bug because we unregister "map_kset" in
    add_sysfs_runtime_map_entry() and also efi_runtime_map_init().
2)  If we fail to allocate "entry" then we should return
    ERR_PTR(-ENOMEM) instead of NULL.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Guangyu Sun <guangyu.sun@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
9 years agoIB/qib: use arch_phys_wc_add()
Luis R. Rodriguez [Wed, 22 Apr 2015 18:38:24 +0000 (11:38 -0700)]
IB/qib: use arch_phys_wc_add()

This driver already makes use of ioremap_wc() on PIO buffers,
so convert it to use arch_phys_wc_add().

The qib driver uses a mmap() special case for when PAT is
not used, this behaviour used to be determined with a
module parameter but since we have been asked to just
remove that module parameter this checks for the WC cookie,
if not set we can assume PAT was used. If its set we do
what we used to do for the mmap for when MTRR was enabled.

The removal of the module parameter is OK given that Andy
notes that even if users of module parameter are still around
it will not prevent loading of the module on recent kernels.

Cc: Doug Ledford <dledford@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: konrad.wilk@oracle.com
Cc: ville.syrjala@linux.intel.com
Cc: david.vrabel@citrix.com
Cc: jbeulich@suse.com
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: infinipath@intel.com
Cc: linux-rdma@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xensource.com
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoIB/qib: add acounting for MTRR
Luis R. Rodriguez [Tue, 21 Apr 2015 21:50:34 +0000 (14:50 -0700)]
IB/qib: add acounting for MTRR

There is no good reason not to, we eventually delete it as well.

Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Mike Marciniszyn <infinipath@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: linux-rdma@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoIB/core: dma unmap optimizations
Guy Shapiro [Wed, 15 Apr 2015 15:17:57 +0000 (18:17 +0300)]
IB/core: dma unmap optimizations

While unmapping an ODP writable page, the dirty bit of the page is set. In
order to do so, the head of the compound page is found.
Currently, the compound head is found even on non-writable pages, where it is
never used, leading to unnecessary cpu barrier that impacts performance.

This patch moves the search for the compound head to be done only when needed.

Signed-off-by: Guy Shapiro <guysh@mellanox.com>
Acked-by: Shachar Raindel <raindel@mellanox.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoIB/core: dma map/unmap locking optimizations
Guy Shapiro [Wed, 15 Apr 2015 15:17:56 +0000 (18:17 +0300)]
IB/core: dma map/unmap locking optimizations

Currently, while mapping or unmapping pages for ODP, the umem mutex is locked
and unlocked once for each page. Such lock/unlock operation take few tens to
hundreds of nsecs. This makes a significant impact when mapping or unmapping few
MBs of memory.

To avoid this, the mutex should be locked only once per operation, and not per
page.

Signed-off-by: Guy Shapiro <guysh@mellanox.com>
Acked-by: Shachar Raindel <raindel@mellanox.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoRDMA/cxgb4: Report the actual address of the remote connecting peer
Steve Wise [Tue, 21 Apr 2015 20:28:41 +0000 (16:28 -0400)]
RDMA/cxgb4: Report the actual address of the remote connecting peer

Get the actual (non-mapped) ip/tcp address of the connecting peer from
the port mapper

Also setup the passive side endpoint to correctly display the actual
and mapped addresses for the new connection.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoRDMA/nes: Report the actual address of the remote connecting peer
Tatyana Nikolova [Tue, 21 Apr 2015 20:28:25 +0000 (16:28 -0400)]
RDMA/nes: Report the actual address of the remote connecting peer

Get the actual (non-mapped) ip/tcp address of the connecting peer from
the port mapper and report the address info to the user space application
at the time of connection establishment

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoRDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecti...
Tatyana Nikolova [Tue, 21 Apr 2015 20:28:10 +0000 (16:28 -0400)]
RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients

Add functionality to enable the port mapper on the passive side to provide to its
clients the actual (non-mapped) ip/tcp address information of the connecting peer

1) Adding remote_info_cb() to process the address info of the connecting peer
   The address info is provided by the user space port mapper service when
   the connection is initiated by the peer
2) Adding a hash list to store the remote address info
3) Adding functionality to add/remove the remote address info
   After the info has been provided to the port mapper client,
   it is removed from the hash list

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoiw_cxgb4: enforce qp/cq id requirements
Hariprasad S [Tue, 21 Apr 2015 20:15:01 +0000 (01:45 +0530)]
iw_cxgb4: enforce qp/cq id requirements

Currently the iw_cxgb4 implementation requires the qp and cq qid densities
to match as well as the qp and cq id ranges.  So fail a device open if
the device configuration doesn't meet the requirements.

The reason for these restictions has to do with the fact that IQ qid X
has a UGTS register in the same bar2 page as EQ qid X.  Thus both qids
need to be allocated to the same user process for security reasons.
The logic that does this (the qpid allocator in iw_cxgb4/resource.c)
handles this but requires the above restrictions.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoiw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs
Hariprasad S [Tue, 21 Apr 2015 20:15:00 +0000 (01:45 +0530)]
iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs

For T5, we must not use the kdb/kgts registers, in order avoid db drops
under extreme loads.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoiw_cxgb4: 32b platform fixes
Hariprasad S [Tue, 21 Apr 2015 20:14:59 +0000 (01:44 +0530)]
iw_cxgb4: 32b platform fixes

- get_dma_mr() was using ~0UL which is should be ~0ULL.  This causes the
DMA MR to get setup incorrectly in hardware.

- wr_log_show() needed a 64b divide function div64_u64() instead of
  doing
division directly.

- fixed warnings about recasting a pointer to a u64

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoiw_cxgb4: Cleanup register defines/MACROS
Hariprasad S [Tue, 21 Apr 2015 20:14:58 +0000 (01:44 +0530)]
iw_cxgb4: Cleanup register defines/MACROS

Cleanup macros and register defines for consistency

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agoRDMA/CMA: Canonize IPv4 on IPV6 sockets properly
Jason Gunthorpe [Mon, 20 Apr 2015 20:01:11 +0000 (14:01 -0600)]
RDMA/CMA: Canonize IPv4 on IPV6 sockets properly

When accepting a new IPv4 connect to an IPv6 socket, the CMA tries to
canonize the address family to IPv4, but does not properly process
the listening sockaddr to get the listening port, and does not properly
set the address family of the canonized sockaddr.

Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager")

Cc: <stable@vger.kernel.org>
Reported-By: Yotam Kenneth <yotamke@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
9 years agox86/spinlocks: Fix regression in spinlock contention detection
Tahsin Erdogan [Tue, 5 May 2015 04:15:31 +0000 (21:15 -0700)]
x86/spinlocks: Fix regression in spinlock contention detection

A spinlock is regarded as contended when there is at least one waiter.
Currently, the code that checks whether there are any waiters rely on
tail value being greater than head. However, this is not true if tail
reaches the max value and wraps back to zero, so arch_spin_is_contended()
incorrectly returns 0 (not contended) when tail is smaller than head.

The original code (before regression) handled this case by casting the
(tail - head) to an unsigned value. This change simply restores that
behavior.

Fixes: d6abfdb20223 ("x86/spinlocks/paravirt: Fix memory corruption on unlock")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Cc: peterz@infradead.org
Cc: Waiman.Long@hp.com
Cc: borntraeger@de.ibm.com
Cc: oleg@redhat.com
Cc: raghavendra.kt@linux.vnet.ibm.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1430799331-20445-1-git-send-email-tahsin@google.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
9 years agof2fs: fix wrong error hanlder in f2fs_follow_link
Jaegeuk Kim [Wed, 22 Apr 2015 18:03:48 +0000 (11:03 -0700)]
f2fs: fix wrong error hanlder in f2fs_follow_link

The page_follow_link_light returns NULL and its error pointer was remained
in nd->path.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
9 years agoRevert "f2fs: enhance multi-threads performance"
Jaegeuk Kim [Tue, 21 Apr 2015 17:40:54 +0000 (10:40 -0700)]
Revert "f2fs: enhance multi-threads performance"

This reports performance regression by Yuanhan Liu.
The basic idea was to reduce one-point mutex, but it turns out this causes
another contention like context swithes.

https://lkml.org/lkml/2015/4/21/11

Until finishing the analysis on this issue, I'd like to revert this for a while.

This reverts commit 78373b7319abdf15050af5b1632c4c8b8b398f33.

9 years agoACPI / PNP: add two IDs to list for PNPACPI device enumeration
Witold Szczeponik [Fri, 1 May 2015 17:05:20 +0000 (19:05 +0200)]
ACPI / PNP: add two IDs to list for PNPACPI device enumeration

Commit eec15edbb0e1 (ACPI / PNP: use device ID list for PNPACPI device
enumeration) changed the way how ACPI devices are enumerated and when
they are added to the PNP bus.

However, it broke the sound card support on (at least) a vintage
IBM ThinkPad 600E: with said commit applied, two of the necessary
"CSC01xx" devices are not added to the PNP bus and hence can not be
found during the initialization of the "snd-cs4236" module.  As a
consequence, loading "snd-cs4236" causes null pointer exceptions.
The attached patch fixes the problem end re-enables sound on the
IBM ThinkPad 600E.

Fixes: eec15edbb0e1 (ACPI / PNP: use device ID list for PNPACPI device enumeration)
Signed-off-by: Witold Szczeponik <Witold.Szczeponik@gmx.net>
Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
9 years agopinctrl: mediatek: mtk-common: initialize unmask
Colin Ian King [Mon, 20 Apr 2015 15:59:17 +0000 (10:59 -0500)]
pinctrl: mediatek: mtk-common: initialize unmask

cppcheck detected an uninitialized variable:

[drivers/pinctrl/mediatek/pinctrl-mtk-common.c:897]:
  (error) Uninitialized variable: unmask

unmask should be initialized to zero to ensure unmasking
only occurs if a previous mask occurred. The current situation
is that the unmask variable could contain any random garbage
causing random unexpected unmasking.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
9 years agoACPI / documentation: Fix ambiguity in the GPIO properties document
Rafael J. Wysocki [Sun, 3 May 2015 23:58:27 +0000 (01:58 +0200)]
ACPI / documentation: Fix ambiguity in the GPIO properties document

The first paragraph in Documentation/acpi/gpio-properties.txt is
ambiguous, so make it more clear.

Reported-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Antonio Ospite <ao2@ao2.it>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
9 years agohwrng: bcm63xx - Fix driver compilation
Álvaro Fernández Rojas [Sat, 2 May 2015 10:08:42 +0000 (12:08 +0200)]
hwrng: bcm63xx - Fix driver compilation

- s/clk_didsable_unprepare/clk_disable_unprepare
- s/prov/priv
- s/error/ret (bcm63xx_rng_probe)

Fixes: 6229c16060fe ("hwrng: bcm63xx - make use of devm_hwrng_register")
Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>