Peter Zijlstra [Wed, 13 Nov 2013 03:42:14 +0000 (19:42 -0800)]
 
block: Use u64_stats_init() to initialize seqcounts
Now that seqcounts are lockdep enabled objects, we need to explicitly
initialize runtime allocated seqcounts so that lockdep can track them.
Without this patch, Fengguang was seeing:
  [    4.127282] INFO: trying to register non-static key.
  [    4.128027] the code is fine but needs lockdep annotation.
  [    4.128027] turning off the locking correctness validator.
  [    4.128027] CPU: 0 PID: 96 Comm: kworker/u4:1 Not tainted 
3.12.0-next-20131108-10601-gbad570d #2
  [    4.128027] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  [    ...     ]
  [    4.128027] Call Trace:
  [    4.128027]  [<
7908e744>] ? console_unlock+0x353/0x380
  [    4.128027]  [<
79dc7cf2>] dump_stack+0x48/0x60
  [    4.128027]  [<
7908953e>] __lock_acquire.isra.26+0x7e3/0xceb
  [    4.128027]  [<
7908a1c5>] lock_acquire+0x71/0x9a
  [    4.128027]  [<
794079aa>] ? blk_throtl_bio+0x1c3/0x485
  [    4.128027]  [<
7940658b>] throtl_update_dispatch_stats+0x7c/0x153
  [    4.128027]  [<
794079aa>] ? blk_throtl_bio+0x1c3/0x485
  [    4.128027]  [<
794079aa>] blk_throtl_bio+0x1c3/0x485
  ...
Use u64_stats_init() for all affected data structures, which initializes
the seqcount.
Reported-and-Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
[ Folded in another fix from the mailing list as well as a fix to that fix. Tweaked commit message. ]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1384314134-6895-1-git-send-email-john.stultz@linaro.org
[ So I actually think that the two SOBs from PeterZ are the right depiction of the patch route. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fengguang Wu [Fri, 8 Nov 2013 16:55:35 +0000 (00:55 +0800)]
 
locking/lockdep: Mark __lockdep_count_forward_deps() as static
There are new Sparse warnings:
  >> kernel/locking/lockdep.c:1235:15: sparse: symbol '__lockdep_count_forward_deps' was not declared. Should it be static?
  >> kernel/locking/lockdep.c:1261:15: sparse: symbol '__lockdep_count_backward_deps' was not declared. Should it be static?
Please consider folding the attached diff :-)
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/527d1787.ThzXGoUspZWehFDl\%fengguang.wu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 6 Nov 2013 16:42:30 +0000 (17:42 +0100)]
 
lockdep/proc: Fix lock-time avg computation
>    kernel/locking/lockdep_proc.c: In function 'seq_lock_time':
> >> kernel/locking/lockdep_proc.c:424:23: warning: comparison of distinct pointer types lacks a cast [enabled by default]
>
>    418	static void seq_lock_time(struct seq_file *m, struct lock_time *lt)
>    419	{
>    420		seq_printf(m, "%14lu", lt->nr);
>    421		seq_time(m, lt->min);
>    422		seq_time(m, lt->max);
>    423		seq_time(m, lt->total);
>  > 424		seq_time(m, lt->nr ? do_div(lt->total, lt->nr) : 0);
>    425	}
My compiler refuses to actually say that; but it looks wrong in that
do_div() returns the remainder, not the divisor.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Link: http://lkml.kernel.org/r/20131106164230.GE16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 8 Nov 2013 07:26:39 +0000 (08:26 +0100)]
 
locking/doc: Update references to kernel/mutex.c
Fix this docbook error:
  >> docproc: kernel/mutex.c: No such file or directory
by updating the stale references to kernel/mutex.c.
Reported-by: fengguang.wu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-34pikw1tlsskj65rrt5iusrq@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
John Stultz [Mon, 7 Oct 2013 22:52:01 +0000 (15:52 -0700)]
 
ipv6: Fix possible ipv6 seqlock deadlock
While enabling lockdep on seqlocks, I ran across the warning below
caused by the ipv6 stats being updated in both irq and non-irq context.
This patch changes from IP6_INC_STATS_BH to IP6_INC_STATS (suggested
by Eric Dumazet) to resolve this problem.
[   11.120383] =================================
[   11.121024] [ INFO: inconsistent lock state ]
[   11.121663] 3.12.0-rc1+ #68 Not tainted
[   11.122229] ---------------------------------
[   11.122867] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[   11.123741] init/4483 [HC0[0]:SC1[3]:HE1:SE0] takes:
[   11.124505]  (&stats->syncp.seq#6){+.?...}, at: [<
c1ab80c2>] ndisc_send_ns+0xe2/0x130
[   11.125736] {SOFTIRQ-ON-W} state was registered at:
[   11.126447]   [<
c10e0eb7>] __lock_acquire+0x5c7/0x1af0
[   11.127222]   [<
c10e2996>] lock_acquire+0x96/0xd0
[   11.127925]   [<
c1a9a2c3>] write_seqcount_begin+0x33/0x40
[   11.128766]   [<
c1a9aa03>] ip6_dst_lookup_tail+0x3a3/0x460
[   11.129582]   [<
c1a9e0ce>] ip6_dst_lookup_flow+0x2e/0x80
[   11.130014]   [<
c1ad18e0>] ip6_datagram_connect+0x150/0x4e0
[   11.130014]   [<
c1a4d0b5>] inet_dgram_connect+0x25/0x70
[   11.130014]   [<
c198dd61>] SYSC_connect+0xa1/0xc0
[   11.130014]   [<
c198f571>] SyS_connect+0x11/0x20
[   11.130014]   [<
c198fe6b>] SyS_socketcall+0x12b/0x300
[   11.130014]   [<
c1bbf880>] syscall_call+0x7/0xb
[   11.130014] irq event stamp: 1184
[   11.130014] hardirqs last  enabled at (1184): [<
c1086901>] local_bh_enable+0x71/0x110
[   11.130014] hardirqs last disabled at (1183): [<
c10868cd>] local_bh_enable+0x3d/0x110
[   11.130014] softirqs last  enabled at (0): [<
c108014d>] copy_process.part.42+0x45d/0x11a0
[   11.130014] softirqs last disabled at (1147): [<
c1086e05>] irq_exit+0xa5/0xb0
[   11.130014]
[   11.130014] other info that might help us debug this:
[   11.130014]  Possible unsafe locking scenario:
[   11.130014]
[   11.130014]        CPU0
[   11.130014]        ----
[   11.130014]   lock(&stats->syncp.seq#6);
[   11.130014]   <Interrupt>
[   11.130014]     lock(&stats->syncp.seq#6);
[   11.130014]
[   11.130014]  *** DEADLOCK ***
[   11.130014]
[   11.130014] 3 locks held by init/4483:
[   11.130014]  #0:  (rcu_read_lock){.+.+..}, at: [<
c109363c>] SyS_setpriority+0x4c/0x620
[   11.130014]  #1:  (((&ifa->dad_timer))){+.-...}, at: [<
c108c1c0>] call_timer_fn+0x0/0xf0
[   11.130014]  #2:  (rcu_read_lock){.+.+..}, at: [<
c1ab6494>] ndisc_send_skb+0x54/0x5d0
[   11.130014]
[   11.130014] stack backtrace:
[   11.130014] CPU: 0 PID: 4483 Comm: init Not tainted 3.12.0-rc1+ #68
[   11.130014] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   11.130014]  
00000000 00000000 c55e5c10 c1bb0e71 c57128b0 c55e5c4c c1badf79 c1ec1123
[   11.130014]  
c1ec1484 00001183 00000000 00000000 00000001 00000003 00000001 00000000
[   11.130014]  
c1ec1484 00000004 c5712dcc 00000000 c55e5c84 c10de492 00000004 c10755f2
[   11.130014] Call Trace:
[   11.130014]  [<
c1bb0e71>] dump_stack+0x4b/0x66
[   11.130014]  [<
c1badf79>] print_usage_bug+0x1d3/0x1dd
[   11.130014]  [<
c10de492>] mark_lock+0x282/0x2f0
[   11.130014]  [<
c10755f2>] ? kvm_clock_read+0x22/0x30
[   11.130014]  [<
c10dd8b0>] ? check_usage_backwards+0x150/0x150
[   11.130014]  [<
c10e0e74>] __lock_acquire+0x584/0x1af0
[   11.130014]  [<
c10b1baf>] ? sched_clock_cpu+0xef/0x190
[   11.130014]  [<
c10de58c>] ? mark_held_locks+0x8c/0xf0
[   11.130014]  [<
c10e2996>] lock_acquire+0x96/0xd0
[   11.130014]  [<
c1ab80c2>] ? ndisc_send_ns+0xe2/0x130
[   11.130014]  [<
c1ab66d3>] ndisc_send_skb+0x293/0x5d0
[   11.130014]  [<
c1ab80c2>] ? ndisc_send_ns+0xe2/0x130
[   11.130014]  [<
c1ab80c2>] ndisc_send_ns+0xe2/0x130
[   11.130014]  [<
c108cc32>] ? mod_timer+0xf2/0x160
[   11.130014]  [<
c1aa706e>] ? addrconf_dad_timer+0xce/0x150
[   11.130014]  [<
c1aa70aa>] addrconf_dad_timer+0x10a/0x150
[   11.130014]  [<
c1aa6fa0>] ? addrconf_dad_completed+0x1c0/0x1c0
[   11.130014]  [<
c108c233>] call_timer_fn+0x73/0xf0
[   11.130014]  [<
c108c1c0>] ? __internal_add_timer+0xb0/0xb0
[   11.130014]  [<
c1aa6fa0>] ? addrconf_dad_completed+0x1c0/0x1c0
[   11.130014]  [<
c108c5b1>] run_timer_softirq+0x141/0x1e0
[   11.130014]  [<
c1086b20>] ? __do_softirq+0x70/0x1b0
[   11.130014]  [<
c1086b70>] __do_softirq+0xc0/0x1b0
[   11.130014]  [<
c1086e05>] irq_exit+0xa5/0xb0
[   11.130014]  [<
c106cfd5>] smp_apic_timer_interrupt+0x35/0x50
[   11.130014]  [<
c1bbfbca>] apic_timer_interrupt+0x32/0x38
[   11.130014]  [<
c10936ed>] ? SyS_setpriority+0xfd/0x620
[   11.130014]  [<
c10e26c9>] ? lock_release+0x9/0x240
[   11.130014]  [<
c10936d7>] ? SyS_setpriority+0xe7/0x620
[   11.130014]  [<
c1bbee6d>] ? _raw_read_unlock+0x1d/0x30
[   11.130014]  [<
c1093701>] SyS_setpriority+0x111/0x620
[   11.130014]  [<
c109363c>] ? SyS_setpriority+0x4c/0x620
[   11.130014]  [<
c1bbf880>] syscall_call+0x7/0xb
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-5-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
John Stultz [Mon, 7 Oct 2013 22:52:00 +0000 (15:52 -0700)]
 
cpuset: Fix potential deadlock w/ set_mems_allowed
After adding lockdep support to seqlock/seqcount structures,
I started seeing the following warning:
[    1.070907] ======================================================
[    1.072015] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[    1.073181] 3.11.0+ #67 Not tainted
[    1.073801] ------------------------------------------------------
[    1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[    1.076088]  (&p->mems_allowed_seq){+.+...}, at: [<
ffffffff81187d7f>] new_slab+0x5f/0x280
[    1.077572]
[    1.077572] and this task is already holding:
[    1.078593]  (&(&q->__queue_lock)->rlock){..-...}, at: [<
ffffffff81339f03>] blk_execute_rq_nowait+0x53/0xf0
[    1.080042] which would create a new lock dependency:
[    1.080042]  (&(&q->__queue_lock)->rlock){..-...} -> (&p->mems_allowed_seq){+.+...}
[    1.080042]
[    1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
[    1.080042]  (&(&q->__queue_lock)->rlock){..-...}
[    1.080042] ... which became SOFTIRQ-irq-safe at:
[    1.080042]   [<
ffffffff810ec179>] __lock_acquire+0x5b9/0x1db0
[    1.080042]   [<
ffffffff810edfe5>] lock_acquire+0x95/0x130
[    1.080042]   [<
ffffffff818968a1>] _raw_spin_lock+0x41/0x80
[    1.080042]   [<
ffffffff81560c9e>] scsi_device_unbusy+0x7e/0xd0
[    1.080042]   [<
ffffffff8155a612>] scsi_finish_command+0x32/0xf0
[    1.080042]   [<
ffffffff81560e91>] scsi_softirq_done+0xa1/0x130
[    1.080042]   [<
ffffffff8133b0f3>] blk_done_softirq+0x73/0x90
[    1.080042]   [<
ffffffff81095dc0>] __do_softirq+0x110/0x2f0
[    1.080042]   [<
ffffffff81095fcd>] run_ksoftirqd+0x2d/0x60
[    1.080042]   [<
ffffffff810bc506>] smpboot_thread_fn+0x156/0x1e0
[    1.080042]   [<
ffffffff810b3916>] kthread+0xd6/0xe0
[    1.080042]   [<
ffffffff818980ac>] ret_from_fork+0x7c/0xb0
[    1.080042]
[    1.080042] to a SOFTIRQ-irq-unsafe lock:
[    1.080042]  (&p->mems_allowed_seq){+.+...}
[    1.080042] ... which became SOFTIRQ-irq-unsafe at:
[    1.080042] ...  [<
ffffffff810ec1d3>] __lock_acquire+0x613/0x1db0
[    1.080042]   [<
ffffffff810edfe5>] lock_acquire+0x95/0x130
[    1.080042]   [<
ffffffff810b3df2>] kthreadd+0x82/0x180
[    1.080042]   [<
ffffffff818980ac>] ret_from_fork+0x7c/0xb0
[    1.080042]
[    1.080042] other info that might help us debug this:
[    1.080042]
[    1.080042]  Possible interrupt unsafe locking scenario:
[    1.080042]
[    1.080042]        CPU0                    CPU1
[    1.080042]        ----                    ----
[    1.080042]   lock(&p->mems_allowed_seq);
[    1.080042]                                local_irq_disable();
[    1.080042]                                lock(&(&q->__queue_lock)->rlock);
[    1.080042]                                lock(&p->mems_allowed_seq);
[    1.080042]   <Interrupt>
[    1.080042]     lock(&(&q->__queue_lock)->rlock);
[    1.080042]
[    1.080042]  *** DEADLOCK ***
The issue stems from the kthreadd() function calling set_mems_allowed
with irqs enabled. While its possibly unlikely for the actual deadlock
to trigger, a fix is fairly simple: disable irqs before taking the
mems_allowed_seq lock.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
John Stultz [Mon, 7 Oct 2013 22:51:59 +0000 (15:51 -0700)]
 
seqcount: Add lockdep functionality to seqcount/seqlock structures
Currently seqlocks and seqcounts don't support lockdep.
After running across a seqcount related deadlock in the timekeeping
code, I used a less-refined and more focused variant of this patch
to narrow down the cause of the issue.
This is a first-pass attempt to properly enable lockdep functionality
on seqlocks and seqcounts.
Since seqcounts are used in the vdso gettimeofday code, I've provided
non-lockdep accessors for those needs.
I've also handled one case where there were nested seqlock writers
and there may be more edge cases.
Comments and feedback would be appreciated!
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
John Stultz [Mon, 7 Oct 2013 22:51:58 +0000 (15:51 -0700)]
 
net: Explicitly initialize u64_stats_sync structures for lockdep
In order to enable lockdep on seqcount/seqlock structures, we
must explicitly initialize any locks.
The u64_stats_sync structure, uses a seqcount, and thus we need
to introduce a u64_stats_init() function and use it to initialize
the structure.
This unfortunately adds a lot of fairly trivial initialization code
to a number of drivers. But the benefit of ensuring correctness makes
this worth while.
Because these changes are required for lockdep to be enabled, and the
changes are quite trivial, I've not yet split this patch out into 30-some
separate patches, as I figured it would be better to get the various
maintainers thoughts on how to best merge this change along with
the seqcount lockdep enablement.
Feedback would be appreciated!
Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Roger Luethi <rl@hellgate.ch>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Simon Horman <horms@verge.net.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Mon, 4 Nov 2013 17:05:09 +0000 (18:05 +0100)]
 
locking: Move the percpu-rwsem code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-52bjmtty46we26hbfd9sc9iy@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Mon, 4 Nov 2013 10:51:33 +0000 (11:51 +0100)]
 
locking: Move the lglocks code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-amd6pg1mif6tikbyktfvby3y@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 17:19:28 +0000 (18:19 +0100)]
 
locking: Move the rwsem code to kernel/locking/
Notably: changed lib/rwsem* targets from lib- to obj-, no idea about
the ramifications of that.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-g0kynfh5feriwc6p3h6kpbw6@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 17:18:19 +0000 (18:18 +0100)]
 
locking: Move the rtmutex code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-p9ijt8div0hwldexwfm4nlhj@git.kernel.org
[ Fixed build failure in kernel/rcu/tree_plugin.h. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 17:16:43 +0000 (18:16 +0100)]
 
locking: Move the semaphore core to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-vmw5sf6vzmua1z6nx1cg69h2@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 17:15:36 +0000 (18:15 +0100)]
 
locking: Move the spinlock code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-b81ol0z3mon45m51o131yc9j@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 17:14:17 +0000 (18:14 +0100)]
 
locking: Move the lockdep code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-wl7s3tta5isufzfguc23et06@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 17:11:53 +0000 (18:11 +0100)]
 
locking: Move the mutex code to kernel/locking/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-1ditvncg30dgbpvrz2bxfmke@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 6 Nov 2013 06:50:37 +0000 (07:50 +0100)]
 
Merge branch 'sched/core' into core/locking, to prepare the kernel/locking/ file move
Conflicts:
	kernel/Makefile
There are conflicts in kernel/Makefile due to file moving in the
scheduler tree - resolve them.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 4 Oct 2013 20:06:53 +0000 (22:06 +0200)]
 
sched: Move completion code from core.c to completion.c
Completions already have their own header file: linux/completion.h
Move the implementation out of kernel/sched/core.c and into its own
file: kernel/sched/completion.c.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-x2y49rmxu5dljt66ai2lcfuw@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 4 Oct 2013 15:24:35 +0000 (17:24 +0200)]
 
sched: Move wait code from core.c to wait.c
For some reason only the wait part of the wait api lives in
kernel/sched/wait.c and the wake part still lives in kernel/sched/core.c;
ammend this.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-ftycee88naznulqk7ei5mbci@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 31 Oct 2013 17:07:08 +0000 (18:07 +0100)]
 
sched: Move wait.c into kernel/sched/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-5q5yqvdaen0rmapwloeaotx3@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 6 Nov 2013 06:43:37 +0000 (07:43 +0100)]
 
Merge branch 'core/rcu' into core/locking, to prepare the kernel/locking/ file move
There are conflicts in lockdep.c due to RCU changes, and also the RCU
tree changes kernel/Makefile - so pre-merge it to ease the moving of
locking related .c files to kernel/locking/.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Wed, 6 Nov 2013 05:39:45 +0000 (06:39 +0100)]
 
Merge tag 'v3.12' into core/locking to pick up mutex upates
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sun, 3 Nov 2013 23:41:51 +0000 (15:41 -0800)]
 
Linux 3.12
Linus Torvalds [Sun, 3 Nov 2013 19:36:41 +0000 (11:36 -0800)]
 
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "Three fixes across arch/mips with the most complex one being the GIC
  interrupt fix - at nine lines still not monster.  I'm confident this
  are the final MIPS patches even if there should go for an rc8"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: ralink: fix return value check in rt_timer_probe()
  MIPS: malta: Fix GIC interrupt offsets
  MIPS: Perf: Fix 74K cache map
Mathias Krause [Sun, 3 Nov 2013 11:36:28 +0000 (12:36 +0100)]
 
ipc, msg: forbid negative values for "msg{max,mnb,mni}"
Negative message lengths make no sense -- so don't do negative queue
lenghts or identifier counts. Prevent them from getting negative.
Also change the underlying data types to be unsigned to avoid hairy
surprises with sign extensions in cases where those variables get
evaluated in unsigned expressions with bigger data types, e.g size_t.
In case a user still wants to have "unlimited" sizes she could just use
INT_MAX instead.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 2 Nov 2013 17:27:29 +0000 (10:27 -0700)]
 
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull ARM kallsyms fix from Rusty Russell:
 "Last minute perf unbreakage for ARM modules; spent a day in
  linux-next"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  scripts/kallsyms: filter symbols not in kernel address space
Vineet Gupta [Sat, 2 Nov 2013 12:17:49 +0000 (17:47 +0530)]
 
ARC: Incorrect mm reference used in vmalloc fault handler
A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current
task's "active_mm".  ARC vmalloc fault handler however was using mm.
A vmalloc fault for non user task context (actually pre-userland, from
init thread's open for /dev/console) caused the handler to deref NULL mm
(for mm->pgd)
The reasons it worked so far is amazing:
1. By default (!SMP), vmalloc fault handler uses a cached value of PGD.
   In SMP that MMU register is repurposed hence need for mm pointer deref.
2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in
   pre-userland code path - it was introduced with commit 
20bafb3d23d108bc
   "n_tty: Move buffers into n_tty_data"
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: stable@vger.kernel.org    #3.10 and 3.11
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ming Lei [Fri, 1 Nov 2013 22:41:33 +0000 (09:11 +1030)]
 
scripts/kallsyms: filter symbols not in kernel address space
This patch uses CONFIG_PAGE_OFFSET to filter symbols which
are not in kernel address space because these symbols are
generally for generating code purpose and can't be run at
kernel mode, so we needn't keep them in /proc/kallsyms.
For example, on ARM there are some symbols which may be
linked in relocatable code section, then perf can't parse
symbols any more from /proc/kallsyms, this patch fixes the
problem (introduced 
b9b32bf70f2fb710b07c94e13afbc729afe221da)
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@vger.kernel.org
Linus Torvalds [Fri, 1 Nov 2013 19:54:51 +0000 (12:54 -0700)]
 
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two fixes:
   - Fix 'NMI handler took too long to run' false positives
     [ Genuine NMI overhead speedups will come for v3.13, this commit
       only fixes a measurement bug ]
   - Fix perf ring-buffer missed barrier causing (rare) ring-buffer data
     corruption on ppc64"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86: Fix NMI measurements
  perf: Fix perf ring buffer memory ordering
Linus Torvalds [Fri, 1 Nov 2013 19:23:56 +0000 (12:23 -0700)]
 
Merge tag 'usb-3.12-rc8' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here is a set of patches that revert all of the changes done to the
  pl2303 USB serial driver in the 3.12-rc timeframe, as it turns out
  they break some devices that work just fine on 3.11.  As it's not a
  good idea to break working systems, drop them all and they will be
  reworked for future kernel versions such that there is no breakage.
  I've also included a MAINTAINERS update for the USB serial subsystem
  and a new device id for the ftdi_sio driver as well"
* tag 'usb-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: ftdi_sio: add id for Z3X Box device
  USB: Maintainers change for usb serial drivers
  Revert "USB: pl2303: restrict the divisor based baud rate encoding method to the "HX" chip type"
  Revert "usb: pl2303: fix+improve the divsor based baud rate encoding method"
  Revert "usb: pl2303: do not round to the next nearest standard baud rate for the divisor based baud rate encoding method"
  Revert "usb: pl2303: remove 500000 baud from the list of standard baud rates"
  Revert "usb: pl2303: move the two baud rate encoding methods to separate functions"
  Revert "usb: pl2303: increase the allowed baud rate range for the divisor based encoding method"
  Revert "usb: pl2303: also use the divisor based baud rate encoding method for baud rates < 115200 with HX chips"
  Revert "usb: pl2303: add two comments concerning the supported baud rates with HX chips"
  Revert "pl2303: simplify the else-if contruct for type_1 chips in pl2303_startup()"
  Revert "pl2303: improve the chip type information output on startup"
  Revert "pl2303: improve the chip type detection/distinction"
  Revert "USB: pl2303: distinguish between original and cloned HX chips"
Linus Torvalds [Fri, 1 Nov 2013 19:23:22 +0000 (12:23 -0700)]
 
Merge tag 'sound-3.12' of git://git./linux/kernel/git/tiwai/sound
Pull more sound fixes from Takashi Iwai:
 "The fixes for random bugs that have been reported lately in the game:
  a few fixes in ASoC dpam and wm_hubs bugs spotted by Coverity, a
  one-liner HD-audio fixup, and a fix for Oops with DPCM.
  They are not so critically urgent bugs, but all small and safe"
* tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM
  ASoC: wm_hubs: Add missing break in hp_supply_event()
  ALSA: hda - Add a fixup for ASUS N76VZ
  ASoC: dapm: Return -ENOMEM in snd_soc_dapm_new_dai_widgets()
  ASoC: dapm: Fix source list debugfs outputs
Linus Torvalds [Fri, 1 Nov 2013 19:22:47 +0000 (12:22 -0700)]
 
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux
Pull clock subsystem fixes from Mike Turquette.
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
  clk: fixup argument order when setting VCO parameters
  clk: socfpga: Fix incorrect sdmmc clock name
  clk: armada-370: fix tclk frequencies
  clk: nomadik: set all timers to use 2.4 MHz TIMCLK
Greg Thelen [Fri, 1 Nov 2013 19:16:59 +0000 (12:16 -0700)]
 
memcg: remove incorrect underflow check
When a memcg is deleted mem_cgroup_reparent_charges() moves charged
memory to the parent memcg.  As of 
v3.11-9444-g3ea67d0 "memcg: add per
cgroup writeback pages accounting" there's bad pointer read.  The goal
was to check for counter underflow.  The counter is a per cpu counter
and there are two problems with the code:
 (1) per cpu access function isn't used, instead a naked pointer is used
     which easily causes oops.
 (2) the check doesn't sum all cpus
Test:
  $ cd /sys/fs/cgroup/memory
  $ mkdir x
  $ echo 3 > /proc/sys/vm/drop_caches
  $ (echo $BASHPID >> x/tasks && exec cat) &
  [1] 7154
  $ grep ^mapped x/memory.stat
  mapped_file 53248
  $ echo 7154 > tasks
  $ rmdir x
  <OOPS>
The fix is to remove the check.  It's currently dangerous and isn't
worth fixing it to use something expensive, such as
percpu_counter_sum(), for each reparented page.  __this_cpu_read() isn't
enough to fix this because there's no guarantees of the current cpus
count.  The only guarantees is that the sum of all per-cpu counter is >=
nr_pages.
Fixes: 
3ea67d06e467 ("memcg: add per cgroup writeback pages accounting")
Reported-and-tested-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Sha Zhengju <handai.szj@taobao.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Алексей Крамаренко [Fri, 1 Nov 2013 13:26:38 +0000 (17:26 +0400)]
 
USB: serial: ftdi_sio: add id for Z3X Box device
Custom VID/PID for Z3X Box device, popular tool for cellphone flashing.
Signed-off-by: Alexey E. Kramarenko <alexeyk13@yandex.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg KH [Wed, 30 Oct 2013 18:07:31 +0000 (11:07 -0700)]
 
USB: Maintainers change for usb serial drivers
Johan has been conned^Wgracious in accepting the maintainership of the
USB serial drivers, especially as he's been doing all of the real work
for the past few years.
At the same time, remove a bunch of old entries for USB serial drivers
that don't make sense anymore, given that the developers are no longer
around, and individual driver maintainerships for tiny things like this
is pretty pointless.
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:19:56 +0000 (09:19 -0700)]
 
Revert "USB: pl2303: restrict the divisor based baud rate encoding method to the "HX" chip type"
This reverts commit 
b8bdad608213caffa081a97d2e937e5fe08c4046.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:19:45 +0000 (09:19 -0700)]
 
Revert "usb: pl2303: fix+improve the divsor based baud rate encoding method"
This reverts commit 
57ce61aad748ceaa08c859da04043ad7dae7c15e.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:19:34 +0000 (09:19 -0700)]
 
Revert "usb: pl2303: do not round to the next nearest standard baud rate for the divisor based baud rate encoding method"
This reverts commit 
75417d9f99f89ab241de69d7db15af5842b488c4.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:19:24 +0000 (09:19 -0700)]
 
Revert "usb: pl2303: remove 500000 baud from the list of standard baud rates"
This reverts commit 
b9208c721ce736125fe58d398319513a27850fd8.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:19:03 +0000 (09:19 -0700)]
 
Revert "usb: pl2303: move the two baud rate encoding methods to separate functions"
This reverts commit 
e917ba01d69ad705a4cd6a6c77538f55d84f5907.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:18:47 +0000 (09:18 -0700)]
 
Revert "usb: pl2303: increase the allowed baud rate range for the divisor based encoding method"
This reverts commit 
b5c16c6a031c52cc4b7dda6c3de46462fbc92eab.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:18:38 +0000 (09:18 -0700)]
 
Revert "usb: pl2303: also use the divisor based baud rate encoding method for baud rates < 115200 with HX chips"
This reverts commit 
61fa8d694b8547894b57ea0d99d0120a58f6ebf8.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:18:25 +0000 (09:18 -0700)]
 
Revert "usb: pl2303: add two comments concerning the supported baud rates with HX chips"
This reverts commit 
c23bda365dfbf56aa4d6d4a97f83136c36050e01.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:18:10 +0000 (09:18 -0700)]
 
Revert "pl2303: simplify the else-if contruct for type_1 chips in pl2303_startup()"
This reverts commit 
73b583af597542329e6adae44524da6f27afed62.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:17:50 +0000 (09:17 -0700)]
 
Revert "pl2303: improve the chip type information output on startup"
This reverts commit 
a77a8c23e4db9fb1f776147eda0d85117359c700.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:16:09 +0000 (09:16 -0700)]
 
Revert "pl2303: improve the chip type detection/distinction"
This reverts commit 
034d1527adebd302115c87ef343497a889638275.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 1 Nov 2013 16:12:52 +0000 (09:12 -0700)]
 
Revert "USB: pl2303: distinguish between original and cloned HX chips"
This reverts commit 
7d26a78f62ff4fb08bc5ba740a8af4aa7ac67da4.
Revert all of the pl2303 changes that went into 3.12-rc1 and -rc2 as
they cause regressions on some versions of the chip.  This will all be
revisited for later kernel versions when we can figure out how to handle
this in a way that does not break working devices.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Frank Schäfer <fschaefer.oss@googlemail.com>
Acked-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Heiko Carstens [Thu, 31 Oct 2013 11:48:14 +0000 (12:48 +0100)]
 
sched/wait: Fix __wait_event_interruptible_lock_irq_timeout()
__wait_event_interruptible_lock_irq_timeout() needs the timeout
parameter passed instead of "ret".
This magically compiled since the only user has a local ret
variable. Luckily we got a build warning:
  CC      drivers/s390/scsi/zfcp_qdio.o
  drivers/s390/scsi/zfcp_qdio.c: In function 'zfcp_qdio_sbal_get':
  include/linux/wait.h:780:15: warning: 'ret' may be used uninitialized
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20131031114814.GB5551@osiris
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Fri, 1 Nov 2013 07:10:58 +0000 (08:10 +0100)]
 
Merge branch 'linus' into sched/core
Resolve cherry-picking conflicts:
Conflicts:
	mm/huge_memory.c
	mm/memory.c
	mm/mprotect.c
See this upstream merge commit for more details:
  
52469b4fcd4f Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Thu, 31 Oct 2013 23:58:23 +0000 (16:58 -0700)]
 
Merge branch 'akpm' (fixes from Andrew Morton)
Merge four more fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
  mm: memcg: fix test for child groups
  mm: memcg: lockdep annotation for memcg OOM lock
  mm: memcg: use proper memcg in limit bypass
Ming Lei [Thu, 31 Oct 2013 23:34:17 +0000 (16:34 -0700)]
 
lib/scatterlist.c: don't flush_kernel_dcache_page on slab page
Commit 
b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper
functions") introduces two sg buffer copy helpers, and calls
flush_kernel_dcache_page() on pages in SG list after these pages are
written to.
Unfortunately, the commit may introduce a potential bug:
 - Before sending some SCSI commands, kmalloc() buffer may be passed to
   block layper, so flush_kernel_dcache_page() can see a slab page
   finally
 - According to cachetlb.txt, flush_kernel_dcache_page() is only called
   on "a user page", which surely can't be a slab page.
 - ARCH's implementation of flush_kernel_dcache_page() may use page
   mapping information to do optimization so page_mapping() will see the
   slab page, then VM_BUG_ON() is triggered.
Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
before calling flush_kernel_dcache_page().
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Tested-by: Simon Baatz <gmbnomis@gmail.com>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>	[3.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 31 Oct 2013 23:34:15 +0000 (16:34 -0700)]
 
mm: memcg: fix test for child groups
When memcg code needs to know whether any given memcg has children, it
uses the cgroup child iteration primitives and returns true/false
depending on whether the iteration loop is executed at least once or
not.
Because a cgroup's list of children is RCU protected, these primitives
require the RCU read-lock to be held, which is not the case for all
memcg callers.  This results in the following splat when e.g.  enabling
hierarchy mode:
  WARNING: CPU: 3 PID: 1 at kernel/cgroup.c:3043 css_next_child+0xa3/0x160()
  CPU: 3 PID: 1 Comm: systemd Not tainted 
3.12.0-rc5-00117-g83f11a9-dirty #18
  Hardware name: LENOVO 
3680B56/
3680B56, BIOS 6QET69WW (1.39 ) 04/26/2012
  Call Trace:
    dump_stack+0x54/0x74
    warn_slowpath_common+0x78/0xa0
    warn_slowpath_null+0x1a/0x20
    css_next_child+0xa3/0x160
    mem_cgroup_hierarchy_write+0x5b/0xa0
    cgroup_file_write+0x108/0x2a0
    vfs_write+0xbd/0x1e0
    SyS_write+0x4c/0xa0
    system_call_fastpath+0x16/0x1b
In the memcg case, we only care about children when we are attempting to
modify inheritable attributes interactively.  Racing with deletion could
mean a spurious -EBUSY, no problem.  Racing with addition is handled
just fine as well through the memcg_create_mutex: if the child group is
not on the list after the mutex is acquired, it won't be initialized
from the parent's attributes until after the unlock.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 31 Oct 2013 23:34:14 +0000 (16:34 -0700)]
 
mm: memcg: lockdep annotation for memcg OOM lock
The memcg OOM lock is a mutex-type lock that is open-coded due to
memcg's special needs.  Add annotations for lockdep coverage.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 31 Oct 2013 23:34:13 +0000 (16:34 -0700)]
 
mm: memcg: use proper memcg in limit bypass
Commit 
84235de394d9 ("fs: buffer: move allocation failure loop into the
allocator") allowed __GFP_NOFAIL allocations to bypass the limit if they
fail to reclaim enough memory for the charge.  But because the main test
case was on a 3.2-based system, the patch missed the fact that on newer
kernels the charge function needs to return root_mem_cgroup when
bypassing the limit, and not NULL.  This will corrupt whatever memory is
at NULL + percpu pointer offset.  Fix this quickly before problems are
reported.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 31 Oct 2013 22:43:02 +0000 (15:43 -0700)]
 
vfs: decrapify dput(), fix cache behavior under normal load
We do not want to dirty the dentry->d_flags cacheline in dput() just to
set the DCACHE_REFERENCED flag when it is already set in the common case
anyway.  This way the first cacheline of the dentry (which contains the
RCU lookup information etc) can stay shared among multiple CPU's.
This finishes off some of the details of all the scalability patches
merged during the merge window.
Also don't mark dentry_kill() for inlining, since it's the uncommon path
and inlining it just makes the common path slower due to extra function
entry/exit overhead.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 31 Oct 2013 22:28:23 +0000 (15:28 -0700)]
 
i915: fix compiler warning
The last i915 drm update brought with it this annoying warning
  drivers/gpu/drm/i915/intel_crt.c: In function ‘intel_crt_get_config’:
  drivers/gpu/drm/i915/intel_crt.c:110:21: warning: unused variable ‘dev’ [-Wunused-variable]
    struct drm_device *dev = encoder->base.dev;
                       ^
introduced by commit 
7195a50b5c7e ("drm/i915: Add HSW CRT output readout
support").
Remove the offending pointless variable.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 31 Oct 2013 22:21:26 +0000 (15:21 -0700)]
 
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull NUMA balancing memory corruption fixes from Ingo Molnar:
 "So these fixes are definitely not something I'd like to sit on, but as
  I said to Mel at the KS the timing is quite tight, with Linus planning
  v3.12-final within a week.
  Fedora-19 is affected:
   comet:~> grep NUMA_BALANCING /boot/config-3.11.3-201.fc19.x86_64
   CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
   CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
   CONFIG_NUMA_BALANCING=y
  AFAICS Ubuntu will be affected as well, once it updates the kernel:
   hubble:~> grep NUMA_BALANCING /boot/config-3.8.0-32-generic
   CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
   CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y
   CONFIG_NUMA_BALANCING=y
  These 6 commits are a minimalized set of cherry-picks needed to fix
  the memory corruption bugs.  All commits are fixes, except "mm: numa:
  Sanitize task_numa_fault() callsites" which is a cleanup that made two
  followup fixes simpler.
  I've done targeted testing with just this SHA1 to try to make sure
  there are no cherry-picking artifacts.  The original non-cherry-picked
  set of fixes were exposed to linux-next for a couple of weeks"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  mm: Account for a THP NUMA hinting update as one PTE update
  mm: Close races between THP migration and PMD numa clearing
  mm: numa: Sanitize task_numa_fault() callsites
  mm: Prevent parallel splits during THP migration
  mm: Wait for THP migrations to complete during NUMA hinting faults
  mm: numa: Do not account for a hinting fault if we raced
Linus Torvalds [Thu, 31 Oct 2013 17:38:59 +0000 (10:38 -0700)]
 
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "A bit later than I would want, but the changes are very minor - a few
  new device IDs for new hardware in existing drivers, fix for battery
  in Wacom devices not be considered system battery and cause emergency
  hibernations, and a couple of other bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: ALPS - add support for model found on Dell XT2
  Input: wacom - add support for ISDv4 0x10E sensor
  Input: wacom - add support for ISDv4 0x10F sensor
  Input: wacom - export battery scope
  Input: cm109 - convert high volume dev_err() to dev_err_ratelimited()
  Input: move name/timer init to input_alloc_dev()
  Input: i8042 - i8042_flush fix for a full 8042 buffer
  Input: pxa27x_keypad - fix NULL pointer dereference
Linus Torvalds [Thu, 31 Oct 2013 17:13:28 +0000 (10:13 -0700)]
 
Merge tag 'pm+acpi-3.12-late' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael J Wysocki:
 "Last-minute ACPI and power management fixes for 3.12
   - Revert epoll and select commits related to the freezer, introduced
     during the 3.11 cycle, that cause mysterious user space breakage to
     occur during resume from suspend to RAM for multiple users of
     32-bit x86 systems.  Material for 3.11.y stable kernels.
   - Revert a recent ACPI-based PCI hotplug (ACPIPHP) commit that was
     part of boot problem fixes for one machine, but turns out to cause
     issues with hotplug on Thunderbolt chains with multiple devices.
     It also turns out to be unnecessary after another fix in the same
     area that went in later.  From Mika Westerberg"
* tag 'pm+acpi-3.12-late' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI / hotplug / PCI: Avoid doing too much for spurious notifies"
  Revert "select: use freezable blocking call"
  Revert "epoll: use freezable blocking call"
Russell King [Thu, 31 Oct 2013 15:01:37 +0000 (15:01 +0000)]
 
ALSA: fix oops in snd_pcm_info() caused by ASoC DPCM
Unable to handle kernel NULL pointer dereference at virtual address 
00000008
pgd = 
d5300000
[
00000008] *pgd=
0d265831, *pte=
00000000, *ppte=
00000000
Internal error: Oops: 17 [#1] PREEMPT ARM
CPU: 0 PID: 2295 Comm: vlc Not tainted 3.11.0+ #755
task: 
dee74800 ti: 
e213c000 task.ti: 
e213c000
PC is at snd_pcm_info+0xc8/0xd8
LR is at 0x30232065
pc : [<
c031b52c>]    lr : [<
30232065>]    psr: 
a0070013
sp : 
e213dea8  ip : 
d81cb0d0  fp : 
c05f7678
r10: 
c05f7770  r9 : 
fffffdfd  r8 : 
00000000
r7 : 
d8a968a8  r6 : 
d8a96800  r5 : 
d8a96200  r4 : 
d81cb000
r3 : 
00000000  r2 : 
d81cb000  r1 : 
00000001  r0 : 
d8a96200
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 
10c5387d  Table: 
15300019  DAC: 
00000015
Process vlc (pid: 2295, stack limit = 0xe213c248)
[<
c031b52c>] (snd_pcm_info) from [<
c031b570>] (snd_pcm_info_user+0x34/0x9c)
[<
c031b570>] (snd_pcm_info_user) from [<
c03164a4>] (snd_pcm_control_ioctl+0x274/0x280)
[<
c03164a4>] (snd_pcm_control_ioctl) from [<
c0311458>] (snd_ctl_ioctl+0xc0/0x55c)
[<
c0311458>] (snd_ctl_ioctl) from [<
c00eca84>] (do_vfs_ioctl+0x80/0x31c)
[<
c00eca84>] (do_vfs_ioctl) from [<
c00ecd5c>] (SyS_ioctl+0x3c/0x60)
[<
c00ecd5c>] (SyS_ioctl) from [<
c000e500>] (ret_fast_syscall+0x0/0x48)
Code: 
e1a00005 e59530dc e3a01001 e1a02004 (
e5933008)
---[ end trace 
cb3d9bdb8dfefb3c ]---
This is provoked when the ASoC front end is open along with its backend,
(which causes the backend to have a runtime assigned to it) and then the
SNDRV_CTL_IOCTL_PCM_INFO is requested for the (visible) backend device.
Resolve this by ensuring that ASoC internal backend devices are not
visible to userspace, just as the commentry for snd_pcm_new_internal()
says it should be.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Mark Brown <broonie@linaro.org>
Cc: <stable@vger.kernel.org> [v3.4+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Wei Yongjun [Thu, 31 Oct 2013 07:51:38 +0000 (15:51 +0800)]
 
MIPS: ralink: fix return value check in rt_timer_probe()
In case of error, the function devm_request_and_ioremap() returns NULL
pointer not ERR_PTR(). Fix it by using devm_ioremap_resource() instead
of devm_request_and_ioremap().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: John Crispin <blogic@openwrt.org>
Cc: grant.likely@linaro.org
Cc: rob.herring@calxeda.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6098/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Oleg Nesterov [Sat, 19 Oct 2013 16:18:28 +0000 (18:18 +0200)]
 
hung_task debugging: Add tracepoint to report the hang
Currently check_hung_task() prints a warning if it detects the
problem, but it is not convenient to watch the system logs if
user-space wants to be notified about the hang.
Add the new trace_sched_process_hang() into check_hung_task(),
this way a user-space monitor can easily wait for the hang and
potentially resolve a problem.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Sullivan <dsulliva@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20131019161828.GA7439@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Yunkang Tang [Thu, 31 Oct 2013 07:55:58 +0000 (00:55 -0700)]
 
Input: ALPS - add support for model found on Dell XT2
This patch adds support for touchpad found on Dell XT2. It's a dual device
with device ID: 73, 00, 14, that comply with "ALPS_PROTO_V2".
Signed-off-by: Yunkang Tang <yunkang.tang@cn.alps.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Dave Airlie [Thu, 31 Oct 2013 05:29:10 +0000 (15:29 +1000)]
 
Merge branch 'drm-fixes-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Just a few small fixes for radeon (audio regression fix,
stability fix, and an endian bug noticed by coverity).
* 'drm-fixes-3.12' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon/dpm: fix incompatible casting on big endian
  drm/radeon: disable bapm on KB
  drm/radeon: use sw CTS/N values for audio on DCE4+
Linus Torvalds [Wed, 30 Oct 2013 21:27:10 +0000 (14:27 -0700)]
 
Merge branch 'akpm' (fixes from Andrew Morton)
Merge three fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting
  percpu: fix this_cpu_sub() subtrahend casting for unsigneds
  mm/pagewalk.c: fix walk_page_range() access of wrong PTEs
Greg Thelen [Wed, 30 Oct 2013 20:56:21 +0000 (13:56 -0700)]
 
memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting
As of commit 
3ea67d06e467 ("memcg: add per cgroup writeback pages
accounting") memcg counter errors are possible when moving charged
memory to a different memcg.  Charge movement occurs when processing
writes to memory.force_empty, moving tasks to a memcg with
memcg.move_charge_at_immigrate=1, or memcg deletion.
An example showing error after memory.force_empty:
  $ cd /sys/fs/cgroup/memory
  $ mkdir x
  $ rm /data/tmp/file
  $ (echo $BASHPID >> x/tasks && exec mmap_writer /data/tmp/file 1M) &
  [1] 13600
  $ grep ^mapped x/memory.stat
  mapped_file 
1048576
  $ echo 13600 > tasks
  $ echo 1 > x/memory.force_empty
  $ grep ^mapped x/memory.stat
  mapped_file 
4503599627370496
mapped_file should end with 0.
  
4503599627370496 == 0x10,0000,0000,0000 == 0x100,0000,0000 pages
  
1048576          == 0x10,0000           == 0x100 pages
This issue only affects the source memcg on 64 bit machines; the
destination memcg counters are correct.  So the rmdir case is not too
important because such counters are soon disappearing with the entire
memcg.  But the memcg.force_empty and memory.move_charge_at_immigrate=1
cases are larger problems as the bogus counters are visible for the
(possibly long) remaining life of the source memcg.
The problem is due to memcg use of __this_cpu_from(.., -nr_pages), which
is subtly wrong because it subtracts the unsigned int nr_pages (either
-1 or -512 for THP) from a signed long percpu counter.  When
nr_pages=-1, -nr_pages=0xffffffff.  On 64 bit machines stat->count[idx]
is signed 64 bit.  So memcg's attempt to simply decrement a count (e.g.
from 1 to 0) boils down to:
  long count = 1
  unsigned int nr_pages = 1
  count += -nr_pages  /* -nr_pages == 0xffff,ffff */
  count is now 0x1,0000,0000 instead of 0
The fix is to subtract the unsigned page count rather than adding its
negation.  This only works once "percpu: fix this_cpu_sub() subtrahend
casting for unsigneds" is applied to fix this_cpu_sub().
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Thelen [Wed, 30 Oct 2013 20:56:20 +0000 (13:56 -0700)]
 
percpu: fix this_cpu_sub() subtrahend casting for unsigneds
this_cpu_sub() is implemented as negation and addition.
This patch casts the adjustment to the counter type before negation to
sign extend the adjustment.  This helps in cases where the counter type
is wider than an unsigned adjustment.  An alternative to this patch is
to declare such operations unsupported, but it seemed useful to avoid
surprises.
This patch specifically helps the following example:
  unsigned int delta = 1
  preempt_disable()
  this_cpu_write(long_counter, 0)
  this_cpu_sub(long_counter, delta)
  preempt_enable()
Before this change long_counter on a 64 bit machine ends with value
0xffffffff, rather than 0xffffffffffffffff.  This is because
this_cpu_sub(pcp, delta) boils down to this_cpu_add(pcp, -delta),
which is basically:
  long_counter = 0 + 0xffffffff
Also apply the same cast to:
  __this_cpu_sub()
  __this_cpu_sub_return()
  this_cpu_sub_return()
All percpu_test.ko passes, especially the following cases which
previously failed:
  l -= ui_one;
  __this_cpu_sub(long_counter, ui_one);
  CHECK(l, long_counter, -1);
  l -= ui_one;
  this_cpu_sub(long_counter, ui_one);
  CHECK(l, long_counter, -1);
  CHECK(l, long_counter, 0xffffffffffffffff);
  ul -= ui_one;
  __this_cpu_sub(ulong_counter, ui_one);
  CHECK(ul, ulong_counter, -1);
  CHECK(ul, ulong_counter, 0xffffffffffffffff);
  ul = this_cpu_sub_return(ulong_counter, ui_one);
  CHECK(ul, ulong_counter, 2);
  ul = __this_cpu_sub_return(ulong_counter, ui_one);
  CHECK(ul, ulong_counter, 1);
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen LinX [Wed, 30 Oct 2013 20:56:18 +0000 (13:56 -0700)]
 
mm/pagewalk.c: fix walk_page_range() access of wrong PTEs
When walk_page_range walk a memory map's page tables, it'll skip
VM_PFNMAP area, then variable 'next' will to assign to vma->vm_end, it
maybe larger than 'end'.  In next loop, 'addr' will be larger than
'next'.  Then in /proc/XXXX/pagemap file reading procedure, the 'addr'
will growing forever in pagemap_pte_range, pte_to_pagemap_entry will
access the wrong pte.
  BUG: Bad page map in process procrank  pte:
8437526f pmd:
785de067
  addr:
9108d000 vm_flags:
00200073 anon_vma:
f0d99020 mapping:  (null) index:9108d
  CPU: 1 PID: 4974 Comm: procrank Tainted: G    B   W  O 3.10.1+ #1
  Call Trace:
    dump_stack+0x16/0x18
    print_bad_pte+0x114/0x1b0
    vm_normal_page+0x56/0x60
    pagemap_pte_range+0x17a/0x1d0
    walk_page_range+0x19e/0x2c0
    pagemap_read+0x16e/0x200
    vfs_read+0x84/0x150
    SyS_read+0x4a/0x80
    syscall_call+0x7/0xb
Signed-off-by: Liu ShuoX <shuox.liu@intel.com>
Signed-off-by: Chen LinX <linx.z.chen@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>	[3.10.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Russell King [Wed, 30 Oct 2013 14:16:16 +0000 (14:16 +0000)]
 
mm: list_lru: fix almost infinite loop causing effective livelock
I've seen a fair number of issues with kswapd and other processes
appearing to get stuck in v3.12-rc.  Using sysrq-p many times seems to
indicate that it gets stuck somewhere in list_lru_walk_node(), called
from prune_icache_sb() and super_cache_scan().
I never seem to be able to trigger a calltrace for functions above that
point.
So I decided to add the following to super_cache_scan():
    @@ -81,10 +81,14 @@ static unsigned long super_cache_scan(struct shrinker *shrink,
            inodes = list_lru_count_node(&sb->s_inode_lru, sc->nid);
            dentries = list_lru_count_node(&sb->s_dentry_lru, sc->nid);
            total_objects = dentries + inodes + fs_objects + 1;
    +printk("%s:%u: %s: dentries %lu inodes %lu total %lu\n", current->comm, current->pid, __func__, dentries, inodes, total_objects);
            /* proportion the scan between the caches */
            dentries = mult_frac(sc->nr_to_scan, dentries, total_objects);
            inodes = mult_frac(sc->nr_to_scan, inodes, total_objects);
    +printk("%s:%u: %s: dentries %lu inodes %lu\n", current->comm, current->pid, __func__, dentries, inodes);
    +BUG_ON(dentries == 0);
    +BUG_ON(inodes == 0);
            /*
             * prune the dcache first as the icache is pinned by it, then
    @@ -99,7 +103,7 @@ static unsigned long super_cache_scan(struct shrinker *shrink,
                    freed += sb->s_op->free_cached_objects(sb, fs_objects,
                                                           sc->nid);
            }
    -
    +printk("%s:%u: %s: dentries %lu inodes %lu freed %lu\n", current->comm, current->pid, __func__, dentries, inodes, freed);
            drop_super(sb);
            return freed;
     }
and shortly thereafter, having applied some pressure, I got this:
    update-apt-xapi:1616: super_cache_scan: dentries 25632 inodes 2 total 25635
    update-apt-xapi:1616: super_cache_scan: dentries 1023 inodes 0
    ------------[ cut here ]------------
    Kernel BUG at 
c0101994 [verbose debug info unavailable]
    Internal error: Oops - BUG: 0 [#3] SMP ARM
    Modules linked in: fuse rfcomm bnep bluetooth hid_cypress
    CPU: 0 PID: 1616 Comm: update-apt-xapi Tainted: G      D      3.12.0-rc7+ #154
    task: 
daea1200 ti: 
c3bf8000 task.ti: 
c3bf8000
    PC is at super_cache_scan+0x1c0/0x278
    LR is at trace_hardirqs_on+0x14/0x18
    Process update-apt-xapi (pid: 1616, stack limit = 0xc3bf8240)
    ...
    Backtrace:
      (super_cache_scan) from [<
c00cd69c>] (shrink_slab+0x254/0x4c8)
      (shrink_slab) from [<
c00d09a0>] (try_to_free_pages+0x3a0/0x5e0)
      (try_to_free_pages) from [<
c00c59cc>] (__alloc_pages_nodemask+0x5)
      (__alloc_pages_nodemask) from [<
c00e07c0>] (__pte_alloc+0x2c/0x13)
      (__pte_alloc) from [<
c00e3a70>] (handle_mm_fault+0x84c/0x914)
      (handle_mm_fault) from [<
c001a4cc>] (do_page_fault+0x1f0/0x3bc)
      (do_page_fault) from [<
c001a7b0>] (do_translation_fault+0xac/0xb8)
      (do_translation_fault) from [<
c000840c>] (do_DataAbort+0x38/0xa0)
      (do_DataAbort) from [<
c00133f8>] (__dabt_usr+0x38/0x40)
Notice that we had a very low number of inodes, which were reduced to
zero my mult_frac().
Now, prune_icache_sb() calls list_lru_walk_node() passing that number of
inodes (0) into that as the number of objects to scan:
    long prune_icache_sb(struct super_block *sb, unsigned long nr_to_scan,
                         int nid)
    {
            LIST_HEAD(freeable);
            long freed;
            freed = list_lru_walk_node(&sb->s_inode_lru, nid, inode_lru_isolate,
                                           &freeable, &nr_to_scan);
which does:
    unsigned long
    list_lru_walk_node(struct list_lru *lru, int nid, list_lru_walk_cb isolate,
                       void *cb_arg, unsigned long *nr_to_walk)
    {
            struct list_lru_node    *nlru = &lru->node[nid];
            struct list_head *item, *n;
            unsigned long isolated = 0;
            spin_lock(&nlru->lock);
    restart:
            list_for_each_safe(item, n, &nlru->list) {
                    enum lru_status ret;
                    /*
                     * decrement nr_to_walk first so that we don't livelock if we
                     * get stuck on large numbesr of LRU_RETRY items
                     */
                    if (--(*nr_to_walk) == 0)
                            break;
So, if *nr_to_walk was zero when this function was entered, that means
we're wanting to operate on (~0UL)+1 objects - which might as well be
infinite.
Clearly this is not correct behaviour.  If we think about the behaviour
of this function when *nr_to_walk is 1, then clearly it's wrong - we
decrement first and then test for zero - which results in us doing
nothing at all.  A post-decrement would give the desired behaviour -
we'd try to walk one object and one object only if *nr_to_walk were one.
It also gives the correct behaviour for zero - we exit at this point.
Fixes: 
5cedf721a7cd ("list_lru: fix broken LRU_RETRY behaviour")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
[ Modified to make sure we never underflow the count: this function gets
  called in a loop, so the 0 -> ~0ul transition is dangerous  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 30 Oct 2013 19:29:06 +0000 (12:29 -0700)]
 
Merge tag 'tty-3.12-rc8' of git://git./linux/kernel/git/gregkh/tty
Pull serial fixes from Greg KH:
 "Here are 3 tiny fixes that are needed for 3.12-final for some serial
  drivers.
  One of them is a revert of a broken patch, and two others are fixes
  for reported bugs.  All of these have been in linux-next for a while,
  I forgot I had not sent them to you yet, my fault"
(Actually, Greg, you _had_ sent two of the three, so this pulls in just
one actual new fix)
* tag 'tty-3.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty/serial: at91: fix uart/usart selection for older products
Linus Torvalds [Wed, 30 Oct 2013 19:27:12 +0000 (12:27 -0700)]
 
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Mainly Intel regression fixes and quirks, along with a simple one
  liner to fix rendernodes ioctl access (off by default, but testers
  want to test it)"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm: allow DRM_IOCTL_VERSION on render-nodes
  drm/i915: Fix the PPT fdi lane bifurcate state handling on ivb
  drm/i915: No LVDS hardware on Intel D410PT and D425KT
  drm/i915/dp: workaround BIOS eDP bpp clamping issue
  drm/i915: Add HSW CRT output readout support
  drm/i915: Add support for pipe_bpp readout
Linus Torvalds [Wed, 30 Oct 2013 19:26:29 +0000 (12:26 -0700)]
 
Merge tag 'sound-3.12' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A few small HD-audio regression fixes, mostly for stable kernels, too"
* tag 'sound-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix silent headphone on Thinkpads with 
AD1984A codec
  ALSA: hda - Add missing initial vmaster hook at build_controls callback
  ALSA: hda - Fix unbalanced runtime PM refcount after S3/S4
Linus Torvalds [Wed, 30 Oct 2013 19:25:15 +0000 (12:25 -0700)]
 
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "Fixes for the 3.12 debugfs problem - removing the duplicate directory
  name, and using a better the error code"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: use a more sensible error number when debugfs directory creation fails
  KVM: Fix modprobe failure for kvm_intel/kvm_amd
Dan Carpenter [Tue, 29 Oct 2013 20:01:43 +0000 (23:01 +0300)]
 
Staging: sb105x: info leak in mp_get_count()
The icount.reserved[] array isn't initialized so it leaks stack
information to userspace.
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Tue, 29 Oct 2013 20:01:11 +0000 (23:01 +0300)]
 
Staging: bcm: info leak in ioctl
The DevInfo.u32Reserved[] array isn't initialized so it leaks kernel
information to user space.
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Tue, 29 Oct 2013 20:00:15 +0000 (23:00 +0300)]
 
staging: wlags49_h2: buffer overflow setting station name
We need to check the length parameter before doing the memcpy().  I've
actually changed it to strlcpy() as well so that it's NUL terminated.
You need CAP_NET_ADMIN to trigger these so it's not the end of the
world.
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Tue, 29 Oct 2013 19:11:06 +0000 (22:11 +0300)]
 
aacraid: missing capable() check in compat ioctl
In commit 
d496f94d22d1 ('[SCSI] aacraid: fix security weakness') we
added a check on CAP_SYS_RAWIO to the ioctl.  The compat ioctls need the
check as well.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Tue, 29 Oct 2013 19:07:47 +0000 (22:07 +0300)]
 
staging: ozwpan: prevent overflow in oz_cdev_write()
We need to check "count" so we don't overflow the ei->data buffer.
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Tue, 29 Oct 2013 19:06:04 +0000 (22:06 +0300)]
 
uml: check length in exitcode_proc_write()
We don't cap the size of buffer from the user so we could write past the
end of the array here.  Only root can write to this file.
Reported-by: Nico Golde <nico@ngolde.de>
Reported-by: Fabian Yamaguchi <fabs@goesec.de>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Wed, 30 Oct 2013 17:42:13 +0000 (18:42 +0100)]
 
Merge tag 'asoc-fix-v3.12-rc7' of git://git./linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v3.12
A few of the Coverity fixes from Takashi, one of which (the wm_hubs one)
is particularly noticable.
Mark Brown [Wed, 30 Oct 2013 17:11:55 +0000 (10:11 -0700)]
 
Merge remote-tracking branch 'asoc/fix/wm8994' into asoc-linus
Takashi Iwai [Wed, 30 Oct 2013 07:35:02 +0000 (08:35 +0100)]
 
ASoC: wm_hubs: Add missing break in hp_supply_event()
Spotted by coverity CID 115170.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Cc: stable@vger.kernel.org
Markos Chandras [Wed, 30 Oct 2013 14:27:48 +0000 (14:27 +0000)]
 
MIPS: malta: Fix GIC interrupt offsets
The GIC interrupt offsets are calculated based on the value of NR_CPUS.
However, this is wrong because NR_CPUS may or may not contain the real
number of the actual cpus present in the system. We fix that by using
the 'nr_cpu_ids' variable which contains the real number of cpus in
the system. Previously, an MT core (eg with 8 VPEs) will fail to boot if
NR_CPUS was > 8 with the following errors:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/irq/chip.c:670 __irq_set_handler+0x15c/0x164()
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W    
3.12.0-rc5-00087-gced5633 5
Stack : 
00000006 00000004 00000000 00000000 00000000 00000000 807a4f36 00000053
          807a0000 00000000 80173218 80565aa8 00000000 00000000 00000000 0000000
          00000000 00000000 00000000 00000000 00000000 00000000 00000000 0000000
          00000000 00000000 00000000 8054fd00 8054fd94 80500514 805657a7 8016eb4
          807a0000 80500514 00000000 00000000 80565aa8 8079a5d8 80565766 8054fd0
          ...
Call Trace:
[<
801098c0>] show_stack+0x64/0x7c
[<
8049c6b0>] dump_stack+0x64/0x84
[<
8012efc4>] warn_slowpath_common+0x84/0xb4
[<
8012f00c>] warn_slowpath_null+0x18/0x24
[<
80173218>] __irq_set_handler+0x15c/0x164
[<
80587cf4>] arch_init_ipiirq+0x2c/0x3c
[<
805880c8>] arch_init_irq+0x3c4/0x4bc
[<
80588e28>] init_IRQ+0x3c/0x50
[<
805847e8>] start_kernel+0x230/0x3d8
---[ end trace 
4eaa2a86a8e2da26 ]---
This is now fixed and the Malta board can boot with any NR_CPUS value
which also helps supporting more processors in a single kernel binary.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6091/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Mika Westerberg [Wed, 30 Oct 2013 12:40:36 +0000 (14:40 +0200)]
 
Revert "ACPI / hotplug / PCI: Avoid doing too much for spurious notifies"
Commit 
2dc4128 (ACPI / hotplug / PCI: Avoid doing too much for
spurious notifies) changed the enable_slot() to check return value of
pci_scan_slot() and if it is zero return early from the function. It
means that there were no new devices in this particular slot.
However, if a device appeared deeper in the hierarchy the code now
ignores it causing things like Thunderbolt chaining fail to recognize
new devices.
The problem with Alex Williamson's machine was solved with commit
a47d8c8 (ACPI / hotplug / PCI: Avoid parent bus rescans on spurious
device checks) and hence we should be able to restore the original
functionality that we always rescan on bus check notification.
On a device check notification we still check what acpiphp_rescan_slot()
returns and on zero bail out early.
Fixes: 
2dc41281b1d1 (ACPI / hotplug / PCI: Avoid doing too much for spurious notifies)
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rafael J. Wysocki [Tue, 29 Oct 2013 22:43:08 +0000 (23:43 +0100)]
 
Revert "select: use freezable blocking call"
This reverts commit 
9745cdb36da8 (select: use freezable blocking call)
that triggers problems during resume from suspend to RAM on Paul Bolle's
32-bit x86 machines.  Paul says:
  Ever since I tried running (release candidates of) v3.11 on the two
  working i686s I still have lying around I ran into issues on resuming
  from suspend. Reverting 
9745cdb36da8 (select: use freezable blocking
  call) resolves those issues.
  Resuming from suspend on i686 on (release candidates of) v3.11 and
  later triggers issues like:
  traps: systemd[1] general protection ip:
b738e490 sp:
bf882fc0 error:0 in libc-2.16.so[
b731c000+1b0000]
  and
  traps: rtkit-daemon[552] general protection ip:
804d6e5 sp:
b6cb32f0 error:0 in rtkit-daemon[
8048000+d000]
  Once I hit the systemd error I can only get out of the mess that the
  system is at that point by power cycling it.
Since we are reverting another freezer-related change causing similar
problems to happen, this one should be reverted as well.
References: https://lkml.org/lkml/2013/10/29/583
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Fixes: 
9745cdb36da8 (select: use freezable blocking call)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.11+ <stable@vger.kernel.org> # 3.11+
Rafael J. Wysocki [Tue, 29 Oct 2013 12:12:56 +0000 (13:12 +0100)]
 
Revert "epoll: use freezable blocking call"
This reverts commit 
1c441e921201 (epoll: use freezable blocking call)
which is reported to cause user space memory corruption to happen
after suspend to RAM.
Since it appears to be extremely difficult to root cause this
problem, it is best to revert the offending commit and try to address
the original issue in a better way later.
References: https://bugzilla.kernel.org/show_bug.cgi?id=61781
Reported-by: Natrio <natrio@list.ru>
Reported-by: Jeff Pohlmeyer <yetanothergeek@gmail.com>
Bisected-by: Leo Wolf <jclw@ymail.com>
Fixes: 
1c441e921201 (epoll: use freezable blocking call)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.11+ <stable@vger.kernel.org> # 3.11+
Takashi Iwai [Wed, 30 Oct 2013 11:29:40 +0000 (12:29 +0100)]
 
ALSA: hda - Add a fixup for ASUS N76VZ
ASUS N76VZ needs the same fixup as N56VZ for supporting the boost
speaker.
Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=846529
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Paolo Bonzini [Wed, 30 Oct 2013 11:12:13 +0000 (12:12 +0100)]
 
KVM: use a more sensible error number when debugfs directory creation fails
I don't know if this was due to cut and paste, or somebody was really
using a D20 to pick the error code for kvm_init_debugfs as suggested by
Linus (EFAULT is 14, so the possibility cannot be entirely ruled out).
In any case, this patch fixes it.
Reported-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tim Gardner [Tue, 29 Oct 2013 15:13:54 +0000 (09:13 -0600)]
 
KVM: Fix modprobe failure for kvm_intel/kvm_amd
The x86 specific kvm init creates a new conflicting
debugfs directory which causes modprobe issues
with kvm_intel and kvm_amd. For example,
sudo modprobe kvm_amd
modprobe: ERROR: could not insert 'kvm_amd': Bad address
The simplest fix is to just rename the directory. The following
KVM config options are set:
CONFIG_KVM_GUEST=y
CONFIG_KVM_DEBUG_FS=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
CONFIG_KVM_DEVICE_ASSIGNMENT=y
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
[Change debugfs directory name. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
David Herrmann [Mon, 28 Oct 2013 09:55:49 +0000 (10:55 +0100)]
 
drm: allow DRM_IOCTL_VERSION on render-nodes
DRM_IOCTL_VERSION is a reliable way to get the driver-name and version
information. It's not related to the interface-version (SET_VERSION ioctl)
so we can safely enable it on render-nodes.
Note that gbm uses udev-BUSID to load the correct mesa driver. However,
the VERSION ioctl should be the more reliable way to do this (in case we
add new DRM-bus drivers which have no BUSID or similar).
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 30 Oct 2013 02:30:12 +0000 (12:30 +1000)]
 
Merge tag 'drm-intel-fixes-2013-10-29' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Regression and warn fixes for i915.
* tag 'drm-intel-fixes-2013-10-29' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915: Fix the PPT fdi lane bifurcate state handling on ivb
  drm/i915: No LVDS hardware on Intel D410PT and D425KT
  drm/i915/dp: workaround BIOS eDP bpp clamping issue
  drm/i915: Add HSW CRT output readout support
  drm/i915: Add support for pipe_bpp readout
Deng-Cheng Zhu [Tue, 8 Oct 2013 15:17:48 +0000 (16:17 +0100)]
 
MIPS: Perf: Fix 74K cache map
According to Software User's Manual, the event of last-level-cache
read/write misses is mapped to even counters. Odd counters of that
event number count miss cycles.
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6036/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Tue, 29 Oct 2013 17:21:34 +0000 (10:21 -0700)]
 
Fix a few incorrectly checked [io_]remap_pfn_range() calls
Nico Golde reports a few straggling uses of [io_]remap_pfn_range() that
really should use the vm_iomap_memory() helper.  This trivially converts
two of them to the helper, and comments about why the third one really
needs to continue to use remap_pfn_range(), and adds the missing size
check.
Reported-by: Nico Golde <nico@ngolde.de>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org.
Linus Torvalds [Tue, 29 Oct 2013 15:36:50 +0000 (08:36 -0700)]
 
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf tooling fixes from Ingo Molnar:
 "This contains five tooling fixes:
   - fix a remaining mmap2 assumption which resulted in perf top output
     breakage
   - fix mmap ring-buffer processing bug that corrupts data
   - fix for a severe python scripting memory leak
   - fix broken (and user-visible) -g option handling
   - fix stdio output
  The diffstat size is larger than what we'd like to see this late :-/"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Fixup mmap event consumption
  perf top: Split -G and --call-graph
  perf record: Split -g and --call-graph
  perf hists: Add color overhead for stdio output buffer
  perf tools: Fix up /proc/PID/maps parsing
  perf script python: Fix mem leak due to missing Py_DECREFs on dict entries
Linus Torvalds [Tue, 29 Oct 2013 15:33:36 +0000 (08:33 -0700)]
 
Kconfig: make KOBJECT_RELEASE debugging require timer debugging
Without the timer debugging, the delayed kobject release will just
result in undebuggable oopses if it triggers any latent bugs.  That
doesn't actually help debugging at all.
So make DEBUG_KOBJECT_RELEASE depend on DEBUG_OBJECTS_TIMERS to avoid
having people enable one without the other.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Vetter [Tue, 29 Oct 2013 11:04:08 +0000 (12:04 +0100)]
 
drm/i915: Fix the PPT fdi lane bifurcate state handling on ivb
Originally I've thought that this is leftover hw state dirt from the
BIOS. But after way too much helpless flailing around on my part I've
noticed that the actual bug is when we change the state of an already
active pipe.
For example when we change the fdi lines from 2 to 3 without switching
off outputs in-between we'll never see the crucial on->off transition
in the ->modeset_global_resources hook the current logic relies on.
Patch version 2 got this right by instead also checking whether the
pipe is indeed active. But that in turn broke things when pipes have
been turned off through dpms since the bifurcate enabling is done in
the ->crtc_mode_set callback.
To address this issues discussed with Ville in the patch review move
the setting of the bifurcate bit into the ->crtc_enable hook. That way
we won't wreak havoc with this state when userspace puts all other
outputs into dpms off state. This also moves us forward with our
overall goal to unify the modeset and dpms on paths (which we need to
have to allow runtime pm in the dpms off state).
Unfortunately this requires us to move the bifurcate helpers around a
bit.
Also update the commit message, I've misanalyzed the bug rather badly.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70507
Tested-by: Jan-Michael Brummer <jan.brummer@tabos.org>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ben Segall [Wed, 16 Oct 2013 18:16:32 +0000 (11:16 -0700)]
 
sched: Avoid throttle_cfs_rq() racing with period_timer stopping
throttle_cfs_rq() doesn't check to make sure that period_timer is running,
and while update_curr/assign_cfs_runtime does, a concurrently running
period_timer on another cpu could cancel itself between this cpu's
update_curr and throttle_cfs_rq(). If there are no other cfs_rqs running
in the tg to restart the timer, this causes the cfs_rq to be stranded
forever.
Fix this by calling __start_cfs_bandwidth() in throttle if the timer is
inactive.
(Also add some sched_debug lines for cfs_bandwidth.)
Tested: make a run/sleep task in a cgroup, loop switching the cgroup
between 1ms/100ms quota and unlimited, checking for timer_active=0 and
throttled=1 as a failure. With the throttle_cfs_rq() change commented out
this fails, with the full patch it passes.
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Paul Turner [Wed, 16 Oct 2013 18:16:27 +0000 (11:16 -0700)]
 
sched: Guarantee new group-entities always have weight
Currently, group entity load-weights are initialized to zero. This
admits some races with respect to the first time they are re-weighted in
earlty use. ( Let g[x] denote the se for "g" on cpu "x". )
Suppose that we have root->a and that a enters a throttled state,
immediately followed by a[0]->t1 (the only task running on cpu[0])
blocking:
  put_prev_task(group_cfs_rq(a[0]), t1)
  put_prev_entity(..., t1)
  check_cfs_rq_runtime(group_cfs_rq(a[0]))
  throttle_cfs_rq(group_cfs_rq(a[0]))
Then, before unthrottling occurs, let a[0]->b[0]->t2 wake for the first
time:
  enqueue_task_fair(rq[0], t2)
  enqueue_entity(group_cfs_rq(b[0]), t2)
  enqueue_entity_load_avg(group_cfs_rq(b[0]), t2)
  account_entity_enqueue(group_cfs_ra(b[0]), t2)
  update_cfs_shares(group_cfs_rq(b[0]))
  < skipped because b is part of a throttled hierarchy >
  enqueue_entity(group_cfs_rq(a[0]), b[0])
  ...
We now have b[0] enqueued, yet group_cfs_rq(a[0])->load.weight == 0
which violates invariants in several code-paths. Eliminate the
possibility of this by initializing group entity weight.
Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131016181627.22647.47543.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ben Segall [Wed, 16 Oct 2013 18:16:22 +0000 (11:16 -0700)]
 
sched: Fix hrtimer_cancel()/rq->lock deadlock
__start_cfs_bandwidth calls hrtimer_cancel while holding rq->lock,
waiting for the hrtimer to finish. However, if sched_cfs_period_timer
runs for another loop iteration, the hrtimer can attempt to take
rq->lock, resulting in deadlock.
Fix this by ensuring that cfs_b->timer_active is cleared only if the
_latest_ call to do_sched_cfs_period_timer is returning as idle. Then
__start_cfs_bandwidth can just call hrtimer_try_to_cancel and wait for
that to succeed or timer_active == 1.
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131016181622.22647.16643.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ben Segall [Wed, 16 Oct 2013 18:16:17 +0000 (11:16 -0700)]
 
sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
hrtimer_expires_remaining does not take internal hrtimer locks and thus
must be guarded against concurrent __hrtimer_start_range_ns (but
returning HRTIMER_RESTART is safe). Use cfs_b->lock to make it safe.
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131016181617.22647.73829.stgit@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>