Pablo Neira Ayuso [Sat, 25 Oct 2014 10:25:06 +0000 (12:25 +0200)]
netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing
Fixes:
36d2af5 ("netfilter: nf_tables: allow to filter from prerouting and postrouting")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Alex Gartrell [Mon, 6 Oct 2014 15:46:19 +0000 (08:46 -0700)]
ipvs: Avoid null-pointer deref in debug code
Use daddr instead of reaching into dest.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Gartrell <agartrell@fb.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Arturo Borrero [Sun, 26 Oct 2014 11:22:40 +0000 (12:22 +0100)]
netfilter: nft_compat: fix wrong target lookup in nft_target_select_ops()
The code looks for an already loaded target, and the correct list to search
is nft_target_list, not nft_match_list.
Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Houcheng Lin [Thu, 23 Oct 2014 08:36:08 +0000 (10:36 +0200)]
netfilter: nf_log: release skbuff on nlmsg put failure
The kernel should reserve enough room in the skb so that the DONE
message can always be appended. However, in case of e.g. new attribute
erronously not being size-accounted for, __nfulnl_send() will still
try to put next nlmsg into this full skbuf, causing the skb to be stuck
forever and blocking delivery of further messages.
Fix issue by releasing skb immediately after nlmsg_put error and
WARN() so we can track down the cause of such size mismatch.
[ fw@strlen.de: add tailroom/len info to WARN ]
Signed-off-by: Houcheng Lin <houcheng@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 23 Oct 2014 08:36:07 +0000 (10:36 +0200)]
netfilter: nfnetlink_log: fix maximum packet length logged to userspace
don't try to queue payloads > 0xffff - NLA_HDRLEN, it does not work.
The nla length includes the size of the nla struct, so anything larger
results in u16 integer overflow.
This patch is similar to
9cefbbc9c8f9abe (netfilter: nfnetlink_queue: cleanup copy_range usage).
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 23 Oct 2014 08:36:06 +0000 (10:36 +0200)]
netfilter: nf_log: account for size of NLMSG_DONE attribute
We currently neither account for the nlattr size, nor do we consider
the size of the trailing NLMSG_DONE when allocating nlmsg skb.
This can result in nflog to stop working, as __nfulnl_send() re-tries
sending forever if it failed to append NLMSG_DONE (which will never
work if buffer is not large enough).
Reported-by: Houcheng Lin <houcheng@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Herbert Xu [Sat, 4 Oct 2014 14:18:02 +0000 (22:18 +0800)]
bridge: Do not compile options in br_parse_ip_options
Commit
462fb2af9788a82a534f8184abfde31574e1cfa0
bridge : Sanitize skb before it enters the IP stack
broke when IP options are actually used because it mangles the
skb as if it entered the IP stack which is wrong because the
bridge is supposed to operate below the IP stack.
Since nobody has actually requested for parsing of IP options
this patch fixes it by simply reverting to the previous approach
of ignoring all IP options, i.e., zeroing the IPCB.
If and when somebody who uses IP options and actually needs them
to be parsed by the bridge complains then we can revisit this.
Reported-by: David Newall <davidn@davidnewall.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Sabrina Dubroca [Tue, 21 Oct 2014 09:08:21 +0000 (11:08 +0200)]
netfilter: nf_tables: check for NULL in nf_tables_newchain pcpu stats allocation
alloc_percpu returns NULL on failure, not a negative error code.
Fixes:
ff3cd7b3c922 ("netfilter: nf_tables: refactor chain statistic routines")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Dan Carpenter [Tue, 21 Oct 2014 08:28:12 +0000 (11:28 +0300)]
netfilter: ipset: off by one in ip_set_nfnl_get_byindex()
The ->ip_set_list[] array is initialized in ip_set_net_init() and it
has ->ip_set_max elements so this check should be >= instead of >
otherwise we are off by one.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Marcelo Leitner [Mon, 13 Oct 2014 16:09:28 +0000 (13:09 -0300)]
netfilter: nf_conntrack: allow server to become a client in TW handling
When a port that was used to listen for inbound connections gets closed
and reused for outgoing connections (like rsh ends up doing for stderr
flow), current we may reject the SYN/ACK packet for the new connection
because tcp_conntracks states forbirds a port to become a client while
there is still a TIME_WAIT entry in there for it.
As TCP may expire the TIME_WAIT socket in 60s and conntrack's timeout
for it is 120s, there is a ~60s window that the application can end up
opening a port that conntrack will end up blocking.
This patch fixes this by simply allowing such state transition: if we
see a SYN, in TIME_WAIT state, on REPLY direction, move it to sSS. Note
that the rest of the code already handles this situation, more
specificly in tcp_packet(), first switch clause.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Sabrina Dubroca [Tue, 21 Oct 2014 09:23:30 +0000 (11:23 +0200)]
net: sched: initialize bstats syncp
Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp.
Fixes:
22e0f8b9322c "net: sched: make bstats per cpu and estimator RCU safe"
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Mon, 20 Oct 2014 21:54:57 +0000 (14:54 -0700)]
bpf: fix bug in eBPF verifier
while comparing for verifier state equivalency the comparison
was missing a check for uninitialized register.
Make sure it does so and add a testcase.
Fixes:
f1bca824dabb ("bpf: add search pruning optimization to verifier")
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Tue, 21 Oct 2014 20:05:38 +0000 (22:05 +0200)]
netlink: Re-add locking to netlink_lookup() and seq walker
The synchronize_rcu() in netlink_release() introduces unacceptable
latency. Reintroduce minimal lookup so we can drop the
synchronize_rcu() until socket destruction has been RCUfied.
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Mon, 20 Oct 2014 06:46:35 +0000 (14:46 +0800)]
tipc: fix lockdep warning when intra-node messages are delivered
When running tipcTC&tipcTS test suite, below lockdep unsafe locking
scenario is reported:
[ 1109.997854]
[ 1109.997988] =================================
[ 1109.998290] [ INFO: inconsistent lock state ]
[ 1109.998575] 3.17.0-rc1+ #113 Not tainted
[ 1109.998762] ---------------------------------
[ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 1109.998762] (slock-AF_TIPC){+.?...}, at: [<
ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] {SOFTIRQ-ON-W} state was registered at:
[ 1109.998762] [<
ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80
[ 1109.998762] [<
ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762] [<
ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762] [<
ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc]
[ 1109.998762] [<
ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc]
[ 1109.998762] [<
ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc]
[ 1109.998762] [<
ffffffff817676ee>] SYSC_connect+0xae/0xc0
[ 1109.998762] [<
ffffffff81767b7e>] SyS_connect+0xe/0x10
[ 1109.998762] [<
ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200
[ 1109.998762] [<
ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f
[ 1109.998762] irq event stamp: 241060
[ 1109.998762] hardirqs last enabled at (241060): [<
ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0
[ 1109.998762] hardirqs last disabled at (241059): [<
ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0
[ 1109.998762] softirqs last enabled at (241020): [<
ffffffff81059a52>] _local_bh_enable+0x22/0x50
[ 1109.998762] softirqs last disabled at (241021): [<
ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762]
[ 1109.998762] other info that might help us debug this:
[ 1109.998762] Possible unsafe locking scenario:
[ 1109.998762]
[ 1109.998762] CPU0
[ 1109.998762] ----
[ 1109.998762] lock(slock-AF_TIPC);
[ 1109.998762] <Interrupt>
[ 1109.998762] lock(slock-AF_TIPC);
[ 1109.998762]
[ 1109.998762] *** DEADLOCK ***
[ 1109.998762]
[ 1109.998762] 2 locks held by swapper/7/0:
[ 1109.998762] #0: (rcu_read_lock){......}, at: [<
ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70
[ 1109.998762] #1: (rcu_read_lock){......}, at: [<
ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762]
[ 1109.998762] stack backtrace:
[ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113
[ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 1109.998762]
ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007
[ 1109.998762]
ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001
[ 1109.998762]
ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000
[ 1109.998762] Call Trace:
[ 1109.998762] <IRQ> [<
ffffffff81a209eb>] dump_stack+0x4e/0x68
[ 1109.998762] [<
ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202
[ 1109.998762] [<
ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50
[ 1109.998762] [<
ffffffff810a406c>] mark_lock+0x28c/0x2f0
[ 1109.998762] [<
ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0
[ 1109.998762] [<
ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80
[ 1109.998762] [<
ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10
[ 1109.998762] [<
ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0
[ 1109.998762] [<
ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762] [<
ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0
[ 1109.998762] [<
ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762] [<
ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762] [<
ffffffff810a6555>] lock_acquire+0x95/0x1e0
[ 1109.998762] [<
ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0
[ 1109.998762] [<
ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80
[ 1109.998762] [<
ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc]
[ 1109.998762] [<
ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc]
[ 1109.998762] [<
ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc]
[ 1109.998762] [<
ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc]
[ 1109.998762] [<
ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc]
[ 1109.998762] [<
ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70
[ 1109.998762] [<
ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70
[ 1109.998762] [<
ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0
[ 1109.998762] [<
ffffffff817838f6>] __netif_receive_skb+0x26/0x70
[ 1109.998762] [<
ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0
[ 1109.998762] [<
ffffffff81785518>] napi_gro_receive+0xd8/0x240
[ 1109.998762] [<
ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530
[ 1109.998762] [<
ffffffff815c1a46>] e1000_clean+0x266/0x9c0
[ 1109.998762] [<
ffffffff8108ad2b>] ? local_clock+0x1b/0x30
[ 1109.998762] [<
ffffffff8108aa05>] ? sched_clock_local+0x25/0x90
[ 1109.998762] [<
ffffffff817842b1>] net_rx_action+0x141/0x310
[ 1109.998762] [<
ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150
[ 1109.998762] [<
ffffffff81059fa6>] __do_softirq+0x116/0x4d0
[ 1109.998762] [<
ffffffff8105a626>] irq_exit+0x96/0xc0
[ 1109.998762] [<
ffffffff81a30d07>] do_IRQ+0x67/0x110
[ 1109.998762] [<
ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f
[ 1109.998762] <EOI> [<
ffffffff8100d2b7>] ? default_idle+0x37/0x250
[ 1109.998762] [<
ffffffff8100d2b5>] ? default_idle+0x35/0x250
[ 1109.998762] [<
ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20
[ 1109.998762] [<
ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0
[ 1109.998762] [<
ffffffff81034c78>] start_secondary+0x188/0x1f0
When intra-node messages are delivered from one process to another
process, tipc_link_xmit() doesn't disable BH before it directly calls
tipc_sk_rcv() on process context to forward messages to destination
socket. Meanwhile, if messages delivered by remote node arrive at the
node and their destinations are also the same socket, tipc_sk_rcv()
running on process context might be preempted by tipc_sk_rcv() running
BH context. As a result, the latter cannot obtain the socket lock as
the lock was obtained by the former, however, the former has no chance
to be run as the latter is owning the CPU now, so headlock happens. To
avoid it, BH should be always disabled in tipc_sk_rcv().
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Mon, 20 Oct 2014 06:44:25 +0000 (14:44 +0800)]
tipc: fix a potential deadlock
Locking dependency detected below possible unsafe locking scenario:
CPU0 CPU1
T0: tipc_named_rcv() tipc_rcv()
T1: [grab nametble write lock]* [grab node lock]*
T2: tipc_update_nametbl() tipc_node_link_up()
T3: tipc_nodesub_subscribe() tipc_nametbl_publish()
T4: [grab node lock]* [grab nametble write lock]*
The opposite order of holding nametbl write lock and node lock on
above two different paths may result in a deadlock. If we move the
the updating of the name table after link state named out of node
lock, the reverse order of holding locks will be eliminated, and
as a result, the deadlock risk.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 21 Oct 2014 19:24:30 +0000 (15:24 -0400)]
Merge branch 'enic'
Govindarajulu Varadarajan says:
====================
enic: Bug fixes
This series fixes the following problem.
Please apply this to net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Govindarajulu Varadarajan [Sun, 19 Oct 2014 08:50:28 +0000 (14:20 +0530)]
enic: Do not call napi_disable when preemption is disabled.
In enic_stop, we disable preemption using local_bh_disable(). We disable
preemption to wait for busy_poll to finish.
napi_disable should not be called here as it might sleep.
Moving napi_disable() call out side of local_bh_disable.
BUG: sleeping function called from invalid context at include/linux/netdevice.h:477
in_atomic(): 1, irqs_disabled(): 0, pid: 443, name: ifconfig
INFO: lockdep is turned off.
Preemption disabled at:[<
ffffffffa029c5c4>] enic_rfs_flw_tbl_free+0x34/0xd0 [enic]
CPU: 31 PID: 443 Comm: ifconfig Not tainted
3.17.0-netnext-05504-g59f35b8 #268
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
ffff8800dac10000 ffff88020b8dfcb8 ffffffff8148a57c 0000000000000000
ffff88020b8dfcd0 ffffffff8107e253 ffff8800dac12a40 ffff88020b8dfd10
ffffffffa029305b ffff88020b8dfd48 ffff8800dac10000 ffff88020b8dfd48
Call Trace:
[<
ffffffff8148a57c>] dump_stack+0x4e/0x7a
[<
ffffffff8107e253>] __might_sleep+0x123/0x1a0
[<
ffffffffa029305b>] enic_stop+0xdb/0x4d0 [enic]
[<
ffffffff8138ed7d>] __dev_close_many+0x9d/0xf0
[<
ffffffff8138ef81>] __dev_close+0x31/0x50
[<
ffffffff813974a8>] __dev_change_flags+0x98/0x160
[<
ffffffff81397594>] dev_change_flags+0x24/0x60
[<
ffffffff814085fd>] devinet_ioctl+0x63d/0x710
[<
ffffffff81139c16>] ? might_fault+0x56/0xc0
[<
ffffffff81409ef5>] inet_ioctl+0x65/0x90
[<
ffffffff813768e0>] sock_do_ioctl+0x20/0x50
[<
ffffffff81376ebb>] sock_ioctl+0x20b/0x2e0
[<
ffffffff81197250>] do_vfs_ioctl+0x2e0/0x500
[<
ffffffff81492619>] ? sysret_check+0x22/0x5d
[<
ffffffff81285f23>] ? __this_cpu_preempt_check+0x13/0x20
[<
ffffffff8109fe19>] ? trace_hardirqs_on_caller+0x119/0x270
[<
ffffffff811974ac>] SyS_ioctl+0x3c/0x80
[<
ffffffff814925ed>] system_call_fastpath+0x1a/0x1f
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Govindarajulu Varadarajan [Sun, 19 Oct 2014 08:50:27 +0000 (14:20 +0530)]
enic: fix possible deadlock in enic_stop/ enic_rfs_flw_tbl_free
The following warning is shown when spinlock debug is enabled.
This occurs when enic_flow_may_expire timer function is running and
enic_stop is called on same CPU.
Fix this by using spink_lock_bh().
=================================
[ INFO: inconsistent lock state ]
3.17.0-netnext-05504-g59f35b8 #268 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
ifconfig/443 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&enic->rfs_h.lock)->rlock){+.?...}, at:
enic_rfs_flw_tbl_free+0x34/0xd0 [enic]
{IN-SOFTIRQ-W} state was registered at:
[<
ffffffff810a25af>] __lock_acquire+0x83f/0x21c0
[<
ffffffff810a45f2>] lock_acquire+0xa2/0xd0
[<
ffffffff814913fc>] _raw_spin_lock+0x3c/0x80
[<
ffffffffa029c3d5>] enic_flow_may_expire+0x25/0x130[enic]
[<
ffffffff810bcd07>] call_timer_fn+0x77/0x100
[<
ffffffff810bd8e3>] run_timer_softirq+0x1e3/0x270
[<
ffffffff8105f9ae>] __do_softirq+0x14e/0x280
[<
ffffffff8105fdae>] irq_exit+0x8e/0xb0
[<
ffffffff8103da0f>] smp_apic_timer_interrupt+0x3f/0x50
[<
ffffffff81493742>] apic_timer_interrupt+0x72/0x80
[<
ffffffff81018143>] default_idle+0x13/0x20
[<
ffffffff81018a6a>] arch_cpu_idle+0xa/0x10
[<
ffffffff81097676>] cpu_startup_entry+0x2c6/0x330
[<
ffffffff8103b7ad>] start_secondary+0x21d/0x290
irq event stamp: 2997
hardirqs last enabled at (2997): [<
ffffffff81491865>] _raw_spin_unlock_irqrestore+0x65/0x90
hardirqs last disabled at (2996): [<
ffffffff814915e6>] _raw_spin_lock_irqsave+0x26/0x90
softirqs last enabled at (2968): [<
ffffffff813b57a3>] dev_deactivate_many+0x213/0x260
softirqs last disabled at (2966): [<
ffffffff813b5783>] dev_deactivate_many+0x1f3/0x260
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&enic->rfs_h.lock)->rlock);
<Interrupt>
lock(&(&enic->rfs_h.lock)->rlock);
*** DEADLOCK ***
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Oct 2014 16:38:19 +0000 (12:38 -0400)]
Merge branch 'gso_encap_fixes'
Florian Westphal says:
====================
net: minor gso encapsulation fixes
The following series fixes a minor bug in the gso segmentation handlers
when encapsulation offload is used.
Theoretically this could cause kernel panic when the stack tries
to software-segment such a GRE offload packet, but it looks like there
is only one affected call site (tbf scheduler) and it handles NULL
return value.
I've included a followup patch to add IS_ERR_OR_NULL checks where needed.
While looking into this, I also found that size computation of the individual
segments is incorrect if skb->encapsulation is set.
Please see individual patches for delta vs. v1.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 20 Oct 2014 11:49:18 +0000 (13:49 +0200)]
net: core: handle encapsulation offloads when computing segment lengths
if ->encapsulation is set we have to use inner_tcp_hdrlen and add the
size of the inner network headers too.
This is 'mostly harmless'; tbf might send skb that is slightly over
quota or drop skb even if it would have fit.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 20 Oct 2014 11:49:17 +0000 (13:49 +0200)]
net: make skb_gso_segment error handling more robust
skb_gso_segment has three possible return values:
1. a pointer to the first segmented skb
2. an errno value (IS_ERR())
3. NULL. This can happen when GSO is used for header verification.
However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL
and would oops when NULL is returned.
Note that these call sites should never actually see such a NULL return
value; all callers mask out the GSO bits in the feature argument.
However, there have been issues with some protocol handlers erronously not
respecting the specified feature mask in some cases.
It is preferable to get 'have to turn off hw offloading, else slow' reports
rather than 'kernel crashes'.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 20 Oct 2014 11:49:16 +0000 (13:49 +0200)]
net: gso: use feature flag argument in all protocol gso handlers
skb_gso_segment() has a 'features' argument representing offload features
available to the output path.
A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use
those instead of the provided ones when handing encapsulation/tunnels.
Depending on dev->hw_enc_features of the output device skb_gso_segment() can
then return NULL even when the caller has disabled all GSO feature bits,
as segmentation of inner header thinks device will take care of segmentation.
This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs
that did not fit the remaining token quota as the segmentation does not work
when device supports corresponding hw offload capabilities.
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Oct 2014 15:57:47 +0000 (11:57 -0400)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
netfilter fixes for net
The following patchset contains netfilter fixes for your net tree,
they are:
1) Fix missing MODULE_LICENSE() in the new nf_reject_ipv{4,6} modules.
2) Restrict nat and masq expressions to the nat chain type. Otherwise,
users may crash their kernel if they attach a nat/masq rule to a non
nat chain.
3) Fix hook validation in nft_compat when non-base chains are used.
Basically, initialize hook_mask to zero.
4) Make sure you use match/targets in nft_compat from the right chain
type. The existing validation relies on the table name which can be
avoided by
5) Better netlink attribute validation in nft_nat. This expression has
to reject the configuration when no address and proto configurations
are specified.
6) Interpret NFTA_NAT_REG_*_MAX if only if NFTA_NAT_REG_*_MIN is set.
Yet another sanity check to reject incorrect configurations from
userspace.
7) Conditional NAT attribute dumping depending on the existing
configuration.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Morgan [Sun, 19 Oct 2014 12:05:13 +0000 (08:05 -0400)]
ax88179_178a: fix bonding failure
The following patch fixes a bug which causes the ax88179_178a driver to be
incapable of being added to a bond.
When I brought up the issue with the bonding maintainers, they indicated
that the real problem was with the NIC driver which must return zero for
success (of setting the MAC address). I see that several other NIC drivers
follow that pattern by either simply always returing zero, or by passing
through a negative (error) result while rewriting any positive return code
to zero. With that same philisophy applied to the ax88179_178a driver, it
allows it to work correctly with the bonding driver.
I believe this is suitable for queuing in -stable, as it's a small, simple,
and obvious fix that corrects a defect with no other known workaround.
This patch is against vanilla 3.17(.0).
Signed-off-by: Ian Morgan <imorgan@primordial.ca>
drivers/net/usb/ax88179_178a.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 19 Oct 2014 19:58:22 +0000 (12:58 -0700)]
Merge tag 'ntb-3.18' of git://github.com/jonmason/ntb
Pull ntb (non-transparent bridge) updates from Jon Mason:
"Add support for Haswell NTB split BARs, a debugfs entry for basic
debugging info, and some code clean-ups"
* tag 'ntb-3.18' of git://github.com/jonmason/ntb:
ntb: Adding split BAR support for Haswell platforms
ntb: use errata flag set via DID to implement workaround
ntb: conslidate reading of PPD to move platform detection earlier
ntb: move platform detection to separate function
NTB: debugfs device entry
Linus Torvalds [Sun, 19 Oct 2014 19:50:44 +0000 (12:50 -0700)]
Merge branch 'i2c/for-next' of git://git./linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"Highlights from the I2C subsystem for 3.18:
- new drivers for Axxia AM55xx, and Hisilicon hix5hd2 SoC.
- designware driver gained AMD support, exynos gained exynos7 support
The rest is usual driver stuff. Hopefully no lowlights this time"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i801: Add Device IDs for Intel Sunrise Point PCH
i2c: hix5hd2: add i2c controller driver
i2c-imx: Disable the clock on probe failure
i2c: designware: Add support for AMD I2C controller
i2c: designware: Rework probe() to get clock a bit later
i2c: designware: Default to fast mode in case of ACPI
i2c: axxia: Add I2C driver for AXM55xx
i2c: exynos: add support for HSI2C module on Exynos7
i2c: mxs: detect No Slave Ack on SELECT in PIO mode
i2c: cros_ec: Remove EC_I2C_FLAG_10BIT
i2c: cros-ec-tunnel: Add of match table
i2c: rcar: remove sign-compare flaw
i2c: ismt: Use minimum descriptor size
i2c: imx: Add arbitration lost check
i2c: rk3x: Remove unlikely() annotations
i2c: rcar: check for no IRQ in rcar_i2c_irq()
i2c: rcar: make rcar_i2c_prepare_msg() *void*
i2c: rcar: simplify check for last message
i2c: designware: add support of platform data to set I2C mode
i2c: designware: add support of I2C standard mode
Linus Torvalds [Sun, 19 Oct 2014 19:45:36 +0000 (12:45 -0700)]
Merge tag 'sound-fix-3.18-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are a collection of small fixes after 3.18 merge.
The urgent one is the fix for kernel panics with linked PCM substream
triggered by the recent nonatomic PCM ops support. Other two fixes
(emu10k1 and bebob) are stable fixes, and one easy PCI ID addition for
a new Intel HD-audio controller"
* tag 'sound-fix-3.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda_intel: Add Device IDs for Intel Sunrise Point PCH
ALSA: emu10k1: Fix deadlock in synth voice lookup
ALSA: pcm: Fix referred substream in snd_pcm_action_group() unlock loop
ALSA: bebob: Fix failure to detect source of clock for Terratec Phase 88
Linus Torvalds [Sun, 19 Oct 2014 19:40:24 +0000 (12:40 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull second round of input updates from Dmitry Torokhov:
"Mostly simple bug fixes, although we do have one brand new driver for
Microchip AR1021 i2c touchscreen.
Also there is the change to stop trying to use i8042 active
multiplexing by default (it is still possible to activate it via
i8042.nomux=0 on boxes that implement it)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: xpad - add Thrustmaster as Xbox 360 controller vendor
Input: xpad - add USB ID for Thrustmaster Ferrari 458 Racing Wheel
Input: max77693-haptic - fix state check in imax77693_haptic_disable()
Input: xen-kbdfront - free grant table entry in xenkbd_disconnect_backend
Input: alps - fix v4 button press recognition
Input: i8042 - disable active multiplexing by default
Input: i8042 - add noloop quirk for Asus X750LN
Input: synaptics - gate forcepad support by DMI check
Input: Add Microchip AR1021 i2c touchscreen
Input: cros_ec_keyb - add of match table
Input: serio - avoid negative serio device numbers
Input: avoid negative input device numbers
Input: automatically set EV_ABS bit in input_set_abs_params
Input: adp5588-keys - cancel workqueue in failure path
Input: opencores-kbd - switch to using managed resources
Input: evdev - fix EVIOCG{type} ioctl
Linus Torvalds [Sun, 19 Oct 2014 19:29:23 +0000 (12:29 -0700)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull infiniband/RDMA updates from Roland Dreier:
- large set of iSER initiator improvements
- hardware driver fixes for cxgb4, mlx5 and ocrdma
- small fixes to core midlayer
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (47 commits)
RDMA/cxgb4: Fix ntuple calculation for ipv6 and remove duplicate line
RDMA/cxgb4: Add missing neigh_release in find_route
RDMA/cxgb4: Take IPv6 into account for best_mtu and set_emss
RDMA/cxgb4: Make c4iw_wr_log_size_order static
IB/core: Fix XRC race condition in ib_uverbs_open_qp
IB/core: Clear AH attr variable to prevent garbage data
RDMA/ocrdma: Save the bit environment, spare unncessary parenthesis
RDMA/ocrdma: The kernel has a perfectly good BIT() macro - use it
RDMA/ocrdma: Don't memset() buffers we just allocated with kzalloc()
RDMA/ocrdma: Remove a unused-label warning
RDMA/ocrdma: Convert kernel VA to PA for mmap in user
RDMA/ocrdma: Get vlan tag from ib_qp_attrs
RDMA/ocrdma: Add default GID at index 0
IB/mlx5, iser, isert: Add Signature API additions
Target/iser: Centralize ib_sig_domain setting
IB/iser: Centralize ib_sig_domain settings
IB/mlx5: Use extended internal signature layout
IB/iser: Set IP_CSUM as default guard type
IB/iser: Remove redundant assignment
IB/mlx5: Use enumerations for PI copy mask
...
Linus Torvalds [Sun, 19 Oct 2014 18:55:41 +0000 (11:55 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull more perf updates from Ingo Molnar:
"A second (and last) round of late coming fixes and changes, almost all
of them in perf tooling:
User visible tooling changes:
- Add period data column and make it default in 'perf script' (Jiri
Olsa)
- Add a visual cue for toggle zeroing of samples in 'perf top'
(Taeung Song)
- Improve callchains when using libunwind (Namhyung Kim)
Tooling fixes and infrastructure changes:
- Fix for double free in 'perf stat' when using some specific invalid
command line combo (Yasser Shalabi)
- Fix off-by-one bugs in map->end handling (Stephane Eranian)
- Fix off-by-one bug in maps__find(), also related to map->end
handling (Namhyung Kim)
- Make struct symbol->end be the first addr after the symbol range,
to make it match the convention used for struct map->end. (Arnaldo
Carvalho de Melo)
- Fix perf_evlist__add_pollfd() error handling in 'perf kvm stat
live' (Jiri Olsa)
- Fix python test build by moving callchain_param to an object linked
into the python binding (Jiri Olsa)
- Document sysfs events/ interfaces (Cody P Schafer)
- Fix typos in perf/Documentation (Masanari Iida)
- Add missing 'struct option' forward declaration (Arnaldo Carvalho
de Melo)
- Add option to copy events when queuing for sorting across cpu
buffers and enable it for 'perf kvm stat live', to avoid having
events left in the queue pointing to the ring buffer be rewritten
in high volume sessions. (Alexander Yarygin, improving work done
by David Ahern):
- Do not include a struct hists per perf_evsel, untangling the
histogram code from perf_evsel, to pave the way for exporting a
minimalistic tools/lib/api/perf/ library usable by tools/perf and
initially by the rasd daemon being developed by Borislav Petkov,
Robert Richter and Jean Pihet. (Arnaldo Carvalho de Melo)
- Make perf_evlist__open(evlist, NULL, NULL), i.e. without cpu and
thread maps mean syswide monitoring, reducing the boilerplate for
tools that only want system wide mode. (Arnaldo Carvalho de Melo)
- Move exit stuff from perf_evsel__delete to perf_evsel__exit, delete
should be just a front end for exit + free (Arnaldo Carvalho de
Melo)
- Add support to new style format of kernel PMU event. (Kan Liang)
and other misc fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
perf script: Add period as a default output column
perf script: Add period data column
perf evsel: No need to drag util/cgroup.h
perf evlist: Add missing 'struct option' forward declaration
perf evsel: Move exit stuff from __delete to __exit
kprobes/x86: Remove stale ARCH_SUPPORTS_KPROBES_ON_FTRACE define
perf kvm stat live: Enable events copying
perf session: Add option to copy events when queueing
perf Documentation: Fix typos in perf/Documentation
perf trace: Use thread_{,_set}_priv helpers
perf kvm: Use thread_{,_set}_priv helpers
perf callchain: Create an address space per thread
perf report: Set callchain_param.record_mode for future use
perf evlist: Fix for double free in tools/perf stat
perf test: Add test case for pmu event new style format
perf tools: Add support to new style format of kernel PMU event
perf tools: Parse the pmu event prefix and suffix
Revert "perf tools: Default to cpu// for events v5"
perf Documentation: Remove Ruplicated docs for powerpc cpu specific events
perf Documentation: sysfs events/ interfaces
...
Linus Torvalds [Sun, 19 Oct 2014 18:46:09 +0000 (11:46 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"Here we have two bug fixes:
1) The current thread's fault_code is not setup properly upon entry to
do_sparc64_fault() in some paths, leading to spurious SIGBUS.
2) Don't use a zero length array at the end of thread_info on sparc64,
otherwise end_of_stack() isn't right"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Do not define thread fpregs save area as zero-length array.
sparc64: Fix corrupted thread fault code.
Linus Torvalds [Sun, 19 Oct 2014 18:41:57 +0000 (11:41 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"A quick batch of bug fixes:
1) Fix build with IPV6 disabled, from Eric Dumazet.
2) Several more cases of caching SKB data pointers across calls to
pskb_may_pull(), thus referencing potentially free'd memory. From
Li RongQing.
3) DSA phy code tests operation presence improperly, instead of going:
if (x->ops->foo)
r = x->ops->foo(args);
it was going:
if (x->ops->foo(args))
r = x->ops->foo(args);
Fix from Andew Lunn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
Net: DSA: Fix checking for get_phy_flags function
ipv6: fix a potential use after free in sit.c
ipv6: fix a potential use after free in ip6_offload.c
ipv4: fix a potential use after free in gre_offload.c
tcp: fix build error if IPv6 is not enabled
Andrew Lunn [Sun, 19 Oct 2014 14:41:47 +0000 (16:41 +0200)]
Net: DSA: Fix checking for get_phy_flags function
The check for the presence or not of the optional switch function
get_phy_flags() called the function, rather than checked to see if it
is a NULL pointer. This causes a derefernce of a NULL pointer on all
switch chips except the sf2, the only switch to implement this call.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes:
6819563e646a ("net: dsa: allow switch drivers to specify phy_device::dev_flags")
Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 19 Oct 2014 03:12:33 +0000 (23:12 -0400)]
sparc64: Do not define thread fpregs save area as zero-length array.
This breaks the stack end corruption detection facility.
What that facility does it write a magic value to "end_of_stack()"
and checking to see if it gets overwritten.
"end_of_stack()" is "task_thread_info(p) + 1", which for sparc64 is
the beginning of the FPU register save area.
So once the user uses the FPU, the magic value is overwritten and the
debug checks trigger.
Fix this by making the size explicit.
Due to the size we use for the fpsaved[], gsr[], and xfsr[] arrays we
are limited to 7 levels of FPU state saves. So each FPU register set
is 256 bytes, allocate 256 * 7 for the fpregs area.
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 19 Oct 2014 03:03:09 +0000 (23:03 -0400)]
sparc64: Fix corrupted thread fault code.
Every path that ends up at do_sparc64_fault() must install a valid
FAULT_CODE_* bitmask in the per-thread fault code byte.
Two paths leading to the label winfix_trampoline (which expects the
FAULT_CODE_* mask in register %g4) were not doing so:
1) For pre-hypervisor TLB protection violation traps, if we took
the 'winfix_trampoline' path we wouldn't have %g4 initialized
with the FAULT_CODE_* value yet. Resulting in using the
TLB_TAG_ACCESS register address value instead.
2) In the TSB miss path, when we notice that we are going to use a
hugepage mapping, but we haven't allocated the hugepage TSB yet, we
still have to take the window fixup case into consideration and
in that particular path we leave %g4 not setup properly.
Errors on this sort were largely invisible previously, but after
commit
4ccb9272892c33ef1c19a783cfa87103b30c2784 ("sparc64: sun4v TLB
error power off events") we now have a fault_code mask bit
(FAULT_CODE_BAD_RA) that triggers due to this bug.
FAULT_CODE_BAD_RA triggers because this bit is set in TLB_TAG_ACCESS
(see #1 above) and thus we get seemingly random bus errors triggered
for user processes.
Fixes:
4ccb9272892c ("sparc64: sun4v TLB error power off events")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 19 Oct 2014 01:11:04 +0000 (18:11 -0700)]
Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine updates from Vinod Koul:
"For dmaengine contributions we have:
- designware cleanup by Andy
- my series moving device_control users to dmanegine_xxx APIs for
later removal of device_control API
- minor fixes spread over drivers mainly mv_xor, pl330, mmp, imx-sdma
etc"
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (60 commits)
serial: atmel: add missing dmaengine header
dmaengine: remove FSLDMA_EXTERNAL_START
dmaengine: freescale: remove FSLDMA_EXTERNAL_START control method
carma-fpga: move to fsl_dma_external_start()
carma-fpga: use dmaengine_xxx() API
dmaengine: freescale: add and export fsl_dma_external_start()
dmaengine: add dmaengine_prep_dma_sg() helper
video: mx3fb: use dmaengine_terminate_all() API
serial: sh-sci: use dmaengine_terminate_all() API
net: ks8842: use dmaengine_terminate_all() API
mtd: sh_flctl: use dmaengine_terminate_all() API
mtd: fsmc_nand: use dmaengine_terminate_all() API
V4L2: mx3_camer: use dmaengine_pause() API
dmaengine: coh901318: use dmaengine_terminate_all() API
pata_arasan_cf: use dmaengine_terminate_all() API
dmaengine: edma: check for echan->edesc => NULL in edma_dma_pause()
dmaengine: dw: export probe()/remove() and Co to users
dmaengine: dw: enable and disable controller when needed
dmaengine: dw: always export dw_dma_{en,dis}able
dmaengine: dw: introduce dw_dma_on() helper
...
Linus Torvalds [Sun, 19 Oct 2014 01:03:02 +0000 (18:03 -0700)]
Merge tag 'fbdev-3.18' of git://git./linux/kernel/git/tomba/linux
Pull fbdev updates from Tomi Valkeinen:
- new 6x10 font
- various small fixes and cleanups
* tag 'fbdev-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (30 commits)
fonts: Add 6x10 font
videomode: provide dummy inline functions for !CONFIG_OF
video/atmel_lcdfb: Introduce regulator support
fbdev: sh_mobile_hdmi: Re-init regs before irq re-enable on resume
framebuffer: fix screen corruption when copying
framebuffer: fix border color
arm, fbdev, omap2, LLVMLinux: Remove nested function from omapfb
arm, fbdev, omap2, LLVMLinux: Remove nested function from omap2 dss
video: fbdev: valkyriefb.c: use container_of to resolve fb_info_valkyrie from fb_info
video: fbdev: pxafb.c: use container_of to resolve pxafb_info/layer from fb_info
video: fbdev: cyber2000fb.c: use container_of to resolve cfb_info from fb_info
video: fbdev: controlfb.c: use container_of to resolve fb_info_control from fb_info
video: fbdev: sa1100fb.c: use container_of to resolve sa1100fb_info from fb_info
video: fbdev: stifb.c: use container_of to resolve stifb_info from fb_info
video: fbdev: sis: sis_main.c: Cleaning up missing null-terminate in conjunction with strncpy
video: valkyriefb: Fix unused variable warning in set_valkyrie_clock()
video: fbdev: use %*ph specifier to dump small buffers
video: mx3fb: always enable BACKLIGHT_LCD_SUPPORT
video: fbdev: au1200fb: delete double assignment
video: fbdev: sis: delete double assignment
...
Linus Torvalds [Sat, 18 Oct 2014 21:32:31 +0000 (14:32 -0700)]
Merge tag 'kvm-arm-for-3.18-take-2' of git://git./linux/kernel/git/kvmarm/kvmarm
Pull second batch of changes for KVM/{arm,arm64} from Marc Zyngier:
"The most obvious thing is the sizeable MMU changes to support 48bit
VAs on arm64.
Summary:
- support for 48bit IPA and VA (EL2)
- a number of fixes for devices mapped into guests
- yet another VGIC fix for BE
- a fix for CPU hotplug
- a few compile fixes (disabled VGIC, strict mm checks)"
[ I'm pulling directly from Marc at the request of Paolo Bonzini, whose
backpack was stolen at Düsseldorf airport and will do new keys and
rebuild his web of trust. - Linus ]
* tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs
arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort
arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE
arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
arm/arm64: KVM: map MMIO regions at creation time
arm64: kvm: define PAGE_S2_DEVICE as read-only by default
ARM: kvm: define PAGE_S2_DEVICE as read-only by default
arm/arm64: KVM: add 'writable' parameter to kvm_phys_addr_ioremap
arm/arm64: KVM: fix potential NULL dereference in user_mem_abort()
arm/arm64: KVM: use __GFP_ZERO not memset() to get zeroed pages
ARM: KVM: fix vgic-disabled build
arm: kvm: fix CPU hotplug
Linus Torvalds [Sat, 18 Oct 2014 21:24:36 +0000 (14:24 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
"This is the MIPS pull request for the next kernel:
- Zubair's patch series adds CMA support for MIPS. Doing so it also
touches ARM64 and x86.
- remove the last instance of IRQF_DISABLED from arch/mips
- updates to two of the MIPS defconfig files.
- cleanup of how cache coherency bits are handled on MIPS and
implement support for write-combining.
- platform upgrades for Alchemy
- move MIPS DTS files to arch/mips/boot/dts/"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (24 commits)
MIPS: ralink: remove deprecated IRQF_DISABLED
MIPS: pgtable.h: Implement the pgprot_writecombine function for MIPS
MIPS: cpu-probe: Set the write-combine CCA value on per core basis
MIPS: pgtable-bits: Define the CCA bit for WC writes on Ingenic cores
MIPS: pgtable-bits: Move the CCA bits out of the core's ifdef blocks
MIPS: DMA: Add cma support
x86: use generic dma-contiguous.h
arm64: use generic dma-contiguous.h
asm-generic: Add dma-contiguous.h
MIPS: BPF: Add new emit_long_instr macro
MIPS: ralink: Move device-trees to arch/mips/boot/dts/
MIPS: Netlogic: Move device-trees to arch/mips/boot/dts/
MIPS: sead3: Move device-trees to arch/mips/boot/dts/
MIPS: Lantiq: Move device-trees to arch/mips/boot/dts/
MIPS: Octeon: Move device-trees to arch/mips/boot/dts/
MIPS: Add support for building device-tree binaries
MIPS: Create common infrastructure for building built-in device-trees
MIPS: SEAD3: Enable DEVTMPFS
MIPS: SEAD3: Regenerate defconfigs
MIPS: Alchemy: DB1300: Add touch penirq support
...
Linus Torvalds [Sat, 18 Oct 2014 21:22:32 +0000 (14:22 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mpe/linux
Pull powerpc fix from Michael Ellerman:
"There was a bit of a misunderstanding between us and the ARM guys in
the device tree PCI code, which is breaking virtio on powerpc.
This is the minimal fix until we can sort it out properly"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc/pci: Fix IO space breakage after of_pci_range_to_resource() change
Linus Torvalds [Sat, 18 Oct 2014 20:39:19 +0000 (13:39 -0700)]
Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs/smb3 updates from Steve French:
"Improved SMB3 support (symlink and device emulation, and remapping by
default the 7 reserved posix characters) and a workaround for cifs
mounts to Mac (working around a commonly encountered Mac server bug)"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] Remove obsolete comment
Check minimum response length on query_network_interface
Workaround Mac server problem
Remap reserved posix characters by default (part 3/3)
Allow conversion of characters in Mac remap range (part 2)
Allow conversion of characters in Mac remap range. Part 1
mfsymlinks support for SMB2.1/SMB3. Part 2 query symlink
Add mfsymlinks support for SMB2.1/SMB3. Part 1 create symlink
Allow mknod and mkfifo on SMB2/SMB3 mounts
add defines for two new file attributes
Linus Torvalds [Sat, 18 Oct 2014 20:37:19 +0000 (13:37 -0700)]
Merge tag 'dlm-3.18' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm fix from David Teigland:
"This includes a single commit fixing a missing endian conversion"
* tag 'dlm-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: fix missing endian conversion of rcom_status flags
Linus Torvalds [Sat, 18 Oct 2014 20:32:17 +0000 (13:32 -0700)]
Merge branch 'for-linus-update' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs data corruption fix from Chris Mason:
"I'm testing a pull with more fixes, but wanted to get this one out so
Greg can pick it up.
The corruption isn't easy to hit, you have to do a readonly snapshot
and have orphans in the snapshot. But my review and testing missed
the bug. Filipe has added a better xfstest to cover it"
* 'for-linus-update' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Revert "Btrfs: race free update of commit root for ro snapshots"
Linus Torvalds [Sat, 18 Oct 2014 20:25:03 +0000 (13:25 -0700)]
Merge tag 'please-pull-pstore' of git://git./linux/kernel/git/aegl/linux
Pull pstore fix from Tony Luck:
"Ensure unique filenames in pstore"
* tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
pstore: Fix duplicate {console,ftrace}-efi entries
Linus Torvalds [Sat, 18 Oct 2014 19:54:46 +0000 (12:54 -0700)]
Merge git://git./linux/kernel/git/aia21/ntfs
Pull NTFS update from Anton Altaparmakov:
"Here is a small NTFS update notably implementing FIBMAP ioctl for NTFS
by adding the bmap address space operation. People seem to still want
FIBMAP"
* git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs:
NTFS: Bump version to 2.1.31.
NTFS: Add bmap address space operation needed for FIBMAP ioctl.
NTFS: Remove changelog from Documentation/filesystems/ntfs.txt.
NTFS: Split ntfs_aops into ntfs_normal_aops and ntfs_compressed_aops in preparation for them diverging.
Linus Torvalds [Sat, 18 Oct 2014 19:52:08 +0000 (12:52 -0700)]
Merge tag 'nfs-for-3.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- fix an uninitialised pointer Oops in the writeback error path
- fix a bogus warning (and early exit from the loop) in nfs_generic_pgio()
Features:
- Add NFSv4.2 SEEK feature and client support for lseek(SEEK_HOLE/SEEK_DATA)
Other fixes:
- pnfs: replace broken pnfs_put_lseg_async
- Remove dead prototype for nfs4_insert_deviceid_node"
* tag 'nfs-for-3.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix a bogus warning in nfs_generic_pgio
NFS: Fix an uninitialised pointer Oops in the writeback error path
NFSv4.1/pnfs: replace broken pnfs_put_lseg_async
NFSv4: Remove dead prototype for nfs4_insert_deviceid_node()
NFS: Implement SEEK
Linus Torvalds [Sat, 18 Oct 2014 19:25:30 +0000 (12:25 -0700)]
Merge tag 'dm-3.18' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device-mapper updates from Mike Snitzer:
"I rebased the DM tree ontop of linux-block.git's 'for-3.18/core' at
the beginning of October because DM core now depends on the newly
introduced bioset_create_nobvec() interface.
Summary:
- fix DM's long-standing excessive use of memory by leveraging the
new bioset_create_nobvec() interface when creating the DM's bioset
- fix a few bugs in dm-bufio and dm-log-userspace
- add DM core support for a DM multipath use-case that requires
loading DM tables that contain devices that have failed (by
allowing active and inactive DM tables to share dm_devs)
- add discard support to the DM raid target; like MD raid456 the user
must opt-in to raid456 discard support be specifying the
devices_handle_discard_safely=Y module param"
* tag 'dm-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm log userspace: fix memory leak in dm_ulog_tfr_init failure path
dm bufio: when done scanning return from __scan immediately
dm bufio: update last_accessed when relinking a buffer
dm raid: add discard support for RAID levels 4, 5 and 6
dm raid: add discard support for RAID levels 1 and 10
dm: allow active and inactive tables to share dm_devs
dm mpath: stop queueing IO when no valid paths exist
dm: use bioset_create_nobvec()
dm: remove nr_iovecs parameter from alloc_tio()
Linus Torvalds [Sat, 18 Oct 2014 19:12:45 +0000 (12:12 -0700)]
Merge branch 'for-3.18/drivers' of git://git.kernel.dk/linux-block
Pull block layer driver update from Jens Axboe:
"This is the block driver pull request for 3.18. Not a lot in there
this round, and nothing earth shattering.
- A round of drbd fixes from the linbit team, and an improvement in
asender performance.
- Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and
hd from Michael Opdenacker.
- Disable entropy collection from flash devices by default, from Mike
Snitzer.
- A small collection of xen blkfront/back fixes from Roger Pau Monné
and Vitaly Kuznetsov"
* 'for-3.18/drivers' of git://git.kernel.dk/linux-block:
block: disable entropy contributions for nonrot devices
xen, blkfront: factor out flush-related checks from do_blkif_request()
xen-blkback: fix leak on grant map error path
xen/blkback: unmap all persistent grants when frontend gets disconnected
rsxx: Remove deprecated IRQF_DISABLED
block: hd: remove deprecated IRQF_DISABLED
drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks
drbd: compute the end before rb_insert_augmented()
drbd: Add missing newline in resync progress display in /proc/drbd
drbd: reduce lock contention in drbd_worker
drbd: Improve asender performance
drbd: Get rid of the WORK_PENDING macro
drbd: Get rid of the __no_warn and __cond_lock macros
drbd: Avoid inconsistent locking warning
drbd: Remove superfluous newline from "resync_extents" debugfs entry.
drbd: Use consistent names for all the bi_end_io callbacks
drbd: Use better variable names
Linus Torvalds [Sat, 18 Oct 2014 18:53:51 +0000 (11:53 -0700)]
Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block
Pull core block layer changes from Jens Axboe:
"This is the core block IO pull request for 3.18. Apart from the new
and improved flush machinery for blk-mq, this is all mostly bug fixes
and cleanups.
- blk-mq timeout updates and fixes from Christoph.
- Removal of REQ_END, also from Christoph. We pass it through the
->queue_rq() hook for blk-mq instead, freeing up one of the request
bits. The space was overly tight on 32-bit, so Martin also killed
REQ_KERNEL since it's no longer used.
- blk integrity updates and fixes from Martin and Gu Zheng.
- Update to the flush machinery for blk-mq from Ming Lei. Now we
have a per hardware context flush request, which both cleans up the
code should scale better for flush intensive workloads on blk-mq.
- Improve the error printing, from Rob Elliott.
- Backing device improvements and cleanups from Tejun.
- Fixup of a misplaced rq_complete() tracepoint from Hannes.
- Make blk_get_request() return error pointers, fixing up issues
where we NULL deref when a device goes bad or missing. From Joe
Lawrence.
- Prep work for drastically reducing the memory consumption of dm
devices from Junichi Nomura. This allows creating clone bio sets
without preallocating a lot of memory.
- Fix a blk-mq hang on certain combinations of queue depths and
hardware queues from me.
- Limit memory consumption for blk-mq devices for crash dump
scenarios and drivers that use crazy high depths (certain SCSI
shared tag setups). We now just use a single queue and limited
depth for that"
* 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits)
block: Remove REQ_KERNEL
blk-mq: allocate cpumask on the home node
bio-integrity: remove the needless fail handle of bip_slab creating
block: include func name in __get_request prints
block: make blk_update_request print prefix match ratelimited prefix
blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio
block: fix alignment_offset math that assumes io_min is a power-of-2
blk-mq: Make bt_clear_tag() easier to read
blk-mq: fix potential hang if rolling wakeup depth is too high
block: add bioset_create_nobvec()
block: use bio_clone_fast() in blk_rq_prep_clone()
block: misplaced rq_complete tracepoint
sd: Honor block layer integrity handling flags
block: Replace strnicmp with strncasecmp
block: Add T10 Protection Information functions
block: Don't merge requests if integrity flags differ
block: Integrity checksum flag
block: Relocate bio integrity flags
block: Add a disk flag to block integrity profile
block: Add prefix to block integrity profile flags
...
Linus Torvalds [Sat, 18 Oct 2014 18:48:03 +0000 (11:48 -0700)]
Merge tag 'for-linus-
20141015' of git://git.infradead.org/linux-mtd
Pull MTD update from Brian Norris:
"Sorry for delaying this a bit later than usual. There's one mild
regression from 3.16 that was noticed during the 3.17 cycle, and I
meant to send a fix for it along with this pull request. I'll
probably try to queue it up for a later pull request once I've had a
better look at it, hopefully by -rc2 at the latest.
Summary for this pull:
NAND
- Cleanup for Denali driver
- Atmel: add support for new page sizes
- Atmel: fix up 'raw' mode support
- Atmel: miscellaneous cleanups
- New timing mode helpers for non-ONFI NAND
- OMAP: allow driver to be (properly) built as a module
- bcm47xx: RESET support and other cleanups
SPI NOR
- Miscellaneous cleanups, to prepare framework for wider use (some
further work still pending)
- Compile-time configuration to select 4K vs. 64K support for flash
that support both (necessary for using UBIFS on some SPI NOR)
A few scattered code quality fixes, detected by Coverity
See the changesets for more"
* tag 'for-linus-
20141015' of git://git.infradead.org/linux-mtd: (59 commits)
mtd: nand: omap: Correct CONFIG_MTD_NAND_OMAP_BCH help message
mtd: nand: Force omap_elm to be built as a module if omap2_nand is a module
mtd: move support for struct flash_platform_data into m25p80
mtd: spi-nor: add Kconfig option to disable 4K sectors
mtd: nand: Move ELM driver and rename as omap_elm
nand: omap2: Replace pr_err with dev_err
nand: omap2: Remove horrible ifdefs to fix module probe
mtd: nand: add Hynix's H27UCG8T2ATR-BC to nand_ids table
mtd: nand: support ONFI timing mode retrieval for non-ONFI NANDs
mtd: physmap_of: Add non-obsolete map_rom probe
mtd: physmap_of: Fix ROM support via OF
MAINTAINERS: add l2-mtd.git, 'next' tree for MTD
mtd: denali: fix indents and other trivial things
mtd: denali: remove unnecessary parentheses
mtd: denali: remove another set-but-unused variable
mtd: denali: fix include guard and license block of denali.h
mtd: nand: don't break long print messages
mtd: bcm47xxnflash: replace some magic numbers
mtd: bcm47xxnflash: NAND_CMD_RESET support
mtd: bcm47xxnflash: add cmd_ctrl handler
...
Linus Torvalds [Sat, 18 Oct 2014 18:39:52 +0000 (11:39 -0700)]
Merge tag 'md/3.18' of git://neil.brown.name/md
Pull md updates from Neil Brown:
- a few minor bug fixes
- quite a lot of code tidy-up and simplification
- remove PRINT_RAID_DEBUG ioctl. I'm fairly sure it is unused, and it
isn't particularly useful.
* tag 'md/3.18' of git://neil.brown.name/md: (21 commits)
lib/raid6: Add log level to printks
md: move EXPORT_SYMBOL to after function in md.c
md: discard PRINT_RAID_DEBUG ioctl
md: remove MD_BUG()
md: clean up 'exit' labels in md_ioctl().
md: remove unnecessary test for MD_MAJOR in md_ioctl()
md: don't allow "-sync" to be set for device in an active array.
md: remove unwanted white space from md.c
md: don't start resync thread directly from md thread.
md: Just use RCU when checking for overlap between arrays.
md: avoid potential long delay under pers_lock
md: simplify export_array()
md: discard find_rdev_nr in favour of find_rdev_nr_rcu
md: use wait_event() to simplify md_super_wait()
md: be more relaxed about stopping an array which isn't started.
md/raid1: process_checks doesn't use its return value.
md/raid5: fix init_stripe() inconsistencies
md/raid10: another memory leak due to reshape.
md: use set_bit/clear_bit instead of shift/mask for bi_flags changes.
md/raid1: minor typos and reformatting.
...
Linus Torvalds [Sat, 18 Oct 2014 17:26:10 +0000 (10:26 -0700)]
Merge branch 'for-linus2' of git://git./linux/kernel/git/jmorris/linux-security
Pull selinux fix from James Morris:
"Fix for a list corruption bug in the SELinux code"
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: fix inode security list corruption
Linus Torvalds [Sat, 18 Oct 2014 17:25:09 +0000 (10:25 -0700)]
Merge tag 'virtio-next-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull virtio updates from Rusty Russell:
"One cc: stable commit, the rest are a series of minor cleanups which
have been sitting in MST's tree during my vacation. I changed a
function name and made one trivial change, then they spent two days in
linux-next"
* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
virtio-rng: refactor probe error handling
virtio_scsi: drop scan callback
virtio_balloon: enable VQs early on restore
virtio_scsi: fix race on device removal
virito_scsi: use freezable WQ for events
virtio_net: enable VQs early on restore
virtio_console: enable VQs early on restore
virtio_scsi: enable VQs early on restore
virtio_blk: enable VQs early on restore
virtio_scsi: move kick event out from virtscsi_init
virtio_net: fix use after free on allocation failure
9p/trans_virtio: enable VQs early
virtio_console: enable VQs early
virtio_blk: enable VQs early
virtio_net: enable VQs early
virtio: add API to enable VQs early
virtio_net: minor cleanup
virtio-net: drop config_mutex
virtio_net: drop config_enable
virtio-blk: drop config_mutex
...
Linus Torvalds [Sat, 18 Oct 2014 17:24:26 +0000 (10:24 -0700)]
Merge tag 'modules-next-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull module fix from Rusty Russell:
"A single panic fix for a rare race, stable CC'd"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
modules, lock around setting of MODULE_STATE_UNFORMED
Jonathan Corbet [Fri, 17 Oct 2014 12:59:26 +0000 (08:59 -0400)]
MAINTAINERS: Become the docs maintainer
It seems it's my turn to be the documentation maintainer for a bit. My
plan is to work to ensure that docs patches don't fall through the cracks;
I assume most changes will continue to flow through subsystem-specific
trees.
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Lutomirski [Wed, 8 Oct 2014 16:02:13 +0000 (09:02 -0700)]
x86,kvm,vmx: Preserve CR4 across VM entry
CR4 isn't constant; at least the TSD and PCE bits can vary.
TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.
This adds a branch and a read from cr4 to each vm entry. Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact. A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li RongQing [Sat, 18 Oct 2014 09:33:38 +0000 (17:33 +0800)]
ipv6: fix a potential use after free in sit.c
pskb_may_pull() maybe change skb->data and make iph pointer oboslete,
fix it by geting ip header length directly.
Fixes:
ca15a078 (sit: generate icmpv6 error when receiving icmpv4 error)
Cc: Oussama Ghorbel <ghorbel@pivasoftware.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Sat, 18 Oct 2014 09:27:42 +0000 (17:27 +0800)]
ipv6: fix a potential use after free in ip6_offload.c
pskb_may_pull() maybe change skb->data and make opth pointer oboslete,
so set the opth again
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Sat, 18 Oct 2014 09:26:04 +0000 (17:26 +0800)]
ipv4: fix a potential use after free in gre_offload.c
pskb_may_pull() may change skb->data and make greh pointer oboslete;
so need to reassign greh;
but since first calling pskb_may_pull already ensured that skb->data
has enough space for greh, so move the reference of greh before second
calling pskb_may_pull(), to avoid reassign greh.
Fixes:
7a7ffbabf9("ipv4: fix tunneled VM traffic over hw VXLAN/GRE GSO NIC")
Cc: Wei-Chun Chao <weichunc@plumgrid.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 18 Oct 2014 15:34:37 +0000 (08:34 -0700)]
tcp: fix build error if IPv6 is not enabled
$ make M=net/ipv4
CC net/ipv4/route.o
In file included from net/ipv4/route.c:102:0:
include/net/tcp.h: In function ‘tcp_v6_iif’:
include/net/tcp.h:738:32: error: ‘union <anonymous>’ has no member named ‘h6’
return TCP_SKB_CB(skb)->header.h6.iif;
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes:
870c3151382c ("ipv6: introduce tcp_v6_iif()")
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 18 Oct 2014 16:31:37 +0000 (09:31 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Include fixes for netrom and dsa (Fabian Frederick and Florian
Fainelli)
2) Fix FIXED_PHY support in stmmac, from Giuseppe CAVALLARO.
3) Several SKB use after free fixes (vxlan, openvswitch, vxlan,
ip_tunnel, fou), from Li ROngQing.
4) fec driver PTP support fixes from Luwei Zhou and Nimrod Andy.
5) Use after free in virtio_net, from Michael S Tsirkin.
6) Fix flow mask handling for megaflows in openvswitch, from Pravin B
Shelar.
7) ISDN gigaset and capi bug fixes from Tilman Schmidt.
8) Fix route leak in ip_send_unicast_reply(), from Vasily Averin.
9) Fix two eBPF JIT bugs on x86, from Alexei Starovoitov.
10) TCP_SKB_CB() reorganization caused a few regressions, fixed by Cong
Wang and Eric Dumazet.
11) Don't overwrite end of SKB when parsing malformed sctp ASCONF
chunks, from Daniel Borkmann.
12) Don't call sock_kfree_s() with NULL pointers, this function also has
the side effect of adjusting the socket memory usage. From Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
bna: fix skb->truesize underestimation
net: dsa: add includes for ethtool and phy_fixed definitions
openvswitch: Set flow-key members.
netrom: use linux/uaccess.h
dsa: Fix conversion from host device to mii bus
tipc: fix bug in bundled buffer reception
ipv6: introduce tcp_v6_iif()
sfc: add support for skb->xmit_more
r8152: return -EBUSY for runtime suspend
ipv4: fix a potential use after free in fou.c
ipv4: fix a potential use after free in ip_tunnel_core.c
hyperv: Add handling of IP header with option field in netvsc_set_hash()
openvswitch: Create right mask with disabled megaflows
vxlan: fix a free after use
openvswitch: fix a use after free
ipv4: dst_entry leak in ip_send_unicast_reply()
ipv4: clean up cookie_v4_check()
ipv4: share tcp_v4_save_options() with cookie_v4_check()
ipv4: call __ip_options_echo() in cookie_v4_check()
atm: simplify lanai.c by using module_pci_driver
...
Linus Torvalds [Sat, 18 Oct 2014 16:30:41 +0000 (09:30 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull Sparc bugfix from David Miller:
"Sparc64 AES ctr mode bug fix"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix FPU register corruption with AES crypto offload.
Linus Torvalds [Sat, 18 Oct 2014 16:29:59 +0000 (09:29 -0700)]
Merge git://git./linux/kernel/git/davem/ide
Pull IDE cleanup from David Miller:
"One IDE driver cleanup"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
Drivers: ide: Remove typedef atiixp_ide_timing
Catalin Marinas [Fri, 17 Oct 2014 16:38:49 +0000 (17:38 +0100)]
futex: Ensure get_futex_key_refs() always implies a barrier
Commit
b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
nothing to wake up) changes the futex code to avoid taking a lock when
there are no waiters. This code has been subsequently fixed in commit
11d4616bd07f (futex: revert back to the explicit waiter counting code).
Both the original commit and the fix-up rely on get_futex_key_refs() to
always imply a barrier.
However, for private futexes, none of the cases in the switch statement
of get_futex_key_refs() would be hit and the function completes without
a memory barrier as required before checking the "waiters" in
futex_wake() -> hb_waiters_pending(). The consequence is a race with a
thread waiting on a futex on another CPU, allowing the waker thread to
read "waiters == 0" while the waiter thread to have read "futex_val ==
locked" (in kernel).
Without this fix, the problem (user space deadlocks) can be seen with
Android bionic's mutex implementation on an arm64 multi-cluster system.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Matteo Franchin <Matteo.Franchin@arm.com>
Fixes:
b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pablo Neira Ayuso [Wed, 15 Oct 2014 22:24:14 +0000 (00:24 +0200)]
netfilter: nft_nat: dump attributes if they are set
Dump NFTA_NAT_REG_ADDR_MIN if this is non-zero. Same thing with
NFTA_NAT_REG_PROTO_MIN.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Wed, 15 Oct 2014 22:19:35 +0000 (00:19 +0200)]
netfilter: nft_nat: NFTA_NAT_REG_ADDR_MAX depends on NFTA_NAT_REG_ADDR_MIN
Interpret NFTA_NAT_REG_ADDR_MAX if NFTA_NAT_REG_ADDR_MIN is present,
otherwise, skip it. Same thing with NFTA_NAT_REG_PROTO_MAX.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Wed, 15 Oct 2014 22:16:57 +0000 (00:16 +0200)]
netfilter: nft_nat: insufficient attribute validation
We have to validate that we at least get an NFTA_NAT_REG_ADDR_MIN or
NFTA_NFT_REG_PROTO_MIN attribute. Reject the configuration if none
of them are present.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Tue, 14 Oct 2014 08:13:48 +0000 (10:13 +0200)]
netfilter: nft_compat: validate chain type in match/target
We have to validate the real chain type to ensure that matches/targets
are not used out from their scope (eg. MASQUERADE in nat chain type).
The existing validation relies on the table name, but this is not
sufficient since userspace can fool us by using the appropriate table
name with a different chain type.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Ingo Molnar [Sat, 18 Oct 2014 07:04:02 +0000 (09:04 +0200)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
* Add period data column and make it default in 'perf script' (Jiri Olsa)
Infrastructure changes:
* Move exit stuff from perf_evsel__delete to perf_evsel__exit, delete
should be just a front end for exit + free (Arnaldo Carvalho de Melo)
* Add missing 'struct option' forward declaration (Arnaldo Carvalho de Melo)
* No need to drag util/cgroup.h into evsel.h (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Eric Dumazet [Fri, 17 Oct 2014 19:45:55 +0000 (12:45 -0700)]
bna: fix skb->truesize underestimation
skb->truesize is not meant to be tracking amount of used bytes
in an skb, but amount of reserved/consumed bytes in memory.
For instance, if we use a single byte in last page fragment,
we have to account the full size of the fragment.
skb->truesize can be very different from skb->len, that has
a very specific safety purpose.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 17 Oct 2014 23:02:13 +0000 (16:02 -0700)]
net: dsa: add includes for ethtool and phy_fixed definitions
net/dsa/slave.c uses functions and structures declared in phy_fixed.h
but does not explicitely include it, while dsa.h needs structure
declarations for 'struct ethtool_wolinfo' and 'struct ethtool_eee', fix
those by including the correct header files.
Fixes:
ec9436baedb6 ("net: dsa: allow drivers to do link adjustment")
Fixes:
ce31b31c68e7 ("net: dsa: allow updating fixed PHY link information")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Fri, 17 Oct 2014 20:56:31 +0000 (13:56 -0700)]
openvswitch: Set flow-key members.
This patch adds missing memset which are required to initialize
flow key member. For example for IP flow we need to initialize
ip.frag for all cases.
Found by inspection.
This bug is introduced by commit
0714812134d7dcadeb7ecfbfeb18788aa7e1eaac
("openvswitch: Eliminate memset() from flow_extract").
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Fri, 17 Oct 2014 20:00:22 +0000 (22:00 +0200)]
netrom: use linux/uaccess.h
replace asm/uaccess.h by linux/uaccess.h
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guenter Roeck [Fri, 17 Oct 2014 19:30:58 +0000 (12:30 -0700)]
dsa: Fix conversion from host device to mii bus
Commit
b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
replaces mii_bus with a generic host_dev, and introduces
dsa_host_dev_to_mii_bus() to support conversion from host_dev to mii_bus.
However, in some cases it uses to_mii_bus to perform that conversion.
Since host_dev is not the phy bus device but typically a platform device,
this fails and results in a crash with the affected drivers.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff81781d35>] __mutex_lock_slowpath+0x75/0x100
PGD
406783067 PUD
406784067 PMD 0
Oops: 0002 [#1] SMP
...
Call Trace:
[<
ffffffff810a538b>] ? pick_next_task_fair+0x61b/0x880
[<
ffffffff81781de3>] mutex_lock+0x23/0x37
[<
ffffffff81533244>] mdiobus_read+0x34/0x60
[<
ffffffff8153b95a>] __mv88e6xxx_reg_read+0x8a/0xa0
[<
ffffffff8153b9bc>] mv88e6xxx_reg_read+0x4c/0xa0
Fixes:
b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Fri, 17 Oct 2014 19:25:28 +0000 (15:25 -0400)]
tipc: fix bug in bundled buffer reception
In commit
ec8a2e5621db2da24badb3969eda7fd359e1869f ("tipc: same receive
code path for connection protocol and data messages") we omitted the
the possiblilty that an arriving message extracted from a bundle buffer
may be a multicast message. Such messages need to be to be delivered to
the socket via a separate function, tipc_sk_mcast_rcv(). As a result,
small multicast messages arriving as members of a bundle buffer will be
silently dropped.
This commit corrects the error by considering this case in the function
tipc_link_bundle_rcv().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 17 Oct 2014 16:17:20 +0000 (09:17 -0700)]
ipv6: introduce tcp_v6_iif()
Commit
971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line
misses") added a regression for SO_BINDTODEVICE on IPv6.
This is because we still use inet6_iif() which expects that IP6 control
block is still at the beginning of skb->cb[]
This patch adds tcp_v6_iif() helper and uses it where necessary.
Because __inet6_lookup_skb() is used by TCP and DCCP, we add an iif
parameter to it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes:
971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Fri, 17 Oct 2014 14:32:25 +0000 (15:32 +0100)]
sfc: add support for skb->xmit_more
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Fri, 17 Oct 2014 08:55:08 +0000 (16:55 +0800)]
r8152: return -EBUSY for runtime suspend
Remove calling cancel_delayed_work_sync() for runtime suspend,
because it would cause dead lock. Instead, return -EBUSY to
avoid the device enters suspending if the net is running and
the delayed work is pending or running. The delayed work would
try to wake up the device later, so the suspending is not
necessary.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 08:53:47 +0000 (16:53 +0800)]
ipv4: fix a potential use after free in fou.c
pskb_may_pull() maybe change skb->data and make uh pointer oboslete,
so reload uh and guehdr
Fixes:
37dd0247 ("gue: Receive side for Generic UDP Encapsulation")
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 08:53:23 +0000 (16:53 +0800)]
ipv4: fix a potential use after free in ip_tunnel_core.c
pskb_may_pull() maybe change skb->data and make eth pointer oboslete,
so set eth after pskb_may_pull()
Fixes:
3d7b46cd("ip_tunnel: push generic protocol handling to ip_tunnel module")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Thu, 16 Oct 2014 21:47:58 +0000 (14:47 -0700)]
hyperv: Add handling of IP header with option field in netvsc_set_hash()
In case that the IP header has optional field at the end, this patch will
get the port numbers after that field, and compute the hash. The general
parser skb_flow_dissect() is used here.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steve French [Fri, 17 Oct 2014 22:17:12 +0000 (17:17 -0500)]
[CIFS] Remove obsolete comment
Signed-off-by: Steven French <smfrench@gmail.com>
Pravin B Shelar [Fri, 17 Oct 2014 04:55:45 +0000 (21:55 -0700)]
openvswitch: Create right mask with disabled megaflows
If megaflows are disabled, the userspace does not send the netlink attribute
OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask.
sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the
bytes that represent padding for struct sw_flow, or the bytes that represent
fields that may not be set during ovs_flow_extract().
This is a problem, because when we extract a flow from a packet,
we do not memset() anymore the struct sw_flow to 0.
This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(),
which operates on the netlink attributes rather than on the mask key. Using
this approach we are sure that only the bytes that the user provided in the
flow are matched.
Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we
now return with an error.
This bug is introduced by commit
0714812134d7dcadeb7ecfbfeb18788aa7e1eaac
("openvswitch: Eliminate memset() from flow_extract").
Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 06:06:16 +0000 (14:06 +0800)]
vxlan: fix a free after use
pskb_may_pull maybe change skb->data and make eth pointer oboslete,
so eth needs to reload
Fixes:
91269e390d062 ("vxlan: using pskb_may_pull as early as possible")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 06:03:08 +0000 (14:03 +0800)]
openvswitch: fix a use after free
pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp
setting after arphdr_ok to avoid the use the freed memory
Fixes:
0714812134d7d ("openvswitch: Eliminate memset() from flow_extract.")
Cc: Jesse Gross <jesse@nicira.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasily Averin [Wed, 15 Oct 2014 12:24:02 +0000 (16:24 +0400)]
ipv4: dst_entry leak in ip_send_unicast_reply()
ip_setup_cork() called inside ip_append_data() steals dst entry from rt to cork
and in case errors in __ip_append_data() nobody frees stolen dst entry
Fixes:
2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()")
Signed-off-by: Vasily Averin <vvs@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Olsa [Mon, 25 Aug 2014 14:45:43 +0000 (16:45 +0200)]
perf script: Add period as a default output column
Adding period as a default output column in script command fo hardware,
software and raw events.
If PERF_SAMPLE_PERIOD sample type is defined in perf.data, following
will be displayed in perf script output:
$ perf script
ls 8034 57477.887209: 250000 task-clock:
ffffffff81361d72 memset ([kernel.kallsyms])
ls 8034 57477.887464: 250000 task-clock:
ffffffff816f6d92 _raw_spin_unlock_irqrestore ([kernel.kallsyms])
ls 8034 57477.887708: 250000 task-clock:
ffffffff811a94f0 do_munmap ([kernel.kallsyms])
ls 8034 57477.887959: 250000 task-clock:
34080916c6 get_next_seq (/usr/lib64/libc-2.17.so)
ls 8034 57477.888208: 250000 task-clock:
3408079230 _IO_doallocbuf (/usr/lib64/libc-2.17.so)
ls 8034 57477.888717: 250000 task-clock:
ffffffff814242c8 n_tty_write ([kernel.kallsyms])
ls 8034 57477.889285: 250000 task-clock:
3408076402 fwrite_unlocked (/usr/lib64/libc-2.17.so)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: "Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408977943-16594-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Mon, 25 Aug 2014 14:45:42 +0000 (16:45 +0200)]
perf script: Add period data column
Adding period data column to be displayed in perf script. It's possible
to get period values using -f option, like:
$ perf script -f comm,tid,time,period,ip,sym,dso
:26019 26019 52414.329088: 3707
ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
:26019 26019 52414.329088: 44
ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
:26019 26019 52414.329093: 1987
ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
:26019 26019 52414.329093: 6
ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
ls 26019 52414.329442: 537558
3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
ls 26019 52414.329442: 2099
3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
ls 26019 52414.330181:
1242100 34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
ls 26019 52414.330181: 3774
34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
ls 26019 52414.331427:
1083662 ffffffff810c7dc2 update_curr ([kernel.kallsyms])
ls 26019 52414.331427: 360
ffffffff810c7dc2 update_curr ([kernel.kallsyms])
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: "Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408977943-16594-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cong Wang [Wed, 15 Oct 2014 21:33:22 +0000 (14:33 -0700)]
ipv4: clean up cookie_v4_check()
We can retrieve opt from skb, no need to pass it as a parameter.
And opt should always be non-NULL, no need to check.
Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 15 Oct 2014 21:33:21 +0000 (14:33 -0700)]
ipv4: share tcp_v4_save_options() with cookie_v4_check()
cookie_v4_check() allocates ip_options_rcu in the same way
with tcp_v4_save_options(), we can just make it a helper function.
Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 15 Oct 2014 21:33:20 +0000 (14:33 -0700)]
ipv4: call __ip_options_echo() in cookie_v4_check()
commit
971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
missed that cookie_v4_check() still calls ip_options_echo() which uses
IPCB(). It should use TCPCB() at TCP layer, so call __ip_options_echo()
instead.
Fixes: commit
971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Opdenacker [Wed, 15 Oct 2014 07:45:50 +0000 (09:45 +0200)]
atm: simplify lanai.c by using module_pci_driver
This simplifies the lanai.c driver by using
the module_pci_driver() macro, at the expense
of losing only debugging messages.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnaldo Carvalho de Melo [Fri, 17 Oct 2014 15:17:40 +0000 (12:17 -0300)]
perf evsel: No need to drag util/cgroup.h
The only thing we need is a forward declaration for 'struct cgroup_sel',
that is inside 'struct perf_evsel'.
Include cgroup.h instead on the tools that support cgroups.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-b7kuymbgf0zxi5viyjjtu5hk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 17 Oct 2014 15:16:00 +0000 (12:16 -0300)]
perf evlist: Add missing 'struct option' forward declaration
It was being found, by chance, because evsel.h needlessly includes
util/cgroup.h, which will be sorted out in a following patch.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-xsvxr747wkkpg1ay9dramorr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 16 Oct 2014 16:25:01 +0000 (13:25 -0300)]
perf evsel: Move exit stuff from __delete to __exit
So that when an evsel is embedded into other struct it can free up
resources calling perf_evsel__exit().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-n1w68pfe9m2vkhm4sqs8y1en@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Dave Jiang [Thu, 28 Aug 2014 20:53:23 +0000 (13:53 -0700)]
ntb: Adding split BAR support for Haswell platforms
On the Haswell platform, a split BAR option to allow creation of 2
32bit BARs (4 and 5) from the 64bit BAR 4. Adding support for this
new option.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Thu, 28 Aug 2014 20:53:18 +0000 (13:53 -0700)]
ntb: use errata flag set via DID to implement workaround
Instead of using a module parameter, we should detect the errata via
PCI DID and then set an appropriate flag. This will be used for additional
errata later on.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Thu, 28 Aug 2014 20:53:13 +0000 (13:53 -0700)]
ntb: conslidate reading of PPD to move platform detection earlier
To simplify some of the platform detection code. Move the platform detection
to a function to be called earlier.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Thu, 28 Aug 2014 20:53:07 +0000 (13:53 -0700)]
ntb: move platform detection to separate function
Move the platform detection function to separate functions to allow
easier maintenence.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Jon Mason [Mon, 7 Apr 2014 17:55:47 +0000 (10:55 -0700)]
NTB: debugfs device entry
Create a debugfs entry for the NTB device to log the basic device info,
as well as display the error count on a number of registers.
Signed-off-by: Jon Mason <jon.mason@intel.com>