git.monstr.eu Git - linux-2.6-microblaze.git/log

r8169: offical fix for CVE-2009-4537 (overlength frame DMAs)

Official patch to fix the r8169 frame length check error.

Based on this initial thread:
http://marc.info/?l=linux-netdev&m=126202972828626&w=1
This is the official patch to fix the frame length problems in the r8169
driver.  As noted in the previous thread, while this patch incurs a performance
hit on the driver, its possible to improve performance dynamically by updating
the mtu and rx_copybreak values at runtime to return performance to what it was
for those NICS which are unaffected by the ideosyncracy (if there are any).

Summary:

    A while back Eric submitted a patch for r8169 in which the proper
allocated frame size was written to RXMaxSize to prevent the NIC from dmaing too
much data.  This was done in commit fdd7b4c3302c93f6833e338903ea77245eb510b4.  A
long time prior to that however, Francois posted
126fa4b9ca5d9d7cb7d46f779ad3bd3631ca387c, which expiclitly disabled the MaxSize
setting due to the fact that the hardware behaved in odd ways when overlong
frames were received on NIC's supported by this driver.  This was mentioned in a
security conference recently:
http://events.ccc.de/congress/2009/Fahrplan//events/3596.en.html

It seems that if we can't enable frame size filtering, then, as Eric correctly
noticed, we can find ourselves DMA-ing too much data to a buffer, causing
corruption.  As a result is seems that we are forced to allocate a frame which
is ready to handle a maximally sized receive.

This obviously has performance issues with it, so to mitigate that issue, this
patch does two things:

1) Raises the copybreak value to the frame allocation size, which should force
appropriately sized packets to get allocated on rx, rather than a full new 16k
buffer.

2) This patch only disables frame filtering initially (i.e., during the NIC
open), changing the MTU results in ring buffer allocation of a size in relation
to the new mtu (along with a warning indicating that this is dangerous).

Because of item (2), individuals who can't cope with the performance hit (or can
otherwise filter frames to prevent the bug), or who have hardware they are sure
is unaffected by this issue, can manually lower the copybreak and reset the mtu
such that performance is restored easily.

Signed-off-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Don't drop cache route entry unless timer actually expired.

This is ipv6 variant of the commit 5e016cbf6.. ("ipv4: Don't drop
redirected route cache entry unless PTMU actually expired")
by Guenter Roeck <guenter.roeck@ericsson.com>.

Remove cache route entry in ipv6_negative_advice() only if
the timer is expired.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tulip: Add missing parens.

As reported by Stephen Rothwell.

drivers/net/tulip/uli526x.c: In function 'uli526x_rx_packet':
drivers/net/tulip/uli526x.c:861: warning: assignment makes pointer from integer without a cast

Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: fix broken register writes

This is quite similar to b39fe41f481d20c201012e4483e76c203802dda7
though said registers are not even documented as 64-bit registers
- as opposed to the initial TxDescStartAddress ones - but as single
bytes which must be combined into 32 bits at the MMIO read/write
level before being merged into a 64 bit logical entity.

Credits go to Ben Hutchings <ben@decadent.org.uk> for the MAR
registers (aka "multicast is broken for ages on ARM) and to
Timo Teräs <timo.teras@iki.fi> for the MAC registers.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pcnet_cs: add new id

pcnet_cs:
*add new id (Allied Telesis LM33-PCM-T Lan&Modem multifunction card)
*use PROD_ID for LA-PCM.(because LA-PCM and LM33-PCM-T use the same MANF_ID).

Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: fix broken multicast with round-robin mode

Round-robin (mode 0) does nothing to ensure that any multicast traffic
originally destined for the host will continue to arrive at the host when
the link that sent the IGMP join or membership report goes down.  One of
the benefits of absolute round-robin transmit.

Keeping track of subscribed multicast groups for each slave did not seem
like a good use of resources, so I decided to simply send on the
curr_active slave of the bond (typically the first enslaved device that
is up).  This makes failover management simple as IGMP membership
reports only need to be sent when the curr_active_slave changes.  I
tested this patch and it appears to work as expected.

Originally reported by Lon Hohberger <lhh@redhat.com>.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Lon Hohberger <lhh@redhat.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers/net: Fix continuation lines

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1000: do not modify tx_queue_len on link speed change

Previously the driver tweaked txqueuelen to avoid false Tx hang reports
seen at half duplex. This had the effect of overriding user set values
on link change/reset. Testing shows that adjusting only the timeout
factor is sufficient to prevent Tx hang reports at half duplex.

This patch removes all instances of tx_queue_len in the driver.

Based on e1000e patch by Franco Fichtner <franco@lastsummer.de>

CC: Franco Fichtner <franco@lastsummer.de>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipmr/ip6mr: prevent out-of-bounds vif_table access

When cache is unresolved, c->mf[6]c_parent is set to 65535 and
minvif, maxvif are not initialized, hence we must avoid to
parse IIF and OIF.
A second problem can happen when the user dumps a cache entry
where a VIF, that was referenced at creation time, has been
removed.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Do not run all Diagnostic offline tests when VFs are active

When running the offline diagnostic tests check to see if any VFs are
online.  If so then only run the link test.  This is necessary because
the VFs running in guest VMs aren't aware of when the PF is taken
offline for a diagnostic test.  Also put a message to the system log
telling the system administrator to take the VFs offline manually if
(s)he wants to run a full diagnostic.  Return 1 on each of the tests
not run to alert the user of the condition.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

igb: use correct bits to identify if managability is enabled

igb was previously checking the wrong bits in the MANC register to determine
if managability was enabled. As a result it was incorrectly powering down and
resetting the phy when it didn't need to.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

benet: Fix compile warnnings in drivers/net/benet/be_ethtool.c

Fix the following warnings:

be_ethtool.c:493: warning: integer constant is too large for 'long' type
be_ethtool.c:493: warning: integer constant is too large for 'long' type

Signed-off-by: Zhitong Wang <zhitong.wangzt@alibaba-inc.com>
Acked-by: Ajit Khaparde <ajitk@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Add MSG_WAITFORONE flag to recvmmsg

Add new flag MSG_WAITFORONE for the recvmmsg() syscall.
When this flag is specified for a blocking socket, recvmmsg()
will only block until at least 1 packet is available. The
default behavior is to block until all vlen packets are
available. This flag has no effect on non-blocking sockets
or when used in combination with MSG_DONTWAIT.

Signed-off-by: Brandon L Black <blblack@gmail.com>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: do not modify tx_queue_len on link speed change

Previously the driver tweaked txqueuelen to avoid false Tx hang reports seen at half duplex.
This had the effect of overriding user set values on link change/reset. Testing shows that
adjusting only the timeout factor is sufficient to prevent Tx hang reports at half duplex.

This patch removes all instances of tx_queue_len in the driver.

Originally reported and patched by Franco Fichtner
CC: Franco Fichtner <franco@lastsummer.de>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

igbvf: do not modify tx_queue_len on link speed change

Previously the driver tweaked txqueuelen to avoid false Tx hang reports seen at half duplex.
This had the effect of overriding user set values on link change/reset. Testing shows that
adjusting only the timeout factor is sufficient to prevent Tx hang reports at half duplex.

Based on e1000e patch by Franco Fichtner <franco@lastsummer.de>

CC: Franco Fichtner <franco@lastsummer.de>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: Restart rt_intern_hash after emergency rebuild (v2)

The the rebuild changes the genid which in turn is used at
the hash calculation. Thus if we don't restart and go on with
inserting the rt will happen in wrong chain.

(Fixed Neil's comment about the index passed into the rt_intern_hash)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: Cleanup struct net dereference in rt_intern_hash

There's no need in getting it 3 times and gcc isn't smart enough
to understand this himself.

This is just a cleanup before the fix (next patch).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fix netlink address dumping in IPv4/IPv6

When a dump is interrupted at the last device in a hash chain and
then continued, "idx" won't get incremented past s_idx, so s_ip_idx
is not reset when moving on to the next device. This means of all
following devices only the last n - s_ip_idx addresses are dumped.

Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>

tulip: Fix null dereference in uli526x_rx_packet()

Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

gianfar: fix undo of reserve()

Fix undo of reserve() before RX recycle

gfar_new_skb reserve()s space in the SKB to align it. If an error occurs,
and the skb needs to be returned to the RX recycle queue, the current code
attempts to reset head, but did not reset tail. This patch remembers the
alignment amount, and reverses the reserve() when needed.

Signed-off-by: Ben Menchaca <ben@bigfootnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: filter FIP frames into the FCoE offload queues

During FCF solicitation, the switch is supposed to pad the
solicited advertisement out to the endpoints specified
maximum FCoE frame size. That means that we need to receive
FIP frames that are larger than the standard MTU. To make
sure the receive queue is configured correctly, we should be
filtering FIP traffic into the FCoE queues.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Priority tag FIP frames

Currently FIP (FCoE Initialization Protocol) frames
are going untagged. This causes various problems
with FCFs (switches) that have negotiated a priority
over dcbx. This patch tags FIP frames with the same
priority as the FCoE frames.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Don't allow user buffer count to exceed 256

If the user buffer count was 256 the shift would place a 1
in the offset region leading to errors. It also overwrites
the uers buffer list. This patch makes sure that at most
256 user buffers are allowed for DDP and the buffer count
is masked properly such that it doesn't overwrite the offset
when shifting the bits.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Frank Zhang <frank_1.zhang@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: cleanup maximum number of tx queues

In the last patch I missed an unecessary min_t comparison.
This patch removes it, the path allocates at most
72 tx queues for 82599 and 24 for 82598 there is no need
for this check.

Additionally this sets MAX_[TX|RX]_QUEUES to 72. Which is
used as the size for the tx/rx_ring arrays. There is no
reason to have more tx_rings/rx_rings then num_tx_queues.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Change where clear_to_send_flag is reset to zero.

The clear_to_send flag is being cleared before the call to ping all
the VFs. It should be called after pinging all the VFs.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: In SR-IOV mode insert delay before bring the adapter up

VFs running in guest VMs do not respond in as timely a manner to
PF indication it is going down as they do when running in the host
domain. If the adapter is in SR-IOV mode insert a two second delay
to guarantee that all VFs have had time to respond to the PF reset.
In any case resetting the PF while VFs are active should be
discouraged but if it must be done then there will be a two
second delay to help synchronize resets among the PF and all the
VFs.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: Fix signed/unsigned int error

In the Tx mapping function if a DMA error occurred then the unwind of
previously mapped sections would improperly check an unsigned int if
it was less than zero. Changed the index variable to signed to avoid
the error.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen: update version to 4.0.73

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen: added sanity check for pci map

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Return value of ioremap is not checked, NULL check added.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen: fix warning in ioaddr for NX3031 chip

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
crb_intr_mask/crb_sts_consumer is predefined for NX2031 not for
NX3031. For NX3031, these values get defined in rx context creation.
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen: fix bios version calculation

Bios sub version from unified fw image is calculated incorrect.

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "r8169: enable 64-bit DMA by default for PCI Express devices (v2)"

This reverts commit 353176888386d9025062a12dcec08d49af10cf2c.

People are reporting problems due to this change and there
is no anticipation that the cause will be tracked down
any time soon.

We can try next time to selectively re-enable this based upon chip
type, or have a black list of some sort.

Signed-off-by: David S. Miller <davem@davemloft.net>

isdn: Add netdev to lists in MAINTAINERS entry.

Signed-off-by: David S. Miller <davem@davemloft.net>

TIPC: Removed inactive maintainer

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn: Cleanup Sections in PCMCIA driver elsa

Compiling this driver gave a section mismatch,
so I reviewed the init/exit paths of the driver
and made the correct changes.

WARNING: drivers/isdn/hisax/built-in.o(.text+0x55e37): Section mismatch
in reference from the function elsa_cs_config() to the function
.devinit.text:hisax_init_pcmcia()
The function elsa_cs_config() references
the function __devinit hisax_init_pcmcia().
This is often because elsa_cs_config lacks a __devinit
annotation or the annotation of hisax_init_pcmcia is wrong.

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Acked-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn: Cleanup Sections in PCMCIA driver avma1

Compiling this driver gave a section mismatch,
so I reviewed the init/exit paths of the driver
and made the correct changes.

WARNING: drivers/isdn/hisax/built-in.o(.text+0x56512): Section mismatch
in reference from the function avma1cs_config() to the function
.devinit.text:hisax_init_pcmcia()
The function avma1cs_config() references
the function __devinit hisax_init_pcmcia().
This is often because avma1cs_config lacks a __devinit
annotation or the annotation of hisax_init_pcmcia is wrong.

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Acked-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn: Cleanup Sections in PCMCIA driver teles

Compiling this driver gave a section mismatch,
so I reviewed the init/exit paths of the driver
and made the correct changes.

WARNING: drivers/isdn/hisax/built-in.o(.text+0x56bfb): Section mismatch
in reference from the function teles_cs_config() to the function
.devinit.text:hisax_init_pcmcia()
The function teles_cs_config() references
the function __devinit hisax_init_pcmcia().
This is often because teles_cs_config lacks a __devinit
annotation or the annotation of hisax_init_pcmcia is wrong.

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Acked-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn: Cleanup Sections in PCMCIA driver sedlbauer

Compiling this driver gave a section mismatch,
so I reviewed the init/exit paths of the driver
and made the correct changes.

WARNING: drivers/isdn/hisax/built-in.o(.text+0x558d6): Section mismatch
in reference from the function sedlbauer_config() to the function
.devinit.text:hisax_init_pcmcia()
The function sedlbauer_config() references
the function __devinit hisax_init_pcmcia().
This is often because sedlbauer_config lacks a __devinit
annotation or the annotation of hisax_init_pcmcia is wrong.

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Acked-by: Karsten Keil <keil@b1-systems.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

via-velocity: Fix FLOW_CNTL_TX_RX handling in set_mii_flow_control()

Clear, don't set, ANAR_ASMDIR in this case.

Noticed by Roel Kluin.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git./linux/kernel/git/kaber/nf-2.6

netfilter: xt_hashlimit: IPV6 bugfix

A missing break statement in hashlimit_ipv6_mask(), and masks
between /64 and /95 are not working at all...

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>

netfilter: ip6table_raw: fix table priority

The order of the IPv6 raw table is currently reversed, that makes impossible
to use the NOTRACK target in IPv6: for example if someone enters

ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK

and if we receive fragmented packets then the first fragment will be
untracked and thus skip nf_ct_frag6_gather (and conntrack), while all
subsequent fragments enter nf_ct_frag6_gather and reassembly will never
successfully be finished.

Singed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>

netfilter: xt_hashlimit: dl_seq_stop() fix

If dl_seq_start() memory allocation fails, we crash later in
dl_seq_stop(), trying to kfree(ERR_PTR(-ENOMEM))

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>

af_key: return error if pfkey_xfrm_policy2msg_prep() fails

The original code saved the error value but just returned 0 in the end.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

skbuff: remove unused dma_head & dma_maps fields

The dma map fields in the skb_shared_info structure no longer has any users
and can be dropped since it is making the skb_shared_info unecessarily larger.

Running slabtop show that we were using 4K slabs for the skb->head on x86_64 w/
an allocation size of 1522. It turns out that the dma_head and dma_maps array
made skb_shared large enough that we had crossed over the 2k boundary with
standard frames and as such we were using 4k blocks of memory for all skbs.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: updates vlan real_num_tx_queues

Updates real_num_tx_queues in case underlying real device
has changed real_num_tx_queues.

-v2
As per Eric Dumazet<eric.dumazet@gmail.com> comment:-
-- adds BUG_ON to catch case of real_num_tx_queues exceeding num_tx_queues.
-- created this self contained patch to just update real_num_tx_queues.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

vlan: adds vlan_dev_select_queue

This is required to correctly select vlan tx queue for a driver
supporting multi tx queue with ndo_select_queue implemented since
currently selected vlan tx queue is unaligned to selected queue by
real net_devce ndo_select_queue.

Unaligned vlan tx queue selection causes thrash with higher vlan
tx lock contention for least fcoe traffic and wrong socket tx
queue_mapping for ixgbe having ndo_select_queue implemented.

-v2

As per Eric Dumazet<eric.dumazet@gmail.com> comments, mirrored
vlan net_device_ops to have them with and without vlan_dev_select_queue
and then select according to real dev ndo_select_queue present or not
for a vlan net_device. This is to completely skip vlan_dev_select_queue
calling for real net_device not supporting ndo_select_queue.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

igb: only use vlan_gro_receive if vlans are registered

This change makes it so that vlan_gro_receive is only used if vlans have been
registered to the adapter structure. Previously we were just sending all vlan
tagged frames in via this function but this results in a null pointer
dereference when vlans are not registered.

[ This fixes bugzilla entry 15582 -Eric Dumazet]

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

igb: do not modify tx_queue_len on link speed change

Previously the driver tweaked txqueuelen to avoid false Tx hang reports seen at half duplex.
This had the effect of overriding user set values on link change/reset. Testing shows that
adjusting only the timeout factor is sufficient to prevent Tx hang reports at half duplex.

Based on e1000e patch by Franco Fichtner <franco@lastsummer.de>

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

igb: count Rx FIFO errors correctly

Don't aggregate rx_no_buffer_count into rx_fifo_errors. RNBC counts
packets that get queued temporarily in the adapter's FIFO. These
packets are not dropped and are not errors. The correct counter
is rx_missed_errors (MPC).

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2: Use proper handler during netpoll.

Netpoll needs to call the proper handler depending on the IRQ mode
and the vector.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2: Fix netpoll crash.

The bnx2 driver calls netif_napi_add() for all the NAPI structs during
->probe() time but not all of them will be used if we're not in MSI-X
mode. This creates a problem for netpoll since it will poll all the
NAPI structs in the dev_list whether or not they are scheduled, resulting
in a crash when we access structure fields not initialized for that vector.

We fix it by moving the netif_napi_add() call to ->open() after the number
of IRQ vectors has been determined.

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ksz884x: fix return value of netdev_set_eeprom

ksz884x: fix return value of netdev_set_eeprom

netdev_set_eeprom() confused ethtool by just returning 1 on error
instead of a proper -EINVAL.

Signed-off-by: Jens Rottmann <JRottmann@LiPPERTEmbedded.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

cgroups: net_cls as module

Allows the net_cls cgroup subsystem to be compiled as a module

This patch modifies net/sched/cls_cgroup.c to allow the net_cls subsystem
to be optionally compiled as a module instead of builtin. The
cgroup_subsys struct is moved around a bit to allow the subsys_id to be
either declared as a compile-time constant by the cgroup_subsys.h include
in cgroup.h, or, if it's a module, initialized within the struct by
cgroup_load_subsys.

Signed-off-by: Ben Blum <bblum@andrew.cmu.edu>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen: The driver doesn't work on NX_P3_B1 so cause probe to fail.

I haven't been able to get link up on a NX_P3_B1 since 2.6.31.  The
driver complains about a firmware hang instead.  When I asked I was
told rev 0x41 was a preproduction rev.  So disable support in the
driver so no one is surprised the code doesn't work.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netpoll: warn when there are spaces in parameters

v2: update according to Frans' comments.

Currently, if we leave spaces before dst port,
netconsole will silently accept it as 0. Warn about this.

Also, when spaces appear in other places, make them
visible in error messages.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

can: bfin_can: switch to common Blackfin can header

The MMR bits are being moved to this header, so include it.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of /home/davem/src/GIT/linux-2.6/

Fix up prototype for sys_ipc breakage

Commit 45575f5a426c ("ppc64 sys_ipc breakage in 2.6.34-rc2") fixed the
definition of the sys_ipc() helper, but didn't fix the prototype in
<linux/syscalls.h>

Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

netfilter: xt_recent: fix regression in rules using a zero hit_count

Commit 8ccb92ad (netfilter: xt_recent: fix false match) fixed supposedly
false matches in rules using a zero hit_count. As it turns out there is
nothing false about these matches and people are actually using entries
with a hit_count of zero to make rules dependant on addresses inserted
manually through /proc.

Since this slipped past the eyes of three reviewers, instead of
reverting the commit in question, this patch explicitly checks
for a hit_count of zero to make the intentions more clear.

Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Cc: stable@kernel.org
Signed-off-by: Patrick McHardy <kaber@trash.net>

rxrpc: Check allocation failure.

alloc_skb() can return NULL.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://git./linux/kernel/git/mst/vhost

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio: console: Check if port is valid in resize_console
virtio: console: Generate a kobject CHANGE event on adding 'name' attribute

Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (38 commits)
  ip_gre: include route header_len in max_headroom calculation
  if_tunnel.h: add missing ams/byteorder.h include
  ipv4: Don't drop redirected route cache entry unless PTMU actually expired
  net: suppress lockdep-RCU false positive in FIB trie.
  Bluetooth: Fix kernel crash on L2CAP stress tests
  Bluetooth: Convert debug files to actually use debugfs instead of sysfs
  Bluetooth: Fix potential bad memory access with sysfs files
  netfilter: ctnetlink: fix reliable event delivery if message building fails
  netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()
  NET_DMA: free skbs periodically
  netlink: fix unaligned access in nla_get_be64()
  tcp: Fix tcp_mark_head_lost() with packets == 0
  net: ipmr/ip6mr: fix potential out-of-bounds vif_table access
  KS8695: update ksp->next_rx_desc_read at the end of rx loop
  igb: Add support for 82576 ET2 Quad Port Server Adapter
  ixgbevf: Message formatting cleanups
  ixgbevf: Shorten up delay timer for watchdog task
  ixgbevf: Fix VF Stats accounting after reset
  ixgbe: Set IXGBE_RSC_CB(skb)->DMA field to zero after unmapping the address
  ixgbe: fix for real_num_tx_queues update issue
  ...

Merge branch 'for-linus' of git://git./linux/kernel/git/bp/bp

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
edac, mce: Filter out invalid values

rxrpc: Check allocation failure.

alloc_skb() can return NULL.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

AFS: Potential null dereference

It seems clear from the surrounding code that xpermits is allowed to be
NULL here.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ppc64 sys_ipc breakage in 2.6.34-rc2

I chased down a fail on ppc64 on 2.6.34-rc2 where an application that
uses shared memory was getting a SEGV.

Commit baed7fc9b580bd3fb8252ff1d9b36eaf1f86b670 ("Add generic sys_ipc
wrapper") changed the second argument from an unsigned long to an int.
When we call shmget the system call wrappers for sys_ipc will sign
extend second (ie the size) which truncates it. It took a while to
track down because the call succeeds and strace shows the untruncated
size :)

The patch below changes second from an int to an unsigned long which
fixes shmget on ppc64 (and I assume s390, sparc64 and mips64).

Signed-off-by: Anton Blanchard <anton@samba.org>
--

I assume the function prototypes for the other IPC methods would cause us
to sign or zero extend second where appropriate (avoiding any security
issues). Come to think of it, the syscall wrappers for each method should do
that for us as well.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86 / perf: Fix suspend to RAM on HP nx6325

Commit 3f6da3905398826d85731247e7fbcf53400c18bd
(perf: Rework and fix the arch CPU-hotplug hooks) broke suspend to
RAM on my HP nx6325 (and most likely on other AMD-based boxes too)
by allowing amd_pmu_cpu_offline() to be executed for CPUs that are
going offline as part of the suspend process. The problem is that
cpuhw->amd_nb may be NULL already, so the function should make sure
it's not NULL before accessing the object pointed to by it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

edac, mce: Filter out invalid values

Print the CPU associated with the error only when the field is valid.

Cc: <stable@kernel.org> # .32.x .33.x
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

virtio: console: Check if port is valid in resize_console

The console port could have been hot-unplugged. Check if it is valid
before working on it.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

virtio: console: Generate a kobject CHANGE event on adding 'name' attribute

When the host lets us know what 'name' a port is assigned, we create the
sysfs 'name' attribute. Generate a 'change' event after this so that
udev wakes up and acts on the rules for virtio-ports (currently there's
only one rule that creates a symlink from the 'name' to the actual char
device).

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

ip_gre: include route header_len in max_headroom calculation

Taking route's header_len into account, and updating gre device
needed_headroom will give better hints on upper bound of required
headroom. This is useful if the gre traffic is xfrm'ed.

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

if_tunnel.h: add missing ams/byteorder.h include

When compiling userspace application which includes
if_tunnel.h and uses GRE_* defines you will get undefined
reference to __cpu_to_be16.

Fix this by adding missing #include <asm/byteorder.h>

Cc: stable@kernel.org
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: Don't drop redirected route cache entry unless PTMU actually expired

TCP sessions over IPv4 can get stuck if routers between endpoints
do not fragment packets but implement PMTU instead, and we are using
those routers because of an ICMP redirect.

Setup is as follows

MTU1 MTU2 MTU1
A--------B------C------D

with MTU1 > MTU2. A and D are endpoints, B and C are routers. B and C
implement PMTU and drop packets larger than MTU2 (for example because
DF is set on all packets). TCP sessions are initiated between A and D.
There is packet loss between A and D, causing frequent TCP
retransmits.

After the number of retransmits on a TCP session reaches tcp_retries1,
tcp calls dst_negative_advice() prior to each retransmit. This results
in route cache entries for the peer to be deleted in
ipv4_negative_advice() if the Path MTU is set.

If the outstanding data on an affected TCP session is larger than
MTU2, packets sent from the endpoints will be dropped by B or C, and
ICMP NEEDFRAG will be returned. A and D receive NEEDFRAG messages and
update PMTU.

Before the next retransmit, tcp will again call dst_negative_advice(),
causing the route cache entry (with correct PMTU) to be deleted. The
retransmitted packet will be larger than MTU2, causing it to be
dropped again.

This sequence repeats until the TCP session aborts or is terminated.

Problem is fixed by removing redirected route cache entries in
ipv4_negative_advice() only if the PMTU is expired.

Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git./linux/kernel/git/holtmann/bluetooth-2.6

net: suppress lockdep-RCU false positive in FIB trie.

Allow fib_find_node() to be called either under rcu_read_lock()
protection or with RTNL held.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Bluetooth: Fix kernel crash on L2CAP stress tests

Added very simple check that req buffer has enough space to
fit configuration parameters. Shall be enough to reject packets
with configuration size more than req buffer.

Crash trace below

[ 6069.659393] Unable to handle kernel paging request at virtual address 02000205
[ 6069.673034] Internal error: Oops: 805 [#1] PREEMPT
...
[ 6069.727172] PC is at l2cap_add_conf_opt+0x70/0xf0 [l2cap]
[ 6069.732604] LR is at l2cap_recv_frame+0x1350/0x2e78 [l2cap]
...
[ 6070.030303] Backtrace:
[ 6070.032806] [<bf1c2880>] (l2cap_add_conf_opt+0x0/0xf0 [l2cap]) from
[<bf1c6624>] (l2cap_recv_frame+0x1350/0x2e78 [l2cap])
[ 6070.043823] r8:dc5d3100 r7:df2a91d6 r6:00000001 r5:df2a8000 r4:00000200
[ 6070.050659] [<bf1c52d4>] (l2cap_recv_frame+0x0/0x2e78 [l2cap]) from
[<bf1c8408>] (l2cap_recv_acldata+0x2bc/0x350 [l2cap])
[ 6070.061798] [<bf1c814c>] (l2cap_recv_acldata+0x0/0x350 [l2cap]) from
[<bf0037a4>] (hci_rx_task+0x244/0x478 [bluetooth])
[ 6070.072631] r6:dc647700 r5:00000001 r4:df2ab740
[ 6070.077362] [<bf003560>] (hci_rx_task+0x0/0x478 [bluetooth]) from
[<c006b9fc>] (tasklet_action+0x78/0xd8)
[ 6070.087005] [<c006b984>] (tasklet_action+0x0/0xd8) from [<c006c160>]

Signed-off-by: Andrei Emeltchenko <andrei.emeltchenko@nokia.com>
Acked-by: Gustavo F. Padovan <gustavo@padovan.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Convert debug files to actually use debugfs instead of sysfs

Some of the debug files ended up wrongly in sysfs, because at that point
of time, debugfs didn't exist. Convert these files to use debugfs and
also seq_file. This patch converts all of these files at once and then
removes the exported symbol for the Bluetooth sysfs class.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Bluetooth: Fix potential bad memory access with sysfs files

When creating a high number of Bluetooth sockets (L2CAP, SCO
and RFCOMM) it is possible to scribble repeatedly on arbitrary
pages of memory. Ensure that the content of these sysfs files is
always less than one page. Even if this means truncating. The
files in question are scheduled to be moved over to debugfs in
the future anyway.

Based on initial patches from Neil Brown and Linus Torvalds

Reported-by: Neil Brown <neilb@suse.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

Merge branch 'vhost' of git://git./linux/kernel/git/mst/vhost

netfilter: ctnetlink: fix reliable event delivery if message building fails

This patch fixes a bug that allows to lose events when reliable
event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR
and NETLINK_RECV_NO_ENOBUFS socket options are set.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err()

Currently, ENOBUFS errors are reported to the socket via
netlink_set_err() even if NETLINK_RECV_NO_ENOBUFS is set. However,
that should not happen. This fixes this problem and it changes the
prototype of netlink_set_err() to return the number of sockets that
have set the NETLINK_RECV_NO_ENOBUFS socket option. This return
value is used in the next patch in these bugfix series.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

NET_DMA: free skbs periodically

Under NET_DMA, data transfer can grind to a halt when userland issues a
large read on a socket with a high RCVLOWAT (i.e., 512 KB for both).
This appears to be because the NET_DMA design queues up lots of memcpy
operations, but doesn't issue or wait for them (and thus free the
associated skbs) until it is time for tcp_recvmesg() to return.
The socket hangs when its TCP window goes to zero before enough data is
available to satisfy the read.

Periodically issue asynchronous memcpy operations, and free skbs for ones
that have completed, to prevent sockets from going into zero-window mode.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlink: fix unaligned access in nla_get_be64()

This patch fixes a unaligned access in nla_get_be64() that was
introduced by myself in a17c859849402315613a0015ac8fbf101acf0cc1.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Fix tcp_mark_head_lost() with packets == 0

A packet is marked as lost in case packets == 0, although nothing should be done.
This results in a too early retransmitted packet during recovery in some cases.
This small patch fixes this issue by returning immediately.

Signed-off-by: Lennart Schulte <lennart.schulte@nets.rwth-aachen.de>
Signed-off-by: Arnd Hannemann <hannemann@nets.rwth-aachen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipmr/ip6mr: fix potential out-of-bounds vif_table access

mfc_parent of cache entries is used to index into the vif_table and is
initialised from mfcctl->mfcc_parent. This can take values of to 2^16-1,
while the vif_table has only MAXVIFS (32) entries. The same problem
affects ip6mr.

Refuse invalid values to fix a potential out-of-bounds access. Unlike
the other validity checks, this is checked in ipmr_mfc_add() instead of
the setsockopt handler since its unused in the delete path and might be
uninitialized.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

KS8695: update ksp->next_rx_desc_read at the end of rx loop

There is no need to adjust the next rx descriptor after each packet,
so do it only once at the end of the routine.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>

igb: Add support for 82576 ET2 Quad Port Server Adapter

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: Message formatting cleanups

Clean up some text output formatting.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: Shorten up delay timer for watchdog task

The recovery from PF reset works better when you shorten up the delay
until the watchdog task executes.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbevf: Fix VF Stats accounting after reset

The counters in the 82599 Virtual Function are not clear on read.  They
accumulate to the maximum value and then roll over.  They are also not
cleared when the VF executes a soft reset, so it is possible they are
non-zero when the driver loads and starts.  This has all been accounted
for in the code that keeps the stats up to date but there is one case
that is not.  When the PF driver is reset the counters in the VF are
all reset to zero.  This adds an additional accounting overhead into
the VF driver when the PF is reset under its feet.  This patch adds
additional counters that are used by the VF driver to accumulate and
save stats after a PF reset has been detected.  Prior to this patch
displaying the stats in the VF after the PF has reset would show
bogus data.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: Set IXGBE_RSC_CB(skb)->DMA field to zero after unmapping the address

As per Simon Horman's feedback set IXGBE_RSC_CB(skb)->dma to zero
after unmapping HWRSC DMA address to avoid double freeing.

Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ixgbe: fix for real_num_tx_queues update issue

Currently netdev_features_change is called before fcoe tx queues
setup is done, so this patch moves calling of netdev_features_change
after tx queues setup is done in ixgbe_init_interrupt_scheme, so
that real_num_tx_queues is updated correctly on each fcoe enable
or disable.

This allows additional fcoe queues updated correctly in vlan driver
for their correct queue selection.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

TCP: check min TTL on received ICMP packets

This adds RFC5082 checks for TTL on received ICMP packets.
It adds some security against spoofed ICMP packets
disrupting GTSM protected sessions.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Remove redundant dst NULL check in ip6_dst_check

As the only path leading to ip6_dst_check makes an indirect call
through dst->ops, dst cannot be NULL in ip6_dst_check.

This patch removes this check in case it misleads people who
come across this code.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: check rt_genid in dst_check

Xfrm_dst keeps a reference to ipv4 rtable entries on each
cached bundle. The only way to renew xfrm_dst when the underlying
route has changed, is to implement dst_check for this. This is
what ipv6 side does too.

The problems started after 87c1e12b5eeb7b30b4b41291bef8e0b41fc3dde9
("ipsec: Fix bogus bundle flowi") which fixed a bug causing xfrm_dst
to not get reused, until that all lookups always generated new
xfrm_dst with new route reference and path mtu worked. But after the
fix, the old routes started to get reused even after they were expired
causing pmtu to break (well it would occationally work if the rtable
gc had run recently and marked the route obsolete causing dst_check to
get called).

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 2.6.34-rc2

Merge git://git./linux/kernel/git/lethal/sh-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  serial: sh-sci: remove duplicated #include
  sh: Export uncached helper symbols.
  sh: Fix up NUMA build for 29-bit.
  serial: sh-sci: Fix build failure for non-sh architectures.
  sh: Fix up uncached offset for legacy 29-bit mode.
  sh: Support CPU affinity masks for INTC controllers.

Merge branch 'for-upstream' of git://git./linux/kernel/git/dvrabel/uwb

* 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb:
  uwb: remove duplicate cpu_to_le16()
  uwb: declare MODULE_FIRMWARE() in i1480 DFU driver
  uwb: make USB device id table constant
  uwb: wlp: refactor wlp_get_<attribute>() macros

Merge branch 'for-linus' of git://git./linux/kernel/git/mattst88/alpha-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: fix compile errors in dma-mapping-common.h
  alpha: remove trailing spaces in messages
  alpha: use __ratelimit