linux-2.6-microblaze.git
13 years agonet/mlx4_core: Do not reset module-parameter num_vfs when fail to enable sriov
Jack Morgenstein [Tue, 15 May 2012 10:35:01 +0000 (10:35 +0000)]
net/mlx4_core: Do not reset module-parameter num_vfs when fail to enable sriov

Consider the following scenario: 2 HCAs, where only one of which can run SRIOV.

If we reset the module parameter, all the VFs of the SRIOV HCA will be
claimed by the PPF host (-- the code relies on num_vfs being non-zero
to avoid this claiming, and num_vfs was reset when pci_enable_sriov failed
for the non-SRIOV HCA).

The solution is not to touch the num_vfs parameter.

Also, eliminate the unneeded check of num_vfs when disabling sriov
(the dev flag bit is sufficient).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/mlx4_core: Remove unused *_str functions from the resource tracker
Jack Morgenstein [Tue, 15 May 2012 10:35:00 +0000 (10:35 +0000)]
net/mlx4_core: Remove unused *_str functions from the resource tracker

Removed unsued *_str helper functions from resource_tracker.c

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/mlx4_core: Change SYNC_TPT to be native (not wrapped)
Jack Morgenstein [Tue, 15 May 2012 10:34:59 +0000 (10:34 +0000)]
net/mlx4_core: Change SYNC_TPT to be native (not wrapped)

The "wrapped" was incorrect, since no wrapper function was defined.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/mlx4_core: Fix init_port mask state for slaves
Jack Morgenstein [Tue, 15 May 2012 10:34:58 +0000 (10:34 +0000)]
net/mlx4_core: Fix init_port mask state for slaves

In function mlx4_INIT_PORT_wrapper, the port state mask for the
slave is only set if we are invoking the INIT_PORT fw command.

However, the reference count for the (initialized) port is
incremented anyway.

This creates a problem in that when we have multiple slaves,
then the CLOSE_PORT command will never be invoked. The
reason is that in the CLOSE_PORT wrapper, if the port-state
mask is zero for the slave (which it is), the wrapper returns
without doing anything. The only slave which will not return
immediately in the CLOSE_PORT wrapper is that slave for which
INIT_PORT was invoked.

The fix is to not have the port-state mask setting depend
on the logic for calling the INIT_PORT fw command.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/mlx4: Address build warnings on set but not used variables
Or Gerlitz [Tue, 15 May 2012 10:34:57 +0000 (10:34 +0000)]
net/mlx4: Address build warnings on set but not used variables

Handle the compiler warnings on variables which are set but not used
by removing the relevant variable or casting a return value which is
ignored on purpose to void.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm: Convert several xfrm policy match functions to bool.
David S. Miller [Tue, 15 May 2012 19:04:57 +0000 (15:04 -0400)]
xfrm: Convert several xfrm policy match functions to bool.

xfrm_selector_match
xfrm_sec_ctx_match
__xfrm4_selector_match
__xfrm6_selector_match

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Convert net_ratelimit uses to net_<level>_ratelimited
Joe Perches [Sun, 13 May 2012 21:56:26 +0000 (21:56 +0000)]
net: Convert net_ratelimit uses to net_<level>_ratelimited

Standardize the net core ratelimited logging functions.

Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: Add net_ratelimited_function and net_<level>_ratelimited macros
Joe Perches [Sun, 13 May 2012 21:56:25 +0000 (21:56 +0000)]
net: Add net_ratelimited_function and net_<level>_ratelimited macros

__ratelimit() can be considered an inverted bool test because
it returns true when not ratelimited.  Several tests in the
kernel tree use this __ratelimit() function incorrectly.

No net_ratelimit uses are incorrect currently though.

Most uses of net_ratelimit are to log something via printk or
pr_<level>.

In order to minimize the uses of net_ratelimit, and to start
standardizing the code style used for __ratelimit() and net_ratelimit(),
add a net_ratelimited_function() macro and net_<level>_ratelimited()
logging macros similar to pr_<level>_ratelimited that use the global
net_ratelimit instead of a static per call site "struct ratelimit_state".

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodummy: documentation is stale
Alan Cox [Mon, 14 May 2012 03:57:31 +0000 (03:57 +0000)]
dummy: documentation is stale

dummy0/1/2 names are always used and there are options to set multiple
dummy devices. Remove the obsolete text

Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=42865
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm_algo: drop an unnecessary inclusion
Jan Beulich [Tue, 15 May 2012 02:00:44 +0000 (02:00 +0000)]
xfrm_algo: drop an unnecessary inclusion

For several releases, this has not been needed anymore, as no helper
functions declared in net/ah.h get implemented by xfrm_algo.c anymore.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoxfrm: make xfrm_algo.c a module
Jan Beulich [Tue, 15 May 2012 01:57:44 +0000 (01:57 +0000)]
xfrm: make xfrm_algo.c a module

By making this a standalone config option (auto-selected as needed),
selecting CRYPTO from here rather than from XFRM (which is boolean)
allows the core crypto code to become a module again even when XFRM=y.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoqlcnic-ethtool: set the ethtool_dump flag by ETH_FW_DUMP_DISABLE value that is zero...
Manish chopra [Tue, 15 May 2012 01:13:39 +0000 (01:13 +0000)]
qlcnic-ethtool: set the ethtool_dump flag by ETH_FW_DUMP_DISABLE value that is zero, if firmware dump is disabled.

Signed-off-by: Manish chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agolinux/ethtool: Added macro ETH_FW_DUMP_DISABLE
Manish chopra [Tue, 15 May 2012 01:13:38 +0000 (01:13 +0000)]
linux/ethtool: Added macro ETH_FW_DUMP_DISABLE

o flag field of ethtool_dump structure must be initialized by this macro
value that is zero, if the firmware dump is disabled.
by this we can get the firmware dump capability [enable/disable] via ethtool

Signed-off-by: Manish chopra <manish.chopra@qlogic.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agostmmac: fix suspend/resume locking
Giuseppe CAVALLARO [Sun, 13 May 2012 22:18:43 +0000 (22:18 +0000)]
stmmac: fix suspend/resume locking

Upon resume from standby, there is a possible interrupt
unsafe locking scenario raised when configure the Kernel
with CONFIG_PROVE_LOCKING. So this patch fixes that in
PM driver stuff by calling lock/unlock_irqsave/restore.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agostmmac: add mixed burst for DMA
Giuseppe CAVALLARO [Sun, 13 May 2012 22:18:42 +0000 (22:18 +0000)]
stmmac: add mixed burst for DMA

In mixed burst (MB) mode, the AHB master always initiates
the bursts with fixed-size when the DMA requests transfers
of size less than or equal to 16 beats.
This patch adds the MB support and the flag that can be
passed from the platform to select it.
MB mode can also give some benefits in terms of performances
on some platforms.

v2: fixed Coding Style

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agostmmac: extend mac addr reg and fix perfect filering
Giuseppe CAVALLARO [Sun, 13 May 2012 22:18:41 +0000 (22:18 +0000)]
stmmac: extend mac addr reg and fix perfect filering

This patch is to extend the number of MAC address registers
for 16 to 32. In fact, other new 16 registers are available in new
chips and this can help on perfect filter mode for unicast.

This patch also fixes the perfect filtering mode by setting the
bit 31 in the MAC address registers.

v2: fixed Coding Style.

Signed-off-by: Gianni Antoniazzi <gianni.antoniazzi-ext@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodm9000: some coldfire boards need this
Steven King [Fri, 11 May 2012 06:49:46 +0000 (06:49 +0000)]
dm9000: some coldfire boards need this

Some coldfire boards (ie m5253demo) have a dm9000 onboard.

Signed-off-by: Steven King <sfking@fdwdc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocodel: use u16 field instead of 31bits for rec_inv_sqrt
Eric Dumazet [Sat, 12 May 2012 21:23:23 +0000 (21:23 +0000)]
codel: use u16 field instead of 31bits for rec_inv_sqrt

David pointed out gcc might generate poor code with 31bit fields.

Using u16 is more than enough and permits a better code output.

Also make the code intent more readable using constants, fixed point arithmetic
not being trivial for everybody.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Mon, 14 May 2012 22:15:33 +0000 (18:15 -0400)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:

* an improvement to avoid to linearise the whole received packet when not needed
* an improvement for client traffic rerouting after roaming
* a fix for the local translation table state-machine
* minor cleanups and fixes

13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Mon, 14 May 2012 22:00:48 +0000 (18:00 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next

13 years agonet: codel: fix build errors
Sasha Levin [Mon, 14 May 2012 11:57:06 +0000 (11:57 +0000)]
net: codel: fix build errors

Fix the following build error:

net/sched/sch_fq_codel.c: In function 'fq_codel_dump_stats':
net/sched/sch_fq_codel.c:464:3: error: unknown field 'qdisc_stats' specified in initializer
net/sched/sch_fq_codel.c:464:3: warning: missing braces around initializer
net/sched/sch_fq_codel.c:464:3: warning: (near initialization for 'st.<anonymous>')
net/sched/sch_fq_codel.c:465:3: error: unknown field 'qdisc_stats' specified in initializer
net/sched/sch_fq_codel.c:465:3: warning: excess elements in struct initializer
net/sched/sch_fq_codel.c:465:3: warning: (near initialization for 'st')
net/sched/sch_fq_codel.c:466:3: error: unknown field 'qdisc_stats' specified in initializer
net/sched/sch_fq_codel.c:466:3: warning: excess elements in struct initializer
net/sched/sch_fq_codel.c:466:3: warning: (near initialization for 'st')
net/sched/sch_fq_codel.c:467:3: error: unknown field 'qdisc_stats' specified in initializer
net/sched/sch_fq_codel.c:467:3: warning: excess elements in struct initializer
net/sched/sch_fq_codel.c:467:3: warning: (near initialization for 'st')
make[1]: *** [net/sched/sch_fq_codel.o] Error 1

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/codel: Add missing #include <linux/prefetch.h>
Geert Uytterhoeven [Mon, 14 May 2012 09:47:05 +0000 (09:47 +0000)]
net/codel: Add missing #include <linux/prefetch.h>

m68k allmodconfig:

net/sched/sch_codel.c: In function ‘dequeue’:
net/sched/sch_codel.c:70: error: implicit declaration of function ‘prefetch’
make[1]: *** [net/sched/sch_codel.o] Error 1

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobatman-adv: unset the TT_CLIENT_PENDING flag if the new local entry already exists
Antonio Quartulli [Sun, 15 Jan 2012 23:36:58 +0000 (00:36 +0100)]
batman-adv: unset the TT_CLIENT_PENDING flag if the new local entry already exists

When trying to add a new tt_local_entry, if such entry already exists, we have
to ensure that the TT_CLIENT_PENDING flag is not set, otherwise the entry will
be deleted soon.

Reported-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: README cleanups
Sven Eckelmann [Mon, 2 Apr 2012 17:31:26 +0000 (19:31 +0200)]
batman-adv: README cleanups

- Add routing_algo

- Remove date from README:
The date has to be updated when a patch touches the README. Therefore, nearly
every feature will modify this date. It can happens quite often that not only
one feature is currently in development or waiting on the mailinglist. This
creates merge conflicts when applying a patchset.

The date itself doesn't provide any additional information when this file is
only available in a release tarball or as part of a SCM repository.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: Start new development cycle
Sven Eckelmann [Fri, 30 Mar 2012 16:44:09 +0000 (18:44 +0200)]
batman-adv: Start new development cycle

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: use shorter pr_warn instead of pr_warning
Sven Eckelmann [Mon, 26 Mar 2012 14:22:45 +0000 (16:22 +0200)]
batman-adv: use shorter pr_warn instead of pr_warning

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: refactor window_protected to avoid unnecessary return statement
Marek Lindner [Sat, 17 Mar 2012 07:28:33 +0000 (15:28 +0800)]
batman-adv: refactor window_protected to avoid unnecessary return statement

Reported-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: prepare lq_update_lock to be shared among different protocols
Marek Lindner [Sat, 17 Mar 2012 07:28:32 +0000 (15:28 +0800)]
batman-adv: prepare lq_update_lock to be shared among different protocols

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: improve unicast packet (re)routing
Antonio Quartulli [Fri, 16 Mar 2012 17:03:28 +0000 (18:03 +0100)]
batman-adv: improve unicast packet (re)routing

In case of a client X roaming from a generic node A to another node B, it is
possible that a third node C gets A's OGM but not B's. At this point in time, if
C wants to send data to X it will send a unicast packet destined to A. The
packet header will contain A's last ttvn (C got A's OGM and so it knows it).

The packet will travel towards A without being intercepted because the ttvn
contained in its header is the newest for A.

Once A will receive the packet, A's state will not report to be in a "roaming
phase" (because, after a roaming, once A sends out its OGM, all the changes are
committed and the node is considered not to be in the roaming state anymore)
and it will match the ttvn carried by the packet. Therefore there is no reason
for A to try to alter the packet's route, thus dropping the packet because the
destination client is not there anymore.

However, C is well aware that it's routing information towards the client X is
outdated as it received an OGM from A saying that the client roamed away.
Thanks to this detail, this patch introduces a small change in behaviour: as
long as C is in the state of not knowing the new location of client X it will
forward the traffic to its last known location using ttvn-1 of the destination.
By using an older ttvn node A will be forced to re-route the packet.
Intermediate nodes are also allowed to update the packet's destination as long
as they have the information about the client's new location.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: avoid skb_linearise() if not needed
Antonio Quartulli [Fri, 16 Mar 2012 10:52:31 +0000 (11:52 +0100)]
batman-adv: avoid skb_linearise() if not needed

Whenever we want to access headers only, we do not need to linearise the whole
packet. Instead we can use pskb_may_pull()

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agoetherdevice: Remove now unused compare_ether_addr_64bits
Joe Perches [Fri, 11 May 2012 12:21:06 +0000 (12:21 +0000)]
etherdevice: Remove now unused compare_ether_addr_64bits

Move and invert the logic from the otherwise unused
compare_ether_addr_64bits to ether_addr_equal_64bits.

Neaten the logic in is_etherdev_addr.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoigb: Add Support for new i210/i211 devices.
Carolyn Wyborny [Fri, 6 Apr 2012 23:25:19 +0000 (23:25 +0000)]
igb: Add Support for new i210/i211 devices.

This patch adds new initialization functions and device support
for i210 and i211 devices.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoigb: Add function and pointers for 82580 low power state settings.
Carolyn Wyborny [Sun, 4 Mar 2012 03:26:26 +0000 (03:26 +0000)]
igb: Add function and pointers for 82580 low power state settings.

82580 and later parts did not have low power setting functions.  This patch
adds the specific functions, pointers and assignments for these low
power settings.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agofq_codel: Fair Queue Codel AQM
Eric Dumazet [Fri, 11 May 2012 09:30:50 +0000 (09:30 +0000)]
fq_codel: Fair Queue Codel AQM

Fair Queue Codel packet scheduler

Principles :

- Packets are classified (internal classifier or external) on flows.
- This is a Stochastic model (as we use a hash, several flows might
                              be hashed on same slot)
- Each flow has a CoDel managed queue.
- Flows are linked onto two (Round Robin) lists,
  so that new flows have priority on old ones.

- For a given flow, packets are not reordered (CoDel uses a FIFO)
- head drops only.
- ECN capability is on by default.
- Very low memory footprint (64 bytes per flow)

tc qdisc ... fq_codel [ limit PACKETS ] [ flows number ]
                      [ target TIME ] [ interval TIME ] [ noecn ]
                      [ quantum BYTES ]

defaults : 1024 flows, 10240 packets limit, quantum : device MTU
           target : 5ms (CoDel default)
           interval : 100ms (CoDel default)

Impressive results on load :

class htb 1:1 root leaf 10: prio 0 quantum 1514 rate 200000Kbit ceil 200000Kbit burst 1475b/8 mpu 0b overhead 0b cburst 1475b/8 mpu 0b overhead 0b level 0
 Sent 43304920109 bytes 33063109 pkt (dropped 0, overlimits 0 requeues 0)
 rate 201691Kbit 28595pps backlog 0b 312p requeues 0
 lended: 33063109 borrowed: 0 giants: 0
 tokens: -912 ctokens: -912

class fq_codel 10:1735 parent 10:
 (dropped 1292, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4524 parent 10:
 (dropped 1291, overlimits 0 requeues 0)
 backlog 16654b 11p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:4e74 parent 10:
 (dropped 1290, overlimits 0 requeues 0)
 backlog 6056b 4p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 6.4ms dropping drop_next 92.0ms
class fq_codel 10:628a parent 10:
 (dropped 1289, overlimits 0 requeues 0)
 backlog 7570b 5p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 5.4ms dropping drop_next 90.9ms
class fq_codel 10:a4b3 parent 10:
 (dropped 302, overlimits 0 requeues 0)
 backlog 16654b 11p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:c3c2 parent 10:
 (dropped 1284, overlimits 0 requeues 0)
 backlog 13626b 9p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:d331 parent 10:
 (dropped 299, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.0ms
class fq_codel 10:d526 parent 10:
 (dropped 12160, overlimits 0 requeues 0)
 backlog 35870b 211p requeues 0
  deficit 1508 count 12160 lastcount 1 ldelay 15.3ms dropping drop_next 247us
class fq_codel 10:e2c6 parent 10:
 (dropped 1288, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms
class fq_codel 10:eab5 parent 10:
 (dropped 1285, overlimits 0 requeues 0)
 backlog 16654b 11p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 5.9ms
class fq_codel 10:f220 parent 10:
 (dropped 1289, overlimits 0 requeues 0)
 backlog 15140b 10p requeues 0
  deficit 1514 count 1 lastcount 1 ldelay 7.1ms

qdisc htb 1: root refcnt 6 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 43331086547 bytes 33092812 pkt (dropped 0, overlimits 66063544 requeues 71)
 rate 201697Kbit 28602pps backlog 0b 260p requeues 71
qdisc fq_codel 10: parent 1:1 limit 10240p flows 65536 target 5.0ms interval 100.0ms ecn
 Sent 43331086547 bytes 33092812 pkt (dropped 949359, overlimits 0 requeues 0)
 rate 201697Kbit 28602pps backlog 189352b 260p requeues 0
  maxpacket 1514 drop_overlimit 0 new_flow_count 5582 ecn_mark 125593
  new_flows_len 0 old_flows_len 11

PING 172.30.42.18 (172.30.42.18) 56(84) bytes of data.
64 bytes from 172.30.42.18: icmp_req=1 ttl=64 time=0.227 ms
64 bytes from 172.30.42.18: icmp_req=2 ttl=64 time=0.165 ms
64 bytes from 172.30.42.18: icmp_req=3 ttl=64 time=0.166 ms
64 bytes from 172.30.42.18: icmp_req=4 ttl=64 time=0.151 ms
64 bytes from 172.30.42.18: icmp_req=5 ttl=64 time=0.164 ms
64 bytes from 172.30.42.18: icmp_req=6 ttl=64 time=0.172 ms
64 bytes from 172.30.42.18: icmp_req=7 ttl=64 time=0.175 ms
64 bytes from 172.30.42.18: icmp_req=8 ttl=64 time=0.183 ms
64 bytes from 172.30.42.18: icmp_req=9 ttl=64 time=0.158 ms
64 bytes from 172.30.42.18: icmp_req=10 ttl=64 time=0.200 ms

10 packets transmitted, 10 received, 0% packet loss, time 8999ms
rtt min/avg/max/mdev = 0.151/0.176/0.227/0.022 ms

Much better than SFQ because of priority given to new flows, and fast
path dirtying less cache lines.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocodel: use Newton method instead of sqrt() and divides
Eric Dumazet [Sat, 12 May 2012 03:32:13 +0000 (03:32 +0000)]
codel: use Newton method instead of sqrt() and divides

As Van pointed out, interval/sqrt(count) can be implemented using
multiplies only.

http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Iterative_methods_for_reciprocal_square_roots

This patch implements the Newton method and reciprocal divide.

Total cost is 15 cycles instead of 120 on my Corei5 machine (64bit
kernel).

There is a small 'error' for count values < 5, but we don't really care.

I reuse a hole in struct codel_vars :
 - pack the dropping boolean into one bit
 - use 31bit to store the reciprocal value of sqrt(count).

Suggested-by: Van Jacobson <van@pollere.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agorndis_wlan: cleanup: change oid from __le32 to u32 in various places
Jussi Kivilinna [Fri, 11 May 2012 22:17:57 +0000 (22:17 +0000)]
rndis_wlan: cleanup: change oid from __le32 to u32 in various places

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agorndis_host: cleanup: change oid from __le32 to u32 in rndis_query()
Jussi Kivilinna [Fri, 11 May 2012 22:17:50 +0000 (22:17 +0000)]
rndis_host: cleanup: change oid from __le32 to u32 in rndis_query()

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agorndis_wlan: cleanup: byteswap data from device instead of RNDIS_* defines
Jussi Kivilinna [Fri, 11 May 2012 22:17:42 +0000 (22:17 +0000)]
rndis_wlan: cleanup: byteswap data from device instead of RNDIS_* defines

All other values from device provided buffer are byteswapped, so it seems more
logical to do same for these.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agorndis_host: cleanup: byteswap data from device instead of RNDIS_* defines
Jussi Kivilinna [Fri, 11 May 2012 22:17:34 +0000 (22:17 +0000)]
rndis_host: cleanup: byteswap data from device instead of RNDIS_* defines

All other values from device provided buffer are byteswapped, so it seems
more logical to do same for these.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: move bus message definition
Linus Walleij [Fri, 11 May 2012 22:17:26 +0000 (22:17 +0000)]
usb/net: rndis: move bus message definition

This moves the bus message definition to land together with the
other message types. This message is not used in the kernel but
I'm keeping it anyway.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: fixup a few name prefixes
Linus Walleij [Fri, 11 May 2012 22:17:19 +0000 (22:17 +0000)]
usb/net: rndis: fixup a few name prefixes

This switches a horde of NDIS_*-prefixed variables to the RNDIS_*
prefix. Most of them aren't used much and causes no changes.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: merge command codes
Linus Walleij [Fri, 11 May 2012 22:17:07 +0000 (22:17 +0000)]
usb/net: rndis: merge command codes

Switch the hyperv filter and rndis gadget driver to use the same command
enumerators as the other drivers and delete the surplus command codes.

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: move and namespace PnP defines
Linus Walleij [Fri, 11 May 2012 22:16:54 +0000 (22:16 +0000)]
usb/net: rndis: move and namespace PnP defines

This moves the PnP OID definitions to the RNDIS_* namespace
and puts them in the next falling slot in the list. Oh, the comment
above the PnP defines was referring to some obsolete or out-of-tree
driver so removed it, and removed my own comments telling where each
header segment came from as well, we have moved everything around by
this point anyway.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: delete duplicate packet types
Linus Walleij [Fri, 11 May 2012 22:16:47 +0000 (22:16 +0000)]
usb/net: rndis: delete duplicate packet types

The NDIS_*-prefixed packet types have equivalent RNDIS_*-
prefixed types, besides nothing in the kernel use these defines.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: merge media type definitions
Linus Walleij [Fri, 11 May 2012 22:16:39 +0000 (22:16 +0000)]
usb/net: rndis: merge media type definitions

Let's have a unified table of RNDIS media. We used to have a similar
table with NDIS_* prefix from the gadget driver, but since we're only
using RNDIS in the kernel (IIRC NDIS, non-remote, is for the windows-
internal network drivers so what do we care) let's prefix everything
with RNDIS. Some of the definitions were conflicting, in one of the
defines 0x0B is bearer "CO WAN" and in two others "BPC". Well I took
the majority vote. Two definition of medium 0x09 calls it "wireless
WAN" but one vote for "wireless LAN" but in this case I am sticking
with the minority, "Wide Area Network" does not make much sense in
this case as far as I can tell.

NOTE: latin singular and plural is so screwed up in these defines
that it makes my eyes bleed. But I will not attempt to submit a
patch converting all use of _MEDIA_ to _MEDIUM_ while I can probably
tell from the semantics of the code that RNDIS_MEDIA_STATE_CONNECTED
is most probably (erroneously) referring to a singular, unless it
can return an array of connected media. I suspect these erroneous
plurals are used in documentation and such so I don't want to
mess around with things for no functional change.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: group all status codes together
Linus Walleij [Fri, 11 May 2012 22:16:30 +0000 (22:16 +0000)]
usb/net: rndis: group all status codes together

Move all RNDIS status codes so they appear in rising order and
in one place of the header file.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: delete surplus defines
Linus Walleij [Fri, 11 May 2012 22:16:23 +0000 (22:16 +0000)]
usb/net: rndis: delete surplus defines

These defines are not used in the kernel, and they have duplicate
definitions under the RNDIS_* prefix.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: merge duplicate 802_* OIDs
Linus Walleij [Fri, 11 May 2012 22:16:16 +0000 (22:16 +0000)]
usb/net: rndis: merge duplicate 802_* OIDs

The 802_* network OIDs were duplicated, so let's merge them and
use the RNDIS_* prefixed definitions from the hyperV driver.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: eliminate first set of duplicate OIDs
Linus Walleij [Fri, 11 May 2012 22:16:08 +0000 (22:16 +0000)]
usb/net: rndis: eliminate first set of duplicate OIDs

The RNDIS protocol contains a vast number of Object ID:s (OIDs).
The current definitions had multiple definitions of these ID:s,
let's use the nicely RNDIS_*-prefixed defines from the HyperV
implementation, rename everywhere they're used, and copy+rename
the few that were missing from this list of objects.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: remove ambigous status codes
Linus Walleij [Fri, 11 May 2012 22:15:59 +0000 (22:15 +0000)]
usb/net: rndis: remove ambigous status codes

The RNDIS status codes are redefined with much stranged ifdeffery
and only one of these codes was used in the hyperv driver, and
there it is very clearly referring to the RNDIS variant, not some
other status. So clarify this by explictly using the RNDIS_*
prefixed status code in the hyperv drivera and delete the
duplicate defines.

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: break out <linux/rndis.h> defines
Linus Walleij [Fri, 11 May 2012 22:15:50 +0000 (22:15 +0000)]
usb/net: rndis: break out <linux/rndis.h> defines

As a first step to consolidate the RNDIS implementations, break out
a common file with all the #defines and move it to <linux/rndis.h>.

This also deletes the immediate duplicated defines in the
<linux/rndis.h> file that yields a lot of compilation warnings.

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agousb/net: rndis: inline the cpu_to_le32() macro
Linus Walleij [Fri, 11 May 2012 22:15:39 +0000 (22:15 +0000)]
usb/net: rndis: inline the cpu_to_le32() macro

The header file <linux/usb/rndis_host.h> used a number of #defines
that included the cpu_to_le32() macro to assure the result will be
in LE endianness. Inlining this into the code instead of using it
in the code definitions yields consolidation opportunities later
on as you will see in the following patches. The individual
drivers also used local defines - all are switched over to the
pattern of doing the conversion at the call sites instead.

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet/ipv6/af_inet6.c: checkpatch cleanup
Eldad Zack [Sat, 5 May 2012 10:13:53 +0000 (10:13 +0000)]
net/ipv6/af_inet6.c: checkpatch cleanup

af_inet6.c:80: ERROR: do not initialise statics to 0 or NULL
af_inet6.c:259: ERROR: spaces required around that '=' (ctx:VxV)
af_inet6.c:394: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:412: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:422: ERROR: do not use assignment in if condition
af_inet6.c:425: ERROR: do not use assignment in if condition
af_inet6.c:433: ERROR: do not use assignment in if condition
af_inet6.c:437: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:446: ERROR: spaces required around that '=' (ctx:VxV)
af_inet6.c:478: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:485: ERROR: that open brace { should be on the previous line
af_inet6.c:485: ERROR: space required before the open parenthesis '('
af_inet6.c:513: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:629: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:647: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:687: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:709: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
af_inet6.c:1073: ERROR: space required before the open parenthesis '('

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet: of/phy: fix build error when phylib is built as a module
Bjørn Mork [Fri, 11 May 2012 05:47:01 +0000 (05:47 +0000)]
net: of/phy: fix build error when phylib is built as a module

CONFIG_OF_MDIO is tristate and will be m if PHYLIB is m.  Use
IS_ENABLED macro to prevent build error:

 ERROR: "of_mdio_find_bus" [drivers/net/phy/mdio-mux.ko] undefined!

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: David Daney <david.daney@cavium.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Fri, 11 May 2012 21:57:52 +0000 (17:57 -0400)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:

* fix a little bug in the DHCP packet snooping introduced so far
* minor fixes and cleanups
* minor routing protocol API cleanups
* add a new contributor name to translation-table.{c,h}
* update copyright years in file headers
* minor improvement for the routing algorithm

13 years agobatman-adv: add contributor name
Antonio Quartulli [Wed, 14 Mar 2012 12:03:01 +0000 (13:03 +0100)]
batman-adv: add contributor name

translation_table.{c,h} have been heavily modified by another contributor and
for legal purposes it is better to include his name into the contributor list

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: update copyright years
Antonio Quartulli [Wed, 14 Mar 2012 11:57:02 +0000 (12:57 +0100)]
batman-adv: update copyright years

update copyright years in order to include 2012

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: fix checkpatch string complaint
Marek Lindner [Sat, 17 Mar 2012 07:28:34 +0000 (15:28 +0800)]
batman-adv: fix checkpatch string complaint

Regression introduced by: f76d019194e0a88c57371df169ecc979690a04c2

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: avoid temporary routing loops by being strict on forwarded OGMs
Marek Lindner [Sat, 10 Mar 2012 22:17:53 +0000 (06:17 +0800)]
batman-adv: avoid temporary routing loops by being strict on forwarded OGMs

batman-adv would forward OGMs from non-besthops while replacing the the TQ
and TTL values with the values from the best hop. In certain corner cases
this leads to a temporary routing loop.
This patch changes this behavior: Only packets from best next hops are
forwarded - TQ and TTL values won't be replaced anymore. However, the protocol
needs to rebroadcast OGMs from single hop neighbors regardless of whether or
not they are the best hop. To handle this case a new flag is introduced to
alert neighboring nodes about the forwarded OGM that is not from my best
next hop. It is to be discarded by all nodes except for the one originating
the OGM.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Daniele Furlan <daniele.furlan@gmail.com>
Tested-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
13 years agobatman-adv: Adding hard_iface specific sysfs wrapper macros for UINT
Linus Luessing [Sat, 10 Mar 2012 22:17:52 +0000 (06:17 +0800)]
batman-adv: Adding hard_iface specific sysfs wrapper macros for UINT

This allows us to easily add a sysfs parameter for an unsigned int
later, which is not for a batman mesh interface (e.g. bat0), but for a
common interface instead. It allows reading and writing an atomic_t in
hard_iface (instead of bat_priv compared to the mesh variant).

Developed by Linus during a 6 months trainee study period in Ascom
(Switzerland) AG.

Signed-off-by: Linus Luessing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
13 years agobatman-adv: rename sysfs macros to reflect the soft-interface dependency
Marek Lindner [Sat, 10 Mar 2012 22:17:51 +0000 (06:17 +0800)]
batman-adv: rename sysfs macros to reflect the soft-interface dependency

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: refactoring API: find generalized name for bat_ogm_update_mac callback
Marek Lindner [Sat, 10 Mar 2012 22:17:50 +0000 (06:17 +0800)]
batman-adv: refactoring API: find generalized name for bat_ogm_update_mac callback

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: ignore protocol packets if the interface did not enable this protocol
Marek Lindner [Sat, 10 Mar 2012 22:17:49 +0000 (06:17 +0800)]
batman-adv: ignore protocol packets if the interface did not enable this protocol

Reported-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: split neigh_new function into generic and batman iv specific parts
Marek Lindner [Thu, 1 Mar 2012 07:35:21 +0000 (15:35 +0800)]
batman-adv: split neigh_new function into generic and batman iv specific parts

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: replace HZ calculations with jiffies_to_msecs()
Marek Lindner [Thu, 1 Mar 2012 07:35:20 +0000 (15:35 +0800)]
batman-adv: replace HZ calculations with jiffies_to_msecs()

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: rename last_valid to last_seen
Marek Lindner [Thu, 1 Mar 2012 07:35:19 +0000 (15:35 +0800)]
batman-adv: rename last_valid to last_seen

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: register batman ogm receive function during protocol init
Marek Lindner [Sun, 4 Mar 2012 08:56:25 +0000 (16:56 +0800)]
batman-adv: register batman ogm receive function during protocol init

The B.A.T.M.A.N. IV OGM receive function still was hard-coded although
it is a routing protocol specific function. This patch takes advantage
of the dynamic packet handler registration to remove the hard-coded
function calls.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: introduce packet type handler array for incoming packets
Marek Lindner [Thu, 1 Mar 2012 07:35:17 +0000 (15:35 +0800)]
batman-adv: introduce packet type handler array for incoming packets

The packet handler array replaces the growing switch statement, thus
dealing with incoming packets in a more efficient way. It also adds
to possibility to register packet handlers on the fly.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: introduce is_single_hop_neigh variable to increase readability
Marek Lindner [Thu, 1 Mar 2012 07:35:16 +0000 (15:35 +0800)]
batman-adv: introduce is_single_hop_neigh variable to increase readability

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years agobatman-adv: fix wrong dhcp option list browsing
Antonio Quartulli [Mon, 27 Feb 2012 10:29:53 +0000 (11:29 +0100)]
batman-adv: fix wrong dhcp option list browsing

In is_type_dhcprequest(), while parsing a DHCP message, if the entry we found in
the option list is neither a padding nor the dhcp-type, we have to ignore it and
jump as many bytes as its length + 1. The "+ 1" byte is given by the subtype
field itself that has to be jumped too.

Reported-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
13 years ago6lowpan: IPv6 link local address
alex.bluesman.smirnov@gmail.com [Thu, 10 May 2012 03:25:52 +0000 (03:25 +0000)]
6lowpan: IPv6 link local address

According to the RFC4944 (Transmission of IPv6 Packets over
IEEE 802.15.4 Networks), chapter 7:

The IPv6 link-local address [RFC4291] for an IEEE 802.15.4 interface
is formed by appending the Interface Identifier, as defined above, to
the prefix FE80::/64.

  10 bits            54 bits                  64 bits
+----------+-----------------------+----------------------------+
|1111111010|         (zeros)       |    Interface Identifier    |
+----------+-----------------------+----------------------------+

This patch adds IPv6 address generation support for the 6lowpan
interfaces.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocodel: Controlled Delay AQM
Eric Dumazet [Thu, 10 May 2012 07:51:25 +0000 (07:51 +0000)]
codel: Controlled Delay AQM

An implementation of CoDel AQM, from Kathleen Nichols and Van Jacobson.

http://queue.acm.org/detail.cfm?id=2209336

This AQM main input is no longer queue size in bytes or packets, but the
delay packets stay in (FIFO) queue.

As we don't have infinite memory, we still can drop packets in enqueue()
in case of massive load, but mean of CoDel is to drop packets in
dequeue(), using a control law based on two simple parameters :

target : target sojourn time (default 5ms)
interval : width of moving time window (default 100ms)

Based on initial work from Dave Taht.

Refactored to help future codel inclusion as a plugin for other linux
qdisc (FQ_CODEL, ...), like RED.

include/net/codel.h contains codel algorithm as close as possible than
Kathleen reference.

net/sched/sch_codel.c contains the linux qdisc specific glue.

Separate structures permit a memory efficient implementation of fq_codel
(to be sent as a separate work) : Each flow has its own struct
codel_vars.

timestamps are taken at enqueue() time with 1024 ns precision, allowing
a range of 2199 seconds in queue, and 100Gb links support. iproute2 uses
usec as base unit.

Selected packets are dropped, unless ECN is enabled and packets can get
ECN mark instead.

Tested from 2Mb to 10Gb speeds with no particular problems, on ixgbe and
tg3 drivers (BQL enabled).

Usage: tc qdisc ... codel [ limit PACKETS ] [ target TIME ]
                          [ interval TIME ] [ ecn ]

qdisc codel 10: parent 1:1 limit 2000p target 3.0ms interval 60.0ms ecn
 Sent 13347099587 bytes 8815805 pkt (dropped 0, overlimits 0 requeues 0)
 rate 202365Kbit 16708pps backlog 113550b 75p requeues 0
  count 116 lastcount 98 ldelay 4.3ms dropping drop_next 816us
  maxpacket 1514 ecn_mark 84399 drop_overlimit 0

CoDel must be seen as a base module, and should be used keeping in mind
there is still a FIFO queue. So a typical setup will probably need a
hierarchy of several qdiscs and packet classifiers to be able to meet
whatever constraints a user might have.

One possible example would be to use fq_codel, which combines Fair
Queueing and CoDel, in replacement of sfq / sfq_red.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dave Taht <dave.taht@bufferbloat.net>
Cc: Kathleen Nichols <nichols@pollere.com>
Cc: Van Jacobson <van@pollere.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet_sched: update bstats in dequeue()
Eric Dumazet [Thu, 10 May 2012 05:36:34 +0000 (05:36 +0000)]
net_sched: update bstats in dequeue()

Class bytes/packets stats can be misleading because they are updated in
enqueue() while packet might be dropped later.

We already fixed all qdiscs but sch_atm.

This patch makes the final cleanup.

class rate estimators can now match qdisc ones.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonet, drivers/net: Convert compare_ether_addr_64bits to ether_addr_equal_64bits
Joe Perches [Wed, 9 May 2012 17:04:04 +0000 (17:04 +0000)]
net, drivers/net: Convert compare_ether_addr_64bits to ether_addr_equal_64bits

Use the new bool function ether_addr_equal_64bits to add
some clarity and reduce the likelihood for misuse of
compare_ether_addr_64bits for sorting.

Done via cocci script:

$ cat compare_ether_addr_64bits.cocci
@@
expression a,b;
@@
- !compare_ether_addr_64bits(a, b)
+ ether_addr_equal_64bits(a, b)

@@
expression a,b;
@@
- compare_ether_addr_64bits(a, b)
+ !ether_addr_equal_64bits(a, b)

@@
expression a,b;
@@
- !ether_addr_equal_64bits(a, b) == 0
+ ether_addr_equal_64bits(a, b)

@@
expression a,b;
@@
- !ether_addr_equal_64bits(a, b) != 0
+ !ether_addr_equal_64bits(a, b)

@@
expression a,b;
@@
- ether_addr_equal_64bits(a, b) == 0
+ !ether_addr_equal_64bits(a, b)

@@
expression a,b;
@@
- ether_addr_equal_64bits(a, b) != 0
+ ether_addr_equal_64bits(a, b)

@@
expression a,b;
@@
- !!ether_addr_equal_64bits(a, b)
+ ether_addr_equal_64bits(a, b)

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoetherdevice.h: Add ether_addr_equal_64bits
Joe Perches [Wed, 9 May 2012 17:04:03 +0000 (17:04 +0000)]
etherdevice.h: Add ether_addr_equal_64bits

Add an optimized boolean function to check if
2 ethernet addresses are the same.

This is to avoid any confusion about compare_ether_addr_64bits
returning an unsigned, and not being able to use the
compare_ether_addr_64bits function for sorting ala memcmp.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agodrivers/net: Convert compare_ether_addr to ether_addr_equal
Joe Perches [Wed, 9 May 2012 17:17:46 +0000 (17:17 +0000)]
drivers/net: Convert compare_ether_addr to ether_addr_equal

Use the new bool function ether_addr_equal to add
some clarity and reduce the likelihood for misuse
of compare_ether_addr for sorting.

Done via cocci script:

$ cat compare_ether_addr.cocci
@@
expression a,b;
@@
- !compare_ether_addr(a, b)
+ ether_addr_equal(a, b)

@@
expression a,b;
@@
- compare_ether_addr(a, b)
+ !ether_addr_equal(a, b)

@@
expression a,b;
@@
- !ether_addr_equal(a, b) == 0
+ ether_addr_equal(a, b)

@@
expression a,b;
@@
- !ether_addr_equal(a, b) != 0
+ !ether_addr_equal(a, b)

@@
expression a,b;
@@
- ether_addr_equal(a, b) == 0
+ !ether_addr_equal(a, b)

@@
expression a,b;
@@
- ether_addr_equal(a, b) != 0
+ ether_addr_equal(a, b)

@@
expression a,b;
@@
- !!ether_addr_equal(a, b)
+ ether_addr_equal(a, b)

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agobe2net: avoid disabling sriov while VFs are assigned
Sathya Perla [Tue, 8 May 2012 19:41:24 +0000 (19:41 +0000)]
be2net: avoid disabling sriov while VFs are assigned

Calling pci_disable_sriov() while VFs are assigned to VMs causes
kernel panic. This patch uses PCI_DEV_FLAGS_ASSIGNED bit state of the
VF's pci_dev to avoid this. Also, the unconditional function reset cmd
issued on a PF probe can delete the VF configuration for the
previously enabled VFs. A scratchpad register is now used to issue a
function reset only when needed (i.e., in a crash dump scenario.)

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agol2tp: fix data packet sequence number handling
James Chapman [Wed, 9 May 2012 23:43:09 +0000 (23:43 +0000)]
l2tp: fix data packet sequence number handling

If enabled, L2TP data packets have sequence numbers which a receiver
can use to drop out of sequence frames or try to reorder them. The
first frame has sequence number 0, but the L2TP code currently expects
it to be 1. This results in the first data frame being handled as out
of sequence.

This one-line patch fixes the problem.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agol2tp: fix reorder timeout recovery
James Chapman [Wed, 9 May 2012 23:43:08 +0000 (23:43 +0000)]
l2tp: fix reorder timeout recovery

When L2TP data packet reordering is enabled, packets are held in a
queue while waiting for out-of-sequence packets. If a packet gets
lost, packets will be held until the reorder timeout expires, when we
are supposed to then advance to the sequence number of the next packet
but we don't currently do so. As a result, the data channel is stuck
because we are waiting for a packet that will never arrive - all
packets age out and none are passed.

The fix is to add a flag to the session context, which is set when the
reorder timeout expires and tells the receive code to reset the next
expected sequence number to that of the next packet in the queue.

Tested in a production L2TP network with Starent and Nortel L2TP gear.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotcp: Out-line tcp_try_rmem_schedule
Pavel Emelyanov [Thu, 10 May 2012 01:50:20 +0000 (01:50 +0000)]
tcp: Out-line tcp_try_rmem_schedule

As proposed by Eric, make the tcp_input.o thinner.

add/remove: 1/1 grow/shrink: 1/4 up/down: 868/-1329 (-461)
function                                     old     new   delta
tcp_try_rmem_schedule                          -     864    +864
tcp_ack                                     4811    4815      +4
tcp_validate_incoming                        817     815      -2
tcp_collapse                                 860     858      -2
tcp_send_rcvq                                555     353    -202
tcp_data_queue                              3435    3033    -402
tcp_prune_queue                              721       -    -721

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotcp: Schedule rmem for rcvq repair send
Pavel Emelyanov [Thu, 10 May 2012 01:50:01 +0000 (01:50 +0000)]
tcp: Schedule rmem for rcvq repair send

As noted by Eric, no checks are performed on the data size we're
putting in the read queue during repair. Thus, validate the given
data size with the common rmem management routine.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agotcp: Move rcvq sending to tcp_input.c
Pavel Emelyanov [Thu, 10 May 2012 01:49:41 +0000 (01:49 +0000)]
tcp: Move rcvq sending to tcp_input.c

It actually works on the input queue and will use its read mem
routines, thus it's better to have in in the tcp_input.c file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net...
David S. Miller [Fri, 11 May 2012 03:16:35 +0000 (23:16 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net-next

13 years agoixgbe: update version number
Don Skidmore [Sat, 28 Apr 2012 03:29:22 +0000 (03:29 +0000)]
ixgbe: update version number

Update version number to better match the version of the out of tree
driver with similar functionality.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: cleanup the hwmon function calls
Don Skidmore [Fri, 4 May 2012 06:07:08 +0000 (06:07 +0000)]
ixgbe: cleanup the hwmon function calls

When the hwmon code was initially added it was with the assumption that a
sysfs patch would be also coming soon.  Since that isn't the case some
clean up needs to be done.  This patch does that.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: support software timestamping
Jacob Keller [Fri, 4 May 2012 01:55:23 +0000 (01:55 +0000)]
ixgbe: support software timestamping

Kernel software timestamping requires that the driver calls skb_tx_timestamp
just before passing the skb to the MAC, in order to provide the best software
timestamps. This patch adds this call for that support.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: add support for get_ts_info
Jacob Keller [Fri, 4 May 2012 02:56:12 +0000 (02:56 +0000)]
ixgbe: add support for get_ts_info

This patch adds support for the ethtool get_ts_info operation, which enables
access of available timestamp/timesync support for that device. It can query
which ptp clock device is associated with the particular port.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: correct disable_rx_buff timeout
Jacob Keller [Thu, 3 May 2012 01:44:12 +0000 (01:44 +0000)]
ixgbe: correct disable_rx_buff timeout

The current value of the udelay timeout for ixgbe_disable_rx_buff is too
short. This causes the security path to not not be properly disabled during
the section that is meant to have it turned off. The end result causes a race
condition that results in RX issues.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Enable timesync clock-out feature for PPS support on X540
Jacob E Keller [Tue, 1 May 2012 05:24:41 +0000 (05:24 +0000)]
ixgbe: Enable timesync clock-out feature for PPS support on X540

This patch enables the PPS system in the PHC framework, by enabling
the clock-out feature on the X540 device. Causes the SDP0 to be set as
a 1Hz clock. Also configures the timesync interrupt cause in order to
report each pulse to the PPS via the PHC framework, which can be used
for general system clock synchronization. (This allows a stable method
for tuning the general system time via the on-board SYSTIM register
based clock.)

Signed-off-by: Jacob E Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Hardware Timestamping + PTP Hardware Clock (PHC)
Jacob Keller [Tue, 1 May 2012 05:24:58 +0000 (05:24 +0000)]
ixgbe: Hardware Timestamping + PTP Hardware Clock (PHC)

This patch enables hardware timestamping for use with PTP software by
extracting a ns counter from an arbitrary fixed point cycles counter.
The hardware generates SYSTIME registers using the DMA tick which
changes based on the current link speed. These SYSTIME registers are
converted to ns using the cyclecounter and timecounter structures
provided by the kernel. Using the SO_TIMESTAMPING api, software can
enable and access timestamps for PTP packets.

The SO_TIMESTAMPING API has space for 3 different kinds of timestamps,
SYS, RAW, and SOF. SYS hardware timestamps are hardware ns values that
are then scaled to the software clock. RAW hardware timestamps are the
direct raw value of the ns counter. SOF software timestamps are the
software timestamp calculated as close as possible to the software
transmit, but are not offloaded to the hardware. This patch only
supports the RAW hardware timestamps due to inefficiency of the SYS
design.

This patch also enables the PHC subsystem features for atomically
adjusting the cycle register, and adjusting the clock frequency in
parts per billion. This frequency adjustment works by slightly
adjusting the value added to the cycle registers each DMA tick. This
causes the hardware registers to overflow rapidly (approximately once
every 34 seconds, when at 10gig link). To solve this, the timecounter
structure is used, along with a timer set for every 25 seconds. This
allows for detecting register overflow and converting the cycle
counter registers into ns values needed for providing useful
timestamps to the network stack.

Only the basic required clock functions are supported at this time,
although the hardware supports some ancillary features and these could
easily be enabled in the future.

Note that use of this hardware timestamping requires modifying daemon
software to use the SO_TIMESTAMPING API for timestamps, and the
ptp_clock PHC framework for accessing the clock. The timestamps have
no relation to the system time at all, so software must use the posix
clock generated by the PHC framework instead.

Signed-off-by: Jacob E Keller <jacob.e.keller@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Fix bogus error message
Greg Rose [Sat, 21 Apr 2012 00:54:28 +0000 (00:54 +0000)]
ixgbe: Fix bogus error message

If the VF sends a MACVLAN request with index of zero then it is not
actually trying to add a filter.  Check the index value and only
indicate that operation is not allowed when the VF is actually trying
to add a filter.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Set Drop_EN bit when multiple Rx queues are present w/o flow control
Alexander Duyck [Wed, 25 Apr 2012 04:36:38 +0000 (04:36 +0000)]
ixgbe: Set Drop_EN bit when multiple Rx queues are present w/o flow control

The drop enable bit can be used to improve the performance of the adapter
in the case of multiple queues being present.  This performance gain is due
to the fact that some slower CPUs can cause the FIFO to backfill preventing
faster CPUs from receiving additional work.  By setting the drop enable bit
we prevent this and instead just drop the packets that would have been
bound for the slower CPU.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Clean up priority based flow control
Alexander Duyck [Thu, 10 May 2012 05:14:44 +0000 (22:14 -0700)]
ixgbe: Clean up priority based flow control

This change cleans up the logic in the priority based flow control
configuration routines.  Both the 82599 and 82598 based routines perform
similar functions however they are both arranged completely differently.
This patch goes over both of them to clean up the code.

In addition I am dropping the ixgbe_fc_pfc flow control mode and instead
just replacing it with checks for if priority flow control is enabled.
This allows us to maintain some of the link flow control information which
allows for an easier transition between link and priority flow control.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoixgbe: Exit on error case in VF message processing
Alexander Duyck [Wed, 28 Mar 2012 08:03:38 +0000 (08:03 +0000)]
ixgbe: Exit on error case in VF message processing

Previously we would get a mailbox error and still process the message.
Instead we should exit on error.

In addition we should also be flushing the ACK of the message so that we
can guarantee that the other end is aware we have received the message
while we are processing it.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoigb: output register's information related to RX/TX queue[4-15]
Koki Sanagi [Wed, 15 Feb 2012 14:45:39 +0000 (14:45 +0000)]
igb: output register's information related to RX/TX queue[4-15]

Current igb outputs registers related to TX/RX queues(ex. RDT, RDH, TDT, TDH).
But it thinks the number of RX/TX queues is 4. But 82576 has 16 RX/TX queues.
This patch modifies igb to output the rest of the registers if the device is
82576.

Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Acked-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
13 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net...
Jeff Kirsher [Thu, 10 May 2012 04:12:37 +0000 (21:12 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/davem/net-next

13 years agonetxen_nic: Fix estimation of recv MSS in case of LRO
Rajesh Borundia [Wed, 9 May 2012 05:55:30 +0000 (05:55 +0000)]
netxen_nic: Fix estimation of recv MSS in case of LRO

o Linux stack estimates MSS from skb->len or skb_shinfo(skb)->gso_size.
In case of LRO skb->len is aggregate of len of number of packets hence MSS
obtained using skb->len would be incorrect. Incorrect estimation of recv MSS
would lead to delayed acks in some traffic patterns (which sends two or three
packets and wait for ack and only then send remaining packets). This leads to
drop in performance. Hence we need to set gso_size to MSS obtained from firmware.

o This is fixed recently in firmware hence the MSS is obtained based on
capability. If fw is capable of sending the MSS then only driver sets the gso_size.

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetxen: added miniDIMM support in driver.
Sucheta Chakraborty [Wed, 9 May 2012 05:55:29 +0000 (05:55 +0000)]
netxen: added miniDIMM support in driver.

Driver queries DIMM information from firmware and accordingly
sets "presence" field of the structure.
"presence" field when set to 0xff denotes invalid flag. And when
set to 0x0 denotes DIMM memory is not present.

Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetxen_nic: Allow only useful and recommended firmware dump capture mask values
Manish chopra [Wed, 9 May 2012 05:55:28 +0000 (05:55 +0000)]
netxen_nic: Allow only useful and recommended firmware dump capture mask values

o 0x3, 0x7, 0xF, 0x1F, 0x3F, 0x7F and 0xFF are the allowed capture masks.

Signed-off-by: Manish chopra <manish.chopra@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agonetxen_nic: disable minidump by default
Sritej Velaga [Wed, 9 May 2012 05:55:27 +0000 (05:55 +0000)]
netxen_nic: disable minidump by default

disable fw dump by default at start up.

Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com>
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>