git.monstr.eu Git - linux-2.6-microblaze.git/log

[PATCH] MM: Remove rogue readahead printk

For some reason it triggers always with NFS root and spams the kernel
logs of my nfs root boxes a lot.

Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] i386: Fix up backtrace fallback patch

I didn't test all compilation combinations. Shame on me.
And fix a missing option in the boot option following x86-64 (Jan Beulich)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Fix swiotlb=force

It was broken before. But having it is important as possible hardware
bug workaround.

And previously there was no way to force swiotlb if there is another IOMMU.
Side effect is that iommu=force won't force swiotlb anymore even if there
isn't another IOMMU.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Revert k8-bus.c northbridge access change

As Travis Betak points out it accesses the wrong northbridge subfunction
now. Switch back to the old code.

Cc: "Travis Betak" <betak@mpdtxmail.amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Calgary IOMMU - Multi-Node NULL pointer dereference fix

Calgary hits a NULL pointer dereference when booting in a multi-chassis
NUMA system.  See Redhat bugzilla number 198498, found by Konrad
Rzeszutek (konradr@redhat.com).

There are many issues that had to be resolved to fix this problem.
Firstly when I originally wrote the code to handle NUMA systems, I
had a large misunderstanding that was not corrected until now.  That was
that I thought the "number of nodes online" referred to number of
physical systems connected.  So that if NUMA was disabled, there
would only be 1 node and it would only show that node's PCI bus.
In reality if NUMA is disabled, the system displays all of the
connected chassis as one node but is only ignorant of the delays
in accessing main memory.  Therefore, references to num_online_nodes()
and MAX_NUMNODES are incorrect and need to be set to the maximum
number of nodes that can be accessed (which are 8).  I created a
variable, MAX_NUM_CHASSIS, and set it to 8 to fix this.

Secondly, when walking the PCI in detect_calgary, the code only
checked the first "slot" when looking to see if a device is present.
This will work for most cases, but unfortunately it isn't always the
case.  In the NUMA MXE drawers, there are USB devices present on the
3rd slot (with slot 1 being empty).  So, to work around this, all
slots (up to 8) are scanned to see if there are any devices present.

Lastly, the bus is being enumerated on large systems in a different
way the we originally thought.  This throws the ugly logic we had
out the window.  To more elegantly handle this, I reorganized the
kva array to be sparse (which removed the need to have any bus number
to kva slot logic in tce.c) and created a secondary space array to
contain the bus number to phb mapping.

With these changes Calgary boots on an x460 with 4 nodes with and
without NUMA enabled.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Calgary IOMMU - fix off by one error

Fixed off-by-one error in detect_calgary and calgary_init which will
cause arrays to overflow. Also, removed impossible to hit BUG_ON.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: On Intel systems when CPU has C3 don't use TSC

On Intel systems generally the TSC stops in C3 or deeper,
so don't use it there. Follows similar logic on i386.

This should fix problems on Meroms.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Update defconfig

Update defconfig

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge branch 'upstream-linus' of /linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] sata_promise: comment out duplicate PCI ID
  [PATCH] libata: improve EH action and EHI flag handling
  [PATCH] libata: fix eh_skip_recovery condition
  [PATCH] libata: fix autopsy ehc->i.action and ehc->i.dev handling

Merge branch 'master' into upstream-fixes

Merge branch 'upstream' of git://electric-eye.fr.zoreil.com/home/romieu/linux-2.6 into upstream-fixes

[PATCH] skge: chip clock rate typo

Okay, Fix both typo's in one patch .The impact is that the incorrect value
was being computed for blinking LED and interrupt moderation values.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

[PATCH] myri10ge - Always do a dummy RDMA after loading the firmware

Always do a dummy RDMA after loading the firmware to work around
buggy PCIe chipsets which do not implement resending properly.
This is so cheap as to be almost free, and should never have been
conditional on the tx boundary != 4096.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Merge branch 'upstream-fixes' of git://git./linux/kernel/git/linville/wireless-2.6 into upstream-fixes

[PATCH] pi-futex: robust-futex exit

Fix robust PI-futexes to be properly unlocked on unexpected exit.

For this to work the kernel has to know whether a futex is a PI or a
non-PI one, because the semantics are different. Since the space in
relevant glibc data structures is extremely scarce, the best solution is
to encode the 'PI' information in bit 0 of the robust list pointer.
Existing (non-PI) glibc robust futexes have this bit always zero, so the
ABI is kept. New glibc with PI-robust-futexes will set this bit.

Further fixes from Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] pi-futex: robust-futex exit crash fix

Fix pi_state->list handling bugs: list handling mishap, locking error.
Plus add more debug checks and fix a few style issues i noticed while
debugging this.

(reported by Ulrich Drepper and Jakub Jelinek.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] i386: Do backtrace fallback too

Similar patch to earlier x86-64 patch. When the dwarf2 unwinder fails
dump the left over stack with the old unwinder.

Also some clarifications in the headers.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Document backtracer selection options

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Dump leftover backtrace entries when dwarf2 unwinder got stuck

The dwarf2 unwinder currently often gets stuck because a lot
of assembly code doesn't have proper dwarf2 annotiation yet.

This currently often happens with __down. Should fix this by
adding proper dwarf2 annotation to all inline assembly. However
until that's done we need a quick fix for 2.6.18 to avoid
incomplete backtraces.

So when this happens dump the rest of the stack with the old unwinder
instead of silently not dumping it. There was already a optional
"both" mode that dumped both, but that was too ugly.

I also clarified the headers for the different backtraces a bit.

Also add a clear error message for missing dwarf2
annotation that people can work on.

And I removed a dead variable left over from Ingo's changes.

Cc: mingo@elte.hu
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Enlarge debug stack for nested kprobes

In x86_64 platform, INT1 and INT3 trap stack is IST stack called DEBUG_STACK,
when INT1/INT3 trap happens, system will switch to DEBUG_STACK by hardware.
Current DEBUG_STACK size is 4K, when int1/int3 trap happens, kernel will
minus current DEBUG_STACK IST value by 4k. But if int3/int1 trap is nested,
it will destroy other vector's IST stack. This patch modifies this, it sets
DEBUG_STACK size as 8K and allows two level of nested int1/int3 trap.

Kprobe DEBUG_STACK may be nested, because kprobe handler may be probed
by other kprobes.

Thanks jbeulich for pointing out error in the first patch.

[AK: nested kprobes are pretty dubious. Hopefully one nest
will be enough. This will cost 8K per CPU (4K more than before)]

Signed-off-by: bibo, mao <bibo.mao@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] x86_64: Don't clobber r8-r11 in int 0x80 handler

When int 0x80 is called from long mode r8-r11 would leak out of the
kernel (or rather they would be filled with some values from
the kernel stack). I don't think it's a security issue because
the values come from the fixed stack frame which should be near
always user registers from a previous interrupt.

Still better fix it.

Longer term the register save macros need to be cleaned up
to avoid such mistakes in the future.

Original analysis from Richard Brunner, fix by me.

Cc: Richard.Brunner@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] i386/x86-64: Add user_mode checks to profile_pc for oprofile

Fixes a obscure user space triggerable crash during oprofiling.

Oprofile calls profile_pc from NMIs even when user_mode(regs) is not true and
the program counter is inside the kernel lock section. This opens
a race - when a user program jumps to a kernel lock address and
a NMI happens before the illegal page fault exception is raised
and the program has a unmapped esp or ebp then the kernel could
oops. NMIs have a higher priority than exceptions so that could
happen.

Add user_mode checks to i386/x86-64 profile_pc to prevent that.

Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge branch 'for-linus' of git://git390.osdl.marist.edu/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] update default configuration
  [S390] duplicate ccw devices in ccwgroup.
  [S390] permanent subchannel busy conditions may cause I/O stall

Merge /pub/scm/linux/kernel/git/davem/sparc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SUNLANCE]: fix compilation on sparc-UP
  [SPARC]: Defer clock_probe to fs_initcall()
  [SPARC64]: Fix typo in pgprot_noncached().
  [SPARC64]: Fix quad-float multiply emulation.

Merge git://oss.sgi.com:8090/nathans/xfs-rc-2.6

* git://oss.sgi.com:8090/nathans/xfs-rc-2.6:
  [XFS] Ensure bulkstat from an invalid inode number gets caught always with
  [XFS] Fix a barrier related forced shutdown on mounts with quota enabled.
  [XFS] Fix remount vs no/barrier options by ensuring we clear unwanted
  [XFS] All xfs_disk_dquot_t values are (as the name says) disk endian.

Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block

* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] scsi: kill overeager "not-ready" messages
  [PATCH] it821x: fix ide dma setup bug
  [PATCH] ide: if the id fields looks screwy, disable DMA
  [PATCH] ide: option to disable cache flushes for buggy drives

[PATCH] i386: switch_to(): misplaced parentheses

Recent changes in i386 __switch_to() have a misplaced closing
parenthesis causing an unlikely() to terminate early.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[SUNLANCE]: fix compilation on sparc-UP

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[XFS] Ensure bulkstat from an invalid inode number gets caught always with
EINVAL.

SGI-PV: 953819
SGI-Modid: xfs-linux-melb:xfs-kern:26629a

Signed-off-by: Nathan Scott <nathans@sgi.com>

[XFS] Fix a barrier related forced shutdown on mounts with quota enabled.

SGI-PV: 912426
SGI-Modid: xfs-linux-melb:xfs-kern:26622a

Signed-off-by: Nathan Scott <nathans@sgi.com>

[XFS] Fix remount vs no/barrier options by ensuring we clear unwanted
flags from iclog buffers before submitting them for writing.

SGI-PV: 954772
SGI-Modid: xfs-linux-melb:xfs-kern:26605a

Signed-off-by: Nathan Scott <nathans@sgi.com>

[XFS] All xfs_disk_dquot_t values are (as the name says) disk endian.
Before putting them into struct statfs they should be endian-swapped.

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:26550a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>

[PATCH] scsi: kill overeager "not-ready" messages

HAL and friends have a tendency to trigger this one all the time.
It's not really interesting, so kill it. The vendor kernels all do
anyways.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] it821x: fix ide dma setup bug

Only enable dma for a valid speed setting.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] ide: if the id fields looks screwy, disable DMA

It's the safer choice. Originally due to a bug in itx821x, but a
generally sound thing to do.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] ide: option to disable cache flushes for buggy drives

Some drives claim they support cache flushing, but get seriously
confused if you try. Add this option to be able to boot with
barriers enabled by default.

Signed-off-by: Jens Axboe <axboe@suse.de>

[SPARC]: Defer clock_probe to fs_initcall()

From: Bob Breuer <breuerr@mc.net>

That way all the of_driver bits will be ready.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Fix typo in pgprot_noncached().

The sun4v code sequence was or'ing in the sun4u pte bits by mistake.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Fix quad-float multiply emulation.

Something is wrong with the 3-multiply (vs. 4-multiply) optimized
version of _FP_MUL_MEAT_2_*(), so just use the slower version
which actually computes correct values.

Noticed by Rene Rebe

Signed-off-by: David S. Miller <davem@davemloft.net>

[PATCH] ieee80211: TKIP requires CRC32

ieee80211_crypt_tkip will not work without CRC32.

LD .tmp_vmlinux1
net/built-in.o: In function `ieee80211_tkip_encrypt':
net/ieee80211/ieee80211_crypt_tkip.c:349: undefined reference to `crc32_le'

Reported by Toralf Foerster <toralf.foerster@gmx.de>

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

[PATCH] orinoco: fix setting transmit key only

When determining whether there's a key to set or not, orinoco should be
looking at the key length, not the key data. Otherwise confusion reigns
when trying to set TX key only, passing in zero-length key, but non-NULL
pointer. Key length takes precedence over non-NULL key data.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

[PATCH] softmac: do shared key auth in workqueue

Johann Uhrmann reported a bcm43xx crash and Michael Buesch tracked
it down to a problem with the new shared key auth code (recursive
calls into the driver)

This patch (effectively Michael's patch with a couple of small
modifications) solves the problem by sending the authentication
challenge response frame from a workqueue entry.

I also removed a lone \n from the bcm43xx messages relating to
authentication mode - this small change was previously discussed but
not patched in.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

[PATCH] airo: should select crypto_aes

The driver airo (for Cisco Wlan-Cards) complains about "failed to load
transform for AES", when it is loaded and CRYPTO_AES is not selected
in Kconfig.

Signed-off-by: John W. Linville <linville@tuxdriver.com>

[PATCH] zd1201: workaround interference problem

zd1201 has nasty tendency to emit magicall anti-wifi cloud when it is
inserted into slot, but not used.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

[S390] update default configuration

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] duplicate ccw devices in ccwgroup.

Fail to create a ccwgroup device if a ccw device is passed in twice.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] permanent subchannel busy conditions may cause I/O stall

In special conditions where a subchannel rejects the HALT I/O-
instruction with a busy indication (cc 2), I/O may stall.
I/O request termination logic retries HALT I/O indefinitely
because it expects HALT I/O to alter the subchannel status which
is not true when cc 2 is returned.
In case of a busy indication, try CLEAR I/O instruction immediately.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[PATCH] fix compile regression for a few scsi drivers

This fixes three drivers to compile again after my patch that removes
the data_cmnd member from struct scsi_cmnd.

The fas216 change is trivial, it should have been using ->cmnd all the
time.

NCR53C9 (which seem to be mostly duplicate driver with esp.c!) is doing
something odd, it should only have looked at ->cmnd before not the saved
copy that is kept for the error handlers sake. Note that it really
should deal with the sync setting themselves but use the generic domain
validation code that get this right - but that's for later let's push
this simple compile fix for now.

And sorry for the late fix for this, I have been busy with OLS and
associated activities last week.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge /pub/scm/linux/kernel/git/davem/sparc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SCSI] esp: Fix build.
  [SPARC]: Fix SA_STATIC_ALLOC value.
  [SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.

Merge /pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg().
  [IPV4] ipmr: ip multicast route bug fix.
  [TG3]: Update version and reldate
  [TG3]: Handle tg3_init_rings() failures
  [TG3]: Add tg3_restart_hw()
  [IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags.
  [IPV6]: Clean skb cb on IPv6 input.
  [NETFILTER]: Demote xt_sctp to EXPERIMENTAL
  [NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule
  [NETFILTER]: xt_pkttype: fix mismatches on locally generated packets
  [NETFILTER]: SNMP NAT: fix byteorder confusion
  [NETFILTER]: conntrack: fix SYSCTL=n compile
  [NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject
  [NETFILTER]: H.323 helper: fix possible NULL-ptr dereference

[PATCH] Reorganize the cpufreq cpu hotplug locking to not be totally bizare

The patch below moves the cpu hotplugging higher up in the cpufreq
layering; this is needed to avoid recursive taking of the cpu hotplug
lock and to otherwise detangle the mess.

The new rules are:
1. you must do lock_cpu_hotplug() around the following functions:
   __cpufreq_driver_target
   __cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only)
   __cpufreq_set_policy
2. governer methods (.governer) must NOT take the lock_cpu_hotplug()
   lock in any way; they are called with the lock taken already
3. if your governer spawns a thread that does things, like calling
   __cpufreq_driver_target, your thread must honor rule #1.
4. the policy lock and other cpufreq internal locks nest within
   the lock_cpu_hotplug() lock.

I'm not entirely happy about how the __cpufreq_governor rule ended up
(conditional locking rule depending on the argument) but basically all
callers pass this as a constant so it's not too horrible.

The patch also removes the cpufreq_governor() function since during the
locking audit it turned out to be entirely unused (so no need to fix it)

The patch works on my testbox, but it could use more testing
(otoh... it can't be much worse than the current code)

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[IPV4/IPV6]: Setting 0 for unused port field in RAW IP recvmsg().

From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp

The recvmsg() for raw socket seems to return random u16 value
from the kernel stack memory since port field is not initialized.
But I'm not sure this patch is correct.
Does raw socket return any information stored in port field?

[ BSD defines RAW IP recvmsg to return a sin_port value of zero.
This is described in Steven's TCP/IP Illustrated Volume 2 on
page 1055, which is discussing the BSD rip_input() implementation. ]

Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4] ipmr: ip multicast route bug fix.

IP multicast route code was reusing an skb which causes use after free
and double free.

From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>

Note, it is real skb_clone(), not alloc_skb(). Equeued skb contains
the whole half-prepared netlink message plus room for the rest.
It could be also skb_copy(), if we want to be puristic about mangling
cloned data, but original copy is really not going to be used.

Acked-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TG3]: Update version and reldate

Update version to 3.63.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TG3]: Handle tg3_init_rings() failures

Handle dev_alloc_skb() failures when initializing the RX rings.
Without proper handling, the driver will crash when using a partial
ring.

Thanks to Stephane Doyon <sdoyon@max-t.com> for reporting the bug and
providing the initial patch.

Howie Xu <howie@vmware.com> also reported the same issue.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TG3]: Add tg3_restart_hw()

Add tg3_restart_hw() to handle failures when re-initializing the
device.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[PATCH] cfq-iosched: don't use a hard jiffies value, translate from msecs

The CIC_SEEKY() test really wants to use the minimum of either:

- 2 msecs (not jiffies)

- or, the pending slice time

So code it like that.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] blktrace: fix read-ahead bit

It should be toggling the same bit on and off, fix it up.

Signed-off-by: Jens Axboe <axboe@suse.de>

[PATCH] cciss: fix stall with softirq handling and CFQ

We need to postpone the queue startup until after the softirq
handler has actually finished some requests, otherwise we could
be racing with cciss_softirq_done() and not actually restart
the queue handling.

Signed-off-by: Jens Axboe <axboe@suse.de>

[IPV4]: Clear the whole IPCB, this clears also IPCB(skb)->flags.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6]: Clean skb cb on IPv6 input.

Clear the accumulated junk in IP6CB when starting to handle an IPV6
packet.

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: Demote xt_sctp to EXPERIMENTAL

After the recent problems with all the SCTP stuff it seems reasonable
to mark this as experimental.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: bridge netfilter: add deferred output hooks to feature-removal-schedule

Add bridge netfilter deferred output hooks to feature-removal-schedule
and disable them by default. Until their removal they will be
activated by the physdev match when needed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: xt_pkttype: fix mismatches on locally generated packets

Locally generated broadcast and multicast packets have pkttype set to
PACKET_LOOPBACK instead of PACKET_BROADCAST or PACKET_MULTICAST. This
causes the pkttype match to fail to match packets of either type.

The below patch remedies this by using the daddr as a hint as to
broadcast|multicast. While not pretty, this seems like the only way
to solve the problem short of just noting this as a limitation of the
match.

This resolves netfilter bugzilla #484

Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: SNMP NAT: fix byteorder confusion

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: conntrack: fix SYSCTL=n compile

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject

In case of an unknown verdict or NF_STOP the packet leaks. Unknown verdicts
can happen when userspace is buggy. Reinject the packet in case of NF_STOP,
drop on unknown verdicts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: H.323 helper: fix possible NULL-ptr dereference

An RCF message containing a timeout results in a NULL-ptr dereference if
no RRQ has been seen before.

Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu>
and Isil Dillig <isil@stanford.edu>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCSI] esp: Fix build.

The data_cmd[] member got deleted, so do not use it any more. Scsi
commands do not have their ->cmd[] overwritten temporary to probe for
status after an error before retrying.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Fix SA_STATIC_ALLOC value.

It alises IRQF_SHARED which causes all kinds of
problems.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.

That way we'll have at least some debugging info even if
the stack dump explodes.

Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Correct dev_alloc_skb kerneldoc

dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers
use it for the latter anyway, but that's a different story)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Remove CONFIG_HAVE_ARCH_DEV_ALLOC_SKB

skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow
architectures to reimplement __dev_alloc_skb. It's not set on any
architecture and now that we have an architecture-overrideable
NET_SKB_PAD there is not point at all to have one either.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[VLAN]: Fix link state propagation

When the queue of the underlying device is stopped at initialization time
or the device is marked "not present", the state will be propagated to the
vlan device and never change. Based on an analysis by Patrick McHardy.

Signed-off-by: Stefan Rompf <stefan@loplof.de>
ACKed-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6] xfrm6_tunnel: Delete debugging code.

It doesn't compile, and it's dubious in several regards:

1) is enabled by non-Kconfig controlled CONFIG_* value
(noted by Randy Dunlap)
2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use
3) the debugging messages print object pointer addresses
which have no meaning without context

So let's just get rid of it.

Signed-off-by: David S. Miller <davem@davemloft.net>

[Bluetooth] Enable SCO support for Broadcom HID proxy dongle

The Broadcom dongles with HID proxy support actually support SCO over
HCI if the SCO buffer size values are corrected. So instead of disabling
the SCO support, mark this dongle with the quirk for the Bluetooth core
to correct the wrong buffer size values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Add quirk for another broken RTX Telecom based dongle

This patch disables the ISOC transfers for another broken RTX Telecom
based USB dongle. Starting the USB ISOC transfers only ends in a burst
of error messages for invalid SCO packets on connection handle 0.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct SCO buffer size for Belkin devices

The Belkin F8T012 and F8T013 devices are both based on a Bluetooth chip
from Broadcom and their SCO buffer size values are wrong. The Bluetooth
core should correct these values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct SCO buffer size for another Broadcom chip

The SCO buffer size values on IBM/Lenovo ThinkPad laptops with a
Bluetooth chip from Broadcom are wrong. The USB Bluetooth driver
has to set a quirk to correct the SCO buffer size values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct RFCOMM channel MTU for broken implementations

Some Bluetooth RFCOMM implementations try to negotiate a bigger channel
MTU than we can support for a particular session. The maximum MTU for
a RFCOMM session is limited through the L2CAP layer. So if the other
side proposes a channel MTU that is bigger than the underlying L2CAP
MTU, we should reduce it to the L2CAP MTU of the session minus five
bytes for the RFCOMM headers.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[PKT_SCHED]: Fix regression in PSCHED_TADD{,2}.

In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so,
less than USEC_PER_SEC too) then tv_res will be smaller than tv. The
affectation "(tv_res).tv_usec = __delta;" is wrong. The fix is to
revert to the original code before
4ee303dfeac6451b402e3d8512723d3a0f861857 and change the 'if' in
'while'.

[Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of
"while (__delta > USEC_PER_SEC){ ... }"]

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[DCCP]: Fix default sequence window size

When using the default sequence window size (100) I got the following in
my logs:

Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for
DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <=
S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <=
P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC...
Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for
DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <=
S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <=
P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC...

I went to alter the default sysctl and it didn't take for new sockets.
Below patch fixes this.

I think the default is too low but it is what the DCCP spec specifies.

As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec
to 3.5. This is still far too slow but it is a step in the right direction.

Compile tested only for IPv6 but not particularly complex change.

Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

IB/mthca: Initialize max_cmds before debug code prints it

Read the max_cmds value from the response to the QUERY_FW command
before printing out the value, so that the real value goes into the
debug output.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipoib: Fix packet loss after hardware address update

The neighbour ha field may get updated without destroying the
neighbour. In this case, the ha field gets out of sync with the
address handle stored in ipoib_neigh->ah, with the result that
the ah field would point to an incorrect path, resulting in all
packets being lost.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipoib: Fix oops with ipoib_debug_mcast set

Need to set mcast->ah before debug code dereferences it.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/mad: Validate MADs for spec compliance

Validate MADs sent by userspace clients for spec compliance with
C13-18.1.1 (prevent duplicate requests and responses sent on the
same port). Without this, RMPP transactions get aborted because
of duplicate packets.

This patch is similar to that provided by Jack Morgenstein.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: ipath_skip_sge() can break if num_sge > 1

ipath_skip_sge() doesn't exactly duplicate the side effects of
ipath_copy_sge() if num_sge > 1 since it doesn't decrement ss->num_sge.
This could result in the sg_list being accessed out of bounds.
Since ipath_skip_sge() is almost always called with num_sge == 1,
the original "optimization" is almost never used.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: Fix ib_ipath driver to work with SRP

I am still working on a proposal to remove the phys_to_virt() calls
in the ib_ipath driver. In the mean time, this patch allows SRP
to work by fixing the R_Key check and conversion from IB address
to kernel virtual address. It also returns the correct page size
for FMRs.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: Fix a data corruption

This patch fixes a problem where certain error packets are passed
to the InfiniBand layer for processing even though the packet
actually was received with an error.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix SRQ limit event range check

Mem-free HCAs always keep one spare SRQ WQE, so the SRQ limit cannot
be set beyond srq->max - 1.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

Merge branch 'master' into upstream-fixes

[libata] sata_promise: comment out duplicate PCI ID

This is just the for-RC fix. A 'TODO' command is added, describing
what's needed for the more-complete fix.

IB/uverbs: Fix lockdep warnings

Lockdep warns because uverbs is trying to take uobj->mutex when it
already holds that lock. This is because there are really multiple
types of uobjs even though all of their locks are initialized in
common code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/uverbs: Fix unlocking in error paths

ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the
PD's read lock in their error paths, which lead to deadlock when
destroying the PD.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

[PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock

Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
callback_mutex lock.

It only happens on cpu_exclusive cpusets, due to the dynamic
sched domain code trying to take the cpu hotplug lock inside
the cpuset callback_mutex lock.

This bug has apparently been here for several months, but didn't
get hit until the right customer load on a large system.

This fix appears right from inspection, but it will take a few
more days running it on that customers workload to be confident
we nailed it.  We don't have any other reproducible test case.

The cpu_hotplug_lock() tends to cover large runs of code.
The other places that hold both that lock and the cpuset callback
mutex lock always nest the cpuset lock inside the hotplug lock.
This place tries to do the reverse, risking an ABBA deadlock.

This is in the cpuset_rmdir() code, where we:
  * take the callback_mutex lock
  * mark the cpuset CS_REMOVED
  * call update_cpu_domains for cpu_exclusive cpusets
  * in that call, take the cpu_hotplug lock if the
    cpuset is marked for removal.

Thanks to Jack Steiner for identifying this deadlock.

The fix is to tear down the dynamic sched domain before we grab
the cpuset callback_mutex lock.  This way, the two locks are
serialized, with the hotplug lock taken and released before
trying for the cpuset lock.

I suspect that this bug was introduced when I changed the
cpuset locking from one lock to two.  The dynamic sched domain
dependency on cpu_exclusive cpusets and its hotplug hooks were
added to this code earlier, when cpusets had only a single lock.
It may well have been fine then.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cpu hotplug: simplify and hopefully fix locking

The CPU hotplug locking was quite messy, with a recursive lock to
handle the fact that both the actual up/down sequence wanted to
protect itself from being re-entered, but the callbacks that it
called also tended to want to protect themselves from CPU events.

This splits the lock into two (one to serialize the whole hotplug
sequence, the other to protect against the CPU present bitmaps
changing). The latter still allows recursive usage because some
subsystems (ondemand policy for cpufreq at least) had already gotten
too used to the lax locking, but the locking mistakes are hopefully
now less fundamental, and we now warn about recursive lock usage
when we see it, in the hope that it can be fixed.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[cpufreq] ondemand: make shutdown sequence more robust

Shutting down the ondemand policy was fraught with potential
problems, causing issues for SMP suspend (which wants to hot-
unplug) all but the last CPU.

This should fix at least the worst problems (divide-by-zero
and infinite wait for the workqueue to shut down).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge /pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [TIPC]: Removing useless casts
  [IPV4]: Fix nexthop realm dumping for multipath routes
  [DUMMY]: Avoid an oops when dummy_init_one() failed
  [IFB] After ifb_init_one() failed, i is increased. Decrease
  [NET]: Fix reversed error test in netif_tx_trylock
  [MAINTAINERS]: Mark LAPB as Oprhan.
  [NET]: Conversions from kmalloc+memset to k(z|c)alloc.
  [NET]: sun happymeal, little pci cleanup
  [IrDA]: Use alloc_skb() in IrDA TX path
  [I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine
  [I/OAT]: net/core/user_dma.c should #include <net/netdma.h>
  [SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
  [SCTP]: Set chunk->data_accepted only if we are going to accept it.
  [SCTP]: Verify all the paths to a peer via heartbeat before using them.
  [SCTP]: Unhash the endpoint in sctp_endpoint_free().
  [SCTP]: Check for NULL arg to sctp_bucket_destroy().
  [PKT_SCHED] netem: Fix slab corruption with netem (2nd try)
  [WAN]: Converted synclink drivers to use netif_carrier_*()
  [WAN]: Cosmetic changes to N2 and C101 drivers
  [WAN]: Added missing netif_dormant_off() to generic HDLC
  ...

[TIPC]: Removing useless casts

Removing useless casts

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Fix nexthop realm dumping for multipath routes

Routing realms exist per nexthop, but are only returned to userspace
for the first nexthop. This is due to the fact that iproute2 only
allows to set the realm for the first nexthop and the kernel refuses
multipath routes where only a single realm is present.

Dump all realms for multipath routes to enable iproute to correctly
display them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>