git.monstr.eu Git - linux-2.6-microblaze.git/log

ext4: fix trimming starting with block 0 with small blocksize

When s_first_data_block is not zero (which happens e.g. when block size is 1KB)
and trim ioctl is called to start trimming from block 0, the math in
ext4_get_group_no_and_offset() overflows. The overall result is that ioctl
returns EINVAL which is kind of unexpected and we probably don't want
userspace tools to bother with internal details of filesystem structure.
So just silently increase starting offset (and shorten length) when starting
block is below s_first_data_block.

CC: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: revert buggy trim overflow patch

This reverts commit 4f531501e44: ext4: fix possible overflow in
ext4_trim_fs()

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: don't pass entire map to check_eofblocks_fl

Since check_eofblocks_fl() only uses the m_lblk portion of the map
structure, we may as well pass that directly, rather than passing the
entire map, which IMHO obfuscates what parameters check_eofblocks_fl()
cares about. Not a big deal, but seems tidier and less confusing, to
me.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: fix memory leak in ext4_free_branches

Commit 40389687 moved a call to ext4_forget() out of
ext4_free_branches and let ext4_free_blocks() handle calling
bforget(). But that change unfortunately did not replace the call to
ext4_forget() with brelse(), which was needed to drop the in-use count
of the indirect block's buffer head, which lead to a memory leak when
deleting files that used indirect blocks. Fix this.

Thanks to Hugh Dickins for pointing this out.

Cc: stable@kernel.org
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: remove ext4_mb_return_to_preallocation()

This function was never implemented, except for a BUG_ON which was
tripping when ext4 is run without a journal. The problem is that
although the comment asserts that "truncate (which is the only way to
free block) discards all preallocations", ext4_free_blocks() is also
called in various error recovery paths when blocks have been
allocated, but for various reasons, we were not able to use those data
blocks (for example, because we ran out of memory while trying to
manipulate the extent tree, or some other similar situation).

In addition to the fact that this function isn't implemented except
for the incorrect BUG_ON, the single caller of this function,
ext4_free_blocks(), doesn't use it all if the journal is enabled.

So remove the (stub) function entirely for now. If we decide it's
better to add it back, it's only going to be useful with a relatively
large number of code changes anyway.

Google-Bug-Id: 3236408

Cc: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: flush the i_completed_io_list during ext4_truncate

Ted first found the bug when running 2.6.36 kernel with dioread_nolock
mount option that xfstests #13 complained about wrong file size during fsck.
However, the bug exists in the older kernels as well although it is
somehow harder to trigger.

The problem is that ext4_end_io_work() can happen after we have truncated an
inode to a smaller size. Then when ext4_end_io_work() calls
ext4_convert_unwritten_extents(), we may reallocate some blocks that have
been truncated, so the inode size becomes inconsistent with the allocated
blocks.

The following patch flushes the i_completed_io_list during truncate to reduce
the risk that some pending end_io requests are executed later and convert
already truncated blocks to initialized.

Note that although the fix helps reduce the problem a lot there may still
be a race window between vmtruncate() and ext4_end_io_work(). The fundamental
problem is that if vmtruncate() is called without either i_mutex or i_alloc_sem
held, it can race with an ongoing write request so that the io_end request is
processed later when the corresponding blocks have been truncated.

Ted and I have discussed the problem offline and we saw a few ways to fix
the race completely:

a) We guarantee that i_mutex lock and i_alloc_sem write lock are both hold
whenever vmtruncate() is called. The i_mutex lock prevents any new write
requests from entering writeback and the i_alloc_sem prevents the race
from ext4_page_mkwrite(). Currently we hold both locks if vmtruncate()
is called from do_truncate(), which is probably the most common case.
However, there are places where we may call vmtruncate() without holding
either i_mutex or i_alloc_sem. I would like to ask for other people's
opinions on what locks are expected to be held before calling vmtruncate().
There seems a disagreement among the callers of that function.

b) We change the ext4 write path so that we change the extent tree to contain
the newly allocated blocks and update i_size both at the same time --- when
the write of the data blocks is completed.

c) We add some additional locking to synchronize vmtruncate() and
ext4_end_io_work(). This approach may have performance implications so we
need to be careful.

All of the above proposals may require more substantial changes, so
we may consider to take the following patch as a bandaid.

Signed-off-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: add error checking to calls to ext4_handle_dirty_metadata()

Call ext4_std_error() in various places when we can't bail out
cleanly, so the file system can be marked as in error.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: fix trimming of a single group

When ext4_trim_fs() is called to trim a part of a single group, the
logic will wrongly set last block of the interval to 'len' instead
of 'first_block + len'. Thus a shorter interval is possibly trimmed.
Fix it.

CC: Lukas Czerner <lczerner@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: fix uninitialized variable in ext4_register_li_request

fs/ext4/super.c: In function 'ext4_register_li_request':
fs/ext4/super.c:2936: warning: 'ret' may be used uninitialized in this function

It looks buggy to me, too.

Cc: Lukas Czerner <lczerner@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: dynamically allocate the jbd2_inode in ext4_inode_info as necessary

Replace the jbd2_inode structure (which is 48 bytes) with a pointer
and only allocate the jbd2_inode when it is needed --- that is, when
the file system has a journal present and the inode has been opened
for writing. This allows us to further slim down the ext4_inode_info
structure.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: drop i_state_flags on architectures with 64-bit longs

We can store the dynamic inode state flags in the high bits of
EXT4_I(inode)->i_flags, and eliminate i_state_flags. This saves 8
bytes from the size of ext4_inode_info structure, which when
multiplied by the number of the number of in the inode cache, can save
a lot of memory.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: reorder ext4_inode_info structure elements to remove unneeded padding

By reordering the elements in the ext4_inode_info structure, we can
reduce the padding needed on an x86_64 system by 16 bytes.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: drop ec_type from the ext4_ext_cache structure

We can encode the ec_type information by using ee_len == 0 to denote
EXT4_EXT_CACHE_NO, ee_start == 0 to denote EXT4_EXT_CACHE_GAP, and if
neither is true, then the cache type must be EXT4_EXT_CACHE_EXTENT.
This allows us to reduce the size of ext4_ext_inode by another 8
bytes. (ec_type is 4 bytes, plus another 4 bytes of padding)

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: use ext4_lblk_t instead of sector_t for logical blocks

This fixes a number of places where we used sector_t instead of
ext4_lblk_t for logical blocks, which for ext4 are still 32-bit data
types. No point wasting space in the ext4_inode_info structure, and
requiring 64-bit arithmetic on 32-bit systems, when it isn't
necessary.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: replace i_delalloc_reserved_flag with EXT4_STATE_DELALLOC_RESERVED

Remove the short element i_delalloc_reserved_flag from the
ext4_inode_info structure and replace it a new bit in i_state_flags.
Since we have an ext4_inode_info for every ext4 inode cached in the
inode cache, any savings we can produce here is a very good thing from
a memory utilization perspective.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: fix 32bit overflow in ext4_ext_find_goal()

ext4_ext_find_goal() returns an ideal physical block number that the block
allocator tries to allocate first. However, if a required file offset is
smaller than the existing extent's one, ext4_ext_find_goal() returns
a wrong block number because it may overflow at
"block - le32_to_cpu(ex->ee_block)". This patch fixes the problem.

ext4_ext_find_goal() will also return a wrong block number in case
a file offset of the existing extent is too big. In this case,
the ideal physical block number is fixed in ext4_mb_initialize_context(),
so it's no problem.

reproduce:
# dd if=/dev/zero of=/mnt/mp1/tmp bs=127M count=1 oflag=sync
# dd if=/dev/zero of=/mnt/mp1/file bs=512K count=1 seek=1 oflag=sync
# filefrag -v /mnt/mp1/file
Filesystem type is: ef53
File size of /mnt/mp1/file is 1048576 (256 blocks, blocksize 4096)
ext logical physical expected length flags
   0     128    67456             128 eof
/mnt/mp1/file: 2 extents found
# rm -rf /mnt/mp1/tmp
# echo $((512*4096)) > /sys/fs/ext4/loop0/mb_stream_req
# dd if=/dev/zero of=/mnt/mp1/file bs=512K count=1 oflag=sync conv=notrunc

result (linux-2.6.37-rc2 + ext4 patch queue):
# filefrag -v /mnt/mp1/file
Filesystem type is: ef53
File size of /mnt/mp1/file is 1048576 (256 blocks, blocksize 4096)
ext logical physical expected length flags
   0       0    33280             128
   1     128    67456    33407    128 eof
/mnt/mp1/file: 2 extents found

result(apply this patch):
# filefrag -v /mnt/mp1/file
Filesystem type is: ef53
File size of /mnt/mp1/file is 1048576 (256 blocks, blocksize 4096)
ext logical physical expected length flags
   0       0    66560             128
   1     128    67456    66687    128 eof
/mnt/mp1/file: 2 extents found

Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: add more error checks to ext4_mkdir()

Check return value of ext4_journal_get_write_access,
ext4_journal_dirty_metadata and ext4_mark_inode_dirty. Move brelse()
under 'out_stop' to release bh properly in case of journal error.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: ext4_ext_migrate should use NULL not 0

ext4_ext_migrate() calls ext4_new_inode() and passes 0 instead of a pointer
to a struct qstr. This patch uses NULL, to make it obvious to the caller
that this was a pointer.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Use ext4_error_file() to print the pathname to the corrupted inode

Where the file pointer is available, use ext4_error_file() instead of
ext4_error_inode().

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: use IS_ERR() to check for errors in ext4_error_file

d_path() returns an ERR_PTR and it doesn't return NULL. This is in
ext4_error_file() and no one actually calls ext4_error_file().

Signed-off-by: Dan Carpenter <error27@gmail.com>

ext4: test the correct variable in ext4_init_pageio()

This is a copy and paste error. The intent was to check
"io_page_cachep". We tested "io_page_cachep" earlier.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext2: remove dead code in ext2_xattr_get

Reviewed-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Wang Sheng-Hui <crosslonelyover@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext2,ext3,ext4: clarify comment for extN_xattr_set_handle

Signed-off-by: Wang Sheng-Hui <crosslonelyover@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: clean up ext4_xattr_list()'s error code checking and return strategy

Any time you see code that tries to add error codes together, you
should want to claw your eyes out...

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: remove warning message from ext4_issue_discard helper

ext4_issue_discard is supposed to be helper for calling discard, however
in case that underlying device does not support discard it prints out
the warning message and clears the DISCARD t_mount_opt flag. Since it
can be (and is) used by others, it should not do anything and let the
caller to handle the error case.

This commit removes warning message and flag setting from
ext4_issue_discard and use it just in place where it is really needed
(release_blocks_on_commit). FITRIM ioctl should not set any flags nor it
should print out warning messages, so get rid of the warning as well.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>

ext4: fix possible overflow in ext4_trim_fs()

When determining last group through ext4_get_group_no_and_offset() the
result may be wrong in cases when range->start and range-len are too
big, because it may overflow when summing up those two numbers.

Fix that by checking range->len and limit its value to
ext4_blocks_count(). This commit was tested by myself with expected
result.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>

ext4: Add error checking to kmem_cache_alloc() call in ext4_free_blocks()

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Use printf extension %pV

Using %pV reduces the number of printk calls and eliminates any
possible message interleaving from other printk calls.

In function __ext4_grp_locked_error also added KERN_CONT to some
printks.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Use vzalloc in ext4_fill_flex_info()

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: zero out nanosecond timestamps for small inodes

When nanosecond timestamp resolution isn't supported on an ext4
partition (inode size = 128), stat() appears to be returning
uninitialized garbage in the nanosecond component of timestamps.

EXT4_INODE_GET_XTIME should zero out tv_nsec when EXT4_FITS_IN_INODE
evaluates to false.

Reported-by: Jordan Russell <jr-list-2010@quo.to>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: optimize ext4_check_dir_entry() with unlikely() annotations

This function gets called a lot for large directories, and the answer
is almost always "no, no, there's no problem". This means using
unlikely() is a good thing.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: use kmem_cache_zalloc() in ext4_init_io_end()

Use advantage of kmem_cache_zalloc() to remove a memset() call in
ext4_init_io_end() and save a few bytes.

Before:
[jj@dragon linux-2.6]$ size fs/ext4/page-io.o
    text    data     bss     dec     hex filename
    3016       0     624    3640     e38 fs/ext4/page-io.o
After:
[jj@dragon linux-2.6]$ size fs/ext4/page-io.o
    text    data     bss     dec     hex filename
    3000       0     624    3624     e28 fs/ext4/page-io.o

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Remove redundant unlikely()

IS_ERR() already implies unlikely(), so it can be omitted here.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

jbd2: simplify return path of journal_init_common

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

jbd2: move debug message into debug #ifdef

This is a port to jbd2 of a patch which Namhyung Kim <namhyung@gmail.com>
originally made to fs/jbd.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

jbd2: remove unnecessary goto statement

This is a port to jbd2 of a patch which Namhyung Kim <namhyung@gmail.com>
originally made to fs/jbd.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

jbd2: use offset_in_page() instead of manual calculation

This is a port to jbd2 of a patch which Namhyung Kim <namhyung@gmail.com>
originally made to fs/jbd.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

jbd2: Fix a debug message in do_get_write_access()

'buffer_head' should be 'journal_head'

This is a port of a patch which Namhyung Kim <namhyung@gmail.com> made
to fs/jbd to jbd2.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

jbd2: Use pr_notice_ratelimited() in journal_alloc_journal_head()

We had an open-coded version of printk_ratelimited(); use the provided
abstraction to make the code cleaner and easier to understand.

Based on a similar patch for fs/jbd from Namhyung Kim <namhyung@gmail.com>

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Use pr_warning_ratelimited() instead of printk_ratelimit()

printk_ratelimit() is deprecated since it is a global instead of a
per-printk ratelimit.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Fix up comments in inode.c

This fixes up some broken argument descriptions that Namhyung Kim had
originally submitted for ext3. This fixes the comments that were
still applicable in ext4.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Add second mount options field since the s_mount_opt is full up

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Move struct ext4_mount_options from ext4.h to super.c

Move the ext4_mount_options structure definition from ext4.h, since it
is only used in super.c.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

ext4: Simplify the usage of clear_opt() and set_opt() macros

Change clear_opt() and set_opt() to take a superblock pointer instead
of a pointer to EXT4_SB(sb)->s_mount_opt. This makes it easier for us
to support a second mount option field.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

Linux 2.6.37-rc6

Merge git://git./linux/kernel/git/herbert/crypto-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ghash-intel - ghash-clmulni-intel_glue needs err.h

Merge branch 'for_linus' of git://git./linux/kernel/git/tytso/ext4

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix typo which broke '..' detection in ext4_find_entry()
ext4: Turn off multiple page-io submission by default

xen: Provide a variant of __RING_SIZE() that is an integer constant expression

Without this, gcc 4.5 won't compile xen-netfront and xen-blkfront, where
this is being used to specify array sizes.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: David Miller <davem@davemloft.net>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: update MSM git tree

The MSM main git tree has changed over to this new address.

Signed-off-by: Daniel Walker <dwalker@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

install_special_mapping skips security_file_mmap check.

The install_special_mapping routine (used, for example, to setup the
vdso) skips the security check before insert_vm_struct, allowing a local
attacker to bypass the mmap_min_addr security restriction by limiting
the available pages for special mappings.

bprm_mm_init() also skips the check, and although I don't think this can
be used to bypass any restrictions, I don't see any reason not to have
the security check.

  $ uname -m
  x86_64
  $ cat /proc/sys/vm/mmap_min_addr
  65536
  $ cat install_special_mapping.s
  section .bss
      resb BSS_SIZE
  section .text
      global _start
      _start:
          mov     eax, __NR_pause
          int     0x80
  $ nasm -D__NR_pause=29 -DBSS_SIZE=0xfffed000 -f elf -o install_special_mapping.o install_special_mapping.s
  $ ld -m elf_i386 -Ttext=0x10000 -Tbss=0x11000 -o install_special_mapping install_special_mapping.o
  $ ./install_special_mapping &
  [1] 14303
  $ cat /proc/14303/maps
  0000f000-00010000 r-xp 00000000 00:00 0                                  [vdso]
  00010000-00011000 r-xp 00001000 00:19 2453665                            /home/taviso/install_special_mapping
  00011000-ffffe000 rwxp 00000000 00:00 0                                  [stack]

It's worth noting that Red Hat are shipping with mmap_min_addr set to
4096.

Signed-off-by: Tavis Ormandy <taviso@google.com>
Acked-by: Kees Cook <kees@ubuntu.com>
Acked-by: Robert Swiecki <swiecki@google.com>
[ Changed to not drop the error code - akpm ]
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

crypto: ghash-intel - ghash-clmulni-intel_glue needs err.h

Add missing header file:

arch/x86/crypto/ghash-clmulni-intel_glue.c:256: error: implicit declaration of function 'IS_ERR'
arch/x86/crypto/ghash-clmulni-intel_glue.c:257: error: implicit declaration of function 'PTR_ERR'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Merge branch 'for-linus' of git://git./linux/kernel/git/tj/wq

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: It is likely that WORKER_NOT_RUNNING is true
  MAINTAINERS: Add workqueue entry
  workqueue: check the allocation of system_unbound_wq

Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md:
  md: protect against NULL reference when waiting to start a raid10.
  md: fix bug with re-adding of partially recovered device.
  md: fix possible deadlock in handling flush requests.
  md: move code in to submit_flushes.
  md: remove handling of flush_pending in md_submit_flush_data

dw_spi: Fix missing final read in some polling situations

There is a possibility that the last word of a transaction will be lost
if data is not ready. Re-read in poll_transfer() to solve this issue
when poll_mode is enabled.

Verified on SPI touch screen device.

Signed-off-by: Major Lee <major_lee@wistron.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i2c_intel_mid: Fix slash in sysfs name

This gets caught by the new sanity check code. Instead of the slash use a
different symbol. This was originally found by Major Lee who proposed a
rather more complex patch which changed the name according to the chip
type.

On the basis that we are in a late -rc and making Linus grumpy isn't always
a good idea (however fun) this is a simple alternative.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ext4: fix typo which broke '..' detection in ext4_find_entry()

There should be a check for the NUL character instead of '0'.

Fortunately the only thing that cares about this is NFS serving, which
is why we didn't notice this in the merge window testing.

Reported-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

Merge branch 'sh-fixes-for-linus' of git://git./linux/kernel/git/lethal/sh-2.6

* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: wire up accept4 syscall (non-multiplexed path)
sh: Enable deprecated IRQ chip APIs for MFD and GPIOLIB drivers.

Merge branch 'omap-fixes-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6

* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
  OMAP2: PRCM: fix some SHIFT macros that were actually bitmasks
  OMAP2+: PM/serial: fix console semaphore acquire during suspend
  OMAP1: SRAM: fix size for OMAP1611 SoCs
  arm: omap2: io: fix clk_get() error check
  arm: plat-omap: counter_32k: use IS_ERR() instead of NULL check
  omap: nand: remove hardware ECC as default
  omap: zoom: wl1271 slot is MMC_CAP_POWER_OFF_CARD
  omap: PM debug: fix wake-on-timer debugfs dependency

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: 6535/1: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix
  ARM: 6534/1: Make CONFIG_FPE_NWFPE depend on !CONFIG_THUMB2_KERNEL
  ARM: 6533/1: Thumb-2: Make CONFIG_THUMB2_KERNEL depend on !CPU_V6
  Change bcmring Maintainer list.
  ARM: Update mach-types
  ARM: 6528/1: Use CTR for the I-cache line size on ARMv7
  ARM: 6527/1: Use CTR instead of CCSIDR for the D-cache line size on ARMv7
  ARM: pxa/palm: fix ifdef around gen_nand driver registration
  ARM: pxa: fix pxa2xx-flash section mismatch
  ARM: mmp2: remove not used clk_rtc

Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Write to prom console using indirect buffer.
  sparc: Delete prom_*getchar().
  sparc: Pass buffer pointer all the way down to prom_{get,put}char().
  sparc: Do not export prom_nb{get,put}char().
  sparc64: Delete prom_setcallback().
  sparc64: Unexport prom_service_exists().
  sparc: Kill prom devops_{32,64}.c
  sparc: Remove prom_pathtoinode()
  sparc64: Delete prom_puts() unused.
  SPARC/LEON: removed constant timer initialization as if HZ=100, now it reflects the value of HZ

Merge git://git./linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (75 commits)
  pppoe.c: Fix kernel panic caused by __pppoe_xmit
  WAN: Fix a TX IRQ causing BUG() in PC300 and PCI200SYN drivers.
  bnx2x: Advance a version number to 1.60.01-0
  bnx2x: Fixed a compilation warning
  bnx2x: LSO code was broken on BE platforms
  qlge: Fix deadlock when cancelling worker.
  net: fix skb_defer_rx_timestamp()
  cxgb4vf: Ingress Queue Entry Size needs to be 64 bytes
  phy: add the IC+ IP1001 driver
  atm: correct sysfs 'device' link creation and parent relationships
  MAINTAINERS: remove me from tulip
  SCTP: Fix SCTP_SET_PEER_PRIMARY_ADDR to accpet v4mapped address
  enic: Bug Fix: Pass napi reference to the isr that services receive queue
  ipv6: fix nl group when advertising a new link
  connector: add module alias
  net: Document the kernel_recvmsg() function
  r8169: Fix runtime power management
  hso: IP checksuming doesn't work on GE0301 option cards
  xfrm: Fix xfrm_state_migrate leak
  net: Convert netpoll blocking api in bonding driver to be a counter
  ...

Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] hpsa: fix redefinition of PCI_DEVICE_ID_CISSF
  [SCSI] qla2xxx: Update version number to 8.03.05-k0.
  [SCSI] qla2xxx: Properly set the return value in qla2xxx_eh_abort function.
  [SCSI] qla2xxx: Correct issue where NPIV-config data was not being allocated for 82xx parts.
  [SCSI] qla2xxx: Change MSI initialization from using incorrect request_irq parameter.
  [SCSI] qla2xxx: Populate Command Type 6 LUN field properly.
  [SCSI] zfcp: Issue FCP command without holding SCSI host_lock
  [SCSI] zfcp: Prevent usage w/o holding a reference
  [SCSI] zfcp: No ERP escalation on gpn_ft eval
  [SCSI] zfcp: Correct false abort data assignment.
  [SCSI] zfcp: Fix common FCP request reception
  [SCSI] Eliminate error handler overload of the SCSI serial number
  [SCSI] pmcraid: disable msix and expand device config entry
  [SCSI] bsg: correct fault if queue object removed while dev_t open
  [SCSI] osd: checking NULL instead of ERR_PTR()

Merge branch 'for_linus' of git://git./linux/kernel/git/jwessel/linux-2.6-kgdb

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
kgdboc,input: Fix regression with keyboard release key and early debugging

Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI / PM: Do not save/restore NVS on Sony Vaio VGN-NW130D
  ACPI/HEST: adjust section selection
  ACPI: eliminate unused variable warning for !ACPI_SLEEP
  ACPI/PNP: avoid section mismatch warning
  ACPI thermal: remove two unused functions
  ACPI: fix a section mismatch
  ACPI, APEI, use raw spinlock in ERST
  ACPI: video: fix build for CONFIG_ACPI=n
  ACPI: video: fix build for VIDEO_OUTPUT_CONTROL=n
  ACPI: fix allowing to add/remove multiple _OSI strings
  acpi: fix _OSI string setup regression
  ACPI: EC: Add another dmi match entry for MSI hardware
  ACPI battery: update status upon sysfs query
  ACPI ac: update AC status upon sysfs query
  ACPI / PM: Do not refcount power resources that can't be turned on
  ACPI / PM: Check device state before refcounting power resources

Merge branch 'idle-release' of git://git./linux/kernel/git/lenb/linux-idle-2.6

* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
intel_idle: recognize ARAT on WSM-EX

ARM: 6535/1: V6 MPCore v6_dma_inv_range and v6_dma_flush_range RWFO fix

Cache ownership must be acquired by reading/writing data from the
cache line to make cache operation have the desired effect on the
SMP MPCore CPU. However, the ownership is never acquired in the
v6_dma_inv_range function when cleaning the first line and
flushing the last one, in case the address is not aligned
to D_CACHE_LINE_SIZE boundary.
Fix this by reading/writing data if needed, before performing
cache operations.
While at it, fix v6_dma_flush_range to prevent RWFO outside
the buffer.

Cc: stable@kernel.org
Signed-off-by: Valentine Barshak <vbarshak@mvista.com>
Signed-off-by: George G. Davis <gdavis@mvista.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 6534/1: Make CONFIG_FPE_NWFPE depend on !CONFIG_THUMB2_KERNEL

Because the nwfpe support is unlikely to be used on new platforms
and requires CONFIG_OABI_COMPAT, which is not generally used with
ARMv7+, we shouldn't expect to build nwfpe support into a Thumb-2
kernel.

At present, nwfpe contains assembly code which isn't Thumb-2
compatible, and for now it doesn't appear useful to port this
code.

All ARMv7-A/R platforms necessarily have VFPv3 hardware floating-
point natively, making emulation unnecessary.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 6533/1: Thumb-2: Make CONFIG_THUMB2_KERNEL depend on !CPU_V6

This makes sense, because Thumb-2 code can't execute on plain
ARMv6 processors.

This will avoid accidentally configuring a broken kernel where the
config otherwise would allow multiple architecture versions to
coexist in the same kernel.

Not adding !CPU_V5 etc., because the chance of anyone trying to
put v5 and v7 in the same kernel is low, and I'm not aware of
any mach which can do this. These could be added later if it
matters.

Note that the rules may need to be refined if support for the
ARM1156J(F)-S processor is later added to the kernel, since this
processor supports the rare ARMv6T2 extensions, which add support
for Thumb-2 and a few other ARMv7 features.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Change bcmring Maintainer list.

I am Jiandong Zheng working on BCMRING in Broadcom Canada Ltd. I am
replacing Leo Chen (leochen@broadcom.com) as "ARM/BCMRING ARM
ARCHITECTURE" and "ARM/BCMRING MTD NAND DRIVER" maintainer from
Broadcom as he is no longer the maintainer of these components.

Signed-off-by: Jiandong Zheng <jdzheng@broadcom.com>
Acked-by: Scott Branden <sbranden@broadcom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Merge branch 'fbdev-fixes-for-linus' of git://git./linux/kernel/git/lethal/fbdev-2.6

* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
fbdev: Fix fb_find_nearest_mode refresh comparison

Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
hwmon: (ltc4215) make sysfs file match the alarm cause

Merge branch 'fixes' of git://git./linux/kernel/git/djbw/async_tx

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  dmaengine: at_hdmac: fix buffer transfer size specification
  fsldma: fix issue of slow dma
  dmaengine i.MX SDMA: initialize on module_init
  dma : EG20T PCH: Fix miss-setting DMA descriptor
  intel_mid_dma: fix section mismatch warnings
  dmaengine: imx-sdma: fix bug in buffer descriptor initialization
  drivers/dma/ppc4xx: Use printf extension %pR for struct resource
  drivers/dma/ioat: Use the ccflag-y instead of EXTRA_CFLAGS
  drivers/dma/: Use the ccflag-y instead of EXTRA_CFLAGS
  dma: intel_mid_dma: fix double free on mid_setup_dma error path
  dma: imx-dma: fix imxdma_probe error path

Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/uverbs: Handle large number of entries in poll CQ

Merge branch 'fixes' of git://git./linux/kernel/git/ieee1394/linux1394-2.6

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: ohci: fix regression with Agere FW643 rev 06, disable MSI
firewire: ohci: fix regression with VIA VT6315, disable MSI

Merge branch 'for-linus' of git://git./linux/kernel/git/lrg/voltage-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6:
  regulator: tps6586x: correct register table
  regulator: tps6586x: Handle both enable reg/bits being the same
  regulator: tps6586x: Fix TPS6586X_DVM to store goreg/bit
  regulator: tps6586x: Add missing bit mask generation

Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: HDA: Quirk for Dell Vostro 320 to make microphone work
  ALSA: hda - Reset sample sizes and max bitrates when reading ELD
  ALSA: hda - Always allow basic audio irrespective of ELD info
  ALSA: hda - Do not wrongly restrict min_channels based on ELD
  ASoC: Correct WM8962 interrupt mask register read
  ASoC: WM8580: Debug BCLK and sample size
  ASoC: Fix resource leak if soc_register_ac97_dai_link failed
  ASoC: Hold client_mutex while calling snd_soc_instantiate_cards()
  ASoC: Fix swap of left and right channels for WM8993/4 speaker boost gain
  ASoC: Fix off by one error in WM8994 EQ register bank size
  ALSA: hda: Use position_fix=1 for Acer Aspire 5538 to enable capture on internal mic
  ALSA: hda - Enable jack sense for Thinkpad Edge 13
  ALSA: hda - Fix ThinkPad T410[s] docking station line-out
  ALSA: hda: Use model=lg quirk for LG P1 Express to enable playback and capture

Merge branch 'for-linus' of git://git./linux/kernel/git/bp/bp

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  amd64_edac: Fix interleaving check
  EDAC: Correct MiB_TO_PAGES() macro
  EDAC: Fix workqueue-related crashes

Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: don't apply 7xx HDP flush workaround on AGP
  drm: use after free in drm_queue_vblank_event()
  drm/kms: remove spaces from connector names (v2)

Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (adm1026) Allow 1 as a valid divider value
  hwmon: (adm1026) Fix setting fan_div
  hwmon: (it87) Fix manual fan speed control on IT8721F

ext4: Turn off multiple page-io submission by default

Jon Nelson has found a test case which causes postgresql to fail with
the error:

psql:t.sql:4: ERROR: invalid page header in block 38269 of relation base/16384/16581

Under memory pressure, it looks like part of a file can end up getting
replaced by zero's. Until we can figure out the cause, we'll roll
back the change and use block_write_full_page() instead of
ext4_bio_write_page(). The new, more efficient writing function can
be used via the mount option mblk_io_submit, so we can test and fix
the new page I/O code.

To reproduce the problem, install postgres 8.4 or 9.0, and pin enough
memory such that the system just at the end of triggering writeback
before running the following sql script:

begin;
create temporary table foo as select x as a, ARRAY[x] as b FROM
generate_series(1, 10000000 ) AS x;
create index foo_a_idx on foo (a);
create index foo_b_idx on foo USING GIN (b);
rollback;

If the temporary table is created on a hard drive partition which is
encrypted using dm_crypt, then under memory pressure, approximately
30-40% of the time, pgsql will issue the above failure.

This patch should fix this problem, and the problem will come back if
the file system is mounted with the mblk_io_submit mount option.

Reported-by: Jon Nelson <jnelson@jamponi.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linux

* 'for-2.6.37' of git://linux-nfs.org/~bfields/linux:
nfsd: Fix possible BUG_ON firing in set_change_info
sunrpc: prevent use-after-free on clearing XPT_BUSY

Merge git://git./linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: prevent RAID level downgrades when space is low
  Btrfs: account for missing devices in RAID allocation profiles
  Btrfs: EIO when we fail to read tree roots
  Btrfs: fix compiler warnings
  Btrfs: Make async snapshot ioctl more generic
  Btrfs: pwrite blocked when writing from the mmaped buffer of the same page
  Btrfs: Fix a crash when mounting a subvolume
  Btrfs: fix sync subvol/snapshot creation
  Btrfs: Fix page leak in compressed writeback path
  Btrfs: do not BUG if we fail to remove the orphan item for dead snapshots
  Btrfs: fixup return code for btrfs_del_orphan_item
  Btrfs: do not do fast caching if we are allocating blocks for tree_root
  Btrfs: deal with space cache errors better
  Btrfs: fix use after free in O_DIRECT

Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: verify ioctl retries
fuse: fix ioctl when server is 32bit

Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: log timestamp changes to the source inode in rename

Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix ioctl magic
  ceph: Behave better when handling file lock replies.
  ceph: pass lock information by struct file_lock instead of as individual params.
  ceph: Handle file locks in replies from the MDS.
  ceph: avoid possible null deref in readdir after dir llseek

Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix panic after nfs_umount()
  nfs: remove extraneous and problematic calls to nfs_clear_request
  nfs: kernel should return EPROTONOSUPPORT when not support NFSv4
  NFS: Fix fcntl F_GETLK not reporting some conflicts
  nfs: Discard ACL cache on mode update
  NFS: Readdir cleanups
  NFS: nfs_readdir_search_for_cookie() don't mark as eof if cookie not found
  NFS: Fix a memory leak in nfs_readdir
  Call the filesystem back whenever a page is removed from the page cache
  NFS: Ensure we use the correct cookie in nfs_readdir_xdr_filler

Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: remove bogus remapping of error in cifs_filldir()
  cifs: allow calling cifs_build_path_to_root on incomplete cifs_sb
  cifs: fix check of error return from is_path_accessable
  cifs: remove Local_System_Name
  cifs: fix use of CONFIG_CIFS_ACL
  cifs: add attribute cache timeout (actimeo) tunable

workqueue: It is likely that WORKER_NOT_RUNNING is true

Running the annotate branch profiler on three boxes, including my
main box that runs firefox, evolution, xchat, and is part of the distcc farm,
showed this with the likelys in the workqueue code:

correct incorrect  %        Function                  File              Line
------- ---------  -        --------                  ----              ----
      96   996253  99 wq_worker_sleeping             workqueue.c          703
      96   996247  99 wq_worker_waking_up            workqueue.c          677

The likely()s in this case were assuming that WORKER_NOT_RUNNING will
most likely be false. But this is not the case. The reason is
(and shown by adding trace_printks and testing it) that most of the time
WORKER_PREP is set.

In worker_thread() we have:

worker_clr_flags(worker, WORKER_PREP);

[ do work stuff ]

worker_set_flags(worker, WORKER_PREP, false);

(that 'false' means not to wake up an idle worker)

The wq_worker_sleeping() is called from schedule when a worker thread
is putting itself to sleep. Which happens most of the time outside
of that [ do work stuff ].

The wq_worker_waking_up is called by the wakeup worker code, which
is also callod outside that [ do work stuff ].

Thus, the likely and unlikely used by those two functions are actually
backwards.

Remove the annotation and let gcc figure it out.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Tejun Heo <tj@kernel.org>

MAINTAINERS: Add workqueue entry

Signed-off-by: Tejun Heo <tj@kernel.org>

fbdev: Fix fb_find_nearest_mode refresh comparison

Refresh rate nearness is not calculated or reset when nearest resolution
changes.

This patch resets the refresh rate differential measurement whenever a
new nearest resolution is discovered. This fixes two error cases;
first, wherein the first mode's refresh rate differential is never
calculated and second, when the closest refresh rate from a previous
nearest resolution is erroneously preserved.

Signed-off-by: Andrew Kephart <andrew.kephart@alereon.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>

sh: wire up accept4 syscall (non-multiplexed path)

Signed-off-by: Carmelo Amoroso <carmelo.amoroso@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>

dmaengine: at_hdmac: fix buffer transfer size specification

Buffer transfer size is the number of transfers to be performed in
relation with the width of the _source_ interface.
So in the DMA_FROM_DEVICE case, it should be the register width that
should be taken into account.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

kgdboc,input: Fix regression with keyboard release key and early debugging

The commit 111c182340cd22e238ab1cc6564df336c6ebd7cb (kgdboc: reset
input devices (keyboards) when exiting debugger) introduced a
regression in early debugging such that you get a kernel oops on
continue (with the go command) if you boot a kernel with:

earlyprintk=vga ekgdboc=kbd kgdbwait

The restore kgdboc_restore_input() routine schedules work for the
purpose of sending key release events for any keys that were in the
depressed state prior to entering the kernel debugger. A simple fix
to the crash is to not invoke the schedule_work() if the kernel
system_state is anything other than SYSTEM_RUNNING.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Sergei Shtylyov <sshtylyov@mvista.com>

Merge branch 'bugzilla-23002' into release

ACPI / PM: Do not save/restore NVS on Sony Vaio VGN-NW130D

The saving of the NVS memory area during suspend and restoring it
during resume causes problems to appear on Sony Vaio VGN-NW130D, so
blacklist that machine to avoid those problems.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=23002

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-and-tested-by: Adriano <adriano.vilela@yahoo.com>
Signed-off-by: Len Brown <len.brown@intel.com>

Btrfs: prevent RAID level downgrades when space is low

The extent allocator has code that allows us to fill
allocations from any available block group, even if it doesn't
match the raid level we've requested.

This was put in because adding a new drive to a filesystem
made with the default mkfs options actually upgrades the metadata from
single spindle dup to full RAID1.

But, the code also allows us to allocate from a raid0 chunk when we
really want a raid1 or raid10 chunk. This can cause big trouble because
mkfs creates a small (4MB) raid0 chunk for data and metadata which then
goes unused for raid1/raid10 installs.

The allocator will happily wander in and allocate from that chunk when
things get tight, which is not correct.

The fix here is to make sure that we provide duplication when the
caller has asked for it. It does all the dups to be any raid level,
which preserves the dup->raid1 upgrade abilities.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: account for missing devices in RAID allocation profiles

When we mount in RAID degraded mode without adding a new device to
replace the failed one, we can end up using the wrong RAID flags for
allocations.

This results in strange combinations of block groups (raid1 in a raid10
filesystem) and corruptions when we try to allocate blocks from single
spindle chunks on drives that are actually missing.

The first device has two small 4MB chunks in it that mkfs creates and
these are usually unused in a raid1 or raid10 setup. But, in -o degraded,
the allocator will fall back to these because the mask of desired raid groups
isn't correct.

The fix here is to count the missing devices as we build up the list
of devices in the system. This count is used when picking the
raid level to make sure we continue using the same levels that were
in place before we lost a drive.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

fsldma: fix issue of slow dma

Fixed fsl dma slow issue by initializing dma mode register with
bandwidth control. It boosts dma performance and should works
with 85xx board.

Signed-off-by: Forrest Shi <b29237@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Btrfs: EIO when we fail to read tree roots

If we just get a plain IO error when we read tree roots, the code
wasn't properly sending that error up the chain. This allowed mounts to
continue when they should failed, and allowed operations
on partially setup root structs. The end result was usually oopsen
on spinlocks that hadn't been spun up correctly.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

hwmon: (ltc4215) make sysfs file match the alarm cause

The ltc4215 driver used the chip's "power good" status bit to provide
the power1_alarm file. This is wrong: the chip is really reporting the
status of one of the monitored voltages.

Change the sysfs file from power1_alarm to in2_min_alarm instead. This
matches the voltage that the chip is raising an alarm for.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>