Commit Graph

26812 Commits

Author SHA1 Message Date
Al Viro
2570ebbd1f switch kern_ipc_perm to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:17 -05:00
Al Viro
0583fcc96b consolidate umode_t declarations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:17 -05:00
Al Viro
1bc94226d5 switch spu_create(2) to use of SYSCALL_DEFINE4, make it use umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:16 -05:00
Al Viro
df0a42837b switch mq_open() to umode_t 2012-01-03 22:55:16 -05:00
Al Viro
a85cfdaec9 switch miscdevice to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:15 -05:00
Al Viro
52ef0c042b switch securityfs_create_file() to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:13 -05:00
Al Viro
910f4ecef3 switch security_path_chmod() to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:13 -05:00
Al Viro
49f0a07672 switch sys_chmod()/sys_fchmod()/sys_fchmodat() to umode_t
SYSCALLx magic should take care of things, according to Linus...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:12 -05:00
Al Viro
36fcb589e7 sysctl: use umode_t for table permissions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:12 -05:00
Al Viro
8d334acdd2 switch is_sxid() to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:11 -05:00
Al Viro
62bb109170 switch inode_init_owner() to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:11 -05:00
Al Viro
632861f05a pohmelfs: propagate umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:07 -05:00
Al Viro
09208d150b shmem, ramfs: propagate umode_t, open-coded S_ISREG
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:07 -05:00
Al Viro
64f1426f3c sunrpc: propagate umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:04 -05:00
Al Viro
a5e7ed3287 cgroup: propagate mode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:03 -05:00
Al Viro
8e0718924e reiserfs: propagate umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:55:00 -05:00
Al Viro
69b34f3ab3 ext3: propagate umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:58 -05:00
Al Viro
439475140b configfs: convert to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:57 -05:00
Al Viro
f4ae40a6a5 switch debugfs to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:56 -05:00
Al Viro
48176a973d switch sysfs_chmod_file() to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:56 -05:00
Al Viro
d161a13f97 switch procfs to umode_t use
both proc_dir_entry ->mode and populating functions

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:56 -05:00
Al Viro
587a1f1659 switch ->is_visible() to returning umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:55 -05:00
Al Viro
9104e427f3 switch sysfs attr->mode to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:55 -05:00
Al Viro
2c9ede55ec switch device_get_devnode() and ->devnode() to umode_t *
both callers of device_get_devnode() are only interested in lower 16bits
and nobody tries to return anything wider than 16bit anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:55 -05:00
Al Viro
1a67aafb5f switch ->mknod() to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:54 -05:00
Al Viro
4acdaf27eb switch ->create() to umode_t
vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the method

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:53 -05:00
Al Viro
18bb1db3e7 switch vfs_mkdir() and ->mkdir() to umode_t
vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:53 -05:00
Al Viro
8208a22bb8 switch sys_mknodat(2) to umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:52 -05:00
Al Viro
cf31e70d6c vfs: new helper - vfs_ustat()
... and bury user_get_super()/statfs_by_dentry() - they are
purely internal now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:53:07 -05:00
Al Viro
2a79f17e4a vfs: mnt_drop_write_file()
new helper (wrapper around mnt_drop_write()) to be used in pair with
mnt_want_write_file().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:40 -05:00
Al Viro
8c9379e972 constify seq_file stuff
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:40 -05:00
Al Viro
79e801a906 vfs: make do_kern_mount() static
the only user outside of fs/namespace.c has died

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:39 -05:00
Al Viro
a5166169f9 vfs: convert fs_supers to hlist
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:39 -05:00
Al Viro
6c449c8dfe unexport put_mnt_ns(), make create_mnt_ns() static outright
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:37 -05:00
Al Viro
5ede7b1cfa pull manipulations of rpc_cred inside alloc_nfs_open_context()
No need to duplicate them in both callers; make it return
ERR_PTR(-ENOMEM) on allocation failure instead of NULL and
it'll be able to report rpc_lookup_cred() failures just
fine.  Callers are much happier that way...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:34 -05:00
Jan Kara
30e053248d security: Fix security_old_inode_init_security() when CONFIG_SECURITY is not set
Commit 1e39f384bb01 ("evm: fix build problems") makes the stub version
of security_old_inode_init_security() return 0 when CONFIG_SECURITY is
not set.

But that makes callers such as reiserfs_security_init() assume that
security_old_inode_init_security() has set name, value, and len
arguments properly - but security_old_inode_init_security() left them
uninitialized which then results in interesting failures.

Revert security_old_inode_init_security() to the old behavior of
returning EOPNOTSUPP since both callers (reiserfs and ocfs2) handle this
just fine.

[ Also fixed the S_PRIVATE(inode) case of the actual non-stub
  security_old_inode_init_security() function to return EOPNOTSUPP
  for the same reason, as pointed out by Mimi Zohar.

  It got incorrectly changed to match the new function in commit
  fb88c2b6cbb1: "evm: fix security/security_old_init_security return
  code".   - Linus ]

Reported-by: Jorge Bastos <mysql.jorge@decimal.pt>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-03 16:12:19 -08:00
Jan Kiszka
4d25a066b6 KVM: Don't automatically expose the TSC deadline timer in cpuid
Unlike all of the other cpuid bits, the TSC deadline timer bit is set
unconditionally, regardless of what userspace wants.

This is broken in several ways:
 - if userspace doesn't use KVM_CREATE_IRQCHIP, and doesn't emulate the TSC
   deadline timer feature, a guest that uses the feature will break
 - live migration to older host kernels that don't support the TSC deadline
   timer will cause the feature to be pulled from under the guest's feet;
   breaking it
 - guests that are broken wrt the feature will fail.

Fix by not enabling the feature automatically; instead report it to userspace.
Because the feature depends on KVM_CREATE_IRQCHIP, which we cannot guarantee
will be called, we expose it via a KVM_CAP_TSC_DEADLINE_TIMER and not
KVM_GET_SUPPORTED_CPUID.

Fixes the Illumos guest kernel, which uses the TSC deadline timer feature.

[avi: add the KVM_CAP + documentation]

Reported-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-12-26 13:27:44 +02:00
Srivatsa S. Bhat
e30e2fdfe5 VFS: Fix race between CPU hotplug and lglocks
Currently, the *_global_[un]lock_online() routines are not at all synchronized
with CPU hotplug. Soft-lockups detected as a consequence of this race was
reported earlier at https://lkml.org/lkml/2011/8/24/185. (Thanks to Cong Meng
for finding out that the root-cause of this issue is the race condition
between br_write_[un]lock() and CPU hotplug, which results in the lock states
getting messed up).

Fixing this race by just adding {get,put}_online_cpus() at appropriate places
in *_global_[un]lock_online() is not a good option, because, then suddenly
br_write_[un]lock() would become blocking, whereas they have been kept as
non-blocking all this time, and we would want to keep them that way.

So, overall, we want to ensure 3 things:
1. br_write_lock() and br_write_unlock() must remain as non-blocking.
2. The corresponding lock and unlock of the per-cpu spinlocks must not happen
   for different sets of CPUs.
3. Either prevent any new CPU online operation in between this lock-unlock, or
   ensure that the newly onlined CPU does not proceed with its corresponding
   per-cpu spinlock unlocked.

To achieve all this:
(a) We introduce a new spinlock that is taken by the *_global_lock_online()
    routine and released by the *_global_unlock_online() routine.
(b) We register a callback for CPU hotplug notifications, and this callback
    takes the same spinlock as above.
(c) We maintain a bitmap which is close to the cpu_online_mask, and once it is
    initialized in the lock_init() code, all future updates to it are done in
    the callback, under the above spinlock.
(d) The above bitmap is used (instead of cpu_online_mask) while locking and
    unlocking the per-cpu locks.

The callback takes the spinlock upon the CPU_UP_PREPARE event. So, if the
br_write_lock-unlock sequence is in progress, the callback keeps spinning,
thus preventing the CPU online operation till the lock-unlock sequence is
complete. This takes care of requirement (3).

The bitmap that we maintain remains unmodified throughout the lock-unlock
sequence, since all updates to it are managed by the callback, which takes
the same spinlock as the one taken by the lock code and released only by the
unlock routine. Combining this with (d) above, satisfies requirement (2).

Overall, since we use a spinlock (mentioned in (a)) to prevent CPU hotplug
operations from racing with br_write_lock-unlock, requirement (1) is also
taken care of.

By the way, it is to be noted that a CPU offline operation can actually run
in parallel with our lock-unlock sequence, because our callback doesn't react
to notifications earlier than CPU_DEAD (in order to maintain our bitmap
properly). And this means, since we use our own bitmap (which is stale, on
purpose) during the lock-unlock sequence, we could end up unlocking the
per-cpu lock of an offline CPU (because we had locked it earlier, when the
CPU was online), in order to satisfy requirement (2). But this is harmless,
though it looks a bit awkward.

Debugged-by: Cong Meng <mc@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
2011-12-22 02:02:20 -05:00
Linus Torvalds
5fbd305dd2 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time/clocksource: Fix kernel-doc warnings
  rtc: m41t80: Workaround broken alarm functionality
  rtc: Expire alarms after the time is set.
2011-12-20 11:42:38 -08:00
Kusanagi Kouichi
b1b73d0950 time/clocksource: Fix kernel-doc warnings
Fix various KernelDoc build warnings.

Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Cc: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20111219091320.0D5AF6FC03D@msa105.auone-net.jp
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-19 11:41:40 +01:00
Linus Torvalds
2cfab8d74e Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux
* 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux:
  drm/i915/dp: Dither down to 6bpc if it makes the mode fit
  drm/i915: enable semaphores on per-device defaults
  drm/i915: don't set unpin_work if vblank_get fails
  drm/i915: By default, enable RC6 on IVB and SNB when reasonable
  iommu: Export intel_iommu_enabled to signal when iommu is in use
  drm/i915/sdvo: Include LVDS panels for the IS_DIGITAL check
  drm/i915: prevent division by zero when asking for chipset power
  drm/i915: add PCH info to i915_capabilities
  drm/i915: set the right SDVO transcoder for CPT
  drm/i915: no-lvds quirk for ASUS AT5NM10T-I
  drm/i915: Treat pre-gen4 backlight duty cycle value consistently
  drm/i915: Hook up Ivybridge eDP
  drm/i915: add multi-threaded forcewake support
2011-12-16 11:27:56 -08:00
Linus Torvalds
b0d78ee89c Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  block: don't kick empty queue in blk_drain_queue()
  block/swim3: Locking fixes
  loop: Fix discard_alignment default setting
  cfq-iosched: fix cfq_cic_link() race confition
  cfq-iosched: free cic_index if blkio_alloc_blkg_stats fails
  cciss: fix flush cache transfer length
  cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler
  loop: fix loop block driver discard and encryption comment
  block: initialize request_queue's numa node during
2011-12-16 10:05:14 -08:00
Eugeni Dodonov
8bc1f85c02 iommu: Export intel_iommu_enabled to signal when iommu is in use
In i915 driver, we do not enable either rc6 or semaphores on SNB when dmar
is enabled. The new 'intel_iommu_enabled' variable signals when the
iommu code is in operation.

Cc: Ted Phelps <phelps@gnusto.com>
Cc: Peter <pab1612@gmail.com>
Cc: Lukas Hejtmanek <xhejtman@fi.muni.cz>
Cc: Andrew Lutomirski <luto@mit.edu>
CC: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-12-16 08:49:57 -08:00
Linus Torvalds
13c07b0286 linux/log2.h: Fix rounddown_pow_of_two(1)
Exactly like roundup_pow_of_two(1), the rounddown version was buggy for
the case of a compile-time constant '1' argument.  Probably because it
originated from the same code, sharing history with the roundup version
from before the bugfix (for that one, see commit 1a06a52ee1b0: "Fix
roundup_pow_of_two(1)").

However, unlike the roundup version, the fix for rounddown is to just
remove the broken special case entirely.  It's simply not needed - the
generic code

    1UL << ilog2(n)

does the right thing for the constant '1' argment too.  The only reason
roundup needed that special case was because rounding up does so by
subtracting one from the argument (and then adding one to the result)
causing the obvious problems with "ilog2(0)".

But rounddown doesn't do any of that, since ilog2() naturally truncates
(ie "rounds down") to the right rounded down value.  And without the
ilog2(0) case, there's no reason for the special case that had the wrong
value.

tl;dr: rounddown_pow_of_two(1) should be 1, not 0.

Acked-by: Dmitry Torokhov <dtor@vmware.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-12 22:06:55 -08:00
Linus Torvalds
71fe5ccac7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: core: Fix deadlock when the CONFIG_MMC_UNSAFE_RESUME is not defined
  mmc: sdhci-s3c: Remove old and misprototyped suspend operations
  mmc: tmio: fix clock gating on platforms with a .set_pwr() method
  mmc: sh_mmcif: fix clock gating on platforms with a .down_pwr() method
  mmc: core: Fix typo at mmc_card_sleep
  mmc: core: Fix power_off_notify during suspend
  mmc: core: Fix setting power notify state variable for non-eMMC
  mmc: core: Add quirk for long data read time
  mmc: Add module.h include to sdhci-cns3xxx.c
  mmc: mxcmmc: fix falling back to PIO
  mmc: omap_hsmmc: DMA unmap only once in case of MMC error
2011-12-12 20:06:13 -08:00
Stefan Nilsson XK
6de5fc9cf7 mmc: core: Add quirk for long data read time
Adds a quirk that sets the data read timeout to a fixed value instead
of relying on the information in the CSD. The timeout value chosen
is 300ms since that has proven enough for the problematic cards found,
but could be increased if other cards require this.

This patch also enables this quirk for certain Micron cards known to
have this problem.

Signed-off-by: Stefan Nilsson XK <stefan.xk.nilsson@stericsson.com>
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-12-10 16:18:35 -05:00
Linus Torvalds
53523d5263 Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  arch/tile: use new generic {enable,disable}_percpu_irq() routines
  drivers/net/ethernet/tile: use skb_frag_page() API
  asm-generic/unistd.h: support new process_vm_{readv,write} syscalls
  arch/tile: fix double-free bug in homecache_free_pages()
  arch/tile: add a few #includes and an EXPORT to catch up with kernel changes.
2011-12-09 08:08:57 -08:00
Konstantin Khlebnikov
83aeeada7c vmscan: use atomic-long for shrinker batching
Use atomic-long operations instead of looping around cmpxchg().

[akpm@linux-foundation.org: massage atomic.h inclusions]
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-09 07:50:27 -08:00
Linus Torvalds
3172f8fe1c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API
2011-12-07 08:14:42 -08:00
Al Viro
02125a8264 fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API
__d_path() API is asking for trouble and in case of apparmor d_namespace_path()
getting just that.  The root cause is that when __d_path() misses the root
it had been told to look for, it stores the location of the most remote ancestor
in *root.  Without grabbing references.  Sure, at the moment of call it had
been pinned down by what we have in *path.  And if we raced with umount -l, we
could have very well stopped at vfsmount/dentry that got freed as soon as
prepend_path() dropped vfsmount_lock.

It is safe to compare these pointers with pre-existing (and known to be still
alive) vfsmount and dentry, as long as all we are asking is "is it the same
address?".  Dereferencing is not safe and apparmor ended up stepping into
that.  d_namespace_path() really wants to examine the place where we stopped,
even if it's not connected to our namespace.  As the result, it looked
at ->d_sb->s_magic of a dentry that might've been already freed by that point.
All other callers had been careful enough to avoid that, but it's really
a bad interface - it invites that kind of trouble.

The fix is fairly straightforward, even though it's bigger than I'd like:
	* prepend_path() root argument becomes const.
	* __d_path() is never called with NULL/NULL root.  It was a kludge
to start with.  Instead, we have an explicit function - d_absolute_root().
Same as __d_path(), except that it doesn't get root passed and stops where
it stops.  apparmor and tomoyo are using it.
	* __d_path() returns NULL on path outside of root.  The main
caller is show_mountinfo() and that's precisely what we pass root for - to
skip those outside chroot jail.  Those who don't want that can (and do)
use d_path().
	* __d_path() root argument becomes const.  Everyone agrees, I hope.
	* apparmor does *NOT* try to use __d_path() or any of its variants
when it sees that path->mnt is an internal vfsmount.  In that case it's
definitely not mounted anywhere and dentry_path() is exactly what we want
there.  Handling of sysctl()-triggered weirdness is moved to that place.
	* if apparmor is asked to do pathname relative to chroot jail
and __d_path() tells it we it's not in that jail, the sucker just calls
d_absolute_path() instead.  That's the other remaining caller of __d_path(),
BTW.
        * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
the normal seq_file logics will take care of growing the buffer and redoing
the call of ->show() just fine).  However, if it gets path not reachable
from root, it returns SEQ_SKIP.  The only caller adjusted (i.e. stopped
ignoring the return value as it used to do).

Reviewed-by: John Johansen <john.johansen@canonical.com>
ACKed-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
2011-12-06 23:57:18 -05:00