Commit Graph

4095 Commits

Author SHA1 Message Date
Christoph Hellwig
f5e54d6e53 [PATCH] mark address_space_operations const
Same as with already do with the file operations: keep them in .rodata and
prevents people from doing runtime patching.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:04 -07:00
Trond Myklebust
607f31e80b Revert "Merge branch 'odirect'"
This reverts ccf01ef7aa9c6c293a1c64c27331a2ce227916ec commit.

No idea how git managed this one: when I asked it to merge the odirect
topic branch it actually generated a patch which reverted the change.

Reverting the 'merge' will once again reveal Chuck's recent NFS/O_DIRECT
work to the world.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-28 16:52:45 -04:00
Linus Torvalds
936813a880 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
  [MTD] NAND: Select chip before checking write protect status
  [MTD] CORE mtdchar.c: fix off-by-one error in lseek()
  [MTD] NAND: Fix typo in mtd/nand/ts7250.c
  [JFFS2][XATTR] coexistence between xattr and write buffering support.
  [JFFS2][XATTR] Fix wrong copyright
  [JFFS2][XATTR] Re-define xd->refcnt as atomic_t
  [JFFS2][XATTR] Fix memory leak with jffs2_xattr_ref
  [JFFS2][XATTR] rid unnecessary writing of delete marker.
  [JFFS2][XATTR] Fix ACL bug when updating null xattr by null ACL.
  [JFFS2][XATTR] using 'delete marker' for xdatum/xref deletion
  [MTD] Fix off-by-one error in physmap.c
  [MTD] Remove unused 'nr_banks' variable from ixp2000 map driver
  [MTD NAND] s3c2412 support in s3c2410.c
  [MTD] Initialize 'writesize'
  [MTD] NAND: ndfc fix address offset thinko
  [MTD] NAND: S3C2410 convert prinks to dev_*()s
  [MTD] NAND: Missing fixups
2006-06-27 19:13:56 -07:00
Linus Torvalds
03529d9f66 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [PATCH] ata_piix: add ICH6/7/8 to Kconfig
  [PATCH] sata_sil: disable hotplug interrupts on two ATI IXPs
  [PATCH] libata: cosmetic updates
  [PATCH] ata: add some NVIDIA chipset IDs
  [PATCH] libata reduce timeouts
  [PATCH] libata: implement ata_port_max_devices()
  [PATCH] libata: make two functions global
  [PATCH] libata: update ata_do_simple_cmd()
  [PATCH] libata: move ata_do_simple_cmd() below ata_exec_internal()
  [PATCH] libata: clear EH action on device detach
  [PATCH] libata: implement and use ata_deh_dev_action()
  [PATCH] libata: move ata_eh_clear_action() upward
  [PATCH] libata.h needs scatterlist.h
  [libata] sata_vsc: partially revert a PCI ID-related commit
  [libata] Bump versions
2006-06-27 19:07:21 -07:00
Adrian Bunk
456229a91d [PATCH] drivers/char/ipmi/ipmi_msghandler.c: make proc_ipmi_root static
Make struct proc_ipmi_root static.

Besides this, tremove removes an unused #ifdef CONFIG_PROC_FS from
include/linux/ipmi.h.

Acked-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:48 -07:00
Thomas Gleixner
95e02ca9bb [PATCH] rtmutex: Propagate priority settings into PI lock chains
When the priority of a task, which is blocked on a lock, changes we must
propagate this change into the PI lock chain.  Therefor the chain walk code
is changed to get rid of the references to current to avoid false positives
in the deadlock detector, as setscheduler might be called by a task which
holds the lock on which the task whose priority is changed is blocked.

Also add some comments about the get/put_task_struct usage to avoid
confusion.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:48 -07:00
Ingo Molnar
c87e2837be [PATCH] pi-futex: futex_lock_pi/futex_unlock_pi support
This adds the actual pi-futex implementation, based on rt-mutexes.

[dino@in.ibm.com: fix an oops-causing race]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Dinakar Guniguntala <dino@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:47 -07:00
Thomas Gleixner
61a8712286 [PATCH] pi-futex: rt mutex tester
RT-mutex tester: scriptable tester for rt mutexes, which allows userspace
scripting of mutex unit-tests (and dynamic tests as well), using the actual
rt-mutex implementation of the kernel.

[akpm@osdl.org: fixlet]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:47 -07:00
Ingo Molnar
e7eebaf6a8 [PATCH] pi-futex: rt mutex debug
Runtime debugging functionality for rt-mutexes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:47 -07:00
Ingo Molnar
23f78d4a03 [PATCH] pi-futex: rt mutex core
Core functions for the rt-mutex subsystem.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:47 -07:00
Ingo Molnar
b29739f902 [PATCH] pi-futex: scheduler support for pi
Add framework to boost/unboost the priority of RT tasks.

This consists of:

 - caching the 'normal' priority in ->normal_prio
 - providing a functions to set/get the priority of the task
 - make sched_setscheduler() aware of boosting

The effective_prio() cleanups also fix a priority-calculation bug pointed out
by Andrey Gelman, in set_user_nice().

has_rt_policy() fix: Peter Williams <pwil3058@bigpond.net.au>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andrey Gelman <agelman@012.net.il>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:46 -07:00
Ingo Molnar
77ba89c5cf [PATCH] pi-futex: add plist implementation
Add the priority-sorted list (plist) implementation.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:46 -07:00
Ingo Molnar
f9b8404cf8 [PATCH] pi-futex: introduce debug_check_no_locks_freed()
Add debug_check_no_locks_freed(), as a central inline to add
bad-lock-free-debugging functionality to.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:46 -07:00
Ingo Molnar
e2970f2fb6 [PATCH] pi-futex: futex code cleanups
We are pleased to announce "lightweight userspace priority inheritance" (PI)
support for futexes.  The following patchset and glibc patch implements it,
ontop of the robust-futexes patchset which is included in 2.6.16-mm1.

We are calling it lightweight for 3 reasons:

 - in the user-space fastpath a PI-enabled futex involves no kernel work
   (or any other PI complexity) at all.  No registration, no extra kernel
   calls - just pure fast atomic ops in userspace.

 - in the slowpath (in the lock-contention case), the system call and
   scheduling pattern is in fact better than that of normal futexes, due to
   the 'integrated' nature of FUTEX_LOCK_PI.  [more about that further down]

 - the in-kernel PI implementation is streamlined around the mutex
   abstraction, with strict rules that keep the implementation relatively
   simple: only a single owner may own a lock (i.e.  no read-write lock
   support), only the owner may unlock a lock, no recursive locking, etc.

  Priority Inheritance - why, oh why???
  -------------------------------------

Many of you heard the horror stories about the evil PI code circling Linux for
years, which makes no real sense at all and is only used by buggy applications
and which has horrible overhead.  Some of you have dreaded this very moment,
when someone actually submits working PI code ;-)

So why would we like to see PI support for futexes?

We'd like to see it done purely for technological reasons.  We dont think it's
a buggy concept, we think it's useful functionality to offer to applications,
which functionality cannot be achieved in other ways.  We also think it's the
right thing to do, and we think we've got the right arguments and the right
numbers to prove that.  We also believe that we can address all the
counter-arguments as well.  For these reasons (and the reasons outlined below)
we are submitting this patch-set for upstream kernel inclusion.

What are the benefits of PI?

  The short reply:
  ----------------

User-space PI helps achieving/improving determinism for user-space
applications.  In the best-case, it can help achieve determinism and
well-bound latencies.  Even in the worst-case, PI will improve the statistical
distribution of locking related application delays.

  The longer reply:
  -----------------

Firstly, sharing locks between multiple tasks is a common programming
technique that often cannot be replaced with lockless algorithms.  As we can
see it in the kernel [which is a quite complex program in itself], lockless
structures are rather the exception than the norm - the current ratio of
lockless vs.  locky code for shared data structures is somewhere between 1:10
and 1:100.  Lockless is hard, and the complexity of lockless algorithms often
endangers to ability to do robust reviews of said code.  I.e.  critical RT
apps often choose lock structures to protect critical data structures, instead
of lockless algorithms.  Furthermore, there are cases (like shared hardware,
or other resource limits) where lockless access is mathematically impossible.

Media players (such as Jack) are an example of reasonable application design
with multiple tasks (with multiple priority levels) sharing short-held locks:
for example, a highprio audio playback thread is combined with medium-prio
construct-audio-data threads and low-prio display-colory-stuff threads.  Add
video and decoding to the mix and we've got even more priority levels.

So once we accept that synchronization objects (locks) are an unavoidable fact
of life, and once we accept that multi-task userspace apps have a very fair
expectation of being able to use locks, we've got to think about how to offer
the option of a deterministic locking implementation to user-space.

Most of the technical counter-arguments against doing priority inheritance
only apply to kernel-space locks.  But user-space locks are different, there
we cannot disable interrupts or make the task non-preemptible in a critical
section, so the 'use spinlocks' argument does not apply (user-space spinlocks
have the same priority inversion problems as other user-space locking
constructs).  Fact is, pretty much the only technique that currently enables
good determinism for userspace locks (such as futex-based pthread mutexes) is
priority inheritance:

Currently (without PI), if a high-prio and a low-prio task shares a lock [this
is a quite common scenario for most non-trivial RT applications], even if all
critical sections are coded carefully to be deterministic (i.e.  all critical
sections are short in duration and only execute a limited number of
instructions), the kernel cannot guarantee any deterministic execution of the
high-prio task: any medium-priority task could preempt the low-prio task while
it holds the shared lock and executes the critical section, and could delay it
indefinitely.

  Implementation:
  ---------------

As mentioned before, the userspace fastpath of PI-enabled pthread mutexes
involves no kernel work at all - they behave quite similarly to normal
futex-based locks: a 0 value means unlocked, and a value==TID means locked.
(This is the same method as used by list-based robust futexes.) Userspace uses
atomic ops to lock/unlock these mutexes without entering the kernel.

To handle the slowpath, we have added two new futex ops:

  FUTEX_LOCK_PI
  FUTEX_UNLOCK_PI

If the lock-acquire fastpath fails, [i.e.  an atomic transition from 0 to TID
fails], then FUTEX_LOCK_PI is called.  The kernel does all the remaining work:
if there is no futex-queue attached to the futex address yet then the code
looks up the task that owns the futex [it has put its own TID into the futex
value], and attaches a 'PI state' structure to the futex-queue.  The pi_state
includes an rt-mutex, which is a PI-aware, kernel-based synchronization
object.  The 'other' task is made the owner of the rt-mutex, and the
FUTEX_WAITERS bit is atomically set in the futex value.  Then this task tries
to lock the rt-mutex, on which it blocks.  Once it returns, it has the mutex
acquired, and it sets the futex value to its own TID and returns.  Userspace
has no other work to perform - it now owns the lock, and futex value contains
FUTEX_WAITERS|TID.

If the unlock side fastpath succeeds, [i.e.  userspace manages to do a TID ->
0 atomic transition of the futex value], then no kernel work is triggered.

If the unlock fastpath fails (because the FUTEX_WAITERS bit is set), then
FUTEX_UNLOCK_PI is called, and the kernel unlocks the futex on the behalf of
userspace - and it also unlocks the attached pi_state->rt_mutex and thus wakes
up any potential waiters.

Note that under this approach, contrary to other PI-futex approaches, there is
no prior 'registration' of a PI-futex.  [which is not quite possible anyway,
due to existing ABI properties of pthread mutexes.]

Also, under this scheme, 'robustness' and 'PI' are two orthogonal properties
of futexes, and all four combinations are possible: futex, robust-futex,
PI-futex, robust+PI-futex.

  glibc support:
  --------------

Ulrich Drepper and Jakub Jelinek have written glibc support for PI-futexes
(and robust futexes), enabling robust and PI (PTHREAD_PRIO_INHERIT) POSIX
mutexes.  (PTHREAD_PRIO_PROTECT support will be added later on too, no
additional kernel changes are needed for that).  [NOTE: The glibc patch is
obviously inofficial and unsupported without matching upstream kernel
functionality.]

the patch-queue and the glibc patch can also be downloaded from:

  http://redhat.com/~mingo/PI-futex-patches/

Many thanks go to the people who helped us create this kernel feature: Steven
Rostedt, Esben Nielsen, Benedikt Spranger, Daniel Walker, John Cooper, Arjan
van de Ven, Oleg Nesterov and others.  Credits for related prior projects goes
to Dirk Grambow, Inaky Perez-Gonzalez, Bill Huey and many others.

Clean up the futex code, before adding more features to it:

 - use u32 as the futex field type - that's the ABI
 - use __user and pointers to u32 instead of unsigned long
 - code style / comment style cleanups
 - rename hash-bucket name from 'bh' to 'hb'.

I checked the pre and post futex.o object files to make sure this
patch has no code effects.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:46 -07:00
Siddha, Suresh B
5c45bf279d [PATCH] sched: mc/smt power savings sched policy
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.

Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains.  When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics...  see OLS
2005 CMP kernel scheduler paper for more details..)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Srivatsa Vaddagiri
51888ca25a [PATCH] sched_domain: handle kmalloc failure
Try to handle mem allocation failures in build_sched_domains by bailing out
and cleaning up thus-far allocated memory.  The patch has a direct consequence
that we disable load balancing completely (even at sibling level) upon *any*
memory allocation failure.

[Lee.Schermerhorn@hp.com: bugfix]
Signed-off-by: Srivatsa Vaddagir <vatsa@in.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Peter Williams
2dd73a4f09 [PATCH] sched: implement smpnice
Problem:

The introduction of separate run queues per CPU has brought with it "nice"
enforcement problems that are best described by a simple example.

For the sake of argument suppose that on a single CPU machine with a
nice==19 hard spinner and a nice==0 hard spinner running that the nice==0
task gets 95% of the CPU and the nice==19 task gets 5% of the CPU.  Now
suppose that there is a system with 2 CPUs and 2 nice==19 hard spinners and
2 nice==0 hard spinners running.  The user of this system would be entitled
to expect that the nice==0 tasks each get 95% of a CPU and the nice==19
tasks only get 5% each.  However, whether this expectation is met is pretty
much down to luck as there are four equally likely distributions of the
tasks to the CPUs that the load balancing code will consider to be balanced
with loads of 2.0 for each CPU.  Two of these distributions involve one
nice==0 and one nice==19 task per CPU and in these circumstances the users
expectations will be met.  The other two distributions both involve both
nice==0 tasks being on one CPU and both nice==19 being on the other CPU and
each task will get 50% of a CPU and the user's expectations will not be
met.

Solution:

The solution to this problem that is implemented in the attached patch is
to use weighted loads when determining if the system is balanced and, when
an imbalance is detected, to move an amount of weighted load between run
queues (as opposed to a number of tasks) to restore the balance.  Once
again, the easiest way to explain why both of these measures are necessary
is to use a simple example.  Suppose that (in a slight variation of the
above example) that we have a two CPU system with 4 nice==0 and 4 nice=19
hard spinning tasks running and that the 4 nice==0 tasks are on one CPU and
the 4 nice==19 tasks are on the other CPU.  The weighted loads for the two
CPUs would be 4.0 and 0.2 respectively and the load balancing code would
move 2 tasks resulting in one CPU with a load of 2.0 and the other with
load of 2.2.  If this was considered to be a big enough imbalance to
justify moving a task and that task was moved using the current
move_tasks() then it would move the highest priority task that it found and
this would result in one CPU with a load of 3.0 and the other with a load
of 1.2 which would result in the movement of a task in the opposite
direction and so on -- infinite loop.  If, on the other hand, an amount of
load to be moved is calculated from the imbalance (in this case 0.1) and
move_tasks() skips tasks until it find ones whose contributions to the
weighted load are less than this amount it would move two of the nice==19
tasks resulting in a system with 2 nice==0 and 2 nice=19 on each CPU with
loads of 2.1 for each CPU.

One of the advantages of this mechanism is that on a system where all tasks
have nice==0 the load balancing calculations would be mathematically
identical to the current load balancing code.

Notes:

struct task_struct:

has a new field load_weight which (in a trade off of space for speed)
stores the contribution that this task makes to a CPU's weighted load when
it is runnable.

struct runqueue:

has a new field raw_weighted_load which is the sum of the load_weight
values for the currently runnable tasks on this run queue.  This field
always needs to be updated when nr_running is updated so two new inline
functions inc_nr_running() and dec_nr_running() have been created to make
sure that this happens.  This also offers a convenient way to optimize away
this part of the smpnice mechanism when CONFIG_SMP is not defined.

int try_to_wake_up():

in this function the value SCHED_LOAD_BALANCE is used to represent the load
contribution of a single task in various calculations in the code that
decides which CPU to put the waking task on.  While this would be a valid
on a system where the nice values for the runnable tasks were distributed
evenly around zero it will lead to anomalous load balancing if the
distribution is skewed in either direction.  To overcome this problem
SCHED_LOAD_SCALE has been replaced by the load_weight for the relevant task
or by the average load_weight per task for the queue in question (as
appropriate).

int move_tasks():

The modifications to this function were complicated by the fact that
active_load_balance() uses it to move exactly one task without checking
whether an imbalance actually exists.  This precluded the simple
overloading of max_nr_move with max_load_move and necessitated the addition
of the latter as an extra argument to the function.  The internal
implementation is then modified to move up to max_nr_move tasks and
max_load_move of weighted load.  This slightly complicates the code where
move_tasks() is called and if ever active_load_balance() is changed to not
use move_tasks() the implementation of move_tasks() should be simplified
accordingly.

struct sched_group *find_busiest_group():

Similar to try_to_wake_up(), there are places in this function where
SCHED_LOAD_SCALE is used to represent the load contribution of a single
task and the same issues are created.  A similar solution is adopted except
that it is now the average per task contribution to a group's load (as
opposed to a run queue) that is required.  As this value is not directly
available from the group it is calculated on the fly as the queues in the
groups are visited when determining the busiest group.

A key change to this function is that it is no longer to scale down
*imbalance on exit as move_tasks() uses the load in its scaled form.

void set_user_nice():

has been modified to update the task's load_weight field when it's nice
value and also to ensure that its run queue's raw_weighted_load field is
updated if it was runnable.

From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>

With smpnice, sched groups with highest priority tasks can mask the imbalance
between the other sched groups with in the same domain.  This patch fixes some
of the listed down scenarios by not considering the sched groups which are
lightly loaded.

a) on a simple 4-way MP system, if we have one high priority and 4 normal
   priority tasks, with smpnice we would like to see the high priority task
   scheduled on one cpu, two other cpus getting one normal task each and the
   fourth cpu getting the remaining two normal tasks.  but with current
   smpnice extra normal priority task keeps jumping from one cpu to another
   cpu having the normal priority task.  This is because of the
   busiest_has_loaded_cpus, nr_loaded_cpus logic..  We are not including the
   cpu with high priority task in max_load calculations but including that in
   total and avg_load calcuations..  leading to max_load < avg_load and load
   balance between cpus running normal priority tasks(2 Vs 1) will always show
   imbalanace as one normal priority and the extra normal priority task will
   keep moving from one cpu to another cpu having normal priority task..

b) 4-way system with HT (8 logical processors).  Package-P0 T0 has a
   highest priority task, T1 is idle.  Package-P1 Both T0 and T1 have 1 normal
   priority task each..  P2 and P3 are idle.  With this patch, one of the
   normal priority tasks on P1 will be moved to P2 or P3..

c) With the current weighted smp nice calculations, it doesn't always make
   sense to look at the highest weighted runqueue in the busy group..
   Consider a load balance scenario on a DP with HT system, with Package-0
   containing one high priority and one low priority, Package-1 containing one
   low priority(with other thread being idle)..  Package-1 thinks that it need
   to take the low priority thread from Package-0.  And find_busiest_queue()
   returns the cpu thread with highest priority task..  And ultimately(with
   help of active load balance) we move high priority task to Package-1.  And
   same continues with Package-0 now, moving high priority task from package-1
   to package-0..  Even without the presence of active load balance, load
   balance will fail to balance the above scenario..  Fix find_busiest_queue
   to use "imbalance" when it is lightly loaded.

[kernel@kolivas.org: sched: store weighted load on up]
[kernel@kolivas.org: sched: add discrete weighted cpu load function]
[suresh.b.siddha@intel.com: sched: remove dead code]
Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Cc: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:44 -07:00
Jim Cromie
f31000e573 [PATCH] chardev: GPIO for SCx200 & PC-8736x: use dev_dbg in common module
Use of dev_dbg() and friends is considered good practice.  dev_dbg() needs a
struct device *devp, but nsc_gpio is only a helper module, so it doesnt
have/need its own.  To provide devp to the user-modules (scx200 & pc8736x
_gpio), we add it to the vtable, and set it during init.

Also squeeze nsc_gpio_dump()'s format a little.

[  199.259879]  pc8736x_gpio.0: io09: 0x0044 TS OD PUE  EDGE LO DEBOUNCE

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:43 -07:00
Jim Cromie
1a66fdf083 [PATCH] chardev: GPIO for SCx200 & PC-8736x: migrate file-ops to common module
Now that the read(), write() file-ops are dispatching gpio-ops via the vtable,
they are generic, and can be moved 'verbatim' to the nsc_gpio common-support
module.  After the move, various symbols are renamed to update 'scx200_' to
'nsc_', and headers are adjusted accordingly.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:43 -07:00
Jim Cromie
fe3a168a2c [PATCH] chardev: GPIO for SCx200 & PC-8736x: add gpio-ops vtable
Abstract the gpio operations into a new nsc_gpio_ops vtable.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:42 -07:00
Jim Cromie
55b8c0455b [PATCH] chardev: GPIO for SCx200 & PC-8736x: device minor numbers are unsigned ints
Per kernel headers, device minor numbers are unsigned ints.  Do the same in
this driver.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:42 -07:00
Jim Cromie
62c83cde92 [PATCH] chardev: GPIO for SCx200 & PC-8736x: whitespace pre-clean
GPIO SUPPORT FOR SCx200 & PC8736x

The patch-set reworks the 2.4 vintage scx200_gpio driver for modern 2.6, and
refactors GPIO support to reuse it in a new driver for the GPIO on PC-8736x
chips.  Its handy for the Soekris.com net-4801, which has both chips.

These patches have been seen recently on Kernel-Mentors, and then
Kernel-Newbies ML, where Jesper Juhl kindly reviewed it.  His feedback has
been incorporated.  Thanks Jesper !

Its also gone to soekris-tech@soekris.com for possible testing by linux folks,
I've gotten 1 promise so far.  Theyre mostly BSD folk over there, but we'll
see..

Device-file & Sysfs

The driver preserves the existing device-file interface, including the
write/cmd set, but adds v to 'view' the pin-settings & configs by inducing,
via gpio_dump(), a dev_info() call.  Its a fairly crappy way to get status,
but it sticks to the syslog approach, conservatively.

Allowing users to voluntarily trigger logging is good, it gives them a
familiar way to confirm their app's control & use of the pins, and I've thus
reduced the pin-mode-updates from dev_info to dev_dbg.

I've recently bolted on a proto sysfs interface for both new drivers.  Im not
including those patches here; they (the patch + doc-pre-patch) are still quite
raw (and unreviewed on KNML), and since they 'invent' a convention for GPIO, a
proper vetting is needed.  Since this patchset is much bigger than my previous
ones, Id like to keep things simpler, and address it 1st, before bolting on
more stuff.

The driver-split

The Geode CPU and the PC-87366 Super-IO chip have GPIO units which share a
common pin-architecture (same pin features, with same bits controlling), but
with different addressing mechanics and port organizations.

The vintage driver expresses the pin capabilities with pin-mode commands
[OoPpTt],etc that change the pin configurations, and since the 2 chips share
pin-arch, we can reuse the read(), write() commands, once the implementation
is suitably adjusted.

The patchset adds a vtable: struct nsc_gpio_ops, to abstract the existing gpio
operations, then adjusts fileops.write() code to invoke operations via that
vtable.  Driver specific open()s set private_data to the vtable so its
available for use by write().

The vtable gets the gpio_dump() too, since its user-friendly, and (could be
construed as) part of the current device-file interface.  To support use of
dev_dbg() in write() & _dump(), the vtable gets a dev ptr too, set by both
scx200 & pc8736x _gpio drivers.

heres how the pins are presented in syslog:

[ 1890.176223]  scx200_gpio.0: io00: 0x0044 TS OD PUE  EDGE LO DEBOUNCE
[ 1890.287223]  scx200_gpio.0: io01: 0x0003 OE PP PUD  EDGE LO

nsc_gpio.c: new file is new home of several file-ops methods, which are
modified to get their vtable from filp->private_data, and use it where needed.

scx200_gpio.c: keeps some of its existing gpio routines, but now wires them up
via the vtable (they're invoked by nsc_gpio.c:nsc_gpio_write() thru this
vtable).  A driver-spcific open() initializes filp->private_data with the
vtable.

Once the split is clean, and the scx200_gpio driver is working, we copy and
modify the function and variable names, and rework the access-method bodies
for the different addressing scheme.

Heres a working overview of the patchset:

# series file for GPIO

# Spring Cleaning
gpio-scx/patch.preclean        # scripts/Lindent fixes, editor-ctrl comments

# API Modernization

gpio-scx/patch.api26        # what I learned from LDD3
gpio-scx/patch.platform-dev-2    # get pdev, support for dev_dbg()
gpio-scx/patch.unsigned-minor    # fix to match std practice

# Debuggability

gpio-scx/patch.dump-diet    # shrink gpio_dump()
gpio-scx/patch.viewpins        # add new 'command' to call dump()
gpio-scx/patch.init-refactor    # pull shadow-register init to sub

# Access-Abstraction (add vtable)

gpio-scx/patch.access-vtable    # introduce nsg_gpio_ops vtable, w dump
gpio-scx/patch.vtable-calls    # add & use the vtable in scx200_gpio
gpio-scx/patch.nscgpio-shell    # add empty driver for common-fops

# move code under abstraction
gpio-scx/patch.migrate-fops    # move file-ops methods from scx200_gpio
gpio-scx/patch.common-dump    # mv scx200.c:scx200_gpio_dump() to nsc_gpio.c
gpio-scx/patch.add-pc8736x-gpio    # add new driver, like old, w chip adapt
# gpio-scx/patch.add-DEBUG    # enable all dev_dbg()s

# Cleanups

# finish printk -> dev_dbg() etc
gpio-scx/patch.pdev-pc8736x    # new drvr needs pdev too,
gpio-scx/patch.devdbg-nscgpio    # add device to 'vtable', use in dev_dbg()

# gpio-scx/patch.pin-config-view    # another 'c' 'command'
# gpio-scx/quiet-getset        # take out excess dbg stuff (pretty quiet
now)
gpio-scx/patch.shadow-current    # imitate scx200_gpio's shadow regs in
pc87*

# post KMentors-post patches ..

gpio-scx/patch.mutexes        # use mutexes for config-locks
gpio-scx/patch.viewpins-values    # extend dump to obsolete separate 'c' cmd

gpio-scx/patch.kconfig        # add stuff for kbuild

# TBC
# combine api26 with pdev, which is just one step.
# merge c&v commands to single do-all-fn
# delay viewpins, dump-diet should also un-ifdef it too.

diff.sys-gpio-rollup-1

This patch:

Removed editor format-control comments, and used scripts/Lindent to clean up
whitespace, then deleted the bogus chunks :-(

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:42 -07:00
Chandra Seetharaman
39f4885c56 [PATCH] cpu hotplug: add hotplug versions of cpu_notifier
Define new macros register_hotcpu_notifier() and unregister_hotcpu_notifier()
that redefines register_cpu_notifier() and unregister_cpu_notifier() for use
only when HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
65edc68c34 [PATCH] cpu hotplug: make [un]register_cpu_notifier init time only
CPUs come online only at init time (unless CONFIG_HOTPLUG_CPU is defined).
So, cpu_notifier functionality need to be available only at init time.

This patch makes register_cpu_notifier() available only at init time, unless
CONFIG_HOTPLUG_CPU is defined.

This patch exports register_cpu_notifier() and unregister_cpu_notifier() only
if CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Paul E. McKenney
c32e066057 [PATCH] rcutorture: add call_rcu_bh() operations
Add operations for the call_rcu_bh() variant of RCU.  Also add an
rcu_batches_completed_bh() function, which is needed by rcutorture.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
David Woodhouse
1c0f16e5cd [PATCH] Remove gratuitous inclusion of <linux/config.h> from <linux/dmaengine.h>
We include config.h on the compiler command line. There's no need for it
to be included again.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:39 -07:00
Adrian Bunk
b6cd0b772d [PATCH] fs/buffer.c: cleanups
- add a proper prototype for the following global function:
  - buffer_init()

- make the following needlessly global function static:
  - end_buffer_async_write()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Randy Dunlap
a7807a32bb [PATCH] poison: add & use more constants
Add more poison values to include/linux/poison.h.  It's not clear to me
whether some others should be added or not, so I haven't added any of
these:

./include/linux/libata.h:#define ATA_TAG_POISON		0xfafbfcfdU
./arch/ppc/8260_io/fcc_enet.c:1918:	memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
./drivers/usb/mon/mon_text.c:429:	memset(mem, 0xe5, sizeof(struct mon_event_text));
./drivers/char/ftape/lowlevel/ftape-ctl.c:738:		memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Randy Dunlap
b3c681e091 [PATCH] update two drivers for poison.h
Update two drivers to use poison.h.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Randy Dunlap
c9cf55285e [PATCH] add poison.h and patch primary users
Localize poison values into one header file for better documentation and
easier/quicker debugging and so that the same values won't be used for
multiple purposes.

Use these constants in core arch., mm, driver, and fs code.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
Ingo Molnar
e6e5494cb2 [PATCH] vdso: randomize the i386 vDSO by moving it into a vma
Move the i386 VDSO down into a vma and thus randomize it.

Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.

It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).

There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.

There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.

(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)

This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:38 -07:00
KAMEZAWA Hiroyuki
76b67ed9dc [PATCH] node hotplug: register cpu: remove node struct
With Goto-san's patch, we can add new pgdat/node at runtime.  I'm now
considering node-hot-add with cpu + memory on ACPI.

I found acpi container, which describes node, could evaluate cpu before
memory. This means cpu-hot-add occurs before memory hot add.

In most part, cpu-hot-add doesn't depend on node hot add.  But register_cpu(),
which creates symbolic link from node to cpu, requires that node should be
onlined before register_cpu().  When a node is onlined, its pgdat should be
there.

This patch-set holds off creating symbolic link from node to cpu
until node is onlined.

This removes node arguments from register_cpu().

Now, register_cpu() requires 'struct node' as its argument.  But the array of
struct node is now unified in driver/base/node.c now (By Goto's node hotplug
patch).  We can get struct node in generic way.  So, this argument is not
necessary now.

This patch also guarantees add cpu under node only when node is onlined.  It
is necessary for node-hot-add vs.  cpu-hot-add patch following this.

Moreover, register_cpu calculates cpu->node_id by cpu_to_node() without regard
to its 'struct node *root' argument.  This patch removes it.

Also modify callers of register_cpu()/unregister_cpu, whose args are changed
by register-cpu-remove-node-struct patch.

[Brice.Goglin@ens-lyon.org: fix it]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Yasunori Goto
dd0932d9d4 [PATCH] pgdat allocation and update for ia64 of memory hotplug: allocate pgdat and per node data
This is a patch to allocate pgdat and per node data area for ia64.  The size
for them can be calculated by compute_pernodesize().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Yasunori Goto
7049027c6f [PATCH] pgdat allocation and update for ia64 of memory hotplug: update pgdat address array
This is to refresh node_data[] array for ia64.  As I mentioned previous
patches, ia64 has copies of information of pgdat address array on each node as
per node data.

At v2 of node_add, this function used stop_machine_run() to update them.  (I
wished that they were copied safety as much as possible.) But, in this patch,
this arrays are just copied simply, and set node_online_map bit after
completion of pgdat initialization.

So, kernel must touch NODE_DATA() macro after checking node_online_map().
(Current code has already done it.) This is more simple way for just
hot-add.....

Note : It will be problem when hot-remove will occur,
       because, even if online_map bit is set, kernel may
       touch NODE_DATA() due to race condition. :-(

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Yasunori Goto
0fc44159bf [PATCH] Register sysfs file for hotplugged new node
When new node becomes enable by hot-add, new sysfs file must be created for
new node.  So, if new node is enabled by add_memory(), register_one_node() is
called to create it.  In addition, I386's arch_register_node() and a part of
register_nodes() of powerpc are consolidated to register_one_node() as a
generic_code().

This is tested by Tiger4(IPF) with node hot-plug emulation.

Signed-off-by: Keiichiro Tokunaga <tokuanga.keiich@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
KAMEZAWA Hiroyuki
2842f11419 [PATCH] catch valid mem range at onlining memory
This patch allows hot-add memory which is not aligned to section.

Now, hot-added memory has to be aligned to section size.  Considering big
section sized archs, this is not useful.

When hot-added memory is registerd as iomem resoruce by iomem resource
patch, we can make use of that information to detect valid memory range.

Note: With this, not-aligned memory can be registerd. To allow hot-add
      memory with holes, we have to do more work around add_memory().
      (It doesn't allows add memory to already existing mem section.)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Yasunori Goto
3218ae14b1 [PATCH] pgdat allocation for new node add (export kswapd start func)
When node is hot-added, kswapd for the node should start.  This export kswapd
start function as kswapd_run() to use at add_memory().

[akpm@osdl.org: daemonize() isn't needed when using the kthread API]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Yasunori Goto
10ad400b49 [PATCH] pgdat allocation for new node add (refresh node_data[])
Refresh NODE_DATA() for generic archs.  In this case, NODE_DATA(nid) ==
node_data[nid].  node_data[] is array of address of pgdat.  So, refresh is
quite simple.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Yasunori Goto
306d6cbe86 [PATCH] pgdat allocation for new node add (generic alloc node_data)
For node hotplug, basically we have to allocate new pgdat.  But, there are
several types of implementations of pgdat.

1. Allocate only pgdat.
   This style allocate only pgdat area.
   And its address is recorded in node_data[].
   It is most popular style.

2. Static array of pgdat
   In this case, all of pgdats are static array.
   Some archs use this style.

3. Allocate not only pgdat, but also per node data.
   To increase performance, each node has copy of some data as
   a per node data. So, this area must be allocated too.

   Ia64 is this style. Ia64 has the copies of node_data[] array
   on each per node data to increase performance.

In this series of patches, treat (1) as generic arch.

generic archs can use generic function. (2) and (3) should have
its own if necessary.

This patch defines pgdat allocator.
Updating NODE_DATA() macro function is in other patch.

Signed-off-by: Yasonori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Yasunori Goto
1e3590e2e4 [PATCH] pgdat allocation for new node add (get node id by acpi)
This is to find node id from acpi's handle of memory_device in DSDT.  _PXM for
the new node can be found by acpi_get_pxm() by using new memory's handle.  So,
node id can be found by pxm_to_nid_map[].

  This patch becomes simpler than v2 of node hot-add patch.
  Because old add_memory() function doesn't have node id parameter.
  So, kernel must find its handle by physical address via DSDT again.
  But, v3 just give node id to add_memory() now.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:36 -07:00
Yasunori Goto
bc02af93dd [PATCH] pgdat allocation for new node add (specify node id)
Change the name of old add_memory() to arch_add_memory.  And use node id to
get pgdat for the node at NODE_DATA().

Note: Powerpc's old add_memory() is defined as __devinit. However,
      add_memory() is usually called only after bootup.
      I suppose it may be redundant. But, I'm not well known about powerpc.
      So, I keep it. (But, __meminit is better at least.)

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:35 -07:00
Greg Kroah-Hartman
6550e07f41 [PATCH] 64bit Resource: finally enable 64bit resource sizes
Introduce the Kconfig entry and actually switch to a 64bit value, if
wanted, for resource_size_t.

Based on a patch series originally from Vivek Goyal <vgoyal@in.ibm.com>

Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-27 09:24:00 -07:00
Greg Kroah-Hartman
b60ba8343b [PATCH] 64bit resource: change pnp core to use resource_size_t
Based on a patch series originally from Vivek Goyal <vgoyal@in.ibm.com>

Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-27 09:24:00 -07:00
Greg Kroah-Hartman
e31dd6e452 [PATCH] 64bit resource: change pci core and arch code to use resource_size_t
Based on a patch series originally from Vivek Goyal <vgoyal@in.ibm.com>

Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-27 09:24:00 -07:00
Greg Kroah-Hartman
d75fc8bbcc [PATCH] 64bit resource: change resource core to use resource_size_t
Based on a patch series originally from Vivek Goyal <vgoyal@in.ibm.com>

Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-27 09:23:59 -07:00
Greg Kroah-Hartman
cf7c712c11 [PATCH] 64bit resource: introduce resource_size_t for the start and end of struct resource
But do not change it from what it currently is (unsigned long)

Based on a patch series originally from Vivek Goyal <vgoyal@in.ibm.com>

Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-27 09:23:59 -07:00
KaiGai Kohei
c9f700f840 [JFFS2][XATTR] using 'delete marker' for xdatum/xref deletion
- When xdatum is removed, a new xdatum with 'delete marker' is
  written. (version==0xffffffff means 'delete marker')
- When xref is removed, a new xref with 'delete marker' is written.
  (odd-numbered xseqno means 'delete marker')

- delete_xattr_(datum/xref)_delay() are new deletion functions
  are added. We can only use them if we can detect the target
  obsolete xdatum/xref as a orphan or errir one.
  (e.g when inode deletion, or detecting crc error)

[1/3] jffs2-xattr-v6-01-delete_marker.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-27 16:16:26 +01:00
Kristen Accardi
a6a888b3c2 KEVENT: add new uevent for dock
so that userspace can be notified of dock and undock events.

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-27 01:24:15 -04:00
Thomas Renninger
d7fa2589bb Pull bugzilla-5737 into release branch 2006-06-27 00:06:37 -04:00
Randy Dunlap
353dcf7c89 [PATCH] ata: add some NVIDIA chipset IDs
From: Randy Dunlap <randy.dunlap@oracle.com>

Add some nVidia chipset ID's support.

http://www.kernel.org/git/?p=linux/kernel/git/bcollins/ubuntu-dapper.git;a=commitdiff;h=b407680553280f9999a20706d5ab2a3be65312c1;hp=ce4cb48010ab2cca537432b5ccb47d4b1fb489e5

Snagged from lkml.

Cc: Jeff Garzik <jeff@garzik.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-26 20:59:28 -04:00
Tejun Heo
5806db22cf [PATCH] libata: implement ata_port_max_devices()
Implement ata_port_max_devices().  This function returns the number of
possible devices on a port.  This will be used by new PM
implementation.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-26 20:59:27 -04:00
Andrew Morton
41542dbe12 [PATCH] libata.h needs scatterlist.h
From: Andrew Morton <akpm@osdl.org>

s390:

In file included from drivers/scsi/libata-bmdma.c:39:                           include/linux/libata.h:391: error: field 'sgent' has incomplete type
include/linux/libata.h:392: error: field 'pad_sgent' has incomplete type
include/linux/libata.h: In function 'ata_sg_is_last':                           include/linux/libata.h:849: error: arithmetic on pointer to an incomplete type
include/linux/libata.h:849: error: arithmetic on pointer to an incomplete type
include/linux/libata.h: In function 'ata_qc_next_sg':
include/linux/libata.h:869: error: increment of pointer to unknown structure
include/linux/libata.h:869: error: arithmetic on pointer to an incomplete type
include/linux/libata.h:869: error: arithmetic on pointer to an incomplete type
include/linux/libata.h:869: error: arithmetic on pointer to an incomplete type

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-26 20:59:27 -04:00
Jeff Garzik
438bc9c3de [libata] sata_vsc: partially revert a PCI ID-related commit
Partially revert 74d0a988d3aa359b6b8a8536c8cb92cce02ca5d5:

	[PATCH] PCI: Move various PCI IDs to header file

libata policy is to avoid use of named PCI device ID constants.
These are often single-use constants, which have little value over
direct numeric constants save for constant include/linux/pci_ids.h
patching/merging headaches.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-26 20:52:17 -04:00
Linus Torvalds
da206c9e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  typo fixes
  Clean up 'inline is not at beginning' warnings for usb storage
  Storage class should be first
  i386: Trivial typo fixes
  ixj: make ixj_set_tone_off() static
  spelling fixes
  fix paniced->panicked typos
  Spelling fixes for Documentation/atomic_ops.txt
  move acknowledgment for Mark Adler to CREDITS
  remove the bouncing email address of David Campbell
2006-06-26 13:33:14 -07:00
Greg Kroah-Hartman
331b831983 [PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV
I've always found this flag confusing.  Now that devfs is no longer around, it
has been renamed, and the documentation for when this flag should be used has
been updated.

Also fixes all drivers that use this flag.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:09 -07:00
Greg Kroah-Hartman
f4eaa37017 [PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed
Also fixes all drivers that set this field.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:09 -07:00
Greg Kroah-Hartman
ce7b0f46bb [PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed
And remove the now unneeded number field.
Also fixes all drivers that set these fields.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:08 -07:00
Greg Kroah-Hartman
96192ff1a9 [PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed
Also fixes all drivers that set this field.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:08 -07:00
Greg Kroah-Hartman
ff23eca3e8 [PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree
Also fixes up all files that #include it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:08 -07:00
Greg Kroah-Hartman
8ab5e4c15b [PATCH] devfs: Remove devfs_remove() function from the kernel tree
Removes the devfs_remove() function and all callers of it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:07 -07:00
Greg Kroah-Hartman
7c69ef7974 [PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree
Removes the devfs_mk_cdev() function and all callers of it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:07 -07:00
Greg Kroah-Hartman
1a715c5cf9 [PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree
Removes the devfs_mk_bdev() function and all callers of it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:06 -07:00
Greg Kroah-Hartman
79021a625c [PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree
Removes the devfs_mk_symlink() function and all callers of it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:06 -07:00
Greg Kroah-Hartman
95dc112a57 [PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree
Removes the devfs_mk_dir() function and all callers of it.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:06 -07:00
Greg Kroah-Hartman
0e6c62da7c [PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree
Removes the devfs_register_tape() and devfs_unregister_tape() functions
and all callers of them.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:06 -07:00
Greg Kroah-Hartman
94f6c59dcf [PATCH] devfs: Remove devfs support from the ide subsystem.
Also removes the ide drive devfs_name field as it's no longer needed

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:06 -07:00
Greg Kroah-Hartman
aa4148cfc7 [PATCH] devfs: Remove devfs support from the serial subsystem
Also fixes all serial drivers.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:05 -07:00
Greg Kroah-Hartman
bdaf852938 [PATCH] devfs: Remove devfs from the init code
This patch removes the devfs code from the init/ directory.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:05 -07:00
Greg Kroah-Hartman
d8deac5094 [PATCH] devfs: Remove devfs from the kernel tree
This is the first patch in a series of patches that removes devfs
support from the kernel.  This patch removes the core devfs code, and
its private header file.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26 12:25:05 -07:00
Linus Torvalds
09c0dc6862 Revert "[PATCH] kthread: update loop.c to use kthread"
This reverts commit c7b2eff059fcc2d1b7085ee3d84b79fd657a537b.

Hugh Dickins explains:

 "It seems too little tested: "losetup -d /dev/loop0" fails with
  EINVAL because nothing sets lo_thread; but even when you patch
  loop_thread() to set lo->lo_thread = current, it can't survive
  more than a few dozen iterations of the loop below (with a tmpfs
  mounted on /tst):

	j=0
	cp /dev/zero /tst
	while :
	do
	    let j=j+1
	    echo "Doing pass $j"
	    losetup /dev/loop0 /tst/zero
	    mkfs -t ext2 -b 1024 /dev/loop0 >/dev/null 2>&1
	    mount -t ext2 /dev/loop0 /mnt
	    umount /mnt
	    losetup -d /dev/loop0
	done

  it collapses with failed ioctl then BUG_ON(!bio).

  I think the original lo_done completion was more subtle and safe
  than the kthread conversion has allowed for."

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 11:55:42 -07:00
Linus Torvalds
2a2ed2db35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (40 commits)
  kbuild: trivial fixes in Makefile
  kbuild: adding symbols in Kconfig and defconfig to TAGS
  kbuild: replace abort() with exit(1)
  kbuild: support for %.symtypes files
  kbuild: fix silentoldconfig recursion
  kbuild: add option for stripping modules while installing them
  kbuild: kill some false positives from modpost
  kbuild: export-symbol usage report generator
  kbuild: fix make -rR breakage
  kbuild: append -dirty for updated but uncommited changes
  kbuild: append git revision for all untagged commits
  kbuild: fix module.symvers parsing in modpost
  kbuild: ignore make's built-in rules & variables
  kbuild: bugfix with initramfs
  kbuild: modpost build fix
  kbuild: check license compatibility when building modules
  kbuild: export-type enhancement to modpost.c
  kbuild: add dependency on kernel.release to the package targets
  kbuild: `make kernelrelease' speedup
  kconfig: KCONFIG_OVERWRITECONFIG
  ...
2006-06-26 11:05:15 -07:00
Linus Torvalds
972d19e837 Merge master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  [CRYPTO] tcrypt: Forbid tcrypt from being built-in
  [CRYPTO] aes: Add wrappers for assembly routines
  [CRYPTO] tcrypt: Speed benchmark support for digest algorithms
  [CRYPTO] tcrypt: Return -EAGAIN from module_init()
  [CRYPTO] api: Allow replacement when registering new algorithms
  [CRYPTO] api: Removed const from cra_name/cra_driver_name
  [CRYPTO] api: Added cra_init/cra_exit
  [CRYPTO] api: Fixed incorrect passing of context instead of tfm
  [CRYPTO] padlock: Rearrange context structure to reduce code size
  [CRYPTO] all: Pass tfm instead of ctx to algorithms
  [CRYPTO] digest: Remove unnecessary zeroing during init
  [CRYPTO] aes-i586: Get rid of useless function wrappers
  [CRYPTO] digest: Add alignment handling
  [CRYPTO] khazad: Use 32-bit reads on key
2006-06-26 11:03:29 -07:00
Linus Torvalds
cdf4f383a4 Merge master.kernel.org:/pub/scm/linux/kernel/git/dtor/input
* master.kernel.org:/pub/scm/linux/kernel/git/dtor/input:
  Input: iforce - remove some pointless casts
  Input: psmouse - add support for Intellimouse 4.0
  Input: atkbd - fix HANGEUL/HANJA keys
  Input: fix misspelling of Hangeul key
  Input: via-pmu - add input device support
  Input: rearrange exports
  Input: fix formatting to better follow CodingStyle
  Input: reset name, phys and uniq when unregistering
  Input: return correct size when reading modalias attribute
  Input: change my e-mail address in MAINTAINERS file
  Input: fix potential overflows in driver/input/keyboard
  Input: fix potential overflows in driver/input/touchscreen
  Input: fix potential overflows in driver/input/joystick
  Input: fix potential overflows in driver/input/mouse
  Input: fix accuracy of fixp-arith.h
  Input: iforce - use ENOSPC instead of ENOMEM
  Input: constify drivers/char/keyboard.c
2006-06-26 11:01:58 -07:00
Linus Torvalds
5f2f444136 Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
  V4L/DVB (4227): Update this driver for recent header file movement.
  V4L/DVB (4223): Add V4L2_CID_MPEG_STREAM_VBI_FMT control
  V4L/DVB (4222): Always switch tuner mode when calling VIDIOC_S_FREQUENCY.
  V4L/DVB (4221): Add HM12 YUV format define.
  V4L/DVB (4219): Av7110: analog sound output of DVB-C rev 2.3
  V4L/DVB (4217): Fix a misplaced closing bracket/else, which caused swzigzag not to be called
  V4L/DVB (4215): Make VIDEO_CX88_BLACKBIRD a separate build option
  V4L/DVB (4214): Make VIDEO_CX2341X a selectable build option
  V4L/DVB (4213): Cx88: cleanups
  V4L/DVB (4211): Fix an Oops for all fe that have get_frontend_algo == NULL
2006-06-26 10:54:02 -07:00
Linus Torvalds
81a07d7588 Merge branch 'x86-64'
* x86-64: (83 commits)
  [PATCH] x86_64: x86_64 stack usage debugging
  [PATCH] x86_64: (resend) x86_64 stack overflow debugging
  [PATCH] x86_64: msi_apic.c build fix
  [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
  [PATCH] x86_64: Avoid broadcasting NMI IPIs
  [PATCH] x86_64: fix apic error on bootup
  [PATCH] x86_64: enlarge window for stack growth
  [PATCH] x86_64: Minor string functions optimizations
  [PATCH] x86_64: Move export symbols to their C functions
  [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
  [PATCH] x86_64: Fix modular pc speaker
  [PATCH] x86_64: remove sys32_ni_syscall()
  [PATCH] x86_64: Do not use -ffunction-sections for modules
  [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle
  [PATCH] x86_64: adjust kstack_depth_to_print default
  [PATCH] i386/x86-64: adjust /proc/interrupts column headings
  [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels
  [PATCH] x86_64: Fix fast check in safe_smp_processor_id
  [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information
  [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
  ...

Manual resolve of trivial conflict in arch/i386/kernel/Makefile
2006-06-26 10:51:09 -07:00
Vojtech Pavlik
05ebb76109 [PATCH] x86_64: Add useful constants to time.h
In timekeeping code, one often does need to use conversion constants. Naming
these leads to code that's easier to understand, showing the reader between
which units the conversion is made.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Jan Beulich
83f4fcce7f [PATCH] x86_64: allow unwinder to build without module support
Add proper conditionals to be able to build with CONFIG_MODULES=n.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jan Beulich
c33bd9aac0 [PATCH] i386/x86-64: fall back to old-style call trace if no unwinding
If no unwinding is possible at all for a certain exception instance,
fall back to the old style call trace instead of not showing any trace
at all.

Also, allow setting the stack trace mode at the command line.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jan Beulich
4552d5dc08 [PATCH] x86_64: reliable stack trace support
These are the generic bits needed to enable reliable stack traces based
on Dwarf2-like (.eh_frame) unwind information. Subsequent patches will
enable x86-64 and i386 to make use of this.

Thanks to Andi Kleen and Ingo Molnar, who pointed out several possibilities
for improvement.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Andi Kleen
08cd36570e [PATCH] x86_64: Optimize bitmap_weight for small bitmaps
Use inline code bitmaps <= BITS_PER_LONG in bitmap_weight. This
gives _much_ better code.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Andi Kleen
bebfa1013e [PATCH] x86_64: Add compat_printk and sysctl to turn off compat layer warnings
Sometimes e.g. with crashme the compat layer warnings can be noisy.
Add a way to turn them off by gating all output through compat_printk
that checks a global sysctl. The default is not changed.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Linus Torvalds
61a46dc9d1 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
  [IOAT]: Do not dereference THIS_MODULE directly to set unsafe.
  [NETROM]: Fix possible null pointer dereference.
  [NET] netpoll: break recursive loop in netpoll rx path
  [NET] netpoll: don't spin forever sending to stopped queues
  [IRDA]: add some IBM think pads
  [ATM]: atm/mpc.c warning fix
  [NET]: skb_find_text ignores to argument
  [NET]: make net/core/dev.c:netdev_nit static
  [NET]: Fix GSO problems in dev_hard_start_xmit()
  [NET]: Fix CHECKSUM_HW GSO problems.
  [TIPC]: Fix incorrect correction to discovery timer frequency computation.
  [TIPC]: Get rid of dynamically allocated arrays in broadcast code.
  [TIPC]: Fixed link switchover bugs
  [TIPC]: Enhanced & cleaned up system messages; fixed 2 obscure memory leaks.
  [TIPC]: First phase of assert() cleanup
  [TIPC]: Disallow config operations that aren't supported in certain modes.
  [TIPC]: Fixed memory leak in tipc_link_send() when destination is unreachable
  [TIPC]: Added missing warning for out-of-memory condition
  [TIPC]: Withdrawing all names from nameless port now returns success, not error
  [TIPC]: Optimized argument validation done by connect().
  ...
2006-06-26 10:08:13 -07:00
NeilBrown
4254376914 [PATCH] md: Don't write dirty/clean update to spares - leave them alone
- record the 'event' count on each individual device (they
  might sometimes be slightly different now)
- add a new value for 'sb_dirty': '3' means that the super
  block only needs to be updated to record a clean<->dirty
  transition.
- Prefer odd event numbers for dirty states and even numbers
  for clean states
- Using all the above, don't update the superblock on
  a spare device if the update is just doing a clean-dirty
  transition.  To accomodate this, a transition from
  dirty back to clean might now decrement the events counter
  if nothing else has changed.

The net effect of this is that spare drives will not see any IO requests
during normal running of the array, so they can go to sleep if that is what
they want to do.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:39 -07:00
NeilBrown
d785a06a0b [PATCH] md/bitmap: change md/bitmap file handling to use bmap to file blocks
If md is asked to store a bitmap in a file, it tries to hold onto the page
cache pages for that file, manipulate them directly, and call a cocktail of
operations to write the file out.  I don't believe this is a supportable
approach.

This patch changes the approach to use the same approach as swap files.  i.e.
bmap is used to enumerate all the block address of parts of the file and we
write directly to those blocks of the device.

swapfile only uses parts of the file that provide a full pages at contiguous
addresses.  We don't have that luxury so we have to cope with pages that are
non-contiguous in storage.  To handle this we attach buffers to each page, and
store the addresses in those buffers.

With this approach the pagecache may contain data which is inconsistent with
what is on disk.  To alleviate the problems this can cause, md invalidates the
pagecache when releasing the file.  If the file is to be examined while the
array is active (a non-critical but occasionally useful function), O_DIRECT io
must be used.  And new version of mdadm will have support for this.

This approach simplifies a lot of code:
 - we no longer need to keep a list of pages which we need to wait for,
   as the b_endio function can keep track of how many outstanding
   writes there are.  This saves a mempool.
 - -EAGAIN returns from write_page are no longer possible (not sure if
    they ever were actually).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00
NeilBrown
0b79ccf0cd [PATCH] md/bitmap: remove bitmap writeback daemon
md/bitmap currently has a separate thread to wait for writes to the bitmap
file to complete (as we cannot get a callback on that action).

However this isn't needed as bitmap_unplug is called from process context and
waits for the writeback thread to do it's work.  The same result can be
achieved by doing the waiting directly in bitmap_unplug.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:38 -07:00
Adrian Bunk
5e56341d02 [PATCH] md: make md_print_devices() static
This patch makes the needlessly global md_print_devices() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
c93983bf51 [PATCH] md: support stripe/offset mode in raid10
The "industry standard" DDF format allows for a stripe/offset layout where
data is duplicated on different stripes.  e.g.

  A  B  C  D
  D  A  B  C
  E  F  G  H
  H  E  F  G

(columns are drives, rows are stripes, LETTERS are chunks of data).

This is similar to raid10's 'far' mode, but not quite the same.  So enhance
'far' mode with a 'far/offset' option which follows the layout of DDFs
stripe/offset.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
7c7546ccf6 [PATCH] md: allow a linear array to have drives added while active
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
5fd6c1dce0 [PATCH] md: allow checkpoint of recovery with version-1 superblock
For a while we have had checkpointing of resync.  The version-1 superblock
allows recovery to be checkpointed as well, and this patch implements that.

Due to early carelessness we need to add a feature flag to signal that the
recovery_offset field is in use, otherwise older kernels would assume that a
partially recovered array is in fact fully recovered.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
16a53ecc35 [PATCH] md: merge raid5 and raid6 code
There is a lot of commonality between raid5.c and raid6main.c.  This patches
merges both into one module called raid456.  This saves a lot of code, and
paves the way for online raid5->raid6 migrations.

There is still duplication, e.g.  between handle_stripe5 and handle_stripe6.
This will probably be cleaned up later.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:37 -07:00
NeilBrown
8932c2e0dc [PATCH] md: remove arbitrary limit on chunk size
The largest chunk size the code can support without substantial surgery is
2^30 bytes, so make that the limit instead of an arbitrary 4Meg.  Some day,
the 'chunksize' should change to a sector-shift instead of a byte-count.  Then
no limit would be needed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:36 -07:00
Alasdair G Kergon
72d9486169 [PATCH] dm: improve error message consistency
Tidy device-mapper error messages to include context information
automatically.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:36 -07:00
Alasdair G Kergon
5c6bd75d06 [PATCH] dm: prevent removal if open
If you misuse the device-mapper interface (or there's a bug in your userspace
tools) it's possible to end up with 'unlinked' mapped devices that cannot be
removed until you reboot (along with uninterruptible processes).

This patch prevents you from removing a device that is still open.

It introduces dm_lock_for_deletion() which is called when a device is about to
be removed to ensure that nothing has it open and nothing further can open it.
 It uses a private open_count for this which also lets us remove one of the
problematic bdget_disk() calls elsewhere.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:36 -07:00
David Teigland
c2ade42dd3 [PATCH] dm: create error table
Add a library function dm_create_error_table() to create a table that rejects
any I/O sent to a device with EIO.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:36 -07:00
Alasdair G Kergon
17b2f66f2a [PATCH] dm: add exports
Move definitions of core device-mapper functions for manipulating mapped
devices and their tables to <linux/device-mapper.h> advertising their
availability for use elsewhere in the kernel.

Protect the contents of device-mapper.h with ifdef __KERNEL__.  And throw
in a few formatting clean-ups and extra comments.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:36 -07:00
Jeff Mahoney
5806f07cd2 [PATCH] lib: add idr_replace
This patch adds idr_replace() to replace an existing pointer in a single
operation.

Device-mapper will use this to update the pointer it stored against a given
id.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:34 -07:00
Antonino A. Daplas
e614b18dce [PATCH] VT binding: Update fbcon to support binding
The control for binding/unbinding is moved from fbcon to the console layer.
Thus the fbcon sysfs attributes, attach and detach, are also gone.

    1. Add a notifier event that tells fbcon if a framebuffer driver has been
       unregistered.  If no registered driver remains, fbcon will unregister
       itself from the console layer.

    2. Replaced calls to give_up_console() with unregister_con_driver().

    3. Still use take_over_console() instead of register_con_driver() to
       maintain compatibility

    4. Respect the parameter first_fb_vc and last_fb_vc instead of using 0 and
       MAX_NR_CONSOLES - 1. These parameters are settable by the user.

    5. When fbcon is completely unbound from the console layer, fbcon will
       also release (iow, decrement module reference counts to zero) all fbdev
       drivers. In other words, a bind or unbind request from the console layer
       will propagate down to the framebuffer drivers.

    6. If fbcon is not bound to the console, it will ignore all notifier
       events (except driver registration and unregistration) and all sysfs
       requests.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:33 -07:00
Antonino A. Daplas
3e795de763 [PATCH] VT binding: Add binding/unbinding support for the VT console
The framebuffer console is now able to dynamically bind and unbind from the VT
console layer.  Due to the way the VT console layer works, the drivers
themselves decide when to bind or unbind.  However, it was decided that
binding must be controlled, not by the drivers themselves, but by the VT
console layer.  With this, dynamic binding is possible for all VT console
drivers, not just fbcon.

Thus, the VT console layer will impose the following to all VT console
drivers:

- all registered VT console drivers will be entered in a private list
- drivers can register themselves to the VT console layer, but they cannot
  decide when to bind or unbind. (Exception: To maintain backwards
  compatibility, take_over_console() will automatically bind the driver after
  registration.)
- drivers can remove themselves from the list by unregistering from the VT
  console layer. A prerequisite for unregistration is that the driver must not
  be bound.

The following functions are new in the vt.c:

register_con_driver() - public function, this function adds the VT console
driver to an internal list maintained by the VT console

bind_con_driver() - private function, it binds the driver to the console

take_over_console() is changed to call register_con_driver() followed by a
bind_con_driver().  This is the only time drivers can decide when to bind to
the VT layer.  This is to maintain backwards compatibility.

unbind_con_driver() - private function, it unbinds the driver from its
console.  The vacated consoles will be taken over by the default boot console
driver.

unregister_con_driver() - public function, removes the driver from the
internal list maintained by the VT console.  It will only succeed if the
driver is currently unbound.

con_is_bound() checks if the driver is currently bound or not

give_up_console() is just a wrapper to unregister_con_driver().

There are also 3 additional functions meant to be called only by the tty layer
for sysfs control:

	vt_bind() - calls bind_con_driver()
	vt_unbind() - calls unbind_con_driver()
	vt_show_drivers() - shows the list of registered drivers

Most VT console drivers will continue to work as is, but might have problems
when unbinding or binding which should be fixable with minimal changes.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:33 -07:00
Antonino A. Daplas
9a17917671 [PATCH] Detaching fbcon: sdd sysfs class device entry for fbcon
In order for this feature to work, an interface will be needed.  The most
appropriate is sysfs.  However, the framebuffer console has no sysfs entry
yet.  This will create a sysfs class device entry for fbcon under
/sys/class/graphics.

Add a class_device entry 'fbcon' under class 'graphics'.  Console-specific
attributes which where previously under class/graphics/fb[x] are moved to
class/graphics/fbcon.  These attributes, 'con_rotate' and 'con_rotate_all',
are also renamed to 'rotate' and 'rotate_all' respectively.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:32 -07:00
Oleg Nesterov
d5f70c00ad [PATCH] coredump: kill ptrace related stuff
With this patch zap_process() sets SIGNAL_GROUP_EXIT while sending SIGKILL to
the thread group.  This means that a TASK_TRACED task

	1. Will be awakened by signal_wake_up(1)

	2. Can't sleep again via ptrace_notify()

	3. Can't go to do_signal_stop() after return
	   from ptrace_stop() in get_signal_to_deliver()

So we can remove all ptrace related stuff from coredump path.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:27 -07:00
Eric W. Biederman
13b41b0949 [PATCH] proc: Use struct pid not struct task_ref
Incrementally update my proc-dont-lock-task_structs-indefinitely patches so
that they work with struct pid instead of struct task_ref.

Mostly this is a straight 1-1 substitution.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:26 -07:00
Eric W. Biederman
99f8955183 [PATCH] proc: don't lock task_structs indefinitely
Every inode in /proc holds a reference to a struct task_struct.  If a
directory or file is opened and remains open after the the task exits this
pinning continues.  With 8K stacks on a 32bit machine the amount pinned per
file descriptor is about 10K.

Normally I would figure a reasonable per user process limit is about 100
processes.  With 80 processes, with a 1000 file descriptors each I can trigger
the 00M killer on a 32bit kernel, because I have pinned about 800MB of useless
data.

This patch replaces the struct task_struct pointer with a pointer to a struct
task_ref which has a struct task_struct pointer.  The so the pinning of dead
tasks does not happen.

The code now has to contend with the fact that the task may now exit at any
time.  Which is a little but not muh more complicated.

With this change it takes about 1000 processes each opening up 1000 file
descriptors before I can trigger the OOM killer.  Much better.

[mlp@google.com: task_mmu small fixes]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Paul Jackson <pj@sgi.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Albert Cahalan <acahalan@gmail.com>
Signed-off-by: Prasanna Meda <mlp@google.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:25 -07:00
Eric W. Biederman
48e6484d49 [PATCH] proc: Rewrite the proc dentry flush on exit optimization
To keep the dcache from filling up with dead /proc entries we flush them on
process exit.  However over the years that code has gotten hairy with a
dentry_pointer and a lock in task_struct and misdocumented as a correctness
feature.

I have rewritten this code to look and see if we have a corresponding entry in
the dcache and if so flush it on process exit.  This removes the extra fields
in the task_struct and allows me to trivially handle the case of a
/proc/<tgid>/task/<pid> entry as well as the current /proc/<pid> entries.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Eric W. Biederman
aed7a6c476 [PATCH] proc: Replace proc_inode.type with proc_inode.fd
The sole renaming use of proc_inode.type is to discover the file descriptor
number, so just store the file descriptor number and don't wory about
processing this field.  This removes any /proc limits on the maximum number of
file descriptors, and clears the path to make the hard coded /proc inode
numbers go away.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:24 -07:00
Hansjoerg Lipp
5024ad4af6 [PATCH] i4l: Gigaset drivers: add IOCTLs to compat_ioctl.h
Add the IOCTLs of the Gigaset drivers to compat_ioctl.h in order to make
them available for 32 bit programs on 64 bit platforms.  Please merge.

Signed-off-by: Hansjoerg Lipp <hjlipp@web.de>
Acked-by: Tilman Schmidt <tilman@imap.cc>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:23 -07:00
Roman Zippel
19923c190e [PATCH] fix and optimize clock source update
This fixes the clock source updates in update_wall_time() to correctly
track the time coming in via current_tick_length().  Optimize the fast
paths to be as short as possible to keep the overhead low.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
Jim Cromie
7f9f303aa3 [PATCH] generic-time: add macro to simplify/hide mask constants
Add a CLOCKSOURCE_MASK macro to simplify initializing the mask for a struct
clocksource, and use it to replace literal mask constants in the various
clocksource drivers.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
john stultz
a275254975 [PATCH] time: rename clocksource functions
As suggested by Roman Zippel, change clocksource functions to use
clocksource_xyz rather then xyz_clocksource to avoid polluting the
namespace.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
john stultz
cf3c769b4b [PATCH] Time: Introduce arch generic time accessors
Introduces clocksource switching code and the arch generic time accessor
functions that use the clocksource infrastructure.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:20 -07:00
john stultz
5eb6d20533 [PATCH] Time: Use clocksource abstraction for NTP adjustments
Instead of incrementing xtime by tick_nsec + ntp adjustments, use the
clocksource abstraction to increment and scale time.  Using the clocksource
abstraction allows other clocksources to be used consistently in the face of
late or lost ticks, while preserving the existing behavior via the jiffies
clocksource.

This removes the need to keep time_phase adjustments as we just use the
current_tick_length() function as the NTP interface and accumulate time using
shifted nanoseconds.

The basics of this design was by Roman Zippel, however it is my own
interpretation and implementation, so the credit should go to him and the
blame to me.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:20 -07:00
john stultz
260a42309b [PATCH] Time: Let user request precision from current_tick_length()
Change the current_tick_length() function so it takes an argument which
specifies how much precision to return in shifted nanoseconds.  This provides
a simple way to convert between NTPs internal nanoseconds shifted by
(SHIFT_SCALE - 10) to other shifted nanosecond units that are used by the
clocksource abstraction.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:20 -07:00
john stultz
ad596171ed [PATCH] Time: Use clocksource infrastructure for update_wall_time
Modify the update_wall_time function so it increments time using the
clocksource abstraction instead of jiffies.  Since the only clocksource driver
currently provided is the jiffies clocksource, this should result in no
functional change.  Additionally, a timekeeping_init and timekeeping_resume
function has been added to initialize and maintain some of the new timekeping
state.

[hirofumi@mail.parknet.co.jp: fixlet]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:20 -07:00
john stultz
734efb467b [PATCH] Time: Clocksource Infrastructure
This introduces the clocksource management infrastructure.  A clocksource is a
driver-like architecture generic abstraction of a free-running counter.  This
code defines the clocksource structure, and provides management code for
registering, selecting, accessing and scaling clocksources.

Additionally, this includes the trivial jiffies clocksource, a lowest common
denominator clocksource, provided mainly for use as an example.

[hirofumi@mail.parknet.co.jp: Don't enable IRQ too early]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:20 -07:00
Michael Buesch
844dd05fec [PATCH] Add new generic HW RNG core
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:19 -07:00
David Howells
7e047ef5fe [PATCH] keys: sort out key quota system
Add the ability for key creation to overrun the user's quota in some
circumstances - notably when a session keyring is created and assigned to a
process that didn't previously have one.

This means it's still possible to log in, should PAM require the creation of a
new session keyring, and fix an overburdened key quota.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:18 -07:00
Andreas Mohr
d6e05edc59 spelling fixes
acquired (aquired)
contiguous (contigious)
successful (succesful, succesfull)
surprise (suprise)
whether (weather)
some other misspellings

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26 18:35:02 +02:00
Hans Verkuil
8cbde94be3 V4L/DVB (4223): Add V4L2_CID_MPEG_STREAM_VBI_FMT control
V4L2_CID_MPEG_STREAM_VBI_FMT controls if and how VBI data is embedded in
an MPEG stream. Currently only one format is supported: the format designed
for the ivtv driver. This should be extended with new standard formats
(such as defined for DVB) in the future.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-26 09:21:45 -03:00
Hans Verkuil
91a972910d V4L/DVB (4221): Add HM12 YUV format define.
HM12 is a YUV 4:1:1 format used by the cx2341x MPEG encoder/decoder for
the raw YUV input/output. The Y and UV planes are broken up in 16x16
macroblocks and each macroblock is transmitted in turn (row by row).

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-26 09:21:32 -03:00
Herbert Xu
d913ea0d6b [CRYPTO] api: Removed const from cra_name/cra_driver_name
We do need to change these names now and even more so in future with
instantiated algorithms.  So let's stop lying to the compiler and get
rid of the const modifiers.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-06-26 17:34:40 +10:00
Herbert Xu
c7fc05992a [CRYPTO] api: Added cra_init/cra_exit
This patch adds the hooks cra_init/cra_exit which are called during a tfm's
construction and destruction respectively.  This will be used by the instances
to allocate child tfm's.

For now this lets us get rid of the coa_init/coa_exit functions which are
used for exactly that purpose (unlike the dia_init function which is called
for each transaction).

In fact the coa_exit path is currently buggy as it may get called twice
when an error is encountered during initialisation.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-06-26 17:34:40 +10:00
Herbert Xu
6c2bb98bc3 [CRYPTO] all: Pass tfm instead of ctx to algorithms
Up until now algorithms have been happy to get a context pointer since
they know everything that's in the tfm already (e.g., alignment, block
size).

However, once we have parameterised algorithms, such information will
be specific to each tfm.  So the algorithm API needs to be changed to
pass the tfm structure instead of the context pointer.

This patch is basically a text substitution.  The only tricky bit is
the assembly routines that need to get the context pointer offset
through asm-offsets.h.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-06-26 17:34:39 +10:00
Neil Horman
068c6e98bc [NET] netpoll: break recursive loop in netpoll rx path
The netpoll system currently has a rx to tx path via:

netpoll_rx
 __netpoll_rx
  arp_reply
   netpoll_send_skb
    dev->hard_start_tx

This rx->tx loop places network drivers at risk of inadvertently causing a
deadlock or BUG halt by recursively trying to acquire a spinlock that is
used in both their rx and tx paths (this problem was origionally reported
to me in the 3c59x driver, which shares a spinlock between the
boomerang_interrupt and boomerang_start_xmit routines).

This patch breaks this loop, by queueing arp frames, so that they can be
responded to after all receive operations have been completed.  Tested by
myself and the reported with successful results.

Specifically it was tested with netdump.  Heres the BZ with details:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=194055

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-26 00:04:27 -07:00
Adrian Bunk
6048126440 [NET]: make net/core/dev.c:netdev_nit static
netdev_nit can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:58:10 -07:00
Jerome Pinot
b9ab58dd8e Input: fix misspelling of Hangeul key
Fix a mispelling of the korean alphabet name in the input subsystem.
See http://en.wikipedia.org/wiki/Hangeul#Names for more details.

KEY_HANGUEL left to not break people

Signed-off-by: Jerome Pinot <ngc891@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-06-26 01:51:23 -04:00
Dmitry Torokhov
f60d2b111c Input: reset name, phys and uniq when unregistering
Name, phys and uniq are quite often constant strings in moules implementing
particular input device. If a module unregisters input device and then gets
unloaded, the device could still be present in memory (pinned via sysfs),
but aforementioned members would point to some random memory. Set them all
to NULL when unregistering so sysfs handlers won't try dereferencing them.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-06-26 01:48:36 -04:00
Venkatesh Pallipadi
46f18e3a28 ACPI: HW P-state coordination support
Treat HW coordination as independent CPUs.
This enables per-cpu monintoring of P-states

http://bugzilla.kernel.org/show_bug.cgi?id=5737

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-06-26 00:34:43 -04:00
Linus Torvalds
3448097fcc Revert "swsusp special saveable pages support" commits
This reverts commits

  3e3318dee0878d42ed62a19c292a2ac284135db3 [PATCH] swsusp: x86_64 mark special saveable/unsaveable pages
  b6370d96e09944c6e3ae8d5743ca8a8ab1f79f6c [PATCH] swsusp: i386 mark special saveable/unsaveable pages
  ce4ab0012b32c1a4a1d6e934aeb73bf3151c48d9 [PATCH] swsusp: add architecture special saveable pages support

because not only do they apparently cause page faults on x86, the
infrastructure doesn't compile on powerpc.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 18:41:00 -07:00
Linus Torvalds
1d77062b14 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (51 commits)
  nfs: remove nfs_put_link()
  nfs-build-fix-99
  git-nfs-build-fixes
  Merge branch 'odirect'
  NFS: alloc nfs_read/write_data as direct I/O is scheduled
  NFS: Eliminate nfs_get_user_pages()
  NFS: refactor nfs_direct_free_user_pages
  NFS: remove user_addr, user_count, and pos from nfs_direct_req
  NFS: "open code" the NFS direct write rescheduler
  NFS: Separate functions for counting outstanding NFS direct I/Os
  NLM: Fix reclaim races
  NLM: sem to mutex conversion
  locks.c: add the fl_owner to nlm_compare_locks
  NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts
  NFS: Split fs/nfs/inode.c
  NFS: Fix typo in nfs_do_clone_mount()
  NFS: Fix compile errors introduced by referrals patches
  NFSv4: Ensure that referral mounts bind to a reserved port
  NFSv4: A root pathname is sent as a zero component4
  NFSv4: Follow a referral
  ...
2006-06-25 10:54:14 -07:00
Linus Torvalds
25581ad107 Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (244 commits)
  V4L/DVB (4210b): git-dvb: tea575x-tuner build fix
  V4L/DVB (4210a): git-dvb versus matroxfb
  V4L/DVB (4209): Added some BTTV PCI IDs for newer boards
  Fixes some sync issues between V4L/DVB development and GIT
  V4L/DVB (4206): Cx88-blackbird: always set encoder height based on tvnorm->id
  V4L/DVB (4205): Merge tda9887 module into tuner.
  V4L/DVB (4203): Explicitly set the enum values.
  V4L/DVB (4202): allow selecting CX2341x port mode
  V4L/DVB (4200): Disable bitrate_mode when encoding mpeg-1.
  V4L/DVB (4199): Add cx2341x-specific control array to cx2341x.c
  V4L/DVB (4198): Avoid newer usages of obsoleted experimental MPEGCOMP API
  V4L/DVB (4197): Port new MPEG API to saa7134-empress with saa6752hs
  V4L/DVB (4196): Port cx88-blackbird to the new MPEG API.
  V4L/DVB (4193): Update cx2341x fw encoding API doc.
  V4L/DVB (4192): Use control helpers for saa7115, cx25840, msp3400.
  V4L/DVB (4191): Add CX2341X MPEG encoder module.
  V4L/DVB (4190): Add helper functions for control processing to v4l2-common.
  V4L/DVB (4189): Add videodev support for VIDIOC_S/G/TRY_EXT_CTRLS.
  V4L/DVB (4188): Add new MPEG control/ioctl definitions to videodev2.h
  V4L/DVB (4186): Add support for the DNTV Live! mini DVB-T card.
  ...
2006-06-25 10:09:31 -07:00
Alexey Dobriyan
d84a84775b [PATCH] Fix "biovec-(256)" in /proc/slabinfo
Stringify does what it was told to do.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:26 -07:00
KaiGai Kohei
77787bfb44 [PATCH] pacct: none-delayed process accounting accumulation
In current 2.6.17 implementation, signal_struct refered from task_struct is
used for per-process data structure.  The pacct facility also uses it as a
per-process data structure to store stime, utime, minflt, majflt.  But those
members are saved in __exit_signal().  It's too late.

For example, if some threads exits at same time, pacct facility has a
possibility to drop accountings for a part of those threads.  (see, the
following 'The results of original 2.6.17 kernel') I think accounting
information should be completely collected into the per-process data structure
before writing out an accounting record.

This patch fixes this matter.  Accumulation of stime, utime, minflt and majflt
are done before generating accounting record.

[mingo@elte.hu: fix acct_collect() siglock bug found by lockdep]
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:25 -07:00
KaiGai Kohei
f6ec29a42d [PATCH] pacct: avoidance to refer the last thread as a representation of the process
When pacct facility generate an 'ac_flag' field in accounting record, it
refers a task_struct of the thread which died last in the process.  But any
other task_structs are ignored.

Therefore, pacct facility drops ASU flag even if root-privilege operations are
used by any other threads except the last one.  In addition, AFORK flag is
always set when the thread of group-leader didn't die last, although this
process has called execve() after fork().

We have a same matter in ac_exitcode.  The recorded ac_exitcode is an exit
code of the last thread in the process.  There is a possibility this exitcode
is not the group leader's one.
2006-06-25 10:01:25 -07:00
KaiGai Kohei
0e4648141a [PATCH] pacct: add pacct_struct to fix some pacct bugs.
The pacct facility need an i/o operation when an accounting record is
generated.  There is a possibility to wake OOM killer up.  If OOM killer is
activated, it kills some processes to make them release process memory
regions.

But acct_process() is called in the killed processes context before calling
exit_mm(), so those processes cannot release own memory.  In the results, any
processes stop in this point and it finally cause a system stall.
2006-06-25 10:01:25 -07:00
Paul Fulghum
6f84be84b4 [PATCH] synclink_gt: add GT2 adapter support
Add support for SyncLink GT2 adapter to driver.

Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:24 -07:00
Paul Fulghum
643f3319b9 [PATCH] add synclink_gt custom hdlc idle
Add custom HDLC idle pattern feature.

It allows the user to specify an arbitrary 8 or 16 bit repeating pattern on
the transmit data pin between HDLC frames.

In most cases the idle pattern is continuous ones or flags as supported by off
the shelf synchronous controllers and defined in the ISO3309 standard.  Some
applications (radio/satellite modems, connections to legacy military hardware)
require non-standard patterns.

Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:24 -07:00
Randy Dunlap
9e37bd301e [PATCH] kthread: move kernel-doc and put it into DocBook
Move kthread API kernel-doc from kthread.h to kthread.c & fix it.
Add kthread API to kernel-api DocBook.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:24 -07:00
Jeremy Fitzhardinge
e905914f96 [PATCH] Implement kasprintf
Implement kasprintf, a kernel version of asprintf.  This allocates the
memory required for the formatted string, including the trailing '\0'.
Returns NULL on allocation failure.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:23 -07:00
Randy Dunlap
fa9799e33d [PATCH] ktime/hrtimer: fix kernel-doc comments
Fix kernel-doc formatting in ktime.h and hrtimer.[ch] files.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:23 -07:00
Ulrich Drepper
45c9b11a1d [PATCH] Implement AT_SYMLINK_FOLLOW flag for linkat
When the linkat() syscall was added the flag parameter was added in the
last minute but it wasn't used so far.  The following patch should change
that.  My tests show that this is all that's needed.

If OLDNAME is a symlink setting the flag causes linkat to follow the
symlink and create a hardlink with the target.  This is actually the
behavior POSIX demands for link() as well but Linux wisely does not do
this.  With this flag (which will most likely be in the next POSIX
revision) the programmer can choose the behavior, defaulting to the safe
variant.  As a side effect it is now possible to implement a
POSIX-compliant link(2) function for those who are interested.

  touch file
  ln -s file symlink

  linkat(fd, "symlink", fd, "newlink", 0)
    -> newlink is hardlink of symlink

  linkat(fd, "symlink", fd, "newlink", AT_SYMLINK_FOLLOW)
    -> newlink is hardlink of file

The value of AT_SYMLINK_FOLLOW is determined by the definition we already
use in glibc.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:22 -07:00
Serge E. Hallyn
c7b2eff059 [PATCH] kthread: update loop.c to use kthread
Update loop.c to use a kthread instead of a deprecated kernel_thread for
loop devices.

[akpm@osdl.org: don't change the thread's name]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:20 -07:00
Miklos Szeredi
a4d27e75ff [PATCH] fuse: add request interruption
Add synchronous request interruption.  This is needed for file locking
operations which have to be interruptible.  However filesystem may implement
interruptibility of other operations (e.g.  like NFS 'intr' mount option).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Miklos Szeredi
7142125937 [PATCH] fuse: add POSIX file locking support
This patch adds POSIX file locking support to the fuse interface.

This implementation doesn't keep any locking state in kernel.  Unlocking on
close() is handled by the FLUSH message, which now contains the lock owner id.

Mandatory locking is not supported.  The filesystem may enfoce mandatory
locking in userspace if needed.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Jan Engelhardt
3e8c54fad8 [PATCH] fuse: use MISC_MAJOR
The following patches add POSIX file locking to the fuse interface.

Additional changes ralated to this are:

  - asynchronous interrupt of requests by SIGKILL no longer supported

  - separate control filesystem, instead of using sysfs objects

  - add support for synchronously interrupting requests

Details are documented in Documentation/filesystems/fuse.txt throughout the
patches.

This patch:

Have fuse.h use MISC_MAJOR rather than a hardcoded '10'.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:19 -07:00
Stephen Hemminger
eab03ac7bd [PATCH] Get rid of /proc/sys/proc
The table is empty, why does it still exist?

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:15 -07:00
Andrew Victor
8232212e0b [PATCH] RTC: Add rtc_year_days() to calculate tm_yday
RTC: Add exported function rtc_year_days() to calculate the tm_yday value.

Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:14 -07:00
Raphael Assenat
362600fe60 [PATCH] Add v3020 RTC support
This patch adds support for the v3020 RTC from EM Microelectronic.

The v3020 RTC is designed to be connected on a bus using only one data bit.
 Since any data bit may be used, it is necessary to specify this to the
driver by passing a struct v3020_platform_data pointer (see
include/linux/rtc-v3020.h) to the driver.

Part of the following code comes from the kernel patchs produced by
Compulab for their products.  The original file (available here:
http://raph.people.8d.com/misc/emv3020.c) was released under the terms of
the GPL license.

[akpm@osdl.org: cleanups]
Signed-off-by: Raphael Assenat <raph@raphnet.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:14 -07:00
Alessandro Zummo
110d693d58 [PATCH] rtc subsystem: add capability checks
Centralize CAP_SYS_XXX checks to avoid duplicate code and missing checks in
the drivers.

Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:14 -07:00
Atsushi Nemoto
655066c383 [PATCH] RTC: rtc-dev UIE emulation
Import genrtc's RTC UIE emulation (CONFIG_GEN_RTC_X) to rtc-dev driver with
slight adjustments/refinements.  This makes UIE-less rtc drivers work
better with programs doing read/poll on /dev/rtc, such as hwclock.  This
emulation should not harm rtc drivers with UIE support, since
rtc_dev_ioctl() calls underlaying rtc driver's ioctl() first.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:13 -07:00
Davide Libenzi
3419b23a91 [PATCH] epoll: use unlocked wqueue operations
A few days ago Arjan signaled a lockdep red flag on epoll locks, and
precisely between the epoll's device structure lock (->lock) and the wait
queue head lock (->lock).

Like I explained in another email, and directly to Arjan, this can't happen
in reality because of the explicit check at eventpoll.c:592, that does not
allow to drop an epoll fd inside the same epoll fd.  Since lockdep is
working on per-structure locks, it will never be able to know of policies
enforced in other parts of the code.

It was decided time ago of having the ability to drop epoll fds inside
other epoll fds, that triggers a very trick wakeup operations (due to
possibly reentrant callback-driven wakeups) handled by the
ep_poll_safewake() function.  While looking again at the code though, I
noticed that all the operations done on the epoll's main structure wait
queue head (->wq) are already protected by the epoll lock (->lock), so that
locked-style functions can be used to manipulate the ->wq member.  This
makes both a lock-acquire save, and lockdep happy.

Running totalmess on my dual opteron for a while did not reveal any problem
so far:

http://www.xmailserver.org/totalmess.c

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:13 -07:00
Alexey Dobriyan
4ad3bcf314 [PATCH] nbd: endian annotations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:13 -07:00
Andrew Morton
9de9adb615 [PATCH] for_each_cpu_mask() warning fix
On UP, this:

       cpumask_t mask = node_to_cpumask(numa_node_id());

       for_each_cpu_mask(cpu, mask)

does this:

mm/readahead.c: In function `node_readahead_aging':
mm/readahead.c:850: warning: unused variable `mask'

which is unpleasantly fixed by this:

Acked-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:12 -07:00
Mingming Cao
43d23f9039 [PATCH] ext3_fsblk_t: the rest of in-kernel filesystem blocks conversion
Convert the ext3 in-kernel filesystem blocks to ext3_fsblk_t.  Convert the
rest of all unsigned long type in-kernel filesystem blocks to ext3_fsblk_t,
and replace the printk format string respondingly.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:10 -07:00
Mingming Cao
1c2bf374a4 [PATCH] ext3_fsblk_t: filesystem, group blocks and bug fixes
Some of the in-kernel ext3 block variable type are treated as signed 4 bytes
int type, thus limited ext3 filesystem to 8TB (4kblock size based).  While
trying to fix them, it seems quite confusing in the ext3 code where some
blocks are filesystem-wide blocks, some are group relative offsets that need
to be signed value (as -1 has special meaning).  So it seem saner to define
two types of physical blocks: one is filesystem wide blocks, another is
group-relative blocks.  The following patches clarify these two types of
blocks in the ext3 code, and fix the type bugs which limit current 32 bit ext3
filesystem limit to 8TB.

With this series of patches and the percpu counter data type changes in the mm
tree, we are able to extend exts filesystem limit to 16TB.

This work is also a pre-request for the recent >32 bit ext3 work, and makes
the kernel to able to address 48 bit ext3 block a lot easier: Simply redefine
ext3_fsblk_t from unsigned long to sector_t and redefine the format string for
ext3 filesystem block corresponding.

Two RFC with a series patches have been posted to ext2-devel list and have
been reviewed and discussed:
http://marc.theaimsgroup.com/?l=ext2-devel&m=114722190816690&w=2

http://marc.theaimsgroup.com/?l=ext2-devel&m=114784919525942&w=2

Patches are tested on both 32 bit machine and 64 bit machine, <8TB ext3 and
>8TB ext3 filesystem(with the latest to be released e2fsprogs-1.39).  Tests
includes overnight fsx, tiobench, dbench and fsstress.

This patch:

Defines ext3_fsblk_t and ext3_grpblk_t, and the printk format string for
filesystem wide blocks.

This patch classifies all block group relative blocks, and ext3_fsblk_t blocks
occurs in the same function where used to be confusing before.  Also include
kernel bug fixes for filesystem wide in-kernel block variables.  There are
some fileystem wide blocks are treated as int/unsigned int type in the kernel
currently, especially in ext3 block allocation and reservation code.  This
patch fixed those bugs by converting those variables to ext3_fsblk_t(unsigned
long) type.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:10 -07:00
Ben Dooks
ad4063b0b2 [PATCH] AX88796 parallel port driver
Driver for the simple parallel port interface on the Asix AX88796 chip on
an platform_bus.

[akpm@osdl.org: x86_64 build fix]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:09 -07:00
Alan Cox
dbe217af3b [PATCH] IDE CD end-of media error fix
This is a patch from Alan that fixes a real ide-cd.c regression causing
bogus "Media Check" failures for perfectly valid Fedora install ISOs, on
certain CD-ROM drives.

This is a forward port to 2.6.16 (from RHEL) of the minimal changes for the
end of media problem.  It may not be sufficient for some controllers
(promise notably) and it does not touch the locking so the error path
locking is as horked as in mainstream.

From: Ingo Molnar <mingo@elte.hu>

I have ported the patch to 2.6.17-rc4 and tested it by provoking
end-of-media IO errors with an unaligned ISO image.  Unlike the vanilla
kernel, the patched kernel interpreted the error condition correctly with
512 byte granularity:

 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
 hdc: command error: error=0x54 { AbortedCommand LastFailedSense=0x05 }
 ide: failed opcode was: unknown
 ATAPI device hdc:
   Error: Illegal request -- (Sense key=0x05)
   Illegal mode for this track or incompatible medium -- (asc=0x64, ascq=0x00)
   The failed "Read 10" packet command was:
   "28 00 00 04 fb 78 00 00 06 00 00 00 00 00 00 00 "
 end_request: I/O error, dev hdc, sector 1306080
 Buffer I/O error on device hdc, logical block 163260
 Buffer I/O error on device hdc, logical block 163261
 Buffer I/O error on device hdc, logical block 163262

the unpatched kernel produces an incorrect error dump:

 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
 hdc: command error: error=0x54 { AbortedCommand LastFailedSense=0x05 }
 ide: failed opcode was: unknown
 end_request: I/O error, dev hdc, sector 1306080
 Buffer I/O error on device hdc, logical block 163260
 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
 hdc: command error: error=0x54 { AbortedCommand LastFailedSense=0x05 }
 ide: failed opcode was: unknown
 end_request: I/O error, dev hdc, sector 1306088
 Buffer I/O error on device hdc, logical block 163261
 hdc: command error: status=0x51 { DriveReady SeekComplete Error }
 hdc: command error: error=0x54 { AbortedCommand LastFailedSense=0x05 }
 ide: failed opcode was: unknown
 end_request: I/O error, dev hdc, sector 1306096
 Buffer I/O error on device hdc, logical block 163262

I do not have the right type of CD-ROM drive to reproduce the end-of-media
data corruption bug myself, but this same patch in RHEL solved it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Cc: Jens Axboe <axboe@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:06 -07:00
Randy Dunlap
8e3a67a992 [PATCH] list.h doc: change "counter" to "cursor"
Use loop "cursor" instead of loop "counter" for list iterator descriptions.
They are not counters, they are pointers or positions.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:06 -07:00
Randy Dunlap
fe96e57d77 [PATCH] fix list.h kernel-doc
kernel-doc:

Put all short function descriptions on one line or if they are too long,
omit the short description & add a Description: section for them.

Change some list iterator descriptions to use "current" point instead of
"existing" point.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:06 -07:00
Adrian Bunk
83cc5ed3c4 [PATCH] kernel/sys.c: cleanups
- proper prototypes for the following functions:
  - ctrl_alt_del()  (in include/linux/reboot.h)
  - getrusage()     (in include/linux/resource.h)
- make the following needlessly global functions static:
  - kernel_restart_prepare()
  - kernel_kexec()

[akpm@osdl.org: compile fix]
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:06 -07:00
Michael Ellerman
76a8ad2939 [PATCH] Make printk work for really early debugging
Currently printk is no use for early debugging because it refuses to
actually print anything to the console unless
cpu_online(smp_processor_id()) is true.

The stated explanation is that console drivers may require per-cpu
resources, or otherwise barf, because the system is not yet setup
correctly.  Fair enough.

However some console drivers might be quite happy running early during
boot, in fact we have one, and so it'd be nice if printk understood that.

So I added a flag (which I would have called CON_BOOT, but that's taken)
called CON_ANYTIME, which indicates that a console is happy to be called
anytime, even if the cpu is not yet online.

Tested on a Power 5 machine, with both a CON_ANYTIME driver and a bogus
console driver that BUG()s if called while offline.  No problems AFAICT.
Built for i386 UP & SMP.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:05 -07:00
Adrian Bunk
138bb68ac9 [PATCH] fs/ufs/inode.c: make 2 functions static
Make two needlessly global functions static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
ee3ffd6c12 [PATCH] ufs: make fsck -f happy
ufs super block contains some statistic about file systems, like amount of
directories, free blocks, inodes and so on.

UFS1 hold this information in one location and uses 32bit integers for such
information, UFS2 hold statistic in another location and uses 64bit integers.

There is transition variant, if UFS1 has type 44BSD and flags field in super
block has some special value this mean that we work with statistic like UFS2
does.  and this also means that nobody care about old(UFS1) statistic.

So if start fsck against such file system, after usage linux ufs driver, it
found error: at now only UFS1 like statistic is updated.

This patch should fix this.  Also it contains some minor cleanup: CodingSytle
and remove unused variables.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
647b7e87b5 [PATCH] ufs: one way to access super block
Super block of UFS usually has size >512, because of fragment size may be 512,
this cause some problems.

Currently, there are two methods to work with ufs super block:

1) split structure which describes ufs super blocks into structures with
   size <=512

2) use one structure which describes ufs super block, and hope that array
   of "buffer_head" which holds "super block", has such construction:

	bh[n]->b_data + bh[n]->b_size == bh[n + 1]->b_data

The second variant may cause some problems in the future, and usage of two
variants cause unnecessary code duplication.

This patch remove the second variant.  Also patch contains some CodingStyle
fixes.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:04 -07:00
Evgeniy Dushistov
dd187a2603 [PATCH] ufs: little directory lookup optimization
This patch make little optimization of ufs_find_entry like "ext2" does.  Save
number of page and reuse it again in the next call.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
abf5d15fd2 [PATCH] ufs: easy debug
Currently to turn on debug mode "user" has to edit ~10 files, to turn off he
has to do it again.

This patch introduce such changes:
1)turn on(off) debug messages via ".config"
2)remove unnecessary duplication of code
3)make "UFSD" macros more similar to function
4)fix some compiler warnings

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:03 -07:00
Evgeniy Dushistov
9695ef16ed [PATCH] ufs: wrong type cast
There are two ugly macros in ufs code:
#define UCPI_UBH ((struct ufs_buffer_head *)ucpi)
#define USPI_UBH ((struct ufs_buffer_head *)uspi)
when uspi looks like
struct {
struct ufs_buffer_head ;
}
and USPI_UBH has some sence,
ucpi looks like
struct {
struct not_ufs_buffer_head;
}

To prevent bugs in future, this patch convert macros to inline function and
fix "ucpi" structure.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:02 -07:00
Evgeniy Dushistov
b71034e5e6 [PATCH] ufs: directory and page cache: from blocks to pages
Change function in fs/ufs/dir.c and fs/ufs/namei.c to work with pages
instead of straight work with blocks.  It fixed such bugs:

* for i in `seq 1 1000`; do touch $i; done - crash system
* mkdir create directory without "." and ".." entries

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:02 -07:00
Evgeniy Dushistov
6ef4d6bf86 [PATCH] ufs: change block number on the fly
First of all some necessary notes about UFS by it self: To avoid waste of disk
space the tail of file consists not from blocks (which is ordinary big enough,
16K usually), it consists from fragments(which is ordinary 2K).  When file is
growing its tail occupy 1 fragment, 2 fragments...  At some stage decision to
allocate whole block is made and all fragments are moved to one block.

How this situation was handled before:

  ufs_prepare_write
  ->block_prepare_write
    ->ufs_getfrag_block
      ->...
        ->ufs_new_fragments:

	bh = sb_bread
	bh->b_blocknr = result + i;
	mark_buffer_dirty (bh);

This is wrong solution, because:

- it didn't take into consideration that there is another cache: "inode page
  cache"

- because of sb_getblk uses not b_blocknr, (it uses page->index) to find
  certain block, this breaks sb_getblk.

How this situation is handled now: we go though all "page inode cache", if
there are no such page in cache we load it into cache, and change b_blocknr.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:01 -07:00
Michael Hanselmann
5474c120aa [PATCH] Rewritten backlight infrastructure for portable Apple computers
This patch contains a total rewrite of the backlight infrastructure for
portable Apple computers.  Backward compatibility is retained.  A sysfs
interface allows userland to control the brightness with more steps than
before.  Userland is allowed to upload a brightness curve for different
monitors, similar to Mac OS X.

[akpm@osdl.org: add needed exports]
Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:59 -07:00
Christoph Lameter
7b2259b3e5 [PATCH] page migration: Support a vma migration function
Hooks for calling vma specific migration functions

With this patch a vma may define a vma->vm_ops->migrate function.  That
function may perform page migration on its own (some vmas may not contain page
structs and therefore cannot be handled by regular page migration.  Pages in a
vma may require special preparatory treatment before migration is possible
etc) .  Only mmap_sem is held when the migration function is called.  The
migrate() function gets passed two sets of nodemasks describing the source and
the target of the migration.  The flags parameter either contains

MPOL_MF_MOVE	which means that only pages used exclusively by
		the specified mm should be moved

or

MPOL_MF_MOVE_ALL which means that pages shared with other processes
		should also be moved.

The migration function returns 0 on success or an error condition.  An error
condition will prevent regular page migration from occurring.

On its own this patch cannot be included since there are no users for this
functionality.  But it seems that the uncached allocator will need this
functionality at some point.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:55 -07:00
Christoph Lameter
68402ddc67 [PATCH] mm: remove VM_LOCKED before remap_pfn_range and drop VM_SHM
Remove VM_LOCKED before remap_pfn range from device drivers and get rid of
VM_SHM.

remap_pfn_range() already sets VM_IO.  There is no need to set VM_SHM since
it does nothing.  VM_LOCKED is of no use since the remap_pfn_range does not
place pages on the LRU.  The pages are therefore never subject to swap
anyways.  Remove all the vm_flags settings before calling remap_pfn_range.

After removing all the vm_flag settings no use of VM_SHM is left.  Drop it.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:55 -07:00
Andrew Morton
fb1bb34d45 [PATCH] remove for_each_cpu()
Convert a few stragglers over to for_each_possible_cpu(), remove
for_each_cpu().

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:54 -07:00
Trond Myklebust
76a9f26c9e Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ 2006-06-25 06:44:44 -04:00
Andrew Morton
d75d54147d git-nfs-build-fixes
Fix various problems with nfs4 disabled.  And various other things.

In file included from fs/nfs/inode.c:50:
fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
fs/nfs/internal.h: In function 'nfs4_path':
fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
fs/nfs/inode.c: In function 'init_once':
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'open_states'
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation'
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation_state'
fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'rwsem'
distcc[26452] ERROR: compile fs/nfs/inode.c on g5/64 failed
make[1]: *** [fs/nfs/inode.o] Error 1
make: *** [fs/nfs/inode.o] Error 2
make: *** Waiting for unfinished jobs....
In file included from fs/nfs/nfs3xdr.c:26:
fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
fs/nfs/internal.h: In function 'nfs4_path':
fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
distcc[26486] ERROR: compile fs/nfs/nfs3xdr.c on g5/64 failed
make[1]: *** [fs/nfs/nfs3xdr.o] Error 1
make: *** [fs/nfs/nfs3xdr.o] Error 2
In file included from fs/nfs/nfs3proc.c:24:
fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration
include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here
fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list
fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want
fs/nfs/internal.h: In function 'nfs4_path':
fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path'
distcc[26469] ERROR: compile fs/nfs/nfs3proc.c on bix/32 failed
make[1]: *** [fs/nfs/nfs3proc.o] Error 1
make: *** [fs/nfs/nfs3proc.o] Error 2
**FAILED**

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Cc: Andy Adamson <andros@citi.umich.edu>
Cc: Chuck Lever <cel@netapp.com>
Cc: David Howells <dhowells@redhat.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Manoj Naik <manoj@almaden.ibm.com>
Cc: Marc Eshel <eshel@almaden.ibm.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-25 06:38:11 -04:00
Trond Myklebust
ccf01ef7aa Merge branch 'odirect' 2006-06-25 06:27:31 -04:00
Hans Verkuil
0ccac4af1a V4L/DVB (4203): Explicitly set the enum values.
It's better to use explicit enums. It reduces the chance of someone
inserting new enums in the middle which would break things.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:24 -03:00
Hans Verkuil
f81cf7533b V4L/DVB (4198): Avoid newer usages of obsoleted experimental MPEGCOMP API
Put old MPEGCOMP API under #if __KERNEL__ and issue warnings when used.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:23 -03:00
Hans Verkuil
4f34171212 V4L/DVB (4188): Add new MPEG control/ioctl definitions to videodev2.h
The old, experimental, VIDIOC_S/G_CODEC API to pass MPEG parameters is now
obsolete and is replaced by 'extended controls' which offer more flexibility
and are hopefully more future proof.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:21 -03:00
Mauro Carvalho Chehab
89a58c83f8 V4L/DVB (4108): Fixes some userspace dependencies at V4L2 public api header
Make life easier for distro guys, by removing the need of including
at the userspace header.
Also, linux/compiler.h is not needed at userspace.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:09 -03:00
Eric Sesterhenn
845f16abad V4L/DVB (4070): Zoran strncpy() fix
The zoran driver uses strncpy() in an unsafe way.  This patch uses the proper
sizeof()-1 size parameter.  Since all strncpy() targets are initialised with
memset() the trailing '\0' is already set.  Where std->name was the target for
the strncpy() we overwrote 8 Bytes of the std structure with zeros.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:05 -03:00
Mauro Carvalho Chehab
5e87efa3b2 V4L/DVB (4068): Removed all references to kernel stuff from videodev.h and videodev2.h
The videodev.h and videodev2.h describe the public API for V4L and V4L2.
It shouldn't have there any kernel-specific stuff. Those were moved to
v4l2-dev.h.
This patch removes some uneeded headers and include v4l2-common.h on all
V4L driver. This header includes device implementation of V4L2 API provided
on v4l2-dev.h as well as V4L2 internal ioctls that provides connections
between master driver and its i2c devices.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:05 -03:00
Mauro Carvalho Chehab
401998fa96 V4L/DVB (4065): Several improvements at videodev.c
Videodev now is capable of better handling V4L2 api, by
processing V4L2 ioctls and using callbacks to the driver.
The drivers should be migrated to the newer way and the older
one will be obsoleted soon.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:04 -03:00
Scott Alfter
88ca8ed0b7 V4L/DVB (4048): Add support for the Texas Instruments TLV320AIC23B audio codec
Signed-off-by: Scott Alfter <salfter@ssai.us>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:00 -03:00
David Mosberger-Tang
c003d467bd V4L/DVB (4046): Trivial videodev2.h patch
linux/videodev2.h uses types such as __u8 but it fails to include
<linux/types.h>.  Within the kernel, that's not a problem because
<linux/time.h> already includes <linux/types.h>.  However, there are
user apps that try to include videodev2.h (e.g., ekiga) and at least
on ia64, it causes compilation failures since <linux/types.h> doesn't
get included for any other reason, leaving __u8 etc. undefined.  The
attached patch fixes the problem for me.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 02:05:00 -03:00
Martin Samuelsson
fbe60daac4 V4L/DVB (3916): AverMedia 6 Eyes AVS6EYES support
Add support for the AverMedia 6 Eyes MJPEG card.
- Updated drivers/media/video/Kconfig with AVS6EYES
  options.
- Added CONFIG_VIDEO_ZORAN_AVS6EYES to
  drivers/media/video/Makefile.
- Added I2C_DRIVERID_BT866 and I2C_DRIVERID_KS0127 to
  include/linux/i2c-id.h
- Added drivers/media/video/ks0127.c, imported and modified from
  the Marvel project.
- Added drivers/media/video/ks0127.h, imported and modified from
  the Marvel project.
- Added drivers/media/video/bt866.c, ported from a 2.4 version
  by Christer Weinigel.
- Added AVS6EYES to drivers/media/video/zoran_card.c
- Added input_mux to all cards in drivers/media/video/zoran_card.c
- Added input mux module parameter to drivers/media/video/zoran_card.c
- Added AVS6EYES to card_type in drivers/media/video/zoran.h
- Added input_mux to card_info in drivers/media/video/zoran.h
- Upped BUZ_MAX_INPUT in drivers/media/video/zoran.h from 8 to 16,
  as the AVS6EYES has 10.
- Updated Documentation/video4linux/Zoran with information about AVS6EYES.

Signed-off-by: Martin Samuelsson <sam@home.se>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 01:59:28 -03:00
Andreas Oberritter
62838084b4 V4L/DVB (3727): Remove DMX_GET_EVENT and associated data structures
The ioctl DMX_GET_EVENT has never been implemented.
I guess no software is using it because of its lack of implementation.
Future software won't use it, too, because this API doesn't make much
sense the way it is: Frontend events have their own different API.
Scrambling events can't be generated in a useful way by the hardware I
know of.

Signed-off-by: Andreas Oberritter <obi@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-06-25 01:57:47 -03:00
Linus Torvalds
dfd8317d33 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (25 commits)
  [ARM] 3648/1: Update struct ucontext layout for coprocessor registers
  [ARM] Add identifying number for non-rt sigframe
  [ARM] Gather common sigframe saving code into setup_sigframe()
  [ARM] Gather common sigframe restoration code into restore_sigframe()
  [ARM] Re-use sigframe within rt_sigframe
  [ARM] Merge sigcontext and sigmask members of sigframe
  [ARM] Replace extramask with a full copy of the sigmask
  [ARM] Remove rt_sigframe puc and pinfo pointers
  [ARM] 3647/1: S3C24XX: add Osiris to the list of simtec pm machines
  [ARM] 3645/1: S3C2412: irq support for external interrupts
  [ARM] 3643/1: S3C2410: Add new usb clocks
  [ARM] 3642/1: S3C24XX: Add machine SMDK2413
  [ARM] 3641/1: S3C2412: Fixup gpio register naming
  [ARM] 3640/1: S3C2412: Use S3C24XX_DCLKCON instead of S3C2410_DCLKCON
  [ARM] 3639/1: S3C2412: serial port support
  [ARM] 3638/1: S3C2412: core clocks
  [ARM] 3637/1: S3C24XX: Add mpll clock, and set as fclk parent
  [ARM] 3636/1: S3C2412: Add selection of CPU_ARM926
  [ARM] 3635/1: S3C24XX: Add S3C2412 core cpu support
  [ARM] 3633/1: S3C24XX: s3c2410 gpio bugfix - wrong pin nos
  ...
2006-06-24 17:48:14 -07:00
Linus Torvalds
eb71c87a49 Add some basic resume trace facilities
Considering that there isn't a lot of hw we can depend on during resume,
this is about as good as it gets.

This is x86-only for now, although the basic concept (and most of the
code) will certainly work on almost any platform.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-24 14:44:01 -07:00
Ben Dooks
73e55cb3b3 [ARM] 3639/1: S3C2412: serial port support
Patch from Ben Dooks

Serial port support for the on-board UART blocks
on the Samsung S3C2412 and S3C2413 UARTs.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-24 21:21:32 +01:00
Chuck Lever
06cf6f2ed0 NFS: Eliminate nfs_get_user_pages()
Neil Brown observed that the kmalloc() in nfs_get_user_pages() is more
likely to fail if the I/O is large enough to require the allocation of more
than a single page to keep track of all the pinned pages in the user's
buffer.

Instead of tracking one large page array per dreq/iocb, track pages per
nfs_read/write_data, just like the cached I/O path does.  An array for
pages is already allocated for us by nfs_readdata_alloc() (and the write
and commit equivalents).

This is also required for adding support for vectored I/O to the NFS direct
I/O path.

The original reason to pin the user buffer and allocate all the NFS data
structures before trying to schedule I/O was to ensure all needed resources
are allocated on the client before starting to send requests.  This reduces
the chance that resource exhaustion on the client will cause a short read
or write.

On the other hand, for an application making very large application I/O
requests, this means that it will be nearly impossible for the application
to make forward progress on a resource-limited client.

Thus, moving the buffer pinning functionality into the I/O scheduling
loops should be good for scalability.  The next patch will do the same for
NFS data structure allocation.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-24 13:11:39 -04:00
Trond Myklebust
816724e65c Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	fs/nfs/inode.c
	fs/super.c

Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch
'VFS: Permit filesystem to override root dentry on mount'
2006-06-24 13:07:53 -04:00
Linus Torvalds
6edad161cd Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: (258 commits)
  [libata] conversion to new debug scheme, part 1 of $N
  [PATCH] libata: Add ata_scsi_dev_disabled
  [libata] Add host lock to struct ata_port
  [PATCH] libata: implement per-dev EH action mask eh_info->dev_action[]
  [PATCH] libata-dev: move the CDB-intr DMA blacklisting
  [PATCH] ahci: disable NCQ support on vt8251
  [libata] ahci: add JMicron PCI IDs
  [libata] sata_nv: add PCI IDs
  [libata] ahci: Add NVIDIA PCI IDs.
  [PATCH] libata: convert several bmdma-style controllers to new EH, take #3
  [PATCH] sata_via: convert to new EH, take #3
  [libata] sata_nv: s/spin_lock_irqsave/spin_lock/ in irq handler
  [PATCH] sata_nv: add hotplug support
  [PATCH] sata_nv: convert to new EH
  [PATCH] sata_nv: better irq handlers
  [PATCH] sata_nv: simplify constants
  [PATCH] sata_nv: kill struct nv_host_desc and nv_host
  [PATCH] sata_nv: kill not-working hotplug code
  [libata] Update docs to reflect current driver API
  [PATCH] libata: add host_set->next for legacy two host_sets case, take #3
  ...
2006-06-23 15:58:44 -07:00
Tony Luck
8cf60e04a1 Auto-update from upstream 2006-06-23 13:46:23 -07:00
Jens Axboe
dd67d05152 [PATCH] rbtree: support functions used by the io schedulers
They all duplicate macros to check for empty root and/or node, and
clearing a node. So put those in rbtree.h.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-06-23 17:10:39 +02:00
Jens Axboe
8f34ee75de [PATCH] Rearrange a few struct request members
This saves 8 bytes of data in 64-bit archs.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-06-23 17:10:39 +02:00
Jens Axboe
ad3caddaa1 [PATCH] Get rid of struct request request_pm_state member
The IDE power management can just use the ->end_io_data member to store
it's data.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-06-23 17:10:39 +02:00
Jens Axboe
b31dc66a54 [PATCH] Kill PF_SYNCWRITE flag
A process flag to indicate whether we are doing sync io is incredibly
ugly. It also causes performance problems when one does a lot of async
io and then proceeds to sync it. Part of the io will go out as async,
and the other part as sync. This causes a disconnect between the
previously submitted io and the synced io. For io schedulers such as CFQ,
this will cause us lost merges and suboptimal behaviour in scheduling.

Remove PF_SYNCWRITE completely from the fsync/msync paths, and let
the O_DIRECT path just directly indicate that the writes are sync
by using WRITE_SYNC instead.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-06-23 17:10:39 +02:00
Alexey Dobriyan
fda151d9fe [PATCH] blktrace_api.h: endian annotations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-06-23 17:10:38 +02:00
Linus Torvalds
199f4c9f76 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET]: Require CAP_NET_ADMIN to create tuntap devices.
  [NET]: fix net-core kernel-doc
  [TCP]: Move inclusion of <linux/dmaengine.h> to correct place in <linux/tcp.h>
  [IPSEC]: Handle GSO packets
  [NET]: Added GSO toggle
  [NET]: Add software TSOv4
  [NET]: Add generic segmentation offload
  [NET]: Merge TSO/UFO fields in sk_buff
  [NET]: Prevent transmission after dev_deactivate
  [IPV6] ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY
  [IPV6]: Fix source address selection.
  [NET]: Avoid allocating skb in skb_pad
2006-06-23 08:00:01 -07:00
Linus Torvalds
37224470c8 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (65 commits)
  ACPI: suppress power button event on S3 resume
  ACPI: resolve merge conflict between sem2mutex and processor_perflib.c
  ACPI: use for_each_possible_cpu() instead of for_each_cpu()
  ACPI: delete newly added debugging macros in processor_perflib.c
  ACPI: UP build fix for bugzilla-5737
  Enable P-state software coordination via _PDC
  P-state software coordination for speedstep-centrino
  P-state software coordination for acpi-cpufreq
  P-state software coordination for ACPI core
  ACPI: create acpi_thermal_resume()
  ACPI: create acpi_fan_suspend()/acpi_fan_resume()
  ACPI: pass pm_message_t from acpi_device_suspend() to root_suspend()
  ACPI: create acpi_device_suspend()/acpi_device_resume()
  ACPI: replace spin_lock_irq with mutex for ec poll mode
  ACPI: Allow a WAN module enable/disable on a Thinkpad X60.
  sem2mutex: acpi, acpi_link_lock
  ACPI: delete unused acpi_bus_drivers_lock
  sem2mutex: drivers/acpi/processor_perflib.c
  ACPI add ia64 exports to build acpi_memhotplug as a module
  ACPI: asus_acpi_init(): propagate correct return value
  ...

Manual resolve of conflicts in:

	arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/i386/kernel/cpu/cpufreq/speedstep-centrino.c
	include/acpi/processor.h
2006-06-23 07:52:36 -07:00
Jan Kara
78ce89c92b [PATCH] JBD: split checkpoint lists
Split the checkpoint list of the transaction into two lists.  In the first
list we keep the buffers that need to be submitted for IO.  In the second
list are kept buffers that were already submitted and we just have to wait
for the IO to complete.  This should simplify a handling of checkpoint
lists a bit and can eventually be also a performance gain.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:08 -07:00
Jan Beulich
908dcecda1 [PATCH] adjust handle_IRR_event() return type
Correct the return type of handle_IRQ_event() (inconsistency noticed during
Xen development), and remove redundant declarations.  The return type
adjustment required breaking out the definition of irqreturn_t into a
separate header, in order to satisfy current include order dependencies.

Signed-off-by: Jan Beulich <jbeulich@novell.com>

Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:08 -07:00
Oleg Nesterov
54e7377035 [PATCH] list: introduce list_replace() helper
list_replace() is similar to list_replace_rcu(), but unlike
list_replace_rcu() it

	could be used when list_empty(old) == 1

	doesn't use barriers

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:07 -07:00
Brent Casavant
f5befceb5c [PATCH] SGI IOC4: Detect IO card variant
There are three different IO cards which an SGI IOC4 controller may find
itself on.  One of these variants does not bring out the IDE and serial
signals, so we need to disable attaching the corresponding IOC4 subdrivers
to such cards.

Cleans up message clutter emitted during device probing.

Signed-off-by: Brent Casavant <bcasavan@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:07 -07:00
Paul E. McKenney
d83015b8f6 [PATCH] Make RCU API inaccessible to non-GPL Linux kernel modules
Remove synchronize_kernel() (deprecated 2-APR-2005 in
http://lkml.org/lkml/2005/4/3/11) and makes the RCU API inaccessible to
non-GPL Linux kernel modules (as was announced more than one year ago in
http://lkml.org/lkml/2005/4/3/8).  Tested on x86 and ppc64.

Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:07 -07:00
Pekka Enberg
481fad4834 [PATCH] strstrip() API
Add a new strstrip() function to lib/string.c for removing leading and
trailing whitespace from a string.

Cc: Michael Holzheu <holzheu@de.ibm.com>
Acked-by: Ingo Oeser <ioe-lkml@rameria.de>
Acked-by: Joern Engel <joern@wohnheim.fh-wedel.de>
Cc: Corey Minyard <minyard@acm.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Michael Holzheu <HOLZHEU@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:06 -07:00
Matt Helsley
3fa2164d03 [PATCH] Process Events: License Change
Change the license on the process event structure passed between kernel and
userspace.

Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
Acked-by: Nguyen Anh Quynh <aquynh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:06 -07:00
Matt Helsley
1d31a4ea8c [PATCH] Process Events - Header Cleanup
Move connector header include to precisely where it's needed.

Remove unused time.h header file as well.  This was leftover from previous
iterations of the process events patches.

Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
Cc: Nguyen Anh Quynh <aquynh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:06 -07:00
Hua Zhong
368a5fa1f2 [PATCH] remove unlikely() in might_sleep_if()
The likely() profiling tools show that __alloc_page() causes a lot of misses:

!       132    119193 __alloc_pages():mm/page_alloc.c@937

Because most __alloc_page() calls are not atomic.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:06 -07:00
Mingming Cao
0216bfcffe [PATCH] percpu counter data type changes to suppport more than 2**31 ext3 free blocks counter
The percpu counter data type are changed in this set of patches to support
more users like ext3 who need more than 32 bit to store the free blocks
total in the filesystem.

- Generic perpcu counters data type changes.  The size of the global counter
  and local counter were explictly specified using s64 and s32.  The global
  counter is changed from long to s64, while the local counter is changed from
  long to s32, so we could avoid doing 64 bit update in most cases.

- Users of the percpu counters are updated to make use of the new
  percpu_counter_init() routine now taking an additional parameter to allow
  users to pass the initial value of the global counter.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:06 -07:00
Eric W. Biederman
260ea10132 [PATCH] ptrace: document the locking rules
After a lot of reading the code and thinking about how it behaves I have
managed to figure out what the current ptrace locking rules are.  The
current code is in much better that it appears at first glance.  The
troublesome code paths are actually the code paths that violate the current
rules.

ptrace uses simple exclusive access as it's locking.  You can only touch
task->ptrace if the task is stopped and you are the ptracer, or if the task
is running and are the task itself.

Very simple, very easy to maintain.  It just needs to be documented so
people know not to touch ptrace from elsewhere.

Currently we do have a few pieces of code that are in violation of this
rule.  Particularly the core dump code, and ptrace_attach.  But so far the
code looks fixable.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:03 -07:00
Xose Vazquez Perez
8d27e9084b [PATCH] module.h: updated comments with a new license
"Dual MIT/GPL" is also accepted (kernel/module.c), so updated comments.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:03 -07:00
Adrian Bunk
b0904e147f [PATCH] fs/locks.c: make posix_locks_deadlock() static
We can now make posix_locks_deadlock() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:03 -07:00
Miklos Szeredi
75e1fcc0b1 [PATCH] vfs: add lock owner argument to flush operation
Pass the POSIX lock owner ID to the flush operation.

This is useful for filesystems which don't want to store any locking state
in inode->i_flock but want to handle locking/unlocking POSIX locks
internally.  FUSE is one such filesystem but I think it possible that some
network filesystems would need this also.

Also add a flag to indicate that a POSIX locking request was generated by
close(), so filesystems using the above feature won't send an extra locking
request in this case.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Pekka Enberg
090d2b185d [PATCH] read_mapping_page for address space
Add read_mapping_page() which is used for callers that pass
mapping->a_ops->readpage as the filler for read_cache_page.  This removes
some duplication from filesystem code.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Jeff Moyer
c330dda908 [PATCH] Add a sysfs file to determine if a kexec kernel is loaded
Create two files in /sys/kernel, kexec_loaded and kexec_crash_loaded.  Each
file contains a simple boolean value indicating whether the relevant kernel
has been loaded into memory.  The motivation for this is geared around
support.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:02 -07:00
Roman Zippel
98317f1271 [PATCH] m68k: Remove some unused definitions in zorro.h
These definitions have long been superseded by asm-offsets.h

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:00 -07:00
Shaohua Li
ce4ab0012b [PATCH] swsusp: add architecture special saveable pages support
1. Add architecture specific pages save/restore support.  Next two patches
   will use this to save/restore 'ACPI NVS' pages.

2. Allow reserved pages 'nosave'.  This could avoid save/restore BIOS
   reserved pages.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@suspend2.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:59 -07:00
Zhang Yanmin
1b61b910e9 [PATCH] x86: kernel irq balance doesn't work
On i386, kernel irq balance doesn't work.

1) In function do_irq_balance, after kernel finds the min_loaded cpu but
   before calling set_pending_irq to really pin the selected_irq to the
   target cpu, kernel does a cpus_and with irq_affinity[selected_irq].
   Later on, when the irq is acked, kernel would calls
   move_native_irq=>desc->handler->set_affinity to change the irq affinity.
    However, every function pointed by
   hw_interrupt_type->set_affinity(unsigned int irq, cpumask_t cpumask)
   always changes irq_affinity[irq] to cpumask.  Next time when recalling
   do_irq_balance, it has to do cpu_ands again with
   irq_affinity[selected_irq], but irq_affinity[selected_irq] already
   becomes one cpu selected by the first irq balance.

2) Function balance_irq in file arch/i386/kernel/io_apic.c has the same
   issue.

[akpm@osdl.org: cleanups]
Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:57 -07:00
Hiro Yoshioka
c22ce143d1 [PATCH] x86: cache pollution aware __copy_from_user_ll()
Use the x86 cache-bypassing copy instructions for copy_from_user().

Some performance data are

Total of GLOBAL_POWER_EVENTS (CPU cycle samples)

2.6.12.4.orig    1921587
2.6.12.4.nt      1599424
1599424/1921587=83.23% (16.77% reduction)

BSQ_CACHE_REFERENCE (L3 cache miss)
2.6.12.4.orig      57427
2.6.12.4.nt        20858
20858/57427=36.32% (63.7% reduction)

L3 cache miss reduction of __copy_from_user_ll
samples  %
37408    65.1412  vmlinux                  __copy_from_user_ll
23        0.1103  vmlinux                  __copy_user_zeroing_intel_nocache
23/37408=0.061% (99.94% reduction)

Top 5 of 2.6.12.4.nt
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
samples  %        app name                 symbol name
128392    8.0274  vmlinux                  __copy_user_zeroing_intel_nocache
64206     4.0143  vmlinux                  journal_add_journal_head
59746     3.7355  vmlinux                  do_get_write_access
47674     2.9807  vmlinux                  journal_put_journal_head
46021     2.8774  vmlinux                  journal_dirty_metadata
pattern9-0-cpu4-0-09011728/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000
samples  %        app name                 symbol name
69755     4.2861  vmlinux                  __copy_user_zeroing_intel_nocache
55685     3.4215  vmlinux                  journal_add_journal_head
52371     3.2179  vmlinux                  __find_get_block
45504     2.7960  vmlinux                  journal_put_journal_head
36005     2.2123  vmlinux                  journal_stop
pattern9-0-cpu4-0-09011744/summary.out

Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000
samples  %        app name                 symbol name
1147      5.4994  vmlinux                  journal_add_journal_head
881       4.2240  vmlinux                  journal_dirty_data
872       4.1809  vmlinux                  blk_rq_map_sg
734       3.5192  vmlinux                  journal_commit_transaction
617       2.9582  vmlinux                  radix_tree_delete
pattern9-0-cpu4-0-09011731/summary.out

iozone results are

original 2.6.12.4 CPU time = 207.768 sec
cache aware       CPU time = 184.783 sec
(three times run)
184.783/207.768=88.94% (11.06% reduction)

original:
pattern9-0-cpu4-0-08191720/iozone.out:  CPU Utilization: Wall time   45.997    CPU time   64.527    CPU utilization 140.28 %
pattern9-0-cpu4-0-08191741/iozone.out:  CPU Utilization: Wall time   46.878    CPU time   71.933    CPU utilization 153.45 %
pattern9-0-cpu4-0-08191743/iozone.out:  CPU Utilization: Wall time   45.152    CPU time   71.308    CPU utilization 157.93 %

cache awre:
pattern9-0-cpu4-0-09011728/iozone.out:  CPU Utilization: Wall time   44.842    CPU time   62.465    CPU utilization 139.30 %
pattern9-0-cpu4-0-09011731/iozone.out:  CPU Utilization: Wall time   44.718    CPU time   59.273    CPU utilization 132.55 %
pattern9-0-cpu4-0-09011744/iozone.out:  CPU Utilization: Wall time   44.367    CPU time   63.045    CPU utilization 142.10 %

Signed-off-by: Hiro Yoshioka <hyoshiok@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:56 -07:00
David Quigley
35601547ba [PATCH] SELinux: add task_movememory hook
This patch adds new security hook, task_movememory, to be called when memory
owened by a task is to be moved (e.g.  when migrating pages to a this hook is
identical to the setscheduler implementation, but a separate hook introduced
to allow this check to be specialized in the future if necessary.

Since the last posting, the hook has been renamed following feedback from
Christoph Lameter.

Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:54 -07:00
James Morris
03e6806063 [PATCH] lsm: add task_setioprio hook
Implement an LSM hook for setting a task's IO priority, similar to the hook
for setting a tasks's nice value.

A previous version of this LSM hook was included in an older version of
multiadm by Jan Engelhardt, although I don't recall it being submitted
upstream.

Also included is the corresponding SELinux hook, which re-uses the setsched
permission in the proccess class.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Christoph Lameter
9216dfad4f [PATCH] move_pages: fix 32 -> 64 bit compat function
The definition of the third parameter is a pointer to an array of virtual
addresses which give us some trouble.  The existing code calculated the
wrong address in the array since I used void to avoid having to specify a
type.

I now use the correct type "compat_uptr_t __user *" in the definition of
the function in kernel/compat.c.

However, I used __u32 in syscalls.h.  Would have to include compat.h there
in order to provide the same definition which would generate an ugly
include situation.

On both ia64 and x86_64 compat_uptr_t is u32. So this works although
parameter declarations differ.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Christoph Lameter
1b2db9fb7a [PATCH] sys_move_pages: 32bit support (i386, x86_64)
sys_move_pages() support for 32bit (i386 plus x86_64 compat layer)

Add support for move_pages() on i386 and also add the compat functions
necessary to run 32 bit binaries on x86_64.

Add compat_sys_move_pages to the x86_64 32bit binary layer.  Note that it is
not up to date so I added the missing pieces.  Not sure if this is done the
right way.

[akpm@osdl.org: compile fix]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Christoph Lameter
742755a1d8 [PATCH] page migration: sys_move_pages(): support moving of individual pages
move_pages() is used to move individual pages of a process. The function can
be used to determine the location of pages and to move them onto the desired
node. move_pages() returns status information for each page.

long move_pages(pid, number_of_pages_to_move,
		addresses_of_pages[],
		nodes[] or NULL,
		status[],
		flags);

The addresses of pages is an array of void * pointing to the
pages to be moved.

The nodes array contains the node numbers that the pages should be moved
to. If a NULL is passed instead of an array then no pages are moved but
the status array is updated. The status request may be used to determine
the page state before issuing another move_pages() to move pages.

The status array will contain the state of all individual page migration
attempts when the function terminates. The status array is only valid if
move_pages() completed successfullly.

Possible page states in status[]:

0..MAX_NUMNODES	The page is now on the indicated node.

-ENOENT		Page is not present

-EACCES		Page is mapped by multiple processes and can only
		be moved if MPOL_MF_MOVE_ALL is specified.

-EPERM		The page has been mlocked by a process/driver and
		cannot be moved.

-EBUSY		Page is busy and cannot be moved. Try again later.

-EFAULT		Invalid address (no VMA or zero page).

-ENOMEM		Unable to allocate memory on target node.

-EIO		Unable to write back page. The page must be written
		back in order to move it since the page is dirty and the
		filesystem does not provide a migration function that
		would allow the moving of dirty pages.

-EINVAL		A dirty page cannot be moved. The filesystem does not provide
		a migration function and has no ability to write back pages.

The flags parameter indicates what types of pages to move:

MPOL_MF_MOVE	Move pages that are only mapped by the process.

MPOL_MF_MOVE_ALL Also move pages that are mapped by multiple processes.
		Requires sufficient capabilities.

Possible return codes from move_pages()

-ENOENT		No pages found that would require moving. All pages
		are either already on the target node, not present, had an
		invalid address or could not be moved because they were
		mapped by multiple processes.

-EINVAL		Flags other than MPOL_MF_MOVE(_ALL) specified or an attempt
		to migrate pages in a kernel thread.

-EPERM		MPOL_MF_MOVE_ALL specified without sufficient priviledges.
		or an attempt to move a process belonging to another user.

-EACCES		One of the target nodes is not allowed by the current cpuset.

-ENODEV		One of the target nodes is not online.

-ESRCH		Process does not exist.

-E2BIG		Too many pages to move.

-ENOMEM		Not enough memory to allocate control array.

-EFAULT		Parameters could not be accessed.

A test program for move_pages() may be found with the patches
on ftp.kernel.org:/pub/linux/kernel/people/christoph/pmig/patches-2.6.17-rc4-mm3

From: Christoph Lameter <clameter@sgi.com>

  Detailed results for sys_move_pages()

  Pass a pointer to an integer to get_new_page() that may be used to
  indicate where the completion status of a migration operation should be
  placed.  This allows sys_move_pags() to report back exactly what happened to
  each page.

  Wish there would be a better way to do this. Looks a bit hacky.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Christoph Lameter
95a402c384 [PATCH] page migration: use allocator function for migrate_pages()
Instead of passing a list of new pages, pass a function to allocate a new
page.  This allows the correct placement of MPOL_INTERLEAVE pages during page
migration.  It also further simplifies the callers of migrate pages.
migrate_pages() becomes similar to migrate_pages_to() so drop
migrate_pages_to().  The batching of new page allocations becomes unnecessary.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Christoph Lameter
aaa994b300 [PATCH] page migration: handle freeing of pages in migrate_pages()
Do not leave pages on the lists passed to migrate_pages().  Seems that we will
not need any postprocessing of pages.  This will simplify the handling of
pages by the callers of migrate_pages().

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:53 -07:00
Paul Drynoff
800590f523 [PATCH] slab: kmalloc, kzalloc comments cleanup and fix
- Move comments for kmalloc to right place, currently it near __do_kmalloc

- Comments for kzalloc

- More detailed comments for kmalloc

- Appearance of "kmalloc" and "kzalloc" man pages after "make mandocs"

[rdunlap@xenotime.net: simplification]
Signed-off-by: Paul Drynoff <pauldrynoff@gmail.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:52 -07:00
Andrew Morton
bd1e22b8e0 [PATCH] initialise total_memory() earlier
Initialise total_memory earlier in boot.  Because if for some reason we run
page reclaim early in boot, we don't want total_memory to be zero when we use
it as a divisor.

And rename total_memory to vm_total_pages to avoid naming clashes with
architectures.

Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Martin Bligh <mbligh@google.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:52 -07:00
David Howells
9637a5efd4 [PATCH] add page_mkwrite() vm_operations method
Add a new VMA operation to notify a filesystem or other driver about the
MMU generating a fault because userspace attempted to write to a page
mapped through a read-only PTE.

This facility permits the filesystem or driver to:

 (*) Implement storage allocation/reservation on attempted write, and so to
     deal with problems such as ENOSPC more gracefully (perhaps by generating
     SIGBUS).

 (*) Delay making the page writable until the contents have been written to a
     backing cache. This is useful for NFS/AFS when using FS-Cache/CacheFS.
     It permits the filesystem to have some guarantee about the state of the
     cache.

 (*) Account and limit number of dirty pages. This is one piece of the puzzle
     needed to make shared writable mapping work safely in FUSE.

Needed by cachefs (Or is it cachefiles?  Or fscache? <head spins>).

At least four other groups have stated an interest in it or a desire to use
the functionality it provides: FUSE, OCFS2, NTFS and JFFS2.  Also, things like
EXT3 really ought to use it to deal with the case of shared-writable mmap
encountering ENOSPC before we permit the page to be dirtied.

From: Peter Zijlstra <a.p.zijlstra@chello.nl>

  get_user_pages(.write=1, .force=1) can generate COW hits on read-only
  shared mappings, this patch traps those as mkpage_write candidates and fails
  to handle them the old way.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Joel Becker <Joel.Becker@oracle.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:51 -07:00
Con Kolivas
bd96b9eb7c [PATCH] mm: fix swap unused warning
If CONFIG_SWAP is not defined we get:

mm/vmscan.c: In function ‘remove_mapping’:
mm/vmscan.c:387: warning: unused variable ‘swap’

Convert defines in swap.h into blank inline functions to fix this warning
and be consistent.

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:51 -07:00
Andy Whitcroft
30c253e6da [PATCH] sparsemem: record nid during memory present
Record the node id as we mark sections for instantiation.  Use this nid
during instantiation to direct allocations.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Mike Kravetz <kravetz@us.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Bob Picco <bob.picco@hp.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Martin Bligh <mbligh@google.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:51 -07:00
Christoph Lameter
04e62a29bf [PATCH] More page migration: use migration entries for file pages
This implements the use of migration entries to preserve ptes of file backed
pages during migration.  Processes can therefore be migrated back and forth
without loosing their connection to pagecache pages.

Note that we implement the migration entries only for linear mappings.
Nonlinear mappings still require the unmapping of the ptes for migration.

And another writepage() ugliness shows up.  writepage() can drop the page
lock.  Therefore we have to remove migration ptes before calling writepages()
in order to avoid having migration entries point to unlocked pages.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:51 -07:00
Christoph Lameter
d75a0fcda2 [PATCH] Swapless page migration: rip out swap based logic
Rip the page migration logic out.

Remove all code that has to do with swapping during page migration.

This also guts the ability to migrate pages to swap.  No one used that so lets
let it go for good.

Page migration should be a bit broken after this patch.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:50 -07:00
Christoph Lameter
0697212a41 [PATCH] Swapless page migration: add R/W migration entries
Implement read/write migration ptes

We take the upper two swapfiles for the two types of migration ptes and define
a series of macros in swapops.h.

The VM is modified to handle the migration entries.  migration entries can
only be encountered when the page they are pointing to is locked.  This limits
the number of places one has to fix.  We also check in copy_pte_range and in
mprotect_pte_range() for migration ptes.

We check for migration ptes in do_swap_cache and call a function that will
then wait on the page lock.  This allows us to effectively stop all accesses
to apge.

Migration entries are created by try_to_unmap if called for migration and
removed by local functions in migrate.c

From: Hugh Dickins <hugh@veritas.com>

  Several times while testing swapless page migration (I've no NUMA, just
  hacking it up to migrate recklessly while running load), I've hit the
  BUG_ON(!PageLocked(p)) in migration_entry_to_page.

  This comes from an orphaned migration entry, unrelated to the current
  correctly locked migration, but hit by remove_anon_migration_ptes as it
  checks an address in each vma of the anon_vma list.

  Such an orphan may be left behind if an earlier migration raced with fork:
  copy_one_pte can duplicate a migration entry from parent to child, after
  remove_anon_migration_ptes has checked the child vma, but before it has
  removed it from the parent vma.  (If the process were later to fault on this
  orphaned entry, it would hit the same BUG from migration_entry_wait.)

  This could be fixed by locking anon_vma in copy_one_pte, but we'd rather
  not.  There's no such problem with file pages, because vma_prio_tree_add
  adds child vma after parent vma, and the page table locking at each end is
  enough to serialize.  Follow that example with anon_vma: add new vmas to the
  tail instead of the head.

  (There's no corresponding problem when inserting migration entries,
  because a missed pte will leave the page count and mapcount high, which is
  allowed for.  And there's no corresponding problem when migrating via swap,
  because a leftover swap entry will be correctly faulted.  But the swapless
  method has no refcounting of its entries.)

From: Ingo Molnar <mingo@elte.hu>

  pte_unmap_unlock() takes the pte pointer as an argument.

From: Hugh Dickins <hugh@veritas.com>

  Several times while testing swapless page migration, gcc has tried to exec
  a pointer instead of a string: smells like COW mappings are not being
  properly write-protected on fork.

  The protection in copy_one_pte looks very convincing, until at last you
  realize that the second arg to make_migration_entry is a boolean "write",
  and SWP_MIGRATION_READ is 30.

  Anyway, it's better done like in change_pte_range, using
  is_write_migration_entry and make_migration_entry_read.

From: Hugh Dickins <hugh@veritas.com>

  Remove unnecessary obfuscation from sys_swapon's range check on swap type,
  which blew up causing memory corruption once swapless migration made
  MAX_SWAPFILES no longer 2 ^ MAX_SWAPFILES_SHIFT.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
From: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:50 -07:00
Christoph Lameter
2d1db3b117 [PATCH] page migration cleanup: pass "mapping" to migration functions
Change handling of address spaces.

Pass a pointer to the address space in which the page is migrated to all
migration function.  This avoids repeatedly having to retrieve the address
space pointer from the page and checking it for validity.  The old page
mapping will change once migration has gone to a certain step, so it is less
confusing to have the pointer always available.

Move the setting of the mapping and index for the new page into
migrate_pages().

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:50 -07:00
Christoph Lameter
e7340f7330 [PATCH] page migration cleanup: remove useless definitions
Remove the export for migrate_page_remove_references() and migrate_page_copy()
that are unlikely to be used directly by filesystems implementing migration.
The export was useful when buffer_migrate_page() lived in fs/buffer.c but it
has now been moved to migrate.c in the migration reorg.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:50 -07:00
OGAWA Hirofumi
111ebb6e6f [PATCH] writeback: fix range handling
When a writeback_control's `start' and `end' fields are used to
indicate a one-byte-range starting at file offset zero, the required
values of .start=0,.end=0 mean that the ->writepages() implementation
has no way of telling that it is being asked to perform a range
request.  Because we're currently overloading (start == 0 && end == 0)
to mean "this is not a write-a-range request".

To make all this sane, the patch changes range of writeback_control.

So caller does: If it is calling ->writepages() to write pages, it
sets range (range_start/end or range_cyclic) always.

And if range_cyclic is true, ->writepages() thinks the range is
cyclic, otherwise it just uses range_start and range_end.

This patch does,

    - Add LLONG_MAX, LLONG_MIN, ULLONG_MAX to include/linux/kernel.h
      -1 is usually ok for range_end (type is long long). But, if someone did,

		range_end += val;		range_end is "val - 1"
		u64val = range_end >> bits;	u64val is "~(0ULL)"

      or something, they are wrong. So, this adds LLONG_MAX to avoid nasty
      things, and uses LLONG_MAX for range_end.

    - All callers of ->writepages() sets range_start/end or range_cyclic.

    - Fix updates of ->writeback_index. It seems already bit strange.
      If it starts at 0 and ended by check of nr_to_write, this last
      index may reduce chance to scan end of file.  So, this updates
      ->writeback_index only if range_cyclic is true or whole-file is
      scanned.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: "Vladimir V. Saveliev" <vs@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:49 -07:00
Nick Piggin
612d6c19db [PATCH] radix-tree: direct data
The ability to have height 0 radix trees (a direct pointer to the data item
rather than going through a full node->slot) quietly disappeared with
old-2.6-bkcvs commit ffee171812d51652f9ba284302d9e5c5cc14bdfd.  On 64-bit
machines this causes nearly 600 bytes to be used for every <= 4K file in
pagecache.

Re-introduce this feature, root tags stored in spare ->gfp_mask bits.

Simplify radix_tree_delete's complex tag clearing arrangement (which would
become even more complex) by just falling back to tag clearing functions
(the pagecache radix-tree never uses this path anyway, so the icache
savings will mean it's actually a speedup).

On my 4GB G5, this saves 8MB RAM per kernel kernel source+object tree in
pagecache.

Pagecache lookup, insertion, and removal speed for small files will also be
improved.

This makes RCU radix tree harder, but it's worth it.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:49 -07:00
Dean Nelson
929f97276b [PATCH] change gen_pool allocator to not touch managed memory
Modify the gen_pool allocator (lib/genalloc.c) to utilize a bitmap scheme
instead of the buddy scheme.  The purpose of this change is to eliminate
the touching of the actual memory being allocated.

Since the change modifies the interface, a change to the uncached allocator
(arch/ia64/kernel/uncached.c) is also required.

Both Andrey Volkov and Jes Sorenson have expressed a desire that the
gen_pool allocator not write to the memory being managed. See the
following:

  http://marc.theaimsgroup.com/?l=linux-kernel&m=113518602713125&w=2
  http://marc.theaimsgroup.com/?l=linux-kernel&m=113533568827916&w=2

Signed-off-by: Dean Nelson <dcn@sgi.com>
Cc: Andrey Volkov <avolkov@varma-el.com>
Acked-by: Jes Sorensen <jes@trained-monkey.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:49 -07:00
Nick Piggin
833423143c [PATCH] mm: introduce remap_vmalloc_range()
Add remap_vmalloc_range, vmalloc_user, and vmalloc_32_user so that drivers
can have a nice interface for remapping vmalloc memory.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:49 -07:00
Yasunori Goto
762834e8bf [PATCH] Unify pxm_to_node() and node_to_pxm()
Consolidate the various arch-specific implementations of pxm_to_node() and
node_to_pxm() into a single generic version.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:48 -07:00
Chen, Kenneth W
a43a8c39bb [PATCH] tightening hugetlb strict accounting
Current hugetlb strict accounting for shared mapping always assume mapping
starts at zero file offset and reserves pages between zero and size of the
file.  This assumption often reserves (or lock down) a lot more pages then
necessary if application maps at none zero file offset.  libhugetlbfs is
one example that requires proper reservation on shared mapping starts at
none zero offset.

This patch extends the reservation and hugetlb strict accounting to support
any arbitrary pair of (offset, len), resulting a much more robust and
accurate scheme.  More importantly, it won't lock down any hugetlb pages
outside file mapping.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:48 -07:00
Andreas Dilger
e8f03d0208 [PATCH] reserve space for swap label
Reserve space in the swap disk header for a LABEL and UUID to be specified.
 This has been possible with util-linux-2.12b (via e2fsprogs 1.36
libblkid), and is used by at least FC3 and later.  The kernel doesn't
really care about this, but the space shouldn't accidentally be used by
something else either.

Also make the on-disk structures be fixed-size types, instead of "int",
though I don't know of any architecture in use where an "int" isn't the
same size as a "__u32" (all current kernel arches have it as "unsigned
int").

Signed-off-by: Andreas Dilger <adilger@shaw.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:47 -07:00
KAMEZAWA Hiroyuki
fadd8fbd15 [PATCH] support for panic at OOM
This patch adds panic_on_oom sysctl under sys.vm.

When sysctl vm.panic_on_oom = 1, the kernel panics intead of killing rogue
processes.  And if vm.panic_on_oom is 0 the kernel will do oom_kill() in
the same way as it does today.  Of course, the default value is 0 and only
root can modifies it.

In general, oom_killer works well and kill rogue processes.  So the whole
system can survive.  But there are environments where panic is preferable
rather than kill some processes.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:47 -07:00
Yasunori Goto
718127cc31 [PATCH] wait_table and zonelist initializing for memory hotadd: add return code for init_current_empty_zone
When add_zone() is called against empty zone (not populated zone), we have to
initialize the zone which didn't initialize at boot time.  But,
init_currently_empty_zone() may fail due to allocation of wait table.  So,
this patch is to catch its error code.

Changes against wait_table is in the next patch.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:46 -07:00
Yasunori Goto
86356ab147 [PATCH] wait_table and zonelist initializing for memory hotadd: change to meminit for build_zonelist
Change definitions of some functions and data from __init to __meminit.

These functions and data can be used after bootup by this patch to be used for
hot-add codes.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:46 -07:00
Yasunori Goto
02b694dea4 [PATCH] wait_table and zonelist initializing for memory hotadd: change name of wait_table_size()
This is just to rename from wait_table_size() to wait_table_hash_nr_entries().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:46 -07:00
Andrew Morton
f886ed443f [PATCH] PG_uncached is ia64 only
As Nick points out, only ia64 uses PG_uncached.  So we can push it up into the
higher bits of the lower half of page->flags and make room for another flag on
32-bit machines.

Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jesse Barnes <jbarnes@sgi.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:46 -07:00
Andy Whitcroft
cb2b95e1c6 [PATCH] zone handle unaligned zone boundaries
The buddy allocator has a requirement that boundaries between contigious
zones occur aligned with the the MAX_ORDER ranges.  Where they do not we
will incorrectly merge pages cross zone boundaries.  This can lead to pages
from the wrong zone being handed out.

Originally the buddy allocator would check that buddies were in the same
zone by referencing the zone start and end page frame numbers.  This was
removed as it became very expensive and the buddy allocator already made
the assumption that zones boundaries were aligned.

It is clear that not all configurations and architectures are honouring
this alignment requirement.  Therefore it seems safest to reintroduce
support for non-aligned zone boundaries.  This patch introduces a new check
when considering a page a buddy it compares the zone_table index for the
two pages and refuses to merge the pages where they do not match.  The
zone_table index is unique for each node/zone combination when
FLATMEM/DISCONTIGMEM is enabled and for each section/zone combination when
SPARSEMEM is enabled (a SPARSEMEM section is at least a MAX_ORDER size).

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
David Howells
726c334223 [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry
Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch.  That reduced the significance of
sb->s_root, allowing NFS to place a fake root there.  However, NFS does
require a dentry to use as a target for the statfs operation.  This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
David Howells
454e2398be [PATCH] VFS: Permit filesystem to override root dentry on mount
Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:45 -07:00
Randy Dunlap
f4b8ea7849 [NET]: fix net-core kernel-doc
Warning(/var/linsrc/linux-2617-g4//include/linux/skbuff.h:304): No description found for parameter 'dma_cookie'
Warning(/var/linsrc/linux-2617-g4//include/net/sock.h:1274): No description found for parameter 'copied_early'
Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'chan'
Warning(/var/linsrc/linux-2617-g4//net/core/dev.c:3309): No description found for parameter 'event'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:42 -07:00
David Woodhouse
c8a553ad7f [TCP]: Move inclusion of <linux/dmaengine.h> to correct place in <linux/tcp.h>
The new <linux/dmaengine.h> header shouldn't be included from
the !__KERNEL__ portion of tcp.h

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:40 -07:00
Herbert Xu
37c3185a02 [NET]: Added GSO toggle
This patch adds a generic segmentation offload toggle that can be turned
on/off for each net device.  For now it only supports in TCPv4.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:36 -07:00
Herbert Xu
f4c50d990d [NET]: Add software TSOv4
This patch adds the GSO implementation for IPv4 TCP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:33 -07:00
Herbert Xu
f6a78bfcb1 [NET]: Add generic segmentation offload
This patch adds the infrastructure for generic segmentation offload.
The idea is to tap into the potential savings of TSO without hardware
support by postponing the allocation of segmented skb's until just
before the entry point into the NIC driver.

The same structure can be used to support software IPv6 TSO, as well as
UFO and segmentation offload for other relevant protocols, e.g., DCCP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:31 -07:00
Herbert Xu
7967168cef [NET]: Merge TSO/UFO fields in sk_buff
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP).  So
let's merge them.

They were used to tell the protocol of a packet.  This function has been
subsumed by the new gso_type field.  This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb.  As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.

I've made gso_type a conjunction.  The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN.  All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4.  This means that only the CWR packets need
to be emulated in software.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:07:29 -07:00
Herbert Xu
5b057c6b1a [NET]: Avoid allocating skb in skb_pad
First of all it is unnecessary to allocate a new skb in skb_pad since
the existing one is not shared.  More importantly, our hard_start_xmit
interface does not allow a new skb to be allocated since that breaks
requeueing.

This patch uses pskb_expand_head to expand the existing skb and linearize
it if needed.  Actually, someone should sift through every instance of
skb_pad on a non-linear skb as they do not fit the reasons why this was
originally created.

Incidentally, this fixes a minor bug when the skb is cloned (tcpdump,
TCP, etc.).  As it is skb_pad will simply write over a cloned skb.  Because
of the position of the write it is unlikely to cause problems but still
it's best if we don't do it.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 02:06:41 -07:00
Linus Torvalds
065a3e17ba Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (33 commits)
  [PATCH] myri10ge - drop workaround pci_save_state() disabling MSI
  [PATCH] myri10ge - drop workaround for the missing AER ext cap on nVidia CK804
  via-velocity: the link is not correctly detected when the device starts
  [PATCH] add b44 to maintainers
  [PATCH] WAN: ioremap() failure checks in drivers
  [PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()
  [PATCH] skb_padto()-area fixes in 8390, wavelan
  [PATCH] make drivers/net/forcedeth.c:nv_update_pause() static
  [PATCH] network driver for Hilscher netx
  [PATCH] Dereference in tokenring/olympic.c
  [PATCH] Array overrun in drivers/net/wireless/wavelan.c
  [PATCH] Remove useless check in drivers/net/pcmcia/xirc2ps_cs.c
  [PATCH] 8139cp: add ethtool eeprom support
  [PATCH] 8139cp: fix eeprom read command length
  [PATCH] b44: update b44 Kconfig entry
  [PATCH] b44: update version to 1.01
  [PATCH] b44: add wol for old nic
  [PATCH] b44: add parameter
  [PATCH] b44: add wol
  [PATCH] b44: fix manual speed/duplex/autoneg settings
  ...
2006-06-22 22:15:09 -07:00
Linus Torvalds
45c091bb2d Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (139 commits)
  [POWERPC] re-enable OProfile for iSeries, using timer interrupt
  [POWERPC] support ibm,extended-*-frequency properties
  [POWERPC] Extra sanity check in EEH code
  [POWERPC] Dont look for class-code in pci children
  [POWERPC] Fix mdelay badness on shared processor partitions
  [POWERPC] disable floating point exceptions for init
  [POWERPC] Unify ppc syscall tables
  [POWERPC] mpic: add support for serial mode interrupts
  [POWERPC] pseries: Print PCI slot location code on failure
  [POWERPC] spufs: one more fix for 64k pages
  [POWERPC] spufs: fail spu_create with invalid flags
  [POWERPC] spufs: clear class2 interrupt status before wakeup
  [POWERPC] spufs: fix Makefile for "make clean"
  [POWERPC] spufs: remove stop_code from struct spu
  [POWERPC] spufs: fix spu irq affinity setting
  [POWERPC] spufs: further abstract priv1 register access
  [POWERPC] spufs: split the Cell BE support into generic and platform dependant parts
  [POWERPC] spufs: dont try to access SPE channel 1 count
  [POWERPC] spufs: use kzalloc in create_spu
  [POWERPC] spufs: fix initial state of wbox file
  ...

Manually resolved conflicts in:
	drivers/net/phy/Makefile
	include/asm-powerpc/spu.h
2006-06-22 22:11:30 -07:00
Jeff Garzik
ba6a13083c [libata] Add host lock to struct ata_port
Prepare for changes required to support SATA devices
attached to SAS HBAs. For these devices we don't want to
use host_set at all, since libata will not be the owner
of struct scsi_host.

Signed-off-by: Brian King <brking@us.ibm.com>

(with slight merge modifications made by...)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-22 23:46:10 -04:00
Tejun Heo
47005f255e [PATCH] libata: implement per-dev EH action mask eh_info->dev_action[]
Currently, the only per-dev EH action is REVALIDATE.  EH used to
exploit ehi->dev to do selective revalidation on a ATA bus.  However,
this is a bit hacky and makes it impossible to request selective
revalidation from outside of EH or add another per-dev EH action.

This patch adds per-dev EH action mask eh_info->dev_action[] and
update EH to use this field for REVALIDATE.  Note that per-dev actions
can still be specified at port-level and it has the same effect of
specifying the action for all devices on the port.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-22 23:36:58 -04:00
Krzysztof Halasa
4a31e348e3 [PATCH] WAN: register_hdlc_device() doesn't need dev_alloc_name()
David Boggs noticed that register_hdlc_device() no longer needs
to call dev_alloc_name() as it's called by register_netdev().
register_hdlc_device() is currently equivalent to register_netdev().

hdlc_setup() is now EXPORTed as per David's request.

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-22 23:32:03 -04:00
Jeff Garzik
71d530cd1b Merge branch 'master' into upstream
Conflicts:

	drivers/scsi/libata-core.c
	drivers/scsi/libata-scsi.c
	include/linux/pci_ids.h
2006-06-22 22:11:56 -04:00
Linus Torvalds
d588fcbe5a Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6: (44 commits)
  [PATCH] I2C: I2C controllers go into right place on sysfs
  [PATCH] hwmon-vid: Add support for Intel Core and Conroe
  [PATCH] lm70: New hardware monitoring driver
  [PATCH] hwmon: Fix the Kconfig header
  [PATCH] i2c-i801: Merge setup function
  [PATCH] i2c-i801: Better pci subsystem integration
  [PATCH] i2c-i801: Cleanups
  [PATCH] i2c-i801: Remove PCI function check
  [PATCH] i2c-i801: Remove force_addr parameter
  [PATCH] i2c-i801: Fix block transaction poll loops
  [PATCH] scx200_acb: Documentation update
  [PATCH] scx200_acb: Mark scx200_acb_probe __init
  [PATCH] scx200_acb: Use PCI I/O resource when appropriate
  [PATCH] i2c: Mark block write buffers as const
  [PATCH] i2c-ocores: Minor cleanups
  [PATCH] abituguru: Fix fan detection
  [PATCH] abituguru: Review fixes
  [PATCH] abituguru: New hardware monitoring driver
  [PATCH] w83792d: Add missing data access locks
  [PATCH] w83792d: Fix setting the PWM value
  ...
2006-06-22 15:08:56 -07:00
Linus Torvalds
eaa8568901 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/w1-2.6:
  [PATCH] w1: warning fix
  [PATCH] w1: clean up W1_CON dependency.
  [PATCH] drivers/w1/w1.c: fix a compile error
  [PATCH] W1: fix dependencies of W1_SLAVE_DS2433_CRC
  [PATCH] W1: possible cleanups
  [PATCH] W1: cleanups
  [PATCH] w1 exports
  [PATCH] w1: Use mutexes instead of semaphores.
  [PATCH] w1: Make w1 connector notifications depend on connector.
  [PATCH] w1: netlink: Mark netlink group 1 as unused.
  [PATCH] w1: Move w1-connector definitions into linux/include/connector.h
  [PATCH] w1: Userspace communication protocol over connector.
  [PATCH] w1: Replace dscore and ds_w1_bridge with ds2490 driver.
  [PATCH] w1: Added default generic read/write operations.
2006-06-22 15:08:34 -07:00
Linus Torvalds
6c763eb9ea Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (27 commits)
  [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible
  [PATCH] PCI: fix issues with extended conf space when MMCONFIG disabled because of e820
  [PATCH] PCI: Bus Parity Status sysfs interface
  [PATCH] PCI: fix memory leak in MMCONFIG error path
  [PATCH] PCI: fix error with pci_get_device() call in the mpc85xx driver
  [PATCH] PCI: MSI-K8T-Neo2-Fir: run only where needed
  [PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev
  [PATCH] PCI: clean up pci documentation to be more specific
  [PATCH] PCI: remove unneeded msi code
  [PATCH] PCI: don't move ioapics below PCI bridge
  [PATCH] PCI: cleanup unused variable about msi driver
  [PATCH] PCI: disable msi mode in pci_disable_device
  [PATCH] PCI: Allow MSI to work on kexec kernel
  [PATCH] PCI: AMD 8131 MSI quirk called too late, bus_flags not inherited ?
  [PATCH] PCI: Move various PCI IDs to header file
  [PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation
  [PATCH] PCI: i386/x86_84: disable PCI resource decode on device disable
  [PATCH] PCI ACPI: Rename the functions to avoid multiple instances.
  [PATCH] PCI: don't enable device if already enabled
  [PATCH] PCI: Add a "enable" sysfs attribute to the pci devices to allow userspace (Xorg) to enable devices without doing foul direct access
  ...
2006-06-22 15:07:59 -07:00
Richard Purdie
4f3865fb57 [PATCH] zlib_inflate: Upgrade library code to a recent version
Upgrade the zlib_inflate implementation in the kernel from a patched
version 1.1.3/4 to a patched 1.2.3.

The code in the kernel is about seven years old and I noticed that the
external zlib library's inflate performance was significantly faster (~50%)
than the code in the kernel on ARM (and faster again on x86_32).

For comparison the newer deflate code is 20% slower on ARM and 50% slower
on x86_32 but gives an approx 1% compression ratio improvement.  I don't
consider this to be an improvement for kernel use so have no plans to
change the zlib_deflate code.

Various changes have been made to the zlib code in the kernel, the most
significant being the extra functions/flush option used by ppp_deflate.
This update reimplements the features PPP needs to ensure it continues to
work.

This code has been tested on ARM under both JFFS2 (with zlib compression
enabled) and ppp_deflate and on x86_32.  JFFS2 sees an approx.  10% real
world file read speed improvement.

This patch also removes ZLIB_VERSION as it no longer has a correct value.
We don't need version checks anyway as the kernel's module handling will
take care of that for us.  This removal is also more in keeping with the
zlib author's wishes (http://www.zlib.net/zlib_faq.html#faq24) and I've
added something to the zlib.h header to note its a modified version.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Acked-by: Joern Engel <joern@wh.fh-wedel.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:58 -07:00
NeilBrown
0feae5c47a [PATCH] Fix dcache race during umount
The race is that the shrink_dcache_memory shrinker could get called while a
filesystem is being unmounted, and could try to prune a dentry belonging to
that filesystem.

If it does, then it will call in to iput on the inode while the dentry is
no longer able to be found by the umounting process.  If iput takes a
while, generic_shutdown_super could get all the way though
shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
ever waiting on this particular inode.

Eventually the superblock gets freed anyway and if the iput tried to touch
it (which some filesystems certainly do), it will lose.  The promised
"Self-destruct in 5 seconds" doesn't lead to a nice day.

The race is closed by holding s_umount while calling prune_one_dentry on
someone else's dentry.  As a down_read_trylock is used,
shrink_dcache_memory will no longer try to prune the dentry of a filesystem
that is being unmounted, and unmount will not be able to start until any
such active prune_one_dentry completes.

This requires that prune_dcache *knows* which filesystem (if any) it is
doing the prune on behalf of so that it can be careful of other
filesystems.  shrink_dcache_memory isn't called it on behalf of any
filesystem, and so is careful of everything.

shrink_dcache_anon is now passed a super_block rather than the s_anon list
out of the superblock, so it can get the s_anon list itself, and can pass
the superblock down to prune_dcache.

If prune_dcache finds a dentry that it cannot free, it leaves it where it
is (at the tail of the list) and exits, on the assumption that some other
thread will be removing that dentry soon.  To try to make sure that some
work gets done, a limited number of dnetries which are untouchable are
skipped over while choosing the dentry to work on.

I believe this race was first found by Kirill Korotaev.

Cc: Jan Blunck <jblunck@suse.de>
Acked-by: Kirill Korotaev <dev@openvz.org>
Cc: Olaf Hering <olh@suse.de>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:57 -07:00
Miklos Szeredi
c89681ed7d [PATCH] remove steal_locks()
This patch removes the steal_locks() function.

steal_locks() doesn't work correctly with any filesystem that does it's own
lock management, including NFS, CIFS, etc.

In addition it has weird semantics on local filesystems in case tasks
sharing file-descriptor tables are doing POSIX locking operations in
parallel to execve().

The steal_locks() function has an effect on applications doing:

clone(CLONE_FILES)
  /* in child */
  lock
  execve
  lock

POSIX locks acquired before execve (by "child", "parent" or any further
task sharing files_struct) will after the execve be owned exclusively by
"child".

According to Chris Wright some LSB/LTP kind of suite triggers without the
stealing behavior, but there's no known real-world application that would
also fail.

Apps using NPTL are not affected, since all other threads are killed before
execve.

Apps using LinuxThreads are only affected if they

  - have multiple threads during exec (LinuxThreads doesn't kill other
    threads, the app may do it with pthread_kill_other_threads_np())
  - rely on POSIX locks being inherited across exec

Both conditions are documented, but not their interaction.

Apps using clone() natively are affected if they

  - use clone(CLONE_FILES)
  - rely on POSIX locks being inherited across exec

The above scenarios are unlikely, but possible.

If the patch is vetoed, there's a plan B, that involves mostly keeping the
weird stealing semantics, but changing the way lock ownership is handled so
that network and local filesystems work consistently.

That would add more complexity though, so this solution seems to be
preferred by most people.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:57 -07:00
Brice Goglin
0e5b378159 [PATCH] PCI: Add PCI_CAP_ID_VNDR
Add the vendor-specific extended capability PCI_CAP_ID_VNDR.  It is required
by the Myri-10G Ethernet driver.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:56 -07:00
David Howells
04c567d931 [PATCH] Keys: Fix race between two instantiators of a key
Add a revocation notification method to the key type and calls it whilst
the key's semaphore is still write-locked after setting the revocation
flag.

The patch then uses this to maintain a reference on the task_struct of the
process that calls request_key() for as long as the authorisation key
remains unrevoked.

This fixes a potential race between two processes both of which have
assumed the authority to instantiate a key (one may have forked the other
for example).  The problem is that there's no locking around the check for
revocation of the auth key and the use of the task_struct it points to, nor
does the auth key keep a reference on the task_struct.

Access to the "context" pointer in the auth key must thenceforth be done
with the auth key semaphore held.  The revocation method is called with the
target key semaphore held write-locked and the search of the context
process's keyrings is done with the auth key semaphore read-locked.

The check for the revocation state of the auth key just prior to searching
it is done after the auth key is read-locked for the search.  This ensures
that the auth key can't be revoked between the check and the search.

The revocation notification method is added so that the context task_struct
can be released as soon as instantiation happens rather than waiting for
the auth key to be destroyed, thus avoiding the unnecessary pinning of the
requesting process.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:56 -07:00
Michael LeMay
d720024e94 [PATCH] selinux: add hooks for key subsystem
Introduce SELinux hooks to support the access key retention subsystem
within the kernel.  Incorporate new flask headers from a modified version
of the SELinux reference policy, with support for the new security class
representing retained keys.  Extend the "key_alloc" security hook with a
task parameter representing the intended ownership context for the key
being allocated.  Attach security information to root's default keyrings
within the SELinux initialization routine.

Has passed David's testsuite.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-22 15:05:55 -07:00
Evgeniy Polyakov
bb5427b546 [PATCH] w1: netlink: Mark netlink group 1 as unused.
netlink_w1 was moved to connector.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22 11:22:50 -07:00
Evgeniy Polyakov
b6043fcab4 [PATCH] w1: Move w1-connector definitions into linux/include/connector.h
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22 11:22:50 -07:00
Krzysztof Halasa
46f5ed753f [PATCH] i2c: Mark block write buffers as const
The attached patch marks i2c_smbus_write_block_data() and
i2c_smbus_write_i2c_block_data() buffers as const.

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22 11:10:34 -07:00
Peter Korsgaard
18f98b1e31 [PATCH] i2c: New bus driver for the OpenCores I2C controller
The following patch adds support for the OpenCores I2C controller IP
core (See http://www.opencores.org/projects.cgi/web/i2c/overview).

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22 11:10:33 -07:00
Jean Delvare
5c7ae65899 [PATCH] I2C: i2c-nforce2: Add support for the nForce4 MCP51 and MCP55
Add support for the new nForce4 MCP51 (also known as nForce 410 or
430) and nForce4 MCP55 to the i2c-nforce2 driver. Some code changes
were required because the base I/O address registers have changed in
these versions. Standard BARs are now being used, while the original
nForce2 chips used non-standard ones.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22 11:10:33 -07:00
Mark A. Greer
5e9f4f2e5a [PATCH] I2C: m41t00: Add support for the ST M41T81 and M41T85
This patch adds support for the ST m41t81 and m41t85 i2c rtc chips
to the existing m41t00 driver.

Since there is no way to reliably determine what type of rtc chip
is in use, the chip type is passed in via platform_data.  The i2c
address and square wave frequency are passed in via platform_data
as well.  To accommodate the use of platform_data, a new header
file include/linux/m41t00.h has been added.

The m41t81 and m41t85 chips halt the updating of their time registers
while they are being accessed.  They resume when a stop condition
exists on the i2c bus or when non-time related regs are accessed.
To make the best use of that facility and to make more efficient
use of the i2c bus, this patch replaces multiple i2c_smbus_xxx calls
with a single i2c_transfer call.

Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22 11:10:32 -07:00
Rudolf Marek
02e0c5d5c2 [PATCH] i2c-piix4: Add ATI IXP200/300/400 support
This patch adds the ATI IXP southbridges support to i2c-piix4,
as it turned out those chips are compatible with it.

Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-22 11:10:32 -07:00
Jean Delvare
1bca9f2e5b Remove <linux/i2c-id.h> and <linux/i2c-algo-ite.h> from userspace export
Do not export the following i2c header files to user space:

* i2c-id.h: These IDs are internal to the kernel and user space
  does not need them.
* i2c-algo-ite.h: The driver is broken and planned for removal, so
  let's not export it for now. The extra ioctl defined here can
  be standardized and moved to i2c-core/i2c-dev if really needed.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-22 14:24:31 +01:00
Greg Kroah-Hartman
bd00949647 [PATCH] USB: convert usb class devices to real devices
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:19 -07:00
Greg Kroah-Hartman
c182274ffe [PATCH] USB: move usb_device_class class devices to be real devices
This moves the usb class devices that control the usbfs nodes to show up
in the proper place in the larger device tree.

No userspace changes is needed, this is compatible due to the symlinks
generated by the driver core.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:19 -07:00
Greg Kroah-Hartman
9bde7497e0 [PATCH] USB: make endpoints real struct devices
This will allow for us to give endpoints a major/minor to create a
"usbfs2-like" way to access endpoints directly from userspace in an
easier manner than the current usbfs provides us.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:19 -07:00
David Brownell
ae0dadcf0f [PATCH] USB: move <linux/usb_input.h> to <linux/usb/input.h>
Move <linux/usb_input.h> to <linux/usb/input.h> and remove some
redundant includes.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:18 -07:00
David Brownell
325a4af60d [PATCH] USB: move hardware-specific <linux/usb_*.h> to <linux/usb/*.h>
This moves header files for controller-specific platform data
from <linux/usb_XXX.h> to <linux/usb/XXX.h> to start reducing
some clutter.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:18 -07:00
David Brownell
a8c28f2389 [PATCH] USB: move <linux/usb_cdc.h> to <linux/usb/cdc.h>
This moves <linux/usb_cdc.h> to <linux/usb/cdc.h> to reduce some of the
clutter of usb header files.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:18 -07:00
Alan Stern
79efa097e7 [PATCH] usbcore: port reset for composite devices
This patch (as699) adds usb_reset_composite_device(), a routine for
sending a USB port reset to a device with multiple interfaces owned by
different drivers.  Drivers are notified about impending and completed
resets through two new methods in the usb_driver structure.

The patch modifieds the usbfs ioctl code to make it use the new routine
instead of usb_reset_device().  Follow-up patches will modify the hub,
usb-storage, and usbhid drivers so they can utilize this new API.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:15 -07:00
Greg Kroah-Hartman
782a7a632e [PATCH] USB: add usb_interrupt_msg() function for api completeness.
Really just a wrapper around usb_bulk_msg() but now it's documented
much better.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 15:04:12 -07:00
Tony Luck
1323523f50 Pull rework-memory-attribute-aliasing into release branch 2006-06-21 14:50:10 -07:00
Rene Herman
a5117ba7da [PATCH] Driver model: add ISA bus
During the recent "isa drivers using platform devices" discussion it was
pointed out that (ALSA) ISA drivers ran into the problem of not having
the option to fail driver load (device registration rather) upon not
finding their hardware due to a probe() error not being passed up
through the driver model. In the course of that, I suggested a seperate
ISA bus might be best; Russell King agreed and suggested this bus could
use the .match() method for the actual device discovery.

The attached does this. For this old non (generically) discoverable ISA
hardware only the driver itself can do discovery so as a difference with
the platform_bus, this isa_bus also distributes match() up to the driver.

As another difference: these devices only exist in the driver model due
to the driver creating them because it might want to drive them, meaning
that all device creation has been made internal as well.

The usage model this provides is nice, and has been acked from the ALSA
side by Takashi Iwai and Jaroslav Kysela. The ALSA driver module_init's
now (for oldisa-only drivers) become:

static int __init alsa_card_foo_init(void)
{
	return isa_register_driver(&snd_foo_isa_driver, SNDRV_CARDS);
}

static void __exit alsa_card_foo_exit(void)
{
	isa_unregister_driver(&snd_foo_isa_driver);
}

Quite like the other bus models therefore. This removes a lot of
duplicated init code from the ALSA ISA drivers.

The passed in isa_driver struct is the regular driver struct embedding a
struct device_driver, the normal probe/remove/shutdown/suspend/resume
callbacks, and as indicated that .match callback.

The "SNDRV_CARDS" you see being passed in is a "unsigned int ndev"
parameter, indicating how many devices to create and call our methods with.

The platform_driver callbacks are called with a platform_device param;
the isa_driver callbacks are being called with a "struct device *dev,
unsigned int id" pair directly -- with the device creation completely
internal to the bus it's much cleaner to not leak isa_dev's by passing
them in at all. The id is the only thing we ever want other then the
struct device * anyways, and it makes for nicer code in the callbacks as
well.

With this additional .match() callback ISA drivers have all options. If
ALSA would want to keep the old non-load behaviour, it could stick all
of the old .probe in .match, which would only keep them registered after
everything was found to be present and accounted for. If it wanted the
behaviour of always loading as it inadvertently did for a bit after the
changeover to platform devices, it could just not provide a .match() and
do everything in .probe() as before.

If it, as Takashi Iwai already suggested earlier as a way of following
the model from saner buses more closely, wants to load when a later bind
could conceivably succeed, it could use .match() for the prerequisites
(such as checking the user wants the card enabled and that port/irq/dma
values have been passed in) and .probe() for everything else. This is
the nicest model.

To the code...

This exports only two functions; isa_{,un}register_driver().

isa_register_driver() register's the struct device_driver, and then
loops over the passed in ndev creating devices and registering them.
This causes the bus match method to be called for them, which is:

int isa_bus_match(struct device *dev, struct device_driver *driver)
{
          struct isa_driver *isa_driver = to_isa_driver(driver);

          if (dev->platform_data == isa_driver) {
                  if (!isa_driver->match ||
                          isa_driver->match(dev, to_isa_dev(dev)->id))
                          return 1;
                  dev->platform_data = NULL;
          }
          return 0;
}

The first thing this does is check if this device is in fact one of this
driver's devices by seeing if the device's platform_data pointer is set
to this driver. Platform devices compare strings, but we don't need to
do that with everything being internal, so isa_register_driver() abuses
dev->platform_data as a isa_driver pointer which we can then check here.
I believe platform_data is available for this, but if rather not, moving
the isa_driver pointer to the private struct isa_dev is ofcourse fine as
well.

Then, if the the driver did not provide a .match, it matches. If it did,
the driver match() method is called to determine a match.

If it did _not_ match, dev->platform_data is reset to indicate this to
isa_register_driver which can then unregister the device again.

If during all this, there's any error, or no devices matched at all
everything is backed out again and the error, or -ENODEV, is returned.

isa_unregister_driver() just unregisters the matched devices and the
driver itself.

More global points/questions...

- I'm introducing include/linux/isa.h. It was available but is ofcourse
a somewhat generic name. Moving more isa stuff over to it in time is
ofcourse fine, so can I have it please? :)

- I'm using device_initcall() and added the isa.o (dependent on
CONFIG_ISA) after the base driver model things in the Makefile. Will
this do, or I really need to stick it in drivers/base/init.c, inside
#ifdef CONFIG_ISA? It's working fine.

Lastly -- I also looked, a bit, into integrating with PnP. "Old ISA"
could be another pnp_protocol, but this does not seem to be a good
match, largely due to the same reason platform_devices weren't -- the
devices do not have a life of their own outside the driver, meaning the
pnp_protocol {get,set}_resources callbacks would need to callback into
driver -- which again means you first need to _have_ that driver. Even
if there's clean way around that, you only end up inventing fake but
valid-form PnP IDs and generally catering to the PnP layer without any
practical advantages over this very simple isa_bus. The thing I also
suggested earlier about the user echoing values into /sys to set up the
hardware from userspace first is... well, cute, but a horrible idea from
a user standpoint.

Comments ofcourse appreciated. Hope it's okay. As said, the usage model
is nice at least.

Signed-off-by: Rene Herman <rene.herman@keyaccess.nl>
2006-06-21 12:40:49 -07:00
Alan Stern
3e95637a48 [PATCH] Driver Core: Make dev_info and friends print the bus name if there is no driver
This patch (as721) makes dev_info and related macros print the device's
bus name if the device doesn't have a driver, instead of printing just a
blank.  If the device isn't on a bus either... well, then it does leave
a blank space.  But it will be easier for someone else to change if they
want.

Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:40:49 -07:00
Greg Kroah-Hartman
23681e4791 [PATCH] Driver core: allow struct device to have a dev_t
This is the first step in moving class_device to being replaced by
struct device.  It allows struct device to export a dev_t and makes it
easy to dynamically create and destroy struct device as long as they are
associated with a specific class.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:40:49 -07:00
Michael Holzheu
4039483fd3 [PATCH] Driver Core: Add /sys/hypervisor when needed
To have a home for all hypervisors, this patch creates /sys/hypervisor.
A new config option SYS_HYPERVISOR is introduced, which should to be set
by architecture dependent hypervisors (e.g. s390 or Xen).

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:40:48 -07:00
Shaohua Li
670dd90d81 [PATCH] Driver Core: Allow sysdev_class have attributes
allow sysdev_class adding attribute. Next patch will use the new API to
add an attribute under /sys/device/system/cpu/.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:40:48 -07:00
Greg Kroah-Hartman
1740757e8f [PATCH] Driver Core: remove unused exports
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:40:48 -07:00
Hansjoerg Lipp
1cdcb6b43f [PATCH] TTY: return class device pointer from tty_register_device()
Let tty_register_device() return a pointer to the class device it creates.
This allows registrants to add their own sysfs files under the class
device node.

Signed-off-by: Hansjoerg Lipp <hjlipp@web.de>
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:40:47 -07:00
Brice Goglin
cf34a8e07f [PATCH] PCI: nVidia quirk to make AER PCI-E extended capability visible
The nVidia CK804 PCI-E chipset supports the AER extended capability
but sometimes fails to link it (with some BIOS or after a warm reboot).
It makes the AER cap invisible to pci_find_ext_capability().

The patch adds a quirk to set the missing bit that controls the
linking of the capability.
By the way, it removes the corresponding code in the myri10ge driver.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Loic Prylli <loic@myri.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:00:01 -07:00
Shaohua Li
99dc804d9b [PATCH] PCI: disable msi mode in pci_disable_device
Brice said the pci_save_msi_state breaks his driver in his special usage
(not in suspend/resume), as pci_save_msi_state will disable msi mode. In
his usage, pci_save_state will be called at runtime, and later (after
the device operates for some time and has an error) pci_restore_state
will be called.
In another hand, suspend/resume needs disable msi mode, as device should
stop working completely. This patch try to workaround this issue.
Drivers are expected call pci_disable_device in suspend time after
pci_save_state.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:00:00 -07:00
Brent Casavant
74d0a988d3 [PATCH] PCI: Move various PCI IDs to header file
Move various QLogic, Vitesse, and Intel storage controller PCI IDs to the
main header file.

Signed-off-by: Brent Casavant <bcasavan@sgi.com>
Acked-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 12:00:00 -07:00
Doug Thompson
bd8481e164 [PATCH] PCI Bus Parity Status-broken hardware attribute, EDAC foundation
Currently, the EDAC (error detection and correction) modules that are in
the kernel contain some features that need to be moved. After some good
feedback on the PCI Parity detection code and interface
(http://www.ussg.iu.edu/hypermail/linux/kernel/0603.1/0897.html) this
patch ADDs an new attribute to the pci_dev structure: Namely the
'broken_parity_status' bit.

When set this indicates that the respective hardware generates false
positives of Parity errors.

The EDAC "blacklist" solution was inferior and will be removed in a
future patch.

Also in this patch is a PCI quirk.c entry for an Infiniband PCI-X card
which generates false positive parity errors.

I am requesting comments on this AND on the possibility of a exposing
this 'broken_parity_status' bit to userland via the PCI device sysfs
directory for devices. This access would allow for enabling of this
feature on new devices and for old devices that have their drivers
updated. (SLES 9 SP3 did this on an ATI motherboard video device). There
is a need to update such a PCI attribute between kernel releases.

This patch just adds a storage place for the attribute and a quirk entry
for a known bad PCI device. PCI Parity reaper/harvestor operations are
in EDAC itself and will be refactored to use this PCI attribute instead
of its own mechanisms (which are currently disabled) in the future.

Signed-off-by: Doug Thompson <norsk5@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 11:59:59 -07:00
Kumar Gala
75acfecaa0 [PATCH] PCI: Add pci_assign_resource_fixed -- allow fixed address assignments
PCI: Add pci_assign_resource_fixed -- allow fixed address assignments

On some embedded systems the PCI address for hotplug devices are not only
known a priori but are required to be at a given PCI address for other
master in the system to be able to access.

An example of such a system would be an FPGA which is setup from user space
after the system has booted.  The FPGA may be access by DSPs in the system
and those DSPs expect the FPGA at a fixed PCI address.

Added pci_assign_resource_fixed() as a way to allow assignment of the PCI
devices's BARs at fixed PCI addresses.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 11:59:59 -07:00
Brice Goglin
c34b4c7344 [PATCH] PCI: Add PCI_CAP_ID_VNDR
Add the vendor-specific extended capability PCI_CAP_ID_VNDR.  It will be
used by the Myri-10G Ethernet driver (will be submitted soon).

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-21 11:59:58 -07:00
Linus Torvalds
28e4b22495 Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (85 commits)
  [SCSI] 53c700: remove reliance on deprecated cmnd fields
  [SCSI] hptiop: don't use cmnd->bufflen
  [SCSI] hptiop: HighPoint RocketRAID 3xxx controller driver
  [SCSI] aacraid: small misc. cleanups
  [SCSI] aacraid: Update supported product information
  [SCSI] aacraid: Fix return code interpretation
  [SCSI] scsi_transport_sas: fix panic in sas_free_rphy
  [SCSI] remove RQ_SCSI_* flags
  [SCSI] remove scsi_request infrastructure
  [SCSI] mptfusion: change driver revision to 3.03.10
  [SCSI] mptfc: abort of board reset leaves port dead requiring reboot
  [SCSI] mptfc: fix fibre channel infinite request/response loop
  [SCSI] mptfc: set fibre channel fw target missing timers to one second
  [SCSI] mptfusion: move fc event/reset handling to mptfc
  [SCSI] spi transport: don't allow dt to be set on SE or HVD buses
  [SCSI] aic7xxx: expose the bus setting to sysfs
  [SCSI] scsi: remove Documentation/scsi/cpqfc.txt
  [SCSI] drivers/scsi: Use ARRAY_SIZE macro
  [SCSI] Remove last page_address from dc395x.c
  [SCSI] hptiop: HighPoint RocketRAID 3xxx controller driver
  ...

Fixed up conflicts in drivers/message/fusion/mptbase.c manually (due to
the sparc interrupt cleanups)
2006-06-21 11:18:25 -07:00
Anton Blanchard
1e92a550e8 [POWERPC] Fix mdelay badness on shared processor partitions
On partitioned PPC64 systems where a partition is given 1/10 of a
processor, we have seen mdelay() delaying for 10 times longer than it
should.  The reason is that the generic mdelay(n) does n delays of 1
millisecond each.  However, with 1/10 of a processor, we only get a
one-millisecond timeslice every 10ms.  Thus each 1 millisecond delay
loop ends up taking 10ms elapsed time.

The solution is just to use the PPC64 udelay function, which uses the
timebase to ensure that the delay is based on elapsed time rather than
how much processing time the partition has been given.  (Yes, the
generic mdelay uses the PPC64 udelay, but the problem is that the
start time gets reset every millisecond, and each time it gets reset
we lose another 9ms.)

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Andrew Morton <akpm@osdl.org>
2006-06-21 15:01:33 +10:00
Brice Goglin
22ae813b85 [PATCH] add __iowrite64_copy
Introduce __iowrite64_copy.  It will be used by the Myri-10G Ethernet
driver to post requests to the NIC.  This driver will be submitted soon.

__iowrite64_copy copies to I/O memory in units of 64 bits when possible (on
64 bit architectures).  It reverts to __iowrite32_copy on 32 bit
architectures.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-20 20:24:58 -07:00
Linus Torvalds
050335db2a Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (42 commits)
  [ARM] Fix tosa build error
  [ARM] 3610/1: Make reboot work on Versatile
  [ARM] 3609/1: S3C24XX: defconfig update for s3c2410_defconfig
  [ARM] 3591/1: Anubis: IDE device definitions
  [ARM] Include asm/hardware.h not asm/arch/hardware.h
  [ARM] 3594/1: Poodle: Add touchscreen support + other updates
  [ARM] 3564/1: sharpsl_pm: Abstract some machine specific parameters
  [ARM] 3561/1: Poodle: Correct the MMC/SD power control
  [ARM] 3593/1: Add reboot and shutdown handlers for Zaurus handhelds
  [ARM] 3599/1: AT91RM9200 remove global variables
  [ARM] 3607/1: AT91RM9200 misc fixes
  [ARM] 3605/1: AT91RM9200 Power Management
  [ARM] 3604/1: AT91RM9200 New boards
  [ARM] 3603/1: AT91RM9200 remove old files
  [ARM] 3592/1: AT91RM9200 Serial driver update
  [ARM] 3590/1: AT91RM9200 Platform devices support
  [ARM] 3589/1: AT91RM9200 DK/EK board update
  [ARM] 3588/1: AT91RM9200 CSB337/637 board update
  [ARM] 3587/1: AT91RM9200 hardware headers
  [ARM] 3586/1: AT91RM9200 header update
  ...
2006-06-20 17:52:36 -07:00
Trond Myklebust
70ac4385a1 Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	include/linux/nfs_fs.h

Fixed up conflict with kernel header updates.
2006-06-20 20:46:21 -04:00
Linus Torvalds
a4cfae13ce Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [ATM]: fix broken uses of NIPQUAD in net/atm
  [SCTP]: sctp_unpack_cookie() fix
  [SCTP]: Fix unintentional change to SCTP_ASSERT when !SCTP_DEBUG
  [NET]: Prevent multiple qdisc runs
  [CONNECTOR]: Initialize subsystem earlier.
  [NETFILTER]: xt_sctp: fix endless loop caused by 0 chunk length
2006-06-20 17:39:53 -07:00
Linus Torvalds
d9eaec9e29 Merge branch 'audit.b21' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b21' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (25 commits)
  [PATCH] make set_loginuid obey audit_enabled
  [PATCH] log more info for directory entry change events
  [PATCH] fix AUDIT_FILTER_PREPEND handling
  [PATCH] validate rule fields' types
  [PATCH] audit: path-based rules
  [PATCH] Audit of POSIX Message Queue Syscalls v.2
  [PATCH] fix se_sen audit filter
  [PATCH] deprecate AUDIT_POSSBILE
  [PATCH] inline more audit helpers
  [PATCH] proc_loginuid_write() uses simple_strtoul() on non-terminated array
  [PATCH] update of IPC audit record cleanup
  [PATCH] minor audit updates
  [PATCH] fix audit_krule_to_{rule,data} return values
  [PATCH] add filtering by ppid
  [PATCH] log ppid
  [PATCH] collect sid of those who send signals to auditd
  [PATCH] execve argument logging
  [PATCH] fix deadlocks in AUDIT_LIST/AUDIT_LIST_RULES
  [PATCH] audit_panic() is audit-internal
  [PATCH] inotify (5/5): update kernel documentation
  ...

Manual fixup of conflict in unclude/linux/inotify.h
2006-06-20 15:37:56 -07:00
Linus Torvalds
cee4cca740 Merge git://git.infradead.org/hdrcleanup-2.6
* git://git.infradead.org/hdrcleanup-2.6: (63 commits)
  [S390] __FD_foo definitions.
  Switch to __s32 types in joystick.h instead of C99 types for consistency.
  Add <sys/types.h> to headers included for userspace in <linux/input.h>
  Move inclusion of <linux/compat.h> out of user scope in asm-x86_64/mtrr.h
  Remove struct fddi_statistics from user view in <linux/if_fddi.h>
  Move user-visible parts of drivers/s390/crypto/z90crypt.h to include/asm-s390
  Revert include/media changes: Mauro says those ioctls are only used in-kernel(!)
  Include <linux/types.h> and use __uXX types in <linux/cramfs_fs.h>
  Use __uXX types in <linux/i2o_dev.h>, include <linux/ioctl.h> too
  Remove private struct dx_hash_info from public view in <linux/ext3_fs.h>
  Include <linux/types.h> and use __uXX types in <linux/affs_hardblocks.h>
  Use __uXX types in <linux/divert.h> for struct divert_blk et al.
  Use __u32 for elf_addr_t in <asm-powerpc/elf.h>, not u32. It's user-visible.
  Remove PPP_FCS from user view in <linux/ppp_defs.h>, remove __P mess entirely
  Use __uXX types in user-visible structures in <linux/nbd.h>
  Don't use 'u32' in user-visible struct ip_conntrack_old_tuple.
  Use __uXX types for S390 DASD volume label definitions which are user-visible
  S390 BIODASDREADCMB ioctl should use __u64 not u64 type.
  Remove unneeded inclusion of <linux/time.h> from <linux/ufs_fs.h>
  Fix private integer types used in V4L2 ioctls.
  ...

Manually resolve conflict in include/linux/mtd/physmap.h
2006-06-20 15:10:08 -07:00
Linus Torvalds
2edc322d42 Merge git://git.infradead.org/~dwmw2/rbtree-2.6
* git://git.infradead.org/~dwmw2/rbtree-2.6:
  [RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
  Update UML kernel/physmem.c to use rb_parent() accessor macro
  [RBTREE] Update hrtimers to use rb_parent() accessor macro.
  [RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
  [RBTREE] Merge colour and parent fields of struct rb_node.
  [RBTREE] Remove dead code in rb_erase()
  [RBTREE] Update JFFS2 to use rb_parent() accessor macro.
  [RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
  [RBTREE] Update key.c to use rb_parent() accessor macro.
  [RBTREE] Update ext3 to use rb_parent() accessor macro.
  [RBTREE] Change rbtree off-tree marking in I/O schedulers.
  [RBTREE] Add accessor macros for colour and parent fields of rb_node
2006-06-20 14:51:22 -07:00
Linus Torvalds
be967b7e2f Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (199 commits)
  [MTD] NAND: Fix breakage all over the place
  [PATCH] NAND: fix remaining OOB length calculation
  [MTD] NAND Fixup NDFC merge brokeness
  [MTD NAND] S3C2410 driver cleanup
  [MTD NAND] s3c24x0 board: Fix clock handling, ensure proper initialisation.
  [JFFS2] Check CRC32 on dirent and data nodes each time they're read
  [JFFS2] When retiring nextblock, allocate a node_ref for the wasted space
  [JFFS2] Mark XATTR support as experimental, for now
  [JFFS2] Don't trust node headers before the CRC is checked.
  [MTD] Restore MTD_ROM and MTD_RAM types
  [MTD] assume mtd->writesize is 1 for NOR flashes
  [MTD NAND] Fix s3c2410 NAND driver so it at least _looks_ like it compiles
  [MTD] Prepare physmap for 64-bit-resources
  [JFFS2] Fix more breakage caused by janitorial meddling.
  [JFFS2] Remove stray __exit from jffs2_compressors_exit()
  [MTD] Allow alternate JFFS2 mount variant for root filesystem.
  [MTD] Disconnect struct mtd_info from ABI
  [MTD] replace MTD_RAM with MTD_GENERIC_TYPE
  [MTD] replace MTD_ROM with MTD_GENERIC_TYPE
  [MTD] remove a forgotten MTD_XIP
  ...
2006-06-20 14:50:31 -07:00
Thomas Gleixner
7bc3312bef [MTD] NAND: Fix breakage all over the place
Following problems are addressed:

- wrong status caused early break out of nand_wait()
- removed the bogus status check in nand_wait() which
  is a relict of the abandoned support for interrupted
  erase.
- status check moved to the correct place in read_oob
- oob support for syndrom based ecc with strange layouts
- use given offset in the AUTOOOB based oob operations

Partially based on a patch from Vitaly Vool <vwool@ru.mvista.com>
Thanks to Savin Zlobec <savin@epico.si> for tracking down the
status problem.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-06-20 20:31:24 +01:00
Wim Van Sebroeck
58b519f3e5 [WATCHDOG] add WDIOC_GETTIMELEFT ioctl
Some watchdog drivers have the ability to report the remaining time
before the system will reboot. With the WDIOC_GETTIMELEFT ioctl
you can now read the time left before the watchdog would reboot
your system.

The following drivers support this new IOCTL:
i8xx_tco.c, pcwd_pci.c and pcwd_usb.c .

Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-06-20 19:00:30 +02:00
Corey Minyard
e05b59fe79 [WATCHDOG] Pre-Timeout flags
Some watchdog timers support the concept of a "pretimeout" which
occurs some time before the real timeout.  The pretimeout can
be delivered via an interrupt or NMI and can be used to panic
the system when it occurs (so you get useful information instead
of a blind reboot).

Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-06-20 18:50:42 +02:00
Trond Myklebust
d59bf96cdd Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ 2006-06-20 08:59:45 -04:00
Amy Griffis
9c937dcc71 [PATCH] log more info for directory entry change events
When an audit event involves changes to a directory entry, include
a PATH record for the directory itself.  A few other notable changes:

    - fixed audit_inode_child() hooks in fsnotify_move()
    - removed unused flags arg from audit_inode()
    - added audit log routines for logging a portion of a string

Here's some sample output.

before patch:
type=SYSCALL msg=audit(1149821605.320:26): arch=40000003 syscall=39 success=yes exit=0 a0=bf8d3c7c a1=1ff a2=804e1b8 a3=bf8d3c7c items=1 ppid=739 pid=800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149821605.320:26):  cwd="/root"
type=PATH msg=audit(1149821605.320:26): item=0 name="foo" parent=164068 inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0

after patch:
type=SYSCALL msg=audit(1149822032.332:24): arch=40000003 syscall=39 success=yes exit=0 a0=bfdd9c7c a1=1ff a2=804e1b8 a3=bfdd9c7c items=2 ppid=714 pid=777 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
type=CWD msg=audit(1149822032.332:24):  cwd="/root"
type=PATH msg=audit(1149822032.332:24): item=0 name="/root" inode=164068 dev=03:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_dir_t:s0
type=PATH msg=audit(1149822032.332:24): item=1 name="foo" inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:28 -04:00
Amy Griffis
f368c07d72 [PATCH] audit: path-based rules
In this implementation, audit registers inotify watches on the parent
directories of paths specified in audit rules.  When audit's inotify
event handler is called, it updates any affected rules based on the
filesystem event.  If the parent directory is renamed, removed, or its
filesystem is unmounted, audit removes all rules referencing that
inotify watch.

To keep things simple, this implementation limits location-based
auditing to the directory entries in an existing directory.  Given
a path-based rule for /foo/bar/passwd, the following table applies:

    passwd modified -- audit event logged
    passwd replaced -- audit event logged, rules list updated
    bar renamed     -- rule removed
    foo renamed     -- untracked, meaning that the rule now applies to
		       the new location

Audit users typically want to have many rules referencing filesystem
objects, which can significantly impact filtering performance.  This
patch also adds an inode-number-based rule hash to mitigate this
situation.

The patch is relative to the audit git tree:
http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary
and uses the inotify kernel API:
http://lkml.org/lkml/2006/6/1/145

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:27 -04:00
George C. Wilson
20ca73bc79 [PATCH] Audit of POSIX Message Queue Syscalls v.2
This patch adds audit support to POSIX message queues.  It applies cleanly to
the lspp.b15 branch of Al Viro's git tree.  There are new auxiliary data
structures, and collection and emission routines in kernel/auditsc.c.  New hooks
in ipc/mqueue.c collect arguments from the syscalls.

I tested the patch by building the examples from the POSIX MQ library tarball.
Build them -lrt, not against the old MQ library in the tarball.  Here's the URL:
http://www.geocities.com/wronski12/posix_ipc/libmqueue-4.41.tar.gz
Do auditctl -a exit,always -S for mq_open, mq_timedsend, mq_timedreceive,
mq_notify, mq_getsetattr.  mq_unlink has no new hooks.  Please see the
corresponding userspace patch to get correct output from auditd for the new
record types.

[fixes folded]

Signed-off-by: George Wilson <ltcgcw@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:26 -04:00
Al Viro
d8945bb51a [PATCH] inline more audit helpers
pull checks for ->audit_context into inlined wrappers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:25 -04:00
Linda Knippers
ac03221a4f [PATCH] update of IPC audit record cleanup
The following patch addresses most of the issues with the IPC_SET_PERM
records as described in:
https://www.redhat.com/archives/linux-audit/2006-May/msg00010.html
and addresses the comments I received on the record field names.

To summarize, I made the following changes:

1. Changed sys_msgctl() and semctl_down() so that an IPC_SET_PERM
   record is emitted in the failure case as well as the success case.
   This matches the behavior in sys_shmctl().  I could simplify the
   code in sys_msgctl() and semctl_down() slightly but it would mean
   that in some error cases we could get an IPC_SET_PERM record
   without an IPC record and that seemed odd.

2. No change to the IPC record type, given no feedback on the backward
   compatibility question.

3. Removed the qbytes field from the IPC record.  It wasn't being
   set and when audit_ipc_obj() is called from ipcperms(), the
   information isn't available.  If we want the information in the IPC
   record, more extensive changes will be necessary.  Since it only
   applies to message queues and it isn't really permission related, it
   doesn't seem worth it.

4. Removed the obj field from the IPC_SET_PERM record.  This means that
   the kern_ipc_perm argument is no longer needed.

5. Removed the spaces and renamed the IPC_SET_PERM field names.  Replaced iuid and
   igid fields with ouid and ogid in the IPC record.

I tested this with the lspp.22 kernel on an x86_64 box.  I believe it
applies cleanly on the latest kernel.

-- ljk

Signed-off-by: Linda Knippers <linda.knippers@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:24 -04:00
Al Viro
3c66251e57 [PATCH] add filtering by ppid
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:22 -04:00
Al Viro
e1396065e0 [PATCH] collect sid of those who send signals to auditd
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:21 -04:00
Al Viro
473ae30bc7 [PATCH] execve argument logging
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:21 -04:00
Al Viro
bc0f3b8ebb [PATCH] audit_panic() is audit-internal
... no need to provide a stub; note that extern is already gone from
include/linux/audit.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:20 -04:00
Amy Griffis
3ca10067f7 [PATCH] inotify (4/5): allow watch removal from event handler
Allow callers to remove watches from their event handler via
inotify_remove_watch_locked().  This functionality can be used to
achieve IN_ONESHOT-like functionality for a subset of events in the
mask.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:19 -04:00
Amy Griffis
a9dc971d3f [PATCH] inotify (3/5): add interfaces to kernel API
Add inotify_init_watch() so caller can use inotify_watch refcounts
before calling inotify_add_watch().

Add inotify_find_watch() to find an existing watch for an (ih,inode)
pair.  This is similar to inotify_find_update_watch(), but does not
update the watch's mask if one is found.

Add inotify_rm_watch() to remove a watch via the watch pointer instead
of the watch descriptor.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:18 -04:00
Amy Griffis
7c29772288 [PATCH] inotify (2/5): add name's inode to event handler
When an inotify event includes a dentry name, also include the inode
associated with that name.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:18 -04:00
Amy Griffis
2d9048e201 [PATCH] inotify (1/5): split kernel API from userspace support
The following series of patches introduces a kernel API for inotify,
making it possible for kernel modules to benefit from inotify's
mechanism for watching inodes.  With these patches, inotify will
maintain for each caller a list of watches (via an embedded struct
inotify_watch), where each inotify_watch is associated with a
corresponding struct inode.  The caller registers an event handler and
specifies for which filesystem events their event handler should be
called per inotify_watch.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Acked-by: Robert Love <rml@novell.com>
Acked-by: John McCutchan <john@johnmccutchan.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:17 -04:00
Al Viro
90204e0b7b [PATCH] remove config.h from inotify.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-06-20 05:25:17 -04:00
Herbert Xu
48d83325b6 [NET]: Prevent multiple qdisc runs
Having two or more qdisc_run's contend against each other is bad because
it can induce packet reordering if the packets have to be requeued.  It
appears that this is an unintended consequence of relinquinshing the queue
lock while transmitting.  That in turn is needed for devices that spend a
lot of time in their transmit routine.

There are no advantages to be had as devices with queues are inherently
single-threaded (the loopback device is not but then it doesn't have a
queue).

Even if you were to add a queue to a parallel virtual device (e.g., bolt
a tbf filter in front of an ipip tunnel device), you would still want to
process the queue in sequence to ensure that the packets are ordered
correctly.

The solution here is to steal a bit from net_device to prevent this.

BTW, as qdisc_restart is no longer used by anyone as a module inside the
kernel (IIRC it used to with netif_wake_queue), I have not exported the
new __qdisc_run function.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-19 23:57:59 -07:00
Linus Torvalds
d0b952a983 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (109 commits)
  [ETHTOOL]: Fix UFO typo
  [SCTP]: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
  [SCTP]: Send only 1 window update SACK per message.
  [SCTP]: Don't do CRC32C checksum over loopback.
  [SCTP] Reset rtt_in_progress for the chunk when processing its sack.
  [SCTP]: Reject sctp packets with broadcast addresses.
  [SCTP]: Limit association max_retrans setting in setsockopt.
  [PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate.
  [IPV6]: Sum real space for RTAs.
  [IRDA]: Use put_unaligned() in irlmp_do_discovery().
  [BRIDGE]: Add support for NETIF_F_HW_CSUM devices
  [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
  [TG3]: Convert to non-LLTX
  [TG3]: Remove unnecessary tx_lock
  [TCP]: Add tcp_slow_start_after_idle sysctl.
  [BNX2]: Update version and reldate
  [BNX2]: Use CPU native page size
  [BNX2]: Use compressed firmware
  [BNX2]: Add firmware decompression
  [BNX2]: Allow WoL settings on new 5708 chips
  ...

Manual fixup for conflict in drivers/net/tulip/winbond-840.c
2006-06-19 18:55:56 -07:00
Linus Torvalds
2090af7180 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (166 commits)
  [PATCH] net: au1000_eth: PHY framework conversion
  [PATCH] 3c5zz ethernet: fix section warnings
  [PATCH] smc ethernet: fix section mismatch warnings
  [PATCH] hp ethernet: fix section mismatches
  [PATCH] Section mismatch in drivers/net/ne.o during modpost
  [PATCH] e1000: prevent statistics from getting garbled during reset
  [PATCH] smc911x Kconfig fix
  [PATCH] forcedeth: new device ids
  [PATCH] forcedeth config: version
  [PATCH] forcedeth config: module parameters
  [PATCH] forcedeth config: diagnostics
  [PATCH] forcedeth config: move functions
  [PATCH] forcedeth config: statistics
  [PATCH] forcedeth config: csum
  [PATCH] forcedeth config: wol
  [PATCH] forcedeth config: phy
  [PATCH] forcedeth config: flow control
  [PATCH] forcedeth config: ring sizes
  [PATCH] forcedeth config: tso cleanup
  [DOC] Update bonding documentation with sysfs info
  ...
2006-06-19 18:50:43 -07:00
Linus Torvalds
557240b48e Add support for suspending and resuming the whole console subsystem
Trying to suspend/resume with console messages flying all around is
doomed to failure, when the devices that the messages are trying to
go to are being shut down.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-19 18:16:01 -07:00
Andrew Victor
afefc4158f [ARM] 3592/1: AT91RM9200 Serial driver update
Patch from Andrew Victor

This patch includes a number of updates to the AT91RM9200 serial driver.

Changes include:
1. Conversion to a platform_driver.  [Ivan Kokshaysky]
2. Replaced all references to AT91RM9200 with AT91.  This driver can now
also be used for the AT91SAM9216.
3. Allow TIOCM_LOOP to configure local loopback mode.
4. Cleaned up the 'read_status_mask' usage and interrupt handler code.
[Chip Coldwell]
5. Suspend/resume support.  [David Brownell]

There are a few 'unused variable' warning when compiling this - I
removed the new DMA support to keep this first patch simpler.

Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-06-19 19:53:19 +01:00
David Woodhouse
8555255f0b Add generic Kbuild files for 'make headers_install'
This adds the Kbuild files listing the files which are to be installed by
the 'headers_install' make target, in generic directories.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-18 12:14:01 +01:00
Tushar Gohad
c7ce1ae212 [PFKEYV2]: Fix inconsistent typing in struct sadb_x_kmprivate.
Signed-off-by: Tushar Gohad <tgohad@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:54:03 -07:00
Herbert Xu
8648b3053b [NET]: Add NETIF_F_GEN_CSUM and NETIF_F_ALL_CSUM
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places.  For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two.  We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 22:06:05 -07:00
David S. Miller
35089bb203 [TCP]: Add tcp_slow_start_after_idle sysctl.
A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:53 -07:00
Herbert Xu
3cc0e87398 [NET]: Warn in __skb_trim if skb is paged
It's better to warn and fail rather than rarely triggering BUG on paths
that incorrectly call skb_trim/__skb_trim on a non-linear skb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:22 -07:00
Herbert Xu
364c6badde [NET]: Clean up skb_linearize
The linearisation operation doesn't need to be super-optimised.  So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.

Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.

Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.

Last but not least, I've removed the gfp argument since nobody uses it
anymore.  If it's ever needed we can easily add it back.

Misc bugs fixed by this patch:

* via-velocity error handling (also, no SG => no frags)

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:16 -07:00
Herbert Xu
932ff279a4 [NET]: Add netif_tx_lock
Various drivers use xmit_lock internally to synchronise with their
transmission routines.  They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.

With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set.  This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.

While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire.  So
delaying or dropping the message is to be avoided as much as possible.

So the only alternative is to always set xmit_lock_owner.  The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.

I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly.  I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.

This is pretty much a straight text substitution except for a small
bug fix in winbond.  It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission.  This is
unsafe as an IRQ can potentially wake up the queue.  So it is safer to
use netif_tx_disable.

The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:14 -07:00
James Morris
100468e9c0 [SECMARK]: Add CONNSECMARK xtables target
Add a new xtables target, CONNSECMARK, which is used to specify rules
for copying security marks from packets to connections, and for
copyying security marks back from connections to packets.  This is
similar to the CONNMARK target, but is more limited in scope in that
it only allows copying of security marks to and from packets, as this
is all it needs to do.

A typical scenario would be to apply a security mark to a 'new' packet
with SECMARK, then copy that to its conntrack via CONNMARK, and then
restore the security mark from the connection to established and
related packets on that connection.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:03 -07:00
James Morris
7c9728c393 [SECMARK]: Add secmark support to conntrack
Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required.  This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:30:01 -07:00
James Morris
5e6874cdb8 [SECMARK]: Add xtables SECMARK target
Add a SECMARK target to xtables, allowing the admin to apply security
marks to packets via both iptables and ip6tables.

The target currently handles SELinux security marking, but can be
extended for other purposes as needed.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:59 -07:00
James Morris
984bc16cc9 [SECMARK]: Add secmark support to core networking.
Add a secmark field to the skbuff structure, to allow security subsystems to
place security markings on network packets.  This is similar to the nfmark
field, except is intended for implementing security policy, rather than than
networking policy.

This patch was already acked in principle by Dave Miller.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:57 -07:00
James Morris
c749b29fae [SECMARK]: Add SELinux exports
Add and export new functions to the in-kernel SELinux API in support of the
new secmark-based packet controls.

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:55 -07:00
David S. Miller
6f68dc3775 [NET]: Fix warnings after LSM-IPSEC changes.
Assignment used as truth value in xfrm_del_sa()
and xfrm_get_policy().

Wrong argument type declared for security_xfrm_state_delete()
when SELINUX is disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:49 -07:00
Catherine Zhang
c8c05a8eec [LSM-IPsec]: SELinux Authorize
This patch contains a fix for the previous patch that adds security
contexts to IPsec policies and security associations.  In the previous
patch, no authorization (besides the check for write permissions to
SAD and SPD) is required to delete IPsec policies and security
assocations with security contexts.  Thus a user authorized to change
SAD and SPD can bypass the IPsec policy authorization by simply
deleteing policies with security contexts.  To fix this security hole,
an additional authorization check is added for removing security
policies and security associations with security contexts.

Note that if no security context is supplied on add or present on
policy to be deleted, the SELinux module allows the change
unconditionally.  The hook is called on deletion when no context is
present, which we may want to change.  At present, I left it up to the
module.

LSM changes:

The patch adds two new LSM hooks: xfrm_policy_delete and
xfrm_state_delete.  The new hooks are necessary to authorize deletion
of IPsec policies that have security contexts.  The existing hooks
xfrm_policy_free and xfrm_state_free lack the context to do the
authorization, so I decided to split authorization of deletion and
memory management of security data, as is typical in the LSM
interface.

Use:

The new delete hooks are checked when xfrm_policy or xfrm_state are
deleted by either the xfrm_user interface (xfrm_get_policy,
xfrm_del_sa) or the pfkey interface (pfkey_spddelete, pfkey_delete).

SELinux changes:

The new policy_delete and state_delete functions are added.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:45 -07:00
Alexey Dobriyan
338fcf9886 [IPV4] igmp: Fixup struct ip_mc_list::multiaddr type
All users except two expect 32-bit big-endian value. One is of

	->multiaddr = ->multiaddr

variety. And last one is "%08lX".

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:37 -07:00
Patrick McHardy
ae5b7d8ba2 [NETFILTER]: Add SIP connection tracking helper
Add SIP connection tracking helper. Originally written by
Christian Hentschel <chentschel@arnet.com.ar>, some cleanup, minor
fixes and bidirectional SIP support added by myself.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:15 -07:00
Jing Min Zhao
c0d4cfd96d [NETFILTER]: H.323 helper: Add support for Call Forwarding
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:11 -07:00
Patrick McHardy
3726add766 [NETFILTER]: ctnetlink: fix NAT configuration
The current configuration only allows to configure one manip and overloads
conntrack status flags with netlink semantic.

Signed-off-by: Patrick Mchardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:29:01 -07:00
Eric Leblond
997ae831ad [NETFILTER]: conntrack: add fixed timeout flag in connection tracking
Add a flag in a connection status to have a non updated timeout.
This permits to have connection that automatically die at a given
time.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:59 -07:00
Patrick McHardy
39a27a35c5 [NETFILTER]: conntrack: add sysctl to disable checksumming
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:57 -07:00
Patrick McHardy
f3389805e5 [NETFILTER]: x_tables: add statistic match
Add statistic match which is a combination of the nth and random matches.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:51 -07:00
Patrick McHardy
62b7743483 [NETFILTER]: x_tables: add quota match
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:49 -07:00
Herbert Xu
b59f45d0b2 [IPSEC] xfrm: Abstract out encapsulation modes
This patch adds the structure xfrm_mode.  It is meant to represent
the operations carried out by transport/tunnel modes.

By doing this we allow additional encapsulation modes to be added
without clogging up the xfrm_input/xfrm_output paths.

Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
BEET modes.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:28:39 -07:00
Michael Chan
30b6c28d2a [TG3]: Add 5786 PCI ID
Add PCI ID for BCM5786 which is a variant of 5787.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:26:24 -07:00
Chris Leech
9593782585 [I/OAT]: Add a sysctl for tuning the I/OAT offloaded I/O threshold
Any socket recv of less than this ammount will not be offloaded

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:54 -07:00
Chris Leech
97fc2f0848 [I/OAT]: Structure changes for TCP recv offload to I/OAT
Adds an async_wait_queue and some additional fields to tcp_sock, and a
dma_cookie_t to sk_buff.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:48 -07:00
Chris Leech
de5506e155 [I/OAT]: Utility functions for offloading sk_buff to iovec copies
Provides for pinning user space pages in memory, copying to iovecs,
and copying from sk_buffs including fragmented and chained sk_buffs.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:25:46 -07:00
Chris Leech
db21733488 [I/OAT]: Setup the networking subsystem as a DMA client
Attempts to allocate per-CPU DMA channels

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:24:58 -07:00
David S. Miller
57c651f74c [I/OAT]: Move PCI_DEVICE_ID_INTEL_IOAT to linux/pci_ids.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:18:50 -07:00
Chris Leech
c13c8260da [I/OAT]: DMA memcpy subsystem
Provides an API for offloading memory copies to DMA devices

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-17 21:18:43 -07:00
Len Brown
d42510a0f5 Pull bugzilla-5737 into release branch
Conflicts:

	arch/x86_64/kernel/acpi/processor.c
2006-06-15 21:39:25 -04:00
Artem B. Bityutskiy
783ed81ff3 [MTD] assume mtd->writesize is 1 for NOR flashes
Signed-off-by: Artem B. Bityitskiy
2006-06-14 19:53:44 +04:00
Jeff Garzik
b5ed7639c9 Merge branch 'master' into upstream 2006-06-13 20:29:04 -04:00
Jeff Garzik
db9ca58035 Merge branch 'master' into upstream 2006-06-13 20:28:05 -04:00
Tejun Heo
f0eb62b81d [PATCH] libata: add host_set->next for legacy two host_sets case, take #3
For a legacy ATA controller, libata registers two separate host sets.
There was no connection between the two hosts making it impossible to
traverse all ports related to the controller.  This patch adds
host_set->next which points to the second host_set and makes
ata_pci_remove_one() remove all associated host_sets.

* On device removal, all ports hanging off the device are properly
  detached.  Prior to this patch, ports on the first host_set weren't
  detached casuing oops on driver unloading.

* On device removal, both host_sets are properly freed

This will also be used by new power management code to suspend and
resume all ports of a controller.  host_set/port representation will
be improved to handle legacy controllers better and this host_set
linking will go away with it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-12 10:23:21 -04:00
Paul Mackerras
7a0c58d051 Merge branch 'merge' 2006-06-12 17:53:34 +10:00
Jeff Garzik
3b01b8af24 libata: fix build, by adding required workqueue member to port struct 2006-06-12 00:22:04 -04:00
zhao, forrest
3057ac3c1a [PATCH] Snoop SET FEATURES - WRITE CACHE ENABLE/DISABLE command(v5)
This patch makes libata snoop 'SET FEATURES - WRITE CACHE
ENABLE/DISABLE' command, executing requisite revalidation processes
to update cached data.

Signed-off-by: Forrest Zhao <forrest.zhao@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-06-12 00:18:35 -04:00
Jeff Garzik
fec69a9748 Merge branch 'upstream-fixes' into upstream
Conflicts:

	drivers/scsi/sata_sil24.c
2006-06-11 23:04:37 -04:00
akpm@osdl.org
0ce030395b [PATCH] PCI: fix pciehp compile issue when CONFIG_ACPI is not enabled
Fix build error when CONFIG_ACPI not defined

Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-11 14:02:27 -07:00
Tejun Heo
9a9c77dc4c [PATCH] libata: cosmetic change in struct ata_port
Cosmetic change in struct ata_port.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-06-11 11:19:00 +09:00
Christoph Hellwig
8d7feac3c7 [SCSI] remove RQ_SCSI_* flags
The RQ_SCSI_* flags are a vestiage of a long past history.  The EH code
still sets them but we never make use of that information.  The other
users is pluto.c which never had a chance to work but needs to be kept
compiling to keep Davem happy, so copy over the definition there.

We could probably get rid of RQ_ACTIVE/RQ_INACTIVE aswell with some
work, there's only two more or less bogus looking uses in ubd and scsi.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-06-10 16:25:21 -05:00
Markus Lidel
57a62fed87 [PATCH] I2O: Bugfixes to get I2O working again
From: Markus Lidel <Markus.Lidel@shadowconnect.com>

- Fixed locking of struct i2o_exec_wait in Executive-OSM

- Removed LCT Notify in i2o_exec_probe() which caused freeing memory and
  accessing freed memory during first enumeration of I2O devices

- Added missing locking in i2o_exec_lct_notify()

- removed put_device() of I2O controller in i2o_iop_remove() which caused
  the controller structure get freed to early

- Fixed size of mempool in i2o_iop_alloc()

- Fixed access to freed memory in i2o_msg_get()

See http://bugzilla.kernel.org/show_bug.cgi?id=6561

Signed-off-by: Markus Lidel <Markus.Lidel@shadowconnect.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-10 11:02:05 -07:00
Sam Ravnborg
b817f6feff kbuild: check license compatibility when building modules
Modules that uses GPL symbols can no longer be build with kbuild,
the build will fail during the modpost step.
When a GPL-incompatible module uses a EXPORT_SYMBOL_GPL_FUTURE symbol
then warn during modpost so author are actually notified.

The actual license compatibility check is shared with the kernel
to make sure it is in sync.

Patch originally from: Andreas Gruenbacher <agruen@suse.de> and
Ram Pai <linuxram@us.ibm.com>

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2006-06-09 21:53:55 +02:00
Trond Myklebust
28df955a2a NLM: Fix reclaim races
Currently it is possible for a task to remove its locks at the same time as
the NLM recovery thread is trying to recover them. This quickly leads to an
Oops.
Protect the locks using an rw semaphore while they are being recovered.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:40:27 -04:00
Trond Myklebust
5046791417 NLM: sem to mutex conversion
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:40:24 -04:00
Marc Eshel
3134cbec5e locks.c: add the fl_owner to nlm_compare_locks
Add the fl_owner to NLM compare locks. Since two different client can
present the same pid to the server it is not enough to distinguish locks
from different clients. The fl_owner field is a pointer to the struct
nlm_host which is unique for each client.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:40:20 -04:00
Manoj Naik
6b97fd3da1 NFSv4: Follow a referral
Respond to a moved error on NFS lookup by setting up the referral.
Note: We don't actually follow the referral during lookup/getattr, but
later when we detect fsid mismatch in inode revalidation (similar to the
processing done for cloning submounts). Referrals will have fake attributes
until they are actually followed or traversed.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:29 -04:00
Manoj Naik
9cdb3883c3 NFSv4: Ensure client submounts when following a referral
Set up mountpoint when hitting a referral on moved error by getting
fs_locations.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:28 -04:00
Manoj Naik
7aaa0b3bd4 NFSv4: convert fs-locations-components to conform to RFC3530
Use component4-style formats for decoding list of servers and pathnames in
fs_locations.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:23 -04:00
Trond Myklebust
683b57b435 NFSv4: Implement the fs_locations function call
NFSv4 allows for the fact that filesystems may be replicated across
several servers or that they may be migrated to a backup server in case of
failure of the primary server.
fs_locations is an NFSv4 operation for retrieving information about the
location of migrated and/or replicated filesystems.

Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:22 -04:00
Trond Myklebust
8b23ea7bed RPC: Allow struc xdr_stream to read the page section of an xdr_buf
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:21 -04:00
Trond Myklebust
51d8fa6a10 NFS: Add timeout to submounts
Make automounted partitions expire using the mark_mounts_for_expiry()
function. The timeout is controlled via a sysctl.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:20 -04:00
Trond Myklebust
55a975937d NFS: Ensure the client submounts, when it crosses a server mountpoint.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:19 -04:00
Trond Myklebust
8b4bdcf899 NFS: Store the file system "fsid" value in the NFS super block.
This should enable us to detect if we are crossing a mountpoint in the
case where the server is exporting "nohide" mounts.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:19 -04:00
Trond Myklebust
8b512d9a88 VFS: Remove dependency of ->umount_begin() call on MNT_FORCE
Allow filesystems to decide to perform pre-umount processing whether or not
MNT_FORCE is set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:18 -04:00
Trond Myklebust
5528f911b4 VFS: Add shrink_submounts()
Allow a submount to be marked as being 'shrinkable' by means of the
vfsmount->mnt_flags, and then add a function 'shrink_submounts()' which
attempts to recursively unmount these submounts.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:17 -04:00
Trond Myklebust
1f5ce9e93a VFS: Unexport do_kern_mount() and clean up simple_pin_fs()
Replace all module uses with the new vfs_kern_mount() interface, and fix up
simple_pin_fs().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:16 -04:00
Trond Myklebust
bb4a58bf46 VFS: Add GPL_EXPORTED function vfs_kern_mount()
do_kern_mount() does not allow the kernel to use private mount interfaces
without exposing the same interfaces to userland. The problem is that the
filesystem is referenced by name, thus meaning that it and its mount
interface must be registered in the global filesystem list.

vfs_kern_mount() passes the struct file_system_type as an explicit
parameter in order to overcome this limitation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:15 -04:00
Trond Myklebust
d2ccddf042 NFS: Flesh out nfs_invalidate_page()
In the case of a call to truncate_inode_pages(), we should really try to
cancel any pending writes on the page.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:14 -04:00
Trond Myklebust
44b11874ff NFS: Separate metadata and page cache revalidation mechanisms
Separate out the function of revalidating the inode metadata, and
revalidating the mapping. The former may be called by lookup(),
and only really needs to check that permissions, ctime, etc haven't changed
whereas the latter needs only done when we want to read data from the page
cache, and may need to sync and then invalidate the mapping.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:09 -04:00
Trond Myklebust
38478b24e3 NFS: More page cache revalidation fixups
Whenever the directory changes, we want to make sure that we always
invalidate its page cache. Fix up update_changeattr() and
nfs_mark_for_revalidate() so that they do so.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:09 -04:00
Chuck Lever
0d0b5cb36f NFS: Optimize allocation of nfs_read/write_data structures
Clean up use of page_array, and fix an off-by-one error noticed by Tom
Talpey which causes kmalloc calls in cases where using the page_array
is sufficient.

Test plan:
Normal client functional testing with r/wsize=32768.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:07 -04:00
Trond Myklebust
73a3d07c10 NFS: Clean up inode metadata updates
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:04 -04:00
Anton Blanchard
651d765d0b [PATCH] Add a prctl to change the endianness of a process.
This new prctl is intended for changing the execution mode of the
processor, on processors that support both a little-endian mode and a
big-endian mode.  It is intended for use by programs such as
instruction set emulators (for example an x86 emulator on PowerPC),
which may find it convenient to use the processor in an alternate
endianness mode when executing translated instructions.

Note that this does not imply the existence of a fully-fledged ABI for
both endiannesses, or of compatibility code for converting system
calls done in the non-native endianness mode.  The program is expected
to arrange for all of its system call arguments to be presented in the
native endianness.

Switching between big and little-endian mode will require some care in
constructing the instruction sequence for the switch.  Generally the
instructions up to the instruction that invokes the prctl system call
will have to be in the old endianness, and subsequent instructions
will have to be in the new endianness.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-06-09 21:24:13 +10:00
Jens Axboe
bc1c116974 [PATCH] elevator switching race
There's a race between shutting down one io scheduler and firing up the
next, in which a new io could enter and cause the io scheduler to be
invoked with bad or NULL data.

To fix this, we need to maintain the queue lock for a bit longer.
Unfortunately we cannot do that, since the elevator init requires to be
run without the lock held.  This isn't easily fixable, without also
changing the mempool API.  So split the initialization into two parts,
and alloc-init operation and an attach operation.  Then we can
preallocate the io scheduler and related structures, and run the attach
inside the lock after we detach the old one.

This patch has survived 30 minutes of 1 second io scheduler switching
with a very busy io load.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-08 15:14:23 -07:00
Ralf Baechle
45b35a5ced [PATCH] Fix mempolicy.h build error
From: Ralf Baechle <ralf@linux-mips.org>

<linux/mempolicy.h> uses struct mm_struct and relies on a definition or
declaration somehow magically being dragged in which may result in a
build:

[...]
  CC      mm/mempolicy.o
In file included from mm/mempolicy.c:69:
include/linux/mempolicy.h:150: warning: ‘struct mm_struct’ declared inside parameter list
include/linux/mempolicy.h:150: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mempolicy.h:175: warning: ‘struct mm_struct’ declared inside parameter list
mm/mempolicy.c:622: error: conflicting types for ‘do_migrate_pages’
include/linux/mempolicy.h:175: error: previous declaration of ‘do_migrate_pages’ was here
mm/mempolicy.c:1661: error: conflicting types for ‘mpol_rebind_mm’
include/linux/mempolicy.h:150: error: previous declaration of ‘mpol_rebind_mm’ was here
make[1]: *** [mm/mempolicy.o] Error 1
make: *** [mm] Error 2
[ralf@denk linux-ip35]$

Including <linux/sched.h> is a step into direction of include hell so
fixed by adding a forward declaration of struct mm_struct instead.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-08 15:12:21 -07:00
Jeff Garzik
d15a88fc21 Merge branch 'master' into upstream 2006-06-08 15:24:46 -04:00
Andrew Morton
2d7b20c188 [PATCH] m48t86: ia64 build fix
From: Andrew Morton <akpm@osdl.org>

drivers/rtc/rtc-m48t86.c: In function `m48t86_rtc_read_time':
drivers/rtc/rtc-m48t86.c:51: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:55: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:56: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:57: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:58: error: structure has no member named `ia64_mv'
drivers/rtc/rtc-m48t86.c:60: error: structure has no member named `ia64_mv'

readb() and writeb() are macros on ia64.

Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-05 12:29:17 -07:00
Ralf Baechle
93ff66bf1e [PATCH] Sparsemem build fix
From: Ralf Baechle <ralf@linux-mips.org>

<linux/mmzone.h> uses PAGE_SIZE, PAGE_SHIFT from <asm/page.h> without
including that header itself.  For some sparsemem configurations this may
result in build errors like:

  CC      init/initramfs.o
In file included from include/linux/gfp.h:4,
                 from include/linux/slab.h:15,
                 from include/linux/percpu.h:4,
                 from include/linux/rcupdate.h:41,
                 from include/linux/dcache.h:10,
                 from include/linux/fs.h:226,
                 from init/initramfs.c:2:
include/linux/mmzone.h:498:22: warning: "PAGE_SHIFT" is not defined
In file included from include/linux/gfp.h:4,
                 from include/linux/slab.h:15,
                 from include/linux/percpu.h:4,
                 from include/linux/rcupdate.h:41,
                 from include/linux/dcache.h:10,
                 from include/linux/fs.h:226,
                 from init/initramfs.c:2:
include/linux/mmzone.h:526: error: `PAGE_SIZE' undeclared here (not in a function)
include/linux/mmzone.h: In function `__pfn_to_section':
include/linux/mmzone.h:573: error: `PAGE_SHIFT' undeclared (first use in this function)
include/linux/mmzone.h:573: error: (Each undeclared identifier is reported only once
include/linux/mmzone.h:573: error: for each function it appears in.)
include/linux/mmzone.h: In function `pfn_valid':
include/linux/mmzone.h:578: error: `PAGE_SHIFT' undeclared (first use in this function)
make[1]: *** [init/initramfs.o] Error 1
make: *** [init] Error 2

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Seems-reasonable-to: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-05 12:29:16 -07:00
David Woodhouse
2f3243aebd [RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
Since rb_insert_color() is part of the _public_ API, while the others are
purely internal, switch to be consistent with that.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-05 20:19:05 +01:00
David Woodhouse
d27317657a Switch to __s32 types in joystick.h instead of C99 types for consistency.
The rest of the file uses these types instead of C99 types.

Acked-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-06-03 00:27:53 +01:00
David Woodhouse
7b1c6ca73a Add <sys/types.h> to headers included for userspace in <linux/input.h>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
2006-06-01 12:49:30 +01:00
Andrew Morton
760f1fce03 [PATCH] revert "swsusp add check for suspension of X controlled devices"
From: Andrew Morton <akpm@osdl.org>

Revert commit ff4da2e262d2509fe1bacff70dd00934be569c66.

It broke APM suspend, probably because APM doesn't switch back to a VT
when suspending.

Tracked down by Matt Mackall <mpm@selenic.com>

Rafael sayeth:
  "It only fixed the theoretical issue that a quick-handed user could
   switch to X after processes have been frozen and before the devices
   are suspended.

   With the current userland suspend tools it shouldn't be necessary."

Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-31 16:27:11 -07:00
Tejun Heo
52783c5dcc [PATCH] libata-hp: killl ops->probe_reset
Now that all drivers implementing new EH are converted to new probing
mechanism, ops->probe_reset doesn't have any user.  Kill it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:28:22 +09:00
Tejun Heo
720ba12620 [PATCH] libata-hp: update unload-unplug
Update unload unplug - driver unloading / PCI removal.  This is done
by ata_port_detach() which short-circuits EH, disables all devices and
freezes the port.  With this patch, EH and unloading/unplugging are
properly synchronized.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:28:13 +09:00
Tejun Heo
83c47bcb3c [PATCH] libata-hp: implement warmplug
Implement warmplug.  User-initiated unplug can be detected by
hostt->slave_destroy() and plug by transportt->user_scan().  This
patch only implements the two callbacks.  The next function will hook
them.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:28:07 +09:00
Tejun Heo
580b210232 [PATCH] libata-hp: implement SCSI part of hotplug
Implement SCSI part of hotplug.

This must be done in a separate context as SCSI makes use of EH during
probing.  SCSI scan fails silently if EH is in progress.  In such
cases, libata pauses briefly and retries until every device is
attached.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:28:05 +09:00
Tejun Heo
084fe639b8 [PATCH] libata-hp: implement hotplug
Implement ATA part of hotplug.  To avoid probing broken devices over
and over again, disabled devices are not automatically detached.  They
are detached only if probing is requested for the device or the
associated port is offline.  Also, to avoid infinite probing loop,
Each device is probed only once per EH run.

As SATA PHY status is fragile, devices are detached only after it has
used up its recovery chances unless explicitly requested by LLDD or
user (LLDD may request direct detach if, for example, it supports cold
presence detection).

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:28:03 +09:00
Tejun Heo
9a1004d0c1 [PATCH] libata: export ata_hsm_move()
ata_hsm_move() will be used by LLDDs which depend on standard PIO HSM
but implement their own interrupt handlers.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:52 +09:00
Tejun Heo
f5914a461e [PATCH] libata-hp-prep: add prereset() method and implement ata_std_prereset()
With hotplug, every reset might be a probing reset and thus something
similar to probe_init() is needed.  prereset() method is called before
a series of resets to a port and is the counterpart of postreset().
prereset() can tell EH to use different type of reset or skip reset by
modifying ehc->i.action.

This patch also implements ata_std_prereset().  Most controllers
should be able to use this function directly or with some wrapping.
After hotplug, different controllers need different actions to resume
the PHY and detect the newly attached device.  Controllers can be
categorized as follows.

* Controllers which can wait for the first D2H FIS after hotplug.
  Note that if the waiting is implemented by polling TF status, there
  needs to be a way to set BSY on PHY status change.  It can be
  implemented by hardware or with the help of the driver.

* Controllers which can wait for the first D2H FIS after sending
  COMRESET.  These controllers need to issue COMRESET to wait for the
  first FIS.  Note that the received D2H FIS could be the first D2H
  FIS after POR (power-on-reset) or D2H FIS in response to the
  COMRESET.  Some controllers use COMRESET as TF status
  synchronization point and clear TF automatically (sata_sil).

* Controllers which cannot wait for the first D2H FIS reliably.
  Blindly issuing SRST to spinning-up device often results in command
  issue failure or timeout, causing extended delay.  For these
  controllers, ata_std_prereset() explicitly waits ATA_SPINUP_WAIT
  (currently 8s) to give newly attached device time to spin up, then
  issues reset.  Note that failing to getting ready in ATA_SPINUP_WAIT
  is not critical.  libata will retry.  So, the timeout needs to be
  long enough to spin up most devices.

LLDDs can tell ata_std_prereset() which of above action is needed with
ATA_FLAG_HRST_TO_RESUME and ATA_FLAG_SKIP_D2H_BSY flags.  These flags
are PHY-specific property and will be moved to ata_link later.

While at it, this patch unifies function typedef's such that they all
have named arguments.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:48 +09:00
Tejun Heo
d7bb4cc757 [PATCH] libata-hp-prep: implement sata_phy_debounce()
With hotplug, PHY always needs to be debounced before a reset as any
reset might find new devices.  Extract PHY waiting code from
sata_phy_resume() and extend it to include SStatus debouncing.  Note
that sata_phy_debounce() is superset of what used to be done inside
sata_phy_resume().

Three default debounce timing parameters are defined to be used by
hot/boot plug.  As resume failure during probing will be properly
handled as errors, timeout doesn't have to be long as before.
probeinit() uses the same timeout to retain the original behavior.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:46 +09:00
Tejun Heo
3edebac41b [PATCH] libata-hp-prep: store attached SCSI device
Add device persistent field dev->sdev and store the attached SCSI
device.  With hotplug, libata needs to know the attached SCSI device
to offline and detach it, but scsi_device_lookup() cannot be used
because libata will reuse SCSI ID numbers - dead but not gone devices
(due to zombie opens, etc...) interfere with the lookup.

dev->sdev doesn't hold reference to the SCSI device.  It's cleared
when the SCSI device goes away.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:40 +09:00
Tejun Heo
5a04bf4bef [PATCH] libata-hp-prep: implement ap->hw_sata_spd_limit
Add ap->hw_sata_spd_limit and initialize it once during the boot
initialization (or driver load initialization).  ap->sata_spd_limit is
reset to ap->hw_sata_spd_limit on hotplug.  This prevents spd limits
introduced by earlier devices from affecting new devices.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:38 +09:00
Tejun Heo
72fa4b742b [PATCH] libata-hp-prep: make some ata_device fields persistent
Lifetimes of some fields span over device plugging/unplugging.  This
patch moves such persistent fields to the top of ata_device and
separate them with ATA_DEVICE_CLEAR_OFFSET.  Fields above the offset
are initialized once during host initializatino while all other fields
are cleared before hotplugging.  Currently ->ap, devno and part of
flags are persistent.

Note that flags is partially cleared while holding host_set lock.
This is to synchronize with later warm plug implementation which will
record hotplug request in dev->flags.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:32 +09:00
Tejun Heo
abdda7331d [PATCH] libata-hp-prep: add flags and eh_info/context fields for hotplug
Add hotplug related flags and eh_info/context fields.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:29 +09:00
Tejun Heo
c6cf9e99d1 [PATCH] libata: implement ata_eh_wait()
Implement ata_eh_wait().  On return from this function, it's
guaranteed that the EH which was pending or in progress when the
function was called is complete - including the tailing part of SCSI
EH.  This will be used by hotplug and others to synchronize with EH.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:27 +09:00
Tejun Heo
7395acb2c8 [PATCH] libata: shift host flag constants
Nudge host flag constants to make a room after ATA_FLAG_EH_PENDING.
New EH flag will be added.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-31 18:27:25 +09:00
Linus Torvalds
e60a48f5ab Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart
* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
  [AGPGART] VIA PT880 Ultra support.
  [AGPGART] Fix Nforce3 suspend on amd64.
  [AGPGART] Enable SIS AGP driver on x86-64 for EM64T systems
2006-05-30 11:54:32 -07:00
Richard Purdie
ed8f9e2f04 Input: change from numbered to named switches
Remove the numbered SW_* entries from the input system and assign names
to the existing users.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-05-29 23:31:03 -04:00
Matthew Garrett
f39b25bed3 Input: add KEY_BATTERY keycode
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-05-29 23:27:39 -04:00
Thomas Gleixner
f1a28c0284 [MTD] NAND Expose the new raw mode function and status info to userspace
The raw read/write access to NAND (without ECC) has been changed in the
NAND rework. Expose the new way - setting the file mode via ioctl - to
userspace. Also allow to read out the ecc statistics information so userspace
tools can see that bitflips happened and whether errors where correctable
or not. Also expose the number of bad blocks for the partition, so nandwrite
can check if the data fits into the parition before writing to it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-30 00:37:34 +02:00
Thomas Gleixner
8593fbc68b [MTD] Rework the out of band handling completely
Hopefully the last iteration on this!

The handling of out of band data on NAND was accompanied by tons of fruitless
discussions and halfarsed patches to make it work for a particular
problem. Sufficiently annoyed by I all those "I know it better" mails and the
resonable amount of discarded "it solves my problem" patches, I finally decided
to go for the big rework. After removing the _ecc variants of mtd read/write
functions the solution to satisfy the various requirements was to refactor the
read/write _oob functions in mtd.

The major change is that read/write_oob now takes a pointer to an operation
descriptor structure "struct mtd_oob_ops".instead of having a function with at
least seven arguments.

read/write_oob which should probably renamed to a more descriptive name, can do
the following tasks:

- read/write out of band data
- read/write data content and out of band data
- read/write raw data content and out of band data (ecc disabled)

struct mtd_oob_ops has a mode field, which determines the oob handling mode.

Aside of the MTD_OOB_RAW mode, which is intended to be especially for
diagnostic purposes and some internal functions e.g. bad block table creation,
the other two modes are for mtd clients:

MTD_OOB_PLACE puts/gets the given oob data exactly to/from the place which is
described by the ooboffs and ooblen fields of the mtd_oob_ops strcuture. It's
up to the caller to make sure that the byte positions are not used by the ECC
placement algorithms.

MTD_OOB_AUTO puts/gets the given oob data automaticaly to/from the places in
the out of band area which are described by the oobfree tuples in the ecclayout
data structre which is associated to the devicee.

The decision whether data plus oob or oob only handling is done depends on the
setting of the datbuf member of the data structure. When datbuf == NULL then
the internal read/write_oob functions are selected, otherwise the read/write
data routines are invoked.

Tested on a few platforms with all variants. Please be aware of possible
regressions for your particular device / application scenario

Disclaimer: Any whining will be ignored from those who just contributed "hot
air blurb" and never sat down to tackle the underlying problem of the mess in
the NAND driver grown over time and the big chunk of work to fix up the
existing users. The problem was not the holiness of the existing MTD
interfaces. The problems was the lack of time to go for the big overhaul. It's
easy to add more mess to the existing one, but it takes alot of effort to go
for a real solution.

Improvements and bugfixes are welcome!

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:51 +02:00
Thomas Gleixner
f4a43cfcec [MTD] Remove silly MTD_WRITE/READ macros
Most of those macros are unused and the used ones just obfuscate
the code. Remove them and fixup all users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:50 +02:00
Thomas Gleixner
5bd34c091a [MTD] NAND Replace oobinfo by ecclayout
The nand_oobinfo structure is not fitting the newer error correction
demands anymore. Replace it by struct nand_ecclayout and fixup the users
all over the place. Keep the nand_oobinfo based ioctl for user space
compability reasons.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:50 +02:00
Thomas Gleixner
ff268fb879 [MTD] NAND Consolidate oobinfo handling
The info structure for out of band data was copied into
the mtd structure. Make it a pointer and remove the ability
to set it from userspace. The position of ecc bytes is
defined by the hardware and should not be changed by software.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:49 +02:00
Thomas Gleixner
8be834f762 [MTD] NAND Fix platform structure and NDFC driver
The platform structure was lacking an oobinfo field.
The NDFC driver had some remains from another tree.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-29 15:06:49 +02:00
Alan Cox
75e995855f [PATCH] libata: add pio_data_xfer_noirq
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-26 22:02:23 -04:00
Alan Cox
622b20fcb8 [PATCH] PCI identifiers for the pata_via update
These IDs are also used by the drivers/ide/pci changes submitted by
VIA.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-26 22:02:23 -04:00
Ayaz Abdulla
4c0c2fd486 [PATCH] pci_ids: add new device ids
This patch adds new device ids for MCP61 and MCP65 chips.

Signed-Off-By: Ayaz Abdulla <aabdulla@nvidia.com>

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-26 21:37:54 -04:00
Thomas Gleixner
f75e5097ef [MTD] NAND modularize write function
Modularize the write function and reorganaize the internal buffer
management. Remove obsolete chip options and fixup all affected
users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-26 18:52:08 +02:00
Thomas Gleixner
f5bbdacc41 [MTD] NAND Modularize read function
Split the core of the read function out and implement
seperate handling functions for software and hardware
ECC.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-25 12:45:27 +01:00
Thomas Gleixner
9577f44a89 [MTD] NAND Add read/write function pointers to struct nand_ecc_ctrl
Add read/write function pointers to struct nand_ecc_ctrl to
prepare the modulaization of nand_read/write functions. The
current implementation handles every type of ecc mode
software/hardware and all kinds of strange ecc placement
schemes in one switch/if construct. Thats too complex to
maintain and too inflexible to expand. Modularization will
also shorten the code pathes of the read/write functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-25 12:45:27 +01:00
Thomas Gleixner
7fac464868 [MTD] Add ECC statistics to struct mtd_info
FLASH - especially NAND FLASH - will become less reliable
and bit flips more likely. Add an ECC statistics struct
to struct mtd_info to keep track of this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-25 12:45:27 +01:00
Thomas Gleixner
7a30601b3a [MTD] NAND Introduce NAND_NO_READRDY option
The nand driver has a superflous read ready / command
delay in the read functions. This was added to handle
chips which have an automatic read forward. Newer
chips do not have this functionality anymore. Add this
option to avoid the delay / I/O operation. Mark all
large page chips with the new option flag.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-25 12:45:26 +01:00
David Woodhouse
66643de455 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	include/asm-powerpc/unistd.h
	include/asm-sparc/unistd.h
	include/asm-sparc64/unistd.h

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-24 09:22:21 +01:00
Alan Cox
957d2df180 [PATCH] libata: Remove obsolete flag
ATA_FLAG_IRQ_MASK was added when I did the original data transfer with
IRQ masked bits for PIO. It has since been replaced by ->pio_data_xfer
methods so should be removed so nobody uses it by mistake thinking it
still works.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-24 02:09:56 -04:00
Alan Cox
a6b2c5d475 [PATCH] PATCH: libata. Add ->data_xfer method
We need to pass the device in order to do per device checks such as
32bit I/O enables. With the changes to include dev->ap we now don't have
to add parameters however just clean them up. Also add data_xfer methods
to the existing drivers except ata_piix (which is in the other block of
patches). If you reject the piix one just add a data_xfer to it...

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-24 01:58:54 -04:00
Andrew Chew
4c5c81613b [PATCH] sata_nv: Add MCP61 support
Added MCP61 SATA support to sata_nv.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-24 01:34:10 -04:00
Jeff Garzik
26e27cd424 Merge branch 'master' into upstream 2006-05-24 01:32:42 -04:00
Brice Goglin
0da34b6dfe [PATCH] Add Myri-10G Ethernet driver
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew J. Gallatin <gallatin@myri.com>

 drivers/net/Kconfig                            |   17
 drivers/net/Makefile                           |    1
 drivers/net/myri10ge/Makefile                  |    5
 drivers/net/myri10ge/myri10ge.c                | 2851 +++++++++++++++
 drivers/net/myri10ge/myri10ge_mcp.h            |  205 +
 drivers/net/myri10ge/myri10ge_mcp_gen_header.h |   58
 include/linux/pci_ids.h                        |    1
 7 files changed, 3138 insertions(+)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-24 00:27:31 -04:00
Brice Goglin
3a720d726a [PATCH] Revive pci_find_ext_capability
This patch revives pci_find_ext_capability (has been disabled a couple month
ago since it was not used anywhere. See http://lkml.org/lkml/2006/1/20/247).
It will now be used by the myri10ge driver.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew J. Gallatin <gallatin@myri.com>

 drivers/pci/pci.c   |    3 +--
 include/linux/pci.h |    2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-24 00:27:31 -04:00
Thomas Gleixner
cad74f2c38 [MTD] NAND remove write_byte/word function from nand_chip
The previous change of the command / hardware control allows to
remove the write_byte/word functions completely, as their only
user were nand_command and nand_command_lp.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 23:28:48 +02:00
Thomas Gleixner
7abd3ef987 [MTD] Refactor NAND hwcontrol to cmd_ctrl
The hwcontrol function enforced a step by step state machine
for any kind of hardware chip access. Let the hardware driver
know which control bits are set and inform it about a change
of the control lines. Let the hardware driver write out the
command and address bytes directly. This gives a peformance
advantage for address bus controlled chips and simplifies the
quirks in the hardware drivers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 23:25:53 +02:00
Jeff Garzik
9528454f9c Merge branch 'master' into upstream 2006-05-23 17:20:58 -04:00
David Woodhouse
0f04108237 [PATCH] powerpc: wire up sys_[gs]et_robust_list
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-23 10:35:32 -07:00
Andrew Morton
e46e490368 [PATCH] sys_sync_file_range(): move exported flags outside __KERNEL__
These flags are needed by userspace - move them outside __KERNEL__

(Pointed out by dwmw2)

Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-23 10:35:31 -07:00
Thomas Gleixner
9223a456da [MTD] Remove read/write _ecc variants
MTD clients are agnostic of FLASH which needs ECC suppport.
Remove the functions and fixup the callers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 17:21:03 +02:00
Thomas Gleixner
2528e8cdf3 [MTD] Remove readv/readv_ecc
These functions were never implemented and added only bloat to
partition and concat code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 16:10:00 +02:00
Thomas Gleixner
9d8522df37 [MTD] Remove nand writev support
NAND writev(_ecc) support is not longer necessary. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 16:06:03 +02:00
Thomas Gleixner
9a57d470fd [MTD] NAND ECC hwctl function has no return value
Fix the broken prototype

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 15:58:23 +02:00
Thomas Gleixner
4cbb9b80e1 Merge branch 'master' of /home/tglx/work/kernel/git/mtd-2.6/
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 12:37:31 +02:00
Thomas Gleixner
6dfc6d250d [MTD] NAND modularize ECC
First step of modularizing ECC support.
- Move ECC related functionality into a seperate embedded data structure
- Get rid of the hardware dependend constants to simplify new ECC models

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 12:00:46 +02:00
Thomas Gleixner
58dd8f2bfd [MTD] NAND consolidate data types
The NAND driver used a mix of unsigned char, u_char amd uint8_t
data types. Consolidate to uint8_t usage

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 11:52:35 +02:00
Thomas Gleixner
2c0a2bed92 [MTD] NAND whitespace and formatting cleanup
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 11:50:56 +02:00
Thomas Gleixner
ce4c61f184 [MTD] Add support for NDFC NAND controller
NDFC NAND Flash controller is embedded in PPC EP44x SoCs.
Add platform driver based support.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 11:43:28 +02:00
Thomas Gleixner
41796c2ea9 [MTD] Add platform support for NAND
Add the data structures necessary to provide platform device support
for NAND

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 11:38:59 +02:00
Thomas Gleixner
a36ed2995c [MTD] Simplify NAND locking
Replace the chip lock by a the controller lock. For simple drivers a
dummy controller structure is created by the scan code.
This simplifies the locking algorithm in nand_get/release_chip().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2006-05-23 11:37:03 +02:00
Joern Engel
28318776a8 [MTD] Introduce writesize
At least two flashes exists that have the concept of a minimum write unit,
similar to NAND pages, but no other NAND characteristics.  Therefore, rename
the minimum write unit to "writesize" for all flashes, including NAND.

Signed-off-by: Joern Engel <joern@wh.fh-wedel.de>
2006-05-22 23:18:05 +02:00
Magnus Kessler
7dd1d9b85c [AGPGART] VIA PT880 Ultra support.
This patch enables agpgart on a Via "PT880 Ultra" based motherboard
(Asus P4V800D-X). The PCI ID of the PT880 Ultra is 0x0308 instead of
0x0258 of the PT880.

The patched via-agp passes testgart.

Signed-off-by: Magnus Kessler <Magnus.Kessler@gmx.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2006-05-22 13:56:02 -04:00
Linus Torvalds
c9d20af62c Merge master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (33 commits)
  V4L/DVB (3965): Fix CONFIG_VIDEO_VIVI=y build bug
  V4L/DVB (3964): Bt8xx/bttv-cards.c: fix off-by-one errors
  V4L/DVB (3914): Vivi build fix
  V4L/DVB (3912): Sparc32 vivi fix
  V4L/DVB (3832): Get_dvb_firmware: download nxt2002 firmware from new driver location
  V4L/DVB (3829): Fix frequency values in the ranges structures of the LG TDVS H06xF tuners
  V4L/DVB (3826): Saa7134: Missing 'break' in Terratec Cinergy 400 TV initialization
  V4L/DVB (3825): Remove broken 'fast firmware load' from cx25840.
  V4L/DVB (3819): Cxusb-bluebird: bug-fix: power down corrupts frontend
  V4L/DVB (3813): Add support for TCL M2523_5N_E tuner.
  V4L/DVB (3804): Tweak bandselect setup fox cx24123
  V4L/DVB (3803): Various correctness fixes to tuning.
  V4L/DVB (3797): Always wait for diseqc queue to become ready before transmitting a diseqc message
  V4L/DVB (3796): Add several debug messages to cx24123 code
  V4L/DVB (3795): Fix for CX24123 & low symbol rates
  V4L/DVB (3792): Kbuild: DVB_BT8XX must select DVB_ZL10353
  V4L/DVB (3790): Use after free in drivers/media/video/em28xx/em28xx-video.c
  V4L/DVB (3788): Fix compilation with V4L1_COMPAT
  V4L/DVB (3782): Removed uneeded stuff from pwc Makefile
  V4L/DVB (3775): Add VIVI Kconfig stuff
  ...
2006-05-21 18:31:53 -07:00
Bob Picco
e984bb43f7 [PATCH] Align the node_mem_map endpoints to a MAX_ORDER boundary
Andy added code to buddy allocator which does not require the zone's
endpoints to be aligned to MAX_ORDER.  An issue is that the buddy allocator
requires the node_mem_map's endpoints to be MAX_ORDER aligned.  Otherwise
__page_find_buddy could compute a buddy not in node_mem_map for partial
MAX_ORDER regions at zone's endpoints.  page_is_buddy will detect that
these pages at endpoints are not PG_buddy (they were zeroed out by bootmem
allocator and not part of zone).  Of course the negative here is we could
waste a little memory but the positive is eliminating all the old checks
for zone boundary conditions.

SPARSEMEM won't encounter this issue because of MAX_ORDER size constraint
when SPARSEMEM is configured.  ia64 VIRTUAL_MEM_MAP doesn't need the logic
either because the holes and endpoints are handled differently.  This
leaves checking alloc_remap and other arches which privately allocate for
node_mem_map.

Signed-off-by: Bob Picco <bob.picco@hp.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:22 -07:00
Adrian Bunk
1b81d6637d [PATCH] drivers/base/firmware_class.c: cleanups
- remove the following global function that is both unused and
  unimplemented:
  - register_firmware()

- make the following needlessly global function static:
  - firmware_class_uevent()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:19 -07:00
Kumar Gala
ccf06998fe [PATCH] spi: add spi master driver for Freescale MPC83xx SPI controller
This driver supports the SPI controller on the MPC83xx SoC devices from
Freescale.  Note, this driver supports only the simple shift register SPI
controller and not the descriptor based CPM or QUICCEngine SPI controller.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:19 -07:00
David Woodhouse
0cfc7da3ff Merge git://git.infradead.org/jffs2-xattr-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-20 17:27:32 +01:00
Albert Lee
3655d1d323 [PATCH] libata: Fix the HSM error_mask mapping (was: Re: libata-tj and SMART)
Fix the HSM error_mask mapping.

Changes:
- Better mapping in ac_err_mask()
- In HSM_ST_FIRST ans HSM_ST state, check ATA_ERR|ATA_DF and map it to AC_ERR_DEV instead of AC_ERR_HSM.
- In HSM_ST_FIRST and HSM_ST state, map DRQ=1 ERR=1 to AC_ERR_HSM.
- For PIO data in and DRQ=1 ERR=1, add check after the junk data block is read.

Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-05-20 00:37:01 -04:00
Jeff Garzik
3d71b3b0b6 Merge branch 'upstream-fixes' into upstream
Conflicts:

	drivers/scsi/libata-core.c
2006-05-20 00:36:08 -04:00
Jeff Garzik
badc48e660 Merge branch 'master' into upstream 2006-05-20 00:03:38 -04:00
Pavel Pisa
2c171bf134 [ARM] 3531/1: i.MX/MX1 SD/MMC ensure, that clock are stopped before new command and cleanups
Patch from Pavel Pisa

There has been problems that for some paths that clock are not stopped
during new command programming and initiation. Result is issuing
of incorrect command to the card. Some other problems are cleaned too.
Noisy report of known ERRATUM #4 has been suppressed.

Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-05-19 21:48:03 +01:00
David Woodhouse
aef9ab4784 [JFFS2] Support new device nodes
Device node major/minor numbers are just stored in the payload of a single
data node. Just extend that to 4 bytes and use new_encode_dev() for it.

We only use the 4-byte format if we _need_ to, if !old_valid_dev(foo).
This preserves backwards compatibility with older code as much as
possible. If we do make devices with major or minor numbers above 255, and
then mount the file system with the old code, it'll just read the first
two bytes and get the numbers wrong. If it comes to garbage-collect it,
it'll then write back those wrong numbers. But that's about the best we
can expect.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-19 00:28:49 +01:00
KaiGai Kohei
20a92fc74c Merge git://git.infradead.org/mtd-2.6 2006-05-19 00:43:53 +09:00
David Woodhouse
3ac6c7b445 Remove struct fddi_statistics from user view in <linux/if_fddi.h>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-16 23:25:37 +01:00
David Woodhouse
ba9627b85f [JFFS2] Repack some on-medium structures. ARM is weirder than I thought.
We have to pack at least the jint16_t structure, because otherwise it'll
be four bytes in size. Thankfully, we can do that and _not_ pack the
actual node structures, and the compiler still doesn't emit stupid code.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-16 23:03:08 +01:00
David Brownell
a020ed7521 [PATCH] SPI: busnum == 0 needs to work
We need to be able to have a "SPI bus 0" matching chip numbering; but
that number was wrongly used to flag dynamic allocation of a bus number.

This patch resolves that issue; now negative numbers trigger dynamic alloc.

It also updates the how-to-write-a-controller-driver overview to mention
this stuff.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-16 14:33:57 -07:00
David Brownell
ccf77cc4af [PATCH] SPI: devices can require LSB-first encodings
Add spi_device hook for LSB-first word encoding, and update all the
(in-tree) controller drivers to reject such devices.  Eventually,
some controller drivers will be updated to support lsb-first encodings
on the wire; no current drivers need this.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-16 14:33:57 -07:00
Kumar Gala
ff9f4771b5 [PATCH] SPI: Renamed bitbang_transfer_setup to spi_bitbang_setup_transfer and export it
Renamed bitbang_transfer_setup to follow convention of other exported symbols
from spi-bitbang.  Exported spi_bitbang_setup_transfer to allow users of
spi-bitbang to use the function in their own setup_transfer.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-16 14:33:57 -07:00
David Brownell
747d844ee9 [PATCH] SPI: spi whitespace fixes
This removes superfluous whitespace in the <linux/spi/spi.h> header.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-16 14:33:56 -07:00
Imre Deak
4cff33f94f [PATCH] SPI: per-transfer overrides for wordsize and clocking
Some protocols (like one for some bitmap displays) require different clock
speed or word size settings for each transfer in an SPI message. This adds
those parameters to struct spi_transfer.  They are to be used when they are
nonzero; otherwise the defaults from spi_device are to be used.

The patch also adds a setup_transfer callback to spi_bitbang, uses it for
messages that use those overrides, and implements it so that the pure
bitbanging code can help resolve any questions about how it should work.

Signed-off-by: Imre Deak <imre.deak@nokia.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-05-16 14:33:56 -07:00
David Woodhouse
18594822fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-16 01:19:52 +01:00
Hua Zhong
e6333fd4dd [PATCH] fix can_share_swap_page() when !CONFIG_SWAP
can_share_swap_page() is used to check if the page has the last reference.
This avoids allocating a new page for COW if it's the last page.

However, if CONFIG_SWAP is not set, can_share_swap_page() is defined as 0,
thus always causes a copy for the last COW page.  The below simple patch
fixes it.

Signed-off-by: Hua Zhong <hzhong@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:56 -07:00
Mike Kravetz
39d24e6426 [PATCH] add slab_is_available() routine for boot code
slab_is_available() indicates slab based allocators are available for use.
SPARSEMEM code needs to know this as it can be called at various times
during the boot process.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:56 -07:00
Trent Piepho
5e37661389 [PATCH] symbol_put_addr() locks kernel
Even since a previous patch:

Fix race between CONFIG_DEBUG_SLABALLOC and modules
Sun, 27 Jun 2004 17:55:19 +0000 (17:55 +0000)
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=92b3db26d31cf21b70e3c1eadc56c179506d8fbe

The function symbol_put_addr() will deadlock the kernel.

symbol_put_addr() would acquire modlist_lock, then while holding the lock call
two functions kernel_text_address() and module_text_address() which also try
to acquire the same lock.  This deadlocks the kernel of course.

This patch changes symbol_put_addr() to not acquire the modlist_lock, it
doesn't need it since it never looks at the module list directly.  Also, it
now uses core_kernel_text() instead of kernel_text_address().  The latter has
an additional check for addr inside a module, but we don't need to do that
since we call module_text_address() (the same function kernel_text_address
uses) ourselves.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Cc: Zwane Mwaikambo <zwane@fsmlabs.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Johannes Stezenbach <js@linuxtv.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:55 -07:00
Heiko Carstens
986733e01d [PATCH] RCU: introduce rcu_needs_cpu() interface
With "Paul E. McKenney" <paulmck@us.ibm.com>

Introduce rcu_needs_cpu() interface.  This can be used to tell if there
will be a new rcu batch on a cpu soon by looking at the curlist pointer.
This can be used to avoid to enter a tickless idle state where the cpu
would miss that a new batch is ready when rcu_start_batch would be called
on a different cpu.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-15 11:20:55 -07:00
Jeff Garzik
8d4ee71ff6 Merge branch 'max-sect' into upstream 2006-05-15 11:27:47 -04:00
Jeff Garzik
efa6e7e9d4 Merge branch 'for-jeff' of git://htj.dyndns.org/libata-tj into tejun-merge 2006-05-15 11:26:53 -04:00
Jeff Garzik
5006ecc2d5 Merge branch 'master' into upstream 2006-05-15 11:26:03 -04:00
Tejun Heo
a6e6ce8e8d [PATCH] libata-ncq: implement NCQ device configuration
Now that all NCQ related stuff are in place, implement NCQ device
configuration and bump ATA_MAX_QUEUE to 32 thus activating NCQ
support.

Original implementation is from Jens Axboe.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 21:03:48 +09:00
Tejun Heo
dedaf2b036 [PATCH] libata-ncq: implement ap->qc_active, ap->sactive and complete helper
Add ap->qc_active and ap->sactive, mask of all active qcs and libata's
view of the SActive register, respectively.  Also, implement
ata_qc_complete_multiple() which takes new qc_active mask and complete
multiple qcs according to the mask.

These will be used to track NCQ commands and complete them.  The
distinction between ap->qc_active and ap->sactive is also useful for
later PM implementation.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 21:03:43 +09:00
Tejun Heo
6cec4a3943 [PATCH] libata-ncq: rename ap->qactive to ap->qc_allocated
Rename ap->qactive to ap->qc_allocated.  This is to accomodate
addition of ap->qc_active, mask of active qcs.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 21:03:41 +09:00
Tejun Heo
88e490340e [PATCH] libata-ncq: add NCQ related ATA/libata constants and macros
Add NCQ related ATA/libata constants and macros.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 21:03:38 +09:00
Tejun Heo
12436c30f4 Merge branch 'irq-pio'
Conflicts:

	drivers/scsi/libata-core.c
	include/linux/libata.h
2006-05-15 20:59:15 +09:00
Tejun Heo
6d97dbd72d [PATCH] libata-eh: implement BMDMA EH
Implement stock BMDMA error handling methods.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:24 +09:00
Tejun Heo
022bdb075b [PATCH] libata-eh: implement new EH
Implement new EH.  The exported interface is ata_do_eh() which is to
be called from ->error_handler and performs the following steps to
recover the failed port.

ata_eh_autopsy() : analyze SError/TF, determine the cause of failure
		   and required recovery actions and record it in
		   ap->eh_context
ata_eh_report()	 : report the failure to user
ata_eh_recover() : perform recovery actions described in ap->eh_context
ata_eh_finish()	 : finish failed qcs

LLDDs can customize error handling by modifying eh_context before
calling ata_do_eh() or, if necessary, doing so inbetween each major
steps by calling each step explicitly.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:22 +09:00
Tejun Heo
f3e81b19aa [PATCH] libata-eh: implement ata_eh_info and ata_eh_context
struct ata_eh_info serves as the communication channel between
execution path and EH.  Execution path describes detected error
condition in ap->eh_info and EH recovers the port using it.  To avoid
missing error conditions detected during EH, EH makes its own copy of
eh_info and clears it on entry allowing error info to accumulate
during EH.

Most EH states including EH's copy of eh_info are stored in
ap->eh_context (struct ata_eh_context) which is owned by EH and thus
doesn't require any synchronization to access and alter.  This
standardized context makes it easy to integrate various parts of EH
and extend EH to handle multiple links (for PM).

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:21 +09:00
Tejun Heo
0c247c559c [PATCH] libata-eh: implement dev->ering
This patch implements ata_ering and uses it to define dev->ering.

ata_ering is a ring buffer which records libata errors - whether a
command was for normar IO request, err_mask and timestamp.  Errors are
recorded per-device in dev->ering.  This will be used by EH to
determine recovery actions.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:19 +09:00
Tejun Heo
9be1e979f2 [PATCH] libata-eh: add ATA and libata flags for new EH
Add ATA and libata flags to be used by new EH.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:17 +09:00
Tejun Heo
ad9e276244 [PATCH] libata-eh-fw: update ata_scsi_error() for new EH
Update ata_scsi_error() for new EH.  ata_scsi_error() is responsible
for claiming timed out qcs and invoking ->error_handler in safe and
synchronized manner.  As the state of the controller is unknown if a
qc has timed out, the port is frozen in such cases.

Note that ata_scsi_timed_out() isn't used for new EH.  This is because
a timed out qc cannot be claimed by EH without freezing the port and
freezing the port in ata_scsi_timed_out() results in unnecessary
abortion of other active qcs.  ata_scsi_timed_out() can be removed
once all drivers are converted to new EH.

While at it, add 'TODO: kill' comments to old EH functions.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:12 +09:00
Tejun Heo
e318049949 [PATCH] libata-eh-fw: implement freeze/thaw
Freezing is performed atomic w.r.t. host_set->lock and once frozen
LLDD is not allowed to access the port or any qc on it.  Also, libata
makes sure that no new qc gets issued to a frozen port.

A frozen port is thawed after a reset operation completes
successfully, so reset methods must do its job while the port is
frozen.  During initialization all ports get frozen before requesting
IRQ, so reset methods are always invoked on a frozen port.

Optional ->freeze and ->thaw operations notify LLDD that the port is
being frozen and thawed, respectively.  LLDD can disable/enable
hardware interrupt in these callbacks if the controller's IRQ mask can
be changed dynamically.  If the controller doesn't allow such
operation, LLDD can check for frozen state in the interrupt handler
and ack/clear interrupts unconditionally while frozen.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:09 +09:00
Tejun Heo
7b70fc0398 [PATCH] libata-eh-fw: implement ata_port_schedule_eh() and ata_port_abort()
ata_port_schedule_eh() directly schedules EH for @ap without
associated qc.  Once EH scheduled, no further qc is allowed and EH
kicks in as soon as all currently active qc's are drained.

ata_port_abort() schedules all currently active commands for EH by
qc_completing them with ATA_QCFLAG_FAILED set.  If ata_port_abort()
doesn't find any qc to abort, it directly schedule EH using
ata_port_schedule_eh().

These two functions provide ways to invoke EH for conditions which
aren't directly related to any specfic qc.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:07 +09:00
Tejun Heo
f686bcb807 [PATCH] libata-eh-fw: implement new EH scheduling via error completion
There are several ways a qc can get schedule for EH in new EH.  This
patch implements one of them - completing a qc with ATA_QCFLAG_FAILED
set or with non-zero qc->err_mask.  ALL such qc's are examined by EH.

New EH schedules a qc for EH from completion iff ->error_handler is
implemented, qc is marked as failed or qc->err_mask is non-zero and
the command is not an internal command (internal cmd is handled via
->post_internal_cmd).  The EH scheduling itself is performed by asking
SCSI midlayer to schedule EH for the specified scmd.

For drivers implementing old-EH, nothing changes.  As this change
makes ata_qc_complete() rather large, it's not inlined anymore and
__ata_qc_complete() is exported to other parts of libata for later
use.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:05 +09:00
Tejun Heo
f69499f42c [PATCH] libata-eh-fw: update ata_qc_from_tag() to enforce normal/EH qc ownership
New EH framework has clear distinction about who owns a qc.  Every qc
starts owned by normal execution path - PIO, interrupt or whatever.
When an exception condition occurs which affects the qc, the qc gets
scheduled for EH.  Note that some events (say, link lost and regained,
command timeout) may schedule qc's which are not directly related but
could have been affected for EH too.  Scheduling for EH is atomic
w.r.t. ap->host_set->lock and once schedule for EH, normal execution
path is not allowed to access the qc in whatever way.  (PIO
synchronization acts a bit different and will be dealt with later)

This patch make ata_qc_from_tag() check whether a qc is active and
owned by normal path before returning it.  If conditions don't match,
NULL is returned and thus access to the qc is denied.
__ata_qc_from_tag() is the original ata_qc_from_tag() and is used by
libata core/EH layers to access inactive/failed qc's.

This change is applied only if the associated LLDD implements new EH
as indicated by non-NULL ->error_handler

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:03 +09:00
Tejun Heo
2ab7db1ff1 [PATCH] libata-eh-fw: use special reserved tag and qc for internal commands
New EH may issue internal commands to recover from error while failed
qc's are still hanging around.  To allow such usage, reserve tag
ATA_MAX_QUEUE-1 for internal command.  This also makes it easy to tell
whether a qc is for internal command or not.  ata_tag_internal() test
implements this test.

To avoid breaking existing drivers, ata_exec_internal() uses
ATA_TAG_INTERNAL only for drivers which implement ->error_handler.
For drivers using old EH, tag 0 is used.  Note that this makes
ata_tag_internal() test valid only when ->error_handler is
implemented.  This is okay as drivers on old EH should not and does
not have any reason to use ata_tag_internal().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:02 +09:00
Tejun Heo
9ec957f200 [PATCH] libata-eh-fw: add flags and operations for new EH
Add ATA_FLAG_EH_{PENDING|FROZEN}, ATA_ATA_QCFLAG_{FAILED|SENSE_VALID}
and ops->freeze, thaw, error_handler, post_internal_cmd() for new EH.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:58 +09:00
Tejun Heo
61440db61f [PATCH] libata: implement ATA printk helpers
Implement ata_{port|dev}_printk() which prefixes the message with
proper identification string.  This change is necessary for later PM
support because devices and links should be identified differently
depending on how they are attached.

This also helps unifying device id strings.  Currently, there are two
forms in use (P is the port number D device number) - 'ataP(D):', and
'ataP: dev D '.  These macros also make it harder to forget proper ID
string (e.g. printing only port number when a device is in question).

Debug message handling can be integrated into these printk macros by
passing debug type and level via @lv.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:55 +09:00
Tejun Heo
3373efd89d [PATCH] libata: use dev->ap
Use dev->ap where possible and eliminate superflous @ap from functions
and structures.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:53 +09:00
Tejun Heo
38d87234d6 [PATCH] libata: add dev->ap
Add dev->ap which points back to the port the device belongs to.  This
makes it unnecessary to pass @ap for silly reasons (e.g. printks).
Also, this change is necessary to accomodate later PM support which
will introduce ATA link inbetween port and device.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:51 +09:00
Tejun Heo
a0ab51cefc [PATCH] libata: kill old SCR functions and sata_dev_present()
Kill now unused scr_{read|write|write_flush}() and sata_dev_present().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:49 +09:00
Tejun Heo
34bf21704c [PATCH] libata: implement new SCR handling and port on/offline functions
Implement ata_scr_{valid|read|write|write_flush}() and
ata_port_{online|offline}().  These functions replace
scr_{read|write}() and sata_dev_present().

Major difference between between the new SCR functions and the old
ones is that the new ones have a way to signal error to the caller.
This makes handling SCR-available and SCR-unavailable cases in the
same path easier.  Also, it eases later PM implementation where SCR
access can fail due to various reasons.

ata_port_{online|offline}() functions return 1 only when they are
affirmitive of the condition.  e.g.  if SCR is unaccessible or
presence cannot be determined for other reasons, these functions
return 0.  So, ata_port_online() != !ata_port_offline().  This
distinction is useful in many exception handling cases.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:46 +09:00
Tejun Heo
e61e067227 [PATCH] libata: implement qc->result_tf
Add qc->result_tf and ATA_QCFLAG_RESULT_TF.  This moves the
responsibility of loading result TF from post-compltion path to qc
execution path.  qc->result_tf is loaded if explicitly requested or
the qc failsa.  This allows more efficient completion implementation
and correct handling of result TF for controllers which don't have
global TF representation such as sil3124/32.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:40 +09:00
Tejun Heo
fe635c7e91 [PATCH] libata: use preallocated buffers
It's not a very good idea to allocate memory during EH.  Use
statically allocated buffer for dev->id[] and add 512byte buffer
ap->sector_buf.  This buffer is owned by EH (or probing) and to be
used as temporary buffer for various purposes (IDENTIFY, NCQ log page
10h, PM GSCR block).

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:35 +09:00
Tejun Heo
6cd727b14f [PATCH] libata: kill duplicate prototypes
Kill duplicate prototypes for ata_eh_qc_complete/retry() in libata.h.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:28 +09:00
Tejun Heo
3c567b7d11 [PATCH] libata: rename ata_down_sata_spd_limit() and friends
Rename ata_down_sata_spd_limit() and friends to sata_down_spd_limit()
and likewise for simplicity & consistency.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:23 +09:00
David Woodhouse
3e68fbb59b [JFFS2] Don't pack on-medium structures, because GCC emits crappy code
If we use __attribute__((packed)), GCC will _also_ assume that the
structures aren't sensibly aligned, and it'll emit code to cope with
that instead of straight word load/save. This can be _very_ suboptimal
on architectures like ARM.

Ideally, we want an attribute which just tells GCC not to do any
padding, without the alignment side-effects. In the absense of that,
we'll just drop the 'packed' attribute and hope that everything stays as
it was (which to be fair is fairly much what we expect). And add some
paranoia checks in the initialisation code, which should be optimised
away completely in the normal case.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-15 00:49:43 +01:00
David Woodhouse
0d4e30d26a [MTD] Clean up <linux/mtd/physmap.h> to fix modular build
... and also fix the multiple inclusion guard so it actually _works_

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-14 12:25:19 +01:00
David Woodhouse
151e76590f [MTD] Fix legacy character sets throughout drivers/mtd, include/linux/mtd
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-14 01:51:54 +01:00
KaiGai Kohei
aa98d7cf59 [JFFS2][XATTR] XATTR support on JFFS2 (version. 5)
This attached patches provide xattr support including POSIX-ACL and
SELinux support on JFFS2 (version.5).

There are some significant differences from previous version posted
at last December.
The biggest change is addition of EBS(Erase Block Summary) support.
Currently, both kernel and usermode utility (sumtool) can recognize
xattr nodes which have JFFS2_NODETYPE_XATTR/_XREF nodetype.

In addition, some bugs are fixed.
- A potential race condition was fixed.
- Unexpected fail when updating a xattr by same name/value pair was fixed.
- A bug when removing xattr name/value pair was fixed.

The fundamental structures (such as using two new nodetypes and exclusion
mechanism by rwsem) are unchanged. But most of implementation were reviewed
and updated if necessary.
Espacially, we had to change several internal implementations related to
load_xattr_datum() to avoid a potential race condition.

[1/2] xattr_on_jffs2.kernel.version-5.patch
[2/2] xattr_on_jffs2.utils.version-5.patch

Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-13 15:09:47 +09:00
Mauro Carvalho Chehab
cd41e28e2d V4L/DVB (3774): Create V4L1 config options
V4L1 API is depreciated and should be removed soon from kernel. This patch
adds two new options, one to disable V4L1 drivers, and another to disable
V4L1 compat module. This way, it would be easy to check what still depends
on V4L1 stuff, allowing also to test if app works fine with V4L2 only support.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-05-12 19:54:53 -03:00
Kyungmin Park
493c646077 OneNAND: One-Time Programmable (OTP) support
One Block of the NAND Flash Array memory is reserved as
a One-Time Programmable Block memory area.
Also, 1st Block of NAND Flash Array can be used as OTP.

The OTP block can be read, programmed and locked using the same
operations as any other NAND Flash Array memory block.
OTP block cannot be erased.

OTP block is fully-guaranteed to be a valid block.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2006-05-12 15:35:50 +01:00
Kyungmin Park
9c01f87db1 OneNAND: handle byte access on BufferRAM
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2006-05-12 15:35:45 +01:00
Linus Torvalds
2bf9d6d0f2 Merge master.kernel.org:/home/rmk/linux-2.6-serial
* master.kernel.org:/home/rmk/linux-2.6-serial:
  [SERIAL] 8250: add locking to console write function
  [SERIAL] Remove unconditional enable of TX irq for console
  [SERIAL] 8250: set divisor register correctly for AMD Alchemy SoC uart
  [SERIAL] AMD Alchemy UART: claim memory range
  [SERIAL] Clean up serial locking when obtaining a reference to a port
2006-05-11 15:46:42 -07:00
Linus Torvalds
6572b2064a Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [NET_SCHED]: HFSC: fix thinko in hfsc_adjust_levels()
  [IPV6]: skb leakage in inet6_csk_xmit
  [BRIDGE]: Do sysfs registration inside rtnl.
  [NET]: Do sysfs registration as part of register_netdevice.
  [TG3]: Fix possible NULL deref in tg3_run_loopback().
  [NET] linkwatch: Handle jiffies wrap-around
  [IRDA]: Switching to a workqueue for the SIR work
  [IRDA]: smsc-ircc: Minimal hotplug support.
  [IRDA]: Removing unused EXPORT_SYMBOLs
  [IRDA]: New maintainer.
  [NET]: Make netdev_chain a raw notifier.
  [IPV4]: ip_options_fragment() has no effect on fragmentation
  [NET]: Add missing operstates documentation.
2006-05-11 15:35:54 -07:00
Linus Torvalds
6314410dd1 Merge branch 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/shemminger/netdev-2.6
* 'upstream' of master.kernel.org:/pub/scm/linux/kernel/git/shemminger/netdev-2.6:
  sis900: phy for FoxCon motherboard
  dl2k: use DMA_48BIT_MASK constant
  phy: mdiobus_register(): initialize all phy_map entries
  sky2: ifdown kills irq mask
2006-05-10 14:59:29 -07:00
Francois Romieu
4c1b46226c dl2k: use DMA_48BIT_MASK constant
Typo will be harder with this one.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-05-10 14:04:22 -07:00
Stephen Hemminger
b17a7c179d [NET]: Do sysfs registration as part of register_netdevice.
The last step of netdevice registration was being done by a delayed
call, but because it was delayed, it was impossible to return any error
code if the class_device registration failed.

Side effects:
 * one state in registration process is unnecessary.
 * register_netdevice can sleep inside class_device registration/hotplug
 * code in netdev_run_todo only does unregistration so it is simpler.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-10 13:21:17 -07:00
Bjorn Helgaas
32e62c636a [IA64] rework memory attribute aliasing
This closes a couple holes in our attribute aliasing avoidance scheme:

  - The current kernel fails mmaps of some /dev/mem MMIO regions because
    they don't appear in the EFI memory map.  This keeps X from working
    on the Intel Tiger box.

  - The current kernel allows UC mmap of the 0-1MB region of
    /sys/.../legacy_mem even when the chipset doesn't support UC
    access.  This causes an MCA when starting X on HP rx7620 and rx8620
    boxes in the default configuration.

There's more detail in the Documentation/ia64/aliasing.txt file this
adds, but the general idea is that if a region might be covered by
a granule-sized kernel identity mapping, any access via /dev/mem or
mmap must use the same attribute as the identity mapping.

Otherwise, we fall back to using an attribute that is supported
according to the EFI memory map, or to using UC if the EFI memory
map doesn't mention the region.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2006-05-08 16:32:05 -07:00
Stephen Hemminger
d324031245 sky2: backout NAPI reschedule
This is a backout of earlier patch.

The whole rescheduling hack was a bad idea. It doesn't really solve
the problem and it makes the code more complicated for no good reason.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
2006-05-08 16:00:23 -07:00
David Woodhouse
6f18a022fb Finally remove the obnoxious inter_module_xxx()
This was already a bad plan when I argued against adding it in the first
place. Good riddance.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-08 22:40:05 +01:00
Lennert Buytenhek
73566edf9b [MTD] Convert physmap to platform driver
After dwmw2 let me know it ought to be done, I rewrote the physmap map
driver to be a platform driver.  I know zilch about the driver model,
so I probably botched it in some way, but I've done some tests on an
ixp23xx board which uses physmap, and it all seems to work.

In order to not break existing physmap users, I've added some compat
code that will instantiate a platform device iff CONFIG_MTD_PHYSMAP_LEN
is defined and != 0.  Also, I've changed the default value for
CONFIG_MTD_PHYSMAP_LEN to zero, so that people who inadvertently
compile in physmap (or new, platform-style, users of physmap) don't get
burned.

This works pretty well -- the new physmap driver is a drop-in replacement
for the old one, and works on said ixp23xx board without any code changes
needed.  (This should hold as long as users don't touch 'physmap_map'
directly.)

Once all physmap users have been converted to instantiate their own
platform devices, the compat code can go.  (Or we decide that we can
change all the in-tree users at the same time, and never merge the
compat code.)

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-07 17:16:36 +01:00
Stephen Hemminger
fe9925b551 [NET]: Create netdev attribute_groups with class_device_add
Atomically create attributes when class device is added. This avoids
the race between registering class_device (which generates hotplug
event), and the creation of attribute groups.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-06 17:56:03 -07:00
Stephen Hemminger
1498221d51 [CLASS DEVICE]: add attribute_group creation
Extend the support of attribute groups in class_device's to allow
groups to be created as part of the registration process. This allows
network device's to avoid race between registration and creating
groups.

Note that unlike attributes that are a property of the class object,
the groups are a property of the class_device object. This is done
because there are different types of network devices (wireless for
example).

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-06 17:55:11 -07:00
David Woodhouse
5047f09b56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-06 19:59:18 +01:00
Linus Torvalds
d98550e334 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [PATCH] powerpc: Use the ibm,pa-features property if available
  powerpc: Fix incorrect might_sleep in __get_user/__put_user on kernel addresses
  [PATCH] ppc32 CPM_UART: fixes and improvements
  [PATCH] ppc32 CPM_UART: Fixed break send on SCC
  [PATCH] powerpc/kprobes: fix singlestep out-of-line
  [PATCH] powerpc/pseries: avoid crash in PCI code if mem system not up
2006-05-04 15:09:52 -07:00
Linus Torvalds
6fc56ccfe4 Merge master.kernel.org:/home/rmk/linux-2.6-mmc
* master.kernel.org:/home/rmk/linux-2.6-mmc:
  [MMC] Move set_ios debugging into mmc.c
  [MMC] Correct mmc_request_done comments
  [MMC] PXA: reduce the number of lines PXAMCI debug uses
  [MMC] PXA and i.MX: don't avoid sending stop command on error
  [MMC] extend data timeout for writes
  [ARM] 3485/1: i.MX: MX1 SD/MMC fix of unintentional double start possibility
2006-05-04 14:52:27 -07:00
Linus Torvalds
cbdf811c77 Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] compat_sys_vmsplice: one-off in UIO_MAXIOV check
  [PATCH] splice: redo page lookup if add_to_page_cache() returns -EEXIST
  [PATCH] splice: rename remaining info variables to pipe
  [PATCH] splice: LRU fixups
  [PATCH] splice: fix unlocking of page on error ->prepare_write()
2006-05-04 13:25:40 -04:00
David Woodhouse
05f75fd3bb Include <linux/types.h> and use __uXX types in <linux/cramfs_fs.h>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 17:50:04 +01:00
David Woodhouse
cb8c1fdc0c Use __uXX types in <linux/i2o_dev.h>, include <linux/ioctl.h> too
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 17:32:44 +01:00
David Woodhouse
de654c9786 Remove private struct dx_hash_info from public view in <linux/ext3_fs.h>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 17:28:26 +01:00
David Woodhouse
7ee7d0e318 Include <linux/types.h> and use __uXX types in <linux/affs_hardblocks.h>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 15:49:24 +01:00
David Woodhouse
5da0458900 Use __uXX types in <linux/divert.h> for struct divert_blk et al.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 15:07:59 +01:00
David Woodhouse
2c88f4a8bc Remove PPP_FCS from user view in <linux/ppp_defs.h>, remove __P mess entirely
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 12:07:37 +01:00
Jing Min Zhao
7582e9d17e [NETFILTER]: H.323 helper: Change author's email address
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 23:19:59 -07:00
Jens Axboe
1432873af7 [PATCH] splice: LRU fixups
Nick says that the current construct isn't safe. This goes back to the
original, but sets PIPE_BUF_FLAG_LRU on user pages as well as they all
seem to be on the LRU in the first place.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-05-04 06:55:12 +02:00
David Woodhouse
90abbae2d3 Use __uXX types in user-visible structures in <linux/nbd.h>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 02:55:50 +01:00
David Woodhouse
8e1515df57 Don't use 'u32' in user-visible struct ip_conntrack_old_tuple.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2006-05-04 01:42:36 +01:00