twx-linux

Author	SHA1	Message	Date
Jan Engelhardt	13f7d63c29	[NETFILTER]: nf_{conntrack,nat}_sip: annotate SIP helper with const Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:08 -08:00
Alexey Dobriyan	3cb609d57c	[NETFILTER]: x_tables: create per-netns /proc/net/_tables_ Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:06 -08:00
Jan Engelhardt	09e410def6	[NETFILTER]: xt_hashlimit match, revision 1 Introduces the xt_hashlimit match revision 1. It adds support for kernel-level inversion and grouping source and/or destination IP addresses, allowing to limit on a per-subnet basis. While this would technically obsolete xt_limit, xt_hashlimit is a more expensive due to the hashbucketing. Kernel-level inversion: Previously you had to do user-level inversion: iptables -N foo iptables -A foo -m hashlimit --hashlimit(-upto) 5/s -j RETURN iptables -A foo -j DROP iptables -A INPUT -j foo now it is simpler: iptables -A INPUT -m hashlimit --hashlimit-over 5/s -j DROP Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:28:04 -08:00
Patrick McHardy	b0a6363c24	[NETFILTER]: {ip,arp,ip6}_tables: fix sparse warnings in compat code CHECK net/ipv4/netfilter/ip_tables.c net/ipv4/netfilter/ip_tables.c:1453:8: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1453:8: expected int size net/ipv4/netfilter/ip_tables.c:1453:8: got unsigned int [usertype] size net/ipv4/netfilter/ip_tables.c:1458:44: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1458:44: expected int size net/ipv4/netfilter/ip_tables.c:1458:44: got unsigned int [usertype] size net/ipv4/netfilter/ip_tables.c:1603:2: warning: incorrect type in argument 2 (different signedness) net/ipv4/netfilter/ip_tables.c:1603:2: expected unsigned int i net/ipv4/netfilter/ip_tables.c:1603:2: got int <noident> net/ipv4/netfilter/ip_tables.c:1627:8: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1627:8: expected int size net/ipv4/netfilter/ip_tables.c:1627:8: got unsigned int size net/ipv4/netfilter/ip_tables.c:1634:40: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/ip_tables.c:1634:40: expected int size net/ipv4/netfilter/ip_tables.c:1634:40: got unsigned int size net/ipv4/netfilter/ip_tables.c:1653:8: warning: incorrect type in argument 5 (different signedness) net/ipv4/netfilter/ip_tables.c:1653:8: expected unsigned int i net/ipv4/netfilter/ip_tables.c:1653:8: got int <noident> net/ipv4/netfilter/ip_tables.c:1666:2: warning: incorrect type in argument 2 (different signedness) net/ipv4/netfilter/ip_tables.c:1666:2: expected unsigned int i net/ipv4/netfilter/ip_tables.c:1666:2: got int <noident> CHECK net/ipv4/netfilter/arp_tables.c net/ipv4/netfilter/arp_tables.c:1285:40: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/arp_tables.c:1285:40: expected int size net/ipv4/netfilter/arp_tables.c:1285:40: got unsigned int size net/ipv4/netfilter/arp_tables.c:1543:44: warning: incorrect type in argument 3 (different signedness) net/ipv4/netfilter/arp_tables.c:1543:44: expected int size net/ipv4/netfilter/arp_tables.c:1543:44: got unsigned int [usertype] size CHECK net/ipv6/netfilter/ip6_tables.c net/ipv6/netfilter/ip6_tables.c:1481:8: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1481:8: expected int size net/ipv6/netfilter/ip6_tables.c:1481:8: got unsigned int [usertype] size net/ipv6/netfilter/ip6_tables.c:1486:44: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1486:44: expected int size net/ipv6/netfilter/ip6_tables.c:1486:44: got unsigned int [usertype] size net/ipv6/netfilter/ip6_tables.c:1631:2: warning: incorrect type in argument 2 (different signedness) net/ipv6/netfilter/ip6_tables.c:1631:2: expected unsigned int i net/ipv6/netfilter/ip6_tables.c:1631:2: got int <noident> net/ipv6/netfilter/ip6_tables.c:1655:8: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1655:8: expected int size net/ipv6/netfilter/ip6_tables.c:1655:8: got unsigned int size net/ipv6/netfilter/ip6_tables.c:1662:40: warning: incorrect type in argument 3 (different signedness) net/ipv6/netfilter/ip6_tables.c:1662:40: expected int size net/ipv6/netfilter/ip6_tables.c:1662:40: got unsigned int size net/ipv6/netfilter/ip6_tables.c:1680:8: warning: incorrect type in argument 5 (different signedness) net/ipv6/netfilter/ip6_tables.c:1680:8: expected unsigned int i net/ipv6/netfilter/ip6_tables.c:1680:8: got int <noident> net/ipv6/netfilter/ip6_tables.c:1693:2: warning: incorrect type in argument 2 (different signedness) net/ipv6/netfilter/ip6_tables.c:1693:2: expected unsigned int i net/ipv6/netfilter/ip6_tables.c:1693:2: got int <noident> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:49 -08:00
Jan Engelhardt	edc26f7aaa	[NETFILTER]: xt_owner: allow matching UID/GID ranges Add support for ranges to the new revision. This doesn't affect compatibility since the new revision was not released yet. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:43 -08:00
Alexey Dobriyan	79df341ab6	[NETFILTER]: arp_tables: netns preparation * Propagate netns from userspace. * arpt_register_table() registers table in supplied netns. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:40 -08:00
Alexey Dobriyan	336b517fdc	[NETFILTER]: ip6_tables: netns preparation * Propagate netns from userspace down to xt_find_table_lock() * Register ip6 tables in netns (modules still use init_net) Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:39 -08:00
Alexey Dobriyan	44d34e721e	[NETFILTER]: x_tables: return new table from {arp,ip,ip6}t_register_table() Typical table module registers xt_table structure (i.e. packet_filter) and link it to list during it. We can't use one template for it because corresponding list_head will become corrupted. We also can't unregister with template because it wasn't changed at all and thus doesn't know in which list it is. So, we duplicate template at the very first step of table registration. Table modules will save it for use during unregistration time and actual filtering. Do it at once to not screw bisection. P.S.: renaming i.e. packet_filter => __packet_filter is temporary until full netnsization of table modules is done. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:36 -08:00
Alexey Dobriyan	8d87005207	[NETFILTER]: x_tables: per-netns xt_tables In fact all we want is per-netns set of rules, however doing that will unnecessary complicate routines such as ipt_hook()/ipt_do_table, so make full xt_table array per-netns. Every user stubbed with init_net for a while. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:35 -08:00
Alexey Dobriyan	a98da11d88	[NETFILTER]: x_tables: change xt_table_register() return value convention Switch from 0/-E to ptr/PTR_ERR convention. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:35 -08:00
Jan Engelhardt	b41649989c	[NETFILTER]: xt_conntrack: add port and direction matching Extend the xt_conntrack match revision 1 by port matching (all four {orig,repl}{src,dst}) and by packet direction matching. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:31 -08:00
Jan Engelhardt	c82a5cb8b2	linux/types.h: Use __u64 for aligned_u64 Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:30 -08:00
Patrick McHardy	2fd8e526f4	[NETFILTER]: bridge netfilter: remove nf_bridge_info read-only netoutdev member Before the removal of the deferred output hooks, netoutdev was used in case of VLANs on top of a bridge to store the VLAN device, so the deferred hooks would see the correct output device. This isn't necessary anymore since we're calling the output hooks for the correct device directly in the IP stack. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:29 -08:00
Jan Engelhardt	ecb6f85e11	[NETFILTER]: Use const in struct xt_match, xt_target, xt_table Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:28 -08:00
Herbert Xu	1a6509d991	[IPSEC]: Add support for combined mode algorithms This patch adds support for combined mode algorithms with GCM being the first algorithm supported. Combined mode algorithms can be added through the xfrm_user interface using the new algorithm payload type XFRMA_ALG_AEAD. Each algorithms is identified by its name and the ICV length. For the purposes of matching algorithms in xfrm_tmpl structures, combined mode algorithms occupy the same name space as encryption algorithms. This is in line with how they are negotiated using IKE. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:03 -08:00
Jussi Kivilinna	3692e94f15	Move usbnet.h and rndis_host.h to include/linux/usb Move headers usbnet.h and rndis_host.h to include/linux/usb and fix includes for drivers/net/usb modules. Headers are moved because rndis_wlan will be outside drivers/net/usb in drivers/net/wireless and yet need these headers. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:00 -08:00
Iñaky Pérez-González	303d9bf6bb	rfkill: add the WiMAX radio type Teach rfkill about wimax radios. Had to define a KEY_WIMAX as a 'key for disabling only wimax radios', as other radio technologies have. This makes sense as hardware has specific keys for disabling specific radios. The RFKILL enabling part is, otherwise, a copy and paste of any other radio technology. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:26:46 -08:00
Linus Torvalds	75659ca0c1	Merge branch 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc * 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits) Remove commented-out code copied from NFS NFS: Switch from intr mount option to TASK_KILLABLE Add wait_for_completion_killable Add wait_event_killable Add schedule_timeout_killable Use mutex_lock_killable in vfs_readdir Add mutex_lock_killable Use lock_page_killable Add lock_page_killable Add fatal_signal_pending Add TASK_WAKEKILL exit: Use task_is_* signal: Use task_is_* sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL ptrace: Use task_is_* power: Use task_is_* wait: Use TASK_NORMAL proc/base.c: Use task_is_* proc/array.c: Use TASK_REPORT perfmon: Use task_is_* ... Fixed up conflicts in NFS/sunrpc manually..	2008-02-01 11:45:47 +11:00
Ingo Molnar	62152d0ea7	asm-generic/tlb.h: build fix bring back the avr32, blackfin, sh, sparc architectures into working order, by reverting the effects of this change that came in via the x86 tree: commit a5a19c63f4e55e32dc0bc3d936d7f94793d8b380 Author: Jeremy Fitzhardinge <jeremy@goop.org> Date: Wed Jan 30 13:33:39 2008 +0100 x86: demacro asm-x86/pgalloc_32.h Sorry about that! Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:48 +01:00
Linus Torvalds	8af03e782c	Merge branch 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'for-2.6.25' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (454 commits) [POWERPC] Cell IOMMU fixed mapping support [POWERPC] Split out the ioid fetching/checking logic [POWERPC] Add support to cell_iommu_setup_page_tables() for multiple windows [POWERPC] Split out the IOMMU logic from cell_dma_dev_setup() [POWERPC] Split cell_iommu_setup_hardware() into two parts [POWERPC] Split out the logic that allocates struct iommus [POWERPC] Allocate the hash table under 1G on cell [POWERPC] Add set_dma_ops() to match get_dma_ops() [POWERPC] 83xx: Clean up / convert mpc83xx board DTS files to v1 format. [POWERPC] 85xx: Only invalidate TLB0 and TLB1 [POWERPC] 83xx: Fix typo in mpc837x compatible entries [POWERPC] 85xx: convert sbc85* boards to use machine_device_initcall [POWERPC] 83xx: rework platform Kconfig [POWERPC] 85xx: rework platform Kconfig [POWERPC] 86xx: Remove unused IRQ defines [POWERPC] QE: Explicitly set address-cells and size cells for muram [POWERPC] Convert StorCenter DTS file to /dts-v1/ format. [POWERPC] 86xx: Convert all 86xx DTS files to /dts-v1/ format. [PPC] Remove 85xx from arch/ppc [PPC] Remove 83xx from arch/ppc ...	2008-01-31 13:37:27 +11:00
Linus Torvalds	6232665040	Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: alpha: fix x86.git merge build error ia64: on UP percpu variables are not small memory model x86: fix arch/x86/kernel/test_nx.c modular build bug s390: use generic percpu linux-2.6.git POWERPC: use generic per cpu ia64: use generic percpu SPARC64: use generic percpu percpu: change Kconfig to HAVE_SETUP_PER_CPU_AREA modules: fold percpu_modcopy into module.c x86: export copy_from_user_ll_nocache[_nozero] x86: fix duplicated TIF on 64-bit	2008-01-31 11:48:53 +11:00
Paul Mackerras	bd45ac0c5d	Merge branch 'linux-2.6'	2008-01-31 11:25:51 +11:00
Linus Torvalds	44c3b59102	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: security: compile capabilities by default selinux: make selinux_set_mnt_opts() static SELinux: Add warning messages on network denial due to error SELinux: Add network ingress and egress control permission checks NetLabel: Add auditing to the static labeling mechanism NetLabel: Introduce static network labels for unlabeled connections SELinux: Allow NetLabel to directly cache SIDs SELinux: Enable dynamic enable/disable of the network access checks SELinux: Better integration between peer labeling subsystems SELinux: Add a new peer class and permissions to the Flask definitions SELinux: Add a capabilities bitmap to SELinux policy version 22 SELinux: Add a network node caching mechanism similar to the sel_netif_*() functions SELinux: Only store the network interface's ifindex SELinux: Convert the netif code to use ifindex values NetLabel: Add IP address family information to the netlbl_skbuff_getattr() function NetLabel: Add secid token support to the NetLabel secattr struct NetLabel: Consolidate the LSM domain mapping/hashing locks NetLabel: Cleanup the LSM domain hash functions NetLabel: Remove unneeded RCU read locks	2008-01-31 09:32:24 +11:00
Linus Torvalds	3b470ac43f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: PPC: Fix powerpc vio_find_name to not use devices_subsys Driver core: add bus_find_device_by_name function Module: check to see if we have a built in module with the same name x86: fix runtime error in arch/x86/kernel/cpu/mcheck/mce_amd_64.c Driver core: Fix up build when CONFIG_BLOCK=N	2008-01-31 09:31:37 +11:00
travis@sgi.com	05991bef10	ia64: use generic percpu ia64 has a special processor specific mapping that can be used to locate the offset for the current per cpu area. Cc: linux-ia64@vger.kernel.org Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 23:27:58 +01:00
Avi Kivity	2f52d58c92	KVM: Move apic timer migration away from critical section Migrating the apic timer in the critical section is not very nice, and is absolutely horrible with the real-time port. Move migration to the regular vcpu execution path, triggered by a new bitflag. Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:22 +02:00
Glauber de Oliveira Costa	a03d7f4b54	KVM: Put kvm_para.h include outside __KERNEL__ kvm_para.h potentially contains definitions that are to be used by userspace, so it should not be included inside the __KERNEL__ block. To protect its own data structures, kvm_para.h already includes its own __KERNEL__ block. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Acked-by: Amit Shah <amit.shah@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:22 +02:00
Christian Ehrhardt	6f723c7911	KVM: Portability: Move kvm_fpu to asm-x86/kvm.h This patch moves kvm_fpu asm-x86/kvm.h to allow every architecture to define an own representation used for KVM_GET_FPU/KVM_SET_FPU. Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Zhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:22 +02:00
Marcelo Tosatti	aaee2c94f7	KVM: MMU: Switch to mmu spinlock Convert the synchronization of the shadow handling to a separate mmu_lock spinlock. Also guard fetch() by mmap_sem in read-mode to protect against alias and memslot changes. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:21 +02:00
Marcelo Tosatti	7ec5458821	KVM: Add kvm_read_guest_atomic() In preparation for a mmu spinlock, add kvm_read_guest_atomic() and use it in fetch() and prefetch_page(). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:20 +02:00
Avi Kivity	b93463aa59	KVM: Accelerated apic support This adds a mechanism for exposing the virtual apic tpr to the guest, and a protocol for letting the guest update the tpr without causing a vmexit if conditions allow (e.g. there is no interrupt pending with a higher priority than the new tpr). Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:20 +02:00
Avi Kivity	b209749f52	KVM: local APIC TPR access reporting facility Add a facility to report on accesses to the local apic tpr even if the local apic is emulated in the kernel. This is basically a hack that allows userspace to patch Windows which tends to bang on the tpr a lot. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:20 +02:00
Zhang Xiantao	ec10f4750d	KVM: Expose ioapic to ia64 save/restore APIs IA64 also needs to see ioapic structure in irqchip. Signed-off-by: xiantao.zhang@intel.com <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:19 +02:00
Zhang Xiantao	5736199afb	KVM: Move kvm_vcpu_kick() to x86.c Moving kvm_vcpu_kick() to x86.c. Since it should be common for all archs, put its declarations in <linux/kvm_host.h> Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:19 +02:00
Avi Kivity	edf884172e	KVM: Move arch dependent files to new directory arch/x86/kvm/ This paves the way for multiple architecture support. Note that while ioapic.c could potentially be shared with ia64, it is also moved. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 18:01:18 +02:00
Avi Kivity	fb56dbb31c	KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM Currently, make headers_check barfs due to <asm/kvm.h>, which <linux/kvm.h> includes, not existing. Rather than add a zillion <asm/kvm.h>s, export kvm.h only if the arch actually supports it. Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:16 +02:00
Jerone Young	51e296258c	KVM: Add ifdef in irqchip struct for x86 only structures This patch fixes a small issue where sturctures: kvm_pic_state kvm_ioapic_state are defined inside x86 specific code and may or may not be defined in anyway for other architectures. The problem caused is one cannot compile userspace apps (ex. libkvm) for other archs since a size cannot be determined for these structures. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:15 +02:00
Dan Kenigsberg	0771671749	KVM: Enhance guest cpuid management The current cpuid management suffers from several problems, which inhibit passing through the host feature set to the guest: - No way to tell which features the host supports While some features can be supported with no changes to kvm, others need explicit support. That means kvm needs to vet the feature set before it is passed to the guest. - No support for indexed or stateful cpuid entries Some cpuid entries depend on ecx as well as on eax, or on internal state in the processor (running cpuid multiple times with the same input returns different output). The current cpuid machinery only supports keying on eax. - No support for save/restore/migrate The internal state above needs to be exposed to userspace so it can be saved or migrated. This patch adds extended cpuid support by means of three new ioctls: - KVM_GET_SUPPORTED_CPUID: get all cpuid entries the host (and kvm) supports - KVM_SET_CPUID2: sets the vcpu's cpuid table - KVM_GET_CPUID2: gets the vcpu's cpuid table, including hidden state [avi: fix original KVM_SET_CPUID not removing nx on non-nx hosts as it did before] Signed-off-by: Dan Kenigsberg <danken@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:13 +02:00
Jerone Young	a162dd5873	KVM: Portability: Move cpuid structures to <asm/kvm.h> This patch moves structures: kvm_cpuid_entry kvm_cpuid from include/linux/kvm.h to include/asm-x86/kvm.h Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:08 +02:00
Jerone Young	244d57ece9	KVM: Portability: Move kvm_sregs and msr structures to <asm/kvm.h> Move structures: kvm_sregs kvm_msr_entry kvm_msrs kvm_msr_list from include/linux/kvm.h to include/asm-x86/kvm.h Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:08 +02:00
Jerone Young	3a56b20104	KVM: Portability: Move kvm_segment & kvm_dtable structure to <asm/kvm.h> This patch moves structures: kvm_segment kvm_dtable from include/linux/kvm.h to include/asm-x86/kvm.h Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:08 +02:00
Jerone Young	d9ecf92810	KVM: Portability: Move structure lapic_state to <asm/kvm.h> This patch moves structure lapic_state from include/linux/kvm.h to include/asm-x86/kvm.h Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:08 +02:00
Jerone Young	19d30b1644	KVM: Portability: Move kvm_regs to <asm/kvm.h> This patch moves structure kvm_regs to include/asm-x86/kvm.h. Each architecture will need to create there own version of this structure. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:07 +02:00
Jerone Young	da1386a5bc	KVM: Portability: Move x86 pic strutctures This patch moves structures: kvm_pic_state kvm_ioapic_state to inclue/asm-x86/kvm.h. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:07 +02:00
Jerone Young	f6a40e3bdf	KVM: Portability: Move kvm_memory_alias to asm/kvm.h This patch moves sturct kvm_memory_alias from include/linux/kvm.h to include/asm-x86/kvm.h. Also have include/linux/kvm.h include include/asm/kvm.h. Signed-off-by: Jerone Young <jyoung5@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:53:07 +02:00
Izik Eidus	cbc9402297	KVM: Add ioctl to tss address from userspace, Currently kvm has a wart in that it requires three extra pages for use as a tss when emulating real mode on Intel. This patch moves the allocation internally, only requiring userspace to tell us where in the physical address space we can place the tss. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:52:56 +02:00
Christian Borntraeger	5f43238d03	KVM: Per-architecture hypercall definitions Currently kvm provides hypercalls only for x86* architectures. To provide hypercall infrastructure for other kvm architectures I split kvm_para.h into a generic header file and architecture specific definitions. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:52:55 +02:00
Izik Eidus	6fc138d227	KVM: Support assigning userspace memory to the guest Instead of having the kernel allocate memory to the guest, let userspace allocate it and pass the address to the kernel. This is required for s390 support, but also enables features like memory sharing and using hugetlbfs backed memory. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:52:51 +02:00
Izik Eidus	82ce2c9683	KVM: Allow dynamic allocation of the mmu shadow cache size The user is now able to set how many mmu pages will be allocated to the guest. Signed-off-by: Izik Eidus <izike@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:52:50 +02:00
Anthony Liguori	7aa81cc047	KVM: Refactor hypercall infrastructure (v3) This patch refactors the current hypercall infrastructure to better support live migration and SMP. It eliminates the hypercall page by trapping the UD exception that would occur if you used the wrong hypercall instruction for the underlying architecture and replacing it with the right one lazily. A fall-out of this patch is that the unhandled hypercalls no longer trap to userspace. There is very little reason though to use a hypercall to communicate with userspace as PIO or MMIO can be used. There is no code in tree that uses userspace hypercalls. [avi: fix #ud injection on vmx] Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-01-30 17:52:46 +02:00
Bernhard Kaindl	f212ec4b7b	x86: early boot debugging via FireWire (ohci1394_dma=early) This patch adds a new configuration option, which adds support for a new early_param which gets checked in arch/x86/kernel/setup_{32,64}.c:setup_arch() to decide wether OHCI-1394 FireWire controllers should be initialized and enabled for physical DMA access to allow remote debugging of early problems like issues ACPI or other subsystems which are executed very early. If the config option is not enabled, no code is changed, and if the boot paramenter is not given, no new code is executed, and independent of that, all new code is freed after boot, so the config option can be even enabled in standard, non-debug kernels. With specialized tools, it is then possible to get debugging information from machines which have no serial ports (notebooks) such as the printk buffer contents, or any data which can be referenced from global pointers, if it is stored below the 4GB limit and even memory dumps of of the physical RAM region below the 4GB limit can be taken without any cooperation from the CPU of the host, so the machine can be crashed early, it does not matter. In the extreme, even kernel debuggers can be accessed in this way. I wrote a small kgdb module and an accompanying gdb stub for FireWire which allows to gdb to talk to kgdb using remote remory reads and writes over FireWire. An version of the gdb stub fore FireWire is able to read all global data from a system which is running a a normal kernel without any kernel debugger, without any interruption or support of the system's CPU. That way, e.g. the task struct and so on can be read and even manipulated when the physical DMA access is granted. A HOWTO is included in this patch, in Documentation/debugging-via-ohci1394.txt and I've put a copy online at ftp://ftp.suse.de/private/bk/firewire/docs/debugging-via-ohci1394.txt It also has links to all the tools which are available to make use of it another copy of it is online at: ftp://ftp.suse.de/private/bk/firewire/kernel/ohci1394_dma_early-v2.diff Signed-Off-By: Bernhard Kaindl <bk@suse.de> Tested-By: Thomas Renninger <trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Ingo Molnar	12d6f21eac	x86: do not PSE on CONFIG_DEBUG_PAGEALLOC=y get more testing of the c_p_a() code done by not turning off PSE on DEBUG_PAGEALLOC. this simplifies the early pagetable setup code, and tests the largepage-splitup code quite heavily. In the end, all the largepages will be split up pretty quickly, so there's no difference to how DEBUG_PAGEALLOC worked before. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:58 +01:00
Jeremy Fitzhardinge	a5a19c63f4	x86: demacro asm-x86/pgalloc_32.h Convert macros into inline functions, for better type-checking. This patch required a little bit of fiddling with headers in order to make __(pte\|pmd)_free_tlb inline rather than macros. asm-generic/tlb.h includes asm/pgalloc.h, though it doesn't directly use any pgalloc definitions. I removed this include to avoid an include cycle, but it may cause secondary compile failures by things depending on the indirect inclusion; arch/x86/mm/hugetlbpage.c was one such place; there may be others. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:39 +01:00
Florian Fainelli	0acf8e3447	pci: add PCI identifiers for the RDC devices This patch defines the PCI identifiers found in the RDC R-321x System-on-Chip. Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:36 +01:00
Andi Kleen	03252919b7	x86: print which shared library/executable faulted in segfault etc. messages v3 They now look like: hal-resmgr[13791]: segfault at 3c rip 2b9c8caec182 rsp 7fff1e825d30 error 4 in libacl.so.1.1.0[2b9c8caea000+6000] This makes it easier to pinpoint bugs to specific libraries. And printing the offset into a mapping also always allows to find the correct fault point in a library even with randomized mappings. Previously there was no way to actually find the correct code address inside the randomized mapping. Relies on earlier patch to shorten the printk formats. They are often now longer than 80 characters, but I think that's worth it. [includes fix from Eric Dumazet to check d_path error value] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:18 +01:00
Andi Kleen	ca74a6f84e	x86: optimize lock prefix switching to run less frequently On VMs implemented using JITs that cache translated code changing the lock prefixes is a quite costly operation that forces the JIT to throw away and retranslate a lot of code. Previously a SMP kernel would rewrite the locks once for each CPU which is quite unnecessary. This patch changes the code to never switch at boot in the normal case (SMP kernel booting with >1 CPU) or only once for SMP kernel on UP. This makes a significant difference in boot up performance on AMD SimNow! Also I expect it to be a little faster on native systems too because a smp switch does a lot of text_poke()s which each synchronize the pipeline. v1->v2: Rename max_cpus v1->v2: Fix off by one in UP check (Thomas Gleixner) Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:17 +01:00
John Reiser	6b8be6df7f	x86: add ENDPROC() markers The ENDPROCs() were not used everywhere. Some code used just END() instead, while other code used nothing. um/sys-i386/checksum.S didn't #include <linux/linkage.h> . I also got confused because gcc puts the .type near the ENTRY, while ENDPROC puts it on the opposite end. Signed off by: John Reiser <jreiser@BitWagon.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:13 +01:00
Ingo Molnar	076f9776f5	x86: make early printk selectable on 64-bit as well Enable CONFIG_EMBEDDED to select CONFIG_EARLY_PRINTK on 64-bit as well. saves ~2K: text data bss dec hex filename 7290283 3672091 1907848 12870222 c4624e vmlinux.before 7288373 3671795 1907848 12868016 c459b0 vmlinux.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:06 +01:00
Ingo Molnar	d50efc6c40	x86: fix UML and -regparm=3 introduce the "asmregparm" calling convention: for functions implemented in assembly with a fixed regparm input parameters calling convention. mark the semaphore and rwsem slowpath functions with that. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:00 +01:00
Ananth N Mavinakayanahalli	8c1c935642	x86: kprobes: add kprobes smoke tests that run on boot Here is a quick and naive smoke test for kprobes. This is intended to just verify if some unrelated change broke the *probes subsystem. It is self contained, architecture agnostic and isn't of any great use by itself. This needs to be built in the kernel and runs a basic set of tests to verify if kprobes, jprobes and kretprobes run fine on the kernel. In case of an error, it'll print out a message with a "BUG" prefix. This is a start; we intend to add more tests to this bucket over time. Thanks to Jim Keniston and Masami Hiramatsu for comments and suggestions. Tested on x86 (32/64) and powerpc. Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:32:53 +01:00
travis@sgi.com	5280e004fc	percpu: move arch XX_PER_CPU_XX definitions into linux/percpu.h - Special consideration for IA64: Add the ability to specify arch specific per cpu flags - remove .data.percpu attribute from DEFINE_PER_CPU for non-smp case. The arch definitions are all the same. So move them into linux/percpu.h. We cannot move DECLARE_PER_CPU since some include files just include asm/percpu.h to avoid include recursion problems. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:32:52 +01:00
Jeremy Fitzhardinge	74ef649fe8	x86: add _AT() macro to conditionally cast # HG changeset patch # User Jeremy Fitzhardinge <jeremy@xensource.com> # Date 1199317452 28800 # Node ID f7e7db3facd9406545103164f9be8f9ba1a2b549 # Parent 4d9a413a0f4c1d98dbea704f0366457b5117045d x86: add _AT() macro to conditionally cast Define _AT(type, value) to conditionally cast a value when compiling C code, but not when used in assembler. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:32:42 +01:00
Roland McGrath	bb61682b3f	x86: x86 core dump TLS This makes ELF core dumps of 32-bit processes include a new note type NT_386_TLS (0x200) giving the contents of the TLS slots in struct user_desc format. This lets post mortem examination figure out what the segment registers mean like the debugger does with get_thread_area on a live process. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:56 +01:00
Roland McGrath	c269f19617	x86: compat_sys_ptrace This adds a generic definition of compat_sys_ptrace that calls compat_arch_ptrace, parallel to sys_ptrace/arch_ptrace. Some machines needing this already define a function by that name. The new generic function is defined only on machines that put #define __ARCH_WANT_COMPAT_SYS_PTRACE into asm/ptrace.h. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:48 +01:00
Roland McGrath	032d82d906	x86: compat_ptrace_request This adds a compat_ptrace_request that is the analogue of ptrace_request for the things that 32-on-64 ptrace implementations can share in common. So far there are just a couple of requests handled generically. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:47 +01:00
Roland McGrath	5bde4d1817	x86: user_regset user-copy helpers This defines two new inlines in linux/regset.h, for use in arch_ptrace implementations and the like. These provide simplified wrappers for using the user_regset interfaces to copy thread regset data into the caller's user-space memory. The inlines are trivial, but make the common uses in places such as ptrace implementation much more concise, easier to read, and less prone to code-copying errors. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:47 +01:00
Roland McGrath	bae3f7c39d	x86: user_regset helpers This adds some inlines to linux/regset.h intended for arch code to use in its user_regset get and set functions. These make it pretty easy to deal with the interface's optional kernel-space or user-space pointers and its generalized access to a part of the register data at a time. In simple cases where the internal data structure matches the exported layout (core dump format), a get function can be nothing but a call to user_regset_copyout, and a set function a call to user_regset_copyin. In other cases the exported layout is usually made up of a few pieces each stored contiguously in a different internal data structure. These helpers make it straightforward to write a get or set function by processing each contiguous chunk of the data in order. The start_pos and end_pos arguments are always constants, so these inlines collapse to a small amount of code. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:45 +01:00
Roland McGrath	bdf88217b7	x86: user_regset header The new header <linux/regset.h> defines the types struct user_regset and struct user_regset_view, with some associated declarations. This new set of interfaces will become the standard way for arch code to expose user-mode machine-specific state. A single set of entry points into arch code can do all the low-level work in one place to fill the needs of core dumps, ptrace, and any other user-mode debugging facilities that might come along in the future. For existing arch code to adapt to the user_regset interfaces, each arch can work from the code it already has to support core files and ptrace. The formats you want for user_regset are the core file formats. The only wrinkle in adapting old ptrace implementation code as user_regset get and set functions is that these functions can be called on current as well as on another task_struct that is stopped and switched out as for ptrace. For some kinds of machine state, you may have to load it directly from CPU registers or otherwise differently for current than for another thread. (Your core dump support already handles this in elf_core_copy_regs for current and elf_core_copy_task_regs for other tasks, so just check there.) The set function should also be made to work on current in case that entails some special cases, though this was never required before for ptrace. Adding this flexibility covers the arch needs to open the door to more sophisticated new debugging facilities that don't always need to context-switch to do every little thing. The copyin/copyout helper functions (in a later patch) relieve the arch code of most of the cumbersome details of the flexible get/set interfaces. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:44 +01:00
Jan Beulich	cae4595764	x86: make __{save,restore}_processor_state static .. allowing to remove their declarations from a global include file (the symbols don't exist for anything but x86). Likewise for 64-bits' fix_processor_context(), just that that one was properly declared in an arch-specific header. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:23 +01:00
Nick Piggin	95c354fe9f	spinlock: lockbreak cleanup The break_lock data structure and code for spinlocks is quite nasty. Not only does it double the size of a spinlock but it changes locking to a potentially less optimal trylock. Put all of that under CONFIG_GENERIC_LOCKBREAK, and introduce a __raw_spin_is_contended that uses the lock data itself to determine whether there are waiters on the lock, to be used if CONFIG_GENERIC_LOCKBREAK is not set. Rename need_lockbreak to spin_needbreak, make it use spin_is_contended to decouple it from the spinlock implementation, and make it typesafe (rwlocks do not have any need_lockbreak sites -- why do they even get bloated up with that break_lock then?). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:20 +01:00
Ingo Molnar	2355188570	x86: avoid build warning fix this build warning: include/asm/topology_32.h: In function 'node_to_first_cpu': include/asm/topology_32.h:66: warning: unused variable 'mask' Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:31:10 +01:00
Jeremy Fitzhardinge	5548fecdff	x86: clean up bitops-related warnings Add casts to appropriate places to silence spurious bitops warnings. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:55 +01:00
Roland McGrath	5b88abbf77	ptrace: generic PTRACE_SINGLEBLOCK This makes ptrace_request handle PTRACE_SINGLEBLOCK along with PTRACE_CONT et al. The new generic code makes use of the arch_has_block_step macro and generic entry points on machines that define them. [ mingo@elte.hu: bugfix ] Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:53 +01:00
Roland McGrath	dc802c2d2e	ptrace: arch_has_block_step This defines the new macro arch_has_block_step() in linux/ptrace.h, a default for when asm/ptrace.h does not define it. This is the analog of arch_has_single_step() for step-until-branch on machines that have it. It declares the new user_enable_block_step function, which goes with the existing user_enable_single_step and user_disable_single_step. This is not used yet, but paves the way to harmonize on this interface for the arch-specific calls on all machines. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:53 +01:00
Roland McGrath	fb7fa8f174	ptrace: arch_has_single_step This defines the new macro arch_has_single_step() in linux/ptrace.h, a default for when asm/ptrace.h does not define it. It declares the new user_enable_single_step and user_disable_single_step functions. This is not used yet, but paves the way to harmonize on this interface for the arch-specific calls on all machines. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:47 +01:00
Bernhard Walle	c9cce83dd1	x86: remove extern declarations for code, data, bss resources This patch removes the extern struct resource declarations for data_resource, code_resource and bss_resource on x86 and declares that three structures as static as done on other architectures like IA64. On i386, these structures are moved to setup_32.c (from e820_32.c) because that's code that is not specific to e820 and also required on EFI systems. That makes the "extern" reference superfluous. On x86_64, data_resource, code_resource and bss_resource are passed to e820_reserve_resources() as arguments just as done on i386 and IA64. That also avoids the "extern" reference and it's possible to make it static. Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:32 +01:00
Thomas Gleixner	02456c708e	x86: nuke a ton of dead hpet code No users, just ballast Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:27 +01:00
Thomas Gleixner	70a2002563	x86: move pmtmr related declarations Move more stuff out of proto.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:18 +01:00
Thomas Gleixner	c202f298de	x86: clean up arch/x86/ia32/sys_ia32.c White space and coding style clenaup. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:08 +01:00
Venki Pallipadi	6378ddb592	time: track accurate idle time with tick_sched.idle_sleeptime Current idle time in kstat is based on jiffies and is coarse grained. tick_sched.idle_sleeptime is making some attempt to keep track of idle time in a fine grained manner. But, it is not handling the time spent in interrupts fully. Make tick_sched.idle_sleeptime accurate with respect to time spent on handling interrupts and also add tick_sched.idle_lastupdate, which keeps track of last time when idle_sleeptime was updated. This statistics will be crucial for cpufreq-ondemand governor, which can shed some conservative gaurd band that is uses today while setting the frequency. The ondemand changes that uses the exact idle time is coming soon. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:04 +01:00
Balaji Rao	e3f37a54f6	x86: assign IRQs to HPET timers The userspace API for the HPET (see Documentation/hpet.txt) did not work. The HPET_IE_ON ioctl was failing as there was no IRQ assigned to the timer device. This patch fixes it by allocating IRQs to timer blocks in the HPET. arch/x86/kernel/hpet.c \| 13 +++++-------- drivers/char/hpet.c \| 45 ++++++++++++++++++++++++++++++++++++++------- include/linux/hpet.h \| 2 +- 3 files changed, 44 insertions(+), 16 deletions(-) Signed-off-by: Balaji Rao <balajirrao@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:03 +01:00
Thomas Gleixner	4713e22ce8	clocksource: add unregister function to disable unusable clocksources On x86 the PIT might become an unusable clocksource. Add an unregister function to provide a possibilty to remove the PIT from the list of available clock sources. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:30:02 +01:00
Atsushi Nemoto	1d76c26228	clocksource: make CLOCKSOURCE_MASK bullet-proof Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:01 +01:00
Pavel Machek	a6fa8e5a61	time: clean hungarian notation from timers Clean up hungarian notation from timer code. Signed-off-by: Pavel Machek <pavel@suse.cz> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:30:00 +01:00
Benny Halevy	99fadcd764	nfs: convert NFS_*(inode) helpers to static inline Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:11 -05:00
Benny Halevy	3a10c30acc	nfs: obliterate NFS_FLAGS macro use NFS_I(inode)->flags instead Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:11 -05:00
Chuck Lever	f1ec08cb94	SUNRPC: Use appropriate argument types in rpcb client Clean up: Follow recommendations of Chapter 5 of Documentation/CodingStyle and use "u32" instead of "__u32" for types in definitions that are not shared with user space. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:09 -05:00
Chuck Lever	c0e07cb68d	NFS: NFS version number is unsigned RPC protocol version numbers are unsigned. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:08 -05:00
Chuck Lever	883bb163f8	NLM: Introduce an arguments structure for nlmclnt_init() Clean up: pass 5 arguments to nlmclnt_init() in a structure similar to the new nfs_client_initdata structure. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2008-01-30 02:06:07 -05:00
Chuck Lever	1093a60ef3	NLM/NFS: Use cached nlm_host when calling nlmclnt_proc() Now that each NFS mount point caches its own nlm_host structure, it can be passed to nlmclnt_proc() for each lock request. By pinning an nlm_host for each mount point, we trade the overhead of looking up or creating a fresh nlm_host struct during every NLM procedure call for a little extra memory. We also restrict the nlmclnt_proc symbol to limit the use of this call to in-tree modules. Note that nlm_lookup_host() (just removed from the client's per-request NLM processing) could also trigger an nlm_host garbage collection. Now client-side nlm_host garbage collection occurs only during NFS mount processing. Since the NFS client now holds a reference on these nlm_host structures, they wouldn't have been affected by garbage collection anyway. Given that nlm_lookup_host() reorders the global nlm_host chain after every successful lookup, and that a garbage collection could be triggered during the call, we've removed a significant amount of per-NLM-request CPU processing overhead. Sidebar: there are only a few remaining references to the internals of NFS inodes in the client-side NLM code. The only references I found are related to extracting or comparing the inode's file handle via NFS_FH(). One is in nlmclnt_grant(); the other is in nlmclnt_setlockargs(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:07 -05:00
Chuck Lever	9289e7f91a	NFS: Invoke nlmclnt_init during NFS mount processing Cache an appropriate nlm_host structure in the NFS client's mount point metadata for later use. Note that there is no need to set NFS_MOUNT_NONLM in the error case -- if nfs_start_lockd() returns a non-zero value, its callers ensure that the mount request fails outright. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:07 -05:00
Chuck Lever	52c4044d00	NLM: Introduce external nlm_host set-up and tear-down functions We would like to remove the per-lock-operation nlm_lookup_host() call from nlmclnt_proc(). The new architecture pins an nlm_host structure to each NFS client superblock that has the "lock" mount option set. The NFS client passes in the pinned nlm_host structure during each call to nlmclnt_proc(). NFS client unmount processing "puts" the nlm_host so it can be garbage- collected later. This patch introduces externally callable NLM functions that handle mount-time nlm_host set up and tear-down. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:06 -05:00
Chuck Lever	b454ae9060	SUNRPC: fewer conditionals in the format_ip_address routines Clean up: have the set up routines explicitly pass the strings to be used for the transport name and NETID. This removes a number of conditionals and dependencies on rpc_xprt.prot, which is overloaded. Tighten up type checking on the address_strings array while we're at it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:04 -05:00
Trond Myklebust	59dca3b28c	NFS: Fix the 'proto=' mount option Currently, if you have a server mounted using networking protocol, you cannot specify a different value using the 'proto=' option on another mountpoint. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:00 -05:00
Trond Myklebust	331702337f	NFS: Support per-mountpoint timeout parameters. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:59 -05:00
Trond Myklebust	ba7392bb37	SUNRPC: Add support for per-client timeout values In order to be able to support setting the timeo and retrans parameters on a per-mountpoint basis, we move the rpc_timeout structure into the rpc_clnt. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:59 -05:00
Trond Myklebust	2881ae74e6	SUNRPC: Clean up the transport timeout initialisation Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:58 -05:00
Trond Myklebust	69dd716c5f	NFSv4: Add socket proto argument to setclientid Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:58 -05:00
Chuck Lever	6e4cffd7b2	NFS: Expand server address storage in nfs_client struct Prepare for managing larger addresses in the NFS client by widening the nfs_client struct's cl_addr field. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> (Modified to work with the new parameters for nfs_alloc_client) Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:54 -05:00
Chuck Lever	4392f25922	NFS: Increase size of cl_ipaddr field to hold IPv6 addresses The nfs_client's cl_ipaddr field needs to be larger to hold strings that represent IPv6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:51 -05:00
Chuck Lever	cc38bac3a0	NFS: Ensure NFSv4 SETCLIENTID send buffer is large enough Ensure that the RPC buffer size specified for NFSv4 SETCLIENTID procedures matches what we are encoding into the buffer. See the definition of struct nfs4_setclientid {} and the encode_setclientid() function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:51 -05:00
Chuck Lever	0fb2b7e945	SUNRPC: Move universal address definitions to global header Universal addresses are defined in RFC 1833 and clarified in RFC 3530. We need to use them in several places in the NFS and RPC clients, so move the relevant definition and block comment to an appropriate global include file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:50 -05:00
Trond Myklebust	40c553193d	NFS: Remove the redundant nfs_client->cl_nfsversion We can get the same information from the rpc_ops structure instead. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:49 -05:00
Trond Myklebust	bfc69a4566	NFS: define a function to update nfsi->cache_change_attribute Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:47 -05:00
Trond Myklebust	c087567d3f	SUNRPC: Remove the obsolete RPC_WAITQ macro Now that we've killed off all the users. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:41 -05:00
Trond Myklebust	47fe064831	SUNRPC: Unexport rpc_init_task() and rpc_execute() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:40 -05:00
Trond Myklebust	e8f5d77c80	SUNRPC: allow the caller of rpc_run_task to preallocate the struct rpc_task Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:38 -05:00
Trond Myklebust	b5627943ab	SUNRPC: Remove the now unused function rpc_call_setup() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:35 -05:00
Trond Myklebust	bdc7f021f3	NFS: Clean up the (commit\|read\|write)_setup() callback routines Move the common code for setting up the nfs_write_data and nfs_read_data structures into fs/nfs/read.c, fs/nfs/write.c and fs/nfs/direct.c. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:32 -05:00
Trond Myklebust	77de2c590e	SUNRPC: Add a helper rpc_call_start() that initialises task->tk_action Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:31 -05:00
Trond Myklebust	3ff7576dda	SUNRPC: Clean up the initialisation of priority queue scheduling info. We want the default scheduling priority (priority == 0) to remain RPC_PRIORITY_NORMAL. Also ensure that the priority wait queue scheduling is per process id instead of sometimes being per thread, and sometimes being per inode. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:30 -05:00
Trond Myklebust	c970aa85e7	SUNRPC: Clean up rpc_run_task Make it use the new task initialiser structure instead of acting as a wrapper. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:30 -05:00
Trond Myklebust	84115e1cd4	SUNRPC: Cleanup of rpc_task initialisation Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:30 -05:00
Trond Myklebust	62da3b2488	SUNRPC: Rename xprt_disconnect() xprt_disconnect() should really only be called when the transport shutdown is completed, and it is time to wake up any pending tasks. Rename it to xprt_disconnect_done() in order to reflect the semantical change. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:27 -05:00
Trond Myklebust	3b948ae5be	SUNRPC: Allow the client to detect if the TCP connection is closed Add an xprt->state bit to enable the TCP ->state_change() method to signal whether or not the TCP connection is in the process of closing down. This will to be used by the reconnection logic in a separate patch. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:25 -05:00
Trond Myklebust	66af1e5585	SUNRPC: Fix a race in xs_tcp_state_change() When scheduling the autoclose RPC call, we want to ensure that we don't race against the test_bit() call in xprt_clear_locked(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:24 -05:00
Steve Dickson	ef818a28fa	NFS: Stop sillyname renames and unmounts from racing Added an active/deactive mechanism to the nfs_server structure allowing async operations to hold off umount until the operations are done. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:24 -05:00
Trond Myklebust	acee478afc	NFS: Clean up the write request locking. Ensure that we set/clear NFS_PAGE_TAG_LOCKED when the nfs_page is hashed. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:24 -05:00
Paul Moore	13541b3ada	NetLabel: Add auditing to the static labeling mechanism This patch adds auditing support to the NetLabel static labeling mechanism. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-30 08:17:29 +11:00
Paul Moore	d621d35e57	SELinux: Enable dynamic enable/disable of the network access checks This patch introduces a mechanism for checking when labeled IPsec or SECMARK are in use by keeping introducing a configuration reference counter for each subsystem. In the case of labeled IPsec, whenever a labeled SA or SPD entry is created the labeled IPsec/XFRM reference count is increased and when the entry is removed it is decreased. In the case of SECMARK, when a SECMARK target is created the reference count is increased and later decreased when the target is removed. These reference counters allow SELinux to quickly determine if either of these subsystems are enabled. NetLabel already has a similar mechanism which provides the netlbl_enabled() function. This patch also renames the selinux_relabel_packet_permission() function to selinux_secmark_relabel_packet_permission() as the original name and description were misleading in that they referenced a single packet label which is not the case. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-30 08:17:26 +11:00
Martin K. Petersen	7da975a231	Fix blktrace compile warning request_queue_t is deprecated Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:15 +01:00
Jens Axboe	023ccde109	block: fix warning on compile with CONFIG_BLOCK struct io_context was not defined, just add an empty forward decl. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-29 21:55:14 +01:00
Linus Torvalds	0ba6c33bcd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25: (1470 commits) [IPV6] ADDRLABEL: Fix double free on label deletion. [PPP]: Sparse warning fixes. [IPV4] fib_trie: remove unneeded NULL check [IPV4] fib_trie: More whitespace cleanup. [NET_SCHED]: Use nla_policy for attribute validation in ematches [NET_SCHED]: Use nla_policy for attribute validation in actions [NET_SCHED]: Use nla_policy for attribute validation in classifiers [NET_SCHED]: Use nla_policy for attribute validation in packet schedulers [NET_SCHED]: sch_api: introduce constant for rate table size [NET_SCHED]: Use typeful attribute parsing helpers [NET_SCHED]: Use typeful attribute construction helpers [NET_SCHED]: Use NLA_PUT_STRING for string dumping [NET_SCHED]: Use nla_nest_start/nla_nest_end [NET_SCHED]: Propagate nla_parse return value [NET_SCHED]: act_api: use PTR_ERR in tcf_action_init/tcf_action_get [NET_SCHED]: act_api: use nlmsg_parse [NET_SCHED]: act_api: fix netlink API conversion bug [NET_SCHED]: sch_netem: use nla_parse_nested_compat [NET_SCHED]: sch_atm: fix format string warning [NETNS]: Add namespace for ICMP replying code. ...	2008-01-29 22:54:01 +11:00
Linus Torvalds	5ea293a904	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (79 commits) Remove references to "make dep" kconfig: document use of HAVE_* Introduce new section reference annotations tags: __ref, __refdata, __refconst kbuild: warn about ld added unique sections kbuild: add verbose option to Section mismatch reporting in modpost kconfig: tristate choices with mixed tristate and boolean values asm-generic/vmlix.lds.h: simplify __mem{init,exit}* dependencies remove __attribute_used__ kbuild: support ARCH=x86 in buildtar kconfig: remove "enable" kbuild: simplified warning report in modpost kbuild: introduce a few helpers in modpost kbuild: use simpler section mismatch warnings in modpost kbuild: link vmlinux.o before kallsyms passes kbuild: introduce new option to enhance section mismatch analysis Use separate sections for __dev/__cpu/__mem code/data compiler.h: introduce __section() all archs: consolidate init and exit sections in vmlinux.lds.h kbuild: check section names consistently in modpost kbuild: introduce blacklisting in modpost ...	2008-01-29 22:46:14 +11:00
Linus Torvalds	03bc26cfef	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: Module: check to see if we have a built in module with the same name module: add module taint on ndiswrapper module: fix the module name length in param_sysfs_builtin module: make module_address_lookup safe module: better OOPS and lockdep coverage for loading modules module: Fix gratuitous sprintf in module.c module: wait for dependent modules doing init. module: Don't report discarded init pages as kernel text.	2008-01-29 22:45:39 +11:00
Kyungmin Park	e71f04fc92	[MTD] [OneNAND] Get correct density from device ID Use the higher bits for other purpose. Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>	2008-01-29 17:11:38 +09:00
Rusty Russell	6dd06c9fbe	module: make module_address_lookup safe module_address_lookup releases preemption then returns a pointer into the module space. The only user (kallsyms) copies the result, so just do that under the preempt disable. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-01-29 17:13:23 +11:00
Mingming Cao	7b7510662f	jbd2: add lockdep support Ported from similar patch for the jbd layer. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-01-28 23:58:27 -05:00
Alex Tomas	c9de560ded	ext4: Add multi block allocator for ext4 Signed-off-by: Alex Tomas <alex@clusterfs.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-01-29 00:19:52 -05:00
Alex Tomas	1988b51e47	ext4: Add new functions for searching extent tree Add the functions ext4_ext_search_left() and ext4_ext_search_right(), which are used by mballoc during ext4_ext_get_blocks to decided whether to merge extent information. Signed-off-by: Alex Tomas <alex@clusterfs.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Johann Lombardi <johann@clusterfs.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	aa02ad67d9	ext4: Add ext4_find_next_bit() This function is used by the ext4 multi block allocator patches. Also add generic_find_next_le_bit Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	c14c6fd5c5	ext4: Add EXT4_IOC_MIGRATE ioctl The below patch add ioctl for migrating ext3 indirect block mapped inode to ext4 extent mapped inode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:26 -05:00
Jean Noel Cordenner	25ec56b518	ext4: Add inode version support in ext4 This patch adds 64-bit inode version support to ext4. The lower 32 bits are stored in the osd1.linux1.l_i_version field while the high 32 bits are stored in the i_version_hi field newly created in the ext4_inode. This field is incremented in case the ext4_inode is large enough. A i_version mount option has been added to enable the feature. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Kalpak Shah <kalpak@clusterfs.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Jean Noel Cordenner <jean-noel.cordenner@bull.net>	2008-01-28 23:58:27 -05:00
Jean Noel Cordenner	7a224228ed	vfs: Add 64 bit i_version support The i_version field of the inode is changed to be a 64-bit counter that is set on every inode creation and that is incremented every time the inode data is modified (similarly to the "ctime" time-stamp). The aim is to fulfill a NFSv4 requirement for rfc3530. This first part concerns the vfs, it converts the 32-bit i_version in the generic inode to a 64-bit, a flag is added in the super block in order to check if the feature is enabled and the i_version is incremented in the vfs. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Jean Noel Cordenner <jean-noel.cordenner@bull.net> Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>	2008-01-28 23:58:27 -05:00
Girish Shilamkar	818d276ceb	ext4: Add the journal checksum feature The journal checksum feature adds two new flags i.e JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT and JBD2_FEATURE_COMPAT_CHECKSUM. JBD2_FEATURE_CHECKSUM flag indicates that the commit block contains the checksum for the blocks described by the descriptor blocks. Due to checksums, writing of the commit record no longer needs to be synchronous. Now commit record can be sent to disk without waiting for descriptor blocks to be written to disk. This behavior is controlled using JBD2_FEATURE_ASYNC_COMMIT flag. Older kernels/e2fsck should not be able to recover the journal with _ASYNC_COMMIT hence it is made incompat. The commit header has been extended to hold the checksum along with the type of the checksum. For recovery in pass scan checksums are verified to ensure the sanity and completeness(in case of _ASYNC_COMMIT) of every transaction. Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Girish Shilamkar <girish@clusterfs.com> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2008-01-28 23:58:27 -05:00
Johann Lombardi	8e85fb3f30	jbd2: jbd2 stats through procfs The patch below updates the jbd stats patch to 2.6.20/jbd2. The initial patch was posted by Alex Tomas in December 2005 (http://marc.info/?l=linux-ext4&m=113538565128617&w=2). It provides statistics via procfs such as transaction lifetime and size. Sometimes, investigating performance problems, i find useful to have stats from jbd about transaction's lifetime, size, etc. here is a patch for review and inclusion probably. for example, stats after creation of 3M files in htree directory: [root@bob ~]# cat /proc/fs/jbd/sda/history R/C tid wait run lock flush log hndls block inlog ctime write drop close R 261 8260 2720 0 0 750 9892 8170 8187 C 259 750 0 4885 1 R 262 20 2200 10 0 770 9836 8170 8187 R 263 30 2200 10 0 3070 9812 8170 8187 R 264 0 5000 10 0 1340 0 0 0 C 261 8240 3212 4957 0 R 265 8260 1470 0 0 4640 9854 8170 8187 R 266 0 5000 10 0 1460 0 0 0 C 262 8210 2989 4868 0 R 267 8230 1490 10 0 4440 9875 8171 8188 R 268 0 5000 10 0 1260 0 0 0 C 263 7710 2937 4908 0 R 269 7730 1470 10 0 3330 9841 8170 8187 R 270 0 5000 10 0 830 0 0 0 C 265 8140 3234 4898 0 C 267 720 0 4849 1 R 271 8630 2740 20 0 740 9819 8170 8187 C 269 800 0 4214 1 R 272 40 2170 10 0 830 9716 8170 8187 R 273 40 2280 0 0 3530 9799 8170 8187 R 274 0 5000 10 0 990 0 0 0 where, R - line for transaction's life from T_RUNNING to T_FINISHED C - line for transaction's checkpointing tid - transaction's id wait - for how long we were waiting for new transaction to start (the longest period journal_start() took in this transaction) run - real transaction's lifetime (from T_RUNNING to T_LOCKED lock - how long we were waiting for all handles to close (time the transaction was in T_LOCKED) flush - how long it took to flush all data (data=ordered) log - how long it took to write the transaction to the log hndls - how many handles got to the transaction block - how many blocks got to the transaction inlog - how many blocks are written to the log (block + descriptors) ctime - how long it took to checkpoint the transaction write - how many blocks have been written during checkpointing drop - how many blocks have been dropped during checkpointing close - how many running transactions have been closed to checkpoint this one all times are in msec. [root@bob ~]# cat /proc/fs/jbd/sda/info 280 transaction, each upto 8192 blocks average: 1633ms waiting for transaction 3616ms running transaction 5ms transaction was being locked 1ms flushing data (in ordered mode) 1799ms logging transaction 11781 handles per transaction 5629 blocks per transaction 5641 logged blocks per transaction Signed-off-by: Johann Lombardi <johann.lombardi@bull.net> Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	0e855ac8b1	ext4: Convert truncate_mutex to read write semaphore. We are currently taking the truncate_mutex for every read. This would have performance impact on large CPU configuration. Convert the lock to read write semaphore and take read lock when we are trying to read the file. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:26 -05:00
Aneesh Kumar K.V	c278bfeceb	ext4: Make ext4_get_blocks_wrap take the truncate_mutex early. When doing a migrate from ext3 to ext4 inode we need to make sure the test for inode type and walking inode data happens inside lock. To make this happen move truncate_mutex early before checking the i_flags. This actually should enable us to remove the verify_chain(). Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:27 -05:00
Jan Kara	f5a7a6b0d9	jbd2: Fix assertion failure in fs/jbd2/checkpoint.c Before we start committing a transaction, we call __journal_clean_checkpoint_list() to cleanup transaction's written-back buffers. If this call happens to remove all of them (and there were already some buffers), __journal_remove_checkpoint() will decide to free the transaction because it isn't (yet) a committing transaction and soon we fail some assertion - the transaction really isn't ready to be freed :). We change the check in __journal_remove_checkpoint() to free only a transaction in T_FINISHED state. The locking there is subtle though (as everywhere in JBD ;(). We use j_list_lock to protect the check and a subsequent call to __journal_drop_transaction() and do the same in the end of journal_commit_transaction() which is the only place where a transaction can get to T_FINISHED state. Probably I'm too paranoid here and such locking is not really necessary - checkpoint lists are processed only from log_do_checkpoint() where a transaction must be already committed to be processed or from __journal_clean_checkpoint_list() where kjournald itself calls it and thus transaction cannot change state either. Better be safe if something changes in future... Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-01-28 23:58:27 -05:00
Chris Snook	36df53f4a3	jbd2: Remove printk from J_ASSERT to preserve registers during BUG Signed-off-by: Chris Snook <csnook@redhat.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	389d1b083c	Add buffer head related helper functions Add buffer head related helper function bh_uptodate_or_lock and bh_submit_read which can be used by file system Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:26 -05:00
Coly Li	91b51a018d	ext4: sync up block group descriptor with e2fsprogs. This patch extends bg_itable_unused of ext4 group descriptor from 16bit into 32bit. In order to add bg_itable_unused_hi into struct ext4_group_desc, some extra fields which are already introduced into e2fsprogs are also added in for consistency. Signed-off-by: Coly Li <coyli@suse.de> Cc: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2008-01-28 23:58:27 -05:00
Eric Sandeen	e2b4657453	ext4: store maxbytes for bitmapped files and return EFBIG as appropriate Calculate & store the max offset for bitmapped files, and catch too-large seeks, truncates, and writes in ext4, shortening or rejecting as appropriate. Signed-off-by: Eric Sandeen <sandeen@redhat.com>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	8180a5627d	ext4: Support large files This patch converts ext4_inode i_blocks to represent total blocks occupied by the inode in file system block size. Earlier the variable used to represent this in 512 byte block size. This actually limited the total size of the file. The feature is enabled transparently when we write an inode whose i_blocks cannot be represnted as 512 byte units in a 48 bit variable. inode flag EXT4_HUGE_FILE_FL Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	0fc1b45147	ext4: Add support for 48 bit inode i_blocks. Use the __le16 l_i_reserved1 field of the linux2 struct of ext4_inode to represet the higher 16 bits for i_blocks. With this change max_file size becomes (2*48 -1 ) 512 bytes. We add a RO_COMPAT feature to the super block to indicate that inode have i_blocks represented as a split 48 bits. Super block with this feature set cannot be mounted read write on a kernel with CONFIG_LSF disabled. Super block flag EXT4_FEATURE_RO_COMPAT_HUGE_FILE Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:26 -05:00
Aneesh Kumar K.V	a48380f769	ext4: Rename i_dir_acl to i_size_high Rename ext4_inode.i_dir_acl to i_size_high drop ext4_inode_info.i_dir_acl as it is not used Rename ext4_inode.i_size to ext4_inode.i_size_lo Add helper function for accessing the ext4_inode combined i_size. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	7973c0c19e	ext4: Rename i_file_acl to i_file_acl_lo Rename i_file_acl to i_file_acl_lo. This helps in finding bugs where we use i_file_acl instead of the combined i_file_acl_lo and i_file_acl_high Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	1d03ec984c	ext4: Fix sparse warnings. Fix sparse warnings related to static functions and local variables. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	99e6f829a8	ext4: Introduce ext4_update__feature Introduce ext4_update__feature and use them instead of opencoding. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-01-28 23:58:27 -05:00
Avantika Mathur	fd2d42912f	ext4: add ext4_group_t, and change all group variables to this type. In many places variables for block group are of type int, which limits the maximum number of block groups to 2^31. Each block group can have up to 2^15 blocks, with a 4K block size, and the max filesystem size is limited to 2^31 * (2^15 * 2^12) = 2^58 -- or 256 PB This patch introduces a new type ext4_group_t, of type unsigned long, to represent block group numbers in ext4. All occurrences of block group variables are converted to type ext4_group_t. Signed-off-by: Avantika Mathur <mathur@us.ibm.com>	2008-01-28 23:58:27 -05:00
Aneesh Kumar K.V	725d26d3f0	ext4: Introduce ext4_lblk_t This patch adds a new data type ext4_lblk_t to represent the logical file blocks. This is the preparatory patch to support large files in ext4 The follow up patch with convert the ext4_inode i_blocks to represent the number of blocks in file system block size. This changes makes it possible to have a block number 2**32 -1 which will result in overflow if the block number is represented by signed long. This patch convert all the block number to type ext4_lblk_t which is typedef to __u32 Also remove dead code ext4_ext_walk_space Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com>	2008-01-28 23:58:27 -05:00
Jan Kara	a72d7f834e	ext4: Avoid rec_len overflow with 64KB block size With 64KB blocksize, a directory entry can have size 64KB which does not fit into 16 bits we have for entry lenght. So we store 0xffff instead and convert value when read from / written to disk. The patch also converts some places to use ext4_next_entry() when we are changing them anyway. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2008-01-28 23:58:27 -05:00
Takashi Sato	afc7cbca5b	ext4: Support large blocksize up to PAGESIZE This patch set supports large block size(>4k, <=64k) in ext4, just enlarging the block size limit. But it is NOT possible to have 64kB blocksize on ext4 without some changes to the directory handling code. The reason is that an empty 64kB directory block would have a rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in the filesystem. The proposed solution is treat 64k rec_len with a an impossible value like rec_len = 0xffff to handle this. The Patch-set consists of the following 2 patches. [1/2] ext4: enlarge blocksize - Allow blocksize up to pagesize [2/2] ext4: fix rec_len overflow - prevent rec_len from overflow with 64KB blocksize Now on 64k page ppc64 box runs with this patch set we could create a 64k block size ext4dev, and able to handle empty directory block. Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2008-01-28 23:58:27 -05:00
Patrick McHardy	5feb5e1aaa	[NET_SCHED]: sch_api: introduce constant for rate table size Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:21 -08:00
Denis V. Lunev	1ab352768f	[NETNS]: Add namespace parameter to ip_dev_find. in_dev_find() need a namespace to pass it to fib_get_table(), so add an argument. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:11:04 -08:00
Ron Rindjunsky	a8b47ea3c5	mac80211: fixing ieee80211_bar types This patch changes ieee80211_bar control and start_seq_num to match the proper bitwise attribute expected from ieee 802.11 frame Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:10:45 -08:00
Denis V. Lunev	7fee0ca237	[NETNS]: Add netns parameter to inetdev_by_index. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:10:20 -08:00
Michael Buesch	af4b745078	ssb: Add boardflags_hi field to the sprom data structure Add boardflags-high. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:09:52 -08:00
Michael Wu	f653211197	Add rtl8180 wireless driver This patch adds a mac80211 based wireless driver for the rtl8180 and rtl8185 PCI wireless cards. Also included are some rtl8187 changes required due to the relationship between that driver and this one. Michael Wu is primarily responsible for the initial driver and rtl8185 support. Andreas Merello provided the additional rtl8180 support. Thanks to Jukka Ruohonen for the donating a rtl8185 card! It was very helpful for the rtl8225z2 code. The Signed-off-by information below is collected from the individual patches submitted to wireless-2.6 before merging this driver upstream. Signed-off-by: Andrea Merello <andreamrl@tiscali.it> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:09:35 -08:00
Miguel Botón	961d57c883	ssb: add 'ssb_pcihost_set_power_state' function This patch adds the 'ssb_pcihost_set_power_state' function. This function allows us to set the power state of a PCI device (for example b44 ethernet device). Signed-off-by: Miguel Botón <mboton@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:09:18 -08:00
Michael Buesch	993e1c780b	ssb: Fix PCMCIA lowlevel register access This fixes lowlevel register access for PCMCIA based devices. The patch also adds a temporary workaround for the device mac address. It simply adds generation of a random address. The real SPROM extraction will follow in another patch. The temporary workaround will be removed then, but for now it's OK. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:09:16 -08:00
Michael Buesch	e861b98d5e	ssb: Fix extraction of values from SPROM This fixes extraction of some values from the SPROM. It mainly fixes extraction of antenna related values, which is needed for another b43 fix sent later. Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:09:14 -08:00
Jan Engelhardt	1e637c74b0	[IPV4]: Enable use of 240/4 address space. This short patch modifies the IPv4 networking to enable use of the 240.0.0.0/4 (aka "class-E") address space as propsed in the internet draft draft-fuller-240space-00.txt. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:44 -08:00
Patrick McHardy	57d3ae847d	[VLAN]: Turn __constant_htons into htons where possible Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:32 -08:00
Patrick McHardy	9dfebcc647	[VLAN]: Turn VLAN_DEV_INFO into inline function Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:32 -08:00
Patrick McHardy	af30151709	[VLAN]: Simplify vlan unregistration Keep track of the number of VLAN devices in a vlan group. This allows to have the caller sense when the group is going to be destroyed and stop using it, which in turn allows to remove the wrapper around unregister_vlan_dev for the NETDEV_UNREGISTER notifier and avoid iterating over all possible VLAN ids whenever a device in unregistered. Also fix what looks like a use-after-free (but is actually safe since we're holding the RTNL), the real_dev reference should not be dropped while we still use it. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:31 -08:00
Patrick McHardy	a5250a3695	[ETHER]: Bring back MAC_FMT The print_mac function is not very suitable for debugging printks in performance critical paths since without ifdefs it will always get called. MAC_FMT can be used with pr_debug without any overhead when debugging is disabled. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:27 -08:00
Patrick McHardy	7bd38d778e	[VLAN]: Use dev->stats Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:25 -08:00
Patrick McHardy	b7a4a83629	[VLAN]: Kill useless VLAN_NAME define The only user already includes __FUNCTION__ (vlan_proto_init) in the output, which is enough to identify what the message is about. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:25 -08:00
Patrick McHardy	740c15d0dd	[VLAN]: Clean up vlan_hdr/vlan_ethhdr structs Fix 3 space indentation and some overly long lines by moving the comments to a kdoc structure description. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:24 -08:00
Patrick McHardy	476bcea67f	[VLAN]: Remove unnecessary structure declarations - struct packet_type is not used - struct vlan_group is declared later in the file before the first use - struct net_device is not needed since netdevice.h is included - struct vlan_collection does not exist - struct vlan_dev_info is declared later in the file before the first use Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:23 -08:00
Denis V. Lunev	b7c6ba6eb1	[NETNS]: Consolidate kernel netlink socket destruction. Create a specific helper for netlink kernel socket disposal. This just let the code look better and provides a ground for proper disposal inside a namespace. Signed-off-by: Denis V. Lunev <den@openvz.org> Tested-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:08:07 -08:00
Al Viro	904584018e	annotate the rest of drivers/net/wan Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-28 15:07:58 -08:00
Al Viro	a3edb08311	annotate tun Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-28 15:07:57 -08:00
Larry Finger	d3c319f9c8	ssb: Remove the old, now unused, data structures The old, now unused, data structures and SPROM extraction routines are removed. Signed-off-by: Larry Finger<Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:04:44 -08:00
Larry Finger	c272ef4403	ssb: Convert to use of the new SPROM structure In disagreement with the SPROM specs, revision 3 devices appear to have moved the MAC address. Change ssb to handle the revision 4 SPROM, which is a different size. This change in size is handled by adding a new variable to the ssb_sprom struct and using it whenever possible. For those routines that do not have access to this structure, a 'u16 size' argument is added. The new PCI_ID for the BCM4328 is also added. Testing of the Revision 4 SPROM, which is used on the BCM4328, was done by Michael Gerdau <mgerdau@tiscali.de>. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:04:41 -08:00
Larry Finger	ac82fab44f	ssb: Add new SPROM structure while keeping the old The SPROM's for various devices utilizing the Sonics Silicon Backplane come with various revisions. The Revision 2 SPROM inherited the data layout of 1, and Revision 3 inherited the layout of 2. The first instance of Revision 4 has now been found in a BCM4328 wireless LAN card. This device does not inherit any layout from previous versions. Although it was possible to create a data structure that kept all the old layouts, we decided to start fresh, keep only those SPROM variables that are used by the drivers that utilize ssb, and to do the conversion in such a manner that neither compilation or execution will be affected if a bisection lands in the middle of these changes, while keeping the patches as small as possible. In this patch, the sprom structures are changed while maintaining the old ones. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2008-01-28 15:04:40 -08:00
Francois Romieu	5ac5d61632	r6040: cleanups - whitespaces vs tabs - use 80 cols - use if_mii - use netdev_priv - remove useless cast to void * - PCI device id does not need to be globally available Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>	2008-01-28 15:04:16 -08:00
Jeff Garzik	092427be8c	drivers/net/r6040: fix obvious problems (but more remain) - checkpatch fixes - fix bogus and uninitialized return codes in r6040_start_xmit() - netdev_get_settings() fix obvious locking bug flagged by compiler warning - set DMA consistent mask - remove unnecessary setting of dev->base_addr Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-01-28 15:04:03 -08:00
Eliezer Tamir	a2fbb9ea23	add bnx2x driver for BCM57710 Signed-off-by: Eliezer Tamir <eliezert@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:03:53 -08:00
Patrick McHardy	e37b386c95	[NETFILTER]: nf_conntrack_sctp: remove unused ttag field from conntrack data Spotted by Pablo Neira Ayuso <pablo@netfilter.org>. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:38 -08:00
Jan Engelhardt	f72e25a897	[NETFILTER]: Rename ipt_iprange to xt_iprange This patch moves ipt_iprange to xt_iprange, in preparation for adding IPv6 support to xt_iprange. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:27 -08:00
Jan Engelhardt	917b6fbd6e	[NETFILTER]: xt_policy: use the new union nf_inet_addr Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:25 -08:00
Jan Engelhardt	17b0d7ef65	[NETFILTER]: xt_mark match, revision 1 Introduces the xt_mark match revision 1. It uses fixed types, eventually obsoleting revision 0 some day (uses nonfixed types). Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:23 -08:00
Jan Engelhardt	64eb12f997	[NETFILTER]: xt_conntrack match, revision 1 Introduces the xt_conntrack match revision 1. It uses fixed types, the new nf_inet_addr and comes with IPv6 support, thereby completely superseding xt_state. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:23 -08:00
Jan Engelhardt	2e3075a2c4	[NETFILTER]: Extend nf_inet_addr with in{,6}_addr Extend union nf_inet_addr with struct in_addr and in6_addr. Useful because a lot of in-kernel IPv4 and IPv6 functions use in_addr/in6_addr. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:22 -08:00
Jan Engelhardt	96e3227265	[NETFILTER]: xt_connmark match, revision 1 Introduces the xt_connmark match revision 1. It uses fixed types, eventually obsoleting revision 0 some day (uses nonfixed types). (Unfixed types like "unsigned long" do not play well with mixed user-/kernelspace "bitness", e.g. 32/64, as is common on SPARC64, and need extra compat code.) Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:21 -08:00
Jan Engelhardt	e0a812aea5	[NETFILTER]: xt_MARK target, revision 2 Introduces the xt_MARK target revision 2. It uses fixed types, and also uses the more expressive XOR logic. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:21 -08:00
Jan Engelhardt	0dc8c76029	[NETFILTER]: xt_CONNMARK target, revision 1 Introduces the xt_CONNMARK target revision 1. It uses fixed types, and also uses the more expressive XOR logic. Futhermore, it allows to selectively pick bits from both the ctmark and the nfmark in the SAVE and RESTORE operations. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:20 -08:00
Jan Engelhardt	8b6f3f62fe	[NETFILTER]: Annotate start of kernel fields in NF headers Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:19 -08:00
Denis V. Lunev	9bd85e3264	[IPV4]: Remove extra argument from arp_ignore. arp_ignore has two arguments: dev & in_dev. dev is used for inet_confirm_addr calling only. inet_confirm_addr, in turn, either gets in_dev from the device passed or iterates over all network devices if the device passed is NULL. It seems logical to directly pass in_dev into inet_confirm_addr. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:12 -08:00
David S. Miller	3f4afb6443	[XFRM]: Fix struct xfrm_algo code formatting. Realign struct members. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:01 -08:00
Eric Dumazet	ba749ae98d	[XFRM]: alg_key_len should be unsigned to avoid integer divides alg_key_len is currently defined as 'signed int'. This unfortunatly leads to integer divides in several paths. Converting it to unsigned is safe and saves 208 bytes of text on i386. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:02:00 -08:00
Denis V. Lunev	e5d69b9f4a	[ATM]: Oops reading net/atm/arp cat /proc/net/atm/arp causes the NULL pointer dereference in the get_proc_net+0xc/0x3a. This happens as proc_get_net believes that the parent proc dir entry contains struct net. Fix this assumption for "net/atm" case. The problem is introduced by the commit c0097b07abf5f92ab135d024dd41bd2aada1512f from Eric W. Biederman/Daniel Lezcano. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:01:36 -08:00
Pavel Emelyanov	b3fd3ffe39	[NETFILTER]: Use the ctl paths instead of hand-made analogue The conntracks subsystem has a similar infrastructure to maintain ctl_paths, but since we already have it on the generic level, I think it's OK to switch to using it. So, basically, this patch just replaces the ctl_table-s with ctl_path-s, nf_register_sysctl_table with register_sysctl_paths() and removes no longer needed code. After this the net/netfilter/nf_sysctl.c file contains the paths only. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:01:11 -08:00
Eric Dumazet	f0b5a0dcf1	[VLAN]: Avoid expensive divides We can avoid divides (as seen with CONFIG_CC_OPTIMIZE_FOR_SIZE=y on x86) changing vlan_group_get_device()/vlan_group_set_device() id parameter from signed to unsigned. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:01:06 -08:00
Ron Rindjunsky	07db218396	mac80211: A-MPDU Rx adding basic functionality This patch adds the basic needed abilities and functions for A-MPDU Rx session changed functions: - ieee80211_sta_process_addba_request - Rx A-MPDU initialization enabled - ieee80211_stop - stops all A-MPDU Rx in case interface goes down added functions: - ieee80211_send_delba - used for sending out Del BA in A-MPDU sessions - ieee80211_sta_stop_rx_BA_session - stopping Rx A-MPDU session - sta_rx_agg_session_timer_expired - stops A-MPDU Rx use if load is too low Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:59 -08:00
Li Zefan	6e32814bc8	[CONNECTOR]: Cleanup struct cn_callback_entry - 'cb' is a fake struct member. In a previous patch struct cn_callback was renamed to cn_callback_id, so 'cb' should have been deleted at that time. - 'nls' isn't used and is redundant, we can retrieve this data through cn_callback_entry.pdev->nls. - 'seq' and 'group' should be u32, as they are declared to be u32 in other places. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:40 -08:00
Li Zefan	96a899655e	[CONNECTOR]: Cleanup struct cn_queue_dev Struct member netlink_groups is never used, and I don't see how it can be useful. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:39 -08:00
Eric Dumazet	571e768202	[LIB] pcounter : unline too big functions Before pushing pcounter to Linus tree, I would like to make some adjustments. Goal is to reduce kernel text size, by unlining too big functions. When a pcounter is bound to a statically defined per_cpu variable, we define two small helpers functions. (No more folding function using the fat for_each_possible_cpu(cpu) ... ) static DEFINE_PER_CPU(int, NAME##_pcounter_values); static void NAME##_pcounter_add(struct pcounter self, int val) { __get_cpu_var(NAME##_pcounter_values) += val; } static int NAME##_pcounter_getval(const struct pcounter self, int cpu) { return per_cpu(NAME##_pcounter_values, cpu); } Fast path is therefore unchanged, while folding/alloc/free is now unlined. This saves 228 bytes on i386 Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:35 -08:00
Adrian Bunk	f1862b0ae2	[SHAPER]: The scheduled shaper removal. This patch contains the scheduled removal of the shaper driver. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:29 -08:00
Chas Williams	fb64c735a5	[ATM]: [br2864] whitespace cleanup Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:14 -08:00
Eric Kinzie	097b19a998	[ATM]: [br2864] routed support Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:13 -08:00
Kay Sievers	ef39592f78	[ATM]: Convert struct class_device to struct device Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>	2008-01-28 15:00:12 -08:00
Michael Chan	7ffc49a6ee	[ETH]: Combine format_addr() with print_mac(). print_mac() used many most net drivers and format_addr() used by net-sysfs.c are very similar and they can be intergrated. format_addr() is also identically redefined in the qla4xxx iscsi driver. Export a new function sysfs_format_mac() to be used by net-sysfs, qla4xxx and others in the future. Both print_mac() and sysfs_format_mac() call _format_mac_addr() to do the formatting. Changed print_mac() to use unsigned char * to be consistent with net_device struct's dev_addr. Added buffer length overrun checking as suggested by Joe Perches. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:05 -08:00
Johannes Berg	fd5b74dcb8	cfg80211/nl80211: implement station attribute retrieval After a station is added to the kernel's structures, userspace has to be able to retrieve statistics about that station, especially whether the station was idle and how much bytes were transferred to and from it. This adds the necessary code to nl80211. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:52 -08:00
Johannes Berg	5727ef1b2e	cfg80211/nl80211: station handling This patch adds station handling to cfg80211/nl80211. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:51 -08:00
Johannes Berg	ed1b6cc7f8	cfg80211/nl80211: add beacon settings This adds the necessary API to cfg80211/nl80211 to allow changing beaconing settings. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:50 -08:00
Johannes Berg	41ade00f21	cfg80211/nl80211: introduce key handling This introduces key handling to cfg80211/nl80211. Default and group keys can be added, changed and removed; sequence counters for each key can be retrieved. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:48 -08:00
Masahide NAKAMURA	558f82ef6e	[XFRM]: Define packet dropping statistics. This statistics is shown factor dropped by transformation at /proc/net/xfrm_stat for developer. It is a counter designed from current transformation source code and defined as linux private MIB. See Documentation/networking/xfrm_proc.txt for the detail. Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:38 -08:00
Jan Engelhardt	22c2d8bca2	[NETFILTER]: xt_connlimit: use the new union nf_inet_addr Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:09 -08:00
Jan Engelhardt	643a2c15a4	[NETFILTER]: Introduce nf_inet_address A few netfilter modules provide their own union of IPv4 and IPv6 address storage. Will unify that in this patch series. (1/4): Rename union nf_conntrack_address to union nf_inet_addr and move it to x_tables.h. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:07 -08:00
Patrick McHardy	051578ccbc	[NETFILTER]: nf_nat: properly use RCU for ip_nat_decode_session We need to use rcu_assign_pointer/rcu_dereference to avoid races. Also remove an obsolete CONFIG_IP_NAT_NEEDED ifdef. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:06 -08:00
Patrick McHardy	1e796fda00	[NETFILTER]: constify nf_afinfo Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:05 -08:00
Patrick McHardy	90a9ba8dd9	[NETFILTER]: Kill function prototype for non-existing function Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:05 -08:00
Patrick McHardy	76aa1ce139	[NETFILTER]: nfnetlink_log: include GID in netlink message Similar to Maciej Soltysiak's ipt_LOG patch, include GID in addition to UID in netlink message. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:59:04 -08:00
Patrick McHardy	f01ffbd6e7	[NETFILTER]: nf_log: move logging stuff to seperate header Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:58 -08:00
Pablo Neira Ayuso	37fccd8577	[NETFILTER]: ctnetlink: add support for secmark This patch adds support for James Morris' connsecmark. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:52 -08:00
Pablo Neira Ayuso	13eae15a24	[NETFILTER]: ctnetlink: add support for NAT sequence adjustments The combination of NAT and helpers may produce TCP sequence adjustments. In failover setups, this information needs to be replicated in order to achieve a successful recovery of mangled, related connections. This patch is particularly useful for conntrackd, see: http://people.netfilter.org/pablo/conntrack-tools/ Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:50 -08:00
Patrick McHardy	d6a2ba07c3	[NETFILTER]: arp_tables: add compat support Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:49 -08:00
Patrick McHardy	0495cf957b	[NETFILTER]: arp_tables: use XT_ALIGN Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:44 -08:00
Patrick McHardy	06e1374a7e	[NETFILTER]: ip6_tables: use XT_ALIGN Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:43 -08:00
Patrick McHardy	3bc3fe5eed	[NETFILTER]: ip6_tables: add compat support Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:36 -08:00
Patrick McHardy	b386d9f596	[NETFILTER]: ip_tables: move compat offset calculation to x_tables Its needed by ip6_tables and arp_tables as well. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:31 -08:00
Patrick McHardy	73cd598df4	[NETFILTER]: ip_tables: fix compat types Use compat types and compat iterators when dealing with compat entries for clarity. This doesn't actually make a difference for ip_tables, but is needed for ip6_tables and arp_tables. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:30 -08:00
Patrick McHardy	89c002d66a	[NETFILTER]: {ip,ip6,arp}_tables: consolidate iterator macros Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:29 -08:00
Patrick McHardy	8956695131	[NETFILTER]: x_tables: make xt_compat_match_from_user usable in iterator macros Make xt_compat_match_from_user return an int to make it usable in the *tables iterator macros and kill a now unnecessary wrapper function. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:28 -08:00
Dan Williams	374fdfbc67	introduce WEXT scan capabilities Introduce scan capabilities to WEXT so that userspace can do intelligent things with scan behavior such as handling hidden SSIDs more gracefully. If the driver reports a specific scan capability, the driver must respect the options specified in the iw_scan_req structure when handling the SIOCSIWSCAN call, unless it's mode or state does not allow it to do so, in which case it must return an error. This version switches to Dave Kilroy's suggestion of claiming unused padding space for the scan_capa field. Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:25 -08:00
Joe Perches	2d4d29802f	[IPV4]: Remove unused IPV4TYPE macros Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:18 -08:00
Joe Perches	2658fa8031	[IPV4]: Create ipv4_is_<type>(__be32 addr) functions Change IPV4 specific macros LOOPBACK MULTICAST LOCAL_MCAST BADCLASS and ZERONET macros to inline functions ipv4_is_<type>(__be32 addr) Adds type safety and arguably some readability. Changes since last submission: Removed ipv4_addr_octets function Used hex constants Converted recently added rfc3330 macros Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:13 -08:00
Pavel Emelyanov	586f121152	[IPV4]: Switch users of ipv4_devconf(_all) to use the pernet one These are scattered over the code, but almost all the "critical" places already have the proper struct net at hand except for snmp proc showing function and routing rtnl handler. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:12 -08:00
Gerrit Renker	b4d4f7c70f	[DCCP]: Handle timestamps on Request/Response exchange separately In DCCP, timestamps can occur on packets anytime, CCID3 uses a timestamp(/echo) on the Request/Response exchange. This patch addresses the following situation: * timestamps are recorded on the listening socket; * Responses are sent from dccp_request_sockets; * suppose two connections reach the listening socket with very small time in between: * the first timestamp value gets overwritten by the second connection request. This is not really good, so this patch separates timestamps into * those which are received by the server during the initial handshake (on dccp_request_sock); * those which are received by the client or the client after connection establishment. As before, a timestamp of 0 is regarded as indicating that no (meaningful) timestamp has been received (in addition, a warning message is printed if hosts send 0-valued timestamps). The timestamp-echoing now works as follows: * when a timestamp is present on the initial Request, it is placed into dreq, due to the call to dccp_parse_options in dccp_v{4,6}_conn_request; * when a timestamp is present on the Ack leading from RESPOND => OPEN, it is copied over from the request_sock into the child cocket in dccp_create_openreq_child; * timestamps received on an (established) dccp_sock are treated as before. Since Elapsed Time is measured in hundredths of milliseconds (13.2), the new dccp_timestamp() function is used, as it is expected that the time between receiving the timestamp and sending the timestamp echo will be very small against the wrap-around time. As a byproduct, this allows smaller timestamping-time fields. Furthermore, inserting the Timestamp Echo option has been taken out of the block starting with '!dccp_packet_without_ack()', since Timestamp Echo can be carried on any packet (5.8 and 13.3). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:51 -08:00
Gerrit Renker	8b81941248	[DCCP]: Allow to parse options on Request Sockets The option parsing code currently only parses on full sk's. This causes a problem for options sent during the initial handshake (in particular timestamps and feature-negotiation options). Therefore, this patch extends the option parsing code with an additional argument for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise the normal path (parsing on the sk) is used. Subsequent patches, which implement feature negotiation during connection setup, make use of this facility. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:50 -08:00
Gerrit Renker	b8599d2070	[DCCP]: Support for server holding timewait state This adds a socket option and signalling support for the case where the server holds timewait state on closing the connection, as described in RFC 4340, 8.3. Since holding timewait state at the server is the non-usual case, it is enabled via a socket option. Documentation for this socket option has been added. The setsockopt statement has been made resilient against different possible cases of expressing boolean `true' values using a suggestion by Ian McDonald. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:48 -08:00
Herbert Xu	8b7817f3a9	[IPSEC]: Add ICMP host relookup support RFC 4301 requires us to relookup ICMP traffic that does not match any policies using the reverse of its payload. This patch implements this for ICMP traffic that originates from or terminates on localhost. This is activated on outbound with the new policy flag XFRM_POLICY_ICMP, and on inbound by the new state flag XFRM_STATE_ICMP. On inbound the policy check is now performed by the ICMP protocol so that it can repeat the policy check where necessary. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:23 -08:00
Herbert Xu	d5422efe68	[IPSEC]: Added xfrm_decode_session_reverse and xfrmX_policy_check_reverse RFC 4301 requires us to relookup ICMP traffic that does not match any policies using the reverse of its payload. This patch adds the functions xfrm_decode_session_reverse and xfrmX_policy_check_reverse so we can get the reverse flow to perform such a lookup. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:22 -08:00
Pavel Emelyanov	01ecfe9ba6	[IPV4]: Cleanup IN_DEV_MFORWARD macro This is essentially IN_DEV_ANDCONF with proper arguments. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:57:08 -08:00
Pavel Emelyanov	b8e1f9b5c3	[NET] sysctl: make sysctl_somaxconn per-namespace Just move the variable on the struct net and adjust its usage. Others sysctls from sys.net.core table are more difficult to virtualize (i.e. make them per-namespace), but I'll look at them as well a bit later. Signed-off-by: Pavel Emelyanov <xemul@oenvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:57 -08:00
Patrick McHardy	f4d900a2ca	[NETLINK]: Mark attribute construction exception unlikely Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:34 -08:00
Herbert Xu	a59322be07	[UDP]: Only increment counter on first peek/recv The previous move of the the UDP inDatagrams counter caused each peek of the same packet to be counted separately. This may be undesirable. This patch fixes this by adding a bit to sk_buff to record whether this packet has already been seen through skb_recv_datagram. We then only increment the counter when the packet is seen for the first time. The only dodgy part is the fact that skb_recv_datagram doesn't have a good way of returning this new bit of information. So I've added a new function __skb_recv_datagram that does return this and made skb_recv_datagram a wrapper around it. The plan is to eventually replace all uses of skb_recv_datagram with this new function at which time it can be renamed its proper name. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:34 -08:00
Herbert Xu	27ab256864	[UDP]: Avoid repeated counting of checksum errors due to peeking Currently it is possible for two processes to peek on the same socket and end up incrementing the error counter twice for the same packet. This patch fixes it by making skb_kill_datagram return whether it succeeded in unlinking the packet and only incrementing the counter if it did. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:32 -08:00
Pavel Emelyanov	68dd299bc8	[INET]: Merge sys.net.ipv4.ip_forward and sys.net.ipv4.conf.all.forwarding AFAIS these two entries should do the same thing - change the forwarding state on ipv4_devconf and on all the devices. I propose to merge the handlers together using ctl paths. The inet_forward_change() is static after this and I move it higher to be closer to other "propagation" helpers and to avoid diff making patches based on { and } matching :) i.e. - make them easier to read. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:31 -08:00
Pavel Emelyanov	08913681e4	[NET]: Remove the empty net_table I have removed all the entries from this table (core_table, ipv4_table and tr_table), so now we can safely drop it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:29 -08:00
Pavel Emelyanov	36f0bebd98	[TR]: Use ctl paths to register net/token-ring/ table The same thing for token-ring - use ctl paths and get rid of external references on the tr_table. Unfortunately, I couldn't split this patch into cleanup and use-the-paths parts. As a lame excuse I can say, that the cleanup is just moving the tr_table from one file to another - closet to a single variable, that this ctl table tunes. Since the source file becomes empty after the move, I remove it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:28 -08:00
Patrick McHardy	02f014d888	[NETFILTER]: nf_queue: move list_head/skb/id to struct nf_info Move common fields for queue management to struct nf_info and rename it to struct nf_queue_entry. The avoids one allocation/free per packet and simplifies the code a bit. Alternatively we could add some private room at the tail, but since all current users use identical structs this seems easier. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:14 -08:00
Patrick McHardy	c01cd429fc	[NETFILTER]: nf_queue: move queueing related functions/struct to seperate header Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:10 -08:00
Patrick McHardy	f9d8928f83	[NETFILTER]: nf_queue: remove unused data pointer Remove the data pointer from struct nf_queue_handler. It has never been used and is useless for the only handler that really matters, nfnetlink_queue, since the handler is shared between all instances. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:10 -08:00
Patrick McHardy	e3ac529815	[NETFILTER]: nf_queue: make queue_handler const Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:09 -08:00
Patrick McHardy	1841a4c7ae	[NETFILTER]: nf_ct_h323: remove ipv6 module dependency nf_conntrack_h323 needs ip6_route_output for the call forwarding filter. Add a ->route function to nf_afinfo and use that to avoid pulling in the ipv6 module. Fix the #ifdef for the IPv6 code while I'm at it - the IPv6 support is only needed when IPv6 conntrack is enabled. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:05 -08:00
Patrick McHardy	50c164a81f	[NETFILTER]: x_tables: add rateest match Add rate estimator match. The rate estimator match can match on estimated rates by the RATEEST target. It supports matching on absolute bps/pps values, comparing two rate estimators and matching on the difference between two rate estimators. This is what I use to route outgoing data connections from a FTP server over two lines based on the available bandwidth: # estimate outgoing rates iptables -t mangle -A POSTROUTING -o eth0 -j RATEEST --rateest-name eth0 \ --rateest-interval 250ms \ --rateest-ewma 0.5s iptables -t mangle -A POSTROUTING -o ppp0 -j RATEEST --rateest-name ppp0 \ --rateest-interval 250ms \ --rateest-ewma 0.5s # mark based on available bandwidth iptables -t mangle -A BALANCE -m state --state NEW \ -m helper --helper ftp \ -m rateest --rateest-delta \ --rateest1 eth0 \ --rateest-bps1 2.5mbit \ --rateest-gt \ --rateest2 ppp0 \ --rateest-bps2 2mbit \ -j CONNMARK --set-mark 0x1 iptables -t mangle -A BALANCE -m state --state NEW \ -m helper --helper ftp \ -m rateest --rateest-delta \ --rateest1 ppp0 \ --rateest-bps1 2mbit \ --rateest-gt \ --rateest2 eth0 \ --rateest-bps2 2.5mbit \ -j CONNMARK --set-mark 0x2 iptables -t mangle -A BALANCE -j CONNMARK --restore-mark Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:03 -08:00
Patrick McHardy	5859034d7e	[NETFILTER]: x_tables: add RATEEST target Add new rate estimator target (using gen_estimator). In combination with the rateest match (next patch) this can be used for load-based multipath routing. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:02 -08:00
Jan Engelhardt	5c350e5a38	[NETFILTER]: IPv6 capable xt_TOS v1 target Extends the xt_DSCP target by xt_TOS v1 to add support for selectively setting and flipping any bit in the IPv4 TOS and IPv6 Priority fields. (ipt_TOS and xt_DSCP only accepted a limited range of possible values.) Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:00 -08:00
Jan Engelhardt	f1095ab51d	[NETFILTER]: IPv6 capable xt_tos v1 match Extends the xt_dscp match by xt_tos v1 to add support for selectively matching any bit in the IPv4 TOS and IPv6 Priority fields. (ipt_tos and xt_dscp only accepted a limited range of possible values.) Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:00 -08:00
Laszlo Attila Toth	e2cf5ecbea	[NETFILTER]: ipt_addrtype: limit address type checking to an interface Addrtype match has a new revision (1), which lets address type checking limited to the interface the current packet belongs to. Either incoming or outgoing interface can be used depending on the current hook. In the FORWARD hook two maches should be used if both interfaces have to be checked. The new structure is ipt_addrtype_info_v1. Revision 0 lets older userspace programs use the match as earlier. ipt_addrtype_info is used. Signed-off-by: Laszlo Attila Toth <panther@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:56 -08:00
Jan Engelhardt	0265ab44ba	[NETFILTER]: merge ipt_owner/ip6t_owner in xt_owner xt_owner merges ipt_owner and ip6t_owner, and adds a flag to match on socket (non-)existence. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:55 -08:00
Eric Dumazet	259d4e41f3	[NETFILTER]: x_tables: struct xt_table_info diet Instead of using a big array of NR_CPUS entries, we can compute the size needed at runtime, using nr_cpu_ids This should save some ram (especially on David's machines where NR_CPUS=4096 : 32 KB can be saved per table, and 64KB for dynamically allocated ones (because of slab/slub alignements) ) In particular, the 'bootstrap' tables are not any more static (in data section) but on stack as their size is now very small. This also should reduce the size used on stack in compat functions (get_info() declares an automatic variable, that could be bigger than kernel stack size for big NR_CPUS) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:54 -08:00
Sven Schnelle	338e8a7926	[NETFILTER]: x_tables: add TCPOPTSTRIP target Signed-off-by: Sven Schnelle <svens@bitebene.org> Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:51 -08:00
Eric W. Biederman	e51b6ba077	sysctl: Infrastructure for per namespace sysctls This patch implements the basic infrastructure for per namespace sysctls. A list of lists of sysctl headers is added, allowing each namespace to have it's own list of sysctl headers. Each list of sysctl headers has a lookup function to find the first sysctl header in the list, allowing the lists to have a per namespace instance. register_sysct_root is added to tell sysctl.c about additional lists of sysctl_headers. As all of the users are expected to be in kernel no unregister function is provided. sysctl_head_next is updated to walk through the list of lists. __register_sysctl_paths is added to add a new sysctl table on a non-default sysctl list. The only intrusive part of this patch is propagating the information to decided which list of sysctls to use for sysctl_check_table. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:17 -08:00
Eric W. Biederman	23eb06de7d	sysctl: Remember the ctl_table we passed to register_sysctl_paths By doing this we allow users of register_sysctl_paths that build and dynamically allocate their ctl_table to be simpler. This allows them to just remember the ctl_table_header returned from register_sysctl_paths from which they can now find the ctl_table array they need to free. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:17 -08:00
Eric W. Biederman	29e796fd4d	sysctl: Add register_sysctl_paths function There are a number of modules that register a sysctl table somewhere deeply nested in the sysctl hierarchy, such as fs/nfs, fs/xfs, dev/cdrom, etc. They all specify several dummy ctl_tables for the path name. This patch implements register_sysctl_path that takes an additional path name, and makes up dummy sysctl nodes for each component. This patch was originally written by Olaf Kirch and brought to my attention and reworked some by Olaf Hering. I have changed a few additional things so the bugs are mine. After converting all of the easy callers Olaf Hering observed allyesconfig ARCH=i386, the patch reduces the final binary size by 9369 bytes. .text +897 .data -7008 text data bss dec hex filename 26959310 4045899 4718592 35723801 2211a19 ../vmlinux-vanilla 26960207 4038891 4718592 35717690 221023a ../O-allyesconfig/vmlinux So this change is both a space savings and a code simplification. CC: Olaf Kirch <okir@suse.de> CC: Olaf Hering <olaf@aepfle.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:16 -08:00
Patrick McHardy	be0ea7d5da	[NETFILTER]: Convert old checksum helper names Kill the defines again, convert to the new checksum helper names and remove the dependency of NET_ACT_NAT on NETFILTER. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:15 -08:00
Patrick McHardy	a99a00cf1a	[NET]: Move netfilter checksum helpers to net/core/utils.c This allows to get rid of the CONFIG_NETFILTER dependency of NET_ACT_NAT. This patch redefines the old names to keep the noise low, the next patch converts all users. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:14 -08:00
Gerrit Renker	0c86962076	[DCCP]: Integrate state transitions for passive-close This adds the necessary state transitions for the two forms of passive-close * PASSIVE_CLOSE - which is entered when a host receives a Close; * PASSIVE_CLOSEREQ - which is entered when a client receives a CloseReq. Here is a detailed account of what the patch does in each state. 1) Receiving CloseReq The pseudo-code in 8.5 says: Step 13: Process CloseReq If P.type == CloseReq and S.state < CLOSEREQ, Generate Close S.state := CLOSING Set CLOSING timer. This means we need to address what to do in CLOSED, LISTEN, REQUEST, RESPOND, PARTOPEN, and OPEN. * CLOSED: silently ignore - it may be a late or duplicate CloseReq; * LISTEN/RESPOND: will not appear, since Step 7 is performed first (we know we are the client); * REQUEST: perform Step 13 directly (no need to enqueue packet); * OPEN/PARTOPEN: enter PASSIVE_CLOSEREQ so that the application has a chance to process unread data. When already in PASSIVE_CLOSEREQ, no second CloseReq is enqueued. In any other state, the CloseReq is ignored. I think that this offers some robustness against rare and pathological cases: e.g. a simultaneous close where the client sends a Close and the server a CloseReq. The client will then be retransmitting its Close until it gets the Reset, so ignoring the CloseReq while in state CLOSING is sane. 2) Receiving Close The code below from 8.5 is unconditional. Step 14: Process Close If P.type == Close, Generate Reset(Closed) Tear down connection Drop packet and return Thus we need to consider all states: * CLOSED: silently ignore, since this can happen when a retransmitted or late Close arrives; * LISTEN: dccp_rcv_state_process() will generate a Reset ("No Connection"); * REQUEST: perform Step 14 directly (no need to enqueue packet); * RESPOND: dccp_check_req() will generate a Reset ("Packet Error") -- left it at that; * OPEN/PARTOPEN: enter PASSIVE_CLOSE so that application has a chance to process unread data; * CLOSEREQ: server performed active-close -- perform Step 14; * CLOSING: simultaneous-close: use a tie-breaker to avoid message ping-pong (see comment); * PASSIVE_CLOSEREQ: ignore - the peer has a bug (sending first a CloseReq and now a Close); * TIMEWAIT: packet is ignored. Note that the condition of receiving a packet in state CLOSED here is different from the condition "there is no socket for such a connection": the socket still exists, but its state indicates it is unusable. Last, dccp_finish_passive_close sets either DCCP_CLOSED or DCCP_CLOSING = TCP_CLOSING, so that sk_stream_wait_close() will wait for the final Reset (which will trigger CLOSING => CLOSED). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:13 -08:00
Gerrit Renker	f11135a344	[DCCP]: Dedicated auxiliary states to support passive-close This adds two auxiliary states to deal with passive closes: * PASSIVE_CLOSE (reached from OPEN via reception of Close) and * PASSIVE_CLOSEREQ (reached from OPEN via reception of CloseReq) as internal intermediate states. These states are used to allow a receiver to process unread data before acknowledging the received connection-termination-request (the Close/CloseReq). Without such support, it will happen that passively-closed sockets enter CLOSED state while there is still unprocessed data in the queue; leading to unexpected and erratic API behaviour. PASSIVE_CLOSE has been mapped into TCPF_CLOSE_WAIT, so that the code will seamlessly work with inet_accept() (which tests for this state). The state names are thanks to Arnaldo, who suggested this naming scheme following an earlier revision of this patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:12 -08:00
Fred L. Templin	c7dc89c0ac	[IPV6]: Add RFC4214 support This patch includes support for the Intra-Site Automatic Tunnel Addressing Protocol (ISATAP) per RFC4214. It uses the SIT module, and is configured using extensions to the "iproute2" utility. The diffs are specific to the Linux 2.6.24-rc2 kernel distribution. This version includes the diff for ./include/linux/if.h which was missing in the v2.4 submission and is needed to make the patch compile. The patch has been installed, compiled and tested in a clean 2.6.24-rc2 kernel build area. Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:09 -08:00
Pavel Emelyanov	8d8ad9d7c4	[NET]: Name magic constants in sock_wake_async() The sock_wake_async() performs a bit different actions depending on "how" argument. Unfortunately this argument ony has numerical magic values. I propose to give names to their constants to help people reading this function callers understand what's going on without looking into this function all the time. I suppose this is 2.6.25 material, but if it's not (or the naming seems poor/bad/awful), I can rework it against the current net-2.6 tree. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:55:03 -08:00
Ilpo Järvinen	e7d0362dd4	[PCOUNTER] Fix build error without CONFIG_SMP I keep getting this build error and couldn't find anyone fixing it in archives. ...Maybe all net developers except me build just SMP kernels :-). In file included from include/net/sock.h:50, from ipc/mqueue.c:35: include/linux/pcounter.h: In function 'pcounter_add': include/linux/pcounter.h:87: error: 'struct pcounter' has no member named 'value' make[1]: * [ipc/mqueue.o] Error 1 make: * [ipc] Error 2 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:50 -08:00
Gerrit Renker	9b91ad2747	[DCCP]: Make PARTOPEN an autonomous state This decouples PARTOPEN from TCP-specific stream-states. It thus addresses the FIXME. The code has been checked with regard to dependency on PARTOPEN and FIN_WAIT1 states (to which PARTOPEN previously was mapped): there is no difference, as PARTOPEN is always referred to directly (i.e. not via the mapping to TCP state). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:44 -08:00
Arnaldo Carvalho de Melo	de4d1db369	[LIB]: Introduce struct pcounter This just generalises what was introduced by Eric Dumazet for the struct proto inuse field in 286ab3d46058840d68e5d7d52e316c1f7e98c59f: [NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way. Please look at the comment in there to see the rationale. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:39 -08:00
Ron Rindjunsky	6b4e324164	mac80211: adding 802.11n definitions in ieee80211.h This patch adds several structs and definitions to ieee80211.h to support 802.11n draft specifications. As 802.11n depends on and extends the 802.11e standard in several issues, there are also several definitions that belong to 802.11e. Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:38 -08:00
Denis V. Lunev	e372c41401	[NET]: Consolidate net namespace related proc files creation. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:28 -08:00
Denis V. Lunev	97c53cacf0	[NET]: Make rtnetlink infrastructure network namespace aware (v3) After this patch none of the netlink callback support anything except the initial network namespace but the rtnetlink infrastructure now handles multiple network namespaces. Changes from v2: - IPv6 addrlabel processing Changes from v1: - no need for special rtnl_unlock handling - fixed IPv6 ndisc Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:25 -08:00
Michael Wu	c237899d1f	ieee80211: Add IEEE80211_MAX_FRAME_LEN to linux/ieee80211.h This patch adds IEEE80211_MAX_FRAME_LEN which is useful for drivers trying to determine how much to allocate for their RX buffers. It also updates the comment on IEEE80211_MAX_DATA_LEN based on revisions in 802.11e. IEEE80211_MAX_FRAG_THRESHOLD and IEEE80211_MAX_RTS_THRESHOLD are also revised due to the new maximum frame size. Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:20 -08:00
Stephen Hemminger	c7b6ea24b4	[NETPOLL]: Don't need rx_flags. The rx_flags variable is redundant. Turning rx on/off is done via setting the rx_np pointer. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:18 -08:00
Stephen Hemminger	0953864160	[NETPOLL]: no need to store local_mac The local_mac is managed by the network device, no need to keep a spare copy and all the management problems that could cause. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:17 -08:00
Oliver Hartkopp	1f98eefae8	[CAN]: Add missing Kbuild entries This patch adds the missing Kbuild entries and the missing Kbuild file in include/linux/can for the CAN subsystem. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:13 -08:00
Oliver Hartkopp	4195e31780	[CAN]: Fix plain integer definitions in userspace header. This patch fixes the use of plain integers instead of __u32 in a struct that is visible from kernel space and user space. Thanks to Sam Ravnborg for pointing out the wrong plain int usage. Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:12 -08:00
Oliver Hartkopp	ffd980f976	[CAN]: Add broadcast manager (bcm) protocol This patch adds the CAN broadcast manager (bcm) protocol. Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:11 -08:00
Oliver Hartkopp	c18ce101f2	[CAN]: Add raw protocol This patch adds the CAN raw protocol. Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:10 -08:00
Oliver Hartkopp	0d66548a10	[CAN]: Add PF_CAN core module This patch adds the CAN core functionality but no protocols or drivers. No protocol implementations are included here. They come as separate patches. Protocol numbers are already in include/linux/can.h. Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:10 -08:00
Oliver Hartkopp	cd05acfe65	[CAN]: Allocate protocol numbers for PF_CAN This patch adds a protocol/address family number, ARP hardware type, ethernet packet type, and a line discipline number for the SocketCAN implementation. Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:09 -08:00
Ilpo Järvinen	68f8353b48	[TCP]: Rewrite SACK block processing & sack_recv_cache use Key points of this patch are: - In case new SACK information is advance only type, no skb processing below previously discovered highest point is done - Optimize cases below highest point too since there's no need to always go up to highest point (which is very likely still present in that SACK), this is not entirely true though because I'm dropping the fastpath_skb_hint which could previously optimize those cases even better. Whether that's significant, I'm not too sure. Currently it will provide skipping by walking. Combined with RB-tree, all skipping would become fast too regardless of window size (can be done incrementally later). Previously a number of cases in TCP SACK processing fails to take advantage of costly stored information in sack_recv_cache, most importantly, expected events such as cumulative ACK and new hole ACKs. Processing on such ACKs result in rather long walks building up latencies (which easily gets nasty when window is huge). Those latencies are often completely unnecessary compared with the amount of _new_ information received, usually for cumulative ACK there's no new information at all, yet TCP walks whole queue unnecessary potentially taking a number of costly cache misses on the way, etc.! Since the inclusion of highest_sack, there's a lot information that is very likely redundant (SACK fastpath hint stuff, fackets_out, highest_sack), though there's no ultimate guarantee that they'll remain the same whole the time (in all unearthly scenarios). Take advantage of this knowledge here and drop fastpath hint and use direct access to highest SACKed skb as a replacement. Effectively "special cased" fastpath is dropped. This change adds some complexity to introduce better coveraged "fastpath", though the added complexity should make TCP behave more cache friendly. The current ACK's SACK blocks are compared against each cached block individially and only ranges that are new are then scanned by the high constant walk. For other parts of write queue, even when in previously known part of the SACK blocks, a faster skip function is used (if necessary at all). In addition, whenever possible, TCP fast-forwards to highest_sack skb that was made available by an earlier patch. In typical case, no other things but this fast-forward and mandatory markings after that occur making the access pattern quite similar to the former fastpath "special case". DSACKs are special case that must always be walked. The local to recv_sack_cache copying could be more intelligent w.r.t DSACKs which are likely to be there only once but that is left to a separate patch. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:07 -08:00
Ilpo Järvinen	fd6dad616d	[TCP]: Earlier SACK block verification & simplify access to them Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:07 -08:00
Ilpo Järvinen	a47e5a988a	[TCP]: Convert highest_sack to sk_buff to allow direct access It is going to replace the sack fastpath hint quite soon... :-) Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:54:03 -08:00
YOSHIFUJI Hideaki	2a8cc6c890	[IPV6] ADDRCONF: Support RFC3484 configurable address selection policy table. Policy table is implemented as an RCU linear list since we do not expect large list nor frequent updates. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:58 -08:00
Patrick McHardy	6e23ae2a48	[NETFILTER]: Introduce NF_INET_ hook values The IPv4 and IPv6 hook values are identical, yet some code tries to figure out the "correct" value by looking at the address family. Introduce NF_INET_* values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__ section for userspace compatibility. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:55 -08:00
Jens Axboe	9c55e01c0c	[TCP]: Splice receive support. Support for network splice receive. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:31 -08:00
Jens Axboe	bbdfc2f706	[SPLICE]: Don't assume regular pages in splice_to_pipe() Allow caller to pass in a release function, there might be other resources that need releasing as well. Needed for network receive. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:30 -08:00
Sam Ravnborg	312b1485fb	Introduce new section reference annotations tags: __ref, __refdata, __refconst Today we have the following annotations for functions/data referencing __init/__exit functions / data: __init_refok => for init functions __initdata_refok => for init data __exit_refok => for exit functions There is really no difference between the __init and __exit versions and simplify it and to introduce a shorter annotation the following new annotations are introduced: __ref => for functions (code) that references __init / __exit __refdata => for variables __refconst => for const variables Whit this annotation is it more obvious what the annotation is for and there is no longer the arbitary division between __init and __exit code. The mechanishm is the same as before - a special section is created which is made part of the usual sections in the linker script. We will start to see annotations like this: -static struct pci_serial_quirk pci_serial_quirks[] = { +static const struct pci_serial_quirk pci_serial_quirks[] __refconst = { ----------------- -static struct notifier_block __cpuinitdata cpuid_class_cpu_notifier = +static struct notifier_block cpuid_class_cpu_notifier __refdata = ---------------- -static int threshold_cpu_callback(struct notifier_block nfb, +static int __ref threshold_cpu_callback(struct notifier_block nfb, [The above is just random samples]. Note: No modifications were needed in modpost to support the new sections due to the newly introduced blacklisting. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2008-01-28 23:21:19 +01:00
Adrian Bunk	3ff6eecca4	remove __attribute_used__ Remove the deprecated __attribute_used__. [Introduce __section in a few places to silence checkpatch /sam] Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2008-01-28 23:21:18 +01:00
Sam Ravnborg	eb8f689046	Use separate sections for __dev/__cpu/__mem code/data Introducing separate sections for __dev* (HOTPLUG), __cpu* (HOTPLUG_CPU) and __mem* (MEMORY_HOTPLUG) allows us to do a much more reliable Section mismatch check in modpost. We are no longer dependent on the actual configuration of for example HOTPLUG. This has the effect that all users see much more Section mismatch warnings than before because they were almost all hidden when HOTPLUG was enabled. The advantage of this is that when building a piece of code then it is much more likely that the Section mismatch errors are spotted and the warnings will be felt less random of nature. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Greg KH <greg@kroah.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Adrian Bunk <bunk@kernel.org>	2008-01-28 23:21:17 +01:00
Sam Ravnborg	f3fe866d59	compiler.h: introduce __section() Add a new helper: __section() that makes a section definition much shorter and more readable. Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2008-01-28 23:21:17 +01:00
Robert P. J. Day	7998a73166	A few corrections to include/linux/Kbuild auxvec.h, i2c-dev.h and vt.h should be unifdef'ed i2o-dev.h does not need unifdef'ing Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2008-01-28 23:14:36 +01:00
Linus Torvalds	f4798748de	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (24 commits) HID: ADS/Tech Radio si470x needs blacklist entry HID: Logitech Extreme 3D needs NOGET quirk HID: Refactor MS Presenter 8K key mapping HID: MS Presenter mapping for PID 0x0701 HID: Support Samsung IR remote HID: fix compilation of hidbp drivers without usbhid HID: Blacklist the Gretag-Macbeth Huey display colorimeter HID: the `bit' in hidinput_mapping_quirks() is an out parameter HID: remove redundant WARN_ON()s in order not to scare users HID: force hiddev creation for SONY PS3 controller HID: Use hid blacklist in usbmouse/usbkbd HID: proper handling of MS 4k and 6k devices HID: remove unused variable in quirk event handler HID: hid-input quirk for BTC 8193 HID: separate hid-input event quirks from generic code HID: refactor mapping to input subsystem for quirky devices HID: Microsoft Wireless Optical Desktop 3.0 quirk HID: Add support for Logitech Elite keyboards HID: add full support for Genius KB-29E HID: fix a potential bug in pointer casting ...	2008-01-29 08:52:20 +11:00
Linus Torvalds	8d01eddf29	Merge branch 'for-2.6.25' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.25' of git://git.kernel.dk/linux-2.6-block: block: implement drain buffers __bio_clone: don't calculate hw/phys segment counts block: allow queue dma_alignment of zero blktrace: Add blktrace ioctls to SCSI generic devices	2008-01-29 08:51:56 +11:00
Linus Torvalds	f0f0052069	Merge branch 'blk-end-request' of git://git.kernel.dk/linux-2.6-block * 'blk-end-request' of git://git.kernel.dk/linux-2.6-block: (30 commits) blk_end_request: changing xsysace (take 4) blk_end_request: changing ub (take 4) blk_end_request: cleanup of request completion (take 4) blk_end_request: cleanup 'uptodate' related code (take 4) blk_end_request: remove/unexport end_that_request_* (take 4) blk_end_request: changing scsi (take 4) blk_end_request: add bidi completion interface (take 4) blk_end_request: changing ide-cd (take 4) blk_end_request: add callback feature (take 4) blk_end_request: changing ide normal caller (take 4) blk_end_request: changing cpqarray (take 4) blk_end_request: changing cciss (take 4) blk_end_request: changing ide-scsi (take 4) blk_end_request: changing s390 (take 4) blk_end_request: changing mmc (take 4) blk_end_request: changing i2o_block (take 4) blk_end_request: changing viocd (take 4) blk_end_request: changing xen-blkfront (take 4) blk_end_request: changing viodasd (take 4) blk_end_request: changing sx8 (take 4) ...	2008-01-29 08:51:32 +11:00
Linus Torvalds	68fbda7de0	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: SG: work with the SCSI fixed maximum allocations. SG: Convert SCSI to use scatterlist helpers for sg chaining SG: Move functions to lib/scatterlist.c and add sg chaining allocator helpers	2008-01-29 08:51:05 +11:00
Linus Torvalds	d4928196fe	Merge branch 'cfq-ioc-share' of git://git.kernel.dk/linux-2.6-block * 'cfq-ioc-share' of git://git.kernel.dk/linux-2.6-block: cfq-iosched: kill some big inlines cfq-iosched: relax IOPRIO_CLASS_IDLE restrictions kernel: add CLONE_IO to specifically request sharing of IO contexts io_context sharing - anticipatory changes block: cfq: make the io contect sharing lockless io_context sharing - cfq changes io context sharing: preliminary support ioprio: move io priority from task_struct to io_context	2008-01-29 08:50:42 +11:00
Robert Schedel	fe56caa97e	HID: Support Samsung IR remote Samsung USB remotes (0419:0001) are rejected by kernel 2.6.23, because the report descriptor from the remote contains a 48 bit HID report field. HID 1.11 states: Fields may span at most 4 bytes. This patch, based on 2.6.23, fixes this by modifying the internal report descriptor in hid-quirks.c. Additional user space support (e.g. LIRC) is required to fetch the information from the hiddev interface. The burden to reconstruct the data is moved into userspace (lirc through hiddev). There is no need to set HID_QUIRK_HIDDEV quirk, as the device has also output applications, which trigger the creation of hiddev device automatically. Signed-off-by: Robert Schedel <r.schedel@yahoo.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:22 +01:00
Fengguang Wu	70d215c4a7	HID: the `bit' in hidinput_mapping_quirks() is an out parameter Fix a panic, by changing hidinput_mapping_quirks(,, unsigned long bit,) to hidinput_mapping_quirks(,, unsigned long *bit,) The `bit' in this function is an out parameter. Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:22 +01:00
Jiri Kosina	628edcde87	HID: proper handling of MS 4k and 6k devices This removes ugly macros IS_* to distinguish devices that need special handling in hid-input, and establish proper quirks for them. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:21 +01:00
Jiri Kosina	36ccaad640	HID: hid-input quirk for BTC 8193 BTC 8193 keyboard handles its scrollwheel in very non-standard way. It produces two non-standard usages for scrolling up and down, in both cases with postive value equaling to 1. We handle this by temporary mapping, which we then catch in quirk event handler, and remap to negative HWHEEL even in order to introduce correct behavior. Also the button requires special mapping, as it triggers standard-violating usage code. Reported in kernel.org bugzilla #9385 Reported-by: Kir Kolyshkin <kir@sacred.ru> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:21 +01:00
Jiri Kosina	87bc2aa993	HID: separate hid-input event quirks from generic code This patch separates also the hid-input quirks that have to be applied at the time the event occurs, so that the generic code handling HUT-compliant devices is not messed up by them too much. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:20 +01:00
Jiri Kosina	10bd065fac	HID: refactor mapping to input subsystem for quirky devices Currently, the handling of mapping between hid and input for devices that don't conform to HUT 1.12 specification is very messy -- no per-device handling, no blacklists, conditions on idVendor and idProduct placed all over the code. This patch moves all the device-specific input mapping to a separate file, and introduces a blacklist-style handling for non-standard device-specific mappings. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:20 +01:00
Jiri Kosina	af9e0eacdc	HID: add full support for Genius KB-29E Genius KB-29E has broken report descriptor, which causes some of the Consumer usages to appear incorrectly as Button usages. We fix it by fixing the report descriptor before it is being parsed. Also a few of the keys violate the HUT standard, so they need a special handling. They currently fall into "Reserved" range as per HUT 1.12. Reported-by: Szekeres Istvan <szekeres@iii.hu> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:20 +01:00
Pavel Troller	c80e5ffac0	HID: Implement horizontal wheel handling for A4 Tech X5-005D This mouse distinguishes horizontal wheel from vertical by a special "pseudo event" GenericDesktop.00b8, with values of 0 for vertical and 8 for horizontal wheel. Because this event is supplied by the parser too late, we need to delay a wheel event, wait for this one and send either REL_WHEEL or REL_HWHEEL to input depending on the event value. Signed-off-by: Pavel Troller <patrol@sinus.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:19 +01:00
Michel Daenzer	81e1a87550	HID: Rename some code identifiers from PowerBook specific to Apple generic Preserve identifiers exposed in build and run time configuration though in order not to break existing configurations. This is in preparation for adding support for Apple aluminum USB keyboards. Signed-off-by: Michel Daenzer <michel@tungstengraphics.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2008-01-28 14:51:19 +01:00
Russell King	c00d4ffdba	Merge branch 'orion' into devel * orion: (26 commits) [ARM] Orion: implement power-off method for QNAP TS-109/209 [ARM] Orion: add support for QNAP TS-109/TS-209 [ARM] Orion: I2C support [I2C] i2c-mv64xxx: Don't set i2c_adapter.retries [I2C] Split mv643xx I2C platform support [ARM] Orion: enable CONFIG_RTC_DRV_M41T80 for D-Link DNS-323 [ARM] Orion defconfig [ARM] Orion: add support for Orion/MV88F5181 based D-Link DNS-323 [ARM] Orion: MV88F5181 support bits [ARM] Orion: Buffalo/Revogear Kurobox Pro support [ARM] OrionNAS RD board support [ARM] Orion: support for Marvell Orion-2 (88F5281) Development Board [ARM] Orion: common platform setup for Gigabit Ethernet port [ARM] Orion: platform device registration for UART, USB and NAND [ARM] Orion: system timer support [ARM] Orion edge GPIO IRQ support [ARM] Orion: IRQ support [ARM] Orion: provide GPIO method for enabling hardware assisted blinking [ARM] Orion: GPIO support [ARM] Orion: programable address map support ... Conflicts: arch/arm/Kconfig arch/arm/Makefile Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2008-01-28 13:21:30 +00:00
James Bottomley	7cedb1f17f	SG: work with the SCSI fixed maximum allocations. SCSI sg table allocation has a maximum size (of SCSI_MAX_SG_SEGMENTS, currently 128) and this will cause a BUG_ON() in SCSI if something tries an allocation over it. This patch adds a size limit to the chaining allocator to allow the specification of the maximum allocation size for chaining, so we always chain in units of the maximum SCSI allocation size. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:54:49 +01:00
James Bottomley	fa0ccd837e	block: implement drain buffers These DMA drain buffer implementations in drivers are pretty horrible to do in terms of manipulating the scatterlist. Plus they're being done at least in drivers/ide and drivers/ata, so we now have code duplication. The one use case for this, as I understand it is AHCI controllers doing PIO mode to mmc devices but translating this to DMA at the controller level. So, what about adding a callback to the block layer that permits the adding of the drain buffer for the problem devices. The idea is that you'd do this in slave_configure after you find one of these devices. The beauty of doing it in the block layer is that it quietly adds the drain buffer to the end of the sg list, so it automatically gets mapped (and unmapped) without anything unusual having to be done to the scatterlist in driver/scsi or drivers/ata and without any alteration to the transfer length. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:54:11 +01:00
Jens Axboe	fadad878cc	kernel: add CLONE_IO to specifically request sharing of IO contexts syslets (or other threads/processes that want io context sharing) can set this to enforce sharing of io context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:50:36 +01:00
Jens Axboe	4ac845a2e9	block: cfq: make the io contect sharing lockless The io context sharing introduced a per-ioc spinlock, that would protect the cfq io context lookup. That is a regression from the original, since we never needed any locking there because the ioc/cic were process private. The cic lookup is changed from an rbtree construct to a radix tree, which we can then use RCU to make the reader side lockless. That is the performance critical path, modifying the radix tree is only done on process creation (when that process first does IO, actually) and on process exit (if that process has done IO). As it so happens, radix trees are also much faster for this type of lookup where the key is a pointer. It's a very sparse tree. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:50:33 +01:00
Jens Axboe	d38ecf935f	io context sharing: preliminary support Detach task state from ioc, instead keep track of how many processes are accessing the ioc. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:50:31 +01:00
Jens Axboe	fd0928df98	ioprio: move io priority from task_struct to io_context This is where it belongs and then it doesn't take up space for a process that doesn't do IO. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:50:29 +01:00
Kiyoshi Ueda	5450d3e1d6	blk_end_request: cleanup 'uptodate' related code (take 4) This patch converts 'uptodate' arguments of no longer exported interfaces, end_that_request_first/last, to 'error', and removes internal conversions for it in blk_end_request interfaces. Also, this patch removes no longer needed end_io_error(). Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:37:13 +01:00
Kiyoshi Ueda	3bcddeac1c	blk_end_request: remove/unexport end_that_request_* (take 4) This patch removes the following functions: o end_that_request_first() o end_that_request_chunk() and stops exporting the functions below: o end_that_request_last() Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:37:12 +01:00
Kiyoshi Ueda	e3a04fe34a	blk_end_request: add bidi completion interface (take 4) This patch adds a variant of the interface, blk_end_bidi_request(), which completes a bidi request. Bidi request must be completed as a whole, both rq and rq->next_rq at once. So the interface has 2 arguments for completion size. As for ->end_io, only rq->end_io is called (rq->next_rq->end_io is not called). So if special completion handling is needed, the handler must be set to rq->end_io. And the handler must take care of freeing next_rq too, since the interface doesn't care of it if rq->end_io is not NULL. Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:37:08 +01:00
Kiyoshi Ueda	e19a3ab058	blk_end_request: add callback feature (take 4) This patch adds a variant of the interface, blk_end_request_callback(), which has driver callback feature. Drivers may need to do special works between end_that_request_first() and end_that_request_last(). For such drivers, blk_end_request_callback() allows it to pass a callback function which is called between end_that_request_first() and end_that_request_last(). This interface is only for fallback of other blk_end_request interfaces. Drivers should avoid their tricky behaviors and use other interfaces as much as possible. Currently, only one driver, ide-cd, needs this interface. So this interface should/will be removed, after the driver removes such tricky behaviors. o ide-cd (cdrom_newpc_intr()) In PIO mode, cdrom_newpc_intr() needs to defer end_that_request_last() until the device clears DRQ_STAT and raises an interrupt after end_that_request_first(). So end_that_request_first() and end_that_request_last() are called separately in cdrom_newpc_intr(). This means blk_end_request_callback() has to return without completing request even if no leftover in the request. To satisfy the requirement, callback function has return value so that drivers can tell blk_end_request_callback() to return without completing request. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:37:04 +01:00
Kiyoshi Ueda	3b11313a6c	blk_end_request: add/export functions to get request size (take 4) This patch adds/exports functions to get the size of request in bytes. They are useful because blk_end_request interfaces take bytes as a completed I/O size instead of sectors. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:35:56 +01:00
Kiyoshi Ueda	336cdb4003	blk_end_request: add new request completion interface (take 4) This patch adds 2 new interfaces for request completion: o blk_end_request() : called without queue lock o __blk_end_request() : called with queue lock held blk_end_request takes 'error' as an argument instead of 'uptodate', which current end_that_request_* take. The meanings of values are below and the value is used when bio is completed. 0 : success < 0 : error Some device drivers call some generic functions below between end_that_request_{first/chunk} and end_that_request_last(). o add_disk_randomness() o blk_queue_end_tag() o blkdev_dequeue_request() These are called in the blk_end_request interfaces as a part of generic request completion. So all device drivers become to call above functions. To decide whether to call blkdev_dequeue_request(), blk_end_request uses list_empty(&rq->queuelist) (blk_queued_rq() macro is added for it). So drivers must re-initialize it using list_init() or so before calling blk_end_request if drivers use it for its specific purpose. (Currently, there is no driver which completes request without re-initializing the queuelist after used it. So rq->queuelist can be used for the purpose above.) "Normal" drivers can be converted to use blk_end_request() in a standard way shown below. a) end_that_request_{chunk/first} spin_lock_irqsave() (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request()) end_that_request_last() spin_unlock_irqrestore() => blk_end_request() b) spin_lock_irqsave() end_that_request_{chunk/first} (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request()) end_that_request_last() spin_unlock_irqrestore() => spin_lock_irqsave() __blk_end_request() spin_unlock_irqsave() c) spin_lock_irqsave() (add_disk_randomness(), blk_queue_end_tag(), blkdev_dequeue_request()) end_that_request_last() spin_unlock_irqrestore() => blk_end_request() or spin_lock_irqsave() __blk_end_request() spin_unlock_irqrestore() Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:35:53 +01:00
Jens Axboe	0db9299f48	SG: Move functions to lib/scatterlist.c and add sg chaining allocator helpers Manually doing chained sg lists is not trivial, so add some helpers to make sure that drivers get it right. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:05:27 +01:00
Pete Wyckoff	482eb68916	block: allow queue dma_alignment of zero Let queue_dma_alignment return 0 if it was specifically set to 0. This permits devices with no particular alignment restrictions to use arbitrary user space buffers without copying. Signed-off-by: Pete Wyckoff <pw@osc.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:04:46 +01:00
Christof Schmitt	6da127ad09	blktrace: Add blktrace ioctls to SCSI generic devices Since the SCSI layer uses the request queues from the block layer, blktrace can also be used to trace the requests to all SCSI devices (like SCSI tape drives), not only disks. The only missing part is the ioctl interface to start and stop tracing. This patch adds the SETUP, START, STOP and TEARDOWN ioctls from blktrace to the sg device files. With this change, blktrace can be used for SCSI devices like for disks, e.g.: blktrace -d /dev/sg1 -o - \| blkparse -i - Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-28 10:04:46 +01:00
Greg Kroah-Hartman	1f9ffc049d	Driver core: add bus_find_device_by_name function The driver core, and some other parts of the kernel just want to find a device based on a name for a specific bus. Give them a simple wrapper to prevent them from having to always roll their own. This will be used in the PPC patch later in this series. Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-27 15:01:39 -08:00
David Brownell	e9f1373b64	i2c: Add i2c_new_dummy() utility This adds a i2c_new_dummy() primitive to help work with devices that consume multiple addresses, which include many I2C eeproms and at least one RTC. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-27 18:14:52 +01:00
David Brownell	9b766b814d	i2c: Stop using the redundant client list The i2c_adapter.clients list of i2c_client nodes duplicates driver model state. This patch starts removing that list, letting us remove most existing users of those i2c-core lists. * The core I2C code now iterates over the driver model's list instead of the i2c-internal one in some places where it's safe: - Passing a command/ioctl to each client, a mechanims used almost exclusively by DVB adapters; - Device address checking, in both i2c-core and i2c-dev. * Provide i2c_verify_client() to use with driver model iterators. * Flag the relevant i2c_adapter and i2c_client fields as deprecated, to help prevent new users from appearing. For the moment the list needs to stick around, since some issues show up when deleting devices created by legacy I2C drivers. (They don't follow standard driver model rules. Removing those devices can cause self-deadlocks.) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-27 18:14:51 +01:00
Jean Delvare	7bca0871ca	i2c: Discard unused driver IDs Discard all I2C driver IDs that aren't used anywhere. That's not just a couple of them, but more like 49 or one quarter of all defined IDs! And this is just a first pass, next will come all IDs that are set but never used, or used but never set. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-27 18:14:50 +01:00
David Brownell	6d16bfb5e8	i2c/tps65010: move header to <linux/i2c/...> Move the tps65010 header file from the OMAP arch directory to the more generic <linux/i2c/...> directory, and remove the spurious dependency of this driver on OMAP. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-27 18:14:49 +01:00
Jean Delvare	026526f5af	i2c: Drop redundant i2c_driver.list i2c_driver.list is superfluous, this list duplicates the one maintained by the driver core. Drop it. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: David Brownell <dbrownell@users.sourceforge.net>	2008-01-27 18:14:49 +01:00
Jean Delvare	87c6c22945	i2c: Drop redundant i2c_adapter.list i2c_adapter.list is superfluous, this list duplicates the one maintained by the driver core. Drop it. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: David Brownell <dbrownell@users.sourceforge.net>	2008-01-27 18:14:48 +01:00
Jean Delvare	e48d33193d	i2c: Change prototypes of refcounting functions Use more standard prototypes for i2c_use_client() and i2c_release_client(). The former now returns a pointer to the client, and the latter no longer returns anything. This matches what all other subsystems do. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: David Brownell <david-b@pacbell.net>	2008-01-27 18:14:48 +01:00
Jean Delvare	bdc511f438	i2c: Use the driver model reference counting Don't implement our own reference counting mechanism for i2c clients when the driver model already has one. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: David Brownell <david-b@pacbell.net>	2008-01-27 18:14:48 +01:00
Mark M. Hoffman	bfb6df24fa	i2c: Constify client address data This patch allows much of the I2C client address data to move from initdata into text. Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-27 18:14:46 +01:00
Adrian Bunk	7e8b99251b	i2c: some overdue driver removal This patch contains the overdue removal of three I2C drivers. [JD: In fact only i2c-ixp4xx can be removed at the moment, the other two platforms don't implement the generic GPIO layer yet.] Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-27 18:14:46 +01:00
Adrian Bunk	eee87d3196	i2c: the scheduled I2C RTC driver removal This patch contains the scheduled removal of legacy I2C RTC drivers with replacement drivers. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-27 18:14:45 +01:00
Bartlomiej Zolnierkiewicz	7267c33774	ide: remove REQ_TYPE_ATA_CMD Based on the earlier work by Tejun Heo. All users are gone so we can finally remove it. Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:13 +01:00
Bartlomiej Zolnierkiewicz	34f5d5ae35	ide: switch set_xfer_rate() to use REQ_TYPE_ATA_TASKFILE requests Based on the earlier work by Tejun Heo. Switch set_xfer_rate() to use REQ_TYPE_ATA_TASKFILE requests and make ide_wait_cmd() static. There should be no functionality changes caused by this patch. Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:12 +01:00
Bartlomiej Zolnierkiewicz	2624565caa	ide: use wait_drive_not_busy() in drive_cmd_intr() (take 2) Use wait_drive_not_busy() in drive_cmd_intr(). v2: * Fix wait_drive_not_busy() comment (noticed by Sergei). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:11 +01:00
Bartlomiej Zolnierkiewicz	4906f3b4cd	ide: kill DATA_READY define Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:11 +01:00
Tejun Heo	4d7a984bdc	ide: task_end_request() fix task_end_request() modified to always call ide_end_drive_cmd() for taskfile requests. Previously, ide_end_drive_cmd() was called only when IDE_TFLAG_FLAGGED was set. Also, ide_dma_intr() is modified to use task_end_request(). Enables TASKFILE ioctls to get valid register outputs on successful completion. Bart: - ported it over recent IDE changes Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:11 +01:00
Bartlomiej Zolnierkiewicz	657cc1a8f6	ide: set IDE_TFLAG_IN_* flags before queuing/executing command * Add IDE_TFLAG_{HOB,TF,DEVICE} defines. * Set IDE_TFLAG_IN_* flags in {do_rw,ide_no_data,ide_raw}_taskfile() users. * Remove no longer needed ->tf_flags setup from ide_end_drive_cmd(). There should be no functionality changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:10 +01:00
Tejun Heo	35cf2b94d0	ide: fix ->io_32bit race in ide_taskfile_ioctl() In ide_taskfile_ioctl(), there was a race condition involving drive->io_32bit. It was cleared and restored during ioctl requests but there was no synchronization with other requests. So, other requests could execute with the altered ->io_32bit setting or updated drive->io_32bit could be overwritten by ide_taskfile_ioctl(). This patch adds IDE_TFLAG_IO_16BIT flag to indicate to ide_pio_datablock() that 16-bit I/O is needed regardless of drive->io_32bit settting. Bart: - ported it over recent IDE changes Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:10 +01:00
Bartlomiej Zolnierkiewicz	9e47be0c97	ide: remove broken disk byte-swapping support Remove broken disk byte-swapping support: - it can cause a data corruption on SMP (or if using PREEMPT on UP) - all data coming from disk are byte-swapped by taskfile_*_data() which results in incorrect identify data being reported by /proc/ide/ and IOCTLs - "hdx=bswap/byteswap" kernel parameter has been broken on m68k host drivers (including Atari/Q40 ones) since 2.5.x days (because of 'hwif' zero-ing) - byte-swapping is limited to PIO transfers (for working with TiVo disks on x86 machines using user-space solutions or dm-byteswap should result in much better performance because DMA can be used) For previous discussions please see: http://www.ussg.iu.edu/hypermail/linux/kernel/0201.0/0768.html http://lkml.org/lkml/2004/2/28/111 [ I have dm-byteswap device mapper target if somebody is interested (patch is for 2.6.4 though but I'll dust it off if needed). ] Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:09 +01:00
Bartlomiej Zolnierkiewicz	81ca691981	ide: add ide_set_irq() inline helper There should be no functionality changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:08 +01:00
Bartlomiej Zolnierkiewicz	ade2daf9c6	ide: make remaining built-in only IDE host drivers modular (take 2) * Make remaining built-in only IDE host drivers modular, add ide-scan-pci.c file for probing PCI host drivers registered with IDE core (special case for built-in IDE and CONFIG_IDEPCI_PCIBUS_ORDER=y) and then take care of the ordering in which all IDE host drivers are probed when IDE is built-in during link time. * Move probing of gayle, falconide, macide, q40ide and buddha (m68k arch specific) host drivers, before PCI ones (no PCI on m68k), ide-cris (cris arch specific), cmd640 (x86 arch specific) and pmac (ppc arch specific). * Move probing of ide-cris (cris arch specific) host driver before cmd640 (x86 arch specific). * Move probing of mpc8xx (ppc specific) host driver before ide-pnp (depends on ISA and none of ppc platform that use mpc8xx supports ISA) and ide-h8300 (h8300 arch specific). * Add "probe_vlb" kernel parameter to cmd640 host driver and update Documentation/ide.txt accordingly. * Make IDE_ARM config option visible so it can also be disabled if needed. * Remove bogus comment from ide.c while at it. v2: * Fix two issues spotted by Sergei: - replace ENOMEM error value by ENOENT in ide-h8300 host driver - fix MODULE_PARM_DESC() in cmd640 host driver Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:07 +01:00
Bartlomiej Zolnierkiewicz	cbb010c180	ide: drop 'initializing' argument from ide_register_hw() * Rename init_hwif_data() to ide_init_port_data() and export it. * For all users of ide_register_hw() with 'initializing' argument set hwif->present and hwif->hold are always zero so convert these host drivers to use ide_find_port()+ide_init_port_data()+ide_init_port_hw() instead (also no need for init_hwif_default() call since the setup done by it gets over-ridden by ide_init_port_hw() call). * Drop 'initializing' argument from ide_register_hw(). Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:06 +01:00
Bartlomiej Zolnierkiewicz	57c802e84f	ide: add ide_init_port_hw() helper * Add ide_init_port_hw() helper. * rapide.c: convert rapide_locate_hwif() to rapide_setup_ports() and use ide_init_port_hw(). * ide_platform.c: convert plat_ide_locate_hwif() to plat_ide_setup_ports() and use ide_init_port_hw(). * sgiioc4.c: use ide_init_port_hw(). * pmac.c: add 'hw_regs_t *hw' argument to pmac_ide_setup_device(), setup 'hw' in pmac_ide_{macio,pci}_attach() and use ide_init_port_hw() in pmac_ide_setup_device(). This patch is a preparation for the future changes in the IDE probing code. There should be no functionality changes caused by this patch. Cc: Russell King <rmk@arm.linux.org.uk> Cc: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: Jeremy Higdon <jeremy@sgi.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:05 +01:00
Olof Johansson	b0d5bc27ce	ide: Fix build break caused by "ide: remove ideprobe_init()" Fix build break of powerpc holly_defconfig: In file included from arch/powerpc/platforms/embedded6xx/holly.c:24: include/linux/ide.h:1206: error: 'CONFIG_IDE_MAX_HWIFS' undeclared here (not in a function) There's no need to have a sized array in the prototype, might as well turn it into a pointer. It could probably be argued that large parts of the include file can be covered under #ifdef CONFIG_IDE, but that's a larger undertaking. Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:05 +01:00
Bartlomiej Zolnierkiewicz	151575e464	ide: remove ideprobe_init() * Rename ide_device_add() to ide_device_add_all() and make it accept 'u8 idx[MAX_HWIFS]' instead of 'u8 idx[4]' as an argument. * Add ide_device_add() wrapper for ide_device_add_all(). * Convert ide_generic_init() to use ide_device_add_all(). * Remove no longer needed ideprobe_init(). There should be no functionality changes caused by this patch. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:05 +01:00
Bartlomiej Zolnierkiewicz	f01393e48c	ide: merge ->fixup and ->quirkproc methods * Assign drive->quirk_list in ->quirkproc implementations: - hpt366.c::hpt3xx_quirkproc() - pdc202xx_new.c::pdcnew_quirkproc() - pdc202xx_old.c::pdc202xx_quirkproc() * Make ->quirkproc void. * Move calling ->quirkproc from do_identify() to probe_hwif(). * Convert it821x_fixups() to it821x_quirkproc() in it821x.c. * Convert siimage_fixup() to sil_quirkproc() in siimage.c, also remove no longer needed drive->present check from is_dev_seagate_sata(). * Convert ide_undecoded_slave() to accept 'drive' instead of 'hwif' as an argument. Then convert ide_register_hw() to accept 'quirkproc' argument instead of 'fixup' one. * Remove no longer needed ->fixup method. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:03 +01:00
Bartlomiej Zolnierkiewicz	15ce926ada	ide: merge ->dma_host_{on,off} methods into ->dma_host_set method Merge ->dma_host_{on,off} methods into ->dma_host_set method which takes 'int on' argument. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:03 +01:00
Bartlomiej Zolnierkiewicz	4a546e046d	ide: remove ->ide_dma_on and ->dma_off_quietly methods from ide_hwif_t * Make ide_dma_off_quietly() and __ide_dma_on() always available. * Drop "__" prefix from __ide_dma_on(). * Check for presence of ->dma_host_on instead of ->ide_dma_on. * Convert all users of ->ide_dma_on and ->dma_off_quietly methods to use ide_dma_on() and ide_dma_off_quietly() instead. * Remove no longer needed ->ide_dma_on and ->dma_off_quietly methods from ide_hwif_t. * Make ide_dma_on() void. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:01 +01:00
Bartlomiej Zolnierkiewicz	8704de8f29	cy82c693: add ->set_dma_mode method * Fix SWDMA/MWDMA masks in cy82c693_chipset. * Add IDE_HFLAG_CY82C693 host flag and use it in ide_tune_dma() to check whether the DMA should be enabled even if ide_max_dma_mode() fails. * Convert cy82c693_dma_enable() to become cy82c693_set_dma_mode() and remove no longer needed cy82c693_ide_dma_on(). Then set IDE_HFLAG_CY82C693 instead of IDE_HFLAG_TRUST_BIOS_FOR_DMA in cy82c693_chipset. * Bump driver version. As a result of this patch cy82c693 driver will configure and use DMA on all SWDMA0-2 and MWDMA0-2 capable ATA devices instead of relying on BIOS. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-26 20:13:00 +01:00
Jean Delvare	2f0a8df40f	[I2C] i2c-mv64xxx: Don't set i2c_adapter.retries I2C adapter drivers are supposed to handle retries on nack by themselves if they do, so there's no point in setting .retries if they don't. As this retry mechanism is going away (at least in its current form), clean this up now so that we don't get build failures later. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Mark A. Greer <mgreer@mvista.com>	2008-01-26 15:04:01 +00:00
Tzachi Perelstein	a0832798c0	[I2C] Split mv643xx I2C platform support The motivation for this change is to allow other chips, like the Marvell Orion ARM SoC family, to use the existing i2c-mv64xxx driver. Signed-off-by: Tzachi Perelstein <tzachi@marvell.com> Acked-by: Nicolas Pitre <nico@marvell.com> Acked-by: Dale Farnsworth <dale@farnsworth.org> Acked-by: Mark A. Greer <mgreer@mvista.com> Acked-by: Jean Delvare <khali@linux-fr.org>	2008-01-26 15:03:59 +00:00
Linus Torvalds	9b73e76f3c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 commits) [SCSI] usbstorage: use last_sector_bug flag universally [SCSI] libsas: abstract STP task status into a function [SCSI] ultrastor: clean up inline asm warnings [SCSI] aic7xxx: fix firmware build [SCSI] aacraid: fib context lock for management ioctls [SCSI] ch: remove forward declarations [SCSI] ch: fix device minor number management bug [SCSI] ch: handle class_device_create failure properly [SCSI] NCR5380: fix section mismatch [SCSI] sg: fix /proc/scsi/sg/devices when no SCSI devices [SCSI] IB/iSER: add logical unit reset support [SCSI] don't use __GFP_DMA for sense buffers if not required [SCSI] use dynamically allocated sense buffer [SCSI] scsi.h: add macro for enclosure bit of inquiry data [SCSI] sd: add fix for devices with last sector access problems [SCSI] fix pcmcia compile problem [SCSI] aacraid: add Voodoo Lite class of cards. [SCSI] aacraid: add new driver features flags [SCSI] qla2xxx: Update version number to 8.02.00-k7. [SCSI] qla2xxx: Issue correct MBC_INITIALIZE_FIRMWARE command. ...	2008-01-25 17:19:08 -08:00
Linus Torvalds	29bd17af7d	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (31 commits) ocfs2: clean up bh null checks ocfs2: document access rules for blocked_lock_list configfs: file.c fix possible recursive locking configfs: dir.c fix possible recursive locking configfs: Remove EXPERIMENTAL ocfs2: bump version number ocfs2/dlm: Clear joining_node on hearbeat node down ocfs2: convert byte order of constant instead of variable ocfs2: Update default cluster timeouts ocfs2: printf fixes ocfs2: Use generic_file_llseek ocfs2: Safer read_inline_data() ocfs2: Silence false lockdep warnings [PATCH 2/2] ocfs2: cluster aware flock() [PATCH 1/2] ocfs2: add flock lock type ocfs2: Local alloc window size changeable via mount option ocfs2: Support commit= mount option ocfs2: Add missing permission checks [PATCH 2/2] ocfs2: Implement group add for online resize [PATCH 1/2] ocfs2: Add group extend for online resize ...	2008-01-25 17:11:13 -08:00
Linus Torvalds	2ba14a017a	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (67 commits) fix drivers/ata/sata_fsl.c double-decl [libata] Prefer SCSI_SENSE_BUFFERSIZE to sizeof() pata_legacy: Merge winbond support ata_generic: Cenatek support pata_winbond: error return pata_serverworks: Fix cable types and cosmetics pata_mpc52xx: remove un-needed assignment libata: fix off-by-one in error categorization ahci: factor out AHCI enabling and enable AHCI before reading CAP ata_piix: implement SIDPR SCR access ata_piix: convert to prepare - activate initialization libata: factor out ata_pci_activate_sff_host() from ata_pci_one() [libata] Prefer SCSI_SENSE_BUFFERSIZE to sizeof() pata_legacy: resychronize with upstream changes and resubmit [libata] pata_legacy: typo fix [libata] pata_winbond: update for new ->data_xfer hook pata_pcmcia: convert to new data_xfer prototype libata annotations and fixes libata: use dev_driver_string() instead of "libata" in libata-sff.c ata_piix: kill unused constants and flags ...	2008-01-25 17:08:28 -08:00
Joel Becker	d69a3ad6a0	dlm: Split lock mode and flag constants into a sharable header. This allows others to use the DLM constants without being tied to the function API of fs/dlm. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2008-01-25 14:46:04 -08:00
Linus Torvalds	b31fde6db2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb * git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (509 commits) V4L/DVB (7078): radio: fix sf16fmi section mismatch V4L/DVB (7077): bt878: remove handcrafted PCI subsystem ID check V4L/DVB (7075): Make a local function static V4L/DVB (7074): DiB7000P: correct tuning problem for 7MHz channel V4L/DVB (7073): DiB7070: Reception quality improved V4L/DVB (7072): sets the MT2060 IF1 frequency according to EEPROM V4L/DVB (7071): DiB0700: Start streaming the right way V4L/DVB (7070): Fix some tuning problems V4L/DVB (7069): Support for myTV.t V4L/DVB (7068): Add support for WinTV Nova-T-CE driver V4L/DVB (7067): fix autoserach in the Hauppauge NOVA-T 500 V4L/DVB (7066): ASUS My Cinema U3000 Mini DVBT Tuner V4L/DVB (7065): Artec T14BR patches V4L/DVB (7063): xc5000: Fix OOPS caused by missing firmware V4L/DVB (7062): radio-si570x: Some fixes and new USB ID addition V4L/DVB (7061): radio-si470x: Some cleanups V4L/DVB (7060): em28xx: remove has_tuner V4L/DVB (7059): cx88: Ensure the tuner is reset correctly V4L/DVB (7058): IR corrections for the Pinnacle 800i V4L/DVB (7056): tuner: suppress obsolete tuner i2c address warning for XC5000 tuners ...	2008-01-25 13:59:51 -08:00
Linus Torvalds	f31c338675	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (67 commits) ide: remove redundant DMA blacklist check from __ide_dma_on() ide: cleanup ide_set_dma() ide: remove redundant ->ide_dma_on call from set_using_dma() sc1200: move DMA timings to timing tables ide: add IDE_HFLAG_ABUSE_SET_DMA_MODE host flag sis5513: factor out UDMA programming code pdc202xx_new: move PIO programming code to pdcnew_set_pio_mode() ide: make 'extra' field in struct ide_port_info u8 ide: kill duplicate code in ide_dump_{ata,atapi}_status() ide-disk: use ide_get_lba_addr() ide: printk fix ide: add ide_tf_read() helper ide: fix registers loading order in ide_dump_ata_status() ide-disk: use do_rw_taskfile() (take 2) ide-disk: add ide_tf_set_cmd() helper ide-disk: extend timeout for PIO-in commands ide: remove 'handler' field from ide_task_t (take 2) ide: use ->data_phase to set ->handler in do_rw_taskfile() ide: convert do_rw_taskfile() to use ->data_phase ide: merge flagged_taskfile() into do_rw_taskfile() ...	2008-01-25 13:57:26 -08:00
Bartlomiej Zolnierkiewicz	4db90a1452	ide: add IDE_HFLAG_ABUSE_SET_DMA_MODE host flag * Add IDE_HFLAG_ABUSE_SET_DMA_MODE host flag and use it to decide what to do with transfer modes < XFER_PIO_0 in ide_set_xfer_rate(). * Set IDE_HFLAG_ABUSE_SET_DMA_MODE in host drivers that need it (aec62xx, amd74xx, cs5520, cs5535, hpt34x, hpt366, pdc202xx_old, serverworks, tc86c001 and via82cxxx) and cleanup ->set_dma_mode methods in host drivers that don't (IDE core code guarantees that ->set_dma_mode will be called only for modes which are present in SWDMA/MWDMA/UDMA masks). While at it: * Add IDE_HFLAGS_HPT34X/HPT3XX/PDC202XX/SVWKS define in hpt34x/hpt366/pdc202xx_old/serverworks host driver. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:18 +01:00
Bartlomiej Zolnierkiewicz	3071a9d00b	ide: make 'extra' field in struct ide_port_info u8 The maximum value used currently for 'extra' field in struct ide_port_info is 240. Make 'extra' u8 so it packs nicely together with enablebits[] and 'chipset' fields (ide_pci_enablebit_t is 3 bytes and hwif_chipset_t is 1 byte). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:17 +01:00
Bartlomiej Zolnierkiewicz	a501633c7d	ide-disk: use ide_get_lba_addr() * Export ide_get_lba_addr(). * Convert idedisk_{read_native,set}_max_address() to use ide_get_lba_addr(). * Remove incorrect comment from idedisk_read_native_max_address() (noticed by Sergei). There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:17 +01:00
Bartlomiej Zolnierkiewicz	c2b57cdc1d	ide: add ide_tf_read() helper * Factor out code reading taskfile registers from ide_end_drive_cmd() to the new ide_tf_read() helper. * Add IDE_TFLAG_IN_* taskfile flags to indicate the need to load particular IDE taskfile register in ide_tf_read(). * Update ide_end_drive_cmd() to set respective IDE_TFLAG_IN_* taksfile flags. * Add ide_get_lba_addr() for getting LBA sector address from taskfile struct. * Factor out code getting sector address from ide_dump_ata_status() to the new ide_dump_sector() function. * Convert ide_dump_sector() to use ide_tf_read() and ide_get_lba_addr(). * Remove no longer needed ide_read_24(). The only change in functionality caused by this patch is that ide_dump_ata_status() no longer prints "high"/"low" parts of LBA48 sector address (of course LBA48 sector address is still printed). Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:17 +01:00
Bartlomiej Zolnierkiewicz	f6e29e35cc	ide-disk: use do_rw_taskfile() (take 2) * Add IDE_TFLAG_DMA_PIO_FALLBACK taskfile flag to indicate the need to skip loading taskfile registers in do_rw_taskfile(). * Export do_rw_taskfile(). * Convert __ide_do_rw_disk() to use do_rw_taskfile(). * Unexport ide_tf_load(). * Unexport {pre_task_out,task_in}_intr() and make it static. * Remove incorrect comment about do_rw_taskfile() from <linux/ide.h>. There should be no functionality changes caused by this patch. v2: * Add missing blk_fs_request() check to task_dma_ok() (for VDMA). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:16 +01:00
Bartlomiej Zolnierkiewicz	57d7366b78	ide: remove 'handler' field from ide_task_t (take 2) * Add IDE_TFLAG_CUSTOM_HANDLER taskfile flag and use it for internal requests which require custom handlers. Check the flag in do_rw_taskfile() and set handler accordingly. * Cleanup ide_init_{specify,restore,setmult}_cmd() and rename it to ide_tf_set_{specify,restore,setmult}_cmd(). * Make {set_geometry,recal,set_multmode}_intr() static. * Remove no longer needed 'handler' field from ide_task_t. v2: * 'handler' in do_rw_taskfile() must be set to NULL initially. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:16 +01:00
Bartlomiej Zolnierkiewicz	1192e528e0	ide: use ->data_phase to set ->handler in do_rw_taskfile() * Use ->data_phase to set ->handler in do_rw_taskfile() instead of setting ->handler in callers of ide_raw_taskfile()/do_rw_taskfile(). * Unexport task_no_data_intr() and make it static. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:16 +01:00
Bartlomiej Zolnierkiewicz	10d90157c8	ide: convert do_rw_taskfile() to use ->data_phase * Use task->data_phase in do_rw_taskfile() to decide what to do. * task->prehandler is only used by TASKFILE[_MULTI]_OUT so just use pre_task_out_intr() directly and remove no longer needed 'prehandler' field from ide_task_t. * Remove no longer needed ide_pre_handler_t type. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:16 +01:00
Bartlomiej Zolnierkiewicz	1edee60e9d	ide: merge flagged_taskfile() into do_rw_taskfile() Based on the earlier work by Tejun Heo. task->data_phase == TASKFILE_MULTI_{IN,OUT} vs drive->mult_count == 0 check is needed also for ide_taskfile_ioctl() requests that don't have IDE_TFLAG_FLAGGED taskfile flag set. Cc: Tejun Heo <htejun@gmail.com> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:15 +01:00
Bartlomiej Zolnierkiewicz	866e2ec9ce	ide: remove 'tf_in_flags' field from ide_task_t * Add IDE_TFLAG_IN_DATA taskfile flag to indicate the need of reading IDE_DATA_REG in ide_end_drive_cmd(). Set the new flag in ide_taskfile_ioctl() if ->in_flags.b.data is set. * Add IDE_TFLAG_FLAGGED_SET_IN_FLAGS taskfile flag to indicate the need of modifying ->in_flags in ide_taskfile_ioctl(). Set the new flag in flagged_taskfile() and move the code modifying ->tf_in_flags to ide_taskfile_ioctl(). While at it remove the bogus comment: ->tf_in_flags (except .b.data) have no effect on selection of registers to read. * Remove no longer needed 'tf_in_flags' field from ide_task_t. As the result we finally have the internals of HDIO_DRIVE_TASKFILE ioctl separated from the core IDE code. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:14 +01:00
Bartlomiej Zolnierkiewicz	ac026ff254	ide: remove 'command_type' field from ide_task_t * Add 'data_buf' and 'nsect' variables in ide_taskfile_ioctl() to cache data buffer pointer and number of sectors to transfer (this allows us to have only one ide_diag_taskfile() call). * Add IDE_TFLAG_WRITE taskfile flag and use it to check whether the REQ_RW request flag should be set. * Move ->command_type handling from ide_diag_taskfile() to ide_taskfile_ioctl() and use ->req_cmd instead of ->command_type. * Add 'nsect' parameter to ide_raw_taskfile(). * Merge ide_diag_taskfile() into ide_raw_taskfile(). * Initialize ->data_phase explicitly in idedisk_prepare_flush(), ide_start_power_step() and ide_disk_special(). * Remove no longer needed 'command_type' field from ide_task_t. * Add #ifndef/#endif __KERNEL__ to <linux/hdreg.h> around no longer used by kernel IDE_DRIVE_TASK_* and TASKFILE_* defines. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:14 +01:00
Bartlomiej Zolnierkiewicz	7299a39184	ide: remove hwif->intrproc Given that: * hpt366.c::hpt3xx_intrproc() is the only user of hwif->intrproc * hpt366.c::hpt3xx_quirkproc() sets drive->quirk_list to 1 for quirky drives which is a value unique to hpt366 host driver we can remove hwif->intproc and just check for drive->quirk_list == 1 in ide_do_request(). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:14 +01:00
Bartlomiej Zolnierkiewicz	f919790f8c	ide: remove SELECT_INTERRUPT() Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:13 +01:00
Bartlomiej Zolnierkiewicz	cd3dbc99da	ide: remove QUIRK_LIST() Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:13 +01:00
Bartlomiej Zolnierkiewicz	2fc5738819	ide: add ide_pktcmd_tf_load() helper Add ide_pktcmd_tf_load() helper and convert ATAPI device drivers to use it. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:13 +01:00
Bartlomiej Zolnierkiewicz	8e7657ae0f	ide: remove atapi_ireason_t (take 3) Remove atapi_ireason_t. While at it: * replace 'HWIF(drive)' by 'drive->hwif' (or just 'hwif' where possible) v2: * v1 had CD and IO bits reversed in many places. * Use CD and IO defines from <linux/hdreg.h>. v3: * Fix incorrect "(ireason & IO) == test_bit()". (Noticed by Sergei) Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:12 +01:00
Bartlomiej Zolnierkiewicz	790d123989	ide: remove ata_nsector_t, ata_data_t and atapi_bcount_t Remove ata_nsector_t, ata_data_t (unused) and atapi_bcount_t. While at it: * replace 'HWIF(drive)' by 'hwif' Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:12 +01:00
Bartlomiej Zolnierkiewicz	e5f9f5a89a	ide: remove atapi_feature_t Remove atapi_feature_t. While at it: * replace 'HWIF(drive)' by 'hwif' Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:12 +01:00
Bartlomiej Zolnierkiewicz	0e38a66a1e	ide: remove atapi_error_t (take 2) Remove atapi_error_t. While at it: * replace 'HWIF(drive)' by 'drive->hwif' v2: * Add {ILI,EOM,LFS}_ERR defines to <linux/hdreg.h>. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:12 +01:00
Bartlomiej Zolnierkiewicz	22c525b976	ide: remove ata_status_t and atapi_status_t Remove ata_status_t (unused) and atapi_status_t. While at it: * replace 'HWIF(drive)' by 'drive->hwif' (or just 'hwif' where possible) Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:11 +01:00
Bartlomiej Zolnierkiewicz	6a2144146a	ide: CPU endianness doesn't matter for special_t special_t is used only internally by the IDE subsystem (it isn't related to hardware registers and isn't exported to the user-space). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:11 +01:00
Bartlomiej Zolnierkiewicz	29ed2a5f8c	ide: remove REQ_TYPE_ATA_TASK Based on the earlier work by Tejun Heo. All users are gone so we can finally remove it. Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:11 +01:00
Bartlomiej Zolnierkiewicz	807e35d695	ide: use ide_tf_load() in execute_drive_cmd() * Add IDE_TFLAG_OUT_DEVICE taskfile flag to indicate the need of writing the Device register and handle it in ide_tf_load(). Update ide_tf_load() and {do_rw,flagged}_taskfile() users accordingly. * Use struct ide_taskfile and ide_tf_load() in execute_drive_cmd(). * Make the debugging code dump all taskfile registers for both REQ_ATA_TYPE_{CMD,TASK} requests and move it to ide_tf_load() so it also covers REQ_ATA_TYPE_TASKFILE requests. There should be no functionality changes caused by this patch (unless DEBUG is defined). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:10 +01:00
Bartlomiej Zolnierkiewicz	4ee06b7e67	ide: remove stale ide.h "configuration options" Remove stale ide.h "configuration options": * INITIAL_MULT_COUNT - always defined to 0 * SUPPORT_SLOW_DATA_PORTS - unused * OK_TO_RESET_CONTROLLER - always defined to 1 * DISABLE_IRQ_NOSYNC - always defined to 0 Leave SUPPORT_VLB_SYNC (defined to 0 for CRIS and FRV, otherwise to 1) for now but disallow overriding it by <asm/ide.h>. There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:08 +01:00
Bartlomiej Zolnierkiewicz	74095a91ed	ide: use do_rw_taskfile() in flagged_taskfile() Based on the earlier work by Tejun Heo. * Move setting IDE_TFLAG_LBA48 taskfile flag from do_rw_taskfile() function to the callers. * Add IDE_TFLAG_FLAGGED taskfile flag for flagged taskfiles coming from ide_taskfile_ioctl(). Check it instead of ->tf_out_flags.all. * Add IDE_TFLAG_OUT_DATA taskfile flag to indicate the need to load IDE data register in ide_tf_load(). * Add IDE_TFLAG_OUT_* taskfile flags to indicate the need to load particular IDE taskfile registers in ide_tf_load(). * Update do_rw_taskfile() and ide_tf_load() users to set respective IDE_TFLAG_OUT_* taksfile flags. * Add task_dma_ok() helper. * Use IDE_TFLAG_FLAGGED taskfile flag to select HIHI mask in ide_tf_load(). * Use do_rw_taskfile() in flagged_taskfile(). * Remove no longer needed 'tf_out_flags' field from ide_task_t. There should be no functionality changes caused by this patch. Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:07 +01:00
Bartlomiej Zolnierkiewicz	9a3c49be5c	ide: add ide_no_data_taskfile() helper * Add ide_no_data_taskfile() helper and convert ide_raw_taskfile() w/ NO DATA protocol users to use it instead. * Set ->data_phase explicitly in ide_no_data_taskfile() (TASKFILE_NO_DATA is defined as 0x0000). * Unexport task_no_data_intr(). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:07 +01:00
Bartlomiej Zolnierkiewicz	9e42237f26	ide: add ide_tf_load() helper Based on the earlier work by Tejun Heo. * Add 'tf_flags' field (for taskfile flags) to ide_task_t. * Add IDE_TFLAG_LBA48 taskfile flag for LBA48 taskfiles. * Add IDE_TFLAG_NO_SELECT_MASK taskfile flag for __ide_do_rw_disk() which doesn't use SELECT_MASK() (looks like a bug but it requires some more investigation). * Split off ide_tf_load() helper from do_rw_taskfile(). * Convert __ide_do_rw_disk() to use ide_tf_load(). There should be no functionality changes caused by this patch. Cc: Tejun Heo <htejun@gmail.com> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:07 +01:00
Bartlomiej Zolnierkiewicz	650d841d9e	ide: add struct ide_taskfile (take 2) * Don't set write-only ide_task_t.hobRegister[6] and ide_task_t.hobRegister[7] in idedisk_set_max_address_ext(). * Add struct ide_taskfile and use it in ide_task_t instead of tfRegister[] and hobRegister[]. * Remove no longer needed IDE_CONTROL_OFFSET_HOB define. * Add #ifndef/#endif __KERNEL__ around definitions of {task,hob}_struct_t. While at it: * Use ATA_LBA define for LBA bit (0x40) as suggested by Tejun Heo. v2: * Add missing newlines. (Noticed by Sergei) * Use ~ATA_LBA instead of 0xBF. (Noticed by Sergei) * Use unnamed unions for error/feature and status/command. (Suggested by Sergei). There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:06 +01:00
Bartlomiej Zolnierkiewicz	cd2a2d9697	ide: remove task_ioreg_t typedef (take 2) Remove task_ioreg_t typedef from the kernel code (but leave it in <linux/hdreg.h> for #ifndef/#endif __KERNEL__ case). While at it also move sata_ioreg_t typedef under #ifndef/#endif __KERNEL__. v2: Remove name of the second parameter from ide_execute_command() declaration. (Noticed by Sergei). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:06 +01:00
Bartlomiej Zolnierkiewicz	1c029fd658	ide: remove ->dma_master field from ide_hwif_t (take 5) * Convert cmd64x, hpt366 and pdc202xx_old host drivers to use pci_resource_start(hwif->pci_dev, 4) instead of hwif->dma_master. * Remove no longer needed ->dma_master field from ide_hwif_t. v2: * Use the more readable 'hwif->dma_base - (hwif->channel * 8)' instead of pci_resource_start(hwif->pci_dev, 4). v3: * Use hwif->extra_base in hpt366/pdc20xx_old + some cosmetic fixups over v2 (suggested by Sergei). v4: * Correct offsets in hpt3xxn_set_clock(). v5: * Use hwif->extra_base in hpt366 for _real_ this time. (Noticed by Sergei) Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:05 +01:00
Hans Verkuil	0b394def21	V4L/DVB (6868): i2c-id.h: add I2C_DRIVERID_CS5345 Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-01-25 19:04:07 -02:00
Hans Verkuil	05e997197e	V4L/DVB (6487): i2c-id: add M52790 driver ID Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-01-25 19:01:47 -02:00
Arjan van de Ven	6d082592b6	sched: keep total / count stats in addition to the max for Right now, the linux kernel (with scheduler statistics enabled) keeps track of the maximum time a process is waiting to be scheduled. While the maximum is a very useful metric, tracking average and total is equally useful (at least for latencytop) to figure out the accumulated effect of scheduler delays. The accumulated effect is important to judge the performance impact of scheduler tuning/behavior. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:35 +01:00
Alexey Dobriyan	286100a6cf	sched, futex: detach sched.h and futex.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Ingo Molnar	90739081ef	softlockup: fix signedness fix softlockup tunables signedness. mark tunables read-mostly. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Arjan van de Ven	9745512ce7	sched: latencytop support LatencyTOP kernel infrastructure; it measures latencies in the scheduler and tracks it system wide and per process. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Pavel Machek	e118adef23	timers: don't #error on higher HZ values For some crazy reason (trying to work around hw problem in i810) I wanted to use HZ around 4000. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Ingo Molnar	6478d8800b	sched: remove the !PREEMPT_BKL code remove the !PREEMPT_BKL code. this removes 160 lines of legacy code. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:33 +01:00
Peter Zijlstra	d3d74453c3	hrtimer: fixup the HRTIMER_CB_IRQSAFE_NO_SOFTIRQ fallback Currently all highres=off timers are run from softirq context, but HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers expect to run from irq context. Fix this up by splitting it similar to the highres=on case. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:31 +01:00
Peter Zijlstra	48d5e25821	sched: rt throttling vs no_hz We need to teach no_hz about the rt throttling because its tick driven. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:31 +01:00
Peter Zijlstra	6f505b1642	sched: rt group scheduling Extend group scheduling to also cover the realtime classes. It uses the time limiting introduced by the previous patch to allow multiple realtime groups. The hard time limit is required to keep behaviour deterministic. The algorithms used make the realtime scheduler O(tg), linear scaling wrt the number of task groups. This is the worst case behaviour I can't seem to get out of, the avg. case of the algorithms can be improved, I focused on correctness and worst case. [ akpm@linux-foundation.org: move side-effects out of BUG_ON(). ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:30 +01:00
Peter Zijlstra	fa85ae2418	sched: rt time limit Very simple time limit on the realtime scheduling classes. Allow the rq's realtime class to consume sched_rt_ratio of every sched_rt_period slice. If the class exceeds this quota the fair class will preempt the realtime class. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:29 +01:00
Peter Zijlstra	8f4d37ec07	sched: high-res preemption tick Use HR-timers (when available) to deliver an accurate preemption tick. The regular scheduler tick that runs at 1/HZ can be too coarse when nice level are used. The fairness system will still keep the cpu utilisation 'fair' by then delaying the task that got an excessive amount of CPU time but try to minimize this by delivering preemption points spot-on. The average frequency of this extra interrupt is sched_latency / nr_latency. Which need not be higher than 1/HZ, its just that the distribution within the sched_latency period is important. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:29 +01:00
Herbert Xu	02b67cc3ba	sched: do not do cond_resched() when CONFIG_PREEMPT Why do we even have cond_resched when real preemption is on? It seems to be a waste of space and time. remove cond_resched with CONFIG_PREEMPT on. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:28 +01:00
Peter Zijlstra	78f2c7db60	sched: SCHED_FIFO/SCHED_RR watchdog timer Introduce a new rlimit that allows the user to set a runtime timeout on real-time tasks their slice. Once this limit is exceeded the task will receive SIGXCPU. So it measures runtime since the last sleep. Input and ideas by Thomas Gleixner and Lennart Poettering. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Lennart Poettering <mzxreary@0pointer.de> CC: Michael Kerrisk <mtk.manpages@googlemail.com> CC: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:27 +01:00
Peter Zijlstra	fa717060f1	sched: sched_rt_entity Move the task_struct members specific to rt scheduling together. A future optimization could be to put sched_entity and sched_rt_entity into a union. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:27 +01:00
Paul E. McKenney	e260be673a	Preempt-RCU: implementation This patch implements a new version of RCU which allows its read-side critical sections to be preempted. It uses a set of counter pairs to keep track of the read-side critical sections and flips them when all tasks exit read-side critical section. The details of this implementation can be found in this paper - http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf and the article- http://lwn.net/Articles/253651/ This patch was developed as a part of the -rt kernel development and meant to provide better latencies when read-side critical sections of RCU don't disable preemption. As a consequence of keeping track of RCU readers, the readers have a slight overhead (optimizations in the paper). This implementation co-exists with the "classic" RCU implementations and can be switched to at compiler. Also includes RCU tracing summarized in debugfs. [ akpm@linux-foundation.org: build fixes on non-preempt architectures ] Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:24 +01:00
Paul E. McKenney	01c1c660f4	Preempt-RCU: reorganize RCU code into rcuclassic.c and rcupdate.c This patch re-organizes the RCU code to enable multiple implementations of RCU. Users of RCU continues to include rcupdate.h and the RCU interfaces remain the same. This is in preparation for subsequently merging the preemptible RCU implementation. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:24 +01:00
Dipankar Sarma	c2d727aa2f	Preempt-RCU: Use softirq instead of tasklets for This patch makes RCU use softirq instead of tasklets. It also adds a memory barrier after raising the softirq inorder to ensure that the cpu sees the most recently updated value of rcu->cur while processing callbacks. The discussion of the related theoretical race pointed out by James Huang can be found here --> http://lkml.org/lkml/2007/11/20/603 Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:23 +01:00
Steven Rostedt	cb46984504	sched: RT-balance, add new methods to sched_class Dmitry Adamushko found that the current implementation of the RT balancing code left out changes to the sched_setscheduler and rt_mutex_setprio. This patch addresses this issue by adding methods to the schedule classes to handle being switched out of (switched_from) and being switched into (switched_to) a sched_class. Also a method for changing of priorities is also added (prio_changed). This patch also removes some duplicate logic between rt_mutex_setprio and sched_setscheduler. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:22 +01:00
Steven Rostedt	9a897c5a67	sched: RT-balance, replace hooks with pre/post schedule and wakeup methods To make the main sched.c code more agnostic to the schedule classes. Instead of having specific hooks in the schedule code for the RT class balancing. They are replaced with a pre_schedule, post_schedule and task_wake_up methods. These methods may be used by any of the classes but currently, only the sched_rt class implements them. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:22 +01:00
Ingo Molnar	32525d022a	sched: whitespace cleanups in topology.h whitespace cleanups in topology.h. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:20 +01:00
Ingo Molnar	52d853431e	sched: reactivate fork balancing reactivate fork balancing. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:20 +01:00
Gregory Haskins	57d885fea0	sched: add sched-domain roots We add the notion of a root-domain which will be used later to rescope global variables to per-domain variables. Each exclusive cpuset essentially defines an island domain by fully partitioning the member cpus from any other cpuset. However, we currently still maintain some policy/state as global variables which transcend all cpusets. Consider, for instance, rt-overload state. Whenever a new exclusive cpuset is created, we also create a new root-domain object and move each cpu member to the root-domain's span. By default the system creates a single root-domain with all cpus as members (mimicking the global state we have today). We add some plumbing for storing class specific data in our root-domain. Whenever a RQ is switching root-domains (because of repartitioning) we give each sched_class the opportunity to remove any state from its old domain and add state to the new one. This logic doesn't have any clients yet but it will later in the series. Signed-off-by: Gregory Haskins <ghaskins@novell.com> CC: Christoph Lameter <clameter@sgi.com> CC: Paul Jackson <pj@sgi.com> CC: Simon Derr <simon.derr@bull.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:18 +01:00
Gregory Haskins	e7693a362e	sched: de-SCHED_OTHER-ize the RT path The current wake-up code path tries to determine if it can optimize the wake-up to "this_cpu" by computing load calculations. The problem is that these calculations are only relevant to SCHED_OTHER tasks where load is king. For RT tasks, priority is king. So the load calculation is completely wasted bandwidth. Therefore, we create a new sched_class interface to help with pre-wakeup routing decisions and move the load calculation as a function of CFS task's class. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:09 +01:00
Gregory Haskins	73fe6aae84	sched: add RT-balance cpu-weight Some RT tasks (particularly kthreads) are bound to one specific CPU. It is fairly common for two or more bound tasks to get queued up at the same time. Consider, for instance, softirq_timer and softirq_sched. A timer goes off in an ISR which schedules softirq_thread to run at RT50. Then the timer handler determines that it's time to smp-rebalance the system so it schedules softirq_sched to run. So we are in a situation where we have two RT50 tasks queued, and the system will go into rt-overload condition to request other CPUs for help. This causes two problems in the current code: 1) If a high-priority bound task and a low-priority unbounded task queue up behind the running task, we will fail to ever relocate the unbounded task because we terminate the search on the first unmovable task. 2) We spend precious futile cycles in the fast-path trying to pull overloaded tasks over. It is therefore optimial to strive to avoid the overhead all together if we can cheaply detect the condition before overload even occurs. This patch tries to achieve this optimization by utilizing the hamming weight of the task->cpus_allowed mask. A weight of 1 indicates that the task cannot be migrated. We will then utilize this information to skip non-migratable tasks and to eliminate uncessary rebalance attempts. We introduce a per-rq variable to count the number of migratable tasks that are currently running. We only go into overload if we have more than one rt task, AND at least one of them is migratable. In addition, we introduce a per-task variable to cache the cpus_allowed weight, since the hamming calculation is probably relatively expensive. We only update the cached value when the mask is updated which should be relatively infrequent, especially compared to scheduling frequency in the fast path. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:07 +01:00
Ingo Molnar	82a1fcb902	softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks this patch extends the soft-lockup detector to automatically detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are printed the following way: ------------------> INFO: task prctl:3042 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message prctl D fd5e3793 0 3042 2997 f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286 f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000 f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b Call Trace: [<c04883a5>] schedule_timeout+0x6d/0x8b [<c04883d8>] schedule_timeout_uninterruptible+0x15/0x17 [<c0133a76>] msleep+0x10/0x16 [<c0138974>] sys_prctl+0x30/0x1e2 [<c0104c52>] sysenter_past_esp+0x5f/0xa5 ======================= 2 locks held by prctl/3042: #0: (&sb->s_type->i_mutex_key#5){--..}, at: [<c0197d11>] do_fsync+0x38/0x7a #1: (jbd_handle){--..}, at: [<c01ca3d2>] journal_start+0xc7/0xe9 <------------------ the current default timeout is 120 seconds. Such messages are printed up to 10 times per bootup. If the system has crashed already then the messages are not printed. if lockdep is enabled then all held locks are printed as well. this feature is a natural extension to the softlockup-detector (kernel locked up without scheduling) and to the NMI watchdog (kernel locked up with IRQs disabled). [ Gautham R Shenoy <ego@in.ibm.com>: CPU hotplug fixes. ] [ Andrew Morton <akpm@linux-foundation.org>: build warning fix. ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2008-01-25 21:08:02 +01:00
Ingo Molnar	d0d23b5432	cpu-hotplug: fix build on !CONFIG_SMP fix build on !CONFIG_SMP. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:02 +01:00
Gautham R Shenoy	95402b3829	cpu-hotplug: replace per-subsystem mutexes with get_online_cpus() This patch converts the known per-subsystem mutexes to get_online_cpus put_online_cpus. It also eliminates the CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE hotplug notification events. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:02 +01:00
Gautham R Shenoy	86ef5c9a8e	cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus() Replace all lock_cpu_hotplug/unlock_cpu_hotplug from the kernel and use get_online_cpus and put_online_cpus instead as it highlights the refcount semantics in these operations. The new API guarantees protection against the cpu-hotplug operation, but it doesn't guarantee serialized access to any of the local data structures. Hence the changes needs to be reviewed. In case of pseries_add_processor/pseries_remove_processor, use cpu_maps_update_begin()/cpu_maps_update_done() as we're modifying the cpu_present_map there. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:02 +01:00
Gautham R Shenoy	d221938c04	cpu-hotplug: refcount based cpu hotplug This patch implements a Refcount + Waitqueue based model for cpu-hotplug. Now, a thread which wants to prevent cpu-hotplug, will bump up a global refcount and the thread which wants to perform a cpu-hotplug operation will block till the global refcount goes to zero. The readers, if any, during an ongoing cpu-hotplug operation are blocked until the cpu-hotplug operation is over. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Paul Jackson <pj@sgi.com> [For !CONFIG_HOTPLUG_CPU ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:01 +01:00
Srivatsa Vaddagiri	6b2d770026	sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups The current load balancing scheme isn't good enough for precise group fairness. For example: on a 8-cpu system, I created 3 groups as under: a = 8 tasks (cpu.shares = 1024) b = 4 tasks (cpu.shares = 1024) c = 3 tasks (cpu.shares = 1024) a, b and c are task groups that have equal weight. We would expect each of the groups to receive 33.33% of cpu bandwidth under a fair scheduler. This is what I get with the latest scheduler git tree: Signed-off-by: Ingo Molnar <mingo@elte.hu> -------------------------------------------------------------------------------- Col1 \| Col2 \| Col3 \| Col4 ------\|---------\|-------\|------------------------------------------------------- a \| 277.676 \| 57.8% \| 54.1% 54.1% 54.1% 54.2% 56.7% 62.2% 62.8% 64.5% b \| 116.108 \| 24.2% \| 47.4% 48.1% 48.7% 49.3% c \| 86.326 \| 18.0% \| 47.5% 47.9% 48.5% -------------------------------------------------------------------------------- Explanation of o/p: Col1 -> Group name Col2 -> Cumulative execution time (in seconds) received by all tasks of that group in a 60sec window across 8 cpus Col3 -> CPU bandwidth received by the group in the 60sec window, expressed in percentage. Col3 data is derived as: Col3 = 100 * Col2 / (NR_CPUS * 60) Col4 -> CPU bandwidth received by each individual task of the group. Col4 = 100 * cpu_time_recd_by_task / 60 [I can share the test case that produces a similar o/p if reqd] The deviation from desired group fairness is as below: a = +24.47% b = -9.13% c = -15.33% which is quite high. After the patch below is applied, here are the results: -------------------------------------------------------------------------------- Col1 \| Col2 \| Col3 \| Col4 ------\|---------\|-------\|------------------------------------------------------- a \| 163.112 \| 34.0% \| 33.2% 33.4% 33.5% 33.5% 33.7% 34.4% 34.8% 35.3% b \| 156.220 \| 32.5% \| 63.3% 64.5% 66.1% 66.5% c \| 160.653 \| 33.5% \| 85.8% 90.6% 91.4% -------------------------------------------------------------------------------- Deviation from desired group fairness is as below: a = +0.67% b = -0.83% c = +0.17% which is far better IMO. Most of other runs have yielded a deviation within +-2% at the most, which is good. Why do we see bad (group) fairness with current scheuler? ========================================================= Currently cpu's weight is just the summation of individual task weights. This can yield incorrect results. For ex: consider three groups as below on a 2-cpu system: CPU0 CPU1 --------------------------- A (10) B(5) C(5) --------------------------- Group A has 10 tasks, all on CPU0, Group B and C have 5 tasks each all of which are on CPU1. Each task has the same weight (NICE_0_LOAD = 1024). The current scheme would yield a cpu weight of 10240 (101024) for each cpu and the load balancer will think both CPUs are perfectly balanced and won't move around any tasks. This, however, would yield this bandwidth: A = 50% B = 25% C = 25% which is not the desired result. What's changing in the patch? ============================= - How cpu weights are calculated when CONFIF_FAIR_GROUP_SCHED is defined (see below) - API Change - Two tunables introduced in sysfs (under SCHED_DEBUG) to control the frequency at which the load balance monitor thread runs. The basic change made in this patch is how cpu weight (rq->load.weight) is calculated. Its now calculated as the summation of group weights on a cpu, rather than summation of task weights. Weight exerted by a group on a cpu is dependent on the shares allocated to it and also the number of tasks the group has on that cpu compared to the total number of (runnable) tasks the group has in the system. Let, W(K,i) = Weight of group K on cpu i T(K,i) = Task load present in group K's cfs_rq on cpu i T(K) = Total task load of group K across various cpus S(K) = Shares allocated to group K NRCPUS = Number of online cpus in the scheduler domain to which group K is assigned. Then, W(K,i) = S(K) NRCPUS * T(K,i) / T(K) A load balance monitor thread is created at bootup, which periodically runs and adjusts group's weight on each cpu. To avoid its overhead, two min/max tunables are introduced (under SCHED_DEBUG) to control the rate at which it runs. Fixes from: Peter Zijlstra <a.p.zijlstra@chello.nl> - don't start the load_balance_monitor when there is only a single cpu. - rename the kthread because its currently longer than TASK_COMM_LEN Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:00 +01:00
Linus Torvalds	b47711bfbc	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: selinux: make mls_compute_sid always polyinstantiate security/selinux: constify function pointer tables and fields security: add a secctx_to_secid() hook security: call security_file_permission from rw_verify_area security: remove security_sb_post_mountroot hook Security: remove security.h include from mm.h Security: remove security_file_mmap hook sparse-warnings (NULL as 0). Security: add get, set, and cloning of superblock security information security/selinux: Add missing "space"	2008-01-25 08:44:29 -08:00
Linus Torvalds	eba0e319c1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (125 commits) [CRYPTO] twofish: Merge common glue code [CRYPTO] hifn_795x: Fixup container_of() usage [CRYPTO] cast6: inline bloat-- [CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long [CRYPTO] tcrypt: Make xcbc available as a standalone test [CRYPTO] xcbc: Remove bogus hash/cipher test [CRYPTO] xcbc: Fix algorithm leak when block size check fails [CRYPTO] tcrypt: Zero axbuf in the right function [CRYPTO] padlock: Only reset the key once for each CBC and ECB operation [CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h [CRYPTO] salsa20-asm: Remove unnecessary dependency on CRYPTO_SALSA20 [CRYPTO] tcrypt: Add select of AEAD [CRYPTO] salsa20: Add x86-64 assembly version [CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version) [CRYPTO] gcm: Introduce rfc4106 [CRYPTO] api: Show async type [CRYPTO] chainiv: Avoid lock spinning where possible [CRYPTO] seqiv: Add select AEAD in Kconfig [CRYPTO] scatterwalk: Handle zero nbytes in scatterwalk_map_and_copy [CRYPTO] null: Allow setkey on digest_null ...	2008-01-25 08:38:25 -08:00
Artem Bityutskiy	866136827b	UBI: introduce atomic LEB change ioctl We have to be able to change individual LEBs for utilities like ubifsck, ubifstune. For example, ubifsck has to be able to fix errors on the media, ubifstune has to be able to change the the superblock, hence this ioctl. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-01-25 16:41:25 +02:00
Greg Kroah-Hartman	79a6ee42fd	Kobject: fix coding style issues in kobject.h Finally clean up the odd spaces and other mess in kobject.h Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 21:27:06 -08:00
Greg Kroah-Hartman	d462943afe	Driver core: fix coding style issues in device.h Finally clean up the odd spaces and other mess in device.h Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 21:04:46 -08:00
Dave Young	fd04897bb2	Driver Core: add class iteration api Add the following class iteration functions for driver use: class_for_each_device class_find_device class_for_each_child class_find_child Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:44 -08:00
Stephen Rothwell	ae72cddb23	Driver Core: constify the name passed to platform_device_register_simple This name is just passed to platform_device_alloc which has its parameter declared const. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:43 -08:00
Kay Sievers	af5ca3f4ec	Driver core: change sysdev classes to use dynamic kobject names All kobjects require a dynamically allocated name now. We no longer need to keep track if the name is statically assigned, we can just unconditionally free() all kobject names on cleanup. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	528a4bf1d5	Kobject: remove kobject_unregister() as no one uses it anymore There are no in-kernel users of kobject_unregister() so it should be removed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Kay Sievers	0f4dafc056	Kobject: auto-cleanup on final unref We save the current state in the object itself, so we can do proper cleanup when the last reference is dropped. If the initial reference is dropped, the object will be removed from sysfs if needed, if an "add" event was sent, "remove" will be send, and the allocated resources are released. This allows us to clean up some driver core usage as well as allowing us to do other such changes to the rest of the kernel. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman	12e339ac6e	Kset: remove kset_add function No one is calling this anymore, so just remove it and hard-code the one internal-use of it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman	6d06adfaf8	Kobject: remove kobject_register() The function is no longer used by anyone in the kernel, and it prevents the proper sending of the kobject uevent after the needed files are set up by the caller. kobject_init_and_add() can be used in its place. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman	f9cb074bff	Kobject: rename kobject_init_ng() to kobject_init() Now that the old kobject_init() function is gone, rename kobject_init_ng() to kobject_init() to clean up the namespace. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman	e1543ddf73	Kobject: remove kobject_init() as no one uses it anymore The old kobject_init() function is on longer in use, so let us remove it from the public scope (kset mess in the kobject.c file still uses it, but that can be cleaned up later very simply.) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman	b2d6db5878	Kobject: rename kobject_add_ng() to kobject_add() Now that the old kobject_add() function is gone, rename kobject_add_ng() to kobject_add() to clean up the namespace. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman	9e7bbccd02	Kobject: remove kobject_add() as no one uses it anymore The old kobject_add() function is on longer in use, so let us remove it from the public scope (kset mess in the kobject.c file still uses it, but that can be cleaned up later very simply.) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Kay Sievers	edfaa7c365	Driver core: convert block from raw kobjects to core devices This moves the block devices to /sys/class/block. It will create a flat list of all block devices, with the disks and partitions in one directory. For compatibility /sys/block is created and contains symlinks to the disks. /sys/class/block \|-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda \|-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1 \|-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10 \|-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5 \|-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6 \|-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7 \|-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8 \|-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9 `-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 /sys/block/ \|-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda `-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:36 -08:00
Greg Kroah-Hartman	e5dd127846	Driver core: move the static kobject out of struct driver This patch removes the kobject, and a few other driver-core-only fields out of struct driver and into the driver core only. Now drivers can be safely create on the stack or statically (like they currently are.) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:35 -08:00
Greg Kroah-Hartman	c63469a398	Driver core: move the driver specific module code into the driver core The module driver specific code should belong in the driver core, not in the kernel/ directory. So move this code. This is done in preparation for some struct device_driver rework that should be confined to the driver core code only. This also lets us keep from exporting these functions, as no external code should ever be calling it. Thanks to Andrew Morton for the !CONFIG_MODULES fix. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:35 -08:00
Greg Kroah-Hartman	cbe9c595f1	Driver: add driver_add_kobj for looney iseries_veth driver The iseries driver wants to hang kobjects off of its driver, so, to preserve backwards compatibility, we need to add a call to the driver core to allow future changes to work properly. Hopefully no one uses this function in the future and the iseries_veth driver authors come to their senses so I can remove this hack... Cc: Dave Larson <larson1@us.ibm.com> Cc: Santiago Leon <santil@us.ibm.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:34 -08:00
Cornelia Huck	57c745340a	driver core: Introduce default attribute groups. This is lot like default attributes for devices (and indeed, a lot of the code is lifted from there). Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:34 -08:00
Greg Kroah-Hartman	c6f7e72a3f	driver core: remove fields from struct bus_type struct bus_type is static everywhere in the kernel. This moves the kobject in the structure out of it, and a bunch of other private only to the driver core fields are now moved to a private structure. This lets us dynamically create the backing kobject properly and gives us the chance to be able to document to users exactly how to use the struct bus_type as there are no fields they can improperly access. Thanks to Kay for the build fixes on this patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman	b249072ee6	driver core: add way to get to bus device klist This allows an easier way to get to the device klist associated with a struct bus_type (you have three to choose from...) This will make it easier to move these fields to be dynamic in a future patch. The only user of this is the PCI core which horribly abuses this interface to rearrange the order of the pci devices. This should be done using the existing bus device walking functions, but that's left for future patches. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman	0fed80f7a6	driver core: add way to get to bus kset This allows an easier way to get to the kset associated with a struct bus_type (you have three to choose from...) This will make it easier to move these fields to be dynamic in a future patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman	cc972e896b	driver core: remove owner field from struct bus_type This isn't used by anything in the driver core, and by no one in the 204 different usages of it in the kernel tree. Remove this field so no one gets any idea that it is needed to be used. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:31 -08:00
Greg Kroah-Hartman	81e7c6a636	UIO: fix kobject usage The uio kobject code is "wierd". This patch should hopefully fix it up to be sane and not leak memory anymore. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benedikt Spranger <b.spranger@linutronix.de> Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:26 -08:00
Greg Kroah-Hartman	d76e15fb20	driver core: make /sys/power a kobject /sys/power should not be a kset, that's overkill. This patch renames it to power_kset and fixes up all usages of it in the tree. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:25 -08:00
Greg Kroah-Hartman	2fb9113b97	kobject: remove subsystem_(un)register functions These functions are no longer used and are the last remants of the old subsystem crap. So delete them for good. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	0ff21e4663	kobject: convert kernel_kset to be a kobject kernel_kset does not need to be a kset, but a much simpler kobject now that we have kobj_attributes. We also rename kernel_kset to kernel_kobj to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	5c03c7ab88	kset: remove decl_subsys macro This macro is no longer used. ksets should be created dynamically with a call to kset_create_and_add() not declared statically. Yes, there are 5 remaining static struct kset usages in the kernel tree, but they will be fixed up soon. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	f62ed9e33b	firmware: change firmware_kset to firmware_kobj There is no firmware "subsystem" it's just a directory in /sys that other portions of the kernel want to hook into. So make it a kobject not a kset to help alivate anyone who tries to do some odd kset-like things with this. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:23 -08:00
Greg Kroah-Hartman	15f2f9b3a9	firmware: remove firmware_(un)register() These functions are no longer called or needed, so we can remove them. As I rewrote the whole firmware.c file, add my copyright. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:23 -08:00
Kay Sievers	000f2a4d8c	Driver Core: kill subsys_attribute and default sysfs ops Remove the no longer needed subsys_attributes, they are all converted to the more sensical kobj_attributes. There is no longer a magic fallback in sysfs attribute operations, all kobjects which create simple attributes need explicitely a ktype assigned, which tells the core what was intended here. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:22 -08:00
Greg Kroah-Hartman	9e5f7f9abe	firmware: export firmware_kset so that people can use that instead of the braindead firmware_register interface Needed for future firmware subsystem cleanups. In the end, the firmware_register/unregister functions will be deleted entirely, but we need this symbol so that subsystems can migrate over. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Matt Domsch <Matt_Domsch@dell.com> Cc: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:19 -08:00
Kay Sievers	eb41d9465c	fix struct user_info export's sysfs interaction Clean up the use of ksets and kobjects. Kobjects are instances of objects (like struct user_info), ksets are collections of objects of a similar type (like the uids directory containing the user_info directories). So, use kobjects for the user_info directories, and a kset for the "uids" directory. On object cleanup, the final kobject_put() was missing. Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:18 -08:00
Kay Sievers	23b5212cc7	Driver Core: add kobj_attribute handling Add kobj_sysfs_ops to replace subsys_sysfs_ops. There is no need for special kset operations, we want to be able to use simple attribute operations at any kobject, not only ksets. The whole concept of any default sysfs attribute operations will go away with the upcoming removal of subsys_sysfs_ops. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:18 -08:00
Greg Kroah-Hartman	6dcec2511f	kset: convert struct bus_device->drivers to use kset_create Dynamically create the kset instead of declaring it statically. Having 3 static kobjects in one structure is not only foolish, but ripe for nasty race conditions if handled improperly. We also rename the field to catch any potential users of it (not that there should be outside of the driver core...) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	3d8995963d	kset: convert struct bus_device->devices to use kset_create Dynamically create the kset instead of declaring it statically. Having 3 static kobjects in one structure is not only foolish, but ripe for nasty race conditions if handled improperly. We also rename the field to catch any potential users of it (not that there should be outside of the driver core...) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	039a5dcd2f	kset: convert /sys/power to use kset_create Dynamically create the kset instead of declaring it statically. We also rename power_subsys to power_kset to catch all users of the variable and we properly export it so that people don't have to guess that it really is present in the system. The pseries code is wierd, why is it createing /sys/power if CONFIG_PM is disabled? Oh well, stupid big boxes ignoring config options... Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	7405c1e15e	kset: convert /sys/module to use kset_create Dynamically create the kset instead of declaring it statically. We also rename module_subsys to module_kset to catch all users of the variable. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	2d72fc00a1	kobject: convert /sys/hypervisor to use kobject_create We don't need a kset here, a simple kobject will do just fine, so dynamically create the kobject and use it. We also rename hypervisor_subsys to hypervisor_kset to catch all users of the variable. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:15 -08:00
Greg Kroah-Hartman	bd35b93d80	kset: convert kernel_subsys to use kset_create Dynamically create the kset instead of declaring it statically. We also rename kernel_subsys to kernel_kset to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	e5e38a86c0	kset: remove decl_subsys_name The last user of this macro (pci hotplug core) is now switched over to using a dynamic kset, so this macro is no longer needed at all. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	81ace5cd8f	kset: convert pci hotplug to use kset_create_and_add This also renames pci_hotplug_slots_subsys to pcis_hotplug_slots_kset catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	00d2666623	kobject: convert main fs kobject to use kobject_create This also renames fs_subsys to fs_kobj to catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:13 -08:00
Greg Kroah-Hartman	43968d2f16	kobject: get rid of kobject_kset_add_dir kobject_kset_add_dir is only called in one place so remove it and use kobject_create() instead. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman	4ff6abff83	kobject: get rid of kobject_add_dir kobject_create_and_add is the same as kobject_add_dir, so drop kobject_add_dir. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman	3f9e3ee9dc	kobject: add kobject_create_and_add function This lets users create dynamic kobjects much easier. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	b727c70289	kset: add kset_create_and_add function Now ksets can be dynamically created on the fly, no static definitions are required. Thanks to Miklos for hints on how to make this work better for the callers. And thanks to Kay for finding some stupid bugs in my original version and pointing out that we need to handle the fact that kobject's can have a kset as a parent and to handle that properly in kobject_add(). Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	12d03da7c1	kobject: remove kobj_set_kset_s as no one is using it anymore What a confusing name for a macro... Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	3514faca19	kobject: remove struct kobj_type from struct kset We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	c11c4154e7	kobject: add kobject_init_and_add function Also add a kobject_init_and_add function which bundles up what a lot of the current callers want to do all at once, and it properly handles the memory usages, unlike kobject_register(); Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	244f6cee9a	kobject: add kobject_add_ng function This is what the kobject_add function is going to become. Add this to the kernel and then we can convert the tree over to use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:09 -08:00
Greg Kroah-Hartman	e86000d042	kobject: add kobject_init_ng function This is what the kobject_init function is going to become. Add this to the kernel and then we can convert the tree over to use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:09 -08:00
Greg Kroah-Hartman	18041f4775	kobject: make kobject_cleanup be static No one except the kobject core calls it so make the function static. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:09 -08:00
Emil Medve	7b8712e563	driver core: Make the dev_*() family of macros in device.h complete Removed duplicates defined elsewhere Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:08 -08:00
Tony Jones	7dd817d083	tifm: Convert from class_device to device for TI flash media Signed-off-by: Tony Jones <tonyj@suse.de> Cc: Alex Dubov <oakad@yahoo.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:06 -08:00
Tony Jones	6013c12be8	pktcdvd: Convert from class_device to device for block/pktcdvd struct class_device is going away, this converts the code to use struct device instead. Signed-off-by: Tony Jones <tonyj@suse.de> Cc: Peter Osterlund <petero2@telia.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:06 -08:00
Tony Jones	891f78ea83	DMA: Convert from class_device to device for DMA engine Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Shannon Nelson <shannon.nelson@intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:05 -08:00
Evgeniy Polyakov	41ca28ab2a	kref: add kref_set() This adds kref_set() to the kref api for future use by people who really know what they are doing with krefs... From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:05 -08:00
Rafael J. Wysocki	775b64d2b6	PM: Acquire device locks on suspend This patch reorganizes the way suspend and resume notifications are sent to drivers. The major changes are that now the PM core acquires every device semaphore before calling the methods, and calls to device_add() during suspends will fail, while calls to device_del() during suspends will block. It also provides a way to safely remove a suspended device with the help of the PM core, by using the device_pm_schedule_removal() callback introduced specifically for this purpose, and updates two drivers (msr and cpuid) that need to use it. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:04 -08:00
Jan Engelhardt	1996a10948	security/selinux: constify function pointer tables and fields Constify function pointer tables and fields. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:54 +11:00
David Howells	63cb344923	security: add a secctx_to_secid() hook Add a secctx_to_secid() LSM hook to go along with the existing secid_to_secctx() LSM hook. This patch also includes the SELinux implementation for this hook. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:53 +11:00
H. Peter Anvin	bced95283e	security: remove security_sb_post_mountroot hook The security_sb_post_mountroot() hook is long-since obsolete, and is fundamentally broken: it is never invoked if someone uses initramfs. This is particularly damaging, because the existence of this hook has been used as motivation for not using initramfs. Stephen Smalley confirmed on 2007-07-19 that this hook was originally used by SELinux but can now be safely removed: http://marc.info/?l=linux-kernel&m=118485683612916&w=2 Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Eric Paris <eparis@parisplace.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:50 +11:00
James Morris	42d7896ebc	Security: remove security.h include from mm.h Remove security.h include from mm.h, as it is only needed for a single extern declaration, and pulls in all kinds of crud. Fine-by-me: David Chinner <dgc@sgi.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:49 +11:00
Eric Paris	c9180a57a9	Security: add get, set, and cloning of superblock security information Adds security_get_sb_mnt_opts, security_set_sb_mnt_opts, and security_clont_sb_mnt_opts to the LSM and to SELinux. This will allow filesystems to directly own and control all of their mount options if they so choose. This interface deals only with option identifiers and strings so it should generic enough for any LSM which may come in the future. Filesystems which pass text mount data around in the kernel (almost all of them) need not currently make use of this interface when dealing with SELinux since it will still parse those strings as it always has. I assume future LSM's would do the same. NFS is the primary FS which does not use text mount data and thus must make use of this interface. An LSM would need to implement these functions only if they had mount time options, such as selinux has context= or fscontext=. If the LSM has no mount time options they could simply not implement and let the dummy ops take care of things. An LSM other than SELinux would need to define new option numbers in security.h and any FS which decides to own there own security options would need to be patched to use this new interface for every possible LSM. This is because it was stated to me very clearly that LSM's should not attempt to understand FS mount data and the burdon to understand security should be in the FS which owns the options. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:46 +11:00
Mattia Dongili	3eb8749a37	sony-laptop: add Type4 model Recent Vaio models (UX, SZ and presumably TZ and others) add more events and a slightly different handling of Fn key events for additional hotkeys (s1, s2, zoom-in/out, etc.). Signed-off-by: Mattia Dongili <malattia@linux.it> Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-24 00:47:27 -05:00
Paul Mackerras	dcb571be20	Merge branch 'for-2.6.25' of master.kernel.org:/pub/scm/linux/kernel/git/galak/powerpc into for-2.6.25	2008-01-24 15:29:14 +11:00
Len Brown	d4b7dc499d	ACPI: make _OSI(Linux) console messages smarter If BIOS invokes _OSI(Linux), the kernel response depends on what the ACPI DMI list knows about the system, and that is reflectd in dmesg: 1) System unknown to DMI: ACPI: BIOS _OSI(Linux) query ignored ACPI: DMI System Vendor: LENOVO ACPI: DMI Product Name: 7661W1P ACPI: DMI Product Version: ThinkPad T61 ACPI: DMI Board Name: 7661W1P ACPI: DMI BIOS Vendor: LENOVO ACPI: DMI BIOS Date: 10/18/2007 ACPI: Please send DMI info above to linux-acpi@vger.kernel.org ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org 2) System known to DMI, but effect of OSI(Linux) unknown: ACPI: DMI detected: Lenovo ThinkPad T61 ... ACPI: BIOS _OSI(Linux) query ignored via DMI ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org 3) System known to DMI, which disables _OSI(Linux): ACPI: DMI detected: Lenovo ThinkPad T61 ... ACPI: BIOS _OSI(Linux) query ignored via DMI 4) System known to DMI, which enable _OSI(Linux): ACPI: DMI detected: Lenovo ThinkPad T61 ACPI: Added _OSI(Linux) ... ACPI: BIOS _OSI(Linux) query honored via DMI cmdline overrides take precidence over the built-in default and the DMI prescribed default. cmdline "acpi_osi=Linux" results in: ACPI: BIOS _OSI(Linux) query honored via cmdline Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-23 21:26:15 -05:00
Len Brown	f89e3b0620	DMI: create dmi_get_slot() This simply allows other sub-systems (such as ACPI) to access and print out slots in static dmi_ident[]. Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-23 21:23:13 -05:00
Len Brown	81b4e1f626	DMI: move dmi_available declaration to linux/dmi.h Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-23 21:22:21 -05:00
Vitaly Bordug	a79d8e93d3	phy/fixed.c: rework to not duplicate PHY layer functionality With that patch fixed.c now fully emulates MDIO bus, thus no need to duplicate PHY layer functionality. That, in turn, drastically simplifies the code, and drops down line count. As an additional bonus, now there is no need to register MDIO bus for each PHY, all emulated PHYs placed on the platform fixed MDIO bus. There is also no more need to pre-allocate PHYs via .config option, this is all now handled dynamically. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-01-23 19:33:58 -06:00
Paul Mackerras	9156ad4833	Merge branch 'linux-2.6'	2008-01-24 10:07:21 +11:00
James Bottomley	d4acd722b7	[SCSI] sysfs: add filter function to groups This patch allows the various users of attribute_groups to selectively allow the appearance of group attributes. The primary consumer of this will be the transport classes in which we currently have elaborate attribute selection algorithms to do this same thing. Acked-by: Greg KH <greg@kroah.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-23 11:29:18 -06:00
James Bottomley	fd1109711d	[SCSI] attribute_container: update to use the group interface This patch is the beginning of moving the attribute_containers to use attribute groups exclusively. The attr element is now deprecated and will eventually be removed (along with all the hand rolled code for doing exactly what attribute groups do) when all the consumers are converted to attribute groups. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-23 11:29:17 -06:00
Alan Cox	5e8f757cb2	ata_generic: Cenatek support Not much to do here. It's an ata memory as disk. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:17 -05:00
Tejun Heo	4e6b79fa61	libata: factor out ata_pci_activate_sff_host() from ata_pci_one() Factor out ata_pci_activate_sff_host() from ata_pci_one(). This does about the same thing as ata_host_activate() but needs to be separate because SFF controllers use different and multiple IRQs in legacy mode. This will be used to make SFF LLD initialization more flexible. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:16 -05:00
Al Viro	4ca4e43964	libata annotations and fixes Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:15 -05:00
Jeff Garzik	442eacc362	libata: make ata_port_queue_task() an internal function ata_port_queue_task() served a single user: ata_pio_task() Rename to ata_pio_queue_task() and un-export it, as nobody outside of libata-core.c uses it. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-01-23 05:24:15 -05:00
Tejun Heo	0bcc65ad78	libata: make qc->nbytes include extra buffers qc->nbytes didn't use to include extra buffers setup by libata core layer and my be odd. This patch makes qc->nbytes include any extra buffers setup by libata core layer and guaranteed to be aligned on 4 byte boundary. This value is to be used to program the host controller. As this represents the actual length of buffer available to the controller and the controller must be able to deal with short transfers for ATAPI commands which can transfer variable length, this shouldn't break any controllers while making problems like rounding-down and controllers choking up on odd transfer bytes much less likely. The unmodified value is stored in new field qc->raw_nbytes. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:15 -05:00
Tejun Heo	ff2aeb1eb6	libata: convert to chained sg libata used private sg iterator to handle padding sg. Now that sg can be chained, padding can be handled using standard sg ops. Convert to chained sg. * s/qc->__sg/qc->sg/ * s/qc->pad_sgent/qc->extra_sg[]/. Because chaining consumes one sg entry. There need to be two extra sg entries. The renaming is also for future addition of other extra sg entries. * Padding setup is moved into ata_sg_setup_extra() which is organized in a way that future addition of other extra sg entries is easy. * qc->orig_n_elem is unused and removed. * qc->n_elem now contains the number of sg entries that LLDs should map. qc->mapped_n_elem is added to carry the original number of mapped sgs for unmapping. * The last sg of the original sg list is used to chain to extra sg list. The original last sg is pointed to by qc->last_sg and the content is stored in qc->saved_last_sg. It's restored during ata_sg_clean(). * All sg walking code has been updated. Unnecessary assertions and checks for conditions the core layer already guarantees are removed. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	001102d785	libata: kill non-sg DMA interface With atapi_request_sense() converted to use sg, there's no user of non-sg interface. Kill non-sg interface. * ATA_QCFLAG_SINGLE and ATA_QCFLAG_SG are removed. ATA_QCFLAG_DMAMAP is used instead. (this way no LLD change is necessary) * qc->buf_virt is removed. * ata_sg_init_one() and ata_sg_setup_one() are removed. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Rusty Russel <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	55dba3120f	libata: update ->data_xfer hook for ATAPI Depending on how many bytes are transferred as a unit, PIO data transfer may consume more bytes than requested. Knowing how much data is consumed is necessary to determine how much is left for draining. This patch update ->data_xfer such that it returns the number of consumed bytes. While at it, it also makes the following changes. * s/adev/dev/ * use READ/WRITE constants for rw indication * misc clean ups Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	ceb0c64262	libata: add ATAPI_* cmd types and implement atapi_cmd_type() Add ATAPI command types - ATAPI_READ, WRITE, RW_BUF, READ_CD and MISC, and implement atapi_cmd_type() which takes SCSI opcode and returns to which class the opcode belongs. This will be used later to improve ATAPI handling. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	0dc36888d4	libata: rename ATA_PROT_ATAPI_* to ATAPI_PROT_* ATA_PROT_ATAPI_* are ugly and naming schemes between ATA_PROT_* and ATA_PROT_ATAPI_* are inconsistent causing confusion. Rename them to ATAPI_PROT_* and make them consistent with ATA counterpart. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-01-23 05:24:14 -05:00
Tejun Heo	537b53c169	cdrom: add more GPCMD_* constants Add GPCMD_* constants for READ_BUFFER, WRITE_12 and WRITE_BUFFER for completeness. These will be used libata. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	021ee9a6da	libata: reimplement ata_acpi_cbl_80wire() using ata_acpi_gtm_xfermask() Reimplement ata_acpi_cbl_80wire() using ata_acpi_gtm_xfermask() and while at it relocate the function below ata_acpi_gtm_xfermask(). New ata_acpi_cbl_80wire() implementation takes @gtm, in both pata_via and pata_amd, use the initial GTM value. Both are trying to peek initial BIOS configuration, so using initial caching value makes sense. This fixes ACPI part of cable detection in pata_amd which previously always returned 0 because configuring PIO0 during reset clears DMA configuration. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	a0f79b929a	libata: implement ata_timing_cycle2mode() and use it in libata-acpi and pata_acpi libata-acpi is using separate timing tables for transfer modes although libata-core has the complete ata_timing table. Implement ata_timing_cycle2mode() to look for matching mode given transfer type and cycle duration and use it in libata-acpi and pata_acpi to replace private timing tables. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	7c77fa4d51	libata: separate out ata_acpi_gtm_xfermask() from pacpi_discover_modes() Finding out matching transfer mode from ACPI GTM values is useful for other purposes too. Separate out the function and timing tables from pata_acpi::pacpi_discover_modes(). Other than checking shared-configuration bit after doing ata_acpi_gtm() in pacpi_discover_modes() which should be safe, this patch doesn't introduce any behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	c88f90c377	libata: add ATA_CBL_PATA_IGN ATA_CBL_PATA_UNK indicates that the cable type can't be determined from the host side and might be either 80c or 40c. libata applies drive or other generic limit in this case. However, there are controllers where both host and drive side detections are misimplemented and the driver has to rely solely on private method - peeking BIOS or ACPI configuration or using some other private mechanism. This patch adds ATA_CBL_PATA_IGN which tells libata to ignore the cable type completely and just let the LLD determine the transfer mode via host transfer mode masks and ->mode_filter(). Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	7dc951aefd	libata: xfer_mask is unsigned long not unsigned int Jeff says xfer_mask is unsigned long not unsigned int. Convert all xfermask fields and handling functions to deal with unsigned longs. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	9d3501ab96	libata: kill ata_id_to_dma_mode() ata_id_to_dma_mode() isn't quite generic. The function is basically privately implemented ata_id_xfermask() combined with hardcoded mode printing and configuration which are specific to ata_generic. Kill the function and open code it in generic_set_mode() using generic xfermode handling functions. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	70cd071e4e	libata: clean up xfermode / PATA timing related stuff * s/ATA_BITS_(PIO\|MWDMA\|UDMA)/ATA_NR_\1_MODES/g * Consistently use 0xff to indicate invalid transfer mode (0x00 is valid for PIO_SLOW). * Make ata_xfer_mode2mask() return proper mode mask instead of just the highest bit. * Sort ata_timing table in increasing xfermode order and update ata_timing_find_mode() accordingly. This patch doesn't introduce any behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	6357357cae	libata: export xfermode / PATA timing related functions Export the following xfermode related functions. * ata_pack_xfermask() * ata_unpack_xfermask() * ata_xfer_mask2mode() * ata_xfer_mode2mask() * ata_xfer_mode2shift() * ata_mode_string() * ata_id_xfermask() * ata_timing_find_mode() These functions will be used later by LLD updates. While at it, change unsigned short @speed to u8 @xfer_mode in ata_timing_find_mode() for consistency. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	00115e0f5b	libata: implement ATA_DFLAG_DUBIOUS_XFER ATA_DFLAG_DUBIOUS_XFER is set whenever data transfer speed or method changes and gets cleared when data transfer command succeeds in the newly configured transfer mode. This will be used to improve speed down logic. Signed-off-by: Tejun Heo <htejun@gmail.com< Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	3884f7b0a8	libata: clean up EH speed down implementation Clean up EH speed down implementation. * is_io boolean variable is replaced eflags. is_io is ATA_EFLAG_IS_IO. * Error categories now have names. * Better comments. * Reorder 5min and 10min rules in ata_eh_speed_down_verdict() * Use local variable @link to cache @dev->link in ata_eh_speed_down() These changes are to improve readability and ease further changes. This patch doesn't introduce any behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:10 -05:00
Tejun Heo	405e66b387	libata: implement protocol tests Implement protocol tests - ata_is_atapi(), ata_is_nodata(), ata_is_pio(), ata_is_dma(), ata_is_ncq() and ata_is_data() and use them to replace is_atapi_taskfile() and hard coded protocol tests. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:10 -05:00
Tejun Heo	f20ded38aa	libata: rearrange ATA_DFLAG_* Area for DFLAGs which are cleared on INIT is full. Extend it by 8 bits. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:10 -05:00
Alan Cox	ae8d4ee7ff	libata: Disable ATA8-ACS proposed Trusted Computing features by default Historically word 48 in the identify data was used to mean 32bit I/O was supported for VLB IDE etc. ATA8 reassigns this word to the Trusted Computing Group, where it is used for TCG features. This means that an ATA8 TCG drive is going to trigger 32bit I/O on some systems which will be funny. Anyway we need to sort this out ready for ATA8 so: - Reorder the ata.h header a bit so the ata_version function occurs early in it - Make dword_io check the ATA version - Add an ATA8 version checking TCG presence test While we are at it the current drafts have a flaw where it may not be possible to disable TCG features at boot (and opt out of the trusted model) as TCG intends because it relies on presence of a different optional feature (DCS). Handle this in software by refusing the TCG commands if libata.allow_tpm is not set. (We must make it possible as some environments such as proprietary VDR devices will doubtless want to use it to lock up content) Finally as with CPRM print a warning so that the user knows they may not be able to full access and use the device. Signed-off-by: Alan Cox <alan@redhat.com>	2008-01-23 05:24:09 -05:00
Dmitry Torokhov	0c1efd3653	Input: remove cdev from input_dev structure Cdev field was obsolete and provided only for backward compatibility since conversion of input core from class devices to regular devices. It is time to remove it. Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-01-21 01:11:08 -05:00
Dmitry Torokhov	f4f37c8ec7	Input: Add proper locking when changing device's keymap Take dev->event_lock to make sure that we don't race with input_event() and also force key up event when removing a key from keymap table. Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2008-01-21 01:11:06 -05:00
Paul Mackerras	52920df4aa	Merge branch 'for-2.6.25' of master.kernel.org:/pub/scm/linux/kernel/git/olof/pasemi into for-2.6.25	2008-01-17 16:17:58 +11:00
Grant Likely	283029d16a	[POWERPC] Add of_find_matching_node() helper function Similar to of_find_compatible_node(), of_find_matching_node() and for_each_matching_node() allow you to iterate over the device tree looking for specific nodes, except that they take of_device_id tables instead of strings. This also moves of_match_node() from driver/of/device.c to driver/of/base.c to colocate it with the of_find_matching_node which depends on it. Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-01-17 14:53:22 +11:00
Johannes Berg	eb13ba8738	lockdep: fix workqueue creation API lockdep interaction Dave Young reported warnings from lockdep that the workqueue API can sometimes try to register lockdep classes with the same key but different names. This is not permitted in lockdep. Unfortunately, I was unaware of that restriction when I wrote the code to debug workqueue problems with lockdep and used the workqueue name as the lockdep class name. This can obviously lead to the problem if the workqueue name is dynamic. This patch solves the problem by always using a constant name for the workqueue's lockdep class, namely either the constant name that was passed in or a string consisting of the variable name. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2008-01-16 09:51:58 +01:00
Alan Cox	121a09e590	libata: correct handling of TSS DVD Devices that misreport the validity bit for word 93 look like SATA. If they are on the blacklist then we must not test for SATA but assume 40 wire in the 40 wire case (The TSSCorp reports 80 wire on SATA it seems!) Signed-off-by: Alan Cox <alan@redhat.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-15 16:35:21 -05:00
Anton Vorontsov	cf03613e96	libata: pata_platform: make probe and remove functions device type neutral Split pata_platform_{probe,remove} into two pieces: 1. pata_platform_{probe,remove} -- platform_device-dependant bits; 2. __ptata_platform_{probe,remove} -- device type neutral bits. This is done to not duplicate code for the OF-platform driver. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Olof Johansson <olof@lixom.net>	2008-01-15 10:23:41 -06:00
Linus Torvalds	c23f72cae9	Revert "writeback: introduce writeback_control.more_io to indicate more io" This reverts commit 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b, as requested by Fengguang Wu. It's not quite fully baked yet, and while there are patches around to fix the problems it caused, they should get more testing. Says Fengguang: "I'll resend them both for -mm later on, in a more complete patchset". See http://bugzilla.kernel.org/show_bug.cgi?id=9738 for some of this discussion. Requested-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-14 21:21:29 -08:00
Jean Delvare	f9dd0194ff	i2c: Driver IDs are optional Document the fact that I2C driver IDs are optional. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-14 21:53:31 +01:00
Linus Torvalds	417009f64f	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: pnpacpi: print resource shortage message only once PM: ACPI and APM must not be enabled at the same time ACPI: apply quirk_ich6_lpc_acpi to more ICH8 and ICH9 ACPICA: fix acpi_serialize hang regression ACPI : Not register gsi for PCI IDE controller in legacy mode ACPI: Reintroduce run time configurable max_cstate for !CPU_IDLE case ACPI: Make sysfs interface in ACPI power optional. ACPI: EC: Enable boot EC before bus_scan increase PNP_MAX_PORT to 40 from 24	2008-01-13 09:58:22 -08:00
Roland McGrath	84427eaef1	remove task_ppid_nr_ns task_ppid_nr_ns is called in three places. One of these should never have called it. In the other two, using it broke the existing semantics. This was presumably accidental. If the function had not been there, it would have been much more obvious to the eye that those patches were changing the behavior. We don't need this function. In task_state, the pid of the ptracer is not the ppid of the ptracer. In do_task_stat, ppid is the tgid of the real_parent, not its pid. I also moved the call outside of lock_task_sighand, since it doesn't need it. In sys_getppid, ppid is the tgid of the real_parent, not its pid. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-13 09:56:43 -08:00
James Bottomley	11c3e689f1	[SCSI] block: Introduce new blk_queue_update_dma_alignment interface The purpose of this is to allow stacked alignment settings, with the ultimate queue alignment being set to the largest alignment requirement in the stack. The reason for this is so that the SCSI mid-layer can relax the default alignment requirements (which are basically causing a lot of superfluous copying to go on in the SG_IO interface) while allowing transports, devices or HBAs to add stricter limits if they need them. Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:29:20 -06:00
Len Brown	6b74c92521	Pull bugzilla-9535 into release branch	2008-01-11 12:27:50 -05:00
Len Brown	02d5bccf8e	Pull bugzilla-9194 into release branch	2008-01-11 12:27:13 -05:00
Len Brown	9f9adecd2d	PM: ACPI and APM must not be enabled at the same time ACPI and APM used "pm_active" to guarantee that they would not be simultaneously active. But pm_active was recently moved under CONFIG_PM_LEGACY, so that without CONFIG_PM_LEGACY, pm_active became a NOP -- allowing ACPI and APM to both be simultaneously enabled. This caused unpredictable results, including boot hangs. Further, the code under CONFIG_PM_LEGACY is scheduled for removal. So replace pm_active with pm_flags. pm_flags depends only on CONFIG_PM, which is present for both CONFIG_APM and CONFIG_ACPI. http://bugzilla.kernel.org/show_bug.cgi?id=9194 Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2008-01-11 12:26:47 -05:00
Rusty Russell	b801a1e7db	Don't blatt first element of prv in sg_chain() I realize that sg chaining is a ploy to make the rest of the kernel devs feel the pain of the SCSI subsystem. But this was a little unsubtle. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-11 10:12:55 +01:00
Zhao Yakui	d1ec7298fc	ACPI: apply quirk_ich6_lpc_acpi to more ICH8 and ICH9 It is important that these resources be reserved to avoid conflicts with well known ACPI registers. Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-11 00:24:55 -05:00
Bartlomiej Sieka	de7921f01a	[MTD] [NOR] Fix incorrect interface code for x16/x32 chips According to "Common Flash Memory Interface Publication 100" dated December 1, 2001, the interface code for x16/x32 chips is 0x0005, and not 0x0004 used so far. Signed-off-by: Bartlomiej Sieka <tur@semihalf.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2008-01-10 22:07:12 +00:00
Herbert Xu	6eb7228421	[CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long Thanks to David Miller for pointing out that the SLAB (or SLOB/SLUB) cache uses the alignment of unsigned long long if the architecture kmalloc/slab alignment macros are not defined. This patch changes the CRYPTO_MINALIGN so that it uses the same default value. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:17:01 +11:00
Herbert Xu	d29ce988ae	[CRYPTO] aead: Create default givcipher instances This patch makes crypto_alloc_aead always return algorithms that is capable of generating their own IVs through givencrypt and givdecrypt. All existing AEAD algorithms already do. New ones must either supply their own or specify a generic IV generator with the geniv field. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:52 +11:00
Herbert Xu	5b6d2d7fdf	[CRYPTO] aead: Add aead_geniv_alloc/aead_geniv_free This patch creates the infrastructure to help the construction of IV generator templates that wrap around AEAD algorithms by adding an IV generator to them. This is useful for AEAD algorithms with no built-in IV generator or to replace their built-in generator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:51 +11:00
Herbert Xu	743edf5727	[CRYPTO] aead: Add givcrypt operations This patch adds the underlying givcrypt operations for aead and associated support elements. The rationale is identical to that of the skcipher givcrypt operations, i.e., sometimes only the algorithm knows how the IV should be generated. A new request type aead_givcrypt_request is added which contains an embedded aead_request structure with two new elements to support this operation. The new elements are seq and giv. The seq field should contain a strictly increasing 64-bit integer which may be used by certain IV generators as an input value. The giv field will be used to store the generated IV. It does not need to obey the alignment requirements of the algorithm because it's not used during the operation. The existing iv field must still be available as it will be used to store intermediate IVs and the output IV if chaining is desired. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:49 +11:00
Herbert Xu	b9c55aa475	[CRYPTO] skcipher: Create default givcipher instances This patch makes crypto_alloc_ablkcipher/crypto_grab_skcipher always return algorithms that are capable of generating their own IVs through givencrypt and givdecrypt. Each algorithm may specify its default IV generator through the geniv field. For algorithms that do not set the geniv field, the blkcipher layer will pick a default. Currently it's chainiv for synchronous algorithms and eseqiv for asynchronous algorithms. Note that if these wrappers do not work on an algorithm then that algorithm must specify its own geniv or it can't be used at all. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:46 +11:00
Herbert Xu	ecfc43292f	[CRYPTO] skcipher: Add skcipher_geniv_alloc/skcipher_geniv_free This patch creates the infrastructure to help the construction of givcipher templates that wrap around existing blkcipher/ablkcipher algorithms by adding an IV generator to them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:44 +11:00
Herbert Xu	23508e11ab	[CRYPTO] skcipher: Added geniv field This patch introduces the geniv field which indicates the default IV generator for each algorithm. It should point to a string that is not freed as long as the algorithm is registered. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:43 +11:00
Herbert Xu	61da88e2b8	[CRYPTO] skcipher: Add givcrypt operations and givcipher type Different block cipher modes have different requirements for intialisation vectors. For example, CBC can use a simple randomly generated IV while modes such as CTR must use an IV generation mechanisms that give a stronger guarantee on the lack of collisions. Furthermore, disk encryption modes have their own IV generation algorithms. Up until now IV generation has been left to the users of the symmetric key cipher API. This is inconvenient as the number of block cipher modes increase because the user needs to be aware of which mode is supposed to be paired with which IV generation algorithm. Therefore it makes sense to integrate the IV generation into the crypto API. This patch takes the first step in that direction by creating two new ablkcipher operations, givencrypt and givdecrypt that generates an IV before performing the actual encryption or decryption. The operations are currently not exposed to the user. That will be done once the underlying functionality has actually been implemented. It also creates the underlying givcipher type. Algorithms that directly generate IVs would use it instead of ablkcipher. All other algorithms (including all existing ones) would generate a givcipher algorithm upon registration. This givcipher algorithm will be constructed from the geniv string that's stored in every algorithm. That string will locate a template which is instantiated by the blkcipher/ablkcipher algorithm in question to give a givcipher algorithm. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:43 +11:00
Herbert Xu	378f4f51f9	[CRYPTO] skcipher: Add crypto_grab_skcipher interface Note: From now on the collective of ablkcipher/blkcipher/givcipher will be known as skcipher, i.e., symmetric key cipher. The name blkcipher has always been much of a misnomer since it supports stream ciphers too. This patch adds the function crypto_grab_skcipher as a new way of getting an ablkcipher spawn. The problem is that previously we did this in two steps, first getting the algorithm and then calling crypto_init_spawn. This meant that each spawn user had to be aware of what type and mask to use for these two steps. This is difficult and also presents a problem when the type/mask changes as they're about to be for IV generators. The new interface does both steps together just like crypto_alloc_ablkcipher. As a side-effect this also allows us to be stronger on type enforcement for spawns. For now this is only done for ablkcipher but it's trivial to extend for other types. This patch also moves the type/mask logic for skcipher into the helpers crypto_skcipher_type and crypto_skcipher_mask. Finally this patch introduces the function crypto_require_sync to determine whether the user is specifically requesting a sync algorithm. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:42 +11:00
Herbert Xu	551a09a7a9	[CRYPTO] api: Sanitise mask when allocating ablkcipher/hash When allocating ablkcipher/hash objects, we use a mask that's wider than the usual type mask. This patch sanitises the mask supplied by the user so we don't end up using a narrower mask which may lead to unintended results. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:39 +11:00
Herbert Xu	7ba683a6de	[CRYPTO] aead: Make authsize a run-time parameter As it is authsize is an algorithm paramter which cannot be changed at run-time. This is inconvenient because hardware that implements such algorithms would have to register each authsize that they support separately. Since authsize is a property common to all AEAD algorithms, we can add a function setauthsize that sets it at run-time, just like setkey. This patch does exactly that and also changes authenc so that authsize is no longer a parameter of its template. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:29 +11:00
Patrick McHardy	984e976f53	[HWRNG]: move status polling loop to data_present callbacks Handle waiting for new random within the drivers themselves, this allows to use better suited timeouts for the individual rngs. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:16 +11:00
Herbert Xu	332f8840f7	[CRYPTO] ablkcipher: Add distinct ABLKCIPHER type Up until now we have ablkcipher algorithms have been identified as type BLKCIPHER with the ASYNC bit set. This is suboptimal because ablkcipher refers to two things. On the one hand it refers to the top-level ablkcipher interface with requests. On the other hand it refers to and algorithm type underneath. As it is you cannot request a synchronous block cipher algorithm with the ablkcipher interface on top. This is a problem because we want to be able to eventually phase out the blkcipher top-level interface. This patch fixes this by making ABLKCIPHER its own type, just as we have distinct types for HASH and DIGEST. The type it associated with the algorithm implementation only. Which top-level interface is used for synchronous block ciphers is then determined by the mask that's used. If it's a specific mask then the old blkcipher interface is given, otherwise we go with the new ablkcipher interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:15 +11:00
David S. Miller	a0a46196cd	[NET]: Add NAPI_STATE_DISABLE. Create a bit to signal that a napi_disable() is in progress. This sets up infrastructure such that net_rx_action() can generically break out of the ->poll() loop on a NAPI context that has a pending napi_disable() yet is being bombed with packets (and thus would otherwise poll endlessly and not allow the napi_disable() to finish). Now, what napi_disable() does is first set the NAPI_STATE_DISABLE bit (to indicate that a disable is pending), then it polls for the NAPI_STATE_SCHED bit, and once the NAPI_STATE_SCHED bit is acquired the NAPI_STATE_DISABLE bit is cleared. Here, the test_and_set_bit() provides the necessary memory barrier between the various bitops. napi_schedule_prep() now tests for a pending disable as it's first action and won't try to obtain the NAPI_STATE_SCHED bit if a disable is pending. As a result, we can remove the netif_running() check in netif_rx_schedule_prep() because the NAPI disable pending state serves this purpose. And, it does so in a NAPI centric manner which is what we really want. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-08 23:30:07 -08:00
David S. Miller	bdb95b1792	[NET]: Do not grab device reference when scheduling a NAPI poll. It is pointless, because everything that can make a device go away will do a napi_disable() first. The main impetus behind this is that now we can legally do a NAPI completion in generic code like net_rx_action() which a following changeset needs to do. net_rx_action() can only perform actions in NAPI centric ways, because there may be a one to many mapping between NAPI contexts and network devices (SKY2 is one example). We also want to get rid of this because it's an extra atomic in the NAPI paths, and also because it is one of the last instances where the NAPI interfaces care about net devices. The one remaining netdev detail the NAPI stuff cares about is the netif_running() check which will be killed off in a subsequent changeset. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-08 23:30:07 -08:00
Alan Cox	bf5e5834bf	pl2303: Fix mode switching regression Cleaning out all the incorrect 'no change made' checks for termios settings showed up a problem with the PL2303. The hardware here seems to lose sync and bits if you tell it to make no changes. This shows up with a real world application. To fix this the driver check for meaningful hardware changes is restored but doing the tests correctly and as a tty layer function so it doesn't get duplicated wrongly everywhere if other drivers turn out to need it. Signed-off-by: Alan Cox <alan@redhat.com> Tested-by: Mirko Parthey <mirko.parthey@informatik.tu-chemnitz.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-08 16:16:34 -08:00
Sebastian Siewior	5b7741b332	KEYS: fix macro Commit 664cceb0093b755739e56572b836a99104ee8a75 changed the parameters of the function make_key_ref(). The macros that are used in case CONFIG_KEY is not defined did not change. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-08 16:10:35 -08:00
Ingo Molnar	a263898f62	CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU make randconfig bootup testing found that the cpufreq code crashes on bootup, if the powernow-k8 driver is enabled and if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU kernel. First lockdep found out that there's an inconsistent unlock sequence: ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at: [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 but there are no more locks to release! Call Trace: [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c [<ffffffff80252f3a>] mark_held_locks+0x56/0x94 [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4 ... then shortly afterwards the cpufreq code crashed on an assert: ------------[ cut here ]------------ kernel BUG at drivers/cpufreq/cpufreq.c:1068! invalid opcode: 0000 [1] SMP [...] Call Trace: [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91 [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2 [<ffffffff80cc0596>] powernowk8_init+0x86/0x94 [...] ---[ end trace 1e9219be2b4431de ]--- the bug was caused by maxcpus=1 bootup, which brought up the secondary core as !cpu_online() but !cpu_is_offline() either, which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h): /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */ static inline int cpu_is_offline(int cpu) { return 0; } but the cpufreq code uses cpu_online() and cpu_is_offline() in a mixed way - the low-level drivers use cpu_online(), while the cpufreq core uses cpu_is_offline(). This opened up the possibility to add the non-initialized sysdev device of the secondary core: cpufreq-core: trying to register driver powernow-k8 cpufreq-core: adding CPU 0 powernow-k8: BIOS error - no PSB or ACPI _PSS objects cpufreq-core: initialization failed cpufreq-core: adding CPU 1 cpufreq-core: initialization failed which then blew up. The fix is to make cpu_is_offline() always the negation of cpu_online(). With that fix applied the kernel boots up fine without crashing: Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00) powernow-k8: BIOS error - no PSB or ACPI _PSS objects initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19. initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94() Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39() We could fix this by making CPU enumeration aware of max_cpus, but that would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu) possibility was quite confusing and a continuous source of bugs too. Most distributions have kernels with CPU hotplug enabled, so this bug remained hidden for a long time. Bug forensics: The broken cpu_is_offline() API variant was introduced via: commit a59d2e4e6977e7b94e003c96a41f07e96cddc340 Author: Rusty Russell <rusty@rustcorp.com.au> Date: Mon Mar 8 06:06:03 2004 -0800 [PATCH] minor cleanups for hotplug CPUs ( this predates linux-2.6.git, this commit is available from Thomas's historic git tree. ) Then 1.5 years later the cpufreq code made use of it: commit c32b6b8e524d2c337767d312814484d9289550cf Author: Ashok Raj <ashok.raj@intel.com> Date: Sun Oct 30 14:59:54 2005 -0800 [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers + if (cpu_is_offline(cpu)) + return 0; which is a correct use of the subtly broken new API. v2.6.15 then shipped with this bug included. then it took two more years for random-kernel qa to hit it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-06 12:39:42 -08:00
Al Viro	831830b5a2	restrict reading from /proc/<pid>/maps to those who share ->mm or can ptrace pid Contents of /proc/*/maps is sensitive and may become sensitive after open() (e.g. if target originally shares our ->mm and later does exec on suid-root binary). Check at read() (actually, ->start() of iterator) time that mm_struct we'd grabbed and locked is - still the ->mm of target - equal to reader's ->mm or the target is ptracable by reader. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-02 13:13:27 -08:00
Linus Torvalds	158a962422	Unify /proc/slabinfo configuration Both SLUB and SLAB really did almost exactly the same thing for /proc/slabinfo setup, using duplicate code and per-allocator #ifdef's. This just creates a common CONFIG_SLABINFO that is enabled by both SLUB and SLAB, and shares all the setup code. Maybe SLOB will want this some day too. Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-02 13:04:48 -08:00
Pekka J Enberg	57ed3eda97	slub: provide /proc/slabinfo This adds a read-only /proc/slabinfo file on SLUB, that makes slabtop work. [ mingo@elte.hu: build fix. ] Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Lameter <clameter@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-01 11:32:02 -08:00
Len Brown	2c83819775	increase PNP_MAX_PORT to 40 from 24 a7839e960675b549f06209d18283d5cee2ce9261 (PNP: increase the maximum number of resources) increased PNP_MAX_PORT to 24 from 8. It also added a test and a complaint when a machine exceeded the limit, causing: pnpacpi: exceeded the max number of IO resources: 24 http://bugzilla.kernel.org/show_bug.cgi?id=9535 We should have been squawking about this all along, as this is a potentially serious issue. For now, simply burn some dynamic bytes and increase the limit by another 16 to 40. There is no guarantee that this will satisfy every system on Earth. It probably will not, but it should be an improvement. In the future, PNPACPI should allocate resource structures as needed, rather than max-sized arrays. Signed-off-by: Len Brown <len.brown@intel.com>	2007-12-27 23:55:13 -05:00
Stephen Hemminger	ecef969e5b	[VETH]: move veth.h to include/linux Move veth.h from net/ to linux/ since it is a user api, and add it to user header processing Kbuild. [ Use header-y as suggested by Sam Ravnborg. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-26 19:36:35 -08:00
Stephen Hemminger	75ec533ec3	[NET] tc_nat: header install iproute2 build needs tc_nat.h header from kernel make install_headers. Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-26 19:36:35 -08:00
Artem Bityutskiy	393852ecfe	UBI: add ubi_leb_map interface The idea of this interface belongs to Adrian Hunter. The interface is extremely useful when one has to have a guarantee that an LEB will contain all 0xFFs even in case of an unclean reboot. UBI does have an 'ubi_leb_erase()' call which may do this, but it is stupid and ineffecient, because it flushes whole queue. I should be re-worked to just be a pair of unmap, map calls. The user of the interfaci is UBIFS at the moment. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2007-12-26 19:15:14 +02:00
Christoph Lameter	ed367fc3a7	quicklists: do not release off node pages early quicklists must keep even off node pages on the quicklists until the TLB flush has been completed. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-23 12:54:36 -08:00
Paul Mackerras	c2a7dcad9f	Merge branch 'linux-2.6'	2007-12-21 22:21:08 +11:00
Benjamin Herrenschmidt	0094f2cdcf	[POWERPC] Fix for via-pmu based backlight control This fixes a few issues with via-pmu based backlight control. First, it fixes a sign problem with the setup of the backlight curve since the `range' value there -can- (and will) go negative. Then, it reworks the interaction between this and the via-pmu sleep code to properly restore backlight on wakeup from sleep. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-21 22:14:07 +11:00
Neil Brown	91212507f9	dm: merge max_hw_sector Make sure dm honours max_hw_sectors of underlying devices We still have no firm testing evidence in support of this patch but believe it may help to resolve some bug reports. - agk Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-12-20 17:32:12 +00:00
Johannes Berg	b819a9bfc7	[POWERPC] via-pmu: Kill sleep notifiers completely This kills off the remnants of the old sleep notifiers now that they are no longer used. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-19 15:00:59 +11:00
Linus Torvalds	3e3b3916a9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" genirq: revert lazy irq disable for simple irqs x86: also define AT_VECTOR_SIZE_ARCH x86: kprobes bugfix x86: jprobe bugfix timer: kernel/timer.c section fixes genirq: add unlocked version of set_irq_handler() clockevents: fix reprogramming decision in oneshot broadcast oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters	2007-12-18 09:42:44 -08:00
Kevin Hilman	b019e57321	genirq: add unlocked version of set_irq_handler() Add unlocked version for use by irq_chip.set_type handlers which may wish to change handler to level or edge handler when IRQ type is changed. The normal set_irq_handler() call cannot be used because it tries to take irq_desc.lock which is already held when the irq_chip.set_type hook is called. Signed-off-by: Kevin Hilman <khilman@mvista.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-12-18 18:05:58 +01:00
Linus Torvalds	3c615e19a4	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Cleanup umem driver: fix most checkpatch warnings, conform to kernel block: let elv_register() return void as-iosched: fix write batch start point as-iosched: fix incorrect comments block: use jiffies conversion functions in scsi_ioctl.c	2007-12-18 08:04:24 -08:00
Linus Torvalds	d55653377d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: mmc: remove unused 'mode' from the mmc_host structure sdhci: support JMicron JMB38x chips sdhci: use PIO when DMA can't satisfy the request sdhci: don't warn about sdhci 2.0 controllers sdhci: describe quirks	2007-12-18 08:03:01 -08:00
Adrian Bunk	2fdd82bd88	block: let elv_register() return void elv_register() always returns 0, and there isn't anything it does where it should return an error (the only error condition is so grave that it's handled with a BUG_ON). Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-12-18 08:29:28 +01:00
Linus Torvalds	ededa4d396	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: fix ATAPI draining libata: update atapi_eh_request_sense() such that lbam/lbah contains buffer size libata-acpi: implement _GTF command filtering libata-acpi: improve _GTF execution error handling and reporting libata-acpi: improve ACPI disabling libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume libata-acpi: implement and use ata_acpi_init_gtm() libata-acpi: add new hooks ata_acpi_dissociate() and ata_acpi_on_disable() libata: ata_dev_disable() should be called from EH context libata: add more opcodes to ata.h libata: update ata_*_printk() macros such that level can be a variable libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters sata_mv: improve warnings about Highpoint RocketRAID 23xx cards libata: add ST3160023AS / 3.42 to NCQ blacklist libata: clear link->eh_info.serror from ata_std_postreset() sata_sil: fix spurious IRQ handling	2007-12-17 19:29:32 -08:00
Nishanth Aravamudan	368d2c6358	Revert "hugetlb: Add hugetlb_dynamic_pool sysctl" This reverts commit 54f9f80d6543fb7b157d3b11e2e7911dc1379790 ("hugetlb: Add hugetlb_dynamic_pool sysctl") Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool sysctl is not needed, as its semantics can be expressed by 0 in the overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl (pool enabled). (Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change) Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:17 -08:00
Nishanth Aravamudan	d1c3fb1f8f	hugetlb: introduce nr_overcommit_hugepages sysctl hugetlb: introduce nr_overcommit_hugepages sysctl While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I became convinced that having a boolean sysctl was insufficient: 1) To support per-node control of hugepages, I have previously submitted patches to add a sysfs attribute related to nr_hugepages. However, with a boolean global value and per-mount quota enforcement constraining the dynamic pool, adding corresponding control of the dynamic pool on a per-node basis seems inconsistent to me. 2) Administration of the hugetlb dynamic pool with multiple hugetlbfs mount points is, arguably, more arduous than it needs to be. Each quota would need to be set separately, and the sum would need to be monitored. To ease the administration, and to help make the way for per-node control of the static & dynamic hugepage pool, I added a separate sysctl, nr_overcommit_hugepages. This value serves as a high watermark for the overall hugepage pool, while nr_hugepages serves as a low watermark. The boolean sysctl can then be removed, as the condition nr_overcommit_hugepages > 0 indicates the same administrative setting as hugetlb_dynamic_pool == 1 Quotas still serve as local enforcement of the size of the pool on a per-mount basis. A few caveats: 1) There is a race whereby the global surplus huge page counter is incremented before a hugepage has allocated. Another process could then try grow the pool, and fail to convert a surplus huge page to a normal huge page and instead allocate a fresh huge page. I believe this is benign, as no memory is leaked (the actual pages are still tracked correctly) and the counters won't go out of sync. 2) Shrinking the static pool while a surplus is in effect will allow the number of surplus huge pages to exceed the overcommit value. As long as this condition holds, however, no more surplus huge pages will be allowed on the system until one of the two sysctls are increased sufficiently, or the surplus huge pages go out of use and are freed. Successfully tested on x86_64 with the current libhugetlbfs snapshot, modified to use the new sysctl. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:17 -08:00
Adam Jackson	8d936626dd	apm_event{,info}_t are userspace types These types define the size of data read from /dev/apm_bios. They should not be hidden behind #ifdef __KERNEL__. This is killing my xserver compile, apm_event_t is used in the xserver source. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:16 -08:00
Andrew Morton	755271358c	fix headers_install make[3]: *** No rule to make target `/usr/src/devel/include/linux/ticable.h', needed by `/usr/src/devel/usr/include/linux/ticable.h'. Stop. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:15 -08:00
Tejun Heo	140b5e5911	libata: fix ATAPI draining With ATAPI transfer chunk size properly programmed, libata PIO HSM should be able to handle full spurious data chunks. Also, it's a good idea to suppress trailing data warning for misc ATAPI commands as there can be many of them per command - for example, if the chunk size is 16 and the drive tries to transfer 510 bytes, there can be 31 trailing data messages. This patch makes the following updates to libata ATAPI PIO HSM implementation. * Make it drain full spurious chunks. * Suppress trailing data warning message for misc commands. * Put limit on how many bytes can be drained. * If odd, round up consumed bytes and the number of bytes to be drained. This gets the number of bytes to drain right for drivers which do 16bit PIO. This patch is partial backport of improve-ATAPI-data-xfer patchset pending for #upstream. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:43:28 -05:00
Tejun Heo	398e07826b	libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume On certain implementations, _GTF evaluation depends on preceding _STM and both can be pretty picky about the configuration. Using _GTM result cached during controller initialization satisfies the most neurotic _STM implementation. However, libata evaluates _GTF after reset during device configuration and the hardware state can be different from what _GTF expects and can cause evaluation failure. This patch adds dev->gtf_cache and updates ata_dev_get_GTF() such that it uses the cached value if available. Cache is cleared with a call to ata_acpi_clear_gtf(). Because for SATA ACPI nodes _GTF must be evaluated after _SDD which can't be done till IDENTIFY is complete, _GTF caching from ata_acpi_on_resume() is used only for IDE ACPI nodes. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:14 -05:00
Tejun Heo	c05e6ff035	libata-acpi: implement and use ata_acpi_init_gtm() _GTM fetches currently configured transfer mode while _STM configures controller according to _GTM parameter and prepares transfer mode configuration TFs for _GTF. In many cases _GTM and _STM implementations are quite brittle and can't cope with configuration changed by libata. libata does not depend on ATA ACPI to configure devices. The only reason libata performs _GTM and _STM are to make _GTF evaluation succeed and libata also doesn't care about how _GTF TFs configure transfer mode. It overrides that configuration anyway, so from libata's POV, it doesn't matter what value is feeded to _STM as long as evaluation succeeds for _STM and following _GTF. This patch adds dev->__acpi_init_gtm and store initial _GTM values on host initialization before modified by reset and mode configuration. If the field is valid, ata_acpi_init_gtm() returns pointer to the saved _GTM structure; otherwise, NULL. This saved value is used for _STM during resume and peek at BIOS/firmware programmed initial timing for later use. The accessor is there to make building w/o ACPI easy as dev->__acpi_init doesn't exist if ACPI is not enabled. On driver detach, the initial BIOS configuration is restored by executing _STM with the initial _GTM values such that the next driver can also use the initial BIOS configured values. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:14 -05:00
Tejun Heo	ce2e0abbd3	libata: add more opcodes to ata.h Add constants for DEVICE CONFIGURATION OVERLAY and SET_MAX to include/linux/ata.h. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:12 -05:00
Tejun Heo	c2e366a107	libata: update ata_*_printk() macros such that level can be a variable Make prink helpers format @lv together rather than prepending to the format string as constant. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:12 -05:00
Tejun Heo	0d02f0b22b	libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters * No internal function uses const ata_port. Drop const from @ap. * Make ata_acpi_stm() copy @stm before using it and change @stm to const. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:12 -05:00
Linus Torvalds	87d5df6bde	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: HOWTO: update misspelling and word incorrected add stable_api_nonsense.txt in korean HOWTO: change addresses of maintainer and lxr url for Korean HOWTO Add Documentation for FAIR_USER_SCHED sysfs files HOWTO: Change man-page maintainer address for Japanese HOWTO tipar: remove obsolete module kobject: fix the documentation of how kobject_set_name works	2007-12-17 13:33:47 -08:00
Linus Torvalds	4942093e9d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: revert portions of "UNUSUAL_DEV: Sync up some reported devices from Ubuntu" usb: Remove broken optimisation in OHCI IRQ handler USB: at91_udc: correct hanging while disconnecting usb cable USB: use IRQF_DISABLED for HCD interrupt handlers USB: fix locking loop by avoiding flush_scheduled_work usb.h: fix kernel-doc warning USB: option: Bind to the correct interface of the Huawei E220 USB: cp2101: new device id usb-storage: Fix devices that cannot handle 32k transfers USB: sierra: fix product id	2007-12-17 13:33:30 -08:00
Linus Torvalds	07232b9715	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: ide: fix ->io_32bit race in set_io_32bit() ide: remove stale changelog from ide-probe.c ide: remove stale changelog from ide-disk.c ide: remove dead code from __ide_dma_test_irq() hpt366: fix HPT37x PIO mode timings (take 2) pdc202xx_new: fix Promise TX4 support ide-cd: remove dead post_transform_command() ide: DMA reporting and validity checking fixes (take 3) ide: add /sys/bus/ide/devices/*/{model,firmware,serial} sysfs entries ide: coding style fixes for drivers/ide/setup-pci.c ide: fix ide_scan_pcibus() error message ide: deprecate CONFIG_BLK_DEV_OFFBOARD ide: add missing checks for control register existence ide-scsi: add ide_scsi_hex_dump() helper	2007-12-17 13:32:49 -08:00
Randy Dunlap	f88ed90d86	usb.h: fix kernel-doc warning Fix kernel-doc warning in usb.h: Warning(linux-2.6.24-rc3-git7//include/linux/usb.h:166): No description found for parameter 'sysfs_files_created' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-12-17 10:47:15 -08:00
Doug Maxey	33abc04f04	usb-storage: Fix devices that cannot handle 32k transfers When a device cannot handle the smallest previously limited transfer size (64 blocks) without stalling, limit the device to the amount of packets that fit in a platform native page. The lowest possible limit is PAGE_CACHE_SIZE, so if the device is ever used on a platform that has larger than 8K pages, you lose unless you can convince the device firmware folks to fix the issue. Cc: Mathew Dharm <mdharm-scsi@one-eyed-alien.net> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Doug Maxey <dwm@austin.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-12-17 10:47:14 -08:00
Romain Liévin	cb8c9b6de0	tipar: remove obsolete module tipar: remove obsolete module The tipar character driver was used to implement bit-banging access to Texas Instruments parallel link cable. A user-land method now exists thru PPDEV & PARPORT. Signed-off-by: Romain Liévin <roms@lpg.ticalc.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-12-17 10:33:18 -08:00
Patrick McHardy	4a9ecd5960	[NETFILTER]: bridge: fix missing link layer headers on outgoing routed packets As reported by Damien Thebault, the double POSTROUTING hook invocation fix caused outgoing packets routed between two bridges to appear without a link-layer header. The reason for this is that we're skipping the br_nf_post_routing hook for routed packets now and don't save the original link layer header, but nevertheless tries to restore it on output, causing corruption. The root cause for this is that skb->nf_bridge has no clearly defined lifetime and is used to indicate all kind of things, but that is quite complicated to fix. For now simply don't touch these packets and handle them like packets from any other device. Tested-by: Damien Thebault <damien.thebault@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-14 13:54:39 -08:00
Len Brown	239665a3bb	ACPI: tables: complete searching upon RSDP w/ bad checksum. ACPI tables follow a tree structure in memory. The root of the tree is the RSDP (Root System Description Pointer). To find the RSDP, the OS searches for the signature "RSD PTR " in well known physical memory locations. Then the OS computes a table checksum to verify that the signature is really part of a valid table header. Some systems have a proper signature but an invalid checksum; followed elsewhere by a proper signature with valid checksum. http://bugzilla.kernel.org/show_bug.cgi?id=9444 The Linux RSDP scanning code bailed out on those systems and as a result they booted with ACPI disabled. Fix this by deleting the Linux RSDP scanning code and plugging in the ACPICA RSDP scanning code. Signed-off-by: Len Brown <len.brown@intel.com>	2007-12-14 02:36:24 -05:00
Bartlomiej Zolnierkiewicz	3ab7efe8e2	ide: DMA reporting and validity checking fixes (take 3) * ide_xfer_verbose() fixups: - beautify returned mode names - fix PIO5 reporting - make it return 'const char ' Change printk() level from KERN_DEBUG to KERN_INFO in ide_find_dma_mode(). * Add ide_id_dma_bug() helper based on ide_dma_verbose() to check for invalid DMA info in identify block. * Use ide_id_dma_bug() in ide_tune_dma() and ide_driveid_update(). As a result DMA won't be tuned or will be disabled after tuning if device reports inconsistent info about enabled DMA mode (ide_dma_verbose() does the same checks while the IDE device is probed by ide-{cd,disk} device driver). * Remove no longer needed ide_dma_verbose(). This patch should fix the following problem with out-of-sync IDE messages reported by Nick Warne: hdd: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache<7>hdd: skipping word 93 validity check , UDMA(66) and later debugged by Mark Lord to be caused by: ide_dma_verbose() printk( ... "2048kB Cache"); eighty_ninty_three() printk(KERN_DEBUG "%s: skipping word 93 validity check\n"); ide_dma_verbose() printk(", UDMA(66)" Please note that as a result ide-{cd,disk} device drivers won't report the DMA speed used but this is intended since now DMA mode being used is always reported by IDE core code. v2: * fixes suggested by Randy: - use KERN_CONT for printk()-s in ide-{cd,disk}.c - don't remove argument name from ide_xfer_verbose() declaration v3: * Remove incorrect check for (id->field_valid & 1) from ide_id_dma_bug() (spotted by Sergei). * "XFER SLOW" -> "PIO SLOW" in ide_xfer_verbose() (suggested by Sergei). * Fix ide_find_dma_mode() to report the correct mode ('mode' after being limited by 'req_mode'). Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Nick Warne <nick@ukfsn.org> Cc: Mark Lord <lkml@rtr.ca> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-12-12 23:31:58 +01:00
Nicolas Pitre	cc3000e4ef	mmc: remove unused 'mode' from the mmc_host structure This field and corresponding defines are simply never used anywhere in the code. But its mere presence is enough to confuse some host driver authors who attempt to rely on it. Let's eliminate the possibility for confusion and remove it entirely. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2007-12-12 20:01:01 +01:00
Pierre Ossman	84c46a53fc	sdhci: support JMicron JMB38x chips The JMicron JMB38x chip doesn't support transfers that aren't 32-bit aligned (both size and start address). It also doesn't like switching between PIO and DMA mode, so it needs to be reset after each request. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2007-12-12 20:01:00 +01:00
Michael Ellerman	aabc08dc66	[POWERPC] Add for_each_child_of_node() helper for iterating over child nodes Add for_each_child_of_node() to encapsulate the common idiom of iterating over the children of a device_node. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-12-11 13:34:39 +11:00
Jay Vosburgh	6f6652be18	bonding: Add new layer2+3 hash for xor/802.3ad modes Add new hash for balance-xor and 802.3ad modes. Originally submitted by "Glenn Griffin" <ggriffin.kernel@gmail.com>; modified by Jay Vosburgh to move setting of hash policy out of line, tweak the documentation update and add version update to 3.2.2. Glenn's original comment follows: Included is a patch for a new xmit_hash_policy for the bonding driver that selects slaves based on MAC and IP information. This is a middle ground between what currently exists in the layer2 only policy and the layer3+4 policy. This policy strives to be fully 802.3ad compliant by transmitting every packet of any particular flow over the same link. As documented the layer3+4 policy is not fully compliant for extreme cases such as ip fragmentation, so this policy is a nice compromise for environments that require full compliance but desire more than the layer2 only policy. Signed-off-by: "Glenn Griffin" <ggriffin.kernel@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-07 15:00:32 -05:00
Richard Purdie	dc47206e55	leds: Fix led trigger locking bugs Convert part of the led trigger core from rw spinlocks to rw semaphores. We're calling functions which can sleep from invalid contexts otherwise. Fixes bug #9264. Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2007-12-07 09:06:53 +00:00
Matthew Wilcox	150030b78a	NFS: Switch from intr mount option to TASK_KILLABLE By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr' mount option. We have to use _killable everywhere instead of _interruptible as we get rid of rpc_clnt_sigmask/sigunmask. Signed-off-by: Liam R. Howlett <howlett@gmail.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:40:25 -05:00
Matthew Wilcox	009e577e07	Add wait_for_completion_killable Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:40:19 -05:00
Matthew Wilcox	1411d5a7fb	Add wait_event_killable Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:40:14 -05:00
Matthew Wilcox	294d5cc233	Add schedule_timeout_killable Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:40:07 -05:00
Liam R. Howlett	ad776537cc	Add mutex_lock_killable Similar to mutex_lock_interruptible, it can be interrupted by a fatal signal only. Signed-off-by: Liam R. Howlett <howlett@gmail.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:37:59 -05:00
Matthew Wilcox	2687a3569e	Add lock_page_killable This routine is like lock_page, but can be interrupted by a fatal signal Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:35:41 -05:00
Matthew Wilcox	f776d12dd1	Add fatal_signal_pending Like signal_pending, but it's only true for signals which are fatal to this process Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:35:35 -05:00
Matthew Wilcox	f021a3c2b1	Add TASK_WAKEKILL Set TASK_WAKEKILL for TASK_STOPPED and TASK_TRACED, add TASK_KILLABLE and use TASK_WAKEKILL in signal_wake_up() Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:35:28 -05:00
Matthew Wilcox	e64d66c8ed	wait: Use TASK_NORMAL Also move wake_up_locked() to be with the related functions Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:34:36 -05:00
Matthew Wilcox	92a1f4bc7a	Add macros to replace direct uses of TASK_ flags With the changes to support TASK_KILLABLE, ->state becomes a bitmask, and moving these tests to convenience macros will fix all the users. Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:07:29 -05:00
Linus Torvalds	7e1fb765c6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: futex: correctly return -EFAULT not -EINVAL lockdep: in_range() fix lockdep: fix debug_show_all_locks() sched: style cleanups futex: fix for futex_wait signal stack corruption	2007-12-05 09:27:46 -08:00
Linus Torvalds	ad658cec23	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: VM/Security: add security hook to do_brk Security: round mmap hint address above mmap_min_addr security: protect from stack expantion into low vm addresses Security: allow capable check to permit mmap or low vm space SELinux: detect dead booleans SELinux: do not clear f_op when removing entries	2007-12-05 09:26:52 -08:00
Linus Torvalds	2a1292b36b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [LRO]: fix lro_gen_skb() alignment [TCP]: NAGLE_PUSH seems to be a wrong way around [TCP]: Move prior_in_flight collect to more robust place [TCP] FRTO: Use of existing funcs make code more obvious & robust [IRDA]: Move ircomm_tty_line_info() under #ifdef CONFIG_PROC_FS [ROSE]: Trivial compilation CONFIG_INET=n case [IPVS]: Fix sched registration race when checking for name collision. [IPVS]: Don't leak sysctl tables if the scheduler registration fails.	2007-12-05 09:26:13 -08:00
Alexey Dobriyan	5a622f2d0f	proc: fix proc_dir_entry refcounting Creating PDEs with refcount 0 and "deleted" flag has problems (see below). Switch to usual scheme: * PDE is created with refcount 1 * every de_get does +1 * every de_put() and remove_proc_entry() do -1 * once refcount reaches 0, PDE is freed. This elegantly fixes at least two following races (both observed) without introducing new locks, without abusing old locks, without spreading lock_kernel(): 1) PDE leak remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_read(&de->count) == 0) if (atomic_dec_and_test(&de->count)) if (de->deleted) /* also not taken! / free_proc_entry(de); else de->deleted = 1; [refcount=0, deleted=1] 2) use after free remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_dec_and_test(&de->count)) if (atomic_read(&de->count) == 0) free_proc_entry(de); / boom! / if (de->deleted) free_proc_entry(de); BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c10acdda pdpt = 00000000338f8001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4) EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1 EIP is at strnlen+0x6/0x18 EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000) Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400 c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400 f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34 Call Trace: [<c10ac4f0>] vsnprintf+0x2ad/0x49b [<c10ac779>] vscnprintf+0x14/0x1f [<c1018e6b>] vprintk+0xc5/0x2f9 [<c10379f1>] handle_fasteoi_irq+0x0/0xab [<c1004f44>] do_IRQ+0x9f/0xb7 [<c117db3b>] preempt_schedule_irq+0x3f/0x5b [<c100264e>] need_resched+0x1f/0x21 [<c10190ba>] printk+0x1b/0x1f [<c107c8ad>] de_put+0x3d/0x50 [<c107c8f8>] proc_delete_inode+0x38/0x41 [<c107c8c0>] proc_delete_inode+0x0/0x41 [<c1066298>] generic_delete_inode+0x5e/0xc6 [<c1065aa9>] iput+0x60/0x62 [<c1063c8e>] d_kill+0x2d/0x46 [<c1063fa9>] dput+0xdc/0xe4 [<c10571a1>] __fput+0xb0/0xcd [<c1054e49>] filp_close+0x48/0x4f [<c1055ee9>] sys_close+0x67/0xa5 [<c10026b6>] sysenter_past_esp+0x5f/0x85 ======================= Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9 EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44 Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds, module is already pinned and remove_proc_entry() can't happen => nobody can mark PDE deleted. Dummy proc root in netns code is not marked with refcount 1. AFAICS, we never get it, it's just for proper /proc/net removal. I double checked CLONE_NETNS continues to work. Patch survives many hours of modprobe/rmmod/cat loops without new bugs which can be attributed to refcounting. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Jan Kara	d4beaf4ab5	jbd: Fix assertion failure in fs/jbd/checkpoint.c Before we start committing a transaction, we call __journal_clean_checkpoint_list() to cleanup transaction's written-back buffers. If this call happens to remove all of them (and there were already some buffers), __journal_remove_checkpoint() will decide to free the transaction because it isn't (yet) a committing transaction and soon we fail some assertion - the transaction really isn't ready to be freed :). We change the check in __journal_remove_checkpoint() to free only a transaction in T_FINISHED state. The locking there is subtle though (as everywhere in JBD ;(). We use j_list_lock to protect the check and a subsequent call to __journal_drop_transaction() and do the same in the end of journal_commit_transaction() which is the only place where a transaction can get to T_FINISHED state. Probably I'm too paranoid here and such locking is not really necessary - checkpoint lists are processed only from log_do_checkpoint() where a transaction must be already committed to be processed or from __journal_clean_checkpoint_list() where kjournald itself calls it and thus transaction cannot change state either. Better be safe if something changes in future... Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Steven Rostedt	ce6bd420f4	futex: fix for futex_wait signal stack corruption David Holmes found a bug in the -rt tree with respect to pthread_cond_timedwait. After trying his test program on the latest git from mainline, I found the bug was there too. The bug he was seeing that his test program showed, was that if one were to do a "Ctrl-Z" on a process that was in the pthread_cond_timedwait, and then did a "bg" on that process, it would return with a "-ETIMEDOUT" but early. That is, the timer would go off early. Looking into this, I found the source of the problem. And it is a rather nasty bug at that. Here's the relevant code from kernel/futex.c: (not in order in the file) [...] smlinkage long sys_futex(u32 __user uaddr, int op, u32 val, struct timespec __user utime, u32 __user uaddr2, u32 val3) { struct timespec ts; ktime_t t, tp = NULL; u32 val2 = 0; int cmd = op & FUTEX_CMD_MASK; if (utime && (cmd == FUTEX_WAIT \|\| cmd == FUTEX_LOCK_PI)) { if (copy_from_user(&ts, utime, sizeof(ts)) != 0) return -EFAULT; if (!timespec_valid(&ts)) return -EINVAL; t = timespec_to_ktime(ts); if (cmd == FUTEX_WAIT) t = ktime_add(ktime_get(), t); tp = &t; } [...] return do_futex(uaddr, op, val, tp, uaddr2, val2, val3); } [...] long do_futex(u32 __user uaddr, int op, u32 val, ktime_t timeout, u32 __user uaddr2, u32 val2, u32 val3) { int ret; int cmd = op & FUTEX_CMD_MASK; struct rw_semaphore fshared = NULL; if (!(op & FUTEX_PRIVATE_FLAG)) fshared = &current->mm->mmap_sem; switch (cmd) { case FUTEX_WAIT: ret = futex_wait(uaddr, fshared, val, timeout); [...] static int futex_wait(u32 __user uaddr, struct rw_semaphore fshared, u32 val, ktime_t abs_time) { [...] struct restart_block restart; restart = &current_thread_info()->restart_block; restart->fn = futex_wait_restart; restart->arg0 = (unsigned long)uaddr; restart->arg1 = (unsigned long)val; restart->arg2 = (unsigned long)abs_time; restart->arg3 = 0; if (fshared) restart->arg3 \|= ARG3_SHARED; return -ERESTART_RESTARTBLOCK; [...] static long futex_wait_restart(struct restart_block restart) { u32 __user uaddr = (u32 __user )restart->arg0; u32 val = (u32)restart->arg1; ktime_t abs_time = (ktime_t )restart->arg2; struct rw_semaphore fshared = NULL; restart->fn = do_no_restart_syscall; if (restart->arg3 & ARG3_SHARED) fshared = &current->mm->mmap_sem; return (long)futex_wait(uaddr, fshared, val, abs_time); } So when the futex_wait is interrupt by a signal we break out of the hrtimer code and set up or return from signal. This code does not return back to userspace, so we set up a RESTARTBLOCK. The bug here is that we save the "abs_time" which is a pointer to the stack variable "ktime_t t" from sys_futex. This returns and unwinds the stack before we get to call our signal. On return from the signal we go to futex_wait_restart, where we update all the parameters for futex_wait and call it. But here we have a problem where abs_time is no longer valid. I verified this with print statements, and sure enough, what abs_time was set to ends up being garbage when we get to futex_wait_restart. The solution I did to solve this (with input from Linus Torvalds) was to add unions to the restart_block to allow system calls to use the restart with specific parameters. This way the futex code now saves the time in a 64bit value in the restart block instead of storing it on the stack. Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64 in thread_info.h, when there's a #ifdef __KERNEL__ just below that. Not sure what that is there for. If this turns out to be a problem, I've tested this with using "unsigned int" for u32 and "unsigned long long" for u64 and it worked just the same. I'm using u32 and u64 just to be consistent with what the futex code uses. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 15:46:09 +01:00
Andrew Gallatin	621544eb8c	[LRO]: fix lro_gen_skb() alignment Add a field to the lro_mgr struct so that drivers can specify how much padding is required to align layer 3 headers when a packet is copied into a freshly allocated skb by inet_lro.c:lro_gen_skb(). Without padding, skbs generated by LRO will cause alignment warnings on architectures which require strict alignment (seen on sparc64). Myri10GE is updated to use this field. Signed-off-by: Andrew Gallatin <gallatin@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-05 05:37:32 -08:00
Eric Paris	7cd94146cd	Security: round mmap hint address above mmap_min_addr If mmap_min_addr is set and a process attempts to mmap (not fixed) with a non-null hint address less than mmap_min_addr the mapping will fail the security checks. Since this is just a hint address this patch will round such a hint address above mmap_min_addr. gcj was found to try to be very frugal with vm usage and give hint addresses in the 8k-32k range. Without this patch all such programs failed and with the patch they happily get a higher address. This patch is wrappad in CONFIG_SECURITY since mmap_min_addr doesn't exist without it and there would be no security check possible no matter what. So we should not bother compiling in this rounding if it is just a waste of time. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2007-12-06 00:25:10 +11:00
Anton Vorontsov	6f4a7f4183	PHY: Add the phy_device_release device method. Lately I've got this nice badness on mdio bus removal: Device 'e0103120:06' does not have a release() function, it is broken and must be fixed. ------------[ cut here ]------------ Badness at drivers/base/core.c:107 NIP: c015c1a8 LR: c015c1a8 CTR: c0157488 REGS: c34bdcf0 TRAP: 0700 Not tainted (2.6.23-rc5-g9ebadfbb-dirty) MSR: 00029032 <EE,ME,IR,DR> CR: 24088422 XER: 00000000 ... [c34bdda0] [c015c1a8] device_release+0x78/0x80 (unreliable) [c34bddb0] [c01354cc] kobject_cleanup+0x80/0xbc [c34bddd0] [c01365f0] kref_put+0x54/0x6c [c34bdde0] [c013543c] kobject_put+0x24/0x34 [c34bddf0] [c015c384] put_device+0x1c/0x2c [c34bde00] [c0180e84] mdiobus_unregister+0x2c/0x58 ... Though actually there is nothing broken, it just device subsystem core expects another "pattern" of resource managment. This patch implement phy device's release function, thus we're getting rid of this badness. Also small hidden bug fixed, hope none other introduced. ;-) Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-04 15:06:33 -05:00
Linus Torvalds	ca6435f188	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: cpu accounting controller (V2)	2007-12-03 08:21:06 -08:00
Linus Torvalds	8002cedc1a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (27 commits) [INET]: Fix inet_diag dead-lock regression [NETNS]: Fix /proc/net breakage [TEXTSEARCH]: Do not allow zero length patterns in the textsearch infrastructure [NETFILTER]: fix forgotten module release in xt_CONNMARK and xt_CONNSECMARK [NETFILTER]: xt_TCPMSS: remove network triggerable WARN_ON [DECNET]: dn_nl_deladdr() almost always returns no error [IPV6]: Restore IPv6 when MTU is big enough [RXRPC]: Add missing select on CRYPTO mac80211: rate limit wep decrypt failed messages rfkill: fix double-mutex-locking mac80211: drop unencrypted frames if encryption is expected mac80211: Fix behavior of ieee80211_open and ieee80211_close ieee80211: fix unaligned access in ieee80211_copy_snap mac80211: free ifsta->extra_ie and clear IEEE80211_STA_PRIVACY_INVOKED SCTP: Fix build issues with SCTP AUTH. SCTP: Fix chunk acceptance when no authenticated chunks were listed. SCTP: Fix the supported extensions paramter SCTP: Fix SCTP-AUTH to correctly add HMACS paramter. SCTP: Fix the number of HB transmissions. [TCP] illinois: Incorrect beta usage ...	2007-12-03 08:15:36 -08:00
Srivatsa Vaddagiri	d842de871c	sched: cpu accounting controller (V2) Commit cfb5285660aad4931b2ebbfa902ea48a37dfffa1 removed a useful feature for us, which provided a cpu accounting resource controller. This feature would be useful if someone wants to group tasks only for accounting purpose and doesnt really want to exercise any control over their cpu consumption. The patch below reintroduces the feature. It is based on Paul Menage's original patch (Commit 62d0df64065e7c135d0002f069444fbdfc64768f), with these differences: - Removed load average information. I felt it needs more thought (esp to deal with SMP and virtualized platforms) and can be added for 2.6.25 after more discussions. - Convert group cpu usage to be nanosecond accurate (as rest of the cfs stats are) and invoke cpuacct_charge() from the respective scheduler classes - Make accounting scalable on SMP systems by splitting the usage counter to be per-cpu - Move the code from kernel/cpu_acct.c to kernel/sched.c (since the code is not big enough to warrant a new file and also this rightly needs to live inside the scheduler. Also things like accessing rq->lock while reading cpu usage becomes easier if the code lived in kernel/sched.c) The patch also modifies the cpu controller not to provide the same accounting information. Tested-by: Balbir Singh <balbir@linux.vnet.ibm.com> Tested the patches on top of 2.6.24-rc3. The patches work fine. Ran some simple tests like cpuspin (spin on the cpu), ran several tasks in the same group and timed them. Compared their time stamps with cpuacct.usage. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-12-02 20:04:49 +01:00
Kim Phillips	7d400a4c58	phylib: add PHY interface modes for internal delay for tx and rx only Allow phylib specification of cases where hardware needs to configure PHYs for Internal Delay only on either RX or TX (not both). Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Tested-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Li Yang <leoli@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-01 16:32:30 -05:00
Jeff Garzik	c99da91e7a	Merge branch 'master' into upstream-fixes	2007-12-01 16:18:56 -05:00
Eric W. Biederman	2b1e300a9d	[NETNS]: Fix /proc/net breakage Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify ->lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-12-02 00:33:17 +11:00
Jiri Kosina	8853c202b4	RTC: convert mutex to bitfield RTC code is using mutex to assure exclusive access to /dev/rtc. This is however wrong usage, as it leaves the mutex locked when returning into userspace, which is unacceptable. Convert rtc->char_lock into bit operation. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	a6643094e7	fuse: pass open flags to read and write Some open flags (O_APPEND, O_DIRECT) can be changed with fcntl(F_SETFL, ...) after open, but fuse currently only sends the flags to userspace in open. To make it possible to correcly handle changing flags, send the current value to userspace in each read and write. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Huang, Ying	7c83172b98	x86_64 EFI boot support: EFI frame buffer driver This patch adds Graphics Output Protocol support to the kernel. UEFI2.0 spec deprecates Universal Graphics Adapter (UGA) protocol and only Graphics Output Protocol (GOP) is produced. Therefore, the boot loader needs to query the UEFI firmware with appropriate Output Protocol and pass the video information to the kernel. As a result of GOP protocol, an EFI framebuffer driver is needed for displaying console messages. The patch adds a EFI framebuffer driver. The EFI frame buffer driver in this patch is based on the Intel Mac framebuffer driver. The ELILO bootloader takes care of passing the video information as appropriate for EFI firmware. The framebuffer driver has been tested in i386 kernel and x86_64 kernel on EFI platform. Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Tobias Poschwatta	6454d1f903	fix up ext2_fs.h for userspace after reservations backport In commit a686cd898bd999fd026a51e90fb0a3410d258ddb: "Val's cross-port of the ext3 reservations code into ext2." include/linux/ext2_fs.h got a new function whose return value is only defined if __KERNEL__ is defined. Putting #ifdef __KERNEL__ around the function seems to help, patch below. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:53 -08:00
Thomas Bogendoerfer	68576cf122	IP22ZILOG: fix lockup and sysrq - fix lockup when switching from early console to real console - make sysrq reliable - fix panic, if sysrq is issued before console is opened Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:53 -08:00
David Woodhouse	79288f5e93	Fix <linux/kd.h> usage in userspace For reasons unclear to me, glibc's <sys/kd.h> deliberately defeats the attempt we make in <linux/kd.h> to include <linux/types.h> For now, change the one instance of __u32 to 'unsigned int' instead because it's breaking userspace. We should probably also remove our inclusion of <linux/types.h>, since we don't use it -- but that's not a change to make in -rc. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:52 -08:00
Zhao Yakui	a7839e9606	PNP: increase the maximum number of resources On some systems the number of resources(IO,MEM) returnedy by PNP device is greater than the PNP constant, for example motherboard devices. It brings that some resources can't be reserved and resource confilicts. This will cause PCI resources are assigned wrongly in some systems, and cause hang. This is a regression since we deleted ACPI motherboard driver and use PNP system driver. [akpm@linux-foundation.org: fix text and coding-style a bit] Signed-off-by: Li Shaohua <shaohua.li@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Thomas Renninger <trenn@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:52 -08:00
Linus Torvalds	bb0851ff9d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (25 commits) USB: s3c2410 gadget: ensure vbus pin in input mode during read USB: s3c2410 gadget: allow sharing of vbus irq USB: s3c2410 gadget: Header move fixups USB: usb-storage: unusual_devs entry for JetFlash TS1GJF2A USB: fix up EHCI startup synchronization USB: make the microtek driver and HAL cooperate USB: uevent environment key fix USB: keep track of whether interface sysfs files exist USB: sierra: new product id USB HCD: avoid duplicate local_irq_disable() USB: mailing lists have changed USB: remove USB HUB entry from MAINTAINERS USB: fix directory references in usb/README USB: add support for an older firmware revision for the Nikon D200 USB: FIx locks and urb->status in adutux (updated) USB: power-management documenation update USB: Fix signr comment in usbdevice_fs.h usbserial: fix inconsistent lock state USB: fix usbled disconnect read race #2 USB: free memory when writing fails in usb/serial/mos7840.c ...	2007-11-28 16:03:09 -08:00
Alan Stern	7e61559f61	USB: keep track of whether interface sysfs files exist This patch (as1009) solves the problem of multiple registrations for USB sysfs files in a more satisfying way than the existing code. It simply adds a flag to keep track of whether or not the files have been created; that way the files can be created or removed as needed. Signed-off-by: Alan Stern <stern@rowland.harvard.edu>	2007-11-28 13:58:35 -08:00
Phil Endecott	bc59462b80	USB: Fix signr comment in usbdevice_fs.h This trivial documentation patch corrects a comment in usbdevice_fs.h; it previously suggested that the signal would only be sent on error, but I am told that it is sent on both successful and unsuccessful completion, and that zero indicates that no signal should be sent. Signed-off-by: Phil Endecott <spam_from_usb_devel@chezphil.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-28 13:58:34 -08:00
Ingo Molnar	deaf2227dd	sched: clean up, move __sched_text_start/end to sched.h move __sched_text_start/end to sched.h. No code changed: text data bss dec hex filename 26582 2310 28 28920 70f8 sched.o.before 26582 2310 28 28920 70f8 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-28 15:52:56 +01:00
Linus Torvalds	9c8ff4f4da	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: scatterlist: add more safeguards Revert "ll_rw_blk: temporarily enable max_segments tweaking" mmc: Add missing sg_init_table() call block: Fix memory leak in alloc_disk_node() alpha: fix sg_page breakage blktrace: Make sure BLKTRACETEARDOWN does the full cleanup.	2007-11-27 14:21:19 -08:00
Linus Torvalds	febb187761	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: adds the context menu key (HUT GenDesc 0x84) Input: add definitions for frame forward and frame back keys Input: bf54x-keys - keypad does not exist on BF544 parts Input: gpio-keys - request and configure GPIOs Input: i8042 - add i8042.noloop quirk for MS Virtual Machine Sonypi: use synchronize_irq instead of sycnronize_sched sonypi: fit input devices into sysfs tree sony-laptop: fit input devices into sysfs tree	2007-11-27 14:20:35 -08:00
Tejun Heo	645a8d9462	scatterlist: add more safeguards Add more safeguards to protect against misinterpreting a chain entry as a normal scatterlist and vice-versa. * Make sure the entry isn't a chain when assigning and reading a normal sg. * Clear offset and length when chaining. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-27 09:30:39 +01:00
Aristeu Rozanski	35baef2afb	Input: adds the context menu key (HUT GenDesc 0x84) Signed-off-by: Aristeu Rozanski <aris@ruivo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2007-11-27 00:47:04 -05:00
Aristeu Rozanski	c23f1f9c40	Input: add definitions for frame forward and frame back keys Signed-off-by: Aristeu Rozanski <aris@ruivo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2007-11-27 00:46:57 -05:00
Linus Torvalds	8c27eba549	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (41 commits) [XFRM]: Fix leak of expired xfrm_states [ATM]: [he] initialize lock and tasklet earlier [IPV4]: Remove bogus ifdef mess in arp_process [SKBUFF]: Free old skb properly in skb_morph [IPV4]: Fix memory leak in inet_hashtables.h when NUMA is on [IPSEC]: Temporarily remove locks around copying of non-atomic fields [TCP] MTUprobe: Cleanup send queue check (no need to loop) [TCP]: MTUprobe: receiver window & data available checks fixed [MAINTAINERS]: tlan list is subscribers-only [SUNRPC]: Remove SPIN_LOCK_UNLOCKED [SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static [PFKEY]: Sending an SADB_GET responds with an SADB_GET [IRDA]: Compilation for CONFIG_INET=n case [IPVS]: Fix compiler warning about unused register_ip_vs_protocol [ARP]: Fix arp reply when sender ip 0 [IPV6] TCPMD5: Fix deleting key operation. [IPV6] TCPMD5: Check return value of tcp_alloc_md5sig_pool(). [IPV4] TCPMD5: Use memmove() instead of memcpy() because we have overlaps. [IPV4] TCPMD5: Omit redundant NULL check for kfree() argument. ieee80211: Stop net_ratelimit/IEEE80211_DEBUG_DROP log pollution ...	2007-11-26 20:09:07 -08:00
Linus Torvalds	423eaf8f00	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: NFS: Clean up new multi-segment direct I/O changes NFS: Ensure we return zero if applications attempt to write zero bytes NFS: Support multiple segment iovecs in the NFS direct I/O path NFS: Introduce iovec I/O helpers to fs/nfs/direct.c SUNRPC: Add missing "space" to net/sunrpc/auth_gss.c SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static NFS: fs/nfs/dir.c should #include "internal.h" NFS: make nfs_wb_page_priority() static NFS: mount failure causes bad page state SUNRPC: remove NFS/RDMA client's binary sysctls kernel BUG at fs/nfs/namespace.c:108! - can be triggered by bad server sunrpc: rpc_pipe_poll may miss available data in some cases sunrpc: return error if unsupported enctype or cksumtype is encountered sunrpc: gss_pipe_downcall(), don't assume all errors are transient NFS: Fix the ustat() regression	2007-11-26 19:42:59 -08:00
Linus Torvalds	ff1ea52fa3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix APIC related bootup crash on Athlon XP CPUs time: add ADJ_OFFSET_SS_READ x86: export the symbol empty_zero_page on the 32-bit x86 architecture x86: fix kprobes_64.c inlining borkage pci: use pci=bfsort for HP DL385 G2, DL585 G2 x86: correctly set UTS_MACHINE for "make ARCH=x86" lockdep: annotate do_debug() trap handler x86: turn off iommu merge by default x86: fix ACPI compile for LOCAL_APIC=n x86: printk kernel version in WARN_ON and other dump_stack users ACPI: Set max_cstate to 1 for early Opterons. x86: fix NMI watchdog & 'stopped time' problem	2007-11-26 19:41:28 -08:00
Linus Torvalds	6d27294053	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (21 commits) libata: bump transfer chunk size if it's odd libata: Return proper ATA INT status in pata_bf54x driver pata_ali: trim trailing whitespace (fix checkpatch complaints) pata_isapnp: Polled devices pata_hpt37x: Fix cable detect bug spotted by Sergei pata_ali: Lots of problems still showing up with small ATAPI DMA pata_ali: Add Mitac 8317 and derivatives libata-core: List more documentation sources for reference ata_piix: Invalid use of writel/readl with iomap sata_sil24: fix sg table sizing pata_jmicron: fix disabled port handling in jmicron_pre_reset() pata_sil680: kill bogus reset code (take 2) ata_piix: port enable for the first SATA controller of ICH8 is 0xf not 0x3 ata_piix: only enable the first port on apple macbook pro ata_piix: reorganize controller IDs pata_sis.c: Add Packard Bell EasyNote K5305 to laptops libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs libata: use ATA_HORKAGE_STUCK_ERR for ATAPI tape drives libata: workaround DRQ=1 ERR=1 for ATAPI tape drives libata: remove unused functions ...	2007-11-26 19:18:22 -08:00
Linus Torvalds	6b41016032	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (39 commits) ACPI: EC: Workaround for optimized controllers (version 3) ACPI: EC: use printk_ratelimit(), add some DEBUG mode messages Revert "ACPI: EC: Workaround for optimized controllers" ACPI: fix two IRQ8 issues in IOAPIC mode ACPI: Add missing spaces to printk format cpuidle: fix HP nx6125 regression cpuidle: add sched_clock_idle_[sleep\|wakeup]_event() hooks cpuidle: fix C3 for no bus-master control case ACPI: thinkpad-acpi: fix oops when a module parameter has no value Revert "Fix very high interrupt rate for IRQ8 (rtc) unless pnpacpi=off" ACPI: EC: Don't init EC early if it has no _INI Revert "acpi: make ACPI_PROCFS default to y" Revert "ACPI: add documentation for deprecated /proc/acpi/battery in ACPI_PROCFS" ACPI: Split out control for /proc/acpi entries from battery, ac, and sbs. ACPI: Video: Increase buffer size for writes to brightness proc file. ACPI: EC: Workaround for optimized controllers ACPI: SBS: Fix retval warning ACPI: Enable MSR (FixedHW) support for T-States ACPI: Get throttling info from BIOS only after evaluating _PDC ACPI: Use _TSS for throttling control, when present. Add error checks. ...	2007-11-26 19:09:58 -08:00
Adrian Bunk	483066d62e	SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static xs_setup_{udp,tcp}() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:50 -05:00
Adrian Bunk	5334eb13d4	NFS: make nfs_wb_page_priority() static nfs_wb_page_priority() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:48 -05:00
James Lentini	cfcb43ff7c	SUNRPC: remove NFS/RDMA client's binary sysctls Support for binary sysctls is being deprecated in 2.6.24. Since there are no applications using the NFS/RDMA client's binary sysctls, it makes sense to remove them. The patch below does this while leaving the /proc/sys interface unchanged. Please consider this for 2.6.24. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:21:19 -05:00
John Stultz	52bfb36050	time: add ADJ_OFFSET_SS_READ Michael Kerrisk reported that a long standing bug in the adjtimex() system call causes glibc's adjtime(3) function to deliver the wrong results if 'delta' is NULL. add the ADJ_OFFSET_SS_READ API detail, which will be used by glibc to fix this API compatibility bug. Also see: http://bugzilla.kernel.org/show_bug.cgi?id=6761 [ mingo@elte.hu: added patch description and made it backwards compatible ] NOTE: the new flag is defined 0xa001 so that it returns -EINVAL on older kernels - this way glibc can use it safely. Suggested by Ulrich Drepper. Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-11-26 20:42:19 +01:00
Herbert Xu	2d4baff8da	[SKBUFF]: Free old skb properly in skb_morph The skb_morph function only freed the data part of the dst skb, but leaked the auxiliary data such as the netfilter fields. This patch fixes this by moving the relevant parts from __kfree_skb to skb_release_all and calling it in skb_morph. It also makes kfree_skbmem static since it's no longer called anywhere else and it now no longer does skb_release_data. Thanks to Yasuyuki KOZAKAI for finding this problem and posting a patch for it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-11-26 23:11:19 +08:00
Ayaz Abdulla	490dde8990	forcedeth: new mcp79 pci ids This patch adds new device ids and features for mcp79 devices into the forcedeth driver. Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-11-23 20:54:01 -05:00
Adrian Bunk	5fe4a33430	[SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static xs_setup_{udp,tcp}() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-11-22 19:38:25 +08:00
Heiko Carstens	37e3a6ac5a	[S390] appldata: remove unused binary sysctls. Remove binary sysctls that never worked due to missing strategy functions. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-11-20 11:13:45 +01:00
Heiko Carstens	43ebbf119a	[S390] cmm: remove unused binary sysctls. Remove binary sysctls that never worked due to missing strategy functions. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-11-20 11:13:45 +01:00
Len Brown	95b00786f3	Pull cpuidle into release branch	2007-11-20 01:18:37 -05:00
Shaohua Li	61fd47e0c8	ACPI: fix two IRQ8 issues in IOAPIC mode Use mp_irqs[] to get PNP device's interrupt polarity and trigger. There are two reasons to do this: 1. BIOS bug for PNP interrupt 2. BIOS explictly does override mp_irqs[] should cover all the cases. http://bugzilla.kernel.org/show_bug.cgi?id=5243 http://bugzilla.kernel.org/show_bug.cgi?id=7679 http://bugzilla.kernel.org/show_bug.cgi?id=9153 [lenb: fixed !IOAPIC and 64-bit !SMP builds] Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2007-11-20 01:16:29 -05:00
Venkatesh Pallipadi	ddc081a195	cpuidle: fix HP nx6125 regression Fix for http://bugzilla.kernel.org/show_bug.cgi?id=9355 cpuidle always used to fallback to C2 if there is some bm activity while entering C3. But, presence of C2 is not always guaranteed. Change cpuidle algorithm to detect a safe_state to fallback in case of bm_activity and use that state instead of C2. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2007-11-19 21:43:22 -05:00
Albert Lee	2d3b8eea7f	libata: workaround DRQ=1 ERR=1 for ATAPI tape drives After an error condition, some ATAPI tape drives set DRQ=1 together with ERR=1 when asking the host to transfer the CDB of the next packet command (i.e. request sense). This patch, a revised version of Alan/Mark's previous patch, adds ATA_HORKAGE_STUCK_ERR to workaround the problem by ignoring the ERR bit and proceed sending the CDB. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Mark Lord <liml@rtr.ca> Signed-off-by: Tejun Heo <htejun@gmail.com>	2007-11-19 12:28:11 +09:00
Adrian Bunk	21bef6dd2b	libata: remove unused functions This patch removes the following obsolete functions: - libata-core.c: __sata_phy_reset() - libata-core.c: sata_phy_reset() - libata-eh.c: ata_qc_timeout() - libata-eh.c: ata_eng_timeout() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Tejun Heo <htejun@gmail.com>	2007-11-19 12:28:09 +09:00
Eric Paris	ec41878170	SELinux: return EOPNOTSUPP not ENOTSUPP ENOTSUPP is not a valid error code in the kernel (it is defined in some NFS internal error codes and has been improperly used other places). In the !CONFIG_SECURITY_SELINUX case though it is possible that we could return this from selinux_audit_rule_init(). This patch just returns the userspace valid EOPNOTSUPP. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2007-11-17 10:38:16 +11:00
Jean Delvare	5e31c2bd3c	i2c: Make i2c_check_addr static i2c_check_addr is only used inside i2c-core now, so we can make it static and stop exporting it. Thanks to David Brownell for noticing. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2007-11-15 19:24:02 +01:00
Jan Kara	7c06a8dc64	Fix 64KB blocksize in ext3 directories With 64KB blocksize, a directory entry can have size 64KB which does not fit into 16 bits we have for entry lenght. So we store 0xffff instead and convert value when read from / written to disk. The patch also converts some places to use ext3_next_entry() when we are changing them anyway. [akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:43 -08:00
Eric W. Biederman	57d5f66b86	pidns: Place under CONFIG_EXPERIMENTAL This is my trivial patch to swat innumerable little bugs with a single blow. After some intensive review (my apologies for not having gotten to this sooner) what we have looks like a good base to build on with the current pid namespace code but it is not complete, and it is still much to simple to find issues where the kernel does the wrong thing outside of the initial pid namespace. Until the dust settles and we are certain we have the ABI and the implementation is as correct as humanly possible let's keep process ID namespaces behind CONFIG_EXPERIMENTAL. Allowing us the option of fixing any ABI or other bugs we find as long as they are minor. Allowing users of the kernel to avoid those bugs simply by ensuring their kernel does not have support for multiple pid namespaces. [akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Adrian Bunk <bunk@kernel.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Kir Kolyshkin <kir@swsoft.com> Cc: Kirill Korotaev <dev@sw.ru> Cc: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:43 -08:00
Bjorn Helgaas	9626f1f117	rtc: fall back to requesting only the ports we actually use Firmware like PNPBIOS or ACPI can report the address space consumed by the RTC. The actual space consumed may be less than the size (RTC_IO_EXTENT) assumed by the RTC driver. The PNP core doesn't request resources yet, but I'd like to make it do so. If/when it does, the RTC_IO_EXTENT request may fail, which prevents the RTC driver from loading. Since we only use the RTC index and data registers at RTC_PORT(0) and RTC_PORT(1), we can fall back to requesting just enough space for those. If the PNP core requests resources, this results in typical I/O port usage like this: 0070-0073 : 00:06 <-- PNP device 00:06 responds to 70-73 0070-0071 : rtc <-- RTC driver uses only 70-71 instead of the current: 0070-0077 : rtc <-- RTC_IO_EXTENT == 8 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:41 -08:00
Shannon Nelson	7bb67c14fd	I/OAT: Add support for version 2 of ioatdma device Add support for version 2 of the ioatdma device. This device handles the descriptor chain and DCA services slightly differently: - Instead of moving the dma descriptors between a busy and an idle chain, this new version uses a single circular chain so that we don't have rewrite the next_descriptor pointers as we add new requests, and the device doesn't need to re-read the last descriptor. - The new device has the DCA tags defined internally instead of needing them defined statically. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Cc: "Williams, Dan J" <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:41 -08:00
Andrew Morton	cfb5285660	revert "Task Control Groups: example CPU accounting subsystem" Revert 62d0df64065e7c135d0002f069444fbdfc64768f. This was originally intended as a simple initial example of how to create a control groups subsystem; it wasn't intended for mainline, but I didn't make this clear enough to Andrew. The CFS cgroup subsystem now has better functionality for the per-cgroup usage accounting (based directly on CFS stats) than the "usage" status file in this patch, and the "load" status file is rather simplistic - although having a per-cgroup load average report would be a useful feature, I don't believe this patch actually provides it. If it gets into the final 2.6.24 we'd probably have to support this interface for ever. Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:40 -08:00
Ken Chen	45c682a68a	hugetlb: fix i_blocks accounting For administrative purpose, we want to query actual block usage for hugetlbfs file via fstat. Currently, hugetlbfs always return 0. Fix that up since kernel already has all the information to track it properly. Signed-off-by: Ken Chen <kenchen@google.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:40 -08:00
Adam Litke	9a119c056d	hugetlb: allow bulk updating in hugetlb_*_quota() Add a second parameter 'delta' to hugetlb_get_quota and hugetlb_put_quota to allow bulk updating of the sbinfo->free_blocks counter. This will be used by the next patch in the series. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Ken Chen <kenchen@google.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <hermes@gibson.dropbear.id.au> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:40 -08:00
Adam Litke	5b23dbe817	hugetlb: follow_hugetlb_page() for write access When calling get_user_pages(), a write flag is passed in by the caller to indicate if write access is required on the faulted-in pages. Currently, follow_hugetlb_page() ignores this flag and always faults pages for read-only access. This can cause data corruption because a device driver that calls get_user_pages() with write set will not expect COW faults to occur on the returned pages. This patch passes the write flag down to follow_hugetlb_page() and makes sure hugetlb_fault() is called with the right write_access parameter. [ezk@cs.sunysb.edu: build fix] Signed-off-by: Adam Litke <agl@us.ibm.com> Reviewed-by: Ken Chen <kenchen@google.com> Cc: David Gibson <hermes@gibson.dropbear.id.au> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:39 -08:00
Linus Torvalds	9418d5dc9b	Merge branch 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6 * 'release' of git://lm-sensors.org/kernel/mhoffman/hwmon-2.6: hwmon: (i5k_amb) Convert macros to C functions hwmon: (w83781d) Add missing curly braces hwmon: (abituguru3) Identify ABit IP35 Pro as such hwmon: (f75375s) pwmX_mode sysfs files writable for f75375 variant hwmon: (f75375s) On n2100 systems, set fans to full speed on boot hwmon: (f75375s) Allow setting up fans with platform_data hwmon: (f75375s) Add new style bindings hwmon: (lm70) Convert semaphore to mutex hwmon: (applesmc) Add support for Mac Pro 2 x Quad-Core hwmon: (abituguru3) Add support for 2 new motherboards hwmon: (ibmpex) Change printk to dev_{info,err} macros hwmon: (i5k_amb) New memory temperature sensor driver hwmon: (f75375s) fix pwm mode setting hwmon: (ibmpex.c) fix NULL dereference hwmon: (sis5595) Split sis5595_attributes_opt hwmon: (sis5595) Add individual alarm files hwmon: (w83627hf) push nr+1 offset into *_REG_FAN macros and simplify hwmon: (w83627hf) hoist nr-1 offset out of show-store-temp-X hwmon: Add power meter spec to Documentation/hwmon/sysfs-interface	2007-11-13 09:09:36 -08:00
Trond Myklebust	91cf45f02a	[NET]: Add the helper kernel_sock_shutdown() ...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers. Looking at the sock->op->shutdown() handlers, it looks as if all of them take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the RCV_SHUTDOWN/SEND_SHUTDOWN arguments. Add a helper, and then define the SHUT_* enum to ensure that kernel users of shutdown() don't get confused. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-12 18:10:39 -08:00
Pierre Ynard	dbb2ed2485	[IPV6]: Add ifindex field to ND user option messages. Userland neighbor discovery options are typically heavily involved with the interface on which thay are received: add a missing ifindex field to the original struct. Thanks to R�mi Denis-Courmont. Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-12 17:58:35 -08:00
Linus Torvalds	05f3f41589	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-virtio * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-virtio: virtio: Force use of power-of-two for descriptor ring sizes lguest: Fix lguest virtio-blk backend size computation virtio: Fix used_idx wrap-around virtio: more fallout from scatterlist changes. virtio: fix vring_init for 64 bits	2007-11-12 11:13:31 -08:00
Rusty Russell	42b36cc0ce	virtio: Force use of power-of-two for descriptor ring sizes The virtio descriptor rings of size N-1 were nicely set up to be aligned to an N-byte boundary. But as Anthony Liguori points out, the free-running indices used by virtio require that the sizes be a power of 2, otherwise we get problems on wrap (demonstrated with lguest). So we replace the clever "2^n-1" scheme with a simple "align to page boundary" scheme: this means that all virtio rings take at least two pages, but it's safer than guessing cache alignment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-11-12 13:59:40 +11:00
Anthony Liguori	44332f7167	virtio: fix vring_init for 64 bits This patch fixes a typo in vring_init(). This happens to work today in lguest because the sizeof(struct vring_desc) is 16 and struct vring contains 3 pointers and an unsigned int so on 32-bit sizeof(struct vring_desc) == sizeof(struct vring). However, this is no longer true on 64-bit where the bug is exposed. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-11-12 13:55:12 +11:00
Chuck Lever	78608ba032	[NET]: Fix skb_truesize_check() assertion The intent of the assertion in skb_truesize_check() is to check for skb->truesize being decremented too much by other code, resulting in a wraparound below zero. The type of the right side of the comparison causes the compiler to promote the left side to an unsigned type, despite the presence of an explicit type cast. This defeats the check for negativity. Ensure both sides of the comparison are a signed type to prevent the implicit type conversion. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-10 21:53:30 -08:00
Linus Torvalds	a70a932299	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: proper prototype for kernel/sched.c:migration_init() sched: avoid large irq-latencies in smp-balancing sched: fix copy_namespace() <-> sched_fork() dependency in do_fork sched: clean up the wakeup preempt check, #2 sched: clean up the wakeup preempt check sched: wakeup preemption fix sched: remove PREEMPT_RESTRICT sched: turn off PREEMPT_RESTRICT KVM: fix !SMP build error x86: make nmi_cpu_busy() always defined x86: make ipi_handler() always defined sched: cleanup, use NSEC_PER_MSEC and NSEC_PER_SEC sched: reintroduce SMP tunings again sched: restore deterministic CPU accounting on powerpc sched: fix delay accounting regression sched: reintroduce the sched_min_granularity tunable sched: documentation: place_entity() comments sched: fix vslice	2007-11-09 15:27:54 -08:00
Linus Torvalds	4c31c30302	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Add UNPLUG traces to all appropriate places block: fix requeue handling in blk_queue_invalidate_tags() mmc: Fix sg helper copy-and-paste error pktcdvd: fix BUG caused by sysfs module reference semantics change ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting cfq_idle_class_timer: add paranoid checks for jiffies overflow cfq: fix IOPRIO_CLASS_IDLE delays cfq: fix IOPRIO_CLASS_IDLE accounting	2007-11-09 15:17:49 -08:00
Adrian Bunk	e6fe6649b4	sched: proper prototype for kernel/sched.c:migration_init() This patch adds a proper prototype for migration_init() in include/linux/sched.h Since there's no point in always returning 0 to a caller that doesn't check the return value it also changes the function to return void. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Peter Zijlstra	b82d9fdd84	sched: avoid large irq-latencies in smp-balancing SMP balancing is done with IRQs disabled and can iterate the full rq. When rqs are large this can cause large irq-latencies. Limit the nr of iterations on each run. This fixes a scheduling latency regression reported by the -rt folks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Ingo Molnar	3e3e13f399	sched: remove PREEMPT_RESTRICT remove PREEMPT_RESTRICT. (this is a separate commit so that any regression related to the removal itself is bisectable) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:39 +01:00
Ingo Molnar	a5fbb6d106	KVM: fix !SMP build error fix a !SMP build error: drivers/kvm/kvm_main.c: In function 'kvm_flush_remote_tlbs': drivers/kvm/kvm_main.c:220: error: implicit declaration of function 'smp_call_function_mask' (and also avoid unused function warning related to up_smp_call_function() not making use of the 'func' parameter.) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:38 +01:00
Paul Mackerras	fa13a5a1f2	sched: restore deterministic CPU accounting on powerpc Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been broken on powerpc, because we end up counting user time twice: once in timer_interrupt() and once in update_process_times(). This fixes the problem by pulling the code in update_process_times that updates utime and stime into a separate function called account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined, there is a version of account_process_tick in kernel/timer.c that simply accounts a whole tick to either utime or stime as before. If CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to implement account_process_tick. This also lets us simplify the s390 code a bit; it means that the s390 timer interrupt can now call update_process_times even when CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a suitable account_process_tick(). account_process_tick() now takes the task_struct * as an argument. Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:38 +01:00
Peter Zijlstra	b2be5e96dc	sched: reintroduce the sched_min_granularity tunable we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-09 22:39:37 +01:00
Alan D. Brunelle	2ad8b1ef11	Add UNPLUG traces to all appropriate places Added blk_unplug interface, allowing all invocations of unplugs to result in a generated blktrace UNPLUG. Signed-off-by: Alan D. Brunelle <Alan.Brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-09 13:41:32 +01:00
Riku Voipio	ff312d07c2	hwmon: (f75375s) Allow setting up fans with platform_data Allow initializing fans on systems where BIOS does not do that by default. - define f75375s_platform_data in new file f75375s.h - if platform_data was provided, set fans accordingly in f75375_init() - split set_pwm_enable() to a sysfs callback and directly usable set_pwm_enable_direct() Signed-off-by: Riku Voipio <riku.voipio@movial.fi> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2007-11-08 08:42:46 -05:00
Darrick J. Wong	298c752491	hwmon: (i5k_amb) New memory temperature sensor driver New driver to read FB-DIMM temperature sensors on systems with the Intel 5000 series chipsets. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>	2007-11-08 08:42:46 -05:00
Patrick McHardy	c3d8d1e30c	[NETLINK]: Fix unicast timeouts Commit ed6dcf4a in the history.git tree broke netlink_unicast timeouts by moving the schedule_timeout() call to a new function that doesn't propagate the remaining timeout back to the caller. This means on each retry we start with the full timeout again. ipc/mqueue.c seems to actually want to wait indefinitely so this behaviour is retained. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:15:12 -08:00
Alan Cox	0fc00e2440	[TTY]: Fix network driver interactions with TCGET/SET calls. Dave Miller noted various cases where line disciplines for things like ppp go poking around in termios themselves in ways that broke with the new termios code. Rather than have them all learning about termios internals provide proper methods for this - tty_mode_ioctl() This handles all the terminal mode handling for speed/carrier etc and none of the methods are ldisc dependant so they can be called by any user - tty_perform_flush() This extracts the flush functionality and enables pppd the ppp layer to share it cleanly. The existing n_tty_ioctl code is refactored in this patch to provide the new functions and to call them itself appropriately. This patch has no (intended) behaviour changes and simply prepares for the other fixes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:14:19 -08:00
David S. Miller	44656ba128	[NET]: Kill proc_net_create() There are no more users. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:10:52 -08:00
Pavel Emelyanov	6a9fb9479f	[IPV4]: Clean the ip_sockglue.c from some ugly ifdefs The #idfed CONFIG_IP_MROUTE is sometimes places inside the if-s, which looks completely bad. Similar ifdefs inside the functions looks a bit better, but they are also not recommended to be used. Provide an ifdef-ed ip_mroute_opt() helper to cleanup the code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:55 -08:00
Jan Engelhardt	b98e1747ee	[NETFILTER]: Sort matches/targets in Kbuild file Sort matches and targets in the Kbuild file. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:08:21 -08:00
Jesper Nilsson	a66f66c44d	[MTD] Provide mtdram.h with mtdram_init_device() prototype This is used by axisflashmap.c to boot from ram. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Mikael Starvik <starvik@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-11-06 08:40:24 +00:00
Linus Torvalds	f8a9efb528	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: handle broken cable reporting pata_hpt37x: Fix outstanding bug reports on the HPT374 and 37x cable detect ata_piix: Add additional PCI identifier for 40 wire short cable pata_serverworks: Fix problem with some drive combinations libata: Don't disable dipm with SET FEATURES libata and bogus LBA48 drives	2007-11-05 17:43:04 -08:00
Kamalesh Babulal	5a75983eef	Missing include file in kallsyms.h The Build with randconfig fails with following error with the 2.6.24-rc4-git9 include/linux/kallsyms.h:56: error: `NULL' undeclared (first use in this function) include/linux/kallsyms.h:56: error: (Each undeclared identifier is reported only once include/linux/kallsyms.h:56: error: for each function it appears in.) make[2]: * [arch/powerpc/platforms/cell/spu_callbacks.o] Error 1 make[1]: * [arch/powerpc/platforms/cell] Error 2 make: *** [arch/powerpc/platforms] Error 2 Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-05 15:12:32 -08:00
Alan Cox	6bbfd53d47	libata: handle broken cable reporting One or two ancient drives predated the cable spec and didn't sent the valid bits for the field. I had hoped to leave this out of libata as a piece of historical annoyance but a recent CD drive shows the same bug so we have to import support for it. Same concept as Bartlomiej's changes old IDE except that as we have centralised blacklists we can avoid keeping another private table of stuff Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-11-05 18:10:28 -05:00
Linus Torvalds	5d66f151ac	Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: PCI: Add Kconfig option to disable deprecated pci_find_* API PCI: pciserial_resume_one ignored return value of pci_enable_device PCI Hotplug: cpqhp_pushbutton_thread(): remove a pointless if() check PCI: make pci_match_device() static PCI: Remove 3 incorrect MSI quirks. PCI: Add MSI INTX_DISABLE quirks for ATI SB700/800 SATA and IXP SB400 USB PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set. PCI: Add MSI quirk for ServerWorks HT1000 PCIX bridge. PCI: Revert "PCI: disable MSI by default on systems with Serverworks HT1000 chips"	2007-11-05 14:08:00 -08:00
Jeff Garzik	bd3989e006	PCI: Add Kconfig option to disable deprecated pci_find_* API Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-05 13:35:17 -08:00
Adrian Bunk	d73460d79b	PCI: make pci_match_device() static pci_match_device() no longer has any other users. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-05 13:35:17 -08:00
David Miller	5257dca0bd	PCI: Remove 3 incorrect MSI quirks. Now that we have dealt with the real issue, in that some ATI SATA and USB controllers needed the INTX_DISABLE quirk, we can remove these AMD chipset global MSI disabling quirks. This reverts three changesets: 4be8f906435a6af241821ab5b94b2b12cb7d57d8 (PCI: disable MSI on RS690) aea6a433f50cd89b9cbd10850fd0b32f961f9883 (PCI: disable MSI on RD580) f122392f679ebed39db08074f935d770504623eb (PCI: disable MSI on RX790) This is based upon testing and feedback from Shane Huang <Shane.Huang@amd.com>. Cc: Shane Huang <Shane.Huang@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-05 13:35:17 -08:00
David Miller	ba698ad4b7	PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set. A reasonably common problem with some devices is that they will disable MSI generation when the INTX_DISABLE bit is set in the PCI_COMMAND register. Quirk this explicitly, guarding the pci_intx() calls in msi.c with this quirk indication. The first entries for this quirk are for 5714 and 5780 Tigon3 chips, and thus we can remove the workaround code from the tg3.c driver. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Michael Chan <mchan@broadcom.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-05 13:35:16 -08:00
David Miller	1d84b5424e	PCI: Add MSI quirk for ServerWorks HT1000 PCIX bridge. This is the fix for the following problem: https://bugzilla.redhat.com/show_bug.cgi?id=227657 The bnx2 device 5706 complains about MSI not working behind a ServerWorks HT1000 PCIX bridge. An earlier commit to fix the problem: e3008dedff4bdc96a5f67224cd3d8d12237082a0: "PCI: disable MSI by default on systems with Serverworks HT1000 chips" was not entirely correct, and has been reverted. MSI does not work on the PCIX bus because the BIOS did not set the HT_MSI_FLAGS_ENABLE bit in the HyperTransport MSI capability on the bridge. We use the existing quirk_msi_ht_cap() to detect the problem and disable MSI in all buses behind it. Signed-off-by: Michael Chan <mchan@broadcom.com> Cc: Anantha Subramanyam <ananth@broadcom.com> Cc: Naren Sankar <nsankar@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-05 13:35:16 -08:00
David Miller	2cc31879f8	PCI: Revert "PCI: disable MSI by default on systems with Serverworks HT1000 chips" This reverts commit e3008dedff4bdc96a5f67224cd3d8d12237082a0. The real bug was an INTX issue in the tg3 ethernet chip, and cured by commit c129d962a66c76964954a98b38586ada82cf9381 Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-05 13:35:16 -08:00
Bartlomiej Zolnierkiewicz	01745112de	ide: move ide_fixstring() documentation to ide-iops.c from ide.h Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-11-05 21:42:29 +01:00
Adrian Bunk	fad23fc78b	kernel/futex.c: make 3 functions static The following functions can now become static again: - get_futex_key() - get_futex_key_refs() - drop_futex_key_refs() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-11-05 21:53:46 +11:00
Geert Uytterhoeven	17bd9a2f4c	libata and bogus LBA48 drives A colleague noticed recent versions of Ubuntu no longer detect his 80 GB ST380020ACE drive. This drive is special in that it advertises LBA48 support, but has the lba_capacity_2 field set to zero (cfr. http://lkml.org/lkml/2004/3/30/163). Upon closer look, libata indeed doesn't seem to handle this case yet. Below is an (untested) fix. Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-11-04 22:53:15 -05:00
Linus Torvalds	b4f555081f	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: [BLOCK] Don't allow empty barriers to be passed down to queues that don't grok them dm: bounce_pfn limit added Deadline iosched: Fix batching fairness Deadline iosched: Reset batch for ordered requests Deadline iosched: Factor out finding latter reques	2007-11-03 12:43:36 -07:00
Linus Torvalds	160acc2e89	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: [SG] Get rid of __sg_mark_end() cleanup asm/scatterlist.h includes SG: Make sg_init_one() use general table init functions	2007-11-03 12:43:21 -07:00
Tony Battersby	f8d8e5799b	libata: increase 128 KB / cmd limit for ATAPI tape drives Commands sent to ATAPI tape drives via the SCSI generic (sg) driver are limited in the amount of data that they can transfer by the max_sectors value. The max_sectors value is currently calculated according to the command set for disk drives, which doesn't apply to tape drives. The default max_sectors value of 256 limits ATAPI tape drive commands to 128 KB. This patch against 2.6.24-rc1 increases the max_sectors value for tape drives to 65535, which permits tape drive commands to transfer just under 32 MB. Tested with a SuperMicro PDSME motherboard, AHCI, and a Sony SDX-570V SATA tape drive. Note that some of the chipset drivers also set their own max_sectors value, which may override the value set in libata-core. I don't have any of these chipsets to test, so I didn't go messing with them. Also, ATAPI devices other than tape drives may benefit from similar changes, but I have only tape drives and disk drives to test. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-11-03 08:46:54 -04:00
Linus Torvalds	a89b7717a8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: linux-input mailing list moved to vger.kernel.org Input: inport, logibm - use KERN_INFO when reporting missing mouse Input: appletouch - idle reset logic broke older Fountains Input: hp_sdc.c - fix section mismatch Input: appletouch - add Johannes Berg as maintainer Input: Add Euro and Dollar key codes Input: xpad - add more USB IDs	2007-11-02 19:37:41 -07:00
Vasily Averin	5ec140e600	dm: bounce_pfn limit added Device mapper uses its own bounce_pfn that may differ from one on underlying device. In that way dm can build incorrect requests that contain sg elements greater than underlying device is able to handle. This is the cause of slab corruption in i2o layer, occurred on i386 arch when very long direct IO requests are addressed to dm-over-i2o device. Signed-off-by: Vasily Averin <vvs@sw.ru> Cc: <stable@kernel.org> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:25 +01:00
Jens Axboe	c46f2334c8	[SG] Get rid of __sg_mark_end() sg_mark_end() overwrites the page_link information, but all users want __sg_mark_end() behaviour where we just set the end bit. That is the most natural way to use the sg list, since you'll fill it in and then mark the end point. So change sg_mark_end() to only set the termination bit. Add a sg_magic debug check as well, and clear a chain pointer if it is set. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:06 +01:00
Jens Axboe	013fb33972	SG: Make sg_init_one() use general table init functions Don't open code sg_init_one(), make it reuse sg_init_table(). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:06 +01:00
Stephen Hemminger	3b582cc14c	[NET]: docbook fixes for netif_ functions Documentation updates for network interfaces. 1. Add doc for netif_napi_add 2. Remove doc for unused returns from netif_rx 3. Add doc for netif_receive_skb [ Incorporated minor mods from Randy Dunlap -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-01 02:21:47 -07:00
Greg Kroah-Hartman	d919fd433b	Revert "Driver core: remove class_device_*_bin_file" This reverts commit fcd239d3d5575e5cc63aab5c33cf6dc66904f6d6. I messed up, ia64 still uses these files in the current tree, and now can not build the pci code, which all ia64 boxes seem to require :) This fixes that mistake. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-31 12:51:29 -07:00
Linus Torvalds	dd13810b42	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [AF_KEY]: suppress a warning for 64k pages. [TIPC]: Fix headercheck wrt. tipc_config.h [COMPAT]: Fix build on COMPAT platforms when CONFIG_NET is disabled. [CONNECTOR]: Fix a spurious kfree_skb() call [COMPAT]: Fix new dev_ifname32 returning -EFAULT [NET]: Fix incorrect sg_mark_end() calls. [IPVS]: Remove /proc/net/ip_vs_lblcr [IPV6]: remove duplicate call to proc_net_remove [NETNS]: fix net released by rcu callback [NET]: Fix free_netdev on register_netdev failure. [WAN]: fix drivers/net/wan/lmc/ compilation	2007-10-31 07:46:51 -07:00
Greg Kroah-Hartman	fcd239d3d5	Driver core: remove class_device_*_bin_file These functions are not used by anyone, so remove them from the tree. The class_device code will be removed soon anyway, so no future users will ever be possible. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-30 21:52:33 -07:00
David S. Miller	97ef1bb0c8	[TIPC]: Fix headercheck wrt. tipc_config.h It wants string functions like memcpy() for inline routines, and these define userland interfaces. The only clean way to deal with this is to simply put linux/string.h into unifdef-y and have it include <string.h> when not-__KERNEL__. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:44:00 -07:00
Linus Torvalds	71d00feca2	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: ixgb: fix TX hangs under heavy load e1000e: Fix typo ! & ixgbe: minor sparse fixes e1000: sparse warnings fixes ixgb: fix sparse warnings e1000e: fix sparse warnings mv643xx_eth: Fix MV643XX_ETH offsets used by Pegasos 2 Blackfin EMAC driver: Fix Ethernet communication bug (dupliated and lost packets) DM9601: Support for ADMtek ADM8515 NIC	2007-10-30 12:04:29 -07:00
Linus Torvalds	8c1ee54cb3	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: implement and use ATA_QCFLAG_QUIET libata: stop being overjealous about non-IO commands libata: flush is an IO command sata_promise: cleanups sata_promise: ASIC PRD table bug workaround, take 2	2007-10-30 11:49:13 -07:00
Dale Farnsworth	3077d78a74	mv643xx_eth: Fix MV643XX_ETH offsets used by Pegasos 2 In the mv643xx_eth driver, we now use offsets from the ethernet register block within the chip, but the pegasos 2 platform still needs offsets from the full chip's register base address. Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-30 14:32:16 -04:00
Linus Torvalds	2d175d438f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [TIPC]: Add tipc_config.h to include/linux/Kbuild. [WAN]: lmc_ioctl: don't return with locks held [SUNRPC]: fix rpc debugging [TCP]: Saner thash_entries default with much memory. [SUNRPC] rpc_rdma: we need to cast u64 to unsigned long long for printing [IPv4] SNMP: Refer correct memory location to display ICMP out-going statistics [NET]: Fix error reporting in sys_socketpair(). [NETFILTER]: nf_ct_alloc_hashtable(): use __GFP_NOWARN [NET]: Fix race between poll_napi() and net_rx_action() [TCP] MD5: Remove some more unnecessary casting. [TCP] vegas: Fix a bug in disabling slow start by gamma parameter. [IPVS]: use proper timeout instead of fixed value [IPV6] NDISC: Fix setting base_reachable_time_ms variable.	2007-10-30 08:08:40 -07:00
Corey Minyard	64e862a579	IPMI: fix comparison in demangle_device_id Coverity spotted some incorrect code in a recent change to the IPMI driver; this patch make sure the data is really long enough to pull the manufacturer id and product id out of a get device id message. Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Adrian Bunk <bunk@kernel.org> Cc: Stian Jordet <liste@jordet.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-30 08:06:55 -07:00
Tejun Heo	e027bd36c1	libata: implement and use ATA_QCFLAG_QUIET Implement ATA_QCFLAG_QUIET which indicates that there's no need to report if the command fails with AC_ERR_DEV and set it for passthrough commands. Combined with previous changes, this now makes device errors for all direct commands reported directly to the issuer without going through EH actions and reporting. Note that EH is still invoked after non-IO device errors to determine the nature of the error and resume command execution (some controller requires special care after error to continue). It just performs default maintenance after error, examines what's going on, realizes that it's none of its business and reports the command failure without logging any error messages. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-30 09:59:43 -04:00
David S. Miller	502ef38da1	[TIPC]: Add tipc_config.h to include/linux/Kbuild. Needed, as reported in: http://bugzilla.kernel.org/show_bug.cgi?id=9260 Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 01:19:19 -07:00
Balbir Singh	9301899be7	sched: fix /proc/<PID>/stat stime/utime monotonicity, part 2 Extend Peter's patch to fix accounting issues, by keeping stime monotonic too. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Frans Pop <elendil@planet.nl>	2007-10-30 00:26:32 +01:00
Linus Torvalds	db8185360d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: fix style in kernel/sched.c sched: fix style of swap() macro in kernel/sched_fair.c sched: report CPU usage in CFS cgroup directories sched: move rcu_head to task_group struct sched: fix incorrect assumption that cpu 0 exists sched: keep utime/stime monotonic sched: make kernel/sched.c:account_guest_time() static	2007-10-29 14:06:19 -07:00
Linus Torvalds	6a22c57b8d	Revert "x86_64: allocate sparsemem memmap above 4G" This reverts commit 2e1c49db4c640b35df13889b86b9d62215ade4b6. First off, testing in Fedora has shown it to cause boot failures, bisected down by Martin Ebourne, and reported by Dave Jobes. So the commit will likely be reverted in the 2.6.23 stable kernels. Secondly, in the 2.6.24 model, x86-64 has now grown support for SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the bug is not visible any more, it's become invisible due to the code just being irrelevant and no longer enabled on the only architecture that this ever affected. Reported-by: Dave Jones <davej@redhat.com> Tested-by: Martin Ebourne <fedora@ebourne.me.uk> Cc: Zou Nan hai <nanhai.zou@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 14:05:37 -07:00
Peter Zijlstra	73a2bcb0ed	sched: keep utime/stime monotonic keep utime/stime monotonic. cpustats use utime/stime as a ratio against sum_exec_runtime, as a consequence it can happen - when the ratio changes faster than time accumulates - that either can be appear to go backwards. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-29 21:18:11 +01:00
Linus Torvalds	3529a23342	Merge branch 'alpm' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'alpm' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] AHCI: add hw link power management support [libata] Link power management infrastructure	2007-10-29 12:12:34 -07:00
Linus Torvalds	00cda56d39	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] AHCI: fix newly introduced host-reset bug [libata] sata_nv: fix SWNCQ enabling libata: add MAXTOR 7V300F0/VA111900 to NCQ blacklist libata: no need to speed down if already at PIO0 libata: relocate forcing PIO0 on reset pata_ns87415: define SUPERIO_IDE_MAX_RETRIES [libata] Address some checkpatch-spotted issues [libata] fix 'if(' and similar areas that lack whitespace libata: implement ata_wait_after_reset() libata: track SLEEP state and issue SRST to wake it up libata: relocate and fix post-command processing	2007-10-29 12:11:54 -07:00
Kristen Carlson Accardi	ca77329fb7	[libata] Link power management infrastructure Device Initiated Power Management, which is defined in SATA 2.5 can be enabled for disks which support it. This patch enables DIPM when the user sets the link power management policy to "min_power". Additionally, libata drivers can define a function (enable_pm) that will perform hardware specific actions to enable whatever power management policy the user set up for Host Initiated Power management (HIPM). This power management policy will be activated after all disks have been enumerated and intialized. Drivers should also define disable_pm, which will turn off link power management, but not change link power management policy. Documentation/scsi/link_power_management_policy.txt has additional information. Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-29 11:00:35 -04:00
Linus Torvalds	cbf67812b2	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: compat_ioctl: fix block device compat ioctl regression [BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps Fix a build error when BLOCK=n block: use lock bitops for the tag map. cciss: update copyright notices cfq_get_queue: fix possible NULL pointer access blk_sync_queue() should cancel request_queue->unplug_work cfq_exit_queue() should cancel cfq_data->unplug_work block layer: remove a unused argument of drive_stat_acct()	2007-10-29 07:49:28 -07:00
Linus Torvalds	20dc9f01a8	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: Correction of "Update drivers to use sg helpers" patch for IMXMMC driver sg_init_table() should use unsigned loop index variable sg_last() should use unsigned loop index variable Initialise scatter/gather list in sg driver Initialise scatter/gather list in ata_sg_setup x86: fix pci-gart failure handling SG: s390-scsi: missing size parameter in zfcp_address_to_sg() SG: clear termination bit in sg_chain()	2007-10-29 07:49:10 -07:00
Al Viro	142956af52	fix abuses of ptrdiff_t Use of ptrdiff_t in places like - if (!access_ok(VERIFY_WRITE, u_tmp->rx_buf, u_tmp->len)) + if (!access_ok(VERIFY_WRITE, (u8 __user *) + (ptrdiff_t) u_tmp->rx_buf, + u_tmp->len)) is wrong; for one thing, it's a bad C (it's what uintptr_t is for; in general we are not even promised that ptrdiff_t is large enough to hold a pointer, just enough to hold a difference between two pointers within the same object). For another, it confuses the fsck out of sparse. Use unsigned long or uintptr_t instead. There are several places misusing ptrdiff_t; fixed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 07:41:33 -07:00
Al Viro	2d8a972661	SUNRPC endianness annotations rpcrdma stuff lacks endianness annotations for on-the-wire data. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 07:41:32 -07:00
Al Viro	ca5cd877ae	x86 merge fallout: uml Don't undef __i386__/__x86_64__ in uml anymore, make sure that (few) places that required adjusting the ifdefs got those. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 07:41:32 -07:00
Jens Axboe	6eca9004df	[BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps For the locking to work, only the tag map and tag bit map may be shared (incidentally, I was just explaining this to Nick yesterday, but I apparently didn't review the code well enough myself). But we also share the busy list! The busy_list must be queue private, or we need a block_queue_tag covering lock as well. So we have to move the busy_list to the queue. This'll work fine, and it'll actually also fix a problem with blk_queue_invalidate_tags() which will invalidate tags across all shared queues. This is a bit confusing, the low level driver should call it for each queue seperately since otherwise you cannot kill tags on just a single queue for eg a hard drive that stops responding. Since the function has no callers currently, it's not an issue. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-29 11:33:06 +01:00
Tejun Heo	88ff6eafbb	libata: implement ata_wait_after_reset() On certain device/controller combination, 0xff status is asserted after reset and doesn't get cleared during 150ms post-reset wait. As 0xff status is interpreted as no device (for good reasons), this can lead to misdetection on such cases. This patch implements ata_wait_after_reset() which replaces the 150ms sleep and waits upto ATA_TMOUT_FF_WAIT if status is 0xff. ATA_TMOUT_FF_WAIT is currently 800ms which is enough for HHD424020F7SV00 to get detected but not enough for Quantum GoVault drive which is known to take upto 2s. Without parallel probing, spending 2s on 0xff port would incur too much delay on ata_piix's which use 0xff to indicate empty port and doesn't have SCR register, so GoVault needs to wait till parallel probing. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-29 06:15:27 -04:00
Tejun Heo	054a5fbace	libata: track SLEEP state and issue SRST to wake it up ATA devices in SLEEP mode don't respond to any commands. SRST is necessary to wake it up. Till now, when a command is issued to a device in SLEEP mode, the command times out, which makes EH reset the device and retry the command after that, causing a long delay. This patch makes libata track SLEEP state and issue SRST automatically if a command is about to be issued to a device in SLEEP. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Bruce Allen <ballen@gravity.phys.uwm.edu> Cc: Andrew Paprocki <andrew@ishiboo.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-29 06:15:25 -04:00
Chuck Lever	513f54b78f	sg_init_table() should use unsigned loop index variable Clean up: fix a mixed sign comparison in sg_init_table() accidentally introduced by commit d6ec0842. The sign of the loop index variable should match the sign of the "nents" argument. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2007-10-29 09:18:04 +01:00
Chuck Lever	74eb94f7b8	sg_last() should use unsigned loop index variable Clean up: fix a mixed sign comparison in sg_last() accidentally introduced by commit 70eb8040. The sign of the loop index variable should match the sign of the "nents" argument. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jens Axboe <axboe@carl.home.kernel.dk>	2007-10-29 09:18:04 +01:00
Jens Axboe	73fd546aa7	SG: clear termination bit in sg_chain() Since we are using the last entry in the list, clear any possible termination bit that may have already been set. Pointed out by Rusty. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-29 09:18:03 +01:00
Al Viro	30e69bf4cc	fix breakage in pegasos_eth Fix fallout from commit b45d9147f1582333e180e1023624c003874b7312 ("mv643xx_eth: Remove unused register defines") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-27 22:23:06 -07:00
Carlos Corbacho	f7852be649	Input: Add Euro and Dollar key codes Most newer Acer laptops (from 2005 onwards) now ship with an extra Dollar and Euro key either side of the 'Up' arrow. These cannot be mapped in the traditional way, since they are not combination keys. Signed-off-by: Carlos Corbacho <cathectic@gmail.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2007-10-27 23:42:32 -04:00
Linus Torvalds	ec3b67c11d	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) [NetLabel]: correct usage of RCU locking [TCP]: fix D-SACK cwnd handling [NET] napi: use non-interruptible sleep in napi_disable [SCTP] net/sctp/auth.c: make 3 functions static [TCP]: Add missing I/O AT code to ipv6 side. [SCTP]: #if 0 sctp_update_copy_cksum() [INET]: Unexport icmpmsg_statistics [NET]: Unexport sock_enable_timestamp(). [TCP]: Make tcp_match_skb_to_sack() static. [IRDA]: Make ircomm_tty static. [NET] fs/proc/proc_net.c: make a struct static [NET] dev_change_name: ignore changes to same name [NET]: Document some simple rules for actions [NET_CLS_ACT]: Use skb_act_clone [NET_CLS_ACT]: Introduce skb_act_clone [TCP]: Fix scatterlist handling in MD5 signature support. [IPSEC]: Fix scatterlist handling in skb_icv_walk(). [IPSEC]: Add missing sg_init_table() calls to ESP. [CRYPTO]: Initialize TCRYPT on-stack scatterlist objects correctly. [CRYPTO]: HMAC needs some more scatterlist fixups. ...	2007-10-26 08:43:05 -07:00
Alexey Dobriyan	e868171a94	De-constify sched.h [PATCH] De-constify sched.h This reverts commit a8972ccf00b7184a743eb6cd9bc7f3443357910c ("sched: constify sched.h") 1) Patch doesn't change any code here, so gcc is already smart enough to "feel" constness in such simple functions. 2) There is no such thing as const task_struct. Anyone who think otherwise deserves compiler warning. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-26 08:42:24 -07:00
Benjamin Herrenschmidt	43cc7380ec	[NET] napi: use non-interruptible sleep in napi_disable The current napi_disable() uses msleep_interruptible() but doesn't (and can't) exit in case there's a signal, thus ending up doing a hot spin without a cpu_relax. Use uninterruptible sleep instead. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 04:23:22 -07:00
David S. Miller	8c56a347c1	Merge master.kernel.org:/pub/scm/linux/kernel/git/acme/net-2.6	2007-10-26 03:50:02 -07:00
Linus Torvalds	06dbbfef82	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [IPV4]: Explicitly call fib_get_table() in fib_frontend.c [NET]: Use BUILD_BUG_ON in net/core/flowi.c [NET]: Remove in-code externs for some functions from net/core/dev.c [NET]: Don't declare extern variables in net/core/sysctl_net_core.c [TCP]: Remove unneeded implicit type cast when calling tcp_minshall_update() [NET]: Treat the sign of the result of skb_headroom() consistently [9P]: Fix missing unlock before return in p9_mux_poll_start [PKT_SCHED]: Fix sch_prio.c build with CONFIG_NETDEVICES_MULTIQUEUE [IPV4] ip_gre: sendto/recvfrom NBMA address [SCTP]: Consolidate sctp_ulpq_renege_xxx functions [NETLINK]: Fix ACK processing after netlink_dump_start [VLAN]: MAINTAINERS update [DCCP]: Implement SIOCINQ/FIONREAD [NET]: Validate device addr prior to interface-up	2007-10-25 15:50:32 -07:00
Linus Torvalds	7f14957453	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: fix sg_phys to use dma_addr_t ub: add sg_init_table for sense and read capacity commands x86: pci-gart fix blackfin: fix sg fallout xtensa: dma-mapping.h is using linux/scatterlist.h functions, so include it SG: audit of drivers that use blk_rq_map_sg() arch/um/drivers/ubd_kern.c: fix a building error SG: Change sg_set_page() to take length and offset argument AVR32: Fix sg_page breakage mmc: sg fallout m68k: sg fallout More SG build fixes sg: add missing sg_init_table calls to zfcp SG build fix	2007-10-25 15:44:54 -07:00
Linus Torvalds	2c75055703	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest: lguest: documentation update lguest: Add to maintainers file. lguest: build fix lguest: clean up lguest_launcher.h lguest: remove unused "wake" element from struct lguest lguest: use defines from x86 headers instead of magic numbers lguest: example launcher header cleanup.	2007-10-25 15:38:19 -07:00
Linus Torvalds	2304c3ac36	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: [netdrvr] forcedeth: add MCP77 device IDs rndis_host: reduce MTU instead of refusing to talk to devices with low max packet size cpmac: update to new fixed phy driver interface cpmac: convert to napi_struct interface cpmac: use print_mac() instead of MAC_FMT natsemi: fix oops, link back netdevice from private-struct ehea: fix port_napi_disable/enable bonding/bond_main.c: fix cut'n'paste error make bonding/bond_main.c:bond_deinit() static drivers/net/ipg.c: cleanups remove Documentation/networking/net-modules.txt	2007-10-25 15:19:59 -07:00
Linus Torvalds	fcd05809e1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: mark CONFIG_FAIR_GROUP_SCHED as !EXPERIMENTAL sched: isolate SMP balancing code a bit more sched: reduce balance-tasks overhead sched: make cpu_shares_{show,store}() static sched: clean up some control group code sched: constify sched.h sched: document profile=sleep requiring CONFIG_SCHEDSTATS sched: use show_regs() to improve __schedule_bug() output sched: clean up sched_domain_debug() sched: fix fastcall mismatch in completion APIs sched: fix sched_domain sysctl registration again	2007-10-25 15:19:03 -07:00
Jeff Garzik	de48844398	Permit silencing of __deprecated warnings. The __deprecated marker is quite useful in highlighting the remnants of old APIs that want removing. However, it is quite normal for one or more years to pass, before the (usually ancient, bitrotten) code in question is either updated or deleted. Thus, like __must_check, add a Kconfig option that permits the silencing of this compiler warning. This change mimics the ifdef-ery and Kconfig defaults of MUST_CHECK as closely as possible. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-25 15:10:17 -07:00
Hugh Dickins	85cdffcde0	fix sg_phys to use dma_addr_t x86_32 CONFIG_HIGHMEM64G with 5GB RAM hung when booting, after issuing some "request_module: runaway loop modprobe binfmt-0000" messages in trying to exec /sbin/init. The binprm buf doesn't see the right ".ELF" header because sg_phys() is providing the wrong physical addresses for high pages: a 32-bit unsigned long is too small in this case, we need to use dma_addr_t. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-25 09:55:05 +02:00
Ayaz Abdulla	96fd4cd3e4	[netdrvr] forcedeth: add MCP77 device IDs Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-25 03:36:42 -04:00
Rusty Russell	e1e72965ec	lguest: documentation update Went through the documentation doing typo and content fixes. This patch contains only comment and whitespace changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-25 15:02:50 +10:00
Rusty Russell	7334492b53	lguest: clean up lguest_launcher.h Remove now-unused defines. Fix old idempotent #ifndef _ASM_LGUEST_USER name. Fix comment on use of lguest_req. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-25 14:12:20 +10:00
Peter Williams	681f3e6854	sched: isolate SMP balancing code a bit more At the moment, a lot of load balancing code that is irrelevant to non SMP systems gets included during non SMP builds. This patch addresses this issue and reduces the binary size on non SMP systems: text data bss dec hex filename 10983 28 1192 12203 2fab sched.o.before 10739 28 1192 11959 2eb7 sched.o.after Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-24 18:23:51 +02:00
Peter Williams	e1d1484f72	sched: reduce balance-tasks overhead At the moment, balance_tasks() provides low level functionality for both move_tasks() and move_one_task() (indirectly) via the load_balance() function (in the sched_class interface) which also provides dual functionality. This dual functionality complicates the interfaces and internal mechanisms and makes the run time overhead of operations that are called with two run queue locks held. This patch addresses this issue and reduces the overhead of these operations. Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-24 18:23:51 +02:00
Joe Perches	a8972ccf00	sched: constify sched.h Add const to some struct task_struct * uses Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-24 18:23:50 +02:00
Ingo Molnar	b15136e949	sched: fix fastcall mismatch in completion APIs Jeff Dike noticed that wait_for_completion_interruptible()'s prototype had a mismatched fastcall. Fix this by removing the fastcall attributes from all the completion APIs. Found-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-24 18:23:48 +02:00
Gerrit Renker	d8ef2c29a0	[DCCP]: Convert Reset code into socket error number This adds support for converting the 11 currently defined Reset codes into system error numbers, which are stored in sk_err for further interpretation. This makes the externally visible API behaviour similar to TCP, since a client connecting to a non-existing port will experience ECONNREFUSED. * Code 0, Unspecified, is interpreted as non-error (0); * Code 1, Closed (normal termination), also maps into 0; * Code 2, Aborted, maps into "Connection reset by peer" (ECONNRESET); * Code 3, No Connection and Code 7, Connection Refused, map into "Connection refused" (ECONNREFUSED); * Code 4, Packet Error, maps into "No message of desired type" (ENOMSG); * Code 5, Option Error, maps into "Illegal byte sequence" (EILSEQ); * Code 6, Mandatory Error, maps into "Operation not supported on transport endpoint" (EOPNOTSUPP); * Code 8, Bad Service Code, maps into "Invalid request code" (EBADRQC); * Code 9, Too Busy, maps into "Too many users" (EUSERS); * Code 10, Bad Init Cookie, maps into "Invalid request descriptor" (EBADR); * Code 11, Aggression Penalty, maps into "Quota exceeded" (EDQUOT) which makes sense in terms of using more than the `fair share' of bandwidth. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:27:48 -02:00
Gerrit Renker	fde20105f3	[DCCP]: Retrieve packet sequence number for error reporting This fixes a problem when analysing erroneous packets in dccp_v{4,6}_err: * dccp_hdr_seq currently takes an skb * however, the transport headers in the skb are shifted, due to the preceding IPv4/v6 header. Fixed for v4 and v6 by changing dccp_hdr_seq to take a struct dccp_hdr as argument. Verified that the correct sequence number is now reported in the error handler. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2007-10-24 10:12:09 -02:00
Jens Axboe	642f149031	SG: Change sg_set_page() to take length and offset argument Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-24 11:20:47 +02:00
Chuck Lever	c2636b4d9e	[NET]: Treat the sign of the result of skb_headroom() consistently In some places, the result of skb_headroom() is compared to an unsigned integer, and in others, the result is compared to a signed integer. Make the comparisons consistent and correct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:55 -07:00
Jeff Garzik	bada339ba2	[NET]: Validate device addr prior to interface-up Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-23 21:27:50 -07:00
Greg Ungerer	f0c15f48bb	add port definition for mcf UART driver Add a port type definition for the Freescale UART driver ports (mcf.c). Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-23 20:45:44 -07:00
Linus Torvalds	25c263542d	Merge branch 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6 * 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6: [SPARC, XEN, NET/CXGB3] use irq_handler_t where appropriate drivers/char/riscom8: clean up irq handling isdn/sc: irq handler clean isdn/act2000: fix major bug. clean irq handler. char/pcmcia/synclink_cs: trim trailing whitespace drivers/char/ip2: separate polling and irq-driven work entry points drivers/char/ip2: split out irq core logic into separate function [NETDRVR] lib82596, netxen: delete pointless tests from irq handler Eliminate pointless casts from void* in a few driver irq handlers. [PARPORT] Remove unused 'irq' argument from parport irq functions [PARPORT] Kill useful 'irq' arg from parport_{generic_irq,ieee1284_interrupt} [PARPORT] Consolidate code copies into a single generic irq handler	2007-10-23 18:57:39 -07:00
Linus Torvalds	5a0e554b62	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (39 commits) Remove Andrew Morton from list of net driver maintainers. bonding: Acquire correct locks in alb for promisc change bonding: Convert more locks to _bh, acquire rtnl, for new locking bonding: Convert locks to _bh, rework alb locking for new locking bonding: Convert miimon to new locking bonding: Convert balance-rr transmit to new locking Convert bonding timers to workqueues Update MAINTAINERS to reflect my (jgarzik's) current efforts. pasemi_mac: fix typo defxx.c: dfx_bus_init() is __devexit not __devinit s390 MAINTAINERS remove header_ops bug in qeth driver sky2: crash on remove MIPSnet: Delete all the useless debugging printks. AR7 ethernet: small post-merge cleanups and fixes mv643xx_eth: Hook up mv643xx_get_sset_count mv643xx_eth: Remove obsolete checksum offload comment mv643xx_eth: Merge drivers/net/mv643xx_eth.h into mv643xx_eth.c mv643xx_eth: Remove unused register defines mv643xx_eth: Clean up mv643xx_eth.h ...	2007-10-23 18:56:54 -07:00
Jeff Garzik	2dcb407e61	[libata] checkpatch-inspired cleanups Tackle the relatively sane complaints of checkpatch --file. The vast majority is indentation and whitespace changes, the rest are * #include fixes * printk KERN_xxx prefix addition * BSS/initializer cleanups Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-23 20:59:42 -04:00
Ursula Braun	f1ecfd5d3b	remove header_ops bug in qeth driver Remove qeth bug caused by commit: [NET]: Move hardware header operations out of netdevice. This is the second part of the qeth header_ops patch, since first patch sent 10/19 has been insufficient. Nevertheless first patch is still valid and should be kept. Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-23 20:18:13 -04:00
Jeff Garzik	02bae21297	Merge branch 'features' of git://farnsworth.org/dale/linux-2.6-mv643xx_eth into upstream	2007-10-23 20:15:54 -04:00
Jeff Garzik	5712cb3d81	[PARPORT] Remove unused 'irq' argument from parport irq functions None of the drivers with a struct pardevice's ->irq_func() hook ever used the 'irq' argument passed to it, so remove it. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-23 19:53:16 -04:00
Jeff Garzik	f230d1010a	[PARPORT] Kill useful 'irq' arg from parport_{generic_irq,ieee1284_interrupt} parport_ieee1284_interrupt() was not using its first arg at all. Delete. parport_generic_irq()'s second arg makes its first arg completely redundant. Delete, and use port->irq in the one place where we actually need it. Also, s/__inline__/inline/ to make the code look nicer. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-23 19:53:15 -04:00
Jeff Garzik	3f2e40df0e	[PARPORT] Consolidate code copies into a single generic irq handler Several arches used the exact same code for their parport irq handling. Make that code generic, in parport_irq_handler(). Also, s/__inline__/inline/ in include/linux/parport.h. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-23 19:53:15 -04:00
Linus Torvalds	01e7ae8c13	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: 9p: v9fs_vfs_rename incorrect clunk order 9p: fix memleak in fs/9p/v9fs.c 9p: add virtio transport	2007-10-23 12:04:01 -07:00
Eric Van Hensbergen	b530cc7940	9p: add virtio transport This adds a transport to 9p for communicating between guests and a host using a virtio based transport. Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-23 13:47:31 -05:00
Jens Axboe	de26103de5	[SG] Add debug check for page alignment Suggested by Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-23 20:35:58 +02:00
Linus Torvalds	0b776eb542	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4_core: Increase command timeout for INIT_HCA to 10 seconds IPoIB/cm: Use common CQ for CM send completions IB/uverbs: Fix checking of userspace object ownership IB/mlx4: Sanity check userspace send queue sizes IPoIB: Rewrite "if (!likely(...))" as "if (unlikely(!(...)))" IB/ehca: Enable large page MRs by default IB/ehca: Change meaning of hca_cap_mr_pgsize IB/ehca: Fix ehca_encode_hwpage_size() and alloc_fmr() IB/ehca: Fix masking error in {,re}reg_phys_mr() IB/ehca: Supply QP token for SRQ base QPs IPoIB: Use round_jiffies() for ah_reap_task RDMA/cma: Fix deadlock destroying listen requests RDMA/cma: Add locking around QP accesses IB/mthca: Avoid alignment traps when writing doorbells mlx4_core: Kill mlx4_write64_raw()	2007-10-23 09:56:11 -07:00
Lennert Buytenhek	e2734d6c61	mv643xx_eth: Move ethernet register definitions into private header Move the mv643xx's ethernet-related register definitions from include/linux/mv643xx.h into drivers/net/mv643xx_eth.h, since they aren't of any use outside the ethernet driver. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Tzachi Perelstein <tzachi@marvell.com> Signed-off-by: Dale Farnsworth <dale@farnsworth.org>	2007-10-23 08:23:00 -07:00
Lennert Buytenhek	c4a6a2ab5e	mv643xx_eth: Split off mv643xx_eth platform device data The mv643xx ethernet silicon block is also found in a couple of other Marvell chips. As a first step towards splitting off the mv643xx_eth bits from the rest of the mv643xx bits, this patch splits the mv643xx ethernet platform device data struct in linux/mv643xx.h off into linux/mv643xx_eth.h, and includes the latter from the former. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Acked-by: Tzachi Perelstein <tzachi@marvell.com> Signed-off-by: Dale Farnsworth <dale@farnsworth.org>	2007-10-23 08:22:58 -07:00
Rusty Russell	19f1537b7b	Lguest support for Virtio This makes lguest able to use the virtio devices. We change the device descriptor page from a simple array to a variable length "type, config_len, status, config data..." format, and implement virtio_config_ops to read from that config data. We use the virtio ring implementation for an efficient Guest <-> Host virtqueue mechanism, and the new LHCALL_NOTIFY hypercall to kick the host when it changes. We also use LHCALL_NOTIFY on kernel addresses for very very early console output. We could have another hypercall, but this hack works quite well. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:56 +10:00
Rusty Russell	15045275c3	Remove old lguest I/O infrrasructure. This patch gets rid of the old lguest host I/O infrastructure and replaces it with a single hypercall "LHCALL_NOTIFY" which takes an address. The main change is the removal of io.c: that mainly did inter-guest I/O, which virtio doesn't yet support. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:55 +10:00
Rusty Russell	0ca49ca946	Remove old lguest bus and drivers. This gets rid of the lguest bus, drivers and DMA mechanism, to make way for a generic virtio mechanism. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:55 +10:00
Rusty Russell	0a8a69dd77	Virtio helper routines for a descriptor ringbuffer implementation These helper routines supply most of the virtqueue_ops for hypervisors which want to use a ring for virtio. Unlike the previous lguest implementation: 1) The rings are variable sized (2^n-1 elements). 2) They have an unfortunate limit of 65535 bytes per sg element. 3) The page numbers are always 64 bit (PAE anyone?) 4) They no longer place used[] on a separate page, just a separate cacheline. 5) We do a modulo on a variable. We could be tricky if we cared. 6) Interrupts and notifies are suppressed using flags within the rings. Users need only get the ring pages and provide a notify hook (KVM wants the guest to allocate the rings, lguest does it sanely). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dor Laor <dor.laor@qumranet.com>	2007-10-23 15:49:55 +10:00
Rusty Russell	31610434bc	Virtio console driver This is an hvc-based virtio console driver. It's suboptimal becuase hvc expects to have raw access to interrupts and virtio doesn't assume that, so it currently polls. There are two solutions: expose hvc's "kick" interface, or wean off hvc. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:55 +10:00
Rusty Russell	e467cde238	Block driver using virtio. The block driver uses scatter-gather lists with sg[0] being the request information (struct virtio_blk_outhdr) with the type, sector and inbuf id. The next N sg entries are the bio itself, then the last sg is the status byte. Whether the N entries are in or out depends on whether it's a read or a write. We accept the normal (SCSI) ioctls: they get handed through to the other side which can then handle it or reply that it's unsupported. It's not clear that this actually works in general, since I don't know if blk_pc_request() requests have an accurate rq_data_dir(). Although we try to reply -ENOTTY on unsupported commands, ioctl(fd, CDROMEJECT) returns success to userspace. This needs a separate patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <jens.axboe@oracle.com>	2007-10-23 15:49:54 +10:00
Rusty Russell	296f96fcfc	Net driver using virtio The network driver uses two virtqueues: one for input packets and one for output packets. This has nice locking properties (ie. we don't do any for recv vs send). TODO: 1) Big packets. 2) Multi-client devices (maybe separate driver?). 3) Resolve freeing of old xmit skbs (Christian Borntraeger) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: netdev@vger.kernel.org	2007-10-23 15:49:54 +10:00
Rusty Russell	ec3d41c4db	Virtio interface This attempts to implement a "virtual I/O" layer which should allow common drivers to be efficiently used across most virtual I/O mechanisms. It will no-doubt need further enhancement. The virtio drivers add buffers to virtio queues; as the buffers are consumed the driver "interrupt" callbacks are invoked. There is also a generic implementation of config space which drivers can query to get setup information from the host. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dor Laor <dor.laor@qumranet.com> Cc: Arnd Bergmann <arnd@arndb.de>	2007-10-23 15:49:54 +10:00
Rusty Russell	47436aa4ad	Boot with virtual == physical to get closer to native Linux. 1) This allows us to get alot closer to booting bzImages. 2) It means we don't have to know page_offset. 3) The Guest needs to modify the boot pagetables to create the PAGE_OFFSET mapping before jumping to C code. 4) guest_pa() walks the page tables rather than using page_offset. 5) We don't use page_offset to figure out whether to emulate: it was always kinda quesationable, and won't work for instructions done before remapping (bzImage unpacking in particular). 6) We still want the kernel address for tlb flushing: have the initial hypercall give us that, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:54 +10:00
Rusty Russell	c18acd73ff	Allow guest to specify syscall vector to use. (Based on Ron Minnich's LGUEST_PLAN9_SYSCALL patch). This patch allows Guests to specify what system call vector they want, and we try to reserve it. We only allow one non-Linux system call vector, to try to avoid DoS on the Host. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:53 +10:00
Jes Sorensen	47aee45ae3	lguest.h declares a struct timespec, make it include linux/time.h Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:52 +10:00
Jes Sorensen	b410e7b149	Make hypercalls arch-independent. Clean up the hypercall code to make the code in hypercalls.c architecture independent. First process the common hypercalls and then call lguest_arch_do_hcall() if the call hasn't been handled. Rename struct hcall_ring to hcall_args. This patch requires the previous patch which reorganize the layout of struct lguest_regs on i386 so they match the layout of struct hcall_args. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:52 +10:00
Rusty Russell	48245cc070	Remove fixed limit on number of guests, and lguests array. Back when we had all the Guest state in the switcher, we had a fixed array of them. This is no longer necessary. If we switch the network code to using random_ether_addr (46 bits is enough to avoid clashes), we can get rid of the concept of "guest id" altogether. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:51 +10:00
Jes Sorensen	c37ae93d59	Move lguest hcalls to arch-specific header Move architecture specific portion of lg_hcall code to asm-i386/lg_hcall.h and have it included from linux/lguest.h. [Changed to asm-i386/lguest_hcall.h so documentation finds it -RR] Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jes Sorensen <jes@sgi.com>	2007-10-23 15:49:49 +10:00
Rusty Russell	b45d8cb054	Make lguest_launcher.h types userspace-friendly lguest_launcher.h uses "u32" not "__u32", which sets a bad example. Fix that, and include <linux/types.h>. This means we need to use -I on the Launcher build line so types.h is found. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:49 +10:00
Rusty Russell	ee8e7cfe9d	Make asm-x86/bootparam.h includable from userspace. To actually write a bootloader (or, say, the lguest launcher) currently requires duplication of these structures. Making them includable from userspace is much nicer. We merge the common userspace-required definitions of e820_32/64.h into e820.h for export. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2007-10-23 15:49:47 +10:00
David Miller	e95d9c6b04	Expand hwif->host_flags so that it fits new flags. Commit 238e4f142c33bb34440cc64029dde7b9fbc4e65f ("ide: add IDE_HFLAG_NO_LBA48 and IDE_HFLAG_NO_LBA48_DMA host flags") caused a regression because the host_flags in struct hwif_s wasn't expanded to cope with the fact that the host flags no longer fit in 16 bits. Signed-off-by: David S. Miller <davem@davemloft.net> [ I hate having to add good commit descriptions. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 19:35:14 -07:00
Linus Torvalds	81f8320f62	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: appletouch - apply idle reset logic to all touchpads Input: usbtouchscreen - add support for GoTop tablet devices Input: bf54x-keys - return real error when request_irq() fails Input: i8042 - export i8042_command()	2007-10-22 19:29:58 -07:00
Linus Torvalds	f09cc910fe	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) [IPSEC] IPV6: Fix to add tunnel mode SA correctly. [NET]: Cut off the queue_mapping field from sk_buff [NET]: Hide the queue_mapping field inside netif_subqueue_stopped [NET]: Make and use skb_get_queue_mapping [NET]: Use the skb_set_queue_mapping where appropriate [INET]: Use MODULE_ALIAS_NET_PF_PROTO_TYPE where possible. [INET]: Let inet_diag and friends autoload [NIU]: Cleanup PAGE_SIZE checks a bit [NET]: Fix SKB_WITH_OVERHEAD calculation [ATM]: Fix clip module reload crash. [TG3]: Update version to 3.85 [TG3]: PCI command adjustment [TG3]: Add management FW version to ethtool report [TG3]: Add 5723 support [Bluetooth] Convert RFCOMM to use kthread API [Bluetooth] Add constant for Bluetooth socket options level [Bluetooth] Add support for handling simple eSCO links [Bluetooth] Add address and channel attribute to RFCOMM TTY device [Bluetooth] Fix wrong argument in debug code of HIDP [Bluetooth] Add generic driver for Bluetooth USB devices ...	2007-10-22 19:22:33 -07:00
Linus Torvalds	ad792f4f46	Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb * 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (37 commits) V4L/DVB (6382): saa7134: fix NULL dereference at suspend time for cards without IR receiver V4L/DVB (6380): ivtvfb: Removal of the 'osd_compat' module option V4L/DVB (6379): patch which improves GotView Saa7135 remote control V4L/DVB (6378b): Updates info about the removal of V4L1 at feature-removal-schedule.txt V4L/DVB (6378a): Removal of VIDIOC_[G\|S]_MPEGCOMP from feature-removal-schedule.txt V4L/DVB (6378): DiB0700-device: Using 1.10 firmware V4L/DVB (6357): pvrusb2: Improve encoder chip health tracking V4L/DVB (6356): "while (!ca->wakeup)" breaks the CAM initialisation V4L/DVB (6352): ir-kbd-i2c: Missing break statement V4L/DVB (6350): V4L: possible leak in em28xx_init_isoc V4L/DVB (6348): ivtv: undo video mute when closing the radio V4L/DVB (6347): ivtv: fix video mute when radio is used V4L/DVB (6346): ivtvfb: YUV output size fix when ivtvfb is not loaded V4L/DVB (6345): ivtvfb: YUV handling of an image which is not visible in the display area V4L/DVB (6343): ivtvfb: check return value of unregister_framebuffer V4L/DVB (6342): ivtv: fix circular locking (bug 9037) V4L/DVB (6341): ivtv: fix resizing MPEG1 streams V4L/DVB (6340): ivtvfb: screen mode change sometimes goes wrong V4L/DVB (6339): ivtv: set the video color to black instead of green when capturing from the radio V4L/DVB (6338): ivtv: fix incorrect EBUSY return ...	2007-10-22 19:20:22 -07:00
Linus Torvalds	69450bb5eb	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: Add CONFIG_DEBUG_SG sg validation Change table chaining layout Update arch/ to use sg helpers Update swiotlb to use sg helpers Update net/ to use sg helpers Update fs/ to use sg helpers [SG] Update drivers to use sg helpers [SG] Update crypto/ to sg helpers [SG] Update block layer to use sg helpers [SG] Add helpers for manipulating SG entries	2007-10-22 19:11:06 -07:00
Jens Axboe	d6ec084200	Add CONFIG_DEBUG_SG sg validation Add a Kconfig entry which will toggle some sanity checks on the sg entry and tables. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:20:03 +02:00
Jens Axboe	18dabf473e	Change table chaining layout Change the page member of the scatterlist structure to be an unsigned long, and encode more stuff in the lower bits: - Bits 0 and 1 zero: this is a normal sg entry. Next sg entry is located at sg + 1. - Bit 0 set: this is a chain entry, the next real entry is at ->page_link with the two low bits masked off. - Bit 1 set: this is the final entry in the sg entry. sg_next() will return NULL when passed such an entry. It's thus important that sg table users use the proper accessors to get and set the page member. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:20:01 +02:00
Christoph Hellwig	e38f981758	exportfs: update documentation Update documentation to the current state of affairs. Remove duplicated method descruptions in exportfs.h and point to Documentation/filesystems/ Exporting instead. Add a little file header comment in expfs.c describing what's going on and mentioning Neils and my copyright [1]. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	3965516440	exportfs: make struct export_operations const Now that nfsd has stopped writing to the find_exported_dentry member we an mark the export_operations const Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	cfaea787c0	exportfs: remove old methods Now that all filesystems are converted remove support for the old methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	be55caf177	reiserfs: new export ops Another nice little cleanup by using the new methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	05da080482	efs: new export ops Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	2596110a39	exportfs: add new methods Add the guts for the new filesystem API to exportfs. There's now a fh_to_dentry method that returns a dentry for the object looked for given a filehandle fragment, and a fh_to_parent operation that returns the dentry for the encoded parent directory in case the file handle contains it. There are default implementations for these methods that only take a callback for an nfs-enhanced iget variant and implement the rest of the semantics. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Christoph Hellwig	6e91ea2bb0	exportfs: add fid type This patchset is a medium scale rewrite of the export operations interface. The goal is to make the interface less complex, and easier to understand from the filesystem side, aswell as preparing generic support for exporting of 64bit inode numbers. This touches all nfs exporting filesystems, and I've done testing on all of the filesystems I have here locally (xfs, ext2, ext3, reiserfs, jfs) This patch: Add a structured fid type so that we don't have to pass an array of u32 values around everywhere. It's a union of possible layouts. As a start there's only the u32 array and the traditional 32bit inode format, but there will be more in one of my next patchset when I start to document the various filehandle formats we have in lowlevel filesystems better. Also add an enum that gives the various filehandle types human- readable names. Note: Some people might think the struct containing an anonymous union is ugly, but I didn't want to pass around a raw union type. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Bernhard Walle	00bf4098be	kexec: add BSS to resource tree Add the BSS to the resource tree just as kernel text and kernel data are in the resource tree. The main reason behind this is to avoid crashkernel reservation in that area. While it's not strictly necessary to have the BSS in the resource tree (the actual collision detection is done in the reserve_bootmem() function before), the usage of the BSS resource should be presented to the user in /proc/iomem just as Kernel data and Kernel code. Note: The patch currently is only implemented for x86 and ia64 (because efi_initialize_iomem_resources() has the same signature on i386 and ia64). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: <linux-arch@vger.kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Keshavamurthy, Anil S	3460a6d9ce	Intel IOMMU: DMAR fault handling support MSI interrupt handler registrations and fault handling support for Intel-IOMMU hadrware. This patch enables the MSI interrupts for the DMA remapping units and in the interrupt handler read the fault cause and outputs the same on to the console. Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ashok Raj <ashok.raj@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christoph Lameter <clameter@sgi.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Keshavamurthy, Anil S	ba39592764	Intel IOMMU: Intel IOMMU driver Actual intel IOMMU driver. Hardware spec can be found at: http://www.intel.com/technology/virtualization This driver sets X86_64 'dma_ops', so hook into standard DMA APIs. In this way, PCI driver will get virtual DMA address. This change is transparent to PCI drivers. [akpm@linux-foundation.org: remove unneeded cast] [akpm@linux-foundation.org: build fix] [bunk@stusta.de: fix duplicate CONFIG_DMAR Makefile line] Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ashok Raj <ashok.raj@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christoph Lameter <clameter@sgi.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S	994a65e25d	Intel IOMMU: PCI generic helper function When devices are under a p2p bridge, upstream transactions get replaced by the device id of the bridge as it owns the PCIE transaction. Hence its necessary to setup translations on behalf of the bridge as well. Due to this limitation all devices under a p2p share the same domain in a DMAR. We just cache the type of device, if its a native PCIe device or not for later use. [akpm@linux-foundation.org: BUG_ON -> WARN_ON+recover] Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ashok Raj <ashok.raj@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christoph Lameter <clameter@sgi.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:18 -07:00
Keshavamurthy, Anil S	10e5247f40	Intel IOMMU: DMAR detection and parsing logic This patch supports the upcomming Intel IOMMU hardware a.k.a. Intel(R) Virtualization Technology for Directed I/O Architecture and the hardware spec for the same can be found here http://www.intel.com/technology/virtualization/index.htm FAQ! (questions from akpm, answers from ak) > So... what's all this code for? > > I assume that the intent here is to speed things up under Xen, etc? Yes in some cases, but not this code. That would be the Xen version of this code that could potentially assign whole devices to guests. I expect this to be only useful in some special cases though because most hardware is not virtualizable and you typically want an own instance for each guest. Ok at some point KVM might implement this too; i likely would use this code for this. > Do we > have any benchmark results to help us to decide whether a merge would be > justified? The main advantage for doing it in the normal kernel is not performance, but more safety. Broken devices won't be able to corrupt memory by doing random DMA. Unfortunately that doesn't work for graphics yet, for that need user space interfaces for the X server are needed. There are some potential performance benefits too: - When you have a device that cannot address the complete address range an IOMMU can remap its memory instead of bounce buffering. Remapping is likely cheaper than copying. - The IOMMU can merge sg lists into a single virtual block. This could potentially speed up SG IO when the device is slow walking SG lists. [I long ago benchmarked 5% on some block benchmark with an old MPT Fusion; but it probably depends a lot on the HBA] And you get better driver debugging because unexpected memory accesses from the devices will cause a trappable event. > > Does it slow anything down? It adds more overhead to each IO so yes. This patch: Add support for early detection and parsing of DMAR's (DMA Remapping) reported to OS via ACPI tables. DMA remapping(DMAR) devices support enables independent address translations for Direct Memory Access(DMA) from Devices. These DMA remapping devices are reported via ACPI tables and includes pci device scope covered by these DMA remapping device. For detailed info on the specification of "Intel(R) Virtualization Technology for Directed I/O Architecture" please see http://www.intel.com/technology/virtualization/index.htm Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Ashok Raj <ashok.raj@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Christoph Lameter <clameter@sgi.com> Cc: Greg KH <greg@kroah.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:18 -07:00
Jan Kara	89910cccb8	ext2: avoid rec_len overflow with 64KB block size With 64KB blocksize, a directory entry can have size 64KB which does not fit into 16 bits we have for entry length. So we store 0xffff instead and convert the value when read from / written to disk. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:18 -07:00
Serge E. Hallyn	b68680e473	capabilities: clean up file capability reading Simplify the vfs_cap_data structure. Also fix get_file_caps which was declaring __le32 v1caps[XATTR_CAPS_SZ] on the stack, but XATTR_CAPS_SZ is already * sizeof(__le32). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Andrew Morgan <morgan@kernel.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:18 -07:00
Yasunori Goto	b9049e2344	memory hotplug: make kmem_cache_node for SLUB on memory online avoid panic Fix a panic due to access NULL pointer of kmem_cache_node at discard_slab() after memory online. When memory online is called, kmem_cache_nodes are created for all SLUBs for new node whose memory are available. slab_mem_going_online_callback() is called to make kmem_cache_node() in callback of memory online event. If it (or other callbacks) fails, then slab_mem_offline_callback() is called for rollback. In memory offline, slab_mem_going_offline_callback() is called to shrink all slub cache, then slab_mem_offline_callback() is called later. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: locking fix] [akpm@linux-foundation.org: build fix] Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:17 -07:00
Yasunori Goto	7b78d335ac	memory hotplug: rearrange memory hotplug notifier Current memory notifier has some defects yet. (Fortunately, nothing uses it.) This patch is to fix and rearrange for them. - Add information of start_pfn, nr_pages, and node id if node status is changes from/to memoryless node for callback functions. Callbacks can't do anything without those information. - Add notification going-online status. It is necessary for creating per node structure before the node's pages are available. - Move GOING_OFFLINE status notification after page isolation. It is good place for return memory like cache for callback, because returned page is not used again. - Make CANCEL events for rollingback when error occurs. - Delete MEM_MAPPING_INVALID notification. It will be not used. - Fix compile error of (un)register_memory_notifier(). Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:17 -07:00
Rusty Russell	214541d1f3	add WEAK() for creating weak asm labels Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:17 -07:00
Jens Axboe	82f66fbef5	[SG] Add helpers for manipulating SG entries We can then transition drivers without changing the generated code. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 17:07:37 +02:00
Hans Verkuil	3bcc95760c	V4L/DVB (6321): Remove obsolete VIDIOC_S/G_MPEGCOMP ioctls Remove the obsolete VIDIOC_G_MPEGCOMP and VIDIOC_S_MPEGCOMP ioctls from the V4L2 API as per the removal schedule (October 2007). Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-10-22 12:01:30 -02:00
Mauro Carvalho Chehab	22c4a4e98e	V4L/DVB (6320): v4l core: remove the unused .hardware V4L1 field struct video_device used to define a .hardware field. While initialized on severl drivers, this field is never used inside V4L. However, drivers using it need to include the old V4L1 header. This seems to cause compilation troubles with some random configs. Better just to remove it from all drivers. Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2007-10-22 12:01:24 -02:00
Pavel Emelyanov	e3fa259bcb	[NET]: Cut off the queue_mapping field from sk_buff Just hide it behind the #ifdef, because nobody wants it now. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:57 -07:00
Pavel Emelyanov	668f895a85	[NET]: Hide the queue_mapping field inside netif_subqueue_stopped Many places get the queue_mapping field from skb to pass it to the netif_subqueue_stopped() which will be 0 in any case. Make the helper that works with sk_buff Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:56 -07:00
Pavel Emelyanov	4e3ab47a54	[NET]: Make and use skb_get_queue_mapping Make the helper for getting the field, symmetrical to the "set" one. Return 0 if CONFIG_NETDEVICES_MULTIQUEUE=n Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:56 -07:00
Jean Delvare	305e1e9691	[INET]: Let inet_diag and friends autoload By adding module aliases to inet_diag, tcp_diag and dccp_diag, we let them load automatically as needed. This makes tools like "ss" run faster. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:54 -07:00
Herbert Xu	deea84b0ae	[NET]: Fix SKB_WITH_OVERHEAD calculation The calculation in SKB_WITH_OVERHEAD is incorrect in that it can cause an overflow across a page boundary which is what it's meant to prevent. In particular, the header length (X) should not be lumped together with skb_shared_info. The latter needs to be aligned properly while the header has no choice but to sit in front of wherever the payload is. Therefore the correct calculation is to take away the aligned size of skb_shared_info, and then subtract the header length. The resulting quantity L satisfies the following inequality: SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <= PAGE_SIZE This is the quantity used by alloc_skb to do the actual allocation. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:53 -07:00
Matt Carlson	6c7af27c8a	[TG3]: Add 5723 support This patch adds support for upcoming 5723 devices. Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-22 02:59:49 -07:00
Marcel Holtmann	2cb3377a29	[Bluetooth] Add constant for Bluetooth socket options level Assign the next free socket options level to be used by the Bluetooth protocol and address family. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2007-10-22 02:59:48 -07:00
Márton Németh	553a05b882	Input: i8042 - export i8042_command() Export the i8042_command() function which manages the mutual exclusion with the help of the i8042_lock spinlock. This allows to access i8042 safely from other parts of the kernel. Signed-off-by: Márton Németh <nm127@freemail.hu> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2007-10-22 00:56:52 -04:00
Al Viro	74c3cbe33b	[PATCH] audit: watching subtrees New kind of audit rule predicates: "object is visible in given subtree". The part that can be sanely implemented, that is. Limitations: * if you have hardlink from outside of tree, you'd better watch it too (or just watch the object itself, obviously) * if you mount something under a watched tree, tell audit that new chunk should be added to watched subtrees * if you umount something in a watched tree and it's still mounted elsewhere, you will get matches on events happening there. New command tells audit to recalculate the trees, trimming such sources of false positives. Note that it's _not_ about path - if something mounted in several places (multiple mount, bindings, different namespaces, etc.), the match does _not_ depend on which one we are using for access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:45 -04:00
Al Viro	455434d450	[PATCH] new helper - inotify_evict_watch() Kicks the watch out without dropping it. Called under ->inotify_mutex Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:38 -04:00
Al Viro	b9efe8a234	[PATCH] new helper - inotify_clone_watch() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:32 -04:00
Al Viro	8aec080945	[PATCH] new helpers - collect_mounts() and release_collected_mounts() Get a snapshot of a subtree, creating private clones of vfsmounts for all its components and release such snapshot resp. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:25 -04:00
Al Viro	5a190ae697	[PATCH] pass dentry to audit_inode()/audit_inode_child() makes caller simpler and allows to scan ancestors Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:18 -04:00
Linus Torvalds	c00046c279	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits) fix do_sys_open() prototype sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake Documentation: Fix typo in SubmitChecklist. Typo: depricated -> deprecated Add missing profile=kvm option to Documentation/kernel-parameters.txt fix typo about TBI in e1000 comment proc.txt: Add /proc/stat field small documentation fixes Fix compiler warning in smount example program from sharedsubtree.txt docs/sysfs: add missing word to sysfs attribute explanation documentation/ext3: grammar fixes Documentation/java.txt: typo and grammar fixes Documentation/filesystems/vfs.txt: typo fix include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros trivial copy_data_pages() tidy up Fix typo in arch/x86/kernel/tsc_32.c file link fix for Pegasus USB net driver help remove unused return within void return function Typo fixes retrun -> return x86 hpet.h: remove broken links ...	2007-10-19 20:36:17 -07:00
Linus Torvalds	5f737085be	Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: (50 commits) ide: remove inclusion of non-existent io_trace.h ide-disk: add get_smart_data() helper ide: fix ->data_phase in taskfile_load_raw() ide: check drive->using_dma in flagged_taskfile() ide: check ->dma_setup() return value in flagged_taskfile() dtc2278: note on docs qd65xx: remove pointless qd_{read,write}_reg() (take 2) ide: PCI BMDMA initialization fixes (take 2) ide: remove stale comments from ide-taskfile.c ide: remove dead code from ide_driveid_update() ide: use __ide_end_request() in ide_end_dequeued_request() ide: enhance ide_setup_pci_noise() cs5530: remove needless ide_lock taking ide: take ide_lock for prefetch disable/enable in do_special() ht6560b: fix deadlock on error handling cmd640: fix deadlock on error handling slc90e66: fix deadlock on error handling opti621: fix deadlock on error handling qd65xx: fix deadlock on error handling dtc2278: fix deadlock on error handling ...	2007-10-19 19:36:05 -07:00
Jason Uhlenkott	8e8a1407ac	fix do_sys_open() prototype Fix an argument name in do_sys_open()'s prototype. Signed-off-by: Jason Uhlenkott <juhlenko@akamai.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-20 03:16:18 +02:00
Rolf Eike Beer	405bbe9fa3	Typo: depricated -> deprecated Typo: depricated -> deprecated Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-20 03:10:57 +02:00
Mike Anderson	7a8c3d3b92	dm: uevent generate events This patch adds support for the dm_path_event dm_send_event functions which create and send udev events. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-10-20 02:01:26 +01:00
Mike Anderson	96a1f7dba6	dm: export name and uuid This patch adds a function to obtain a copy of a mapped device's name and uuid. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-10-20 02:01:23 +01:00
Milan Broz	027d50f92e	dm io:ctl use constant struct size Make size of dm_ioctl struct always 312 bytes on all supported architectures. This change retains compatibility with already-compiled code because it uses an embedded offset to locate the payload that follows the structure. On 64-bit architectures there is no change at all; on 32-bit we are increasing the size of dm-ioctl from 308 to 312 bytes. Currently with 32-bit userspace / 64-bit kernel on x86_64 some ioctls (including rename, message) are incorrectly rejected by the comparison against 'param + 1'. This breaks userspace lvrename and multipath 'fail_if_no_path' changes, for example. (BTW Device-mapper uses its own versioning and ignores the ioctl size bits. Only the generic ioctl compat code on mixed arches checks them, and that will continue to accept both sizes for now, but we intend to list 308 as deprecated and eventually remove it.) Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: Guido Guenther <agx@sigxcpu.org> Cc: Kevin Corry <kevcorry@us.ibm.com> Cc: stable@kernel.org	2007-10-20 02:00:58 +01:00
Serge Hallyn	6da34bae29	fix up security_socket_getpeersec_* documentation Update the security_socket_peersec documentation in include/linux/security.h. security_socket_peersec has been split into two functions - _stream and _dgram, with new capabilities. Signed-off-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-20 00:53:30 +02:00
Bartlomiej Zolnierkiewicz	8562043606	ide: constify struct ide_port_info Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:34 +02:00
Bartlomiej Zolnierkiewicz	039788e153	ide: replace ide_pci_device_t by struct ide_port_info * Rename struct ide_pci_device_s to struct ide_port_info. * Remove ide_pci_device_t typedef. While at it: * Fix __ide_pci_register_driver() comment. * Fix aec62xx_init_one() comment. * Remove unused 'cds' field from ide_hwgroup_t. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:34 +02:00
Bartlomiej Zolnierkiewicz	9239b33393	ide: remove write-only hwif->hw Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:33 +02:00
Bartlomiej Zolnierkiewicz	18e181fe13	ide: add hwif->ack_intr hook * Add hwif->ack_intr hook and use it instead of hwif->hw.ack_intr. * Add missing brackets to cris-v32 and powerpc ide_ack_intr() macros. Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:32 +02:00
Bartlomiej Zolnierkiewicz	86f3a492bb	icside: use ec->dma directly * hwif->hwif_data contains pointer to struct expansion_card so use ec->dma directly instead of caching it in hwif->hw.dma. * Remove no longer needed hw_regs_t.dma and NO_DMA define. Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:32 +02:00
Bartlomiej Zolnierkiewicz	847ddd2bbe	ide: add CONFIG_IDE_ARCH_OBSOLETE_INIT Add CONFIG_IDE_ARCH_OBSOLETE_INIT to drivers/ide/Kconfig and use it instead of defining IDE_ARCH_OBSOLETE_INIT in <arch/ide.h>. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:32 +02:00
Bartlomiej Zolnierkiewicz	f9b9309737	ide: remove redundant comments from ide.h There is better documentation for these functions in drivers/ide/. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:31 +02:00
Bartlomiej Zolnierkiewicz	baa8f3e94b	ide: add ide_find_port() helper * Add ide_find_port() helper. * Convert icside, rapide and ide_platform host drivers to use it. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:31 +02:00
Bartlomiej Zolnierkiewicz	8447d9d52a	ide: add ide_device_add() * Add ide_device_add() helper and convert host drivers to use it instead of open-coded variants. * Make ide_pci_setup_ports() and do_ide_setup_pci_device() take 'u8 idx' argument instead of 'ata_index_t index'. * Remove no longer needed ata_index_t. * Unexport probe_hwif_init() and make it static. * Unexport ide_proc_register_port(). There should be no functionality changes caused by this patch (sgiioc4.c: ide_proc_register_port() requires hwif->present to be set and it won't be set if probe_hwif_init() fails). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:31 +02:00
Bartlomiej Zolnierkiewicz	fd9bb53942	ide: add ->fixup method to ide_hwif_t * Add ->fixup method to ide_hwif_t. * Set hwif->fixup in ide_pci_setup_ports() to d->fixup. * Use hwif->fixup in probe_hwif(). * Use probe_hwif_init() instead of probe_hwif_init_with_fixup() in ide_setup_pci_device(). * Add 'fixup' argument to ide_register_hw() and use it to set hwif->fixup, update all ide_register_hw() users accordingly. * Convert ide-cs/delkin_cb host drivers to use ide_register_hw(). * Restore hwif->fixup in ide_hwif_restore(). * Remove ide_register_hw_with_fixup(), probe_hwif_init_with_fixup() and 'fixup' argument from probe_hwif(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:31 +02:00
Bartlomiej Zolnierkiewicz	caea7602f3	ide: add IDE_HFLAG_{IO_32BIT,UNMASK_IRQS} host flags Add IDE_HFLAG_{IO_32BIT,UNMASK_IRQS} host flag to tell ide_pci_setup_ports() to set drive->{io_32bit,unmask} for both drives on the interface. Convert amd74xx, sl82c105 and via82cxxx host drivers to use these new host flags. While at it: * Add IDE_HFLAGS_AMD define (amd74xx host driver). * Add IDE_HFLAGS_VIA define (via82cxxx host driver). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:30 +02:00
Bartlomiej Zolnierkiewicz	272a370900	ide: add IDE_HFLAG_RQSIZE_256 host flag Add IDE_HFLAG_RQSIZE_256 host flag to tell ide_pci_setup_ports() to set hwif->rqsize to 256 sectors. Convert pdc202xx_old host driver to use it. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:30 +02:00
Bartlomiej Zolnierkiewicz	8acf28c090	ide: add IDE_HFLAG_FORCE_LEGACY_IRQS host flag Add IDE_HFLAG_FORCE_LEGACY_IRQS host flag to tell ide_pci_setup_ports() to always set hwif->irq to legacy IRQ 14/15 and convert generic IDE PCI and via82cxxx host drivers to use it. While at it: * Add IDE_HFLAGS_UMC define (generic IDE PCI host driver). * Remove no longer needed init_hwif_generic() (generic IDE PCI host driver). * Set d->udma_mask instead of hwif->ultra_mask (via82cxxx host driver). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:30 +02:00
Bartlomiej Zolnierkiewicz	528a572dae	ide: add ->chipset field to ide_pci_device_t Add ->chipset field to ide_pci_device_t and use it in ide_hwif_configure() to set hwif->chipset. Convert cmd64x, cy82c693, rz1000 and trm290 host drivers to use this new ability. While at it define hwif_chipset_t as u8 to save some space in hw_regs_t, ide_hwif_t and ide_pci_device_t instances. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:30 +02:00
Bartlomiej Zolnierkiewicz	44a59ad59f	ide: remove unused ->next field from ide_pci_device_t Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-20 00:32:29 +02:00
Linus Torvalds	60812a4a99	Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86 * ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (33 commits) x86: convert cpuinfo_x86 array to a per_cpu array x86: introduce frame_pointer() and stack_pointer() x86 & generic: change to __builtin_prefetch() i386: do not BUG_ON() when MSR is unknown x86: acpi use cpu_physical_id x86: convert cpu_llc_id to be a per cpu variable x86: convert cpu_to_apicid to be a per cpu variable i386: introduce "used_vectors" bitmap which can be used to reserve vectors. x86: use raw locks during oopses x86: honor _PAGE_PSE bit on page walks i386: do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE. x86: implement missing x86_64 function smp_call_function_mask() x86: use descriptor's functions instead of inline assembly i386: consolidate show_regs and show_registers for i386 i386: make callgraph use dump_trace() on i386/x86_64 x86: enable iommu_merge by default i386: i386 add AMD64 Barcelona PMU MSR definitions to msr.h x86: Unify i386 and x86-64 early quirks x86: enable HPET on ICH3 and ICH4 x86: force enable HPET on VT8235/8237 chipsets ... Manually fix trivial conflict with task pid container helper changes in arch/x86/kernel/process_32.c	2007-10-19 15:06:00 -07:00
Linus Torvalds	b04cde34cf	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: NFSv4: Fix an rpc_cred reference leakage in fs/nfs/delegation.c NFSv4: Ensure that we wait for the CLOSE request to complete NFS: Fix a race in sillyrename NFS: Fix a writeback race...	2007-10-19 14:31:42 -07:00
Jan Engelhardt	96de0e252c	Convert files to UTF-8 and some cleanups * Convert files to UTF-8. * Also correct some people's names (one example is Eißfeldt, which was found in a source file. Given that the author used an ß at all in a source file indicates that the real name has in fact a 'ß' and not an 'ss', which is commonly used as a substitute for 'ß' when limited to 7bit.) * Correct town names (Goettingen -> Göttingen) * Update Eberhard Mönkeberg's address (http://lkml.org/lkml/2007/1/8/313) Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:21:04 +02:00
Trond Myklebust	565277f63c	NFS: Fix a race in sillyrename lookup() and sillyrename() can race one another because the sillyrename() completion cannot take the parent directory's inode->i_mutex since the latter may be held by whoever is calling dput(). We therefore have little option but to add extra locking to ensure that nfs_lookup() and nfs_atomic_open() do not race with the sillyrename completion. If somebody has looked up the sillyrenamed file in the meantime, we just transfer the sillydelete information to the new dentry. Please refer to the bug-report at http://bugzilla.linux-nfs.org/show_bug.cgi?id=150 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-19 17:19:16 -04:00
Robert P. J. Day	3a4fa0a25d	Fix misspellings of "system", "controller", "interrupt" and "necessary". Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:10:43 +02:00
John Anthony Kazos Jr	18735dd8d2	crypto: convert crypto.h to UTF-8 Convert the encoding of <include/linux/crypto.h> from ISO-8859-1 to UTF-8. Signed-off-by: John Anthony Kazos Jr. <jakj@j-a-k-j.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:07:36 +02:00
Linus Torvalds	c4ec207173	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (41 commits) ACPICA: hw: Don't carry spinlock over suspend ACPICA: hw: remove use_lock flag from acpi_hw_register_{read, write} ACPI: cpuidle: port idle timer suspend/resume workaround to cpuidle ACPI: clean up acpi_enter_sleep_state_prep Hibernation: Make sure that ACPI is enabled in acpi_hibernation_finish ACPI: suppress uninitialized var warning cpuidle: consolidate 2.6.22 cpuidle branch into one patch ACPI: thinkpad-acpi: skip blanks before the data when parsing sysfs ACPI: AC: Add sysfs interface ACPI: SBS: Add sysfs alarm ACPI: SBS: Add ACPI_PROCFS around procfs handling code. ACPI: SBS: Add support for power_supply class (and sysfs) ACPI: SBS: Make SBS reads table-driven. ACPI: SBS: Simplify data structures in SBS ACPI: SBS: Split host controller (ACPI0001) from SBS driver (ACPI0002) ACPI: EC: Add new query handler to list head. ACPI: Add acpi_bus_generate_event4() function ACPI: Battery: add sysfs alarm ACPI: Battery: Add sysfs support ACPI: Battery: Misc clean-ups, no functional changes ... Fix up conflicts in drivers/misc/thinkpad_acpi.[ch] manually	2007-10-19 13:12:46 -07:00
Mathieu Desnoyers	31155bc03e	Linux Kernel Markers - Samples Module example showing how to use the Linux Kernel Markers. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:55 -07:00
Mathieu Desnoyers	8256e47cdc	Linux Kernel Markers The marker activation functions sits in kernel/marker.c. A hash table is used to keep track of the registered probes and armed markers, so the markers within a newly loaded module that should be active can be activated at module load time. marker_query has been removed. marker_get_first, marker_get_next and marker_release should be used as iterators on the markers. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: "Frank Ch. Eigler" <fche@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Mike Mason <mmlnx@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:54 -07:00
Srivatsa Vaddagiri	68318b8e0b	Hook up group scheduler with control groups Enable "cgroup" (formerly containers) based fair group scheduling. This will let administrator create arbitrary groups of tasks (using "cgroup" pseudo filesystem) and control their cpu bandwidth usage. [akpm@linux-foundation.org: fix cpp condition] Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:51 -07:00
Bernhard Walle	cba63c3089	Extended crashkernel command line This patch adds a extended crashkernel syntax that makes the value of reserved system RAM dependent on the system RAM itself: crashkernel=<range1>:<size1>[,<range2>:<size2>,...][@offset] range=start-[end] For example: crashkernel=512M-2G:64M,2G-:128M The motivation comes from distributors that configure their crashkernel command line automatically with some configuration tool (YaST, you know ;)). Of course that tool knows the value of System RAM, but if the user removes RAM, then the system becomes unbootable or at least unusable and error handling is very difficult. This series implements this change for i386, x86_64, ia64, ppc64 and sh. That should be all platforms that support kdump in current mainline. I tested all platforms except sh due to the lack of a sh processor. This patch: This is the generic part of the patch. It adds a parse_crashkernel() function in kernel/kexec.c that is called by the architecture specific code that actually reserves the memory. That function takes the whole command line and looks itself for "crashkernel=" in it. If there are multiple occurrences, then the last one is taken. The advantage is that if you have a bootloader like lilo or elilo which allows you to append a command line parameter but not to remove one (like in GRUB), then you can add another crashkernel value for testing at the boot command line and this one overwrites the command line in the configuration then. Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Vivek Goyal <vgoyal@in.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:49 -07:00
Kirill Korotaev	3ac88a41ff	virtualization of sysv msg queues is incomplete Virtualization of sysv msg queues is incomplete: msg_hdrs and msg_bytes variables visible from userspace are global. Let's make them per-namespace. Signed-off-by: Alexey Kuznetsov <alexey@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Cc: Pierre Peiffer <pierre.peiffer@bull.net> Cc: Nadia Derbey <Nadia.Derbey@bull.net> Cc: Serge Hallyn <serue@us.ibm.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:48 -07:00
Nadia Derbey	7ca7e564e0	ipc: store ipcs into IDRs This patch introduces ipcs storage into IDRs. The main changes are: . This ipc_ids structure is changed: the entries array is changed into a root idr structure. . The grow_ary() routine is removed: it is not needed anymore when adding an ipc structure, since we are now using the IDR facility. . The ipc_rmid() routine interface is changed: . there is no need for this routine to return the pointer passed in as argument: it is now declared as a void . since the id is now part of the kern_ipc_perm structure, no need to have it as an argument to the routine Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:44 -07:00
Cliff Wickman	470fd64644	hotplug cpu: migrate a task within its cpuset When a cpu is disabled, move_task_off_dead_cpu() is called for tasks that have been running on that cpu. Currently, such a task is migrated: 1) to any cpu on the same node as the disabled cpu, which is both online and among that task's cpus_allowed 2) to any cpu which is both online and among that task's cpus_allowed It is typical of a multithreaded application running on a large NUMA system to have its tasks confined to a cpuset so as to cluster them near the memory that they share. Furthermore, it is typical to explicitly place such a task on a specific cpu in that cpuset. And in that case the task's cpus_allowed includes only a single cpu. This patch would insert a preference to migrate such a task to some cpu within its cpuset (and set its cpus_allowed to its entire cpuset). With this patch, migrate the task to: 1) to any cpu on the same node as the disabled cpu, which is both online and among that task's cpus_allowed 2) to any online cpu within the task's cpuset 3) to any cpu which is both online and among that task's cpus_allowed In order to do this, move_task_off_dead_cpu() must make a call to cpuset_cpus_allowed_locked(), a new subset of cpuset_cpus_allowed(), that will not block. (name change - per Oleg's suggestion) Calls are made to cpuset_lock() and cpuset_unlock() in migration_call() to set the cpuset mutex during the whole migrate_live_tasks() and migrate_dead_tasks() procedure. [akpm@linux-foundation.org: build fix] [pj@sgi.com: Fix indentation and spacing] Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Christoph Lameter <clameter@sgi.com> Cc: Paul Jackson <pj@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:44 -07:00
Pavel Emelyanov	ba25f9dcc4	Use helpers to obtain task pid in printks The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Pavel Emelyanov	9a2e70572e	Isolate the explicit usage of signal->pgrp The pgrp field is not used widely around the kernel so it is now marked as deprecated with appropriate comment. The initialization of INIT_SIGNALS is trimmed because a) they are set to 0 automatically; b) gcc cannot properly initialize two anonymous (the second one is the one with the session) unions. In this particular case to make it compile we'd have to add some field initialized right before the .pgrp. This is the same patch as the 1ec320afdc9552c92191d5f89fcd1ebe588334ca one (from Cedric), but for the pgrp field. Some progress report: We have to deprecate the pid, tgid, session and pgrp fields on struct task_struct and struct signal_struct. The session and pgrp are already deprecated. The tgid value is close to being such - the worst known usage in in fs/locks.c and audit code. The pid field deprecation is mainly blocked by numerous printk-s around the kernel that print the tsk->pid to log. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Jiri Slaby	14ed9d23aa	remove BITS_TO_TYPE macro remove BITS_TO_TYPE macro I realized, that it is actually the same as DIV_ROUND_UP, use it instead. [akpm@linux-foundation.org: build fix] Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Jiri Slaby	93043ece03	define global BIT macro define global BIT macro move all local BIT defines to the new globally define macro. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Kumar Gala <galak@gate.crashing.org> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: "John W. Linville" <linville@tuxdriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Jiri Slaby	7b19ada2ed	get rid of input BIT* duplicate defines get rid of input BIT* duplicate defines use newly global defined macros for input layer. Also remove includes of input.h from non-input sources only for BIT macro definiton. Define the macro temporarily in local manner, all those local definitons will be removed further in this patchset (to not break bisecting). BIT macro will be globally defined (1<<x) Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: <dtor@mail.ru> Acked-by: Jiri Kosina <jkosina@suse.cz> Cc: <lenb@kernel.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Cc: <perex@suse.cz> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: <vernux@us.ibm.com> Cc: <malattia@linux.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Jiri Slaby	d05be13bcc	define first set of BIT* macros define first set of BIT* macros - move BITOP_MASK and BITOP_WORD from asm-generic/bitops/atomic.h to include/linux/bitops.h and rename it to BIT_MASK and BIT_WORD - move BITS_TO_LONGS and BITS_PER_BYTE to bitops.h too and allow easily define another BITS_TO_something (e.g. in event.c) by BITS_TO_TYPE macro Remaining (and common) BIT macro will be defined after all occurences and conflicts will be sorted out in the patches. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Jiri Slaby	1977f03272	remove asm/bitops.h includes remove asm/bitops.h includes including asm/bitops directly may cause compile errors. don't include it and include linux/bitops instead. next patch will deny including asm header directly. Cc: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Jiri Slaby	bc552f7715	Misc: phantom, improved data passing This new version guarantees amb_bit switch in small enough intervals, so that the device won't stop working in the middle of a movement anymore. However it preserves old (openhaptics) functionality. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Paul Menage	8707d8b8c0	Fix cpusets update_cpumask Cause writes to cpuset "cpus" file to update cpus_allowed for member tasks: - collect batches of tasks under tasklist_lock and then call set_cpus_allowed() on them outside the lock (since this can sleep). - add a simple generic priority heap type to allow efficient collection of batches of tasks to be processed without duplicating or missing any tasks in subsequent batches. - make "cpus" file update a no-op if the mask hasn't changed - fix race between update_cpumask() and sched_setaffinity() by making sched_setaffinity() post-check that it's not running on any cpus outside cpuset_cpus_allowed(). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: David Rientjes <rientjes@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Paul Jackson	029190c515	cpuset sched_load_balance flag Add a new per-cpuset flag called 'sched_load_balance'. When enabled in a cpuset (the default value) it tells the kernel scheduler that the scheduler should provide the normal load balancing on the CPUs in that cpuset, sometimes moving tasks from one CPU to a second CPU if the second CPU is less loaded and if that task is allowed to run there. When disabled (write "0" to the file) then it tells the kernel scheduler that load balancing is not required for the CPUs in that cpuset. Now even if this flag is disabled for some cpuset, the kernel may still have to load balance some or all the CPUs in that cpuset, if some overlapping cpuset has its sched_load_balance flag enabled. If there are some CPUs that are not in any cpuset whose sched_load_balance flag is enabled, the kernel scheduler will not load balance tasks to those CPUs. Moreover the kernel will partition the 'sched domains' (non-overlapping sets of CPUs over which load balancing is attempted) into the finest granularity partition that it can find, while still keeping any two CPUs that are in the same shed_load_balance enabled cpuset in the same element of the partition. This serves two purposes: 1) It provides a mechanism for real time isolation of some CPUs, and 2) it can be used to improve performance on systems with many CPUs by supporting configurations in which load balancing is not done across all CPUs at once, but rather only done in several smaller disjoint sets of CPUs. This mechanism replaces the earlier overloading of the per-cpuset flag 'cpu_exclusive', which overloading was removed in an earlier patch: cpuset-remove-sched-domain-hooks-from-cpusets See further the Documentation and comments in the code itself. [akpm@linux-foundation.org: don't be weird] Signed-off-by: Paul Jackson <pj@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Pavel Emelyanov	2f2a3a46fc	Uninline the task_xid_nr_ns() calls Since these are expanded into call to pid_nr_ns() anyway, it's OK to move the whole routine out-of-line. This is a cheap way to save ~100 bytes from vmlinux. Together with the previous two patches, it saves half-a-kilo from the vmlinux. Un-inline other (currently inlined) functions must be done with additional performance testing. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Pavel Emelyanov	8990571eb5	Uninline find_pid etc set of functions The find_pid/_vpid/_pid_ns functions are used to find the struct pid by its id, depending on whic id - global or virtual - is used. The find_vpid() is a macro that pushes the current->nsproxy->pid_ns on the stack to call another function - find_pid_ns(). It turned out, that this dereference together with the push itself cause the kernel text size to grow too much. Move all these out-of-line. Together with the previous patch this saves a bit less that 400 bytes from .text section. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Pavel Emelyanov	bac0abd617	Isolate some explicit usage of task->tgid With pid namespaces this field is now dangerous to use explicitly, so hide it behind the helpers. Also the pid and pgrp fields o task_struct and signal_struct are to be deprecated. Unfortunately this patch cannot be sent right now as this leads to tons of warnings, so start isolating them, and deprecate later. Actually the p->tgid == pid has to be changed to has_group_leader_pid(), but Oleg pointed out that in case of posix cpu timers this is the same, and thread_group_leader() is more preferable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	19b9b9b54e	pid namespaces: remove the struct pid unneeded fields Since we've switched from using pid->nr to pid->upids->nr some fields on struct pid are no longer needed Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	228ebcbe63	Uninline find_task_by_xxx set of functions The find_task_by_something is a set of macros are used to find task by pid depending on what kind of pid is proposed - global or virtual one. All of them are wrappers above the most generic one - find_task_by_pid_type_ns() - and just substitute some args for it. It turned out, that dereferencing the current->nsproxy->pid_ns construction and pushing one more argument on the stack inline cause kernel text size to grow. This patch moves all this stuff out-of-line into kernel/pid.c. Together with the next patch it saves a bit less than 400 bytes from the .text section. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Sukadev Bhattiprolu	3eb07c8c8a	pid namespaces: destroy pid namespace on init's death Terminate all processes in a namespace when the reaper of the namespace is exiting. We do this by walking the pidmap of the namespace and sending SIGKILL to all processes. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	6f4e643353	pid namespaces: initialize the namespace's proc_mnt The namespace's proc_mnt must be kern_mount-ed to make this pointer always valid, independently of whether the user space mounted the proc or not. This solves raced in proc_flush_task, etc. with the proc_mnt switching from NULL to not-NULL. The initialization is done after the init's pid is created and hashed to make proc_get_sb() finr it and get for root inode. Sice the namespace holds the vfsmnt, vfsmnt holds the superblock and the superblock holds the namespace we must explicitly break this circle to destroy all the stuff. This is done after the init of the namespace dies. Running a few steps forward - when init exits it will kill all its children, so no proc_mnt will be needed after its death. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	30e49c263e	pid namespaces: allow cloning of new namespace When clone() is invoked with CLONE_NEWPID, create a new pid namespace and then create a new struct pid for the new process. Allocate pid_t's for the new process in the new pid namespace and all ancestor pid namespaces. Make the newly cloned process the session and process group leader. Since the active pid namespace is special and expected to be the first entry in pid->upid_list, preserve the order of pid namespaces. The size of 'struct pid' is dependent on the the number of pid namespaces the process exists in, so we use multiple pid-caches'. Only one pid cache is created during system startup and this used by processes that exist only in init_pid_ns. When a process clones its pid namespace, we create additional pid caches as necessary and use the pid cache to allocate 'struct pids' for that depth. Note, that with this patch the newly created namespace won't work, since the rest of the kernel still uses global pids, but this is to be fixed soon. Init pid namespace still works. [oleg@tv-sign.ru: merge fix] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	b461cc0382	pid namespaces: miscellaneous preparations for pid namespaces * remove pid.h from pid_namespaces.h; * rework is_(cgroup\|global)_init; * optimize (get\|put)_pid_ns for init_pid_ns; * declare task_child_reaper to return actual reaper. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	07543f5c75	pid namespaces: make proc have multiple superblocks - one for each namespace Each pid namespace have to be visible through its own proc mount. Thus we need to have per-namespace proc trees with their own superblocks. We cannot easily show different pid namespace via one global proc tree, since each pid refers to different tasks in different namespaces. E.g. pid 1 refers to the init task in the initial namespace and to some other task when seeing from another namespace. Moreover - pid, exisintg in one namespace may not exist in the other. This approach has one move advantage is that the tasks from the init namespace can see what tasks live in another namespace by reading entries from another proc tree. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	198fe21b0a	pid namespaces: helpers to find the task by its numerical ids When searching the task by numerical id on may need to find it using global pid (as it is done now in kernel) or by its virtual id, e.g. when sending a signal to a task from one namespace the sender will specify the task's virtual id and we should find the task by this value. [akpm@linux-foundation.org: fix gfs2 linkage] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	7af5729474	pid namespaces: helpers to obtain pid numbers When showing pid to user or getting the pid numerical id for in-kernel use the value of this id may differ depending on the namespace. This set of helpers is used to get the global pid nr, the virtual (i.e. seen by task in its namespace) nr and the nr as it is seen from the specified namespace. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	8ef047aaae	pid namespaces: make alloc_pid(), free_pid() and put_pid() work with struct upid Each struct upid element of struct pid has to be initialized properly, i.e. its nr mst be allocated from appropriate pidmap and ns set to appropriate namespace. When allocating a new pid, we need to know the namespace this pid will live in, so the additional argument is added to alloc_pid(). On the other hand, the rest of the kernel still uses the pid->nr and pid->pid_chain fields, so these ones are still initialized, but this will be removed soon. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	faacbfd3a6	pid namespaces: add support for pid namespaces hierarchy Each namespace has a parent and is characterized by its "level". Level is the number of the namespace generation. E.g. init namespace has level 0, after cloning new one it will have level 1, the next one - 2 and so on and so forth. This level is not explicitly limited. True hierarchy must have some way to find each namespace's children, but it is not used in the patches, so this ability is not added (yet). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Sukadev Bhattiprolu	4c3f2ead5a	pid namespaces: introduce struct upid Since task will be visible from different pid namespaces each of them have to be addressed by multiple pids. struct upid is to store the information about which id refers to which namespace. The constuciton looks like this. Each struct pid carried the reference counter and the list of tasks attached to this pid. At its end it has a variable length array of struct upid-s. Each struct upid has a numerical id (pid itself), pointer to the namespace, this ID is valid in and is hashed into a pid_hash for searching the pids. The nr and pid_chain fields are kept in struct pid for a while to make kernel still work (no patch initialize the upids yet), but it will be removed at the end of this series when we switch to upids completely. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Pavel Emelyanov	60347f6716	pid namespaces: prepare proc_flust_task() to flush entries from multiple proc trees The first part is trivial - we just make the proc_flush_task() to operate on arbitrary vfsmount with arbitrary ids and pass the pid and global proc_mnt to it. The other change is more tricky: I moved the proc_flush_task() call in release_task() higher to address the following problem. When flushing task from many proc trees we need to know the set of ids (not just one pid) to find the dentries' names to flush. Thus we need to pass the task's pid to proc_flush_task() as struct pid is the only object that can provide all the pid numbers. But after __exit_signal() task has detached all his pids and this information is lost. This creates a tiny gap for proc_pid_lookup() to bring some dentries back to tree and keep them in hash (since pids are still alive before __exit_signal()) till the next shrink, but since proc_flush_task() does not provide a 100% guarantee that the dentries will be flushed, this is OK to do so. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Pavel Emelyanov	8bf9725c29	pid namespaces: introduce MS_KERNMOUNT flag This flag tells the .get_sb callback that this is a kern_mount() call so that it can trust data pointer to be valid in-kernel one. If this flag is passed from the user process, it is cleared since the data pointer is not a valid kernel object. Running a few steps forward - this will be needed for proc to create the superblock and store a valid pid namespace on it during the namespace creation. The reason, why the namespace cannot live without proc mount is described in the appropriate patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Johannes Berg	4e6045f134	workqueue: debug flushing deadlocks with lockdep In the following scenario: code path 1: my_function() -> lock(L1); ...; flush_workqueue(); ... code path 2: run_workqueue() -> my_work() -> ...; lock(L1); ... you can get a deadlock when my_work() is queued or running but my_function() has acquired L1 already. This patch adds a pseudo-lock to each workqueue to make lockdep warn about this scenario. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Pavel Emelyanov	cf7b708c8d	Make access to task's nsproxy lighter When someone wants to deal with some other taks's namespaces it has to lock the task and then to get the desired namespace if the one exists. This is slow on read-only paths and may be impossible in some cases. E.g. Oleg recently noticed a race between unshare() and the (sent for review in cgroups) pid namespaces - when the task notifies the parent it has to know the parent's namespace, but taking the task_lock() is impossible there - the code is under write locked tasklist lock. On the other hand switching the namespace on task (daemonize) and releasing the namespace (after the last task exit) is rather rare operation and we can sacrifice its speed to solve the issues above. The access to other task namespaces is proposed to be performed like this: rcu_read_lock(); nsproxy = task_nsproxy(tsk); if (nsproxy != NULL) { / * * work with the namespaces here * e.g. get the reference on one of them * / } / * * NULL task_nsproxy() means that this task is * almost dead (zombie) * / rcu_read_unlock(); This patch has passed the review by Eric and Oleg :) and, of course, tested. [clg@fr.ibm.com: fix unshare()] [ebiederm@xmission.com: Update get_net_ns_by_pid] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Serge E. Hallyn	b460cbc581	pid namespaces: define is_global_init() and is_container_init() is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Sukadev Bhattiprolu	88f21d8182	pid namespaces: rename child_reaper() function Rename the child_reaper() function to task_child_reaper() to be similar to other task_* functions and to distinguish the function from 'struct pid_namspace.child_reaper'. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Sukadev Bhattiprolu	2894d650cd	pid namespaces: define and use task_active_pid_ns() wrapper With multiple pid namespaces, a process is known by some pid_t in every ancestor pid namespace. Every time the process forks, the child process also gets a pid_t in every ancestor pid namespace. While a process is visible in >=1 pid namespaces, it can see pid_t's in only one pid namespace. We call this pid namespace it's "active pid namespace", and it is always the youngest pid namespace in which the process is known. This patch defines and uses a wrapper to find the active pid namespace of a process. The implementation of the wrapper will be changed in when support for multiple pid namespaces are added. Changelog: 2.6.22-rc4-mm2-pidns1: - [Pavel Emelianov, Alexey Dobriyan] Back out the change to use task_active_pid_ns() in child_reaper() since task->nsproxy can be NULL during task exit (so child_reaper() continues to use init_pid_ns). to implement child_reaper() since init_pid_ns.child_reaper to implement child_reaper() since tsk->nsproxy can be NULL during exit. 2.6.21-rc6-mm1: - Rename task_pid_ns() to task_active_pid_ns() to reflect that a process can have multiple pid namespaces. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Pavel Emelianov	baf8f0f82d	pid namespaces: dynamic kmem cache allocator for pid namespaces Add kmem_cache to pid_namespace to allocate pids from. Since both implementations expand the struct pid to carry more numerical values each namespace should have separate cache to store pids of different sizes. Each kmem cache is name "pid_<NR>", where <NR> is the number of numerical ids on the pid. Different namespaces with same level of nesting will have same caches. This patch has two FIXMEs that are to be fixed after we reach the consensus about the struct pid itself. The first one is that the namespace to free the pid from in free_pid() must be taken from pid. Now the init_pid_ns is used. The second FIXME is about the cache allocation. When we do know how long the object will be then we'll have to calculate this size in create_pid_cachep. Right now the sizeof(struct pid) value is used. [akpm@linux-foundation.org: coding-style repair] Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Cedric Le Goater <clg@fr.ibm.com> Acked-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Pavel Emelianov	a05f7b15de	pid namespaces: make get_pid_ns() return the namespace itself Make get_pid_ns() return the namespace itself to look like the other getters and make the code using it look nicer. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Pavel Emelianov	a47afb0f9d	pid namespaces: round up the API The set of functions process_session, task_session, process_group and task_pgrp is confusing, as the names can be mixed with each other when looking at the code for a long time. The proposals are to * equip the functions that return the integer with _nr suffix to represent that fact, * and to make all functions work with task (not process) by making the common prefix of the same name. For monotony the routines signal_session() and set_signal_session() are replaced with task_session_nr() and set_task_session(), especially since they are only used with the explicit task->signal dereference. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Serge E. Hallyn	858d72ead4	cgroups: implement namespace tracking subsystem When a task enters a new namespace via a clone() or unshare(), a new cgroup is created and the task moves into it. This version names cgroups which are automatically created using cgroup_clone() as "node_<pid>" where pid is the pid of the unsharing or cloned process. (Thanks Pavel for the idea) This is safe because if the process unshares again, it will create /cgroups/(...)/node_<pid>/node_<pid> The only possibilities (AFAICT) for a -EEXIST on unshare are 1. pid wraparound 2. a process fails an unshare, then tries again. Case 1 is unlikely enough that I ignore it (at least for now). In case 2, the node_<pid> will be empty and can be rmdir'ed to make the subsequent unshare() succeed. Changelog: Name cloned cgroups as "node_<pid>". [clg@fr.ibm.com: fix order of cgroup subsystems in init/Kconfig] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Balbir Singh	846c7bb055	Add cgroupstats This patch is inspired by the discussion at http://lkml.org/lkml/2007/4/11/187 and implements per cgroup statistics as suggested by Andrew Morton in http://lkml.org/lkml/2007/4/11/263. The patch is on top of 2.6.21-mm1 with Paul's cgroups v9 patches (forward ported) This patch implements per cgroup statistics infrastructure and re-uses code from the taskstats interface. A new set of cgroup operations are registered with commands and attributes. It should be very easy to extend per cgroup statistics, by adding members to the cgroupstats structure. The current model for cgroupstats is a pull, a push model (to post statistics on interesting events), should be very easy to add. Currently user space requests for statistics by passing the cgroup file descriptor. Statistics about the state of all the tasks in the cgroup is returned to user space. TODO's/NOTE: This patch provides an infrastructure for implementing cgroup statistics. Based on the needs of each controller, we can incrementally add more statistics, event based support for notification of statistics, accumulation of taskstats into cgroup statistics in the future. Sample output # ./cgroupstats -C /cgroup/a sleeping 2, blocked 0, running 1, stopped 0, uninterruptible 0 # ./cgroupstats -C /cgroup/ sleeping 154, blocked 0, running 0, stopped 0, uninterruptible 0 If the approach looks good, I'll enhance and post the user space utility for the same Feedback, comments, test results are always welcome! [akpm@linux-foundation.org: build fix] Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Jay Lan <jlan@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	006cb99200	Task Control Groups: simple task cgroup debug info subsystem This example subsystem exports debugging information as an aid to diagnosing refcount leaks, etc, in the cgroup framework. Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	62d0df6406	Task Control Groups: example CPU accounting subsystem This example demonstrates how to use the generic cgroup subsystem for a simple resource tracker that counts, for the processes in a cgroup, the total CPU time used and the %CPU used in the last complete 10 second interval. Portions contributed by Balbir Singh <balbir@in.ibm.com> Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	8793d854ed	Task Control Groups: make cpusets a client of cgroups Remove the filesystem support logic from the cpusets system and makes cpusets a cgroup subsystem The "cpuset" filesystem becomes a dummy filesystem; attempts to mount it get passed through to the cgroup filesystem with the appropriate options to emulate the old cpuset filesystem behaviour. Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	81a6a5cdd2	Task Control Groups: automatic userspace notification of idle cgroups Add the following files to the cgroup filesystem: notify_on_release - configures/reports whether the cgroup subsystem should attempt to run a release script when this cgroup becomes unused release_agent - configures/reports the release agent to be used for this hierarchy (top level in each hierarchy only) releasable - reports whether this cgroup would have been auto-released if notify_on_release was true and a release agent was configured (mainly useful for debugging) To avoid locking issues, invoking the userspace release agent is done via a workqueue task; cgroups that need to have their release agents invoked by the workqueue task are linked on to a list. [pj@sgi.com: Need to include kmod.h] Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	817929ec27	Task Control Groups: shared cgroup subsystem group arrays Replace the struct css_set embedded in task_struct with a pointer; all tasks that have the same set of memberships across all hierarchies will share a css_set object, and will be linked via their css_sets field to the "tasks" list_head in the css_set. Assuming that many tasks share the same cgroup assignments, this reduces overall space usage and keeps the size of the task_struct down (three pointers added to task_struct compared to a non-cgroups kernel, no matter how many subsystems are registered). [akpm@linux-foundation.org: fix a printk] [akpm@linux-foundation.org: build fix] Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	a424316ca1	Task Control Groups: add procfs interface Add: /proc/cgroups - general system info /proc/*/cgroup - per-task cgroup membership info [a.p.zijlstra@chello.nl: cgroups: bdi init hooks] Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	697f416108	Task Control Groups: add cgroup_clone() interface Add support for cgroup_clone(), a way to create new cgroups intended to be used for systems such as namespace unsharing. A new subsystem callback, post_clone(), is added to allow subsystems to automatically configure cloned cgroups. Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	b4f48b6363	Task Control Groups: add fork()/exit() hooks This adds the necessary hooks to the fork() and exit() paths to ensure that new children inherit their parent's cgroup assignments, and that exiting processes release reference counts on their cgroups. Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	355e0c48b7	Add cgroup write_uint() helper method Add write_uint() helper method for cgroup subsystems This helper is analagous to the read_uint() helper method for reporting u64 values to userspace. It's designed to reduce the amount of boilerplate requierd for creating new cgroup subsystems. Signed-off-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	bbcb81d091	Task Control Groups: add tasks file interface Add the per-directory "tasks" file for cgroupfs mounts; this allows the user to determine which tasks are members of a cgroup by reading a cgroup's "tasks", and to move a task into a cgroup by writing its pid to its "tasks". Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	ddbcc7e8e5	Task Control Groups: basic task cgroup framework Generic Process Control Groups -------------------------- There have recently been various proposals floating around for resource management/accounting and other task grouping subsystems in the kernel, including ResGroups, User BeanCounters, NSProxy cgroups, and others. These all need the basic abstraction of being able to group together multiple processes in an aggregate, in order to track/limit the resources permitted to those processes, or control other behaviour of the processes, and all implement this grouping in different ways. This patchset provides a framework for tracking and grouping processes into arbitrary "cgroups" and assigning arbitrary state to those groupings, in order to control the behaviour of the cgroup as an aggregate. The intention is that the various resource management and virtualization/cgroup efforts can also become task cgroup clients, with the result that: - the userspace APIs are (somewhat) normalised - it's easier to test e.g. the ResGroups CPU controller in conjunction with the BeanCounters memory controller, or use either of them as the resource-control portion of a virtual server system. - the additional kernel footprint of any of the competing resource management systems is substantially reduced, since it doesn't need to provide process grouping/containment, hence improving their chances of getting into the kernel This patch: Add the main task cgroups framework - the cgroup filesystem, and the basic structures for tracking membership and associating subsystem state objects to tasks. Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Randy Dunlap	8f731f7d83	kernel-api docbook: fix content problems Fix kernel-api docbook contents problems. docproc: linux-2.6.23-git13/include/asm-x86/unaligned_32.h: No such file or directory Warning(linux-2.6.23-git13//include/linux/list.h:482): bad line: of list entry Warning(linux-2.6.23-git13//mm/filemap.c:864): No description found for parameter 'ra' Warning(linux-2.6.23-git13//block/ll_rw_blk.c:3760): No description found for parameter 'req' Warning(linux-2.6.23-git13//include/linux/input.h:1077): No description found for parameter 'private' Warning(linux-2.6.23-git13//include/linux/input.h:1077): No description found for parameter 'cdev' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: WU Fengguang <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	cb680c1be6	reiserfs: ignore on disk s_bmap_nr value Implement support for file systems larger than 8 TiB. The reiserfs superblock contains a 16 bit value for counting the number of bitmap blocks. The rest of the disk format supports file systems up to 2^32 blocks, but the bitmap block limitation artificially limits this to 8 TiB with a 4KiB block size. Rather than trust the superblock's 16-bit bitmap block count, we calculate it dynamically based on the number of blocks in the file system. When an incorrect value is observed in the superblock, it is zeroed out, ensuring that older kernels will not be able to mount the file system. Userspace support has already been implemented and shipped in reiserfsprogs 3.6.20. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	4d20851d37	reiserfs: remove first_zero_hint The first_zero_hint metadata caching was never actually used, and it's of dubious optimization quality. This patch removes it. It doesn't actually shrink the size of the reiserfs_bitmap_info struct, since that doesn't work with block sizes larger than 8K. There was a big fixme in there, and with all the work lately in allowing block size > page size, I might as well kill the fixme as well. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	3ee1667042	reiserfs: fix usage of signed ints for block numbers Do a quick signedness check for block numbers. There are a number of places where signed integers are used for block numbers, which limits the usable file system size to 8 TiB. The disk format, excepting a problem which will be fixed in the following patch, supports file systems up to 16 TiB in size. This patch cleans up those sites so that we can enable the full usable size. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jose R. Santos	c2a9159cdd	jbd: config_jbd_debug cannot create /proc entry The jbd-debug file used to be located in /proc/sys/fs/jbd-debug, but create_proc_entry() does not do lookups on file names that are more that one directory deep. This causes the entry creation to fail and hence, no proc file is created. Instead of fixing this on procfs might as well move the jbd2-debug file to debugfs which would be the preferred location for this kind of tunable. The new location is now /sys/kernel/debug/jbd/jbd-debug. [akpm@linux-foundation.org: zillions of cleanups] Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Chris Snook	1c09924448	jbd: remove printk() from J_ASSERT macros Remove printk from J_ASSERT to preserve registers during BUG. Signed-off-by: Chris Snook <csnook@redhat.com> Cc: "Stephen C. Tweedie" <sct@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:34 -07:00
Samuel Thibault	b293d75847	Console events and accessibility Some external modules like Speakup need to monitor console output. This adds a VT notifier that such modules can use to get console output events: allocation, deallocation, writes, other updates (cursor position, switch, etc.) [akpm@linux-foundation.org: fix headers_check] Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:34 -07:00
Alexey Dobriyan	fe9d4f5763	Add kernel/notifier.c There is separate notifier header, but no separate notifier .c file. Extract notifier code out of kernel/sys.c which will remain for misc syscalls I hope. Merge kernel/die_notifier.c into kernel/notifier.c. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:34 -07:00
Samuel Thibault	41ab4396e1	Console keyboard events and accessibility Some blind people use a kernel engine called Speakup which uses hardware synthesis to speak what gets displayed on the screen. They use the PC keyboard to control this engine (start/stop, accelerate, ...) and also need to get keyboard feedback (to make sure to know what they are typing, the caps lock status, etc.) Up to now, the way it was done was very ugly. Below is a patch to add a notifier list for permitting a far better implementation, see ChangeLog above for details. You may wonder why this can't be done at the input layer. The problem is that what people want to monitor is the console keyboard, i.e. all input keyboards that got attached to the console, and with the currently active keymap (i.e. keysyms, not only keycodes). This adds a keyboard notifier that such modules can use to get the keyboard events and possibly eat them, at several stages: - keycodes: even before translation into keysym. - unbound keycodes: when no keysym is bound. - unicode: when the keycode would get translated into a unicode character. - keysym: when the keycode would get translated into a keysym. - post_keysym: after the keysym got interpreted, so as to see the result (caps lock, etc.) This also provides access to k_handler so as to permit simulation of keypresses. [akpm@linux-foundation.org: various fixes] Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:33 -07:00
Miklos Szeredi	c18479fe01	put declaration of put_filesystem() in fs.h Declarations go into headers. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Ram Pai <linuxram@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:33 -07:00
Andi Kleen	ab483570a1	x86 & generic: change to __builtin_prefetch() gcc 3.2+ supports __builtin_prefetch, so it's possible to use it on all architectures. Change the generic fallback in linux/prefetch.h to use it instead of noping it out. gcc should do the right thing when the architecture doesn't support prefetching Undefine the x86-64 inline assembler version and use the fallback. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-19 20:35:04 +02:00
Linus Torvalds	4fa4d23fa2	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: pcnet32: remove private net_device_stats structure vortex_up should initialize "err" pcnet32: remove compile warnings in non-napi mode pcnet32: fix non-napi packet reception fix EMAC driver for proper napi_synchronize API sky2: shutdown cleanup napi_synchronize: waiting for NAPI forcedeth msi bugfix gianfar: fix obviously wrong #ifdef CONFIG_GFAR_NAPI placement fs_enet: Update for API changes gianfar: remove orphan struct. forcedeth: fix rx-work condition in nv_rx_process_optimized() too	2007-10-18 19:31:54 -07:00
Linus Torvalds	a9e82d3a02	Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6: (37 commits) ide: set drive->autotune in ide_pci_setup_ports() triflex: always tune PIO opti621: always tune PIO cy82c693: always tune PIO cs5520: always tune PIO alim15x3: always tune PIO ide: add IDE_HFLAG_LEGACY_IRQS host flag ide: add IDE_HFLAG_SERIALIZE host flag ide: add IDE_HFLAG_ERROR_STOPS_FIFO host flag piix: add DECLARE_ICH_DEV() macro pdc202xx_old: add DECLARE_PDC2026X_DEV() macro pdc202xx_new: add DECLARE_PDCNEW_DEV() macro aec62xx: no need to disable UDMA in ->init_hwif method for ATP850UF ide: remove .init_setup from ide_pci_device_t serverworks: remove ->init_setup scc_pata: remove ->init_setup pdc202xx_old: remove ->init_setup pdc202xx_new: remove ->init_setup hpt366: remove ->init_setup cmd64x: remove ->init_setup ...	2007-10-18 16:00:02 -07:00
Bartlomiej Zolnierkiewicz	3985ee3b4c	ide: add IDE_HFLAG_LEGACY_IRQS host flag Add IDE_HFLAG_LEGACY_IRQS host flag to tell ide_pci_setup_ports() to set hwif->irq to legacy IRQ 14/15 (iff hwif->irq is not already set) and convert atiixp, piix, serverworks, sis5513 and slc90e66 host drivers to use it. While at it: * In piix.c add IDE_HFLAGS_PIIX define and don't use ->init_hwif for MPIIX. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:11 +02:00
Bartlomiej Zolnierkiewicz	1c51361a98	ide: add IDE_HFLAG_SERIALIZE host flag Add IDE_HFLAG_SERIALIZE host flag to tell ide_pci_setup_ports() to set hwif/mate->serialized and convert aec62xx, cs5530 and sc1200 host drivers to use it. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:10 +02:00
Bartlomiej Zolnierkiewicz	ed67b92385	ide: add IDE_HFLAG_ERROR_STOPS_FIFO host flag Add IDE_HFLAG_ERROR_STOPS_FIFO host flag and use it instead of hwif->err_stops_fifo. As a side-effect this change fixes hwif->err_stops_fifo not being restored by ide_hwif_restore(). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:10 +02:00
Bartlomiej Zolnierkiewicz	942278ef64	ide: remove .init_setup from ide_pci_device_t Now that all users were fixed we can safely remove it. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:09 +02:00
Bartlomiej Zolnierkiewicz	5f8b6c3485	ide: add ->mwdma_mask and ->swdma_mask to ide_pci_device_t (take 2) * Add ->mwdma_mask and ->swdma_mask to ide_pci_device_t. * Set ide_hwif_t DMA masks using DMA masks from ide_pci_device_t in setup-pci.c::ide_pci_setup_ports() (iff DMA base is valid and ->init_hwif method may still override them). * Convert IDE PCI host drivers to use ide_pci_device_t DMA masks. While at it: * Use ATA_{UDMA,MWDMA,SWDMA}* defines. * hpt34x.c: add separate ide_pci_device_t instances for HPT343 and HPT345. * serverworks.c: fix DMA masks being set before checking DMA base. v2: * Add missing masks to DECLARE_GENERIC_PCI_DEV() macro. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:07 +02:00
Bartlomiej Zolnierkiewicz	238e4f142c	ide: add IDE_HFLAG_NO_LBA48 and IDE_HFLAG_NO_LBA48_DMA host flags Add IDE_HFLAG_NO_LBA48[_DMA] host flags, use it instead of hwif->no_lba48[_dma] and then remove no longer needed hwif->no_lba48[_dma]. As a side-effect this change fixes hwif->no_lba48_dma not being restored by ide_hwif_restore(). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:07 +02:00
Bartlomiej Zolnierkiewicz	9ffcf364f9	ide: remove ->init_setup_dma from ide_pci_device_t (take 2) * Make ide_pci_device_t.host_flags u32 and add IDE_HFLAG_CS5520 host flag. * Pass ide_pci_device_t d to setup-pci.c::ide_get_or_set_dma_base() and use d->name instead of hwif->cds->name. Set IDE_HFLAG_CS5520 host flag in cs5520 host driver and use it in ide_get_or_set_dma_base() to find out which PCI BAR to use, remove no longer needed cs5520.c::cs5520_init_setup_dma() and ide_pci_device_t.init_setup_dma. This fixes PCI bus-mastering not being checked for CS5510/CS5520 hosts. v2: * It is wrong to check simplex bits on CS5510/CS5520 as v1 did. (Noticed by Alan). Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:07 +02:00
Bartlomiej Zolnierkiewicz	47b687882c	ide: add IDE_HFLAG_NO_{DMA,AUTODMA} host flags Add IDE_HFLAG_NO_{DMA,AUTODMA} host flags. Convert all host drivers using ide_pci_device_t to use these flags instead of d->autodma and then remove no longer needed d->autodma. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:06 +02:00
Bartlomiej Zolnierkiewicz	7cab14a799	ide: add IDE_HFLAG_BOOTABLE host flag Add IDE_HFLAG_BOOTABLE host flag and IDE_HFLAG_OFF_BOARD define. Convert all host drivers using ide_pci_device_t to use IDE_HFLAG_{BOOTABLE,OFF_BOARD} instead of d->bootable and then remove no longer needed d->bootable. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:06 +02:00
Bartlomiej Zolnierkiewicz	33c1002ed9	ide: add IDE_HFLAG_NO_ATAPI_DMA host flag Add IDE_HFLAG_NO_ATAPI_DMA host flag and set it in host drivers which don't support ATAPI DMA. Then remove no longer needed hwif->atapi_dma. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:06 +02:00
Benjamin Herrenschmidt	1b67834712	ide: Add ide_get_paired_drive() helper This adds a helper to get to the "other" drive on a pair connected to a given hwif. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andrew Morton <akpm@osdl.org> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-10-19 00:30:05 +02:00
Linus Torvalds	58f9b52ee8	Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt * ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-hrt: hrtimer: hook compat_sys_nanosleep up to high res timer code hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easier	2007-10-18 15:12:41 -07:00
Linus Torvalds	2af170dd24	Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] kill ata_sg_is_last() Update libata driver for bf548 atapi controller against the 2.6.24 tree. libata-sff: Correct use of check_status() drivers/ata: add support to Freescale 3.0Gbps SATA Controller pata_acpi: fix build breakage if !CONFIG_PM	2007-10-18 15:08:35 -07:00
Linus Torvalds	54e840dd50	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: reduce schedstat variable overhead a bit sched: add KERN_CONT annotation sched: cleanup, make struct rq comments more consistent sched: cleanup, fix spacing sched: fix return value of wait_for_completion_interruptible()	2007-10-18 14:54:03 -07:00
Linus Torvalds	a57793651f	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits) [IPV6]: Fix again the fl6_sock_lookup() fixed locking [NETFILTER]: nf_conntrack_tcp: fix connection reopening fix [IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels [IPV6]: Lost locking in fl6_sock_lookup [IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list [NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required [NET]: Fix OOPS due to missing check in dev_parse_header(). [TCP]: Remove lost_retrans zero seqno special cases [NET]: fix carrier-on bug? [NET]: Fix uninitialised variable in ip_frag_reasm() [IPSEC]: Rename mode to outer_mode and add inner_mode [IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP [IPSEC]: Use the top IPv4 route's peer instead of the bottom [IPSEC]: Store afinfo pointer in xfrm_mode [IPSEC]: Add missing BEET checks [IPSEC]: Move type and mode map into xfrm_state.c [IPSEC]: Fix length check in xfrm_parse_spi [IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi [IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi [IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input ...	2007-10-18 14:40:30 -07:00
Linus Torvalds	9cf52b2921	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC/64]: Consolidate of_register_driver [SPARC] Videopix Frame Grabber: Convert device_lock_sem to mutex [SPARC]: Support for new termios. [SPARC64]: Check of_get_property() return in pci_determine_mem_io_space(). [SPARC64]: Fix boot failures due to bootmem. [SPARC64]: Implement atomic backoff.	2007-10-18 14:39:44 -07:00
Corey Minyard	d8c98618f4	IPMI: add 0.9 support Add support for IPMI 0.9 systems to the IPMI driver. Just handle a shorter get device ID command with less information. Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Stian Jordet <liste@jordet.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:32 -07:00
Corey Minyard	fcfa472411	IPMI: add polled interface Currently the IPMI watchdog timer sets the watchdog timeout on a panic, but it doesn't actually poll the interface to make sure the message goes out. Add an interface for polling the IPMI driver, and add code to the IPMI watchdog timer to poll the interface when the timer is set from a panic. Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:32 -07:00
Ralf Baechle	e8c44319c6	Replace __attribute_pure__ with __pure To be consistent with the use of attributes in the rest of the kernel replace all use of __attribute_pure__ with __pure and delete the definition of __attribute_pure__. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:32 -07:00
Miklos Szeredi	0e9663ee45	fuse: add blksize field to fuse_attr There are cases when the filesystem will be passed the buffer from a single read or write call, namely: 1) in 'direct-io' mode (not O_DIRECT), read/write requests don't go through the page cache, but go directly to the userspace fs 2) currently buffered writes are done with single page requests, but if Nick's ->perform_write() patch goes it, it will be possible to do larger write requests. But only if the original write() was also bigger than a page. In these cases the filesystem might want to give a hint to the app about the optimal I/O size. Allow the userspace filesystem to supply a blksize value to be returned by stat() and friends. If the field is zero, it defaults to the old PAGE_CACHE_SIZE value. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	f33321141b	fuse: add support for mandatory locking For mandatory locking the userspace filesystem needs to know the lock ownership for read, write and truncate operations. This patch adds the necessary fields to the protocol. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	b25e82e567	fuse: add helper for asynchronous writes This patch adds a new helper function fuse_write_fill() which makes it possible to send WRITE requests asynchronously. A new flag for WRITE requests is also added which indicates that this a write from the page cache, and not a "normal" file write. This patch is in preparation for writable mmap support. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	a9ff4f8705	fuse: support BSD locking semantics It is trivial to add support for flock(2) semantics to the existing protocol, by setting the lock owner field to the file pointer, and passing a new FUSE_LK_FLOCK flag with the locking request. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	6ff958edbf	fuse: add atomic open+truncate support This patch allows fuse filesystems to implement open(..., O_TRUNC) as a single request, instead of separate truncate and open requests. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	17637cbaba	fuse: improve utimes support Add two new flags for setattr: FATTR_ATIME_NOW and FATTR_MTIME_NOW. These mean, that atime or mtime should be changed to the current time. Also it is now possible to update atime or mtime individually, not just together. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	d139d7ffd0	VFS: allow filesystems to implement atomic open+truncate Add a new attribute flag ATTR_OPEN, with the meaning: "truncation was initiated by open() due to the O_TRUNC flag". This way filesystems wanting to implement truncation within their ->open() method can ignore such truncate requests. This is a quick & dirty hack, but it comes for free. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Miklos Szeredi	c79e322f63	fuse: add file handle to getattr operation Add necessary protocol changes for supplying a file handle with the getattr operation. Step the API version to 7.9. This patch doesn't actually supply the file handle, because that needs some kind of VFS support, which we haven't yet been able to agree upon. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:30 -07:00
Takashi Sato	0f0a89ebe1	ext3: support large blocksize up to PAGESIZE This patch set supports large block size(>4k, <=64k) in ext3 just enlarging the block size limit. But it is NOT possible to have 64kB blocksize on ext3 without some changes to the directory handling code. The reason is that an empty 64kB directory block would have a rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in the filesystem. The proposed solution is treat 64k rec_len with a an impossible value like rec_len = 0xffff to handle this. The Patch-set consists of the following 2 patches. [1/2] ext3: enlarge blocksize - Allow blocksize up to pagesize [2/2] ext3: fix rec_len overflow - prevent rec_len from overflow with 64KB blocksize Now on 64k page ppc64 box runs with this patch set we could create a 64k block size ext3, and able to handle empty directory block. Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Nick Piggin	b8dc93cbe9	bit_spin_lock: use lock bitops Convert bit_spin_lock to new locking bitops. Slub can use the non-atomic store version to clear (Christoph?) Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:29 -07:00
Satyam Sharma	761bb43190	Redefine {un}register_hotcpu_notifier() !HOTPLUG_CPU stubs The return of the present "do {} while" based stub definition of register_hotcpu_notifier() cannot be checked. This makes the stub asymmetric w.r.t. the real HOTPLUG_CPU=y implementation that is int-returning. So let us redefine this to be consistent with the full version. Also do the same for unregister_hotcpu_notifier(). We cannot define these as static inline functions due to an existing GCC bug (#33172). So define as macros that return appropriately instead (int '0' for the register_hotcpu_notifier case and void for unregister_hotcpu_notifier). Signed-off-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Michael Neuling	f494f8fcb1	add-scaled-time-to-taskstats-based-process-accounting fix This moves the new items to the end of the taskstats struct as requested by Balbir and yourself. Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Michael Neuling	c66f08be7e	Add scaled time to taskstats based process accounting This adds items to the taststats struct to account for user and system time based on scaling the CPU frequency and instruction issue rates. Adds account_(user\|system)_time_scaled callbacks which architectures can use to account for time using this mechanism. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Jiri Slaby	65f76a82ec	Char: cyclades, fix some -W warnings Most of them are signedness, the rest unused function parameters. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:26 -07:00
Jiri Slaby	ebafeeff0f	Char: cyclades, remove bottom half processing The work done in bottom half doesn't cost much cpu time (e.g. tty_hangup itself schedules its own bottom half), it's possible to do the work in isr directly and save hence some .text. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Paul Fulghum <paulkf@microgate.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:26 -07:00
Andrew Morgan	72c2d5823f	V3 file capabilities: alter behavior of cap_setpcap The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1, can change the capabilities of another process, p2. This is not the meaning that was intended for this capability at all, and this implementation came about purely because, without filesystem capabilities, there was no way to use capabilities without one process bestowing them on another. Since we now have a filesystem support for capabilities we can fix the implementation of CAP_SETPCAP. The most significant thing about this change is that, with it in effect, no process can set the capabilities of another process. The capabilities of a program are set via the capability convolution rules: pI(post-exec) = pI(pre-exec) pP(post-exec) = (X(aka cap_bset) & fP) \| (pI(post-exec) & fI) pE(post-exec) = fE ? pP(post-exec) : 0 at exec() time. As such, the only influence the pre-exec() program can have on the post-exec() program's capabilities are through the pI capability set. The correct implementation for CAP_SETPCAP (and that enabled by this patch) is that it can be used to add extra pI capabilities to the current process - to be picked up by subsequent exec()s when the above convolution rules are applied. Here is how it works: Let's say we have a process, p. It has capability sets, pE, pP and pI. Generally, p, can change the value of its own pI to pI' where (pI' & ~pI) & ~pP = 0. That is, the only new things in pI' that were not present in pI need to be present in pP. The role of CAP_SETPCAP is basically to permit changes to pI beyond the above: if (pE & CAP_SETPCAP) { pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0 */ } This capability is useful for things like login, which (say, via pam_cap) might want to raise certain inheritable capabilities for use by the children of the logged-in user's shell, but those capabilities are not useful to or needed by the login program itself. One such use might be to limit who can run ping. You set the capabilities of the 'ping' program to be "= cap_net_raw+i", and then only shells that have (pI & CAP_NET_RAW) will be able to run it. Without CAP_SETPCAP implemented as described above, login(pam_cap) would have to also have (pP & CAP_NET_RAW) in order to raise this capability and pass it on through the inheritable set. Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:24 -07:00
Eric W. Biederman	fc6cd25b73	sysctl: Error on bad sysctl tables After going through the kernels sysctl tables several times it has become clear that code review and testing is just not effective in prevent problematic sysctl tables from being used in the stable kernel. I certainly can't seem to fix the problems as fast as they are introduced. Therefore this patch adds sysctl_check_table which is called when a sysctl table is registered and checks to see if we have a problematic sysctl table. The biggest part of the code is the table of valid binary sysctl entries, but since we have frozen our set of binary sysctls this table should not need to change, and it makes it much easier to detect when someone unintentionally adds a new binary sysctl value. As best as I can determine all of the several hundred errors spewed on boot up now are legitimate. [bunk@kernel.org: kernel/sysctl_check.c must #include <linux/string.h>] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	f429cd37a2	sysctl: properly register the irda binary sysctl numbers Grumble. These numbers should have been in sysctl.h from the beginning if we ever expected anyone to use them. Oh well put them there now so we can find them and make maintenance easier. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	25398a158d	sysctl: parport remove binary paths The sysctl binary paths don't look as if they even code work, .data is not filled in, and all of the proc_handlers look at extra1 and there is not strategy routine. So just kill the binary paths. In addition this patch removes the setting of extra1 on directories. It doesn't look like the parport code ever examines it, and it's bad sysctl form. [bunk@kernel.org: remove parport_device_num()] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:23 -07:00
Eric W. Biederman	49a0c45833	sysctl: Factor out sysctl_data. There as been no easy way to wrap the default sysctl strategy routine except for returning 0. Which is not always what we want. The few instances I have seen that want different behaviour have written their own version of sysctl_data. While not too hard it is unnecessary code and has the potential for extra bugs. So to make these situations easier and make that part of sysctl more symetric I have factord sysctl_data out of do_sysctl_strategy and exported as a function everyone can use. Further having sysctl_data be an explicit function makes checking for badly formed sysctl tables much easier. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Eric W. Biederman	d8217f076b	sysctl core: Stop using the unnecessary ctl_table typedef In sysctl.h the typedef struct ctl_table ctl_table violates coding style isn't needed and is a bit of a nuisance because it makes it harder to recognize ctl_table is a type name. So this patch removes it from the generic sysctl code. Hopefully I will have enough energy to send the rest of my patches will follow and to remove it from the rest of the kernel. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Andrew Morton	8f286c33f1	stop using DMA_xxBIT_MASK Now that we have DMA_BIT_MASK(), these macros are pointless. Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:21 -07:00
Borislav Petkov	34c6538413	unify DMA_..BIT_MASK definitions: v3.1 Remove redundant DMA_..BIT_MASK definitions across two drivers. The computation of the majority of the bitmasks is done by the compiler. The initial split of the patch touching each a different file got removed due to possible git bisect breakage. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Reviewed-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:21 -07:00
Tony Breeds	2c62214831	Fix discrepancy between VDSO based gettimeofday() and sys_gettimeofday(). On platforms that copy sys_tz into the vdso (currently only x86_64, soon to include powerpc), it is possible for the vdso to get out of sync if a user calls (admittedly unusual) settimeofday(NULL, ptr). This patch adds a hook for architectures that set CONFIG_GENERIC_TIME_VSYSCALL to ensure when sys_tz is updated they can also updatee their copy in the vdso. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Cc: Andi Kleen <ak@suse.de> Cc: Tony Luck <tony.luck@intel.com> Acked-by: John Stultz <johnstul@us.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:20 -07:00
Alexey Dobriyan	6212e3a388	Remove struct task_struct::io_wait Hell knows what happened in commit 63b05203af57e7de4f3bb63b8b81d43bc196d32b during 2.6.9 development. Commit introduced io_wait field which remained write-only than and still remains write-only. Also garbage collect macros which "use" io_wait. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:20 -07:00
Rafael J. Wysocki	c7e0831d38	Hibernation: Check if ACPI is enabled during restore in the right place The following scenario leads to total confusion of the platform firmware on some boxes (eg. HPC nx6325): * Hibernate with ACPI enabled * Resume passing "acpi=off" to the boot kernel To prevent this from happening it's necessary to check if ACPI is enabled (and enable it if that's not the case) _right_ _after_ control has been transfered from the boot kernel to the image kernel, before device_power_up() is called (ie. with interrupts disabled). Enabling ACPI after calling device_power_up() turns out to be insufficient. For this reason, introduce new hibernation callback ->leave() that will be executed before device_power_up() by the restored image kernel. To make it work, it also is necessary to move swsusp_suspend() from swsusp.c to disk.c (it's name is changed to "create_image", which is more up to the point). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:20 -07:00
Andres Salomon	8f4ce8c32f	serial: turn serial console suspend a boot rather than compile time option Currently, there's a CONFIG_DISABLE_CONSOLE_SUSPEND that allows one to stop the serial console from being suspended when the rest of the machine goes to sleep. This is incredibly useful for debugging power management-related things; however, having it as a compile-time option has proved to be incredibly inconvenient for us (OLPC). There are plenty of times that we want serial console to not suspend, but for the most part we'd like serial console to be suspended. This drops CONFIG_DISABLE_CONSOLE_SUSPEND, and replaces it with a kernel boot parameter (no_console_suspend). By default, the serial console will be suspended along with the rest of the system; by passing 'no_console_suspend' to the kernel during boot, serial console will remain alive during suspend. For now, this is pretty serial console specific; further fixes could be applied to make this work for things like netconsole. Signed-off-by: Andres Salomon <dilinger@debian.org> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@suspend2.net> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:19 -07:00
Rafael J. Wysocki	e42837bcd3	freezer: introduce freezer-friendly waiting macros Introduce freezer-friendly wrappers around wait_event_interruptible() and wait_event_interruptible_timeout(), originally defined in <linux/wait.h>, to be used in freezable kernel threads. Make some of the freezable kernel threads use them. This is necessary for the freezer to stop sending signals to kernel threads, which is implemented in the next patch. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:19 -07:00
Rafael J. Wysocki	b3dac3b304	PM: Rename hibernation_ops to platform_hibernation_ops Rename 'struct hibernation_ops' to 'struct platform_hibernation_ops' in analogy with 'struct platform_suspend_ops'. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	74f270af0c	PM: Rework struct hibernation_ops During hibernation we also need to tell the ACPI core that we're going to put the system into the S4 sleep state. For this reason, an additional method in 'struct hibernation_ops' is needed, playing the role of set_target() in 'struct platform_suspend_operations'. Moreover, the role of the .prepare() method is now different, so it's better to introduce another method, that in general may be different from .prepare(), that will be used to prepare the platform for creating the hibernation image (.prepare() is used anyway to notify the platform that we're going to enter the low power state after the image has been saved). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	f242d9196f	PM: Make suspend_ops static The variable suspend_ops representing the set of global platform-specific suspend-related operations, used by the PM core, need not be exported outside of kernel/power/main.c . Make it static. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	e6c5eb9541	PM: Rework struct platform_suspend_ops There is no reason why the .prepare() and .finish() methods in 'struct platform_suspend_ops' should take any arguments, since architectures don't use these methods' argument in any practically meaningful way (ie. either the target system sleep state is conveyed to the platform by .set_target(), or there is only one suspend state supported and it is indicated to the PM core by .valid(), or .prepare() and .finish() aren't defined at all). There also is no reason why .finish() should return any result. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	26398a70ea	PM: Rename struct pm_ops and related things The name of 'struct pm_ops' suggests that it is related to the power management in general, but in fact it is only related to suspend. Moreover, its name should indicate what this structure is used for, so it seems reasonable to change it to 'struct platform_suspend_ops'. In that case, the name of the global variable of this type used by the PM core and the names of related functions should be changed accordingly. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Rafael J. Wysocki	95d9ffbe01	PM: Move definition of struct pm_ops to suspend.h Move the definition of 'struct pm_ops' and related functions from <linux/pm.h> to <linux/suspend.h> . There are, at least, the following reasons to do that: * 'struct pm_ops' is specifically related to suspend and not to the power management in general. * As long as 'struct pm_ops' is defined in <linux/pm.h>, any modification of it causes the entire kernel to be recompiled, which is unnecessary and annoying. * Some suspend-related features are already defined in <linux/suspend.h>, so it is logical to move the definition of 'struct pm_ops' into there. * 'struct hibernation_ops', being the hibernation-related counterpart of 'struct pm_ops', is defined in <linux/suspend.h> . Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:18 -07:00
Anton Blanchard	04c227140f	hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easier Pull the copy_to_user out of hrtimer_nanosleep and into the callers (common_nsleep, sys_nanosleep) in preparation for converting compat_sys_nanosleep to use hrtimers. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-10-18 22:54:18 +02:00
Jeff Garzik	3be6cbd73f	[libata] kill ata_sg_is_last() Short term, this works around a bug introduced by early sg-chaining work. Long term, removing this function eliminates a branch from a hot path loop in each scatter/gather table build. Also, as this code demonstrates, we don't need to _track_ the end of the s/g list, as long as we mark it in some way. And doing so programatically is nice. So its a useful cleanup, regardless of its short term effects. Based conceptually on a quick patch by Jens Axboe. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-10-18 16:21:18 -04:00
Ken Chen	480b9434c5	sched: reduce schedstat variable overhead a bit schedstat is useful in investigating CPU scheduler behavior. Ideally, I think it is beneficial to have it on all the time. However, the cost of turning it on in production system is quite high, largely due to number of events it collects and also due to its large memory footprint. Most of the fields probably don't need to be full 64-bit on 64-bit arch. Rolling over 4 billion events will most like take a long time and user space tool can be made to accommodate that. I'm proposing kernel to cut back most of variable width on 64-bit system. (note, the following patch doesn't affect 32-bit system). Signed-off-by: Ken Chen <kenchen@google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-18 21:32:56 +02:00
Li Zefan	009e8c965f	[NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required Macros like SCTP_CHUNKMAP_XXX(chukmap) require chukmap to be an array, but match_packet() passes a pointer to these macros. Also remove the ELEMCOUNT macro and fix a bug in SCTP_CHUNKMAP_COPY. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:12:21 -07:00
Patrick McHardy	1b83336bb9	[NET]: Fix OOPS due to missing check in dev_parse_header(). [ This is kernel bugzilla 9174 "linux-2.6.23-git11 kernel panic" ] The device in question is an IPv6-over-IPv4 tunnel, which doesn't have any header_ops, so the crash happens in dev_parse_header when dereferencing them. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-18 05:09:28 -07:00
Pavel Emelyanov	55b333253d	[NET]: Introduce the sk_detach_filter() call Filter is attached in a separate function, so do the same for filter detaching. This also removes one variable sock_setsockopt(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:21:26 -07:00
Stephen Rothwell	5c45708352	[SPARC/64]: Consolidate of_register_driver Also of_unregister_driver. These will be shortly also used by the PowerPC code. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-17 21:17:42 -07:00
Stephen Hemminger	c264c3dee9	napi_synchronize: waiting for NAPI Some drivers with shared NAPI need a synchronization barrier. Also suggested by Benjamin Herrenschmidt for EMAC. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-10-17 20:17:34 -04:00
Aneesh Kumar K.V	d8dd0b4543	ext4: Convert ext4_extent_idx.ei_leaf to ext4_extent_idx.ei_leaf_lo Convert ext4_extent_idx.ei_leaf ext4_extent_idx.ei_leaf_lo This helps in finding BUGs due to direct partial access of these split 48 bit values. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:03 -04:00
Aneesh Kumar K.V	b377611d11	ext4: Convert ext4_extent.ee_start to ext4_extent.ee_start_lo Convert ext4_extent.ee_start to ext4_extent.ee_start_lo This helps in finding BUGs due to direct partial access of these split 48 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:03 -04:00
Aneesh Kumar K.V	308ba3ece7	ext4: Convert s_r_blocks_count and s_free_blocks_count Convert s_r_blocks_count and s_free_blocks_count to s_r_blocks_count_lo and s_free_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-10-17 18:50:02 -04:00
Aneesh Kumar K.V	6bc9feff14	ext4: Convert s_blocks_count to s_blocks_count_lo Convert s_blocks_count to s_blocks_count_lo This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix direct partial access in ext4 code Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:02 -04:00
Aneesh Kumar K.V	5272f83727	ext4: Convert bg_inode_bitmap and bg_inode_table Convert bg_inode_bitmap and bg_inode_table to bg_inode_bitmap_lo and bg_inode_table_lo. This helps in finding BUGs due to direct partial access of these split 64 bit values Also fix one direct partial access Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:02 -04:00
Aneesh Kumar K.V	3a14589cce	ext4: Convert bg_block_bitmap to bg_block_bitmap_lo Convert bg_block_bitmap to bg_block_bitmap_lo This helps in catching some BUGS due to direct partial access of these split fields. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:01 -04:00
Jose R. Santos	ce42158179	ext4: FLEX_BG Kernel support v2. This feature relaxes check restrictions on where each block groups meta data is located within the storage media. This allows for the allocation of bitmaps or inode tables outside the block group boundaries in cases where bad blocks forces us to look for new blocks which the owning block group can not satisfy. This will also allow for new meta-data allocation schemes to improve performance and scalability. Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-17 18:50:01 -04:00
Aneesh Kumar K.V	c1bddad949	ext4: Fix sparse warnings Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-17 18:50:01 -04:00
Andreas Dilger	717d50e497	Ext4: Uninitialized Block Groups In pass1 of e2fsck, every inode table in the fileystem is scanned and checked, regardless of whether it is in use. This is this the most time consuming part of the filesystem check. The unintialized block group feature can greatly reduce e2fsck time by eliminating checking of uninitialized inodes. With this feature, there is a a high water mark of used inodes for each block group. Block and inode bitmaps can be uninitialized on disk via a flag in the group descriptor to avoid reading or scanning them at e2fsck time. A checksum of each group descriptor is used to ensure that corruption in the group descriptor's bit flags does not cause incorrect operation. The feature is enabled through a mkfs option mke2fs /dev/ -O uninit_groups A patch adding support for uninitialized block groups to e2fsprogs tools has been posted to the linux-ext4 mailing list. The patches have been stress tested with fsstress and fsx. In performance tests testing e2fsck time, we have seen that e2fsck time on ext3 grows linearly with the total number of inodes in the filesytem. In ext4 with the uninitialized block groups feature, the e2fsck time is constant, based solely on the number of used inodes rather than the total inode count. Since typical ext4 filesystems only use 1-10% of their inodes, this feature can greatly reduce e2fsck time for users. With performance improvement of 2-20 times, depending on how full the filesystem is. The attached graph shows the major improvements in e2fsck times in filesystems with a large total inode count, but few inodes in use. In each group descriptor if we have EXT4_BG_INODE_UNINIT set in bg_flags: Inode table is not initialized/used in this group. So we can skip the consistency check during fsck. EXT4_BG_BLOCK_UNINIT set in bg_flags: No block in the group is used. So we can skip the block bitmap verification for this group. We also add two new fields to group descriptor as a part of uninitialized group patch. __le16 bg_itable_unused; /* Unused inodes count / __le16 bg_checksum; / crc16(sb_uuid+group+desc) */ bg_itable_unused: If we have EXT4_BG_INODE_UNINIT not set in bg_flags then bg_itable_unused will give the offset within the inode table till the inodes are used. This can be used by fsck to skip list of inodes that are marked unused. bg_checksum: Now that we depend on bg_flags and bg_itable_unused to determine the block and inode usage, we need to make sure group descriptor is not corrupt. We add checksum to group descriptor to detect corruption. If the descriptor is found to be corrupt, we mark all the blocks and inodes in the group used. Signed-off-by: Avantika Mathur <mathur@us.ibm.com> Signed-off-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2007-10-17 18:50:00 -04:00
Eric Sandeen	4074fe3736	ext4: remove #ifdef CONFIG_EXT4_INDEX CONFIG_EXT4_INDEX is not an exposed config option in the kernel, and it is unconditionally defined in ext4_fs.h. tune2fs is already able to turn off dir indexing, so at this point it's just cluttering up the code. Remove it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-10-17 18:50:00 -04:00
Coly Li	f077d0d7ea	ext4: Remove (partial, never completed) fragment support Fragment support in ext2/3/4 was never implemented, and it probably will never be implemented. So remove it from ext4. Signed-off-by: Coly Li <coyli@suse.de> Acked-by: Andreas Dilger <adilger@clusterfs.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-10-17 18:49:59 -04:00
Mingming Cao	cd02ff0b14	jbd2: JBD_XXX to JBD2_XXX naming cleanup change JBD_XXX macros to JBD2_XXX in JBD2/Ext4 Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2007-10-17 18:49:58 -04:00
Mingming Cao	2d917969bc	JBD2: replace jbd_kmalloc with kmalloc directly. This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:57 -04:00
Mingming Cao	a5005da204	JBD: replace jbd_kmalloc with kmalloc directly This patch cleans up jbd_kmalloc and replace it with kmalloc directly Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:57 -04:00
Mingming Cao	af1e76d6b3	JBD2: jbd2 slab allocation cleanups JBD2: Replace slab allocations with page allocations JBD2 allocate memory for committed_data and frozen_data from slab. However JBD2 should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:56 -04:00
Mingming Cao	c089d490df	JBD: JBD slab allocation cleanups JBD: Replace slab allocations with page allocations JBD allocate memory for committed_data and frozen_data from slab. However JBD should not pass slab pages down to the block layer. Use page allocator pages instead. This will also prepare JBD for the large blocksize patchset. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com>	2007-10-17 18:49:56 -04:00
Pierre Ossman	727c26ed78	net: libertas sdio driver Add driver for Marvell's Libertas 8385 and 8686 wifi chips. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Acked-by: Dan Williams <dcbw@redhat.com>	2007-10-17 22:51:13 +02:00
Linus Torvalds	e6d5a11dad	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: fix new task startup crash sched: fix !SYSFS build breakage sched: fix improper load balance across sched domain sched: more robust sd-sysctl entry freeing	2007-10-17 09:11:18 -07:00
Linus Torvalds	c548f08a4f	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (24 commits) [POWERPC] Fix vmemmap warning in init_64.c [POWERPC] Fix 64 bits vDSO DWARF info for CR register [POWERPC] Add 1TB workaround for PA6T [POWERPC] Enable NO_HZ and high res timers for pseries and ppc64 configs [POWERPC] Quieten cache information at boot [POWERPC] Quieten clockevent printk [POWERPC] Enable SLUB in *_defconfig [POWERPC] Fix 1TB segment detection [POWERPC] Fix iSeries_hpte_insert prototype [POWERPC] Fix copyright symbol [POWERPC] ibmebus: Move to of_device and of_platform_driver, match eHCA and eHEA drivers [POWERPC] ibmebus: Add device creation and bus probing based on of_device [POWERPC] ibmebus: Remove bus match/probe/remove functions [POWERPC] Move of_device allocation into of_device.[ch] [POWERPC] mpc52xx: device tree changes for FEC and MDIO [POWERPC] bestcomm: GenBD task support [POWERPC] bestcomm: FEC task support [POWERPC] bestcomm: ATA task support [POWERPC] bestcomm: core bestcomm support for Freescale MPC5200 [POWERPC] mpc52xx: Update mpc52xx_psc structure with B revision changes ...	2007-10-17 09:05:55 -07:00
Linus Torvalds	5c8e191e84	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup: Remove magic macros for screen_info structure members [x86] remove uses of magic macros for boot_params access	2007-10-17 09:00:30 -07:00
Adrian Bunk	cbfee34520	security/ cleanups This patch contains the following cleanups that are now possible: - remove the unused security_operations->inode_xattr_getsuffix - remove the no longer used security_operations->unregister_security - remove some no longer required exit code - remove a bunch of no longer used exports Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
Serge E. Hallyn	b53767719b	Implement file posix capabilities Implement file posix capabilities. This allows programs to be given a subset of root's powers regardless of who runs them, without having to use setuid and giving the binary all of root's powers. This version works with Kaigai Kohei's userspace tools, found at http://www.kaigai.gr.jp/index.php. For more information on how to use this patch, Chris Friedhoff has posted a nice page at http://www.friedhoff.org/fscaps.html. Changelog: Nov 27: Incorporate fixes from Andrew Morton (security-introduce-file-caps-tweaks and security-introduce-file-caps-warning-fix) Fix Kconfig dependency. Fix change signaling behavior when file caps are not compiled in. Nov 13: Integrate comments from Alexey: Remove CONFIG_ ifdef from capability.h, and use %zd for printing a size_t. Nov 13: Fix endianness warnings by sparse as suggested by Alexey Dobriyan. Nov 09: Address warnings of unused variables at cap_bprm_set_security when file capabilities are disabled, and simultaneously clean up the code a little, by pulling the new code into a helper function. Nov 08: For pointers to required userspace tools and how to use them, see http://www.friedhoff.org/fscaps.html. Nov 07: Fix the calculation of the highest bit checked in check_cap_sanity(). Nov 07: Allow file caps to be enabled without CONFIG_SECURITY, since capabilities are the default. Hook cap_task_setscheduler when !CONFIG_SECURITY. Move capable(TASK_KILL) to end of cap_task_kill to reduce audit messages. Nov 05: Add secondary calls in selinux/hooks.c to task_setioprio and task_setscheduler so that selinux and capabilities with file cap support can be stacked. Sep 05: As Seth Arnold points out, uid checks are out of place for capability code. Sep 01: Define task_setscheduler, task_setioprio, cap_task_kill, and task_setnice to make sure a user cannot affect a process in which they called a program with some fscaps. One remaining question is the note under task_setscheduler: are we ok with CAP_SYS_NICE being sufficient to confine a process to a cpuset? It is a semantic change, as without fsccaps, attach_task doesn't allow CAP_SYS_NICE to override the uid equivalence check. But since it uses security_task_setscheduler, which elsewhere is used where CAP_SYS_NICE can be used to override the uid equivalence check, fixing it might be tough. task_setscheduler note: this also controls cpuset:attach_task. Are we ok with CAP_SYS_NICE being used to confine to a cpuset? task_setioprio task_setnice sys_setpriority uses this (through set_one_prio) for another process. Need same checks as setrlimit Aug 21: Updated secureexec implementation to reflect the fact that euid and uid might be the same and nonzero, but the process might still have elevated caps. Aug 15: Handle endianness of xattrs. Enforce capability version match between kernel and disk. Enforce that no bits beyond the known max capability are set, else return -EPERM. With this extra processing, it may be worth reconsidering doing all the work at bprm_set_security rather than d_instantiate. Aug 10: Always call getxattr at bprm_set_security, rather than caching it at d_instantiate. [morgan@kernel.org: file-caps clean up for linux/capability.h] [bunk@kernel.org: unexport cap_inode_killpriv] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Andrew Morgan <morgan@kernel.org> Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
Alexey Dobriyan	57c521ce61	ifdef struct task_struct::security For those who don't care about CONFIG_SECURITY. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
James Morris	20510f2f4e	security: Convert LSM into a static interface Convert LSM into a static interface, as the ability to unload a security module is not required by in-tree users and potentially complicates the overall security architecture. Needlessly exported LSM symbols have been unexported, to help reduce API abuse. Parameters for the capability and root_plug modules are now specified at boot. The SECURITY_FRAMEWORK_VERSION macro has also been removed. In a nutshell, there is no safe way to unload an LSM. The modular interface is thus unecessary and broken infrastructure. It is used only by out-of-tree modules, which are often binary-only, illegal, abusive of the API and dangerous, e.g. silently re-vectoring SELinux. [akpm@linux-foundation.org: cleanups] [akpm@linux-foundation.org: USB Kconfig fix] [randy.dunlap@oracle.com: fix LSM kernel-doc] Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
Dave Hansen	ce8d2cdf3d	r/o bind mounts: filesystem helpers for custom 'struct file's Why do we need r/o bind mounts? This feature allows a read-only view into a read-write filesystem. In the process of doing that, it also provides infrastructure for keeping track of the number of writers to any given mount. This has a number of uses. It allows chroots to have parts of filesystems writable. It will be useful for containers in the future because users may have root inside a container, but should not be allowed to write to somefilesystems. This also replaces patches that vserver has had out of the tree for several years. It allows security enhancement by making sure that parts of your filesystem read-only (such as when you don't trust your FTP server), when you don't want to have entire new filesystems mounted, or when you want atime selectively updated. I've been using the following script to test that the feature is working as desired. It takes a directory and makes a regular bind and a r/o bind mount of it. It then performs some normal filesystem operations on the three directories, including ones that are expected to fail, like creating a file on the r/o mount. This patch: Some filesystems forego the vfs and may_open() and create their own 'struct file's. This patch creates a couple of helper functions which can be used by these filesystems, and will provide a unified place which the r/o bind mount code may patch. Also, rename an existing, static-scope init_file() to a less generic name. Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Bjorn Helgaas	402b310cb6	PNP: remove null pointer checks Remove some null pointer checks. Null pointers in these areas indicate programming errors, and I think it's better to oops immediately rather than return an error that is easily ignored. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Adam Belay <ambx1@neo.rr.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:04 -07:00
Adrian Bunk	5ebf2c1260	bitmap.h: remove dead artifacts bitmap_active() no longer exists and BITMAP_ACTIVE is no longer used. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:03 -07:00
Martin J. Bligh	a686cd898b	ext2 reservations Val's cross-port of the ext3 reservations code into ext2. [mbligh@mbligh.org: Small type error for printk [akpm@linux-foundation.org: fix types, sync with ext3] [mbligh@mbligh.org: Bring ext2 reservations code in line with latest ext3] [akpm@linux-foundation.org: kill noisy printk] [akpm@linux-foundation.org: remember to dirty the gdp's block] [akpm@linux-foundation.org: cross-port the missed 5dea5176e5c32ef9f0d1a41d28427b3bf6881b3a] [akpm@linux-foundation.org: cross-port e6022603b9aa7d61d20b392e69edcdbbc1789969] [akpm@linux-foundation.org: Port the omitted 08fb306fe63d98eb86e3b16f4cc21816fa47f18e] [akpm@linux-foundation.org: Backport the missed 20acaa18d0c002fec180956f87adeb3f11f635a6] [akpm@linux-foundation.org: fixes] [cmm@us.ibm.com: fix reservation extension] [bunk@stusta.de: make ext2_get_blocks() static] [hugh@veritas.com: fix hang] [hugh@veritas.com: ext2_new_blocks should reset the reservation window size] [hugh@veritas.com: ext2 balloc: fix off-by-one against rsv_end] [hugh@veritas.com: grp_goal 0 is a genuine goal (unlike -1), so ext2_try_to_allocate_with_rsv should treat it as such] [hugh@veritas.com: rbtree usage cleanup] [pbadari@us.ibm.com: Fix for ext2 reservation] [bunk@kernel.org: remove fs/ext2/balloc.c:reserve_blocks()] [hugh@veritas.com: ext2 balloc: use io_error label] Cc: "Martin J. Bligh" <mbligh@mbligh.org> Cc: Valerie Henson <val_henson@linux.intel.com> Cc: Mingming Cao <cmm@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Joern Engel	1c0eeaf569	introduce I_SYNC I_LOCK was used for several unrelated purposes, which caused deadlock situations in certain filesystems as a side effect. One of the purposes now uses the new I_SYNC bit. Also document the various bits and change their order from historical to logical. [bunk@stusta.de: make fs/inode.c:wake_up_inode() static] Signed-off-by: Joern Engel <joern@wohnheim.fh-wedel.de> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: David Chinner <dgc@sgi.com> Cc: Anton Altaparmakov <aia21@cam.ac.uk> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Fengguang Wu	2e6883bdf4	writeback: introduce writeback_control.more_io to indicate more io After making dirty a 100M file, the normal behavior is to start the writeback for all data after 30s delays. But sometimes the following happens instead: - after 30s: ~4M - after 5s: ~4M - after 5s: all remaining 92M Some analyze shows that the internal io dispatch queues goes like this: s_io s_more_io ------------------------- 1) 100M,1K 0 2) 1K 96M 3) 0 96M 1) initial state with a 100M file and a 1K file 2) 4M written, nr_to_write <= 0, so write more 3) 1K written, nr_to_write > 0, no more writes(BUG) nr_to_write > 0 in (3) fools the upper layer to think that data have all been written out. The big dirty file is actually still sitting in s_more_io. We cannot simply splice s_more_io back to s_io as soon as s_io becomes empty, and let the loop in generic_sync_sb_inodes() continue: this may starve newly expired inodes in s_dirty. It is also not an option to draw inodes from both s_more_io and s_dirty, an let the loop go on: this might lead to live locks, and might also starve other superblocks in sync time(well kupdate may still starve some superblocks, that's another bug). We have to return when a full scan of s_io completes. So nr_to_write > 0 does not necessarily mean that "all data are written". This patch introduces a flag writeback_control.more_io to indicate this situation. With it the big dirty file no longer has to wait for the next kupdate invocation 5s later. Cc: David Chinner <dgc@sgi.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Fengguang Wu	08d8e9749e	writeback: fix ntfs with sb_has_dirty_inodes() NTFS's if-condition on dirty inodes is not complete. Fix it with sb_has_dirty_inodes(). Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Ken Chen	0e0f4fc22e	writeback: fix periodic superblock dirty inode flushing Current -mm tree has bucketful of bug fixes in periodic writeback path. However, we still hit a glitch where dirty pages on a given inode aren't completely flushed to the disk, and system will accumulate large amount of dirty pages beyond what dirty_expire_interval is designed for. The problem is __sync_single_inode() will move an inode to sb->s_dirty list even when there are more pending dirty pages on that inode. If there is another inode with a small number of dirty pages, we hit a case where the loop iteration in wb_kupdate() terminates prematurely because wbc.nr_to_write > 0. Thus leaving the inode that has large amount of dirty pages behind and it has to wait for another dirty_writeback_interval before we flush it again. We effectively only write out MAX_WRITEBACK_PAGES every dirty_writeback_interval. If the rate of dirtying is sufficiently high, the system will start accumulate a large number of dirty pages. So fix it by having another sb->s_more_io list on which to park the inode while we iterate through sb->s_io and to allow each dirty inode which resides on that sb to have an equal chance of flushing some amount of dirty pages. Signed-off-by: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:02 -07:00
Ingo Molnar	4749252776	printk: add KERN_CONT annotation printk: add the KERN_CONT annotation (which is empty string but via which checkpatch.pl can notice that the lacking KERN_ level is fine). This useful for multiple calls of hand-crafted printk output done by early debug code or similar. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Ulrich Drepper	22d2b35b20	F_DUPFD_CLOEXEC implementation One more small change to extend the availability of creation of file descriptors with FD_CLOEXEC set. Adding a new command to fcntl() requires no new system call and the overall impact on code size if minimal. If this patch gets accepted we will also add this change to the next revision of the POSIX spec. To test the patch, use the following little program. Adjust the value of F_DUPFD_CLOEXEC appropriately. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ #include <errno.h> #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #ifndef F_DUPFD_CLOEXEC # define F_DUPFD_CLOEXEC 12 #endif int main (int argc, char *argv[]) { if (argc > 1) { if (fcntl (3, F_GETFD) == 0) { puts ("descriptor not closed"); exit (1); } if (errno != EBADF) { puts ("error not EBADF"); exit (1); } exit (0); } int fd = fcntl (STDOUT_FILENO, F_DUPFD_CLOEXEC, 0); if (fd == -1 && errno == EINVAL) { puts ("F_DUPFD_CLOEXEC not supported"); return 0; } if (fd != 3) { puts ("program called with descriptors other than 0,1,2"); return 1; } execl ("/proc/self/exe", "/proc/self/exe", "1", NULL); puts ("execl failed"); return 1; } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Ulrich Drepper <drepper@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Alexey Dobriyan	18796aa002	task_struct: move ->fpu_counter and ->oomkilladj There is nice 2 byte hole after struct task_struct::ioprio field into which we can put two 1-byte fields: ->fpu_counter and ->oomkilladj. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Davide Libenzi	96358de6bc	rename signalfd_siginfo fields For Michael Kerrisk request, the following patch renames signalfd_siginfo fields in order to keep them consistent with the siginfo_t ones. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Eric Sandeen	059590f495	ext3: remove #ifdef CONFIG_EXT3_INDEX CONFIG_EXT3_INDEX is not an exposed config option in the kernel, and it is unconditionally defined in ext3_fs.h. tune2fs is already able to turn off dir indexing, so at this point it's just cluttering up the code. Remove it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:01 -07:00
Ahmed S. Darwish	b4471cbb09	Completely remove deprecated IRQ flags (SA_) Only very little files use the deprecated SA_ IRQ flags in latest pull. This patch series removes such macros from the tree and transfrom old code to the new IRQF_* flags. I've grepped the whole tree to make sure that no more files than the patched ones use such deprecated macros. I hope this series won't introduce build errors. Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:00 -07:00
Andrey Mirkin	fd5eea4214	change inotifyfs magic as the same magic is used for futexfs Right now futexfs and inotifyfs have one magic 0xBAD1DEA, that looks a little bit confusing. Use 0xBAD1DEA as magic for futexfs and 0x2BAD1DEA as magic for inotifyfs. Signed-off-by: Andrey Mirkin <major@openvz.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:00 -07:00
Olaf Hering	4f9a58d75b	increase AT_VECTOR_SIZE to terminate saved_auxv properly include/asm-powerpc/elf.h has 6 entries in ARCH_DLINFO. fs/binfmt_elf.c has 14 unconditional NEW_AUX_ENT entries and 2 conditional NEW_AUX_ENT entries. So in the worst case, saved_auxv does not get an AT_NULL entry at the end. The saved_auxv array must be terminated with an AT_NULL entry. Make the size of mm_struct->saved_auxv arch dependend, based on the number of ARCH_DLINFO entries. Signed-off-by: Olaf Hering <olh@suse.de> Cc: Roland McGrath <roland@redhat.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:00 -07:00
Pavel Emelyanov	1efd24fa05	Remove unused member from nsproxy The nslock spinlock is not used in the kernel at all. Remove it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:59 -07:00
Alexey Dobriyan	970a8645ca	user.c: #ifdef ->mq_bytes For those who deselect POSIX message queues. Reduces SLAB size of user_struct from 64 to 32 bytes here, SLUB size -- from 40 bytes to 32 bytes. [akpm@linux-foundation.org: fix build] Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:59 -07:00
Alan Cox	5f519d7281	tty: expose new methods needed for drivers to get termios right This adds three new functions (or in one case to be more exact makes it always available) tty_termios_copy_hw Copies all the hardware settings from one termios structure to the other. This is intended for drivers that support little or no hardware setting tty_termios_encode_baud_rate Allows you to set the input and output baud rate in a termios structure. A driver is supposed to set the resulting baud rate from a request so most will want to use this function to set the resulting input and output rates to match the hardware values. Internally it knows about keeping Bxxx encoding when possible to maximise compatibility. tty_encode_baud_rate As above but for the tty's own current termios structure I suspect this will initially need some tweaking as it gets enabled by driver patches over the next few mm cycles so consider this lot -mm only for the moment so it can stabilize and end up neat before it goes to base. I've tried not to break any obscure architectures - if you get a speed you can't represent the code will print warnings on non updated termios systems but not break. Once this is merged and seems sane I've got a growing pile of driver updates to use it - notably for USB serial drivers. [akpm@linux-foundation.org: cleanups] Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:58 -07:00
Denys Vlasenko	b9ec0339d8	add consts where appropriate in fs/nls/* Add const modifiers to a few struct nls_table's member pointers in include/linux/nls.h and adds a lot of const's in fs/nls/*.c files. Resulting changes as visible by size: text data bss dec hex filename 113612 481216 2368 597196 91ccc nls.org/built-in.o 593548 3296 288 597132 91c8c nls/built-in.o Apparently compiler managed to optimize code a bit better because of const-ness. No other changes are made. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:58 -07:00
Denis V. Lunev	37c42524d6	shrink_dcache_sb speedup This patch makes shrink_dcache_sb consistent with dentry pruning policy. On the first pass we iterate over dentry unused list and prepare some dentries for removal. However, since the existing code moves evicted dentries to the beginning of the LRU it can happen that fresh dentries from other superblocks will be inserted before our dentries. This can result in significant slowdown of shrink_dcache_sb(). Moreover, for virtual filesystems like unionfs which can call dput() during dentries kill existing code results in O(n^2) complexity. We observed 2 minutes shrink_dcache_sb() with only 35000 dentries. To avoid this effects we propose to isolate sb dentries at the end of LRU list. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Kirill Korotaev <dev@openvz.org> Signed-off-by: Andrey Mirkin <amirkin@openvz.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Emil Medve	1f7c8234c7	Make the pr_() family of macros in kernel.h complete Other/Some pr_() macros are already defined in kernel.h, but pr_err() was defined multiple times in several other places Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Cc: Tony Lindgren <tony@atomide.com> Reviewed-by: Satyam Sharma <satyam@infradead.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
David Howells	76181c134f	KEYS: Make request_key() and co fundamentally asynchronous Make request_key() and co fundamentally asynchronous to make it easier for NFS to make use of them. There are now accessor functions that do asynchronous constructions, a wait function to wait for construction to complete, and a completion function for the key type to indicate completion of construction. Note that the construction queue is now gone. Instead, keys under construction are linked in to the appropriate keyring in advance, and that anyone encountering one must wait for it to be complete before they can use it. This is done automatically for userspace. The following auxiliary changes are also made: (1) Key type implementation stuff is split from linux/key.h into linux/key-type.h. (2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does not need to call key_instantiate_and_link() directly. (3) Adjust the debugging macros so that they're -Wformat checked even if they are disabled, and make it so they can be enabled simply by defining __KDEBUG to be consistent with other code of mine. (3) Documentation. [alan@lxorguk.ukuu.org.uk: keys: missing word in documentation] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Matti Linnanvuori	f20fda4861	Mutex documentation is unclear about software interrupts, tasklets and timers Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:57 -07:00
Bill Nottingham	2e8ecb9db0	add CONFIG_VT_UNICODE As of now, the kernel defaults to non-unicode and XLATE for the keyboard. We've been changing this in Fedora, but that requires patching the defaults in the kernel. The attached introduces CONFIG_VT_UNICODE, which sets the console in unicode mode by default on boot, including both the virtual terminal and the keyboard driver. Signed-off-by: Bill Nottingham <notting@redhat.com> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Dmitry Torokhov <dtor@mail.ru> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00
Jan Beulich	22e48eaf58	constify string/array kparam tracking structures .. in an effort to make read-only whatever can be made, so that CONFIG_DEBUG_RODATA can catch as many issues as possible. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00
Jan Beulich	d5aa0daf6d	store __setup_str_* in a more compact way __setup_str_* are referenced only during boot, hence there's no need to waste image space for aligning these strings (with the aim of improving performance). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:56 -07:00

... 19 20 21 22 23 ...

10016 Commits