Commit graph

787 commits

Author SHA1 Message Date
Matt Helsley
dc52ddc0e6 container freezer: implement freezer cgroup subsystem
This patch implements a new freezer subsystem in the control groups
framework.  It provides a way to stop and resume execution of all tasks in
a cgroup by writing in the cgroup filesystem.

The freezer subsystem in the container filesystem defines a file named
freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup.  Reading will return the current state.

* Examples of usage :

   # mkdir /containers/freezer
   # mount -t cgroup -ofreezer freezer  /containers
   # mkdir /containers/0
   # echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

   # cat /containers/0/freezer.state
   RUNNING

to freeze all tasks in the container :

   # echo FROZEN > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   FREEZING
   # cat /containers/0/freezer.state
   FROZEN

to unfreeze all tasks in the container :

   # echo RUNNING > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   RUNNING

This is the basic mechanism which should do the right thing for user space
task in a simple scenario.

It's important to note that freezing can be incomplete.  In that case we
return EBUSY.  This means that some tasks in the cgroup are busy doing
something that prevents us from completely freezing the cgroup at this
time.  After EBUSY, the cgroup will remain partially frozen -- reflected
by freezer.state reporting "FREEZING" when read.  The state will remain
"FREEZING" until one of these things happens:

	1) Userspace cancels the freezing operation by writing "RUNNING" to
		the freezer.state file
	2) Userspace retries the freezing operation by writing "FROZEN" to
		the freezer.state file (writing "FREEZING" is not legal
		and returns EIO)
	3) The tasks that blocked the cgroup from entering the "FROZEN"
		state disappear from the cgroup's set of tasks.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: export thaw_process]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:34 -07:00
Matt Helsley
83224b0837 container freezer: add TIF_FREEZE flag to all architectures
This patch series introduces a cgroup subsystem that utilizes the swsusp
freezer to freeze a group of tasks.  It's immediately useful for batch job
management scripts.  It should also be useful in the future for
implementing container checkpoint/restart.

The freezer subsystem in the container filesystem defines a cgroup file
named freezer.state.  Reading freezer.state will return the current state
of the cgroup.  Writing "FROZEN" to the state file will freeze all tasks
in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
the cgroup.

* Examples of usage :

   # mkdir /containers/freezer
   # mount -t cgroup -ofreezer freezer  /containers
   # mkdir /containers/0
   # echo $some_pid > /containers/0/tasks

to get status of the freezer subsystem :

   # cat /containers/0/freezer.state
   RUNNING

to freeze all tasks in the container :

   # echo FROZEN > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   FREEZING
   # cat /containers/0/freezer.state
   FROZEN

to unfreeze all tasks in the container :

   # echo RUNNING > /containers/0/freezer.state
   # cat /containers/0/freezer.state
   RUNNING

This patch:

The first step in making the refrigerator() available to all
architectures, even for those without power management.

The purpose of such a change is to be able to use the refrigerator() in a
new control group subsystem which will implement a control group freezer.

[akpm@linux-foundation.org: fix sparc]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@tuxonice.net>
Tested-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:52:33 -07:00
Badari Pulavarty
71088785c6 mm: cleanup to make remove_memory() arch-neutral
There is nothing architecture specific about remove_memory().
remove_memory() function is common for all architectures which support
hotplug memory remove.  Instead of duplicating it in every architecture,
collapse them into arch neutral function.

[akpm@linux-foundation.org: fix the export]
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-20 08:50:25 -07:00
Linus Torvalds
08d19f51f0 Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (134 commits)
  KVM: ia64: Add intel iommu support for guests.
  KVM: ia64: add directed mmio range support for kvm guests
  KVM: ia64: Make pmt table be able to hold physical mmio entries.
  KVM: Move irqchip_in_kernel() from ioapic.h to irq.h
  KVM: Separate irq ack notification out of arch/x86/kvm/irq.c
  KVM: Change is_mmio_pfn to kvm_is_mmio_pfn, and make it common for all archs
  KVM: Move device assignment logic to common code
  KVM: Device Assignment: Move vtd.c from arch/x86/kvm/ to virt/kvm/
  KVM: VMX: enable invlpg exiting if EPT is disabled
  KVM: x86: Silence various LAPIC-related host kernel messages
  KVM: Device Assignment: Map mmio pages into VT-d page table
  KVM: PIC: enhance IPI avoidance
  KVM: MMU: add "oos_shadow" parameter to disable oos
  KVM: MMU: speed up mmu_unsync_walk
  KVM: MMU: out of sync shadow core
  KVM: MMU: mmu_convert_notrap helper
  KVM: MMU: awareness of new kvm_mmu_zap_page behaviour
  KVM: MMU: mmu_parent_walk
  KVM: x86: trap invlpg
  KVM: MMU: sync roots on mmu reload
  ...
2008-10-16 15:36:00 -07:00
Linus Torvalds
e4856a70cf Merge branch 'personality' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'personality' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [PATCH] remove unused ibcs2/PER_SVR4 in SET_PERSONALITY
2008-10-16 12:32:52 -07:00
Christoph Hellwig
b418da16dd compat: generic compat get/settimeofday
Nothing arch specific in get/settimeofday.  The details of the timeval
conversion varied a little from arch to arch, but all with the same
results.

Also add an extern declaration for sys_tz to linux/time.h because externs
in .c files are fowned upon.  I'll kill the externs in various other files
in a sparate patch.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Christoph Hellwig
f7a5000f7a compat: move cp_compat_stat to common code
struct stat / compat_stat is the same on all architectures, so
cp_compat_stat should be, too.

Turns out it is, except that various architectures have slightly and some
high2lowuid/high2lowgid or the direct assignment instead of the
SET_UID/SET_GID that expands to the correct one anyway.

This patch replaces the arch-specific cp_compat_stat implementations with
a common one based on the x86-64 one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net> [ sparc bits ]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [ parisc bits ]
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16 11:21:33 -07:00
Martin Schwidefsky
0b59268285 [PATCH] remove unused ibcs2/PER_SVR4 in SET_PERSONALITY
The SET_PERSONALITY macro is always called with a second argument of 0.
Remove the ibcs argument and the various tests to set the PER_SVR4
personality.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-16 15:40:05 +02:00
Christian Borntraeger
20766c083e KVM: s390: change help text of guest Kconfig
The current help text for CONFIG_S390_GUEST is not very helpful.
Lets add more text.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15 10:15:25 +02:00
Christian Borntraeger
a0046b6db1 KVM: s390: Make facility bits future-proof
Heiko Carstens pointed out, that its safer to activate working facilities
instead of disabling problematic facilities. The new code uses the host
facility bits and masks it with known good ones.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-10-15 10:15:24 +02:00
Steven Whitehouse
a447c09324 vfs: Use const for kernel parser table
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-13 10:10:37 -07:00
David Woodhouse
e758936e02 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	include/asm-x86/statfs.h
2008-10-13 17:13:56 +01:00
Linus Torvalds
37d9869ed9 Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (27 commits)
  [S390] Fix checkstack for s390
  [S390] fix initialization of stp
  [S390] 3215: Remove tasklet.
  [S390] console flush on panic / reboot
  [S390] introduce dirty bit for kvm live migration
  [S390] Add ioctl support for EMC Symmetrix Subsystem Control I/O
  [S390] xpram: per device block request queues.
  [S390] dasd: fix message flood for unsolicited interrupts
  [S390] Move private simple udelay function to arch/s390/lib/delay.c.
  [S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support.
  [S390] ptrace changes
  [S390] s390: use sys_pause for 31bit pause entry point
  [S390] qdio enhanced SIGA (iqdio) support.
  [S390] cio: fix cio_tpi.
  [S390] cio: Correct use of ! and &
  [S390] cio: inline assembly cleanup
  [S390] bus_id -> dev_set_name() for css and ccw busses
  [S390] bus_id ->dev_name() conversions in qdio
  [S390] Use s390_root_dev_* in kvm_virtio.
  [S390] more bus_id -> dev_name conversions
  ...
2008-10-11 08:50:01 -07:00
Martin Schwidefsky
4a672cfa3a [S390] fix initialization of stp
chsc_sstpc returns -EIO on error and 0 on success but stp_reset checks
against 1 instead of 0. chsc_sstpc used to return 1 on success, one
call location has not been updated ..

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:34:02 +02:00
Florian Funke
15e86b0c75 [S390] introduce dirty bit for kvm live migration
This patch defines a dirty bit in the PGSTE that can be used to implement
dirty pages logging for KVM's live migration. The bit is set in the
ptep_rcp_copy function, which is called to save dirty and referenced information
from the storage key in the PGSTE. The bit can be tested and reset by KVM using
the kvm_s390_test_and_clear_page_dirty function that is introduced by this patch.

Acked-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Florian Funke <ffunke@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:34:00 +02:00
Nigel Hislop
ab1d848fd6 [S390] Add ioctl support for EMC Symmetrix Subsystem Control I/O
EMC Symmetrix Subsystem Control I/O through CKD dasd requires a
specific parameter list sent to the array via a Perform Subsystem
Function CCW. The Symmetrix response is retrieved from the array
via a Read Subsystem Data CCW.

Signed-off-by: Nigel Hislop <hislop_nigel@emc.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:34:00 +02:00
Heiko Carstens
5a0d0e6537 [S390] Move private simple udelay function to arch/s390/lib/delay.c.
Move cio's private simple udelay function to lib/delay.c and turn it
into something much more readable. So we have all implementations
at one place.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:58 +02:00
Hongjie Yang
b2300b9efe [S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support.
The DCSS block device driver is modified to add >2G DCSSs support and
allow a DCSS block device to map to a set of contiguous DCSSs.  The
extmem code is also modified to use new Diagnose x'64' subcodes for
>2G DCSSs.

Signed-off-by: Hongjie Yang <hongjie@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:57 +02:00
Martin Schwidefsky
753c4dd6a2 [S390] ptrace changes
* System call parameter and result access functions
* Add tracehook calls
* Split syscall_trace into two functions do_syscall_trace_enter and
  do_syscall_trace_exit

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:57 +02:00
Christoph Hellwig
d86730bb95 [S390] s390: use sys_pause for 31bit pause entry point
sys32_pause is a useless copy of the generic sys_pause.
(and it's certainly not there for old sparc32 binaries..)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:56 +02:00
Klaus-Dieter Wacker
7a0f475513 [S390] qdio enhanced SIGA (iqdio) support.
Add support for z10 HiperSockets multiwrite SBALs on output
queues. This is used on LPAR with EDDP enabled devices.

Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10 21:33:55 +02:00
Ingo Molnar
990d0f2ced Merge branches 'sched/devel', 'sched/cpu-hotplug', 'sched/cpusets' and 'sched/urgent' into sched/core 2008-10-08 11:31:02 +02:00
Heiko Carstens
d3d238c774 [S390] nohz: Fix __udelay.
This fixes a regression that came with 934b2857cc
("[S390] nohz/sclp: disable timer on synchronous waits.").
If udelay() gets called from a disabled context it sets the clock comparator
to a value where it expects the next interrupt. When the interrupt happens
the clock comparator gets not reset and therefore the interrupt condition
doesn't get cleared. The result is an endless timer interrupt loop.

In addition this patch fixes also the following:

rcutorture reveals that our __udelay implementation is still buggy,
since it might schedule tasklets, but prevents their execution:

NOHZ: local_softirq_pending 42
NOHZ: local_softirq_pending 02
NOHZ: local_softirq_pending 142
NOHZ: local_softirq_pending 02

To fix this we make sure that only the clock comparator interrupt
is enabled when the enabled wait psw is loaded.
Also no code gets called anymore which might schedule tasklets.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-03 21:55:54 +02:00
Jarod Wilson
3d6e48f433 [S390] CVE-2008-1514: prevent ptrace padding area read/write in 31-bit mode
When running a 31-bit ptrace, on either an s390 or s390x kernel,
reads and writes into a padding area in struct user_regs_struct32
will result in a kernel panic.

This is also known as CVE-2008-1514.

Test case available here:
http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/tests/ptrace-tests/tests/user-area-padding.c?cvsroot=systemtap

Steps to reproduce:
1) wget the above
2) gcc -o user-area-padding-31bit user-area-padding.c -Wall -ggdb2 -D_GNU_SOURCE -m31
3) ./user-area-padding-31bit
<panic>

Test status
-----------
Without patch, both s390 and s390x kernels panic. With patch, the test case,
as well as the gdb testsuite, pass without incident, padding area reads
returning zero, writes ignored.

Nb: original version returned -EINVAL on write attempts, which broke the
gdb test and made the test case slightly unhappy, Jan Kratochvil suggested
the change to return 0 on write attempts.

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Tested-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-09-09 12:39:06 +02:00
Manfred Spraul
e545a6140b kernel/cpu.c: create a CPU_STARTING cpu_chain notifier
Right now, there is no notifier that is called on a new cpu, before the new
cpu begins processing interrupts/softirqs.
Various kernel function would need that notification, e.g. kvm works around
by calling smp_call_function_single(), rcu polls cpu_online_map.

The patch adds a CPU_STARTING notification. It also adds a helper function
that sends the message to all cpu_chain handlers.

Tested on x86-64.
All other archs are untested. Especially on sparc, I'm not sure if I got
it right.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-09-08 19:25:24 +02:00
David Woodhouse
5cfba5df8c S390: Update comments about why we don't use <asm-generic/statfs.h>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-09-06 19:30:19 +01:00
Heiko Carstens
5453c1a575 [S390] Fix linker script.
6360b1fbb4 ("move BUG_TABLE into RODATA")
causes this build bug (binutils 2.18.50.0.8.20080709, gcc 4.3.1):

  AS      .tmp_kallsyms1.o
  LD      .tmp_vmlinux2
  KSYM    .tmp_kallsyms2.S
s390x-4.3.1-nm: .tmp_vmlinux2: File truncated
No valid symbol.
make: *** [.tmp_kallsyms2.S] Error 1

So fix this.

Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-25 18:15:01 +02:00
Martin Schwidefsky
cce7496d3d [S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-21 19:46:42 +02:00
Eric Sandeen
152382af40 [S390] fix ext2_find_next_bit
ext4 does not work on s390 because ext2_find_next_bit is broken. Fortunately
this function is only used by ext4. The function uses ffs which does not work
analog to ffz. The result of ffs has an offset of 1 which is not taken into
account. To fix this use the low level __ffs_word function directly instead
of the ill defined ffs.

In addition the patch improves find_next_zero_bit and ext2_find_next_zero_bit
by passing the bit offset into __ffz_word instead of adding it after the
function call returned.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-21 19:46:41 +02:00
Heiko Carstens
8853e505a1 [S390] Remove unneeded spinlock initialization.
Remove the now unneeded s390_idle.lock spinlock initialization after
Josef Sipek did it the right way in arch/s390/kernel/process.c.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-21 19:46:39 +02:00
Josef 'Jeff' Sipek
3e972394f9 [S390] Fix uninitialized spinlock use
Ever since commit 43ca5c3a1c ([S390] Convert
monitor calls to function calls.), the kernel refused to IPL with spinlock
debugging enabled.

BUG: spinlock bad magic on CPU#0, swapper/0
 lock: 00000000003a4668, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
CPU: 0 Not tainted 2.6.25 #1
Process swapper (pid: 0, task: 000000000034f958, ksp: 0000000000377d60)
0000000000377ab8 0000000000352628 0000000000377d60 0000000000377d60
       0000000000016af4 00000000fffff7b5 0000000000377d60 0000000000000000
       0000000000000000 0000000000377a18 0000000000000009 0000000000377a18
       0000000000377a78 000000000023c920 0000000000016af4 0000000000377a18
       0000000000000005 0000000000000000 0000000000377b58 0000000000377ab8
Call Trace:
([<0000000000016a60>] show_trace+0xdc/0x108)
 [<0000000000016b4e>] show_stack+0xc2/0xfc
 [<0000000000016c9a>] dump_stack+0xb2/0xc0
 [<0000000000172dd4>]

Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-21 19:46:39 +02:00
Huang Ying
163f6876f5 kexec jump: rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE
Rename KEXEC_CONTROL_CODE_SIZE to KEXEC_CONTROL_PAGE_SIZE, because control
page is used for not only code on some platform.  For example in kexec
jump, it is used for data and stack too.

[akpm@linux-foundation.org: unbreak powerpc and arm, finish conversion]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-15 08:35:42 -07:00
Linus Torvalds
5941de8ead Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] move include/asm-s390 to arch/s390/include/asm
2008-08-04 17:15:07 -07:00
Linus Torvalds
84ff7a0012 Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: s390: Fix kvm on IBM System z10
  KVM: Advertise synchronized mmu support to userspace
  KVM: Synchronize guest physical memory map to host virtual memory map
  KVM: Allow browsing memslots with mmu_lock
  KVM: Allow reading aliases with mmu_lock
2008-08-01 12:48:16 -07:00
Martin Schwidefsky
c6557e7f2b [S390] move include/asm-s390 to arch/s390/include/asm
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-01 20:42:05 +02:00
Heiko Carstens
8f84700325 [S390] stp: fix section mismatch warning.
Fix these two (false positive) warnings by adding an __init annoation:

WARNING: vmlinux.o(.text+0x7e6a): Section mismatch in reference from the function stp_reset() to the function .init.text:__alloc_bootmem()
The function stp_reset() references
the function __init __alloc_bootmem().
This is often because stp_reset lacks a __init
annotation or the annotation of __alloc_bootmem is wrong.

WARNING: vmlinux.o(.text+0x7ece): Section mismatch in reference from the function stp_reset() to the function .init.text:free_bootmem()
The function stp_reset() references
the function __init free_bootmem().
This is often because stp_reset lacks a __init
annotation or the annotation of free_bootmem is wrong.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-01 16:39:34 +02:00
Heiko Carstens
d918fe2bd7 [S390] Remove diag 0x260 call from memory detection.
The result of the diag 0x260 call is not always what one would expect.
So just remove it.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-01 16:39:34 +02:00
Gerald Schaefer
7e9238fbc1 [S390] Add support for memory hot-remove.
This patch enables memory hot-remove on s390.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-01 16:39:33 +02:00
Heiko Carstens
519620cc3d [S390] Wire up new syscalls.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-01 16:39:32 +02:00
Heiko Carstens
934b2857cc [S390] nohz/sclp: disable timer on synchronous waits.
sclp_sync_wait wait synchronously for an sclp interrupt and disables
timer interrupts. However on the irq enter paths there is an extra
check if a timer interrupt would be due and calls the timer callback.
This would schedule softirqs in the wrong context.
So introduce local_tick_enable/disable which prevents this.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-01 16:39:30 +02:00
Michael Holzheu
3a95e8eb34 [S390] ipl: Reboot from alternate device does not work when booting from file
During startup we check if diag308 works using diag 308 subcode 6,
which stores the actual ipl information. This fails with rc = 0x102, if
the system has been ipled from the HMC using load from CD or load from file.
In the case of rc = 0x102 we have to assume that diag 308 is working,
since it still can be used to ipl from an alternative device.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-08-01 16:39:30 +02:00
Christian Borntraeger
1f4170e12d KVM: s390: Fix kvm on IBM System z10
The z10 system supports large pages, kvm-s390 doesnt.
Make sure that we dont advertise large pages to avoid the guest crashing as
soon as the guest kernel activates DAT.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-31 11:57:18 +03:00
Rusty Russell
9b1a4d3837 stop_machine: Wean existing callers off stop_machine_run()
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-28 12:16:31 +10:00
Christian Borntraeger
5a00a5e7a3 KVM: s390: Fix possible host kernel bug on lctl(g) handling
The lctl(g) instructions require a specific alignment for the parameters.
The architecture requires a specification program check if these alignments
are not used. Enforcing this alignment also removes a possible host BUG,
since the get_guest functions check for proper alignment and emits a BUG.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:36:20 +03:00
Christian Borntraeger
f5e10b09a5 KVM: s390: Fix instruction naming for lctlg
Lets fix the name for the lctlg instruction...

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:36:12 +03:00
Christian Borntraeger
3cd612998f KVM: s390: Fix program check on interrupt delivery handling
The current interrupt handling on s390 misbehaves on an error case. On s390
each cpu has the prefix area (lowcore) for interrupt delivery. This memory
must always be available. If we fail to access the prefix area for a guest
on interrupt delivery the configuration is completely unusable. There is no
point in sending another program interrupt to an inaccessible lowcore.
Furthermore, we should not bug the host kernel, because this can be triggered
by userspace. I think the guest kernel itself can not trigger the problem, as
SET PREFIX and SIGNAL PROCESSOR SET PREFIX both check that the memory is
available and sane. As this is a userspace bug (e.g. setting the wrong guest
offset, unmapping guest memory) we should kill the userspace process instead
of BUGing the host kernel.
In the long term we probably should notify the userspace process about this
problem.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:36:05 +03:00
Martin Schwidefsky
0096369daa KVM: s390: Change guestaddr type in gaccess
All registers are unsigned long types. This patch changes all occurences
of guestaddr in gaccess from u64 to unsigned long.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:35:57 +03:00
Carsten Otte
2bd0ac4eb4 KVM: s390: Advertise KVM_CAP_USER_MEMORY
KVM_CAP_USER_MEMORY is used by s390, therefore, we should advertise it.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:35:40 +03:00
Johannes Weiner
c55281dee0 s390: use generic show_mem()
Remove arch-specific show_mem() in favor of the generic version.

This also removes the following redundant information display:

	- pages in swapcache, printed by show_swap_cache_info()

where show_mem() calls show_free_areas(), which calls
show_swap_cache_info().

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:11 -07:00
Oleg Nesterov
69b895fd13 S390 topology: don't use kthread() for arch_reinit_sched_domains()
Now that it is safe to use get_online_cpus() we can revert

	[S390] cpu topology: Fix possible deadlock.
	commit: fd781fa25c

and call arch_reinit_sched_domains() directly from topology_work_fn().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:40 -07:00