Commit graph

1111 commits

Author SHA1 Message Date
Heiko Carstens
83a24e3290 [S390] topology: get rid of ifdefs
Remove all ifdefs from topology code and also only compile it for the
CONFIG_SCHED_BOOK case. The new code selects SCHED_MC if SCHED_BOOK is
selected. SCHED_MC without SCHED_BOOK is not possible anymore.
Furthermore various sysfs attributes are not available anymore for the
!SCHED_BOOK case. In particular all attributes that correspond to
CPU polarization.
But since all real world kernels have SCHED_BOOK selected anyway this
doesn't matter too much.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:10 +01:00
Michael Holzheu
3931723f36 [S390] kernel: Fix smp_switch_to_ipl_cpu() stack frame setup
Currently, when smp_switch_to_ipl_cpu() is done, the backchain in the dump
analysis tool crash looks like the following:

 #0 [1f746e70] __machine_kexec at 11dd92
 #1 [1f746eb8] smp_restart_cpu at 11820e
 #0 [00907eb0] cpu_idle at 10602e
 #1 [00907ef8] start_kernel at 979a08

It would be good to see the registers of the interrupted function.
To achieve this, the backchain on the new stack has to be set to zero.
This looks then like the following:

 #0 [1f746e70] __machine_kexec at 11dd8e
 #1 [1f746eb8] smp_restart_cpu at 11820a
 PSW:  0706000180000000 00000000005c6fe6 (vtime_stop_cpu+134)
 GPRS: 0000000000000000 00000000005c6fe6 0000000001ad0228 0000000001ad0248
       0000000000907f08 0000000001ad0b40 0000000000979344 0000000000000000
       00000000009c0000 00000000009c0010 00000000009ab024 0000000001ad0200
       0000000001ad0238 00000000005cc9d8 000000000010602e 0000000000907e68
 #0 [00907eb0] cpu_idle at 10602e
 #1 [00907ef8] start_kernel at 979a08

In addition to this, now also the correct PSW is stored in the pt_regs
structure that is located at the start of the panic stack.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:10 +01:00
Martin Schwidefsky
14045ebf1e [S390] add support for physical memory > 4TB
The kernel address space of a 64 bit kernel currently uses a three level
page table and the vmemmap array has a fixed address and a fixed maximum
size. A three level page table is good enough for systems with less than
3.8TB of memory, for bigger systems four page table levels need to be
used. Each page table level costs a bit of performance, use 3 levels for
normal systems and 4 levels only for the really big systems.
To avoid bloating sparse.o too much set MAX_PHYSMEM_BITS to 46 for a
maximum of 64TB of memory.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:10 +01:00
Michael Holzheu
4999023aa9 [S390] Remove useless newline in reserve_kdump_bootmem()
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:09 +01:00
Michael Holzheu
44e5ddc4e9 [S390] Rework create_mem_hole() function
This patch makes the create_mem_hole() function more readable and
fixes some minor bugs (e.g. off-by-one problems).

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:09 +01:00
Michael Holzheu
1fb810576f [S390] Check for NULL termination in command line setup
The current code in setup_boot_command_line() uses a heuristic to
detect an EBCDIC command line. It checks if any of the bytes in
the command line has bit one (0x80) set. In that case it is assumed
that we have an EBCDIC string and the complete command line is
converted.

On s390 there are cases where the boot loader provides a kernel
command line that is NULL terminated, but has random data after
the NULL termination. In that case, setup_boot_command_line()
might misinterpret an ASCII string for an EBCDIC string. A
subsequent string conversion can then damage the ASCII string.

This patch solves the problem by checking for NULL termination.
If no EBCDIC character has been found until the the NULL
termination has been found, we now assume that we have an ASCII
string.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:25:48 +01:00
Heiko Carstens
272f01bf9b [S390] irq: fix accounting of external call/emergency signal
Mask the extint_code parameter of the smp external interrupt handler
to get the interruption code. Otherwise emergency call interrupts
erroneously might be accounted as emergency signal interrupts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:25:48 +01:00
Kay Sievers
3fbacffbe9 s390: time - convert sysdev_class to a regular subsystem
After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 15:09:50 -08:00
Kay Sievers
8a25a2fd12 cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem
This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 14:29:42 -08:00
Ingo Molnar
45aa0663cc Merge branch 'memblock-kill-early_node_map' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/memblock 2011-12-20 12:14:26 +01:00
Frederic Weisbecker
1268fbc746 nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
Those two APIs were provided to optimize the calls of
tick_nohz_idle_enter() and rcu_idle_enter() into a single
irq disabled section. This way no interrupt happening in-between would
needlessly process any RCU job.

Now we are talking about an optimization for which benefits
have yet to be measured. Let's start simple and completely decouple
idle rcu and dyntick idle logics to simplify.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:57 -08:00
Frederic Weisbecker
2bbb6817c0 nohz: Allow rcu extended quiescent state handling seperately from tick stop
It is assumed that rcu won't be used once we switch to tickless
mode and until we restart the tick. However this is not always
true, as in x86-64 where we dereference the idle notifiers after
the tick is stopped.

To prepare for fixing this, add two new APIs:
tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().

If no use of RCU is made in the idle loop between
tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
must instead call the new *_norcu() version such that the arch doesn't
need to call rcu_idle_enter() and rcu_idle_exit().

Otherwise the arch must call tick_nohz_enter_idle() and
tick_nohz_exit_idle() and also call explicitly:

- rcu_idle_enter() after its last use of RCU before the CPU is put
to sleep.
- rcu_idle_exit() before the first use of RCU after the CPU is woken
up.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:36 -08:00
Frederic Weisbecker
280f06774a nohz: Separate out irq exit and idle loop dyntick logic
The tick_nohz_stop_sched_tick() function, which tries to delay
the next timer tick as long as possible, can be called from two
places:

- From the idle loop to start the dytick idle mode
- From interrupt exit if we have interrupted the dyntick
idle mode, so that we reprogram the next tick event in
case the irq changed some internal state that requires this
action.

There are only few minor differences between both that
are handled by that function, driven by the ts->inidle
cpu variable and the inidle parameter. The whole guarantees
that we only update the dyntick mode on irq exit if we actually
interrupted the dyntick idle mode, and that we enter in RCU extended
quiescent state from idle loop entry only.

Split this function into:

- tick_nohz_idle_enter(), which sets ts->inidle to 1, enters
dynticks idle mode unconditionally if it can, and enters into RCU
extended quiescent state.

- tick_nohz_irq_exit() which only updates the dynticks idle mode
when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).

To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
into tick_nohz_idle_exit().

This simplifies the code and micro-optimize the irq exit path (no need
for local_irq_save there). This also prepares for the split between
dynticks and rcu extended quiescent state logics. We'll need this split to
further fix illegal uses of RCU in extended quiescent states in the idle
loop.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11 10:31:35 -08:00
Tejun Heo
ff38df377c s390: Use HAVE_MEMBLOCK_NODE_MAP
s390 used early_node_map[] just to prime free_area_init_nodes().  Now
memblock can be used for the same purpose and early_node_map[] is
scheduled to be dropped.  Use memblock instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
2011-12-08 10:22:09 -08:00
Martin Schwidefsky
cfc9066bcd [S390] remove reset of system call restart on psw changes
git commit 20b40a794b "signal race with restarting system calls"
added code to the poke_user/poke_user_compat to reset the system call
restart information in the thread-info if the PSW address is changed.
The purpose of that change has been to workaround old gdbs that do
not know about the REGSET_SYSTEM_CALL. It turned out that this is not
a good idea, it makes the behaviour of the debuggee dependent on the
order of specific ptrace call, e.g. the REGSET_SYSTEM_CALL register
set needs to be written last. And the workaround does not really fix
old gdbs, inferior calls on interrupted restarting system calls do not
work either way.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-01 13:32:17 +01:00
Martin Schwidefsky
b934069c99 [S390] add missing .set function for NT_S390_LAST_BREAK regset
The last breaking event address is a read-only value, the regset misses the
.set function. If a PTRACE_SETREGSET is done for NT_S390_LAST_BREAK we
get an oops due to a branch to zero:

Kernel BUG at 0000000000000002 verbose debug info unavailable
illegal operation: 0001 #1 SMP
...
Call Trace:
(<0000000000158294> ptrace_regset+0x184/0x188)
 <00000000001595b6> ptrace_request+0x37a/0x4fc
 <0000000000109a78> arch_ptrace+0x108/0x1fc
 <00000000001590d6> SyS_ptrace+0xaa/0x12c
 <00000000005c7a42> sysc_noemu+0x16/0x1c
 <000003fffd5ec10c> 0x3fffd5ec10c
Last Breaking-Event-Address:
 <0000000000158242> ptrace_regset+0x132/0x188

Add a nop .set function to prevent the branch to zero.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: stable@kernel.org
2011-12-01 13:32:17 +01:00
Martin Schwidefsky
d9ae6772d3 [S390] ptrace inferior call interactions with TIF_SYSCALL
The TIF_SYSCALL bit needs to be cleared if the debugger changes the state
of the ptraced process in regard to the presence of a system call.
Otherwise the system call will be restarted although the debugger set up
an inferior call.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-01 13:32:17 +01:00
Michael Holzheu
5f894cbb68 [S390] kdump: Replace is_kdump_kernel() with OLDMEM_BASE check
In order to have the same behavior for kdump based stand-alone dump
as for the kexec method, the is_kdump_kernel() check (only true for
the kexec method) has to be replaced by the OLDMEM_BASE check (true
for both methods).

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-01 13:32:17 +01:00
Heiko Carstens
f6bf1a8acd [S390] topology: fix topology on z10 machines
Make sure that all cpus in a book on a z10 appear as book siblings
and not as core siblings. This fixes some performance regressions that
appeared after the book scheduling domain got introduced.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-11-14 11:19:09 +01:00
Jan Glauber
cfa1e7e1d4 [S390] avoid STCKF if running in ESA mode
In ESA mode STCKF is not defined even if the facility bit is enabled.
To prevent an illegal operation we must also check if we run a 64 bit kernel.
To make the check perform well add the STCKF bit to the machine flags.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-11-14 11:19:09 +01:00
Michael Holzheu
3f25dc4fcb [S390] zfcpdump: Do not initialize zfcpdump in kdump mode
When the kernel is started in kdump mode, zfcpdump should not be
initialized because both dump methods can't be used at the same time.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-11-14 11:19:09 +01:00
Martin Schwidefsky
7a2512b744 [S390] incorrect note program header
'readelf -n' on the s390 vmlinux file generates lots of warnings about
corrupt notes. The reason is that the 'NOTE' program header has incorrect
file and memory sizes. The problem is that the section following the
NOTES section do not switch to a different phdr and they get added to
the NOTE program section. Add a dummy entry to the linker script that
switches to the data phdr before the start of the RODATA section.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-11-14 11:19:08 +01:00
Heiko Carstens
800252976b [S390] wire up process_vm syscalls
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-11-14 11:19:08 +01:00
Linus Torvalds
b32fc0a062 Merge branch 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  jump-label: initialize jump-label subsystem much earlier
  x86/jump_label: add arch_jump_label_transform_static()
  s390/jump-label: add arch_jump_label_transform_static()
  jump_label: add arch_jump_label_transform_static() to optimise non-live code updates
  sparc/jump_label: drop arch_jump_label_text_poke_early()
  x86/jump_label: drop arch_jump_label_text_poke_early()
  jump_label: if a key has already been initialized, don't nop it out
  stop_machine: make stop_machine safe and efficient to call early
  jump_label: use proper atomic_t initializer

Conflicts:
 - arch/x86/kernel/jump_label.c
	Added __init_or_module to arch_jump_label_text_poke_early vs
	removal of that function entirely
 - kernel/stop_machine.c
	same patch ("stop_machine: make stop_machine safe and efficient
	to call early") merged twice, with whitespace fix in one version
2011-11-06 20:20:46 -08:00
Michael Holzheu
07ea815b22 [S390] Remove error checking from copy_oldmem_page()
Currently it can happen that the pre-allocated ELF header contains a wrong
memory map which would result in errors when copying /proc/vmcore.
In order to still get a valid vmcore, we (temporarily) disable the error
checking in copy_oldmem_page(). This will then produce zero pages for those
memory regions.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:47 +01:00
Heiko Carstens
2a3a2d66aa [S390] irqstats: split IPI interrupt accounting
We use both the external call and emergency call IPIs to signal remote
cpus. Therefore it makes sense to account them differently withing
/proc/irqstats so we actually know what happened.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:47 +01:00
Martin Schwidefsky
3c52e49d7c [S390] sparse: fix sparse warnings with __user pointers
Use __force to quiet sparse warnings about user address space.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:46 +01:00
Martin Schwidefsky
5b479a79bf [S390] sparse: fix sparse warnings in math-emu
Fix three sparse warnings in math-emu / sysinfo:

arch/s390/kernel/sysinfo.c:448:17: error: return expression in void function
arch/s390/kernel/sysinfo.c:445:25: warning: shift too big (32) for type unsigned int
arch/s390/kernel/sysinfo.c:445:25: warning: shift too big (32) for type unsigned int

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:46 +01:00
Martin Schwidefsky
638ad34a88 [S390] sparse: fix sparse warnings about missing prototypes
Add prototypes and includes for functions used in different modules.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:46 +01:00
Martin Schwidefsky
c4736d9682 [S390] sparse: fix sparse static warnings
Make functions and data static to avoid sparse warnings.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:46 +01:00
Martin Schwidefsky
399c1d8dbf [S390] sparse: fix access past end of array warnings
Remove unnecessary code to avoid false positives from sparse, e.g.

arch/s390/kernel/compat_signal.c:221:61: warning: invalid access past the end of 'set32' (8 8)

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:46 +01:00
Carsten Otte
69ba974366 [S390] load user asce on sie_fault
On sie_fault we need to switch back to user ASCE. Otherwise we get
interresting effects when exiting to "userspace" while the guest
space is still active.

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:44 +01:00
Martin Schwidefsky
d98e19ccef [S390] smp: external call vs. emergency signal
Use a sigp sense running to decide which signal processor order to use
for an ipi. If the target cpu is running use external call, if the target
cpu is not running use emergency signal.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:44 +01:00
Sebastian Ott
65b4e403ac [S390] chsc_sch: add support for irq statistics
Add support for CHSC I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:44 +01:00
Martin Schwidefsky
d4e81b35b8 [S390] allow all addressing modes
The user space program can change its addressing mode between the
24-bit, 31-bit and the 64-bit mode if the kernel is 64 bit. Currently
the kernel always forces the standard amode on signal delivery and
signal return and on ptrace: 64-bit for a 64-bit process, 31-bit for
a compat process and 31-bit kernels. Change the signal and ptrace code
to allow the full range of addressing modes. Signal handlers are
run in the standard addressing mode for the process.

One caveat is that even an 31-bit compat process can switch to the
64-bit mode. The next signal will switch back into the 31-bit mode
and there is no room in the 31-bit compat signal frame to store the
information that the program came from the 64-bit mode.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:43 +01:00
Martin Schwidefsky
b50511e41a [S390] cleanup psw related bits and pieces
Split out addressing mode bits from PSW_BASE_BITS, rename PSW_BASE_BITS
to PSW_MASK_BASE, get rid of psw_user32_bits, remove unused function
enabled_wait(), introduce PSW_MASK_USER, and drop PSW_MASK_MERGE macros.
Change psw_kernel_bits / psw_user_bits to contain only the bits that
are always set in the respective mode.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:43 +01:00
Martin Schwidefsky
b6ef5bb3d9 [S390] add TIF_SYSCALL thread flag
Add an explicit TIF_SYSCALL bit that indicates if a task is inside
a system call. The svc_code in the pt_regs structure is now only
valid if TIF_SYSCALL is set. With this definition TIF_RESTART_SVC
can be replaced with TIF_SYSCALL. Overall do_signal is a bit more
readable and it saves a few lines of code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:43 +01:00
Martin Schwidefsky
ccf45cafb0 [S390] addressing mode limits and psw address wrapping
An instruction with an address right below the adress limit for the
current addressing mode will wrap. The instruction restart logic in
the protection fault handler and the signal code need to follow the
wrapping rules to find the correct instruction address.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:43 +01:00
Martin Schwidefsky
20b40a794b [S390] signal race with restarting system calls
For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
do_signal will prepare the restart of the system call with a rewind of
the PSW before calling get_signal_to_deliver (where the debugger might
take control). For A ERESTART_RESTARTBLOCK restarting system call
do_signal will set -EINTR as return code.
There are two issues with this approach:
1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
   ERESTART_RESTARTBLOCK as the rewinding already took place or the
   return code has been changed to -EINTR
2) if get_signal_to_deliver does not return with a signal to deliver
   the restart via the repeat of the svc instruction is left in place.
   This opens a race if another signal is made pending before the
   system call instruction can be reexecuted. The original system call
   will be restarted even if the second signal would have ended the
   system call with -EINTR.

These two issues can be solved by dropping the early rewind of the
system call before get_signal_to_deliver has been called and by using
the TIF_RESTART_SVC magic to do the restart if no signal has to be
delivered. The only situation where the system call restart via the
repeat of the svc instruction is appropriate is when a SA_RESTART
signal is delivered to user space.

Unfortunately this breaks inferior calls by the debugger again. The
system call number and the length of the system call instruction is
lost over the inferior call and user space will see ERESTARTNOHAND/
ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
new ptrace interface is added to save/restore the system call number
and system call instruction length.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:43 +01:00
Martin Schwidefsky
0edc8faa76 [S390] lowcore cleanup
Remove the save_area_64 field from the 0xe00 - 0xf00 area in the lowcore.
Use a free slot in the save_area array instead.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:42 +01:00
Michael Holzheu
dab7a7b153 [S390] Add architecture code for unmapping crashkernel memory
This patch implements the crash_map_pages() function for s390.
KEXEC_CRASH_MEM_ALIGN is set to HPAGE_SIZE, in order to support
kernel mappings that use large pages. We also use HPAGE_SIZE alignment
for CONFIG_HUGETLB_PAGE=n in order to have the same 1 MiB alignment on
all s390 systems.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:42 +01:00
Michael Holzheu
d38593f938 [S390] Export vmcoreinfo note
This patch defines for s390 an ABI defined pointer to the vmcoreinfo note at
a well known address. With this patch tools are able to find this information
in dumps created by stand-alone or hypervisor dump tools.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:42 +01:00
Michael Holzheu
60a0c68df2 [S390] kdump backend code
This patch provides the architecture specific part of the s390 kdump
support.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:42 +01:00
Michael Holzheu
1943f53c9c [S390] Force PSW restart on online CPU
PSW restart can be triggered on offline CPUs. If this happens, currently
the PSW restart code fails, because functions like smp_processor_id()
do not work on offline CPUs. This patch fixes this as follows:

If PSW restart is triggered on an offline CPU, the PSW restart (sigp restart)
is done a second time on another CPU that is online and the old CPU is
stopped afterwards.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:41 +01:00
Jan Glauber
017ec18360 [S390] use ENTRY macro for sys_setns_wrapper
Use the ENTRY macro for the system call wrapper sys_setns_wrapper
similarly to the other wrappers.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:16 +01:00
Martin Schwidefsky
a45aff5285 [S390] user per registers vs. ptrace single stepping
git commit 5e9a2692 "[S390] ptrace cleanup" introduced a regression
for the case when both a user PER set (e.g. a storage alteration trace) and
PTRACE_SINGLESTEP are active. The new code will overrule the user PER set
with a instruction-fetch PER set over the whole address space for ptrace
single stepping. The inferior process will be stopped after each instruction
with an instruction fetch event. Any other events that may have occurred
concurrently are not reported (e.g. storage alteration event) because the
control bits for them are not set. The solution is to merge the PER control
bits of the user PER set with the PER_EVENT_IFETCH control bit for
PTRACE_SINGLESTEP.

Cc: stable@kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:15 +01:00
Sebastian Ott
caa04f69df [S390] topology: fix alloc_masks annotation
Fix this warning:
WARNING: vmlinux.o(.text+0x199b6): Section mismatch in reference from
the function alloc_masks() to the function .init.text:__alloc_bootmem()

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:15 +01:00
Martin Schwidefsky
dd4a5a31fc [S390] avoid warning in show_cpuinfo
The .start function and indirectly the .next function of the show_cpuinfo
sequential operation uses NR_CPUS as limit instead of nr_cpu_ids.
This can cause warnings like this:

WARNING: at /usr/src/linux/include/linux/cpumask.h:107
Process lscpu (pid: 575, task: 000000007deb4338, ksp: 000000007794f588)
Krnl PSW : 0704000180000000 0000000000106db4 (show_cpuinfo+0x108/0x234)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000000003 0000000000791988 000000000071b478 0000000000000004
           0000000000000001 0000000000000000 000000007d139500 0000000000000400
           0000000000000000 000000000070e24c 000000007d48d600 0000000000000005
           000000007d48d600 00000000004dfa10 0000000000106cf8 000000007794fcc0
Krnl Code: 0000000000106da8: 95001000           cli     0(%r1),0
           0000000000106dac: a774ffac           brc     7,106d04
           0000000000106db0: a7f40001           brc     15,106db2
          >0000000000106db4: 92011000           mvi     0(%r1),1
           0000000000106db8: a7f4ffa6           brc     15,106d04
           0000000000106dbc: c0e5000065b4       brasl   %r14,113924
           0000000000106dc2: c09000303a45       larl    %r9,70e24c
           0000000000106dc8: c020001eefd4       larl    %r2,4e4d70

Replacing NR_CPUS with nr_cpu_ids fixes it.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:15 +01:00
Peter Oberparleiter
de400d6b78 [S390] fix mismatch in summation of I/O IRQ statistics
Current IRQ statistics support does not show detail counts for I/O
interrupts which are processed internally only. The result is a
summation count which is way off such as this one:

           CPU0       CPU1       CPU2
I/O:       1331        710        442
[...]
QAI:         15         16         16   [I/O] QDIO Adapter Interrupt
QDI:          1          0          0   [I/O] QDIO Interrupt
DAS:        706        645        381   [I/O] DASD
C15:         26         10          0   [I/O] 3215
C70:          0          0          0   [I/O] 3270
TAP:          0          0          0   [I/O] Tape
VMR:          0          0          0   [I/O] Unit Record Devices
LCS:          0          0          0   [I/O] LCS
CLW:          0          0          0   [I/O] CLAW
CTC:          0          0          0   [I/O] CTC
APB:          0          0          0   [I/O] AP Bus

Fix this by moving I/O interrupt accounting into the common I/O layer.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:15 +01:00
Linus Torvalds
39adff5f69 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  time, s390: Get rid of compile warning
  dw_apb_timer: constify clocksource name
  time: Cleanup old CONFIG_GENERIC_TIME references that snuck in
  time: Change jiffies_to_clock_t() argument type to unsigned long
  alarmtimers: Fix error handling
  clocksource: Make watchdog reset lockless
  posix-cpu-timers: Cure SMP accounting oddities
  s390: Use direct ktime path for s390 clockevent device
  clockevents: Add direct ktime programming function
  clockevents: Make minimum delay adjustments configurable
  nohz: Remove "Switched to NOHz mode" debugging messages
  proc: Consider NO_HZ when printing idle and iowait times
  nohz: Make idle/iowait counter update conditional
  nohz: Fix update_ts_time_stat idle accounting
  cputime: Clean up cputime_to_usecs and usecs_to_cputime macros
  alarmtimers: Rework RTC device selection using class interface
  alarmtimers: Add try_to_cancel functionality
  alarmtimers: Add more refined alarm state tracking
  alarmtimers: Remove period from alarm structure
  alarmtimers: Remove interval cap limit hack
  ...
2011-10-26 17:15:03 +02:00
Jeremy Fitzhardinge
61f42183fd s390/jump-label: add arch_jump_label_transform_static()
This allows jump-label entries to be cheaply updated on code which is
not yet live.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jan Glauber <jang@linux.vnet.ibm.com>
2011-10-25 11:54:37 -07:00
Martin Schwidefsky
85055dd805 PM / Hibernate: Include storage keys in hibernation image on s390
For s390 there is one additional byte associated with each page,
the storage key. This byte contains the referenced and changed
bits and needs to be included into the hibernation image.
If the storage keys are not restored to their previous state all
original pages would appear to be dirty. This can cause
inconsistencies e.g. with read-only filesystems.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-10-16 23:27:46 +02:00
Heiko Carstens
e35f95b36e time, s390: Get rid of compile warning
"s390: Use direct ktime path for s390 clockevent device" in linux-next
introduces this compile warning:

arch/s390/kernel/time.c: In function 's390_next_ktime':
arch/s390/kernel/time.c:118:2: warning:
  comparison of distinct pointer types lacks a cast [enabled by default]

Just use a u64 instead of an s64 variable. This is not a problem since it
will always contain a positive value.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/1316675957-5538-1-git-send-email-heiko.carstens@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-12 10:24:10 +02:00
Christian Borntraeger
480e5926ce [S390] kvm: fix address mode switching
598841ca99 ([S390] use gmap address
spaces for kvm guest images) changed kvm to use a separate address
space for kvm guests. This address space was switched in __vcpu_run
In some cases (preemption, page fault) there is the possibility that
this address space switch is lost.
The typical symptom was a huge amount of validity intercepts or
random guest addressing exceptions.
Fix this by doing the switch in sie_loop and sie_exit and saving the
address space in the gmap structure itself. Also use the preempt
notifier.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-09-20 17:07:34 +02:00
Martin Schwidefsky
4f37a68cda s390: Use direct ktime path for s390 clockevent device
The clock comparator on s390 uses the same format as the TOD clock.
If the value in the clock comparator is smaller than the current TOD
value an interrupt is pending. Use the CLOCK_EVT_FEAT_KTIME feature
to get the unmodified ktime of the next clockevent expiration and
use it to program the clock comparator without querying the TOD clock.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: john stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20110823133143.153017933@de.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:56 +02:00
NeilBrown
f5b9409973 All Arch: remove linkage for sys_nfsservctl system call
The nfsservctl system call is now gone, so we should remove all
linkage for it.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-26 15:09:58 -07:00
Michael Holzheu
e1202edadb [S390] Change default action from reipl to stop for on_restart
The main purpose for PSW restart will be kdump. Therefore customers will
issue "system restart" for creating a dump. If kdump is not enabled,
currently "PSW restart" will reboot the system and then no dump can
be created any more. In order to still allow a manual stand-alone dump in
the case a user issues "PSW restart" on a system that has not enabled
kdump we now stop the system.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-08-24 17:15:24 +02:00
Julia Lawall
798620fb1d [S390] arch/s390/kernel/ipl.c: correct error detection check
reipl_fcp_kset was just initialized, so it appears that it should be tested
instead of reipl_kset.

Signed-off-by: Julia Lawall <julia@diku.dk>
Reported-by: Suman Saha <sumsaha@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-08-24 17:15:24 +02:00
Heiko Carstens
27e7318c3e [S390] nss,initrd: kernel image and initrd must be in different segments
When IPL'ing from a block device and an NSS should be created we must
make sure that the kernel image and the initrd are in different 1MB
segments. Otherwise creating the NSS will fail.
So we make sure the initrd is 4MB behind the end of the kernel image
like we do already when IPL via the VM reader is performed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-08-24 17:15:23 +02:00
Heiko Carstens
9e8ed3ae92 [S390] signal: use set_restore_sigmask() helper
We should call set_restore_sigmask() instead of directly setting
TIF_RESTORE_SIGMASK. This change should have been done three years
earlier... see 4e4c22 "signals: add set_restore_sigmask".

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:21 +02:00
Heiko Carstens
b7f275042f [S390] smp: remove pointless comments in startup_secondary()
Remove pointless comments in startup_secondary(). There is not too much
value in having comments like e.g. "call cpu notifiers" just before a
call to notify_cpu*().

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:21 +02:00
Heiko Carstens
cc34321d58 [S390] cpu hotplug: on cpu start wait until being marked active
This is the same as fd8a7de1 "x86: cpu-hotplug: Prevent softirq wakeup
on wrong CPU".
Unlike on x86 this doesn't fix a bug on s390 since we do not have
threaded interrupt handlers. However we want to keep the same
initialization order like on x86. This should prevent bugs caused by
code which assumes (and relies on) the init order is the same on each
architecture.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:20 +02:00
Heiko Carstens
391c62feb1 [S390] signal: convert to use set_current_blocked()
Convert to use set_current_blocked() like x86.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:20 +02:00
Heiko Carstens
7a0e42f168 [S390] asm offsets: fix coding style
Because of readability reasons we ignore the 80 character line limit
in asm offsets. Just one line per define, nothing else.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:20 +02:00
Heiko Carstens
3a81b17142 [S390] Add support for IBM zEnterprise 114
Just fix up the Kconfig description and the elf platform.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:20 +02:00
Michael Holzheu
9dc7356ee1 [S390] Use diagnose 308 for system reset
The diagnose 308 call is the prefered method for clearing all ongoing I/O.
Therefore if it is available we use it instead of doing a manual reset.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:19 +02:00
Michael Holzheu
ef1daec8da [S390] Export store_status() function
For kdump we need a store status function to save the registers for the
current CPU. Therefore this patch exports a function "store_status()".
In addition to that now also floating point registers are saved correctly.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:19 +02:00
Michael Holzheu
7dd6b3343f [S390] Add PSW restart shutdown trigger
With this patch a new S390 shutdown trigger "restart" is added. If under
z/VM "systerm restart" is entered or under the HMC the "PSW restart" button
is pressed, the PSW located at 0 (31 bit) or 0x1a0 (64 bit) bit is loaded.
Now we execute do_restart() that processes the restart action that is
defined under /sys/firmware/shutdown_actions/on_restart. Currently the
following actions are possible: reipl (default), stop, vmcmd, dump, and
dump_reipl.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-08-03 16:44:19 +02:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Linus Torvalds
21c7075fa5 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (21 commits)
  [S390] use siginfo for sigtrap signals
  [S390] dasd: add enhanced DASD statistics interface
  [S390] kvm: make sigp emerg smp capable
  [S390] disable cpu measurement alerts on a dying cpu
  [S390] initial cr0 bits
  [S390] iucv cr0 enablement bit
  [S390] race safe external interrupt registration
  [S390] remove tape block docu
  [S390] ap: toleration support for ap device type 10
  [S390] cleanup program check handler prototypes
  [S390] remove kvm mmu reload on s390
  [S390] Use gmap translation for accessing guest memory
  [S390] use gmap address spaces for kvm guest images
  [S390] kvm guest address space mapping
  [S390] fix s390 assembler code alignments
  [S390] move sie code to entry.S
  [S390] kvm: handle tprot intercepts
  [S390] qdio: clear shared DSCI before scheduling the queue handler
  [S390] reference bit testing for unmapped pages
  [S390] irqs: Do not trace arch_local_{*,irq_*} functions
  ...
2011-07-24 09:55:45 -07:00
Jonas Bonn
66574cc054 modules: make arch's use default loader hooks
This patch removes all the module loader hook implementations in the
architecture specific code where the functionality is the same as that
now provided by the recently added default hooks.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-07-24 22:06:04 +09:30
Martin Schwidefsky
73b7d40ff1 [S390] use siginfo for sigtrap signals
Provide additional information on SIGTRAP by using a sig_info signal.
Use TRAP_BRKPT for breakpoints via illegal operation and TRAP_HWBKPT
for breakpoints via program event recording. Provide the address of
the instruction that caused the breakpoint via si_addr.
While we are at it get rid of tracehook_consider_fatal_signal.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:23 +02:00
Jan Glauber
cadfce7277 [S390] disable cpu measurement alerts on a dying cpu
The cpu measurement alerts that are used for instance by oprofile
for hardware sampling are not turned off on a cpu that is going
offline. Add the appropriate control register bit that should be
disabled to the list.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:22 +02:00
Martin Schwidefsky
c76e70d378 [S390] initial cr0 bits
Remove outdated bits from the initial cr0 register.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:22 +02:00
Martin Schwidefsky
5beab99100 [S390] iucv cr0 enablement bit
Do not set the cr0 enablement bit for iucv by default in head[31|64].S,
move the enablement to iucv_init in the iucv base layer.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:22 +02:00
Jan Glauber
89c9b66b10 [S390] race safe external interrupt registration
The (un-)register_external_interrupt functions are not race safe if
more than one interrupt handler is added or deleted for an external
interrupt concurrently.

Make the registration / unregistration of external interrupts race safe
by using RCU and a spinlock. RCU is used to avoid a performance penalty
in the external interrupt handler, the register and unregister functions
are protected by the spinlock and are not performance critical.
call_rcu must be used since the SCLP driver uses the interface with
IRQs disabled. Also use the generic list implementation rather than
homebrewn list code.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:22 +02:00
Martin Schwidefsky
fdb204d1a7 [S390] cleanup program check handler prototypes
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:21 +02:00
Martin Schwidefsky
e5992f2e6c [S390] kvm guest address space mapping
Add code that allows KVM to control the virtual memory layout that
is seen by a guest. The guest address space uses a second page table
that shares the last level pte-tables with the process page table.
If a page is unmapped from the process page table it is automatically
unmapped from the guest page table as well.

The guest address space mapping starts out empty, KVM can map any
individual 1MB segments from the process virtual memory to any 1MB
aligned location in the guest virtual memory. If a target segment in
the process virtual memory does not exist or is unmapped while a
guest mapping exists the desired target address is stored as an
invalid segment table entry in the guest page table.
The population of the guest page table is fault driven.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:21 +02:00
Jan Glauber
144d634a21 [S390] fix s390 assembler code alignments
The alignment is missing for various global symbols in s390 assembly code.
With a recent gcc and an instruction like stgrl this can lead to a
specification exception if the instruction uses such a mis-aligned address.

Specify the alignment explicitely and while add it define __ALIGN for s390
and use the ENTRY define to save some lines of code.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:21 +02:00
Martin Schwidefsky
603d1a50ac [S390] move sie code to entry.S
The entry to / exit from sie has subtle dependencies to the first level
interrupt handler. Move the sie assembler code to entry64.S and replace
the SIE_HOOK callback with a test and the new _TIF_SIE bit.
In addition this patch fixes several problems in regard to the check for
the_TIF_EXIT_SIE bits. The old code checked the TIF bits before executing
the interrupt handler and it only modified the instruction address if it
pointed directly to the sie instruction. In both cases it could miss
a TIF bit that normally would cause an exit from the guest and would
reenter the guest context.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24 10:48:21 +02:00
Linus Torvalds
8209f53d79 Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc
* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (39 commits)
  ptrace: do_wait(traced_leader_killed_by_mt_exec) can block forever
  ptrace: fix ptrace_signal() && STOP_DEQUEUED interaction
  connector: add an event for monitoring process tracers
  ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED
  ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task()
  ptrace_init_task: initialize child->jobctl explicitly
  has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/
  ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop
  ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/
  ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented()
  ptrace: ptrace_reparented() should check same_thread_group()
  redefine thread_group_leader() as exit_signal >= 0
  do not change dead_task->exit_signal
  kill task_detached()
  reparent_leader: check EXIT_DEAD instead of task_detached()
  make do_notify_parent() __must_check, update the callers
  __ptrace_detach: avoid task_detached(), check do_notify_parent()
  kill tracehook_notify_death()
  make do_notify_parent() return bool
  ptrace: s/tracehook_tracer_task()/ptrace_parent()/
  ...
2011-07-22 15:06:50 -07:00
Tejun Heo
a288eecce5 ptrace: kill trivial tracehooks
At this point, tracehooks aren't useful to mainline kernel and mostly
just add an extra layer of obfuscation.  Although they have comments,
without actual in-kernel users, it is difficult to tell what are their
assumptions and they're actually trying to achieve.  To mainline
kernel, they just aren't worth keeping around.

This patch kills the following trivial tracehooks.

* Ones testing whether task is ptraced.  Replace with ->ptrace test.

	tracehook_expect_breakpoints()
	tracehook_consider_ignored_signal()
	tracehook_consider_fatal_signal()

* ptrace_event() wrappers.  Call directly.

	tracehook_report_exec()
	tracehook_report_exit()
	tracehook_report_vfork_done()

* ptrace_release_task() wrapper.  Call directly.

	tracehook_finish_release_task()

* noop

	tracehook_prepare_release_task()
	tracehook_report_death()

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:28 +02:00
Jan Glauber
859c965149 [S390] allow setting of upper 32 bit in smp_ctl_set_bit
The bit shift operation in smp_ctl_set_bit does not specify the type
of the shifted bit so integer is used as default. Therefore it is not
possible to set bits in the upper 32 bit of the control register if
the kernel runs in 64 bit mode. Fix this by specifying the type as
unsigned long.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-06-22 16:24:20 +02:00
Linus Torvalds
571503e100 Merge branch 'setns'
* setns:
  ns: Wire up the setns system call

Done as a merge to make it easier to fix up conflicts in arm due to
addition of sendmmsg system call
2011-05-28 10:51:01 -07:00
Eric W. Biederman
7b21fddd08 ns: Wire up the setns system call
32bit and 64bit on x86 are tested and working.  The rest I have looked
at closely and I can't find any problems.

setns is an easy system call to wire up.  It just takes two ints so I
don't expect any weird architecture porting problems.

While doing this I have noticed that we have some architectures that are
very slow to get new system calls.  cris seems to be the slowest where
the last system calls wired up were preadv and pwritev.  avr32 is weird
in that recvmmsg was wired up but never declared in unistd.h.  frv is
behind with perf_event_open being the last syscall wired up.  On h8300
the last system call wired up was epoll_wait.  On m32r the last system
call wired up was fallocate.  mn10300 has recvmmsg as the last system
call wired up.  The rest seem to at least have syncfs wired up which was
new in the 2.6.39.

v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com>
v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com>
v4: Moved wiring up of the system call to another patch
v5: ported to v2.6.39-rc6
v6: rebased onto parisc-next and net-next to avoid syscall  conflicts.
v7: ported to Linus's latest post 2.6.39 tree.

>  arch/blackfin/include/asm/unistd.h     |    3 ++-
>  arch/blackfin/mach-common/entry.S      |    1 +
Acked-by: Mike Frysinger <vapier@gentoo.org>

Oh - ia64 wiring looks good.
Acked-by: Tony Luck <tony.luck@intel.com>

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-28 10:48:39 -07:00
Heiko Carstens
d7b250e2a2 [S390] irq: merge irq.c and s390_ext.c
Merge irq.c and s390_ext.c into irq.c. That way all external interrupt
related functions are together.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-26 09:48:24 +02:00
Heiko Carstens
df7997ab1c [S390] irq: fix service signal external interrupt handling
Interrupt sources like pfault, sclp, dasd_diag and virtio all use the
service signal external interrupt subclass mask in control register 0
to enable and disable the corresponding interrupt.
Because no reference counting is implemented each subsystem thinks it
is the only user of subclass and sets and clears the bit like it wants.
This leads to case that unloading the dasd diag module under z/VM
causes both sclp and pfault interrupts to be masked. The result will
be locked up system sooner or later.
Fix this by introducing a new way to set (register) and clear
(unregister) the service signal subclass mask bit in cr0.
Also convert all drivers.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-26 09:48:24 +02:00
Linus Torvalds
0d66cba1ac Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (29 commits)
  [S390] cpu hotplug: fix external interrupt subclass mask handling
  [S390] oprofile: dont access lowcore
  [S390] oprofile: add missing irq stats counter
  [S390] Ignore sendmmsg system call note wired up warning
  [S390] s390,oprofile: fix compile error for !CONFIG_SMP
  [S390] s390,oprofile: fix alert counter increment
  [S390] Remove unused includes in process.c
  [S390] get CPC image name
  [S390] sclp: event buffer dissection
  [S390] chsc: process channel-path-availability information
  [S390] refactor page table functions for better pgste support
  [S390] merge page_test_dirty and page_clear_dirty
  [S390] qdio: prevent compile warning
  [S390] sclp: remove unnecessary sendmask check
  [S390] convert old cpumask API into new one
  [S390] pfault: cleanup code
  [S390] pfault: cpu hotplug vs missing completion interrupts
  [S390] smp: add __noreturn attribute to cpu_die()
  [S390] percpu: implement arch specific irqsafe_cpu_ops
  [S390] vdso: disable gcov profiling
  ...
2011-05-24 12:06:02 -07:00
Linus Torvalds
5129df03d0 Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: Unify input section names
  percpu: Avoid extra NOP in percpu_cmpxchg16b_double
  percpu: Cast away printk format warning
  percpu: Always align percpu output section to PAGE_SIZE

Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
2011-05-24 11:53:42 -07:00
Tejun Heo
6988f20fe0 Merge branch 'fixes-2.6.39' into for-2.6.40 2011-05-24 09:59:36 +02:00
Heiko Carstens
5bd418784a [S390] cpu hotplug: fix external interrupt subclass mask handling
When disabling a cpu all external interrupt subclass masks in control
register 0 get cleared. However instead of the service signal subclass
mask bit an unused bit got cleared.
Accidently (or luckily) the service subclass mask gets cleared with the
pfault_fini() call that happens just before the rest of the subclass
mask bits get cleared.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-05-23 10:24:33 +02:00
Heiko Carstens
fcdd65b0e7 [S390] oprofile: add missing irq stats counter
Count CPU measurement external interrupts as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-05-23 10:24:32 +02:00
Jan Glauber
3af6fb687b [S390] Remove unused includes in process.c
Remove unsused includes from arch/s390/kernel/process.c.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-23 10:24:32 +02:00
KOSAKI Motohiro
0f1959f506 [S390] convert old cpumask API into new one
Adapt new API.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-23 10:24:31 +02:00
Heiko Carstens
f2db2e6cb3 [S390] pfault: cpu hotplug vs missing completion interrupts
On cpu hot remove a PFAULT CANCEL command is sent to the hypervisor
which in turn will cancel all outstanding pfault requests that have
been issued on that cpu (the same happens with a SIGP cpu reset).

The result is that we end up with uninterruptible processes where
the interrupt that would wake up these processes never arrives.

In order to solve this all processes which wait for a pfault
completion interrupt get woken up after a cpu hot remove. The worst
case that could happen is that they fault again and in turn need to
wait again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-23 10:24:29 +02:00
Heiko Carstens
b456d94a97 [S390] smp: add __noreturn attribute to cpu_die()
Add missing __noreturn attribute to cpu_die():

arch/s390/kernel/smp.c:691:6: error: symbol 'cpu_die' redeclared with different type

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-23 10:24:29 +02:00
Peter Oberparleiter
add7490c27 [S390] vdso: disable gcov profiling
The concepts of VDSO and gcov-based profiling don't mix: the former
includes kernel-provided code running in userspace, the latter adds
instructions that modify counters in kernel data segments. On s390
this has not been a problem so far due to VDSO code being written in
all-assembler which is exempt from gcov-based profiling. This could
change in the future, so disable profiling excplicitly for VDSO code.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-23 10:24:29 +02:00
Martin Schwidefsky
043d07084b [S390] Remove data execution protection
The noexec support on s390 does not rely on a bit in the page table
entry but utilizes the secondary space mode to distinguish between
memory accesses for instructions vs. data. The noexec code relies
on the assumption that the cpu will always use the secondary space
page table for data accesses while it is running in the secondary
space mode. Up to the z9-109 class machines this has been the case.
Unfortunately this is not true anymore with z10 and later machines.
The load-relative-long instructions lrl, lgrl and lgfrl access the
memory operand using the same addressing-space mode that has been
used to fetch the instruction.
This breaks the noexec mode for all user space binaries compiled
with march=z10 or later. The only option is to remove the current
noexec support.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-23 10:24:28 +02:00
Linus Torvalds
80fe02b5da Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
  sched: Fix and optimise calculation of the weight-inverse
  sched: Avoid going ahead if ->cpus_allowed is not changed
  sched, rt: Update rq clock when unthrottling of an otherwise idle CPU
  sched: Remove unused parameters from sched_fork() and wake_up_new_task()
  sched: Shorten the construction of the span cpu mask of sched domain
  sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG
  sched: Remove unused 'this_best_prio arg' from balance_tasks()
  sched: Remove noop in alloc_rt_sched_group()
  sched: Get rid of lock_depth
  sched: Remove obsolete comment from scheduler_tick()
  sched: Fix sched_domain iterations vs. RCU
  sched: Next buddy hint on sleep and preempt path
  sched: Make set_*_buddy() work on non-task entities
  sched: Remove need_migrate_task()
  sched: Move the second half of ttwu() to the remote cpu
  sched: Restructure ttwu() some more
  sched: Rename ttwu_post_activation() to ttwu_do_wakeup()
  sched: Remove rq argument from ttwu_stat()
  sched: Remove rq->lock from the first half of ttwu()
  sched: Drop rq->lock from sched_exec()
  ...

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix rt_rq runtime leakage bug
2011-05-19 17:41:22 -07:00
Linus Torvalds
df48d8716e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits)
  perf stat: Add more cache-miss percentage printouts
  perf stat: Add -d -d and -d -d -d options to show more CPU events
  ftrace/kbuild: Add recordmcount files to force full build
  ftrace: Add self-tests for multiple function trace users
  ftrace: Modify ftrace_set_filter/notrace to take ops
  ftrace: Allow dynamically allocated function tracers
  ftrace: Implement separate user function filtering
  ftrace: Free hash with call_rcu_sched()
  ftrace: Have global_ops store the functions that are to be traced
  ftrace: Add ops parameter to ftrace_startup/shutdown functions
  ftrace: Add enabled_functions file
  ftrace: Use counters to enable functions to trace
  ftrace: Separate hash allocation and assignment
  ftrace: Create a global_ops to hold the filter and notrace hashes
  ftrace: Use hash instead for FTRACE_FL_FILTER
  ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions
  perf bench, x86: Add alternatives-asm.h wrapper
  x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
  x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
  x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
  ...
2011-05-19 17:36:08 -07:00
Michael Holzheu
83ace2701b [S390] replace diag10() with diag10_range() function
Currently the diag10() function can only release one page. For exploiters
that have to call diag10 on a contiguous memory region this is suboptimal.
This patch replaces the diag10() function with diag10_range() that is
able to release multiple pages. In addition to that the new function now
allows to release memory with addresses higher than 2047 MiB. This was
due to a restriction of the diagnose implementation under z/VM prior to
release 5.2.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-10 17:13:43 +02:00
Christian Borntraeger
91d378088b [S390] disassembler: handle b280/spp instruction
arch/s390/kvm/sie64a.S uses the b280 instruction. Tell the builtin
disassembler to handle that code.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-10 17:13:42 +02:00
Michael Holzheu
8eb4bd666f [S390] kernel: Initialize register 14 when starting new CPU
When starting a new CPU we currently jump to start_secondary() without
setting register 14 (the return address) correctly. Therefore on the stack
frame for start_secondary an invalid return address is stored. This leads
to wrong stack back traces in kernel dumps.

Example:

 #00 [1f33fe48] cpu_idle at 10614a
 #01 [1f33fe90] start_secondary at 54fa88
 #02 [1f33feb8] (null) at 0                 <--- invalid

To fix this start_secondary() is called now with basr/brasl that sets
register 14 correctly. The output of the stack backtrace looks then
like the following:

 #00 [1f33fe48] cpu_idle at 10614a
 #01 [1f33fe90] start_secondary at 54fa88
 #02 [1f33feb8] restart_base at 54f41e      <--- correct

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-10 17:13:42 +02:00
Ingo Molnar
32673822e4 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
Conflicts:
	include/linux/perf_event.h

Merge reason: pick up the latest jump-label enhancements, they are cooked ready.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-27 10:40:21 +02:00
Peter Zijlstra
184748cc50 sched: Provide scheduler_ipi() callback in response to smp_send_reschedule()
For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.

In order to do so we need a generic callback from the existing scheduler IPI.

This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.

BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
2011-04-14 08:52:32 +02:00
Linus Torvalds
bb3c90f0de Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] compile fix for latest binutils
  [S390] cio: prevent purging of CCW devices in the online state
  [S390] qdio: fix init sequence
  [S390] Fix parameter passing for smp_switch_to_cpu()
  [S390] oprofile s390: prevent stack corruption
2011-04-08 07:36:14 -07:00
Jan Glauber
5373db886b jump label: Add s390 support
Implement the architecture backend for jump label support on s390.

For a shared kernel booted from a NSS silently disable jump labels
because the NSS is read-only. Therefore jump labels will be disabled
in a shared kernel and can't be activated.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
LKML-Reference: <6935d2c41ce111e1719176ed4bbd3dbe4de80855.1300299760.git.jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-04-04 13:43:16 -04:00
Martin Schwidefsky
8838101183 [S390] compile fix for latest binutils
The latest binutils won't accept the stfl instruction with march=g5
which is the correct behaviour. Unfortunately head.S is assembled
with -march=g5 even if the target cpu is z900 or later. To get
31-bit kernels compiled again the easiest fix is to use the .insn
notation for the stfl instruction in head.S.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-04-04 09:43:33 +02:00
Michael Holzheu
034e9e966c [S390] Fix parameter passing for smp_switch_to_cpu()
After the execution has been switched to the destination CPU, the target
function is called with the wrong parameter. According to the C calling
convention on s390, the first parameter should be loaded into register 2.
Currently in smp_restart_cpu() it is stored in register 3. To fix this, we
load the parameter into the correct register 2.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-04-04 09:43:32 +02:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Linus Torvalds
7c8d891c2c Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] cmpxchg: implement cmpxchg64()
  [S390] xchg/cmpxchg: move to own header file
  [S390] ccwgroup_driver: remove duplicate members
  [S390] ccw_bus_type: make it static
  [S390] ccw_driver: remove duplicate members
  [S390] qdio: prevent handling of buffers if count is zero
  [S390] setup: register bss section as resource
  [S390] setup: simplify setup_resources()
  [S390] wire up sys_syncfs
  [S390] wire up sys_clock_adjtime
  [S390] wire up sys_open_by_handle_at
  [S390] wire up sys_name_to_handle_at
  [S390] oprofile: disable hw sampling for CONFIG_32BIT
  [S390] early: limit savesys cmd string handling
  [S390] early: Fix possible overlapping data buffer
2011-03-25 17:47:04 -07:00
Tejun Heo
0415b00d17 percpu: Always align percpu output section to PAGE_SIZE
Percpu allocator honors alignment request upto PAGE_SIZE and both the
percpu addresses in the percpu address space and the translated kernel
addresses should be aligned accordingly.  The calculation of the
former depends on the alignment of percpu output section in the kernel
image.

The linker script macros PERCPU_VADDR() and PERCPU() are used to
define this output section and the latter takes @align parameter.
Several architectures are using @align smaller than PAGE_SIZE breaking
percpu memory alignment.

This patch removes @align parameter from PERCPU(), renames it to
PERCPU_SECTION() and makes it always align to PAGE_SIZE.  While at it,
add PCPU_SETUP_BUG_ON() checks such that alignment problems are
reliably detected and remove percpu alignment comment recently added
in workqueue.c as the condition would trigger BUG way before reaching
there.

For um, this patch raises the alignment of percpu area.  As the area
is in .init, there shouldn't be any noticeable difference.

This problem was discovered by David Howells while debugging boot
failure on mn10300.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
2011-03-24 18:50:09 +01:00
Stephen Wilson
cae5d39032 mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
Now that gate vma's are referenced with respect to a particular mm and not a
particular task it only makes sense to propagate the change to this predicate as
well.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23 16:36:55 -04:00
Stephen Wilson
83b964bbf8 mm: arch: make in_gate_area take an mm_struct instead of a task_struct
Morally, the question of whether an address lies in a gate vma should be asked
with respect to an mm, not a particular task.  Moreover, dropping the dependency
on task_struct will help make existing and future operations on mm's more
flexible and convenient.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23 16:36:54 -04:00
Stephen Wilson
31db58b3ab mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
Morally, the presence of a gate vma is more an attribute of a particular mm than
a particular task.  Moreover, dropping the dependency on task_struct will help
make both existing and future operations on mm's more flexible and convenient.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23 16:36:54 -04:00
Heiko Carstens
4cc69531f9 [S390] setup: register bss section as resource
Make kernel bss section visible via /proc/iomem like on other
architectures.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:59 +01:00
Heiko Carstens
71189284e6 [S390] setup: simplify setup_resources()
Simplify setup_resources() and make it more generic. That way it is
easier to add additional resources.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:59 +01:00
Heiko Carstens
d0d2e31af6 [S390] wire up sys_syncfs
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:58 +01:00
Heiko Carstens
26e8a33989 [S390] wire up sys_clock_adjtime
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:58 +01:00
Heiko Carstens
5069496ec4 [S390] wire up sys_open_by_handle_at
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:58 +01:00
Heiko Carstens
737fd5f1f6 [S390] wire up sys_name_to_handle_at
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:58 +01:00
Heiko Carstens
894e491e42 [S390] early: limit savesys cmd string handling
Use snprintf() here as well so we won't have to deal with this again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:14 +01:00
Chen Liu
69ac43b05e [S390] early: Fix possible overlapping data buffer
This patch fixed bugzilla #12965:
https://bugzilla.kernel.org/show_bug.cgi?id=12965

The original code contains some inproper use of sprintf
function where a buffer is used both as input string
as well as output string. It should remember the written
bytes in the previous and use that as the offset for
later writing. Also replace sprintf with snprintf.

Signed-off-by: Chen Liu <chenliu@asset.uwaterloo.ca>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-23 10:15:14 +01:00
Linus Torvalds
31598e8713 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] kexec: Disable ftrace during kexec
  [S390] support XZ compressed kernel
  [S390] css_bus_type: make it static
  [S390] css_driver: remove duplicate members
  [S390] css: remove subchannel private
  [S390] css: move chsc_private to drv_data
  [S390] css: move io_private to drv_data
  [S390] cio: move cdev pointer to io_subchannel_private
  [S390] cio: move options to io_sch_private
  [S390] cio: move asms to generic header
  [S390] cio: move orb definitions to separate header
  [S390] Write protect module text and RO data
  [S390] dasd: get rid of compile warning
  [S390] remove superfluous check from do_IRQ
  [S390] remove redundant stack check option
2011-03-17 10:10:49 -07:00
Linus Torvalds
79d8a8f736 Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
  percpu: Generic support for this_cpu_cmpxchg_double()
  alpha: use L1_CACHE_BYTES for cacheline size in the linker script
  percpu: align percpu readmostly subsection to cacheline

Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
2011-03-16 08:22:41 -07:00
Heiko Carstens
6966727db1 [S390] kexec: Disable ftrace during kexec
Disable ftrace during kexec. Same as on x86/powerpc.
ac4414e "powerpc/kdump: Disable ftrace during kexec".

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-03-15 17:08:24 +01:00
Martin Schwidefsky
261cd298a8 s390: remove task_show_regs
task_show_regs used to be a debugging aid in the early bringup days
of Linux on s390. /proc/<pid>/status is a world readable file, it
is not a good idea to show the registers of a process. The only
correct fix is to remove task_show_regs.

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-15 07:34:16 -08:00
Tejun Heo
19df0c2fef percpu: align percpu readmostly subsection to cacheline
Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.

This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.

This is based on Shaohua's x86 only patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
2011-01-25 14:26:50 +01:00
Heiko Carstens
d2c9dfccbc [S390] Randomize PIEs
Randomize ELF_ET_DYN_BASE, which is used when loading position
independent executables.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12 09:55:25 +01:00
Heiko Carstens
3351918282 [S390] Randomise the brk region
Randomize heap address like other architectures do already.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12 09:55:25 +01:00
Heiko Carstens
9887a1fcdd [S390] Randomize lower bits of stack address
Randomize the lower bits of the stack address like x86 and powerpc.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12 09:55:25 +01:00
Heiko Carstens
e7828bbd5e [S390] vdso: dont map at mmap_base
The vdso object is currently always mapped with mm->mmap_base used as
requested address. In case of flexible mmap layout this means it gets
mapped above mmap_base and therefore potentially stealing a bit of
address space that is reserved for the stack.
In case of flexible mmap layout the object should be mapped below
mmap base. For legacy mmap layout above.
To fix this just don't request any specific address and let the mmap
code figure out an address that fits.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-12 09:55:24 +01:00
Heiko Carstens
8e1023016c [S390] prevent unneccesary loops_per_jiffy recalculation
When the seqfile /proc/cpuinfo gets accesses for each possible cpu
loops_per_jiffy gets recalculated. However its value is only needed
on first access.
In addition loops_per_jiffy should be recalculated when the machine
reports a capability change.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:32 +01:00
Heiko Carstens
19726cec63 [S390] cpuinfo: use get_online_cpus() instead of preempt_disable()
Use get_online_cpus() instead of preempt_disable() to make sure cpus
don't go offline while accessing their per cpu data.
The preempt_disable() stuff is old code which was used before
get_online_cpus() was available.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:31 +01:00
Heiko Carstens
974de4d7e7 [S390] smp: remove cpu hotplug messages
Get rid of messages that indicate if a cpu went online or offline.
There is nothing special about this anymore and these messages might
flood the kernel log buffer which makes debugging harder since more
important messages might be overwritten.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:31 +01:00
Martin Schwidefsky
4cc9bed034 [S390] cleanup ftrace backend functions
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:31 +01:00
Martin Schwidefsky
5e9a26928f [S390] ptrace cleanup
Overhaul program event recording and the code dealing with the ptrace
user space interface.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:31 +01:00
Heiko Carstens
da7f51c11d [S390] smp/idle: call init_idle() before starting a new cpu
Call init_idle() which (re-)initializes the idle task structure before
it gets used on a new cpu.
That way we can also get rid of the odd preempt_enable_no_resched()
call we have in the cpu offline path within cpu_idle(). That call
prevented preempt count imbalances between cpu hotplug operations.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:30 +01:00
Heiko Carstens
f230886b0b [S390] smp: delay idle task creation
Delay idle task creation until a cpu gets set online instead of
creating them for all possible cpus at system startup.
For one cpu system this should safe more than 1 MB.
On my debug system with lots of debug stuff enabled this saves 2 MB.

Same as on x86.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:30 +01:00
Heiko Carstens
f3e1a27359 [S390] nmi: enable machine checks early
Until now machine checks for the swapper process of the IPL cpu are just
implicitly (and more or less accidently) enabled when the first time the
idle process goes into idle state and loads an enabled wait psw.
Before that machine checks are disabled.
So let's enable them explicitly in trap_init() so we have a well defined
time when machine checks are enabled.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:29 +01:00
Martin Schwidefsky
1de3447a41 [S390] 31 bit entry.S update.
Make the code in the 31 bit entry.S code as similar as possible to the
64 bit version in entry64.S. That makes it easier to add new code to
the first level interrupt handler that affects both 31 and 64 bit kernels.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:29 +01:00
Heiko Carstens
b1b7509185 [S390] extint: get rid of early code plus cleanup
Get rid of register/unregister_early_external_interrupt() and clean up
the code while at it.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:26 +01:00
Heiko Carstens
fb0a9d7e86 [S390] pfault: delay register of pfault interrupt
Use an early init call to initialize pfault. That way it is possible to
use the register_external_interrupt() instead of the early variant.
No need to enable pfault any earlier since it has only effect if user
space processes are running.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:26 +01:00
Holger Dengler
62d146ffe3 [S390] ap bus: add support for irq statistics
Add support for AP Bus I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Holger Dengler <hd@linux.vnet.ibm.com>
Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:26 +01:00
Heiko Carstens
85b81cdd0b [S390] ctc: add support for irq statistics
Add support for CTC I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:26 +01:00
Heiko Carstens
355eb4022b [S390] claw: add support for irq statistics
Add support for CLAW I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:26 +01:00
Heiko Carstens
096a61682e [S390] lcs: add support for irq statistics
Add support for LCS I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:26 +01:00
Heiko Carstens
f48198d592 [S390] vmur: add support for irq statistics
Add support for VMUR I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:26 +01:00
Heiko Carstens
b86651721f [S390] tape: add support for irq statistics
Add support for ccw based tape I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:25 +01:00
Heiko Carstens
3fe22f6bfd [S390] 3270: add support for irq statistics
Add support for 3270 I/O interrupt statistics in /proc/interrupts.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-01-05 12:47:25 +01:00