Impact: cleanup, avoid sparse warnings, reduce kernel size a bit
Fixes these sparse warnings:
arch/x86/kernel/cpu/perf_counter.c:44:11: warning: symbol 'intel_perfmon_event_map' was not declared. Should it be static?
arch/x86/kernel/cpu/perf_counter.c:54:11: warning: symbol 'max_intel_perfmon_events' was not declared. Should it be static?
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: restructure code
Change counter math from absolute values to clear delta logic.
We try to extract elapsed deltas from the raw hw counter - and put
that into the generic counter.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new API
Implement the atomic64_t APIs on 32-bit as well. Will be used by
the performance counters code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Introduce a proper enum for the 3 states of a counter:
PERF_COUNTER_STATE_OFF = -1
PERF_COUNTER_STATE_INACTIVE = 0
PERF_COUNTER_STATE_ACTIVE = 1
and rename counter->active to counter->state and propagate the
changes everywhere.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Rename them to better match up the usual IRQ disable/enable APIs:
hw_perf_disable_all() => hw_perf_save_disable()
hw_perf_restore_ctrl() => hw_perf_restore()
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new perf-counter type
The 'CPU clock' counter counts the amount of CPU clock time that is
elapsing, in nanoseconds. (regardless of how much of it the task is
spending on a CPU executing)
This counter type is a Linux kernel based abstraction, it is available
even if the hardware does not support native hardware performance counters.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: restructure code, introduce hw_ops driver abstraction
Introduce this abstraction to handle counter details:
struct hw_perf_counter_ops {
void (*hw_perf_counter_enable) (struct perf_counter *counter);
void (*hw_perf_counter_disable) (struct perf_counter *counter);
void (*hw_perf_counter_read) (struct perf_counter *counter);
};
This will be useful to support assymetric hw details, and it will also
be useful to implement "software counters". (Counters that count kernel
managed sw events such as pagefaults, context-switches, wall-clock time
or task-local time.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add group counters
This patch adds the "counter groups" abstraction.
Groups of counters behave much like normal 'single' counters, with a
few semantic and behavioral extensions on top of that.
A counter group is created by creating a new counter with the open()
syscall's group-leader group_fd file descriptor parameter pointing
to another, already existing counter.
Groups of counters are scheduled in and out in one atomic group, and
they are also roundrobin-scheduled atomically.
Counters that are member of a group can also record events with an
(atomic) extended timestamp that extends to all members of the group,
if the record type is set to PERF_RECORD_GROUP.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up new API
Thorough cleanup of the new perf counters API, we now get clean separation
of the various concepts:
- introduce perf_counter_hw_event to separate out the event source details
- move special type flags into separate attributes: PERF_COUNT_NMI,
PERF_COUNT_RAW
- extend the type to u64 and reserve it fully to the architecture in the
raw type case.
And make use of all these changes in the core and x86 perfcounters code.
Also change the syscall signature to:
asmlinkage int sys_perf_counter_open(
struct perf_counter_hw_event *hw_event_uptr __user,
pid_t pid,
int cpu,
int group_fd);
( Note that group_fd is unused for now - it's reserved for the counter
groups abstraction. )
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: change syscall, cleanup
Make use of the new perf_counters event type.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix rare lost events problem
There are CPUs whose performance counters misbehave on CSTATE transitions,
so provide a way to just disable/enable them around deep idle methods.
(hw_perf_enable_all() is cheap on x86.)
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix spurious missed counter wakeups
In the case of NMI events, close a race window that can occur if an NMI
hits counter code that temporarily disables+enables a counter, and the NMI
leaks into the disabled section.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
these warnings:
arch/x86/kernel/paravirt-spinlocks.c: In function ‘default_spin_lock_flags’:
arch/x86/kernel/paravirt-spinlocks.c:12: warning: passing argument 1 of ‘__raw_spin_lock’ from incompatible pointer type
arch/x86/kernel/paravirt-spinlocks.c: At top level:
arch/x86/kernel/paravirt-spinlocks.c:11: warning: ‘default_spin_lock_flags’ defined but not used
showed that the prototype of default_spin_lock_flags() was confused about
what type spinlocks have.
the proper type on UP is raw_spinlock_t.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: make perfcounter NMI and IRQ sequence more robust
Make __smp_perf_counter_interrupt() a bit more conservative: first disable
all counters, then read out the status. Most invocations are because there
are real events, so there's no performance impact.
Code flow gets a bit simpler as well this way.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implement performance counters for x86 Intel CPUs.
It's simplified right now: the PERFMON CPU feature is assumed,
which is available in Core2 and later Intel CPUs.
The design is flexible to be extended to more CPU types as well.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup on 32-bit
Peter pointed this parameter can be changed.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix early panic with boot option "nosmp"
x86/oprofile: fix Intel cpu family 6 detection
oprofile: fix CPU unplug panic in ppro_stop()
AMD IOMMU: fix possible race while accessing iommu->need_sync
AMD IOMMU: set device table entry for aliased devices
AMD IOMMU: struct amd_iommu remove padding on 64 bit
x86: fix broken flushing in GART nofullflush path
x86: fix dma_mapping_error for 32bit x86
Impact: fix boot crash with numcpus=0 on certain systems
Fix early exception in __get_smp_config with nosmp.
Bail out early when there is no MP table.
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Alan Jenkins wrote:
> This is on an EeePC 701, /proc/cpuinfo as attached.
>
> Is this expected? Will the next release work?
>
> Thanks, Alan
>
> # opcontrol --setup --no-vmlinux
> cpu_type 'unset' is not valid
> you should upgrade oprofile or force the use of timer mode
>
> # opcontrol -v
> opcontrol: oprofile 0.9.4 compiled on Nov 29 2008 22:44:10
>
> # cat /dev/oprofile/cpu_type
> i386/p6
> # uname -r
> 2.6.28-rc6eeepc
Hi Alan,
Looking at the kernel driver code for oprofile it can return the "i386/p6" for
the cpu_type. However, looking at the user-space oprofile code there isn't the
matching entry in libop/op_cpu_type.c or the events/unit_mask files in
events/i386 directory.
The Intel AP-485 says this is a "Intel Pentium M processor model D". Seems like
the oprofile kernel driver should be identifying the processor as "i386/p6_mobile"
The driver identification code doesn't look quite right in nmi_init.c
http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=blob;f=arch/x86/oprofile/nmi_int.c;h=022cd41ea9b4106e5884277096e80e9088a7c7a9;hb=HEAD
has:
409 case 10 ... 13:
410 *cpu_type = "i386/p6";
411 break;
Referring to the Intel AP-485:
case 10 and 11 should produce "i386/piii"
case 13 should produce "i386/p6_mobile"
I didn't see anything for case 12.
Something like the attached patch. I don't have a celeron machine to verify that
changes in this area of the kernel fix thing.
-Will
Signed-off-by: William Cohen <wcohen@redhat.com>
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
If oprofile statically compiled in kernel, a cpu unplug triggers
a panic in ppro_stop(), because a NULL pointer is dereferenced.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
The access to the iommu->need_sync member needs to be protected by the
iommu->lock. Otherwise this is a possible race condition. Fix it with
this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
In some rare cases a request can arrive an IOMMU with its originial
requestor id even it is aliased. Handle this by setting the device table
entry to the same protection domain for the original and the aliased
requestor id.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Remove 16 bytes of padding from struct amd_iommu on 64bit builds
reducing its size to 120 bytes, allowing it to span one fewer
cachelines.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Impact: remove stale IOTLB entries
In the non-default nofullflush case the GART is only flushed when
next_bit wraps around. But it can happen that an unmap operation unmaps
memory which is behind the current next_bit location. If these addresses
are reused it may result in stale GART IO/TLB entries. Fix this by
setting the GART next_bit always behind an unmapped location.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: MMU: avoid creation of unreachable pages in the shadow
KVM: ppc: stop leaking host memory on VM exit
KVM: MMU: fix sync of ptes addressed at owner pagetable
KVM: ia64: Fix: Use correct calling convention for PAL_VPS_RESUME_HANDLER
KVM: ia64: Fix incorrect kbuild CFLAGS override
KVM: VMX: Fix interrupt loss during race with NMI
KVM: s390: Fix problem state handling in guest sigp handler
Impact: do not expose a control that has no effect
Fix to prevent sched_mc_power_saving from being exported through sysfs
on single-socket systems. (Say multicore single socket (Laptop))
CPU core map of the boot cpu should be equal to possible number
of cpus for single socket system.
This fix has been developed at FOSS.in kernel workout.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
irq.h: fix missing/extra kernel-doc
genirq: __irq_set_trigger: change pr_warning to pr_debug
irq: fix typo
x86: apic honour irq affinity which was set in early boot
genirq: fix the affinity setting in setup_irq
genirq: keep affinities set from userspace across free/request_irq()
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: always define DECLARE_PCI_UNMAP* macros
x86: fixup config space size of CPU functions for AMD family 11h
x86, bts: fix wrmsr and spinlock over kmalloc
x86, pebs: fix PEBS record size configuration
x86, bts: turn macro into static inline function
x86, bts: exclude ds.c from build when disabled
arch/x86/kernel/pci-calgary_64.c: change simple_strtol to simple_strtoul
x86: use limited register constraint for setnz
xen: pin correct PGD on suspend
x86: revert irq number limitation
x86: fixing __cpuinit/__init tangle, xsave_cntxt_init()
x86: fix __cpuinit/__init tangle in init_thread_xstate()
oprofile: fix an overflow in ppro code
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] powernow-k8: ignore out-of-range PstateStatus value
[CPUFREQ] Documentation: Add Blackfin to list of supported processors
All architectures now use the generic compat_sys_ptrace, as should every
new architecture that needs 32bit compat (if we'll ever get another).
Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also
kill a comment about __ARCH_SYS_PTRACE that was added after
__ARCH_SYS_PTRACE was already gone.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... so get xen-ops.h in agreement with xen/smp.c
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>