Age | Commit message (Collapse) | Author |
|
|
|
wait_queue is a swiss army knife and in most of the cases the
complexity is not needed. For RT waitqueues are a constant source of
trouble as we can't convert the head lock to a raw spinlock due to
fancy and long lasting callbacks.
Provide a slim version, which allows RT to replace wait queues. This
should go mainline as well, as it lowers memory consumption and
runtime overhead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Add a CONFIG option to allow the output from Magic SysRq to be output
immediately, even if this causes large latencies.
If PREEMPT_RT_FULL, printk() will not try to acquire the console lock
when interrupts or preemption are disabled. If the console lock is
not acquired the printk() output will be buffered, but will not be
output immediately. Some drivers call into the Magic SysRq code
with interrupts or preemption disabled, so the output of Magic SysRq
will be buffered instead of printing immediately if this option is
not selected.
Even with this option selected, Magic SysRq output will be delayed
if the attempt to acquire the console lock fails.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Link: http://lkml.kernel.org/r/4E7CEF60.5020508@am.sony.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In fact, with migrate_disable() existing one could play games with
kmap_atomic. You could save/restore the kmap_atomic slots on context
switch (if there are any in use of course), this should be esp easy now
that we have a kmap_atomic stack.
Something like the below.. it wants replacing all the preempt_disable()
stuff with pagefault_disable() && migrate_disable() of course, but then
you can flip kmaps around like below.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
[dvhart@linux.intel.com: build fix]
Link: http://lkml.kernel.org/r/1311842631.5890.208.camel@twins
|
|
On 07/27/2011 04:37 PM, Thomas Gleixner wrote:
> - KGDB (not yet disabled) is reportedly unusable on -rt right now due
> to missing hacks in the console locking which I dropped on purpose.
>
To work around this in the short term you can use this patch, in
addition to the clocksource watchdog patch that Thomas brewed up.
Comments are welcome of course. Ultimately the right solution is to
change separation between the console and the HW to have a polled mode
+ work queue so as not to introduce any kind of latency.
Thanks,
Jason.
|
|
There are (probably rare) situations when a system crashed and the system
console becomes unresponsive but the network icmp layer still is alive.
Wouldn't it be wonderful, if we then could submit a sysreq command via ping?
This patch provides this facility. Please consult the updated documentation
Documentation/sysrq.txt for details.
Signed-off-by: Carsten Emde <C.Emde@osadl.org>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The current workqueue code does crazy stuff on cpu unplug, it relies on
forced affine breakage, thereby violating per-cpu expectations. Worse,
it tries to re-attach to a cpu if the thing comes up again before all
previously queued works are finished. This breaks (admittedly bonkers)
cpu-hotplug use that relies on a down-up cycle to push all usage away.
Introduce a new WQ_NON_AFFINE flag that indicates a per-cpu workqueue
will not respect cpu affinity and use this to migrate all its pending
works to whatever cpu is doing cpu-down.
This also adds a warning for queue_on_cpu() users which warns when its
used on WQ_NON_AFFINE workqueues for the API implies you care about
what cpu things are ran on when such workqueues cannot guarantee this.
For the rest, simply flush all per-cpu works and don't mess about.
This also means that currently all workqueues that are manually
flushing things on cpu-down in order to provide the per-cpu guarantee
no longer need to do so.
In short, we tell the WQ what we want it to do, provide validation for
this and loose ~250 lines of code.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Implementing RCU-bh in terms of RCU-preempt makes the system vulnerable
to network-based denial-of-service attacks. This patch therefore
makes __do_softirq() invoke rcu_bh_qs(), but only when __do_softirq()
is running in ksoftirqd context. A wrapper layer in interposed so that
other calls to __do_softirq() avoid invoking rcu_bh_qs(). The underlying
function __do_softirq_common() does the actual work.
The reason that rcu_bh_qs() is bad in these non-ksoftirqd contexts is
that there might be a local_bh_enable() inside an RCU-preempt read-side
critical section. This local_bh_enable() can invoke __do_softirq()
directly, so if __do_softirq() were to invoke rcu_bh_qs() (which just
calls rcu_preempt_qs() in the PREEMPT_RT_FULL case), there would be
an illegal RCU-preempt quiescent state in the middle of an RCU-preempt
read-side critical section. Therefore, quiescent states can only happen
in cases where __do_softirq() is invoked directly from ksoftirqd.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20111005184518.GA21601@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
kernel/rcutorture.c:492: error: ‘synchronize_rcu_bh’ undeclared here (not in a function)
synchronize_rcu_bh() is not just called as a normal function, but can
also be referenced as a function pointer. When CONFIG_PREEMPT_RT_FULL
is enabled, synchronize_rcu_bh() is defined as synchronize_rcu(), but
needs to be defined without the parenthesis because the compiler will
complain when synchronize_rcu_bh is referenced as a function pointer
and not a function.
Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: stable-rt@vger.kernel.org
Link: http://lkml.kernel.org/r/1321235083-21756-1-git-send-email-jkacur@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The Linux kernel has long RCU-bh read-side critical sections that
intolerably increase scheduling latency under mainline's RCU-bh rules,
which include RCU-bh read-side critical sections being non-preemptible.
This patch therefore arranges for RCU-bh to be implemented in terms of
RCU-preempt for CONFIG_PREEMPT_RT_FULL=y.
This has the downside of defeating the purpose of RCU-bh, namely,
handling the case where the system is subjected to a network-based
denial-of-service attack that keeps at least one CPU doing full-time
softirq processing. This issue will be fixed by a later commit.
The current commit will need some work to make it appropriate for
mainline use, for example, it needs to be extended to cover Tiny RCU.
[ paulmck: Added a useful changelog ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20111005185938.GA20403@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
This fixes the following build error for the preempt-rt kernel.
make kernel/fork.o
CC kernel/fork.o
kernel/fork.c:90: error: section of ¡tasklist_lock¢ conflicts with previous declaration
make[2]: *** [kernel/fork.o] Error 1
make[1]: *** [kernel/fork.o] Error 2
The rt kernel cache aligns the RWLOCK in DEFINE_RWLOCK by default.
The non-rt kernels explicitly cache align only the tasklist_lock in
kernel/fork.c
That can create a build conflict. This fixes the build problem by making the
non-rt kernels cache align RWLOCKs by default. The side effect is that
the other RWLOCKs are also cache aligned for non-rt.
This is a short term solution for rt only.
The longer term solution would be to push the cache aligned DEFINE_RWLOCK
to mainline. If there are objections, then we could create a
DEFINE_RWLOCK_CACHE_ALIGNED or something of that nature.
Comments? Objections?
Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/alpine.LFD.2.00.1109191104010.23118@localhost6.localdomain6
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Map spinlocks, rwlocks, rw_semaphores and semaphores to the rt_mutex
based locking functions for preempt-rt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When CONFIG_PREEMPT_RT_FULL is enabled, tasklets run as threads,
and spinlocks turn are mutexes. But this can cause issues with
tasks disabling tasklets. A tasklet runs under ksoftirqd, and
if a tasklets are disabled with tasklet_disable(), the tasklet
count is increased. When a tasklet runs, it checks this counter
and if it is set, it adds itself back on the softirq queue and
returns.
The problem arises in RT because ksoftirq will see that a softirq
is ready to run (the tasklet softirq just re-armed itself), and will
not sleep, but instead run the softirqs again. The tasklet softirq
will still see that the count is non-zero and will not execute
the tasklet and requeue itself on the softirq again, which will
cause ksoftirqd to run it again and again and again.
It gets worse because ksoftirqd runs as a real-time thread.
If it preempted the task that disabled tasklets, and that task
has migration disabled, or can't run for other reasons, the tasklet
softirq will never run because the count will never be zero, and
ksoftirqd will go into an infinite loop. As an RT task, it this
becomes a big problem.
This is a hack solution to have tasklet_disable stop tasklets, and
when a tasklet runs, instead of requeueing the tasklet softirqd
it delays it. When tasklet_enable() is called, and tasklets are
waiting, then the tasklet_enable() will kick the tasklets to continue.
This prevents the lock up from ksoftirq going into an infinite loop.
[ rostedt@goodmis.org: ported to 3.0-rt ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
kernel/lockdep.c: In function ‘print_bad_irq_dependency’:
kernel/lockdep.c:1476:3: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 7 has type ‘unsigned int’
kernel/lockdep.c: In function ‘print_usage_bug’:
kernel/lockdep.c:2193:3: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 7 has type ‘unsigned int’
kernel/lockdep.i show this:
printk("%s/%d [HC%u[%lu]:SC%u[%lu]:HE%u:SE%u] is trying to acquire:\n",
curr->comm, task_pid_nr(curr),
curr->hardirq_context, ((current_thread_info()->preempt_count) & (((1UL << (10))-1) << ((0 + 8) + 8))) >> ((0 + 8) + 8),
curr->softirq_context, (0U) >> (0 + 8),
curr->hardirqs_enabled,
curr->softirqs_enabled);
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Link: http://lkml.kernel.org/r/20111013091909.GA32739@zhy
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
<NMI> [<ffffffff812dafd8>] spin_bug+0x94/0xa8
[<ffffffff812db07f>] do_raw_spin_lock+0x43/0xea
[<ffffffff814fa9be>] _raw_spin_lock_irqsave+0x6b/0x85
[<ffffffff8106ff9e>] ? migrate_disable+0x75/0x12d
[<ffffffff81078aaf>] ? pin_current_cpu+0x36/0xb0
[<ffffffff8106ff9e>] migrate_disable+0x75/0x12d
[<ffffffff81115b9d>] pagefault_disable+0xe/0x1f
[<ffffffff81047027>] copy_from_user_nmi+0x74/0xe6
[<ffffffff810489d7>] perf_callchain_user+0xf3/0x135
Now clearly we can't go around taking locks from NMI context, cure
this by short-circuiting migrate_disable() when we're in an atomic
context already.
Add some extra debugging to avoid things like:
preempt_disable()
migrate_disable();
preempt_enable();
migrate_enable();
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314967297.1301.14.camel@twins
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/n/tip-wbot4vsmwhi8vmbf83hsclk6@git.kernel.org
|
|
Make migrate_disable() be a preempt_disable() for !rt kernels. This
allows generic code to use it but still enforces that these code
sections stay relatively small.
A preemptible migrate_disable() accessible for general use would allow
people growing arbitrary per-cpu crap instead of clean these things
up.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-275i87sl8e1jcamtchmehonm@git.kernel.org
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
get_online_cpus() is a heavy weight function which involves a global
mutex. migrate_disable() wants a simpler construct which prevents only
a CPU from going doing while a task is in a migrate disabled section.
Implement a per cpu lockless mechanism, which serializes only in the
real unplug case on a global mutex. That serialization affects only
tasks on the cpu which should be brought down.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Needs thread context (pgd_lock) -> ifdeffed. workqueues wont work with
RT
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
posix-cpu-timer code takes non -rt safe locks in hard irq
context. Move it to a thread.
[ 3.0 fixes from Peter Zijlstra <peterz@infradead.org> ]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In preempt-rt we can not call the callbacks which take sleeping locks
from the timer interrupt context.
Bring back the softirq split for now, until we fixed the signal
delivery problem for real.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Make cancellation of a running callback in softirq context safe
against preemption.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
When softirqs can be preempted we need to make sure that cancelling
the timer from the active thread can not deadlock vs. a running timer
callback. Add a waitqueue to resolve that.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
He below is a boot-tested hack to shrink the page frame size back to
normal.
Should be a net win since there should be many less PTE-pages than
page-frames.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|