summaryrefslogtreecommitdiff
path: root/arch/arc
AgeCommit message (Collapse)Author
2014-06-05Merge branch 'linux-3.10.40' into rel-21Ishan Mittal
Bug 200004122 Conflicts: drivers/cpufreq/cpufreq.c drivers/regulator/core.c sound/soc/codecs/max98090.c Change-Id: I9418a05ad5c56b2e902249218bac2fa594d99f56 Signed-off-by: Ishan Mittal <imittal@nvidia.com>
2014-05-13ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safeVineet Gupta
commit 8aa9e85adac609588eeec356e5a85059b3b819ba upstream. There was a very small race window where resume to kernel mode from a Exception Path (or pure kernel mode which is true for most of ARC exceptions anyways), was not disabling interrupts in restore_regs, clobbering the exception regs Anton found the culprit call flow (after many sleepless nights) | 1. we got a Trap from user land | 2. started to service it. | 3. While doing some stuff on user-land memory (I think it is padzero()), | we got a DataTlbMiss | 4. On return from it we are taking "resume_kernel_mode" path | 5. NEED_RESHED is not set, so we go to "return from exception" path in | restore regs. | 6. there seems to be IRQ happening Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Anton Kolesov <Anton.Kolesov@synopsys.com> Cc: Francois Bedard <Francois.Bedard@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13ARC: Entry Handler tweaks: Optimize away redundant IRQ_DISABLE_SAVEVineet Gupta
commit fce16bc35ae4a45634f3dc348d8d297a25c277cf upstream. In the exception return path, for both U/K cases, intr are already disabled (for various existing reasons). So when we drop down to @restore_regs, we need not redo that. There was subtle issue - when intr were NOT being disabled for ret-to-kernel-but-no-preemption case - now fixed by moving the IRQ_DISABLE further up in @resume_kernel_mode. So what do we gain: * Shaves off a few insn in return path. * Eliminates the need for IRQ_DISABLE_SAVE assembler macro for ARCv2 hence allows for entry code sharing. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-13ARC: Entry Handler tweaks: Simplify branch for in-kernel preemptionVineet Gupta
commit 147aece29b15051173eb1e767018135361cdba89 upstream. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-30lib: Add missing arch generic-y entries for asm-generic/hash.hDavid S. Miller
Conflicts: arch/avr32/include/asm/Kbuild arch/c6x/include/asm/Kbuild arch/cris/include/asm/Kbuild arch/ia64/include/asm/Kbuild arch/mips/include/asm/Kbuild arch/openrisc/include/asm/Kbuild arch/powerpc/include/asm/Kbuild arch/score/include/asm/Kbuild Change-Id: I6800e3f03dbc40e94de1495459dec4b29df4474e Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit e3fec2f74f7f90d2149a24243a4d040caabe6f30) Signed-off-by: Ishan Mittal <imittal@nvidia.com>
2014-04-30kernel: remove CONFIG_USE_GENERIC_SMP_HELPERSChristoph Hellwig
We've switched over every architecture that supports SMP to it, so remove the new useless config variable. Conflicts: arch/arm/Kconfig block/blk-mq.c Change-Id: Ic19c3ac07a38a1636d6aa2fed5e55a58833f9b2c Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 0a06ff068f1255bcd7965ab07bc0f4adc3eb639a) Signed-off-by: Ishan Mittal <imittal@nvidia.com>
2014-04-30of: remove early_init_dt_setup_initrd_archRob Herring
All arches do essentially the same thing now for early_init_dt_setup_initrd_arch, so it can now be removed. Conflicts: arch/arm/mm/init.c arch/c6x/kernel/devicetree.c arch/powerpc/kernel/prom.c (cherry picked from commit 29eb45a9ab4839a1e9cef2bcf369b918c9c4fcad) Acked-by: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Acked-by: Grant Likely <grant.likely@linaro.org> Change-Id: I84b59cec16fa7e96fa8ff3de04aa959f7039a7a9 Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Ishan Mittal <imittal@nvidia.com>
2014-04-30sched, arch: Create asm/preempt.hPeter Zijlstra
In order to prepare to per-arch implementations of preempt_count move the required bits into an asm-generic header and use this for all archs. (cherry picked from commit a787870924dbd6f321661e06d4ec1c7a408c9ccf) Conflicts: arch/c6x/include/asm/Kbuild arch/cris/include/asm/Kbuild arch/h8300/include/asm/Kbuild arch/ia64/include/asm/Kbuild arch/mips/include/asm/Kbuild arch/openrisc/include/asm/Kbuild arch/powerpc/include/asm/Kbuild arch/score/include/asm/Kbuild include/linux/preempt.h Change-Id: I544914d3c23cc50da658296a34f9f2796854e259 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ishan Mittal <imittal@nvidia.com>
2014-04-30arch: mm: pass userspace fault flag to generic fault handlerJohannes Weiner
Unlike global OOM handling, memory cgroup code will invoke the OOM killer in any OOM situation because it has no way of telling faults occuring in kernel context - which could be handled more gracefully - from user-triggered faults. Pass a flag that identifies faults originating in user space from the architecture-specific fault handlers to generic code so that memcg OOM handling can be improved. (cherry picked from commit 759496ba6407c6994d6a5ce3a5e74937d7816208) Conflicts: arch/arc/mm/fault.c Change-Id: I6ddf37c0feae69fcda0c2db76d2b10ca2a11c619 Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: azurIt <azurit@pobox.sk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-30of: consolidate definition of early_init_dt_alloc_memory_arch()Grant Likely
Most architectures use the same implementation. Collapse the common ones into a single weak function that can be overridden. (cherry picked from commit a1727da599ad030ccaf4073473fd235c8ee28219) Change-Id: I5cfcac67b98407f8e4c4dcb89829f2a8e0d1b88b Signed-off-by: Grant Likely <grant.likely@linaro.org>
2014-04-30of: Specify initrd location using 64-bitSantosh Shilimkar
On some PAE architectures, the entire range of physical memory could reside outside the 32-bit limit. These systems need the ability to specify the initrd location using 64-bit numbers. This patch globally modifies the early_init_dt_setup_initrd_arch() function to use 64-bit numbers instead of the current unsigned long. There has been quite a bit of debate about whether to use u64 or phys_addr_t. It was concluded to stick to u64 to be consistent with rest of the device tree code. As summarized by Geert, "The address to load the initrd is decided by the bootloader/user and set at that point later in time. The dtb should not be tied to the kernel you are booting" More details on the discussion can be found here: https://lkml.org/lkml/2013/6/20/690 https://lkml.org/lkml/2012/9/13/544 (cherry picked from commit 374d5c9964c10373ba39bbe934f4262eb87d7114) Change-Id: Iab36378e1de4e6c2cb07a3b88aeb5ff4afbe535b Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Signed-off-by: Grant Likely <grant.likely@linaro.org>
2014-04-14ARC: [nsimosci] Unbork consoleVineet Gupta
commit 61fb4bfc010b0d2940f7fd87acbce6a0f03217cb upstream. Despite the switch to right UART driver (prev patch), serial console still doesn't work due to missing CONFIG_SERIAL_OF_PLATFORM Also fix the default cmdline in DT to not refer to out-of-tree ARC framebuffer driver for console. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Francois Bedard <Francois.Bedard@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14ARC: [nsimosci] Change .dts to use generic 8250 UARTMischa Jonker
commit 6eda477b3c54b8236868c8784e5e042ff14244f0 upstream. The Synopsys APB DW UART has a couple of special features that are not in the System C model. In 3.8, the 8250_dw driver didn't really use these features, but from 3.9 onwards, the 8250_dw driver has become incompatible with our model. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Francois Bedard <Francois.Bedard@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-12-16Merge tag 'v3.10.24' into HEADAjay Nandakumar
This is the 3.10.24 stable release Change-Id: Ibd2734f93d44385ab86867272a1359158635133b
2013-11-13ARC: Incorrect mm reference used in vmalloc fault handlerVineet Gupta
commit 9c41f4eeb9d51f3ece20428d35a3ea32cf3b5622 upstream. A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current task's "active_mm". ARC vmalloc fault handler however was using mm. A vmalloc fault for non user task context (actually pre-userland, from init thread's open for /dev/console) caused the handler to deref NULL mm (for mm->pgd) The reasons it worked so far is amazing: 1. By default (!SMP), vmalloc fault handler uses a cached value of PGD. In SMP that MMU register is repurposed hence need for mm pointer deref. 2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in pre-userland code path - it was introduced with commit 20bafb3d23d108bc "n_tty: Move buffers into n_tty_data" Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-31Merge tag 'v3.10.17' into dev-kernel-3.10Ajay Nandakumar
This is the 3.10.17 stable release Conflicts: drivers/usb/host/xhci.c Change-Id: I6bd3b15ff92a0b94568b9d02e9bb1036becfca20
2013-10-18ARC: Ignore ptrace SETREGSET request for synthetic register "stop_pc"Vineet Gupta
commit 5b24282846c064ee90d40fcb3a8f63b8e754fd28 upstream. ARCompact TRAP_S insn used for breakpoints, commits before exception is taken (updating architectural PC). So ptregs->ret contains next-PC and not the breakpoint PC itself. This is different from other restartable exceptions such as TLB Miss where ptregs->ret has exact faulting PC. gdb needs to know exact-PC hence ARC ptrace GETREGSET provides for @stop_pc which returns ptregs->ret vs. EFA depending on the situation. However, writing stop_pc (SETREGSET request), which updates ptregs->ret doesn't makes sense stop_pc doesn't always correspond to that reg as described above. This was not an issue so far since user_regs->ret / user_regs->stop_pc had same value and both writing to ptregs->ret was OK, needless, but NOT broken, hence not observed. With gdb "jump", they diverge, and user_regs->ret updating ptregs is overwritten immediately with stop_pc, which this patch fixes. Reported-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-18ARC: Fix signal frame management for SA_SIGINFOChristian Ruppert
commit 10469350e345599dfef3fa78a7c19fb230e674c1 upstream. Previously, when a signal was registered with SA_SIGINFO, parameters 2 and 3 of the signal handler were written to registers r1 and r2 before the register set was saved. This led to corruption of these two registers after returning from the signal handler (the wrong values were restored). With this patch, registers are now saved before any parameters are passed, thus maintaining the processor state from before signal entry. Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-18ARC: Workaround spinlock livelock in SMP SystemC simulationVineet Gupta
commit 6c00350b573c0bd3635436e43e8696951dd6e1b6 upstream. Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and can only use atomic EX insn (reg with mem) to build higher level R-M-W primitives. This includes a SystemC based SMP simulation model. So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange operation to update reader(s)/writer count. The spinlock operation itself looks as follows: mov reg, 1 ; 1=locked, 0=unlocked retry: EX reg, [lock] ; load existing, store 1, atomically BREQ reg, 1, rety ; if already locked, retry In single-threaded simulation, SystemC alternates between the 2 cores with "N" insn each based scheduling. Additionally for insn with global side effect, such as EX writing to shared mem, a core switch is enforced too. Given that, 2 cores doing a repeated EX on same location, Linux often got into a livelock e.g. when both cores were fiddling with tasklist lock (gdbserver / hackbench) for read/write respectively as the sequence diagram below shows: core1 core2 -------- -------- 1. spin lock [EX r=0, w=1] - LOCKED 2. rwlock(Read) - LOCKED 3. spin unlock [ST 0] - UNLOCKED spin lock [EX r=0,w=1] - LOCKED -- resched core 1---- 5. spin lock [EX r=1] - ALREADY-LOCKED -- resched core 2---- 6. rwlock(Write) - READER-LOCKED 7. spin unlock [ST 0] 8. rwlock failed, retry again 9. spin lock [EX r=0, w=1] -- resched core 1---- 10 spinlock locked in #9, retry #5 11. spin lock [EX gets 1] -- resched core 2---- ... ... The fix was to unlock using the EX insn too (step 7), to trigger another SystemC scheduling pass which would let core1 proceed, eliding the livelock. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-18ARC: Fix 32-bit wrap around in access_ok()Vineet Gupta
commit 0752adfda15f0eca9859a76da3db1800e129ad43 upstream. Anton reported | LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail | similarly in one testcase test_iov_invalid -> lvec->iov_base. | Testcase expects errno EFAULT and return code -1, | but it gets return code 1 and ERRNO is 0 what means success. Essentially test case was passing a pointer of -1 which access_ok() was not catching. It was doing [@addr + @sz <= TASK_SIZE] which would pass for @addr == -1 Fixed that by rewriting as [@addr <= TASK_SIZE - @sz] Reported-by: Anton Kolesov <Anton.Kolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-18ARC: Handle zero-overhead-loop in unaligned access handlerMischa Jonker
commit c11eb222fd7d4db91196121dbf854178505d2751 upstream. If a load or store is the last instruction in a zero-overhead-loop, and it's misaligned, the loop would execute only once. This fixes that problem. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-18ARC: Fix __udelay calculationMischa Jonker
commit 7efd0da2d17360e1cef91507dbe619db0ee2c691 upstream. Cast usecs to u64, to ensure that the (usecs * 4295 * HZ) multiplication is 64 bit. Initially, the (usecs * 4295 * HZ) part was done as a 32 bit multiplication, with the result casted to 64 bit. This led to some bits falling off, causing a "DMA initialization error" in the stmmac Ethernet driver, due to a premature timeout. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-18ARC: SMP failed to boot due to missing IVT setupNoam Camus
commit c3567f8a359b7917dcffa442301f88ed0a75211f upstream. Commit 05b016ecf5e7a "ARC: Setup Vector Table Base in early boot" moved the Interrupt vector Table setup out of arc_init_IRQ() which is called for all CPUs, to entry point of boot cpu only, breaking booting of others. Fix by adding the same to entry point of non-boot CPUs too. read_arc_build_cfg_regs() printing IVT Base Register didn't help the casue since it prints a synthetic value if zero which is totally bogus, so fix that to print the exact Register. [vgupta: Remove the now stale comment from header of arc_init_IRQ and also added the commentary for halt-on-reset] Cc: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Noam Camus <noamc@ezchip.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-18ARC: Setup Vector Table Base in early bootVineet Gupta
commit 05b016ecf5e7a8c24409d8e9effb5d2ec9107708 upstream. Otherwise early boot exceptions such as instructions errors due to configuration mismatch between kernel and hardware go off to la-la land, as opposed to hitting the handler and panic()'ing properly. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-09-14UPSTREAM arch: mm: pass userspace fault flag to generic fault handlerPrashant Gaikwad
Unlike global OOM handling, memory cgroup code will invoke the OOM killer in any OOM situation because it has no way of telling faults occuring in kernel context - which could be handled more gracefully - from user-triggered faults. Pass a flag that identifies faults originating in user space from the architecture-specific fault handlers to generic code so that memcg OOM handling can be improved. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: azurIt <azurit@pobox.sk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from commit 407c454cb0ac6e68ca66974da787a71118cfef84) Conflicts: arch/arc/mm/fault.c arch/arm64/mm/fault.c arch/metag/mm/fault.c arch/parisc/mm/fault.c Change-Id: Iee53942737627be8dd8e2e325b5ba87fe85d6814 Signed-off-by: Prashant Gaikwad <pgaikwad@nvidia.com> Reviewed-on: http://git-master/r/266410 GVS: Gerrit_Virtual_Submit Reviewed-by: Sachin Nikam <snikam@nvidia.com>
2013-08-29ARC: [lib] strchr breakage in Big-endian configurationJoern Rennecke
commit b0f55f2a1a295c364be012e82dbab079a2454006 upstream. For a search buffer, 2 byte aligned, strchr() was returning pointer outside of buffer (buf - 1) ------------->8---------------- // Input buffer (default 4 byte aigned) char *buffer = "1AA_"; // actual search start (to mimick 2 byte alignment) char *current_line = &(buffer[2]); // Character to search for char c = 'A'; char *c_pos = strchr(current_line, c); printf("%s\n", c_pos) --> 'AA_' as oppose to 'A_' ------------->8---------------- Reported-by: Anton Kolesov <Anton.Kolesov@synopsys.com> Debugged-by: Anton Kolesov <Anton.Kolesov@synopsys.com> Cc: Noam Camus <noamc@ezchip.com> Signed-off-by: Joern Rennecke <joern.rennecke@embecosm.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-29ARC: gdbserver breakage in Big-Endian configuration #2Vineet Gupta
[Based on mainline commit 352c1d95e3220d0: "ARC: stop using pt_regs->orig_r8"] Stop using orig_r8 as it could get clobbered by ST in trap_with_param, and further it is semantically not needed either. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-08-29ARC: gdbserver breakage in Big-Endian configuration #1Vineet Gupta
[Based on mainline commit 502a0c775c7f0a: "ARC: pt_regs update #5"] gdbserver needs @stop_pc, served by ptrace, but fetched from pt_regs differently, based on in_brkpt_traps(), which in turn relies on additional machine state in pt_regs->event bitfield. unsigned long orig_r8:16, event:16; For big endian config, this macro was returning false, despite being in breakpoint Trap exception, causing wrong @stop_pc to be returned to gdb. Issue #1: In BE, @event above is at offset 2 in word, while a STW insn at offset 0 was used to update it. Resort to using ST insn which updates the half-word at right location. Issue #2: The union involving bitfields causes all the members to be laid out at offset 0. So with fix #1 above, ASM was now updating at offset 2, "C" code was still referencing at offset 0. Fixed by wrapping bitfield in a struct. Reported-by: Noam Camus <noamc@ezchip.com> Tested-by: Anton Kolesov <akolesov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-05-25ARC: lazy dcache flush broke gdb in non-aliasing configsVineet Gupta
gdbserver inserting a breakpoint ends up calling copy_user_page() for a code page. The generic version of which (non-aliasing config) didn't set the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for corresponding dynamic loader code page - causing garbade to be executed. So now aliasing versions of copy_user_highpage()/clear_page() are made default. There is no significant overhead since all of special alias handling code is compiled out for non-aliasing build Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-23ARC: Use enough bits for determining page's cache colorVineet Gupta
The current code uses 2 bits for determining page's dcache color, thus sorting pages into 4 bins, whereas the aliasing dcache really has 2 bins (8k page, 64k dcache - 4 way-set-assoc). This can cause extraneous flushes - e.g. color 0 and 2. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-23ARC: Brown paper bag bug in macro for checking cache colorVineet Gupta
The VM_EXEC check in update_mmu_cache() was getting optimized away because of a stupid error in definition of macro addr_not_cache_congruent() The intention was to have the equivalent of following: if (a || (1 ? b : 0)) but we ended up with following: if (a || 1 ? b : 0) And because precedence of '||' is more that that of '?', gcc was optimizing away evaluation of <a> Nasty Repercussions: 1. For non-aliasing configs it would mean some extraneous dcache flushes for non-code pages if U/K mappings were not congruent. 2. For aliasing config, some needed dcache flush for code pages might be missed if U/K mappings were congruent. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-23ARC: copy_(to|from)_user() to honor usermode-access permissionsVineet Gupta
This manifested as grep failing psuedo-randomly: -------------->8--------------------- [ARCLinux]$ ip address show lo | grep inet [ARCLinux]$ ip address show lo | grep inet [ARCLinux]$ ip address show lo | grep inet [ARCLinux]$ [ARCLinux]$ ip address show lo | grep inet inet 127.0.0.1/8 scope host lo -------------->8--------------------- ARC700 MMU provides fully orthogonal permission bits per page: Ur, Uw, Ux, Kr, Kw, Kx The user mode page permission templates used to have all Kernel mode access bits enabled. This caused a tricky race condition observed with uClibc buffered file read and UNIX pipes. 1. Read access to an anon mapped page in libc .bss: write-protected zero_page mapped: TLB Entry installed with Ur + K[rwx] 2. grep calls libc:getc() -> buffered read layer calls read(2) with the internal read buffer in same .bss page. The read() call is on STDIN which has been redirected to a pipe. read(2) => sys_read() => pipe_read() => copy_to_user() 3. Since page has Kernel-write permission (despite being user-mode write-protected), copy_to_user() suceeds w/o taking a MMU TLB-Miss Exception (page-fault for ARC). core-MM is unaware that kernel erroneously wrote to the reserved read-only zero-page (BUG #1) 4. Control returns to userspace which now does a write to same .bss page Since Linux MM is not aware that page has been modified by kernel, it simply reassigns a new writable zero-init page to mapping, loosing the prior write by kernel - effectively zero'ing out the libc read buffer under the hood - hence grep doesn't see right data (BUG #2) The fix is to make all kernel-mode access permissions mirror the user-mode ones. Note that the kernel still has full access to pages, when accessed directly (w/o MMU) - this fix ensures that kernel-mode access in copy_to_from() path uses the same faulting access model as for pure user accesses to keep MM fully aware of page state. The issue is peudo-random because it only shows up if the TLB entry installed in #1 is present at the time of #3. If it is evicted out, due to TLB pressure or some-such, then copy_to_user() does take a TLB Miss Exception, with a routine write-to-anon COW processing installing a fresh page for kernel writes and also usable as it is in userspace. Further the issue was dormant for so long as it depends on where the libc internal read buffer (in .bss) is mapped at runtime. If it happens to reside in file-backed data mapping of libc (in the page-aligned slack space trailing the file backed data), loader zero padding the slack space, does the early cow page replacement, setting things up at the very beginning itself. With gcc 4.8 based builds, the libc buffer got pushed out to a real anon mapping which triggers the issue. Reported-by: Anton Kolesov <akolesov@synopsys.com> Cc: <stable@vger.kernel.org> # 3.9 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-23ARC: [mm] Prevent stray dcache lines after__sync_icache_dcach()Vineet Gupta
Flush and INVALIDATE the dcache page. This helper is only used for writeback of CODE pages to memory. So there's no value in keeping the dcache lines around. Infact it is risky as a writeback on natural eviction under pressure can cause un-needed writeback with weird issues on aliasing dcache configurations. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-15ARC: [TB10x] Remove redundant abilis,simple-pinctrl mechanismChristian Ruppert
The TB10x platform port includes a custom mechanism using to set up default pin controller configurations using abilis,simple-default pin configurations of nodes compatible with abilis,simple-pinctrl. This mechanism is redundant with the Linux standard "default" pin configuration, see commit ab78029ecc347debbd737f06688d788bd9d60c1d "drivers/pinctrl: grab default handles from device core". This patch removes the TB10x custom mechanism in favour of the Linux standard. Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Reviewed-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-10Merge tag 'arc-v3.10-rc1-part2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull second set of arc arch updates from Vineet Gupta: "Aliasing VIPT dcache support for ARC I'm satisified with testing, specially with fuse which has historically given grief to VIPT arches (ARM/PARISC...)" * tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [TB10x] Remove GENERIC_GPIO ARC: [mm] Aliasing VIPT dcache support 4/4 ARC: [mm] Aliasing VIPT dcache support 3/4 ARC: [mm] Aliasing VIPT dcache support 2/4 ARC: [mm] Aliasing VIPT dcache support 1/4 ARC: [mm] refactor the core (i|d)cache line ops loops ARC: [mm] serious bug in vaddr based icache flush
2013-05-10ARC: [TB10x] Remove GENERIC_GPIOVineet Gupta
This tracks Alexandre Courbot's mainline GPIO rework Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Alexandre Courbot <acourbot@nvidia.com>
2013-05-09Merge tag 'arc-v3.10-rc1-part1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC port updates from Vineet Gupta: "Support for two new platforms based on ARC700: - Abilis TB10x SoC [Chritisian/Pierrick] - Simulator only System-C Model [Mischa] ARC specific MM improvements: - Avoid full TLB flush (ASID increment) on munmap (even single page) - VIPT Cache Flushing improvements + Delayed dcache flush for non-aliasing dcache (big performance boost) + icache flush aliasing agnostic (no need to kill all possible aliases) Others: - Avoid needless rebuild of DTB files for every kernel build - Remove builtin cmdline as that is already provided by DeviceTree/bootargs - Fixing unaligned access emulation corner case - checkpatch fixes [Sachin] - Various fixlets [Noam] - Minor build failures/cleanups" * tag 'arc-v3.10-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (35 commits) ARC: [mm] Lazy D-cache flush (non aliasing VIPT) ARC: [mm] micro-optimize page size icache invalidate ARC: [mm] remove the pessimistic all-alias-invalidate icache helpers ARC: [mm] consolidate icache/dcache sync code ARC: [mm] optimise icache flush for kernel mappings ARC: [mm] optimise icache flush for user mappings ARC: [mm] optimize needless full mm TLB flush on munmap ARC: Add support for nSIM OSCI System C model ARC: [TB10x] Adapt device tree to new compatible string ARC: [TB10x] Add support for TB10x platform ARC: [TB10x] Device tree of TB100 and TB101 Development Kits ARC: Prepare interrupt code for external controllers ARC: Allow embedded arc-intc to be properly placed in DT intc hierarchy ARC: [cmdline] Don't overwrite u-boot provided bootargs ARC: [cmdline] Remove CONFIG_CMDLINE ARC: [plat-arcfpga] defconfig update ARC: unaligned access emulation broken if callee-reg dest of LD/ST ARC: unaligned access emulation error handling consolidation ARC: Debug/crash-printing Improvements ARC: fix typo with clock speed ...
2013-05-09ARC: [mm] Aliasing VIPT dcache support 4/4Vineet Gupta
Enforce congruency of userspace shared mappings Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-09ARC: [mm] Aliasing VIPT dcache support 3/4Vineet Gupta
Fix the one zillion warnings Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-09ARC: [mm] Aliasing VIPT dcache support 2/4Vineet Gupta
This is the meat of the series which prevents any dcache alias creation by always keeping the U and K mapping of a page congruent. If a mapping already exists, and other tries to access the page, prev one is flushed to physical page (wback+inv) Essentially flush_dcache_page()/copy_user_highpage() create K-mapping of a page, but try to defer flushing, unless U-mapping exist. When page is actually mapped to userspace, update_mmu_cache() flushes the K-mapping (in certain cases this can be optimised out) Additonally flush_cache_mm(), flush_cache_range(), flush_cache_page() handle the puring of stale userspace mappings on exit/munmap... flush_anon_page() handles the existing U-mapping for anon page before kernel reads it via the GUP path. Note that while not complete, this is enough to boot a simple dynamically linked Busybox based rootfs Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-09ARC: [mm] Aliasing VIPT dcache support 1/4Vineet Gupta
This preps the low level dcache flush helpers to take vaddr argument in addition to the existing paddr to properly flush the VIPT dcache Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-09ARC: [mm] refactor the core (i|d)cache line ops loopsVineet Gupta
Nothing semantical * simplify the alignement code by using & operation only * rename variables clearly as paddr Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-09ARC: [mm] serious bug in vaddr based icache flushVineet Gupta
vaddr used to index the cache was clipped from the wrong end, and thus would potentially fail to flush the correct lines. The problem was dorment for so long because up until the recent optimizations it was only used for ptrace break-point only flushes. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07ARC: [mm] Lazy D-cache flush (non aliasing VIPT)Vineet Gupta
flush_dcache_page( ) is MM hook to ensure that a page has consistent views between kernel and userspace. Thus it is called when * kernel writes to a page which at some later point could get mapped to userspace (so kernel mapping needs to be flushed-n-inv) * kernel is about to read from a page with possible userspace mappings (so userspace mappings needs to be made coherent with kernel ones) However for Non aliasing VIPT dcache, any userspace mapping will always be congruent to kernel mapping. Thus d-cache need need not be flushed at all (or delayed indefinitely). The only reason it does need to be flushed is when mapping code pages. Since icache doesn't snoop dcache, those dirty dcache lines need to be written back to memory and icache line invalidated so that icache lines fetch will get the right data. Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks. (1) FPGA @ 80 MHZ Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 3.9-rc6-a Linux 3.9.0-r 80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K 3.9-rc6-b Linux 3.9.0-r 80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K 3.9-rc7-c Linux 3.9.0-r 80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K ^^^^ ^^^^ ^^^ File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- 3.9-rc6-a Linux 3.9.0-r 317.8 204.2 1122.3 375.1 3522.0 4.288 20.7 126.8 3.9-rc6-b Linux 3.9.0-r 298.7 223.0 1141.6 367.8 3531.0 4.866 20.9 126.4 3.9-rc7-c Linux 3.9.0-r 278.4 179.2 862.1 339.3 3705.0 3.223 20.3 126.6 ^^^^^ ^^^^^ ^^^^^ ^^^^ (2) Customer Silicon @ 500 MHz (166 MHz mem) ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- abilis-ba Linux 3.9.0-r 497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K abilis-ca Linux 3.9.0-r 497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K ^^^^ ^^^^ ^^^ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07ARC: [mm] micro-optimize page size icache invalidateVineet Gupta
start address is already page aligned and size is const PAGE_SIZE, thus fixups for alignment not needed in generated code. bloat-o-meter vmlinux-mm5 vmlinux add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-32 (-32) function old new delta __inv_icache_page 82 50 -32 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07ARC: [mm] remove the pessimistic all-alias-invalidate icache helpersVineet Gupta
No users of this code anymore - so RIP ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07ARC: [mm] consolidate icache/dcache sync codeVineet Gupta
Now that we have same helper used for all icache invalidates (i.e. vaddr+paddr based exact line invalidate), consolidate the open coded calls into one place. Also rename flush_icache_range_vaddr => __sync_icache_dcache Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07ARC: [mm] optimise icache flush for kernel mappingsVineet Gupta
This change continues the theme from prev commit - this time icache handling for kernel's own code modification (vmalloc: loadable modules, breakpoints for kprobes/kgdb...) flush_icache_range() calls the CDU icache helper with vaddr to enable exact line invalidate. For a true kernel-virtual mapping, the vaddr is actually virtual hence valid as index into cache. For kprobes breakpoint however, the vaddr arg is actually paddr - since that's how normal kernel is mapped in ARC memory map. This implies that CDU will use the same addr for indexing as for tag match - which is fine since kernel code would only have that "implicit" mapping and none other. This should speed up module loading significantly - specially on default ARC700 icache configurations (32k) which alias. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07ARC: [mm] optimise icache flush for user mappingsVineet Gupta
ARC icache doesn't snoop dcache thus executable pages need to be made coherent before mapping into userspace in flush_icache_page(). However ARC700 CDU (hardware cache flush module) requires both vaddr (index in cache) as well as paddr (tag match) to correctly identify a line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus the paddr only based flush_icache_page() API couldn't be implemented efficiently. It had to loop thru all possible alias indexes and perform the invalidate operation (ofcourse the cache op would only succeed at the index(es) where tag matches - typically only 1, but the cost of visiting all the cache-bins needs to paid nevertheless). Turns out however that the vaddr (along with paddr) is available in update_mmu_cache() hence better suits ARC icache flush semantics. With both vaddr+paddr, exactly one flush operation per line is done. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-05-07ARC: [mm] optimize needless full mm TLB flush on munmapVineet Gupta
munmap ends up calling tlb_flush() which for ARC was flushing the entire TLB unconditionally (by moving the MMU to a new ASID) do_munmap unmap_region unmap_vmas unmap_single_vma unmap_page_range tlb_start_vma zap_pud_range tlb_end_vma() tlb_finish_mmu tlb_flush() ---> unconditional flush_tlb_mm() So even a single page munmap, a frequent operation when uClibc dynamic linker (ldso) is loading the dependent shared libraries, would move the the ASID multiple times - needlessly invalidating the pre-faulted TLB entries (and increasing the rate of ASID wraparound + full TLB flush). This is now optimised to only be called if tlb->full_mm (which means for exit/execve) cases only. And for those cases, flush_tlb_mm() is already optimised to be a no-op for mm->mm_users == 0. So essentially there are no mmore full mm flushes - except for fork which anyhow needs it for properly COW'ing parent address space. munmap now needs to do TLB range flush, which is implemented with tlb_end_vma() Results ------- 1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or 7 before). 2. LMBench microbenchmark also shows improvements Basic system parameters ------------------------------------------------------------------------------ Host OS Description Mhz tlb cache mem scal pages line par load bytes --------- ------------- ----------------------- ---- ----- ----- ------ ---- 3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba 80 8 64 1.1000 1 3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full 80 8 64 1.1200 1 Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 3.9-rc5-0 Linux 3.9.0-r 80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K 3.9-rc5-0 Linux 3.9.0-r 80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- 3.9-rc5-0 Linux 3.9.0-r 314.7 223.2 1054.9 390.2 3615.0 1.590 20.1 126.6 3.9-rc5-0 Linux 3.9.0-r 265.8 183.8 1014.2 314.1 3193.0 6.910 18.8 110.4 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>