summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-01-21Linux 3.4.27v3.4.27Greg Kroah-Hartman
2013-01-21staging: vt6656: Fix inconsistent structure packingBen Hutchings
commit 1ee4c55fc9620451b2a825d793042a7e0775391b upstream. vt6656 has several headers that use the #pragma pack(1) directive to enable structure packing, but never disable it. The layout of structures defined in other headers can then depend on which order the various headers are included in, breaking the One Definition Rule. In practice this resulted in crashes on x86_64 until the order of header inclusion was changed for some files in commit 11d404cb56ecd ('staging: vt6656: fix headers and add cfg80211.'). But we need a proper fix that won't be affected by future changes to the order of inclusion. This removes the #pragma pack(1) directives and adds __packed to the structure definitions for which packing appears to have been intended. Reported-and-tested-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21staging: wlan-ng: Fix clamping of returned SSID lengthTormod Volden
commit 811a37effdb11e54e1ff1ddaa944286c88f58487 upstream. Commit 2e254212 broke listing of available network names, since it clamped the length of the returned SSID to WLAN_BSSID_LEN (6) instead of WLAN_SSID_MAXLEN (32). https://bugzilla.kernel.org/show_bug.cgi?id=52501 Signed-off-by: Tormod Volden <debian.tormod@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21tty: 8250_dw: Fix inverted arguments to serial_out in IRQ handlerMaxime Ripard
commit 68e56cb3a068f9c30971c6117fbbd1e32918e49e upstream. Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21serial:ifx6x60:Delete SPI timer when shut down portchao bi
commit 014b9b4ce84281ccb3d723c792bed19815f3571a upstream. When shut down SPI port, it's possible that MRDY has been asserted and a SPI timer was activated waiting for SRDY assert, in the case, it needs to delete this timer. Signed-off-by: Chen Jun <jun.d.chen@intel.com> Signed-off-by: channing <chao.bi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21USB: option: blacklist network interface on ONDA MT8205 4G LTEBjørn Mork
Signed-off-by: Bjørn Mork <bjorn@mork.no> commit 2291dff02e5f8c708a46a7c4c888f2c467e26642 upstream. The driver description files gives these names to the vendor specific functions on this modem: Diag VID_19D2&PID_0265&MI_00 NMEA VID_19D2&PID_0265&MI_01 AT cmd VID_19D2&PID_0265&MI_02 Modem VID_19D2&PID_0265&MI_03 Net VID_19D2&PID_0265&MI_04 Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21USB: option: add TP-LINK HSUPA Modem MA180Bjørn Mork
commit 99beb2e9687ffd61c92a9875141eabe6f57a71b9 upstream. The driver description files gives these names to the vendor specific functions on this modem: Diagnostics VID_2357&PID_0201&MI_00 NMEA VID_2357&PID_0201&MI_01 Modem VID_2357&PID_0201&MI_03 Networkcard VID_2357&PID_0201&MI_04 Reported-by: Thomas Schäfer <tschaefer@t-online.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests.Frediano Ziglio
commit 9174adbee4a9a49d0139f5d71969852b36720809 upstream. This fixes CVE-2013-0190 / XSA-40 There has been an error on the xen_failsafe_callback path for failed iret, which causes the stack pointer to be wrong when entering the iret_exc error path. This can result in the kernel crashing. In the classic kernel case, the relevant code looked a little like: popl %eax # Error code from hypervisor jz 5f addl $16,%esp jmp iret_exc # Hypervisor said iret fault 5: addl $16,%esp # Hypervisor said segment selector fault Here, there are two identical addls on either option of a branch which appears to have been optimised by hoisting it above the jz, and converting it to an lea, which leaves the flags register unaffected. In the PVOPS case, the code looks like: popl_cfi %eax # Error from the hypervisor lea 16(%esp),%esp # Add $16 before choosing fault path CFI_ADJUST_CFA_OFFSET -16 jz 5f addl $16,%esp # Incorrectly adjust %esp again jmp iret_exc It is possible unprivileged userspace applications to cause this behaviour, for example by loading an LDT code selector, then changing the code selector to be not-present. At this point, there is a race condition where it is possible for the hypervisor to return back to userspace from an interrupt, fault on its own iret, and inject a failsafe_callback into the kernel. This bug has been present since the introduction of Xen PVOPS support in commit 5ead97c84 (xen: Core Xen implementation), in 2.6.23. Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21xen/grant-table: correctly initialize grant table version 1Matt Wilson
commit d0b4d64aadb9f4a90669848de9ef3819050a98cd upstream. Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from a constant to a conditional expression. The expression depends on grant_table_version being appropriately set. Unfortunately, at init time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME conditional expression checks for "grant_table_version == 1", and therefore returns the number of grant references per frame for v2. This causes gnttab_init() to allocate fewer pages for gnttab_list, as a frame can old half the number of v2 entries than v1 entries. After gnttab_resume() is called, grant_table_version is appropriately set. nr_init_grefs will then be miscalculated and gnttab_free_count will hold a value larger than the actual number of free gref entries. If a guest is heavily utilizing improperly initialized v1 grant tables, memory corruption can occur. One common manifestation is corruption of the vmalloc list, resulting in a poisoned pointer derefrence when accessing /proc/meminfo or /proc/vmallocinfo: [ 40.770064] BUG: unable to handle kernel paging request at 0000200200001407 [ 40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110 [ 40.770102] PGD 0 [ 40.770107] Oops: 0000 [#1] SMP [ 40.770114] CPU 10 This patch introduces a static variable, grefs_per_grant_frame, to cache the calculated value. gnttab_init() now calls gnttab_request_version() early so that grant_table_version and grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have been added to prevent this type of bug from reoccurring in the future. Signed-off-by: Matt Wilson <msw@amazon.com> Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Annie Li <annie.li@oracle.com> Cc: xen-devel@lists.xen.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21drbd: add missing part_round_stats to _drbd_start_io_acctPhilipp Reisner
commit 72585d2428fa3a0daab02ebad1f41e5ef517dbaa upstream. Without this, iostat frequently sees bogus svctime and >= 100% "utilization". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Cc: Raoul Bhatia <raoul@bhatia.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21igb: release already assigned MSI-X interrupts if setup failsStefan Assmann
commit 52285b762b3681669215bf1d17ca6143448ab7d3 upstream. During MSI-X setup the system might run out of vectors. If this happens the already assigned vectors for this NIC should be freed before trying the disable MSI-X. Failing to do so results in the following oops. kernel BUG at drivers/pci/msi.c:341! [...] Call Trace: [<ffffffff8128f39d>] pci_disable_msix+0x3d/0x60 [<ffffffffa037d1ce>] igb_reset_interrupt_capability+0x27/0x5c [igb] [<ffffffffa037d229>] igb_clear_interrupt_scheme+0x26/0x2d [igb] [<ffffffffa0384268>] igb_request_irq+0x73/0x297 [igb] [<ffffffffa0384554>] __igb_open+0xc8/0x223 [igb] [<ffffffffa0384815>] igb_open+0x13/0x15 [igb] [<ffffffff8144592f>] __dev_open+0xbf/0x120 [<ffffffff81443e51>] __dev_change_flags+0xa1/0x180 [<ffffffff81445828>] dev_change_flags+0x28/0x70 [<ffffffff814af537>] devinet_ioctl+0x5b7/0x620 [<ffffffff814b01c8>] inet_ioctl+0x88/0xa0 [<ffffffff8142e8a0>] sock_do_ioctl+0x30/0x70 [<ffffffff8142ecf2>] sock_ioctl+0x72/0x270 [<ffffffff8118062c>] do_vfs_ioctl+0x8c/0x340 [<ffffffff81180981>] sys_ioctl+0xa1/0xb0 [<ffffffff815161a9>] system_call_fastpath+0x16/0x1b Code: 48 89 df e8 1f 40 ed ff 4d 39 e6 49 8b 45 10 75 b6 48 83 c4 18 5b 41 5c 41 5d 41 5e 41 5f c9 c3 48 8b 7b 20 e8 3e 91 db ff eb ae <0f> 0b eb fe 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 RIP [<ffffffff8128e144>] free_msi_irqs+0x124/0x130 RSP <ffff880037503bd8> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Abdallah Chatila <abdallah.chatila@ericsson.com>
2013-01-21intel-iommu: Prevent devices with RMRRs from being placed into SI DomainTom Mingarelli
commit ea2447f700cab264019b52e2b417d689e052dcfd upstream. This patch is to prevent non-USB devices that have RMRRs associated with them from being placed into the SI Domain during init. This fixes the issue where the RMRR info for devices being placed in and out of the SI Domain gets lost. Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Tested-by: Shuah Khan <shuah.khan@hp.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Joerg Roedel <joro@8bytes.org> Cc: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21target: Add link_magic for fabric allow_link destination target_itemsNicholas Bellinger
commit 0ff8754981261a80f4b77db2536dfea92c2d4539 upstream. This patch adds [dev,lun]_link_magic value assignment + checks within generic target_fabric_port_link() and target_fabric_mappedlun_link() code to ensure destination config_item *target_item sent from configfs_symlink() -> config_item_operations->allow_link() is the underlying se_device->dev_group and se_lun->lun_group that we expect to symlink. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21r8169: avoid NAPI scheduling delay.Francois Romieu
commit 7dbb491878a2c51d372a8890fa45a8ff80358af1 upstream. While reworking the r8169 driver a few months ago to perform the smallest amount of work in the irq handler, I took care of avoiding any irq mask register operation in the slow work dedicated user context thread. The slow work thread scheduled an extra round of NAPI work which would ultimately set the irq mask register as required, thus keeping such irq mask operations in the NAPI handler. It would eventually race with the irq handler and delay NAPI execution for - assuming no further irq - a whole ksoftirqd period. Mildly a problem for rare link changes or corner case PCI events. The race was always lost after the last bh disabling lock had been removed from the work thread and people started wondering where those pesky "NOHZ: local_softirq_pending 08" messages came from. Actually the irq mask register _can_ be set up directly in the slow work thread. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Reported-by: Dave Jones <davej@redhat.com> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-01-21ext4: init pagevec in ext4_da_block_invalidatepagesEric Sandeen
commit 66bea92c69477a75a5d37b9bfed5773c92a3c4b4 upstream. ext4_da_block_invalidatepages is missing a pagevec_init(), which means that pvec->cold contains random garbage. This affects whether the page goes to the front or back of the LRU when ->cold makes it to free_hot_cold_page() Reviewed-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21ALSA: usb - fix race in creation of M-Audio Fast track pro driverDavid Henningsson
commit b98ae2729dea161edc96c9d177459b6c28bcbba5 upstream. A patch in the 3.2 kernel caused regression with hotplugging the M-Audio Fast track pro, or sound after suspend. I don't have the device so I haven't done a full analysis, but it seems userspace (both udev and pulseaudio) got confused when a card was created, immediately destroyed, and then created again. However, at least one person in the bug report (martin djfun) reports that this patch resolves the issue for him. It also leaves a message in the log: "snd-usb-audio: probe of 1-1.1:1.1 failed with error -5" which is a bit misleading. It is better than non-working audio, but maybe there's a more elegant solution? BugLink: https://bugs.launchpad.net/bugs/1095315 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21x86/Sandy Bridge: reserve pages when integrated graphics is presentJesse Barnes
commit a9acc5365dbda29f7be2884efb63771dc24bd815 upstream. SNB graphics devices have a bug that prevent them from accessing certain memory ranges, namely anything below 1M and in the pages listed in the table. So reserve those at boot if set detect a SNB gfx device on the CPU to avoid GPU hangs. Stephane Marchesin had a similar patch to the page allocator awhile back, but rather than reserving pages up front, it leaked them at allocation time. [ hpa: made a number of stylistic changes, marked arrays as static const, and made less verbose; use "memblock=debug" for full verbosity. ] Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21s390/time: fix sched_clock() overflowHeiko Carstens
commit ed4f20943cd4c7b55105c04daedf8d63ab6d499c upstream. Converting a 64 Bit TOD format value to nanoseconds means that the value must be divided by 4.096. In order to achieve that we multiply with 125 and divide by 512. When used within sched_clock() this triggers an overflow after appr. 417 days. Resulting in a sched_clock() return value that is much smaller than previously and therefore may cause all sort of weird things in subsystems that rely on a monotonic sched_clock() behaviour. To fix this implement a tod_to_ns() helper function which converts TOD values without overflow and call this function from both places that open coded the conversion: sched_clock() and kvm_s390_handle_wait(). Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21target: Release se_cmd when LUN lookup fails for TMRRoland Dreier
commit 5a3b6fc0092c5f8dee7820064ee54d2631d48573 upstream. When transport_lookup_tmr_lun() fails and we return a task management response from target_complete_tmr_failure(), we need to call transport_cmd_check_stop_to_fabric() to release the last ref to the cmd after calling se_tfo->queue_tm_rsp(), or else we will never remove the failed TMR from the session command list (and we'll end up waiting forever when trying to tear down the session). (nab: Fix minor compile breakage) Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21tcm_fc: Do not report target role when target is not definedMark Rustad
commit edec8dfefa1f372b2dd8197da555352e76a10c03 upstream. Clear the target role when no target is provided for the node performing a PRLI. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Acked by Robert Love <robert.w.love@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21tcm_fc: Do not indicate retry capability to initiatorsMark Rustad
commit f2eeba214bcd0215b7f558cab6420e5fd153042b upstream. When generating a PRLI response to an initiator, clear the FCP_SPPF_RETRY bit in the response. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Reviewed-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Acked by Robert Love <robert.w.love@intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-21sh: Fix FDPIC binary loaderThomas Schwinge
commit 4a71997a3279a339e7336ea5d0cd27282e2dea44 upstream. Ensure that the aux table is properly initialized, even when optional features are missing. Without this, the FDPIC loader did not work. This was meant to be included in commit d5ab780305bb6d60a7b5a74f18cf84eb6ad153b1. Signed-off-by: Thomas Schwinge <thomas@codesourcery.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17Linux 3.4.26v3.4.26Greg Kroah-Hartman
2013-01-17USB: fix endpoint-disabling for failed config changesAlan Stern
commit 36caff5d795429c572443894e8789c2150dd796b upstream. This patch (as1631) fixes a bug that shows up when a config change fails for a device under an xHCI controller. The controller needs to be told to disable the endpoints that have been enabled for the new config. The existing code does this, but before storing the information about which endpoints were enabled! As a result, any second attempt to install the new config is doomed to fail because xhci-hcd will refuse to enable an endpoint that is already enabled. The patch optimistically initializes the new endpoints' device structures before asking the device to switch to the new config. If the request fails then the endpoint information is already stored, so we can use usb_hcd_alloc_bandwidth() to disable the endpoints with no trouble. The rest of the error path is slightly more complex now; we have to disable the new interfaces and call put_device() rather than simply deallocating them. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Matthias Schniedermeyer <ms@citd.de> CC: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17staging: comedi: Kconfig: COMEDI_NI_AT_A2150 should select COMEDI_FCIan Abbott
commit 34ffb33e09132401872fe79e95c30824ce194d23 upstream. The 'ni_at_a2150' module links to `cfc_write_to_buffer` in the 'comedi_fc' module, so selecting 'COMEDI_NI_AT_A2150' in the kernel config needs to also select 'COMEDI_FC'. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17staging: comedi: don't hijack hardware device private dataIan Abbott
commit c43435d7722134ed1fda58ce1025f41029bd58ad upstream. comedi_auto_config() associates a Comedi minor device number with an auto-configured hardware device and comedi_auto_unconfig() disassociates it. Currently, these use the hardware device's private data pointer to point to some allocated storage holding the minor device number. This is a bit of a waste of the hardware device's private data pointer, preventing it from being used for something more useful by the low-level comedi device drivers. For example, it would make more sense if comedi_usb_auto_config() was passed a pointer to the struct usb_interface instead of the struct usb_device, but this cannot be done currently because the low-level comedi drivers already use the private data pointer in the struct usb_interface for something more useful. This patch stops the comedi core hijacking the hardware device's private data pointer. Instead, comedi_auto_config() stores a pointer to the hardware device's struct device in the struct comedi_device_file_info associated with the minor device number, and comedi_auto_unconfig() calls new function comedi_find_board_minor() to recover the minor device number associated with the hardware device. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: Unlock unprocessed pages in start_read() error pathDavid Zafman
Function start_read() can get an error before processing all pages. It must not only release the remaining pages, but unlock them too. This fixes http://tracker.newdream.net/issues/3370 Signed-off-by: David Zafman <david.zafman@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> (cherry picked from commit 8884d53dd63b1d9315b343564fcbe1ede004a99e) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ceph: call handle_cap_grant() for cap import messageYan, Zheng
If client sends cap message that requests new max size during exporting caps, the exporting MDS will drop the message quietly. So the client may wait for the reply that updates the max size forever. call handle_cap_grant() for cap import message can avoid this issue. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 0e5e1774a92e6fe9c511585de8f078b4c4c68dbb) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ceph: Fix __ceph_do_pending_vmtruncateYan, Zheng
we should set i_truncate_pending to 0 after page cache is truncated to i_truncate_size Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit a85f50b6ef93fbbb2ae932ce9b2376509d172796) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ceph: Don't add dirty inode to dirty list if caps is in migrationYan, Zheng
Add dirty inode to cap_dirty_migrating list instead, this can avoid ceph_flush_dirty_caps() entering infinite loop. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 0685235ffd9dbdb9ccbda587f8a3c83ad1d5a921) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ceph: Fix infinite loop in __wake_requestsYan, Zheng
__wake_requests() will enter infinite loop if we use it to wake requests in the session->s_waiting list. __wake_requests() deletes requests from the list and __do_request() adds requests back to the list. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit ed75ec2cd19b47efcd292b6e23f58e56f4c5bc34) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ceph: Don't update i_max_size when handling non-auth capYan, Zheng
The cap from non-auth mds doesn't have a meaningful max_size value. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 5e62ad30157d0da04cf40c6d1a2f4bc840948b9c) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17rbd: do not allow remove of mounted-on imageAlex Elder
There is no check in rbd_remove() to see if anybody holds open the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and don't allow an rbd image to be removed if the count is non-zero. Protect the updates of the open count value with ctl_mutex to ensure the underlying rbd device doesn't get removed while concurrently being opened. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (based on commit 42382b709bd1d143b9f0fa93e0a3a1f2f4210707)
2013-01-17rbd: fix bug in rbd_dev_id_put()Alex Elder
In rbd_dev_id_put(), there's a loop that's intended to determine the maximum device id in use. But it isn't doing that at all, the effect of how it's written is to simply use the just-put id number, which ignores whole purpose of this function. Fix the bug. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> (cherry picked from commit b213e0b1a62637b2a9395a34349b13d73ca2b90a) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17rbd: BUG on invalid layoutSage Weil
This shouldn't actually be possible because the layout struct is constructed from the RBD header and validated then. [elder@inktank.com: converted BUG() call to equivalent rbd_assert()] Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> (based on commit 6cae3717cddaf8e5e96e304733dca66e40d56f89)
2013-01-17rbd: remove linger unconditionallyAlex Elder
In __unregister_linger_request(), the request is being removed from the osd client's req_linger list only when the request has a non-null osd pointer. It should be done whether or not the request currently has an osd. This is most likely a non-issue because I believe the request will always have an osd when this function is called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 61c74035626beb25a39b0273ccf7d75510bc36a1) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17ceph: don't reference req after putAlex Elder
In __unregister_request(), there is a call to list_del_init() referencing a request that was the subject of a call to ceph_osdc_put_request() on the previous line. This is not safe, because the request structure could have been freed by the time we reach the list_del_init(). Fix this by reversing the order of these lines. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 7d5f24812bd182a2471cb69c1c2baf0648332e1f) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: remove 'osdtimeout' optionSage Weil
This would reset a connection with any OSD that had an outstanding request that was taking more than N seconds. The idea was that if the OSD was buggy, the client could compensate by resending the request. In reality, this only served to hide server bugs, and we haven't actually seen such a bug in quite a while. Moreover, the userspace client code never did this. More importantly, often the request is taking a long time because the OSD is trying to recover, or overloaded, and killing the connection and retrying would only make the situation worse by giving the OSD more work to do. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> (cherry picked from commit 83aff95eb9d60aff5497e9f44a2ae906b86d8e88) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17rbd: kill notify_timeout optionAlex Elder
The "notify_timeout" rbd device option is never used, so get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> (cherry picked from commit 84d34dcc116e117a41c6fc8be13430529fc2d9e7) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17rbd: add read_only rbd map optionAlex Elder
Add the ability to map an rbd image read-only, by specifying either "read_only" or "ro" as an option on the rbd "command line." Also allow the inverse to be explicitly specified using "read_write" or "rw". Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> (based on commit cc0538b62c839c2df7b9f8378bb37e3b35faa608)
2013-01-17rbd: kill create_snap sysfs entryAlex Elder
Josh proposed the following change, and I don't think I could explain it any better than he did: From: Josh Durgin <josh.durgin@inktank.com> Date: Tue, 24 Jul 2012 14:22:11 -0700 To: ceph-devel <ceph-devel@vger.kernel.org> Message-ID: <500F1203.9050605@inktank.com> From: Josh Durgin <josh.durgin@inktank.com> Right now the kernel still has one piece of rbd management duplicated from the rbd command line tool: snapshot creation. There's nothing special about snapshot creation that makes it advantageous to do from the kernel, so I'd like to remove the create_snap sysfs interface. That is, /sys/bus/rbd/devices/<id>/create_snap would be removed. Does anyone rely on the sysfs interface for creating rbd snapshots? If so, how hard would it be to replace with: rbd snap create pool/image@snap Is there any benefit to the sysfs interface that I'm missing? Josh This patch implements this proposal, removing the code that implements the "snap_create" sysfs interface for rbd images. As a result, quite a lot of other supporting code goes away. [elder@inktank.com: commented out rbd_req_sync_exec() to avoid warning] Suggested-by: Josh Durgin <josh.durgin@inktank.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com> (based on commit 02cdb02ceab1f3dd9ac2bc899fc51f0e0e744782)
2013-01-17libceph: avoid using freed osd in __kick_osd_requests()Alex Elder
If an osd has no requests and no linger requests, __reset_osd() will just remove it with a call to __remove_osd(). That drops a reference to the osd, and therefore the osd may have been free by the time __reset_osd() returns. That function offers no indication this may have occurred, and as a result the osd will continue to be used even when it's no longer valid. Change__reset_osd() so it returns an error (ENODEV) when it deletes the osd being reset. And change __kick_osd_requests() so it returns immediately (before referencing osd again) if __reset_osd() returns *any* error. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 685a7555ca69030739ddb57a47f0ea8ea80196a4) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: fix osdmap decode error pathsSage Weil
Ensure that we set the err value correctly so that we do not pass a 0 value to ERR_PTR and confuse the calling code. (In particular, osd_client.c handle_map() will BUG(!newmap)). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> (cherry picked from commit 0ed7285e0001b960c888e5455ae982025210ed3d) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: fix protocol feature mismatch failure pathSage Weil
We should not set con->state to CLOSED here; that happens in ceph_fault() in the caller, where it first asserts that the state is not yet CLOSED. Avoids a BUG when the features don't match. Since the fail_protocol() has become a trivial wrapper, replace calls to it with direct calls to reset_connection(). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> (cherry picked from commit 0fa6ebc600bc8e830551aee47a0e929e818a1868) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: WARN, don't BUG on unexpected connection statesAlex Elder
A number of assertions in the ceph messenger are implemented with BUG_ON(), killing the system if connection's state doesn't match what's expected. At this point our state model is (evidently) not well understood enough for these assertions to trigger a BUG(). Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...) so we learn about these issues without killing the machine. We now recognize that a connection fault can occur due to a socket closure at any time, regardless of the state of the connection. So there is really nothing we can assert about the state of the connection at that point so eliminate that assertion. Reported-by: Ugis <ugis22@gmail.com> Tested-by: Ugis <ugis22@gmail.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 122070a2ffc91f87fe8e8493eb0ac61986c5557c) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: always reset osds when kickingAlex Elder
When ceph_osdc_handle_map() is called to process a new osd map, kick_requests() is called to ensure all affected requests are updated if necessary to reflect changes in the osd map. This happens in two cases: whenever an incremental map update is processed; and when a full map update (or the last one if there is more than one) gets processed. In the former case, the kick_requests() call is followed immediately by a call to reset_changed_osds() to ensure any connections to osds affected by the map change are reset. But for full map updates this isn't done. Both cases should be doing this osd reset. Rather than duplicating the reset_changed_osds() call, move it into the end of kick_requests(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit e6d50f67a6b1a6252a616e6e629473b5c4277218) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: move linger requests sooner in kick_requests()Alex Elder
The kick_requests() function is called by ceph_osdc_handle_map() when an osd map change has been indicated. Its purpose is to re-queue any request whose target osd is different from what it was when it was originally sent. It is structured as two loops, one for incomplete but registered requests, and a second for handling completed linger requests. As a special case, in the first loop if a request marked to linger has not yet completed, it is moved from the request list to the linger list. This is as a quick and dirty way to have the second loop handle sending the request along with all the other linger requests. Because of the way it's done now, however, this quick and dirty solution can result in these incomplete linger requests never getting re-sent as desired. The problem lies in the fact that the second loop only arranges for a linger request to be sent if it appears its target osd has changed. This is the proper handling for *completed* linger requests (it avoids issuing the same linger request twice to the same osd). But although the linger requests added to the list in the first loop may have been sent, they have not yet completed, so they need to be re-sent regardless of whether their target osd has changed. The first required fix is we need to avoid calling __map_request() on any incomplete linger request. Otherwise the subsequent __map_request() call in the second loop will find the target osd has not changed and will therefore not re-send the request. Second, we need to be sure that a sent but incomplete linger request gets re-sent. If the target osd is the same with the new osd map as it was when the request was originally sent, this won't happen. This can be fixed through careful handling when we move these requests from the request list to the linger list, by unregistering the request *before* it is registered as a linger request. This works because a side-effect of unregistering the request is to make the request's r_osd pointer be NULL, and *that* will ensure the second loop actually re-sends the linger request. Processing of such a request is done at that point, so continue with the next one once it's been moved. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit ab60b16d3c31b9bd9fd5b39f97dc42c52a50b67d) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: register request before unregister lingerAlex Elder
In kick_requests(), we need to register the request before we unregister the linger request. Otherwise the unregister will reset the request's osd pointer to NULL. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit c89ce05e0c5a01a256100ac6a6019f276bdd1ca6) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: don't use rb_init_node() in ceph_osdc_alloc_request()Alex Elder
The red-black node in the ceph osd request structure is initialized in ceph_osdc_alloc_request() using rbd_init_node(). We do need to initialize this, because in __unregister_request() we call RB_EMPTY_NODE(), which expects the node it's checking to have been initialized. But rb_init_node() is apparently overkill, and may in fact be on its way out. So use RB_CLEAR_NODE() instead. For a little more background, see this commit: 4c199a93 rbtree: empty nodes have no color" Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit a978fa20fb657548561dddbfb605fe43654f0825) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17libceph: init event->node in ceph_osdc_create_event()Alex Elder
The red-black node node in the ceph osd event structure is not initialized in create_osdc_create_event(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 3ee5234df68d253c415ba4f2db72ad250d9c21a9) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>