From 84209e02de48d72289650cc5a7ae8dd18223620f Mon Sep 17 00:00:00 2001 From: Miklos Szeredi Date: Fri, 1 Aug 2008 20:28:47 +0200 Subject: mm: dont clear PG_uptodate on truncate/invalidate Brian Wang reported that a FUSE filesystem exported through NFS could return I/O errors on read. This was traced to splice_direct_to_actor() returning a short or zero count when racing with page invalidation. However this is not FUSE or NFSD specific, other filesystems (notably NFS) also call invalidate_inode_pages2() to purge stale data from the cache. If this happens while such pages are sitting in a pipe buffer, then splice(2) from the pipe can return zero, and read(2) from the pipe can return ENODATA. The zero return is especially bad, since it implies end-of-file or disconnected pipe/socket, and is documented as such for splice. But returning an error for read() is also nasty, when in fact there was no error (data becoming stale is not an error). The same problems can be triggered by "hole punching" with madvise(MADV_REMOVE). Fix this by not clearing the PG_uptodate flag on truncation and invalidation. Signed-off-by: Miklos Szeredi Acked-by: Nick Piggin Cc: Andrew Morton Cc: Jens Axboe Signed-off-by: Linus Torvalds --- mm/truncate.c | 2 -- 1 file changed, 2 deletions(-) (limited to 'mm/truncate.c') diff --git a/mm/truncate.c b/mm/truncate.c index e68443d74567..894e9a70699f 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -104,7 +104,6 @@ truncate_complete_page(struct address_space *mapping, struct page *page) cancel_dirty_page(page, PAGE_CACHE_SIZE); remove_from_page_cache(page); - ClearPageUptodate(page); ClearPageMappedToDisk(page); page_cache_release(page); /* pagecache ref */ } @@ -356,7 +355,6 @@ invalidate_complete_page2(struct address_space *mapping, struct page *page) BUG_ON(PagePrivate(page)); __remove_from_page_cache(page); spin_unlock_irq(&mapping->tree_lock); - ClearPageUptodate(page); page_cache_release(page); /* pagecache ref */ return 1; failed: -- cgit v1.2.3 From 529ae9aaa08378cfe2a4350bded76f32cc8ff0ce Mon Sep 17 00:00:00 2001 From: Nick Piggin Date: Sat, 2 Aug 2008 12:01:03 +0200 Subject: mm: rename page trylock Converting page lock to new locking bitops requires a change of page flag operation naming, so we might as well convert it to something nicer (!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked). This also facilitates lockdeping of page lock. Signed-off-by: Nick Piggin Acked-by: KOSAKI Motohiro Acked-by: Peter Zijlstra Acked-by: Andrew Morton Acked-by: Benjamin Herrenschmidt Signed-off-by: Linus Torvalds --- mm/truncate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'mm/truncate.c') diff --git a/mm/truncate.c b/mm/truncate.c index 894e9a70699f..250505091d37 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -187,7 +187,7 @@ void truncate_inode_pages_range(struct address_space *mapping, if (page_index > next) next = page_index; next++; - if (TestSetPageLocked(page)) + if (!trylock_page(page)) continue; if (PageWriteback(page)) { unlock_page(page); @@ -280,7 +280,7 @@ unsigned long __invalidate_mapping_pages(struct address_space *mapping, pgoff_t index; int lock_failed; - lock_failed = TestSetPageLocked(page); + lock_failed = !trylock_page(page); /* * We really shouldn't be looking at the ->index of an -- cgit v1.2.3