summaryrefslogtreecommitdiff
path: root/fs/btrfs/inode.c
AgeCommit message (Collapse)Author
2009-10-01Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix data space leak fix Btrfs: remove duplicates of filemap_ helpers Btrfs: take i_mutex before generic_write_checks Btrfs: fix arguments to btrfs_wait_on_page_writeback_range Btrfs: fix deadlock with free space handling and user transactions Btrfs: fix error cases for ioctl transactions Btrfs: Use CONFIG_BTRFS_POSIX_ACL to enable ACL code Btrfs: introduce missing kfree Btrfs: Fix setting umask when POSIX ACLs are not enabled Btrfs: proper -ENOSPC handling
2009-10-01const: constify remaining file_operationsAlexey Dobriyan
[akpm@linux-foundation.org: fix KVM] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-01Merge branch 'master' of ↵Chris Mason
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable into for-linus
2009-10-01Btrfs: fix data space leak fixJosef Bacik
There is a problem where page_mkwrite can be called on a dirtied page that already has a delalloc range associated with it. The fix is to clear any delalloc bits for the range we are dirtying so the space accounting gets handled properly. This is the same thing we do in the normal write case, so we are consistent across the board. With this patch we no longer leak reserved space. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-10-01Merge branch 'master' of ↵Chris Mason
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable into for-linus
2009-09-28Btrfs: proper -ENOSPC handlingJosef Bacik
At the start of a transaction we do a btrfs_reserve_metadata_space() and specify how many items we plan on modifying. Then once we've done our modifications and such, just call btrfs_unreserve_metadata_space() for the same number of items we reserved. For keeping track of metadata needed for data I've had to add an extent_io op for when we merge extents. This lets us track space properly when we are doing sequential writes, so we don't end up reserving way more metadata space than what we need. The only place where the metadata space accounting is not done is in the relocation code. This is because Yan is going to be reworking that code in the near future, so running btrfs-vol -b could still possibly result in a ENOSPC related panic. This patch also turns off the metadata_ratio stuff in order to allow users to more efficiently use their disk space. This patch makes it so we track how much metadata we need for an inode's delayed allocation extents by tracking how many extents are currently waiting for allocation. It introduces two new callbacks for the extent_io tree's, merge_extent_hook and split_extent_hook. These help us keep track of when we merge delalloc extents together and split them up. Reservations are handled prior to any actually dirty'ing occurs, and then we unreserve after we dirty. btrfs_unreserve_metadata_for_delalloc() will make the appropriate unreservations as needed based on the number of reservations we currently have and the number of extents we currently have. Doing the reservation outside of doing any of the actual dirty'ing lets us do things like filemap_flush() the inode to try and force delalloc to happen, or as a last resort actually start allocation on all delalloc inodes in the fs. This has survived dbench, fs_mark and an fsx torture test. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (42 commits) Btrfs: hash the btree inode during fill_super Btrfs: relocate file extents in clusters Btrfs: don't rename file into dummy directory Btrfs: check size of inode backref before adding hardlink Btrfs: fix releasepage to avoid unlocking extents we haven't locked Btrfs: Fix test_range_bit for whole file extents Btrfs: fix errors handling cached state in set/clear_extent_bit Btrfs: fix early enospc during balancing Btrfs: deal with NULL space info Btrfs: account for space used by the super mirrors Btrfs: fix extent entry threshold calculation Btrfs: remove dead code Btrfs: fix bitmap size tracking Btrfs: don't keep retrying a block group if we fail to allocate a cluster Btrfs: make balance code choose more wisely when relocating Btrfs: fix arithmetic error in clone ioctl Btrfs: add snapshot/subvolume destroy ioctl Btrfs: change how subvolumes are organized Btrfs: do not reuse objectid of deleted snapshot/subvol Btrfs: speed up snapshot dropping ...
2009-09-24Merge branch 'hwpoison' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 * 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits) HWPOISON: Enable error_remove_page on btrfs HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs HWPOISON: Add madvise() based injector for hardware poisoned pages v4 HWPOISON: Enable error_remove_page for NFS HWPOISON: Enable .remove_error_page for migration aware file systems HWPOISON: The high level memory error handler in the VM v7 HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process HWPOISON: shmem: call set_page_dirty() with locked page HWPOISON: Define a new error_remove_page address space op for async truncation HWPOISON: Add invalidate_inode_page HWPOISON: Refactor truncate to allow direct truncating of page v2 HWPOISON: check and isolate corrupted free pages v2 HWPOISON: Handle hardware poisoned pages in try_to_unmap HWPOISON: Use bitmask/action code for try_to_unmap behaviour HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 HWPOISON: Add poison check to page fault handling HWPOISON: Add basic support for poisoned pages in fault handler v3 HWPOISON: Add new SIGBUS error codes for hardware poison signals HWPOISON: Add support for poison swap entries v2 HWPOISON: Export some rmap vma locking to outside world ...
2009-09-24Merge branch 'master' of ↵Chris Mason
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable into for-linus Conflicts: fs/btrfs/super.c
2009-09-24Btrfs: don't rename file into dummy directoryYan, Zheng
A recent change enforces only one access point to each subvolume. The first directory entry (the one added when the subvolume/snapshot was created) is treated as valid access point, all other subvolume links are linked to dummy empty directories. The dummy directories are temporary inodes that only in memory, so we can not rename file into them. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-24Btrfs: check size of inode backref before adding hardlinkYan, Zheng
For every hardlink in btrfs, there is a corresponding inode back reference. All inode back references for hardlinks in a given directory are stored in single b-tree item. The size of b-tree item is limited by the size of b-tree leaf, so we can only create limited number of hardlinks to a given file in a directory. The original code lacks of the check, it oops if the number of hardlinks goes over the limit. This patch fixes the issue by adding check to btrfs_link and btrfs_rename. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-22const: mark remaining inode_operations as constAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22const: mark remaining address_space_operations constAlexey Dobriyan
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-21Btrfs: add snapshot/subvolume destroy ioctlYan, Zheng
This patch adds snapshot/subvolume destroy ioctl. A subvolume that isn't being used and doesn't contains links to other subvolumes can be destroyed. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-21Btrfs: change how subvolumes are organizedYan, Zheng
btrfs allows subvolumes and snapshots anywhere in the directory tree. If we snapshot a subvolume that contains a link to other subvolume called subvolA, subvolA can be accessed through both the original subvolume and the snapshot. This is similar to creating hard link to directory, and has the very similar problems. The aim of this patch is enforcing there is only one access point to each subvolume. Only the first directory entry (the one added when the subvolume/snapshot was created) is treated as valid access point. The first directory entry is distinguished by checking root forward reference. If the corresponding root forward reference is missing, we know the entry is not the first one. This patch also adds snapshot/subvolume rename support, the code allows rename subvolume link across subvolumes. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-21Btrfs: do not reuse objectid of deleted snapshot/subvolYan, Zheng
The new back reference format does not allow reusing objectid of deleted snapshot/subvol. So we use ++highest_objectid to allocate objectid for new snapshot/subvol. Now we use ++highest_objectid to allocate objectid for both new inode and new snapshot/subvolume, so this patch removes 'find hole' code in btrfs_find_free_objectid. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-18Btrfs: search for an allocation hint while filling file COWChris Mason
The allocator has some nice knobs for sending hints about where to try and allocate new blocks, but when we're doing file allocations we're not sending any hint at all. This commit adds a simple extent map search to see if we can quickly and easily find a hint for the allocator. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-16HWPOISON: Enable error_remove_page on btrfsAndi Kleen
Cc: chris.mason@oracle.com Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-11Merge branch 'master' of ↵Chris Mason
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
2009-09-11Btrfs: zero page past end of inline file itemsChris Mason
When btrfs_get_extent is reading inline file items for readpage, it needs to copy the inline extent into the page. If the inline extent doesn't cover all of the page, that means there is a hole in the file, or that our file is smaller than one page. readpage does zeroing for the case where the file is smaller than one page, but nobody is currently zeroing for the case where there is a hole after the inline item. This commit changes btrfs_get_extent to zero fill the page past the end of the inline item. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-11Btrfs: fix btrfs page_mkwrite to return locked pageChris Mason
This closes a whole where the page may be written before the page_mkwrite caller has a chance to dirty it (thanks to Nick Piggin) Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-11Btrfs: Fix extent replacment raceChris Mason
Data COW means that whenever we write to a file, we replace any old extent pointers with new ones. There was a window where a readpage might find the old extent pointers on disk and cache them in the extent_map tree in ram in the middle of a given write replacing them. Even though both the readpage and the write had their respective bytes in the file locked, the extent readpage inserts may cover more bytes than it had locked down. This commit closes the race by keeping the new extent pinned in the extent map tree until after the on-disk btree is properly setup with the new extent pointers. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-11Btrfs: Use PagePrivate2 to track pages in the data=ordered code.Chris Mason
Btrfs writes go through delalloc to the data=ordered code. This makes sure that all of the data is on disk before the metadata that references it. The tracking means that we have to make sure each page in an extent is fully written before we add that extent into the on-disk btree. This was done in the past by setting the EXTENT_ORDERED bit for the range of an extent when it was added to the data=ordered code, and then clearing the EXTENT_ORDERED bit in the extent state tree as each page finished IO. One of the reasons we had to do this was because sometimes pages are magically dirtied without page_mkwrite being called. The EXTENT_ORDERED bit is checked at writepage time, and if it isn't there, our page become dirty without going through the proper path. These bit operations make for a number of rbtree searches for each page, and can cause considerable lock contention. This commit switches from the EXTENT_ORDERED bit to use PagePrivate2. As pages go into the ordered code, PagePrivate2 is set on each one. This is a cheap operation because we already have all the pages locked and ready to go. As IO finishes, the PagePrivate2 bit is cleared and the ordered accoutning is updated for each page. At writepage time, if the PagePrivate2 bit is missing, we go into the writepage fixup code to handle improperly dirtied pages. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-11Btrfs: use a cached state for extent state operations during delallocChris Mason
This changes the btrfs code to find delalloc ranges in the extent state tree to use the new state caching code from set/test bit. It reduces one of the biggest causes of rbtree searches in the writeback path. test_range_bit is also modified to take the cached state as a starting point while searching. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-11Btrfs: cache values for locking extentsChris Mason
Many of the btrfs extent state tree users follow the same pattern. They lock an extent range in the tree, do some operation and then unlock. This translates to at least 2 rbtree searches, and maybe more if they are doing operations on the extent state tree. A locked extent in the tree isn't going to be merged or changed, and so we can safely return the extent state structure as a cached handle. This changes set_extent_bit to give back a cached handle, and also changes both set_extent_bit and clear_extent_bit to use the cached handle if it is available. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-09-11Btrfs: switch extent_map to a rw lockChris Mason
There are two main users of the extent_map tree. The first is regular file inodes, where it is evenly spread between readers and writers. The second is the chunk allocation tree, which maps blocks from logical addresses to phyiscal ones, and it is 99.99% reads. The mapping tree is a point of lock contention during heavy IO workloads, so this commit switches things to a rw lock. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-08-21btrfs: fix inode rbtree corruptionFrom: Nick Piggin
Node may not be inserted over existing node. This causes inode tree corruption and I was seeing crashes in inode_tree_del which I can not reproduce after this patch. The other way to fix this would be to tie inode lifetime in the rbtree with inode while not in freeing state. I had a look at this but it is not so trivial at this point. At least this patch gets things working again. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Acked-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-08-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix balancing oops when invalidate_inode_pages2 returns EBUSY Btrfs: correct error-handling zlib error handling Btrfs: remove superfluous NULL pointer check in btrfs_rename() Btrfs: make sure the async caching thread advances the key Btrfs: fix btrfs_remove_from_free_space corner case
2009-08-07Btrfs: remove superfluous NULL pointer check in btrfs_rename()Bartlomiej Zolnierkiewicz
This takes care of the following entry from Dan's list: fs/btrfs/inode.c +4788 btrfs_rename(36) warning: variable derefenced before check 'old_inode' Reported-by: Dan Carpenter <error27@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Eugene Teo <eteo@redhat.com> Cc: Julia Lawall <julia@diku.dk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-07-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (22 commits) Btrfs: Fix async caching interaction with unmount Btrfs: change how we unpin extents Btrfs: Correct redundant test in add_inode_ref Btrfs: find smallest available device extent during chunk allocation Btrfs: clear all space_info->full after removing a block group Btrfs: make flushoncommit mount option correctly wait on ordered_extents Btrfs: Avoid delayed reference update looping Btrfs: Fix ordering of key field checks in btrfs_previous_item Btrfs: find_free_dev_extent doesn't handle holes at the start of the device Btrfs: Remove code duplication in comp_keys Btrfs: async block group caching Btrfs: use hybrid extents+bitmap rb tree for free space Btrfs: Fix crash on read failures at mount Btrfs: remove of redundant btrfs_header_level Btrfs: adjust NULL test Btrfs: Remove broken sanity check from btrfs_rmap_block() Btrfs: convert nested spin_lock_irqsave to spin_lock Btrfs: make sure all dirty blocks are written at commit time Btrfs: fix locking issue in btrfs_find_next_key Btrfs: fix double increment of path->slots[0] in btrfs_next_leaf ...
2009-07-22Btrfs: adjust NULL testJulia Lawall
Move the call to BUG_ON to before the dereference of the tested value. Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-07-12headers: smp_lock.h reduxAlexey Dobriyan
* Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix error message formatting Btrfs: fix use after free in btrfs_start_workers fail path Btrfs: honor nodatacow/sum mount options for new files Btrfs: update backrefs while dropping snapshot Btrfs: account for space we may use in fallocate Btrfs: fix the file clone ioctl for preallocated extents Btrfs: don't log the inode in file_write while growing the file
2009-07-02Btrfs: honor nodatacow/sum mount options for new filesChris Mason
The btrfs attr patches unconditionally inherited the inode flags field without honoring nodatacow and nodatasum. This fix makes sure we properly record the nodatacow/sum mount options in new inodes. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-07-02Btrfs: account for space we may use in fallocateJosef Bacik
Using Eric Sandeen's xfstest for fallocate, you can easily trigger a ENOSPC panic on btrfs. This is because we do not account for data we may use when doing the fallocate. This patch fixes the problem by properly reserving space, and then just freeing it when we are done. The reservation stuff was made with delalloc in mind, so its a little crude for this case, but it keeps the box from panicing. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-24Get "no acls for this inode" right, fix shmem breakageAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-24switch btrfs to inode->i_aclAl Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11btrfs: remove ->write_super and stop maintaining ->s_dirtChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-10Btrfs: implement FS_IOC_GETFLAGS/SETFLAGS/GETVERSIONChristoph Hellwig
Add support for the standard attributes set via chattr and read via lsattr. Currently we store the attributes in the flags value in the btrfs inode, but I wonder whether we should split it into two so that we don't have to keep converting between the two formats. Remove the btrfs_clear_flag/btrfs_set_flag/btrfs_test_flag macros as they were confusing the existing code and got in the way of the new additions. Also add the FS_IOC_GETVERSION ioctl for getting i_generation as it's trivial. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-06-10Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE)Yan Zheng
This commit introduces a new kind of back reference for btrfs metadata. Once a filesystem has been mounted with this commit, IT WILL NO LONGER BE MOUNTABLE BY OLDER KERNELS. When a tree block in subvolume tree is cow'd, the reference counts of all extents it points to are increased by one. At transaction commit time, the old root of the subvolume is recorded in a "dead root" data structure, and the btree it points to is later walked, dropping reference counts and freeing any blocks where the reference count goes to 0. The increments done during cow and decrements done after commit cancel out, and the walk is a very expensive way to go about freeing the blocks that are no longer referenced by the new btree root. This commit reduces the transaction overhead by avoiding the need for dead root records. When a non-shared tree block is cow'd, we free the old block at once, and the new block inherits old block's references. When a tree block with reference count > 1 is cow'd, we increase the reference counts of all extents the new block points to by one, and decrease the old block's reference count by one. This dead tree avoidance code removes the need to modify the reference counts of lower level extents when a non-shared tree block is cow'd. But we still need to update back ref for all pointers in the block. This is because the location of the block is recorded in the back ref item. We can solve this by introducing a new type of back ref. The new back ref provides information about pointer's key, level and in which tree the pointer lives. This information allow us to find the pointer by searching the tree. The shortcoming of the new back ref is that it only works for pointers in tree blocks referenced by their owner trees. This is mostly a problem for snapshots, where resolving one of these fuzzy back references would be O(number_of_snapshots) and quite slow. The solution used here is to use the fuzzy back references in the common case where a given tree block is only referenced by one root, and use the full back references when multiple roots have a reference on a given block. This commit adds per subvolume red-black tree to keep trace of cached inodes. The red-black tree helps the balancing code to find cached inodes whose inode numbers within a given range. This commit improves the balancing code by introducing several data structures to keep the state of balancing. The most important one is the back ref cache. It caches how the upper level tree blocks are referenced. This greatly reduce the overhead of checking back ref. The improved balancing code scales significantly better with a large number of snapshots. This is a very large commit and was written in a number of pieces. But, they depend heavily on the disk format change and were squashed together to make sure git bisect didn't end up in a bad state wrt space balancing or the format change. Signed-off-by: Yan Zheng <zheng.yan@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-05-14Btrfs: remove some WARN_ONs in the IO failure pathChris Mason
These debugging WARN_ONs make too much console noise during regular IO failures. An IO failure will still generate a number of messages as we verify checksums etc, but these two are not needed. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-05-14Btrfs: init inode ordered_data_close flag properlyChris Mason
This flag is used to decide when we need to send a given file through the ordered code to make sure it is fully written before a transaction commits. It was not being properly set to zero when the inode was being setup. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: look for acls during btrfs_read_locked_inodeChris Mason
This changes btrfs_read_locked_inode() to peek ahead in the btree for acl items. If it is certain a given inode has no acls, it will set the in memory acl fields to null to avoid acl lookups completely. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: fix acl cachingChris Mason
Linus noticed the btrfs code to cache acls wasn't properly caching a NULL acl when the inode didn't have any acls. This meant the common case of no acls resulted in expensive btree searches every time the kernel checked permissions (which is quite often). This is a modified version of Linus' original patch: Properly set initial acl fields to BTRFS_ACL_NOT_CACHED in the inode. This forces an acl lookup when permission checks are done. Fix btrfs_get_acl to avoid lookups and locking when the inode acls fields are set to null. Fix btrfs_get_acl to use the right return value from __btrfs_getxattr when deciding to cache a NULL acl. It was storing a NULL acl when __btrfs_getxattr return -ENOENT, but __btrfs_getxattr was actually returning -ENODATA for this case. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: remove unused btrfs_bit_radix slabChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-27Btrfs: ratelimit IO error printksChris Mason
Btrfs has printks for various IO errors, including bad checksums and mismatches between what we expect the block headers to contain and what we actually find on the disk. Longer term we need a real reporting mechanism for this, but for now printk is going to have to do. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-24Btrfs: fix fallocate deadlock on inode extent lockChris Mason
The btrfs fallocate call takes an extent lock on the entire range being fallocated, and then runs through insert_reserved_extent on each extent as they are allocated. The problem with this is that btrfs_drop_extents may decide to try and take the same extent lock fallocate was already holding. The solution used here is to push down knowledge of the range that is already locked going into btrfs_drop_extents. It turns out that at least one other caller had the same bug. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-24Btrfs: kill btrfs_cache_createChristoph Hellwig
Just use kmem_cache_create directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-21Btrfs: fix btrfs fallocate oops and deadlockChris Mason
Btrfs fallocate was incorrectly starting a transaction with a lock held on the extent_io tree for the file, which could deadlock. Strictly speaking it was using join_transaction which would be safe, but it is better to move the transaction outside of the lock. When preallocated extents are overwritten, btrfs_mark_buffer_dirty was being called on an unlocked buffer. This was triggering an assertion and oops because the lock is supposed to be held. The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had been run. btrfs_del_item takes care of dirtying things, so the solution is a to skip the btrfs_mark_buffer_dirty call in this case. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2009-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstableLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: BUG to BUG_ON changes Btrfs: remove dead code Btrfs: remove dead code Btrfs: fix typos in comments Btrfs: remove unused ftrace include Btrfs: fix __ucmpdi2 compile bug on 32 bit builds Btrfs: free inode struct when btrfs_new_inode fails Btrfs: fix race in worker_loop Btrfs: add flushoncommit mount option Btrfs: notreelog mount option Btrfs: introduce btrfs_show_options Btrfs: rework allocation clustering Btrfs: Optimize locking in btrfs_next_leaf() Btrfs: break up btrfs_search_slot into smaller pieces Btrfs: kill the pinned_mutex Btrfs: kill the block group alloc mutex Btrfs: clean up find_free_extent Btrfs: free space cache cleanups Btrfs: unplug in the async bio submission threads Btrfs: keep processing bios for a given bdev if our proc is batching