twx-linux/drivers/md
Vallish Vaidyeshwara 00a0ea33b4 dm thin: do not queue freed thin mapping for next stage processing
process_prepared_discard_passdown_pt1() should cleanup
dm_thin_new_mapping in cases of error.

dm_pool_inc_data_range() can fail trying to get a block reference:

metadata operation 'dm_pool_inc_data_range' failed: error = -61

When dm_pool_inc_data_range() fails, dm thin aborts current metadata
transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
is released as well. However, current thin mapping will be queued
onto next stage as part of queue_passdown_pt2() or passdown_endio().
This dangling thin mapping memory when processed and accessed in
next stage will lead to device mapper crashing.

Code flow without fix:
-> process_prepared_discard_passdown_pt1(m)
   -> dm_thin_remove_range()
   -> discard passdown
      --> passdown_endio(m) queues m onto next stage
   -> dm_pool_inc_data_range() fails, frees memory m
            but does not remove it from next stage queue

-> process_prepared_discard_passdown_pt2(m)
   -> processes freed memory m and crashes

One such stack:

Call Trace:
[<ffffffffa037a46f>] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
[<ffffffffa039b6dc>] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
[<ffffffffa039b88b>] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
[<ffffffffa0399611>] process_prepared+0x81/0xa0 [dm_thin_pool]
[<ffffffffa039e735>] do_worker+0xc5/0x820 [dm_thin_pool]
[<ffffffff8152bf54>] ? __schedule+0x244/0x680
[<ffffffff81087e72>] ? pwq_activate_delayed_work+0x42/0xb0
[<ffffffff81089f53>] process_one_work+0x153/0x3f0
[<ffffffff8108a71b>] worker_thread+0x12b/0x4b0
[<ffffffff8108a5f0>] ? rescuer_thread+0x350/0x350
[<ffffffff8108fd6a>] kthread+0xca/0xe0
[<ffffffff8108fca0>] ? kthread_park+0x60/0x60
[<ffffffff81530b45>] ret_from_fork+0x25/0x30

The fix is to first take the block ref count for discarded block and
then do a passdown discard of this block. If block ref count fails,
then bail out aborting current metadata transaction, mark pool as
PM_READ_ONLY and also free current thin mapping memory (existing error
handling code) without queueing this thin mapping onto next stage of
processing. If block ref count succeeds, then passdown discard of this
block. Discard callback of passdown_endio() will queue this thin mapping
onto next stage of processing.

Code flow with fix:
-> process_prepared_discard_passdown_pt1(m)
   -> dm_thin_remove_range()
   -> dm_pool_inc_data_range()
      --> if fails, free memory m and bail out
   -> discard passdown
      --> passdown_endio(m) queues m onto next stage

Cc: stable <stable@vger.kernel.org> # v4.9+
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
Reviewed-by: Cristian Gafton <gafton@amazon.com>
Reviewed-by: Anchal Agarwal <anchalag@amazon.com>
Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com>
Reviewed-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2017-06-27 15:14:34 -04:00
..
bcache drivers/md/bcache/super.c: use kvmalloc 2017-05-08 17:15:13 -07:00
persistent-data dm space map disk: fix some book keeping in the disk space map 2017-05-15 15:09:50 -04:00
bitmap.c md: uuid debug statement now in processor byte order. 2017-05-24 15:58:43 -07:00
bitmap.h
dm-bio-prison-v1.c
dm-bio-prison-v1.h
dm-bio-prison-v2.c
dm-bio-prison-v2.h
dm-bio-record.h
dm-bufio.c dm: make flush bios explicitly sync 2017-05-31 10:50:23 -04:00
dm-bufio.h
dm-builtin.c
dm-cache-background-tracker.c dm cache: handle kmalloc failure allocating background_tracker struct 2017-05-17 09:44:53 -04:00
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c dm cache policy smq: don't do any writebacks unless IDLE 2017-05-14 21:54:33 -04:00
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c dm cache: simplify the IDLE vs BUSY state calculation 2017-05-14 21:54:33 -04:00
dm-core.h libnvdimm for 4.12 2017-05-05 18:49:20 -07:00
dm-crypt.c
dm-delay.c
dm-era-target.c
dm-exception-store.c
dm-exception-store.h
dm-flakey.c
dm-integrity.c dm integrity: fix to not disable/enable interrupts from interrupt context 2017-06-21 11:45:02 -04:00
dm-io.c dm io: fix duplicate bio completion due to missing ref count 2017-06-21 12:04:50 -04:00
dm-ioctl.c dm ioctl: restore __GFP_HIGH in copy_params() 2017-05-22 19:30:03 -04:00
dm-kcopyd.c
dm-linear.c libnvdimm for 4.12 2017-05-05 18:49:20 -07:00
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c
dm-log.c
dm-mpath.c dm mpath: multipath_clone_and_map must not return -EIO 2017-05-15 15:09:53 -04:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c
dm-raid1.c Revert "dm mirror: use all available legs on multiple failures" 2017-06-15 08:39:15 -04:00
dm-raid.c dm raid: fix oops on upgrading to extended superblock format 2017-06-23 12:16:15 -04:00
dm-region-hash.c
dm-round-robin.c
dm-rq.c dm rq: add a missing break to map_request 2017-05-15 15:09:51 -04:00
dm-rq.h
dm-service-time.c
dm-snap-persistent.c dm: make flush bios explicitly sync 2017-05-31 10:50:23 -04:00
dm-snap-transient.c
dm-snap.c
dm-stats.c mm: introduce kv[mz]alloc helpers 2017-05-08 17:15:12 -07:00
dm-stats.h
dm-stripe.c libnvdimm for 4.12 2017-05-05 18:49:20 -07:00
dm-switch.c
dm-sysfs.c
dm-table.c
dm-target.c libnvdimm for 4.12 2017-05-05 18:49:20 -07:00
dm-thin-metadata.c dm thin metadata: call precommit before saving the roots 2017-05-15 15:09:49 -04:00
dm-thin-metadata.h
dm-thin.c dm thin: do not queue freed thin mapping for next stage processing 2017-06-27 15:14:34 -04:00
dm-uevent.c
dm-uevent.h
dm-verity-fec.c
dm-verity-fec.h
dm-verity-target.c dm verity: fix no salt use case 2017-05-22 13:49:03 -04:00
dm-verity.h
dm-zero.c
dm.c dm: make flush bios explicitly sync 2017-05-31 10:50:23 -04:00
dm.h
faulty.c
Kconfig - DM cache metadata fixes to short-circuit operations that require the 2017-05-05 19:31:06 -07:00
linear.c
linear.h
Makefile
md-cluster.c md-cluster: fix potential lock issue in add_new_disk 2017-05-21 20:37:09 -07:00
md-cluster.h
md.c md: initialise ->writes_pending in personality modules. 2017-06-05 16:04:35 -07:00
md.h md: initialise ->writes_pending in personality modules. 2017-06-05 16:04:35 -07:00
multipath.c
multipath.h
raid0.c md/md0: optimize raid0 discard handling 2017-05-08 21:18:03 -07:00
raid0.h
raid1.c md: initialise ->writes_pending in personality modules. 2017-06-05 16:04:35 -07:00
raid1.h
raid5-cache.c md: Make flush bios explicitely sync 2017-05-31 09:25:53 -07:00
raid5-log.h md/r5cache: gracefully handle journal device errors for writeback mode 2017-05-11 22:11:11 -07:00
raid5-ppl.c md: Make flush bios explicitely sync 2017-05-31 09:25:53 -07:00
raid5.c md: initialise ->writes_pending in personality modules. 2017-06-05 16:04:35 -07:00
raid5.h
raid10.c md: initialise ->writes_pending in personality modules. 2017-06-05 16:04:35 -07:00
raid10.h