ZPOOL hangs during rollback of a zfs snapshot 
Starting with Kernel Patch 150401-09 of Oracle we experienced hangs of the whole zpool when we did a rollback of a snapshot.
Up to now (2015/12/05) there is no fix available for this problem. Last tests with Kernel patch 150401-28 where not successful.
The hangs occur if we had to rollback a snapshot of cloned filesystem.
Here the setup of our filesystems:

testsystem_data 11.5T 2.22T 31K legacy
testsystem_data/u01 9.65T 2.22T 46K /zones/testsystem/root/u01
testsystem_data/u01/DB1 8.77T 2.22T 2.19T /zones/testsystem/root/u01/DB1
testsystem_data/u01/DB1@DB5 2.18T - 2.18T -
testsystem_data/u01/DB1@DB8 2.19T - 2.19T -
testsystem_data/u01/DB1@DB4 1.72G - 2.19T -
testsystem_data/u01/DB1@DB7 223M - 2.19T -
testsystem_data/u01/DB1@DB6 216M - 2.19T -
testsystem_data/u01/DB1@DB2 242M - 2.19T -
testsystem_data/u01/DB1@DB3 268M - 2.19T -
testsystem_data/u01/DB1@DB9 1.72G - 2.19T -
testsystem_data/u01/DB2 95.3G 2.22T 2.26T /zones/testsystem/root/u01/DB2
testsystem_data/u01/DB2@db2_after_clone 8.81G - 2.19T -
testsystem_data/u01/DB3 95.5G 2.22T 2.26T /zones/testsystem/root/u01/DB3
testsystem_data/u01/DB3@db3_after_clone 8.87G - 2.19T -
testsystem_data/u01/DB4 209G 2.22T 2.28T /zones/testsystem/root/u01/DB4
testsystem_data/u01/DB5 30.8G 2.22T 2.18T /zones/testsystem/root/u01/DB5
testsystem_data/u01/DB5@db5_after_clone 7.53G - 2.18T -
testsystem_data/u01/DB6 91.9G 2.22T 2.26T /zones/testsystem/root/u01/DB6
testsystem_data/u01/DB6@db6_after_clone 8.75G - 2.19T -
testsystem_data/u01/DB7 128G 2.22T 2.26T /zones/testsystem/root/u01/DB7
testsystem_data/u01/DB7@db7_after_clone 9.14G - 2.19T -
testsystem_data/u01/DB8 163G 2.22T 2.26T /zones/testsystem/root/u01/DB8
testsystem_data/u01/DB8@db8_after_clone 9.27G - 2.19T -
testsystem_data/u01/DB9 92.0G 2.22T 2.26T /zones/testsystem/root/u01/DB9
testsystem_data/u01/DB9@db9_after_clone 8.78G - 2.19T -

If we had to rollback a snapshot testsystem_data/u01/DB7@db7_after_clone we experienced long hang times. Sometimes the whole pool was blocked for several minutes.
The filesystem testsystem_data/u01/DB7 is a clone of the snapshot testsystem_data/u01/DB1@DB7.

After one year of testing all possible IDR Patches and Kernel Patches we found a simple workaround:

First delete all files in the Filesystem which you want to rollback. E.g. if you want to rollback testsystem_data/u01/DB7@db7_after_clone then first delete all files in /zones/testsystem/root/u01/DB7 and then run the rollback:

rm -r /zones/testsystem/root/u01/DB7
zfs rollback testsystem_data/u01/DB7@db7_after_clone

The "rm -r" command will take a while, depending on the size of the Filesystem (for 2TB about 40 Minutes), but then the "zfs rollback" will only take a few seconds, and the zpool will never hang during the whole procedure.

[ view entry ] ( 1569 views )   |  permalink  |  print article  |   ( 3 / 3085 )

<Back | 1 | 2 | 3 | Next> Last>>