76 points by yamrzou 6 days ago | 13 comments
_carbyau_ 3 days ago
It can be run from the online OS itself and it can store the resulting vhd on the same disk it is imaging (with space and disk performance constraints).
I find it handy for turning my freshly superceded gaming machine into a VM on my new machine for easy access to files, before doing whatever with my old hardware.
[0] https://learn.microsoft.com/en-us/sysinternals/downloads/dis...
miles 2 days ago
accrual 2 days ago
Did you uninstall any games or other large files before converting to .vhd to keep the image size down?
_carbyau_ 2 days ago
If I wanted to be more careful I could probably just do a full registry export and keep C:\Users\[username]\AppData. But rather than dig around trying to recall and export MORE stuff (when I want to be playing on the new machine...) I'll just keep a copy of the whole thing for reference.
And it'll get deleted down the track when I'm happily bedded into the new machine.
Other tips: if you moved your license for Windows to the new machine, run the VM without networking...
If you are wondering how to get stuff off it with no networking - because you are using (inbuilt to Windows Pro) hyperv instead of vmware - you can mount the VHD disks directly on your new machine while the VM is off.
honestSysAdmin 2 days ago
Or Ceph Bluestore, which does checksums on physically redundant media. We do N+3 replication because we're lazy.
nyrikki 3 days ago
Duplicating the OS/FS behavior hits the decidable problem and you just hope for the best with block level, often you won't notice corruption either.
groby_b 3 days ago
2) No, most of us don't "hope for the best" with imaging, but would like to actually achieve a reasonable level of confidence. If your approach to data integrity is "you probably won't notice corruption", you don't have an approach to data integrity.
nyrikki 2 days ago
Pretty common problem for someone to do their boot drive, reboot and have it mount their backup.
If you are using iSCSI or anything with multipathing this can happen even without a reboot.
I know that block level copies seem like a good easy solution. But several decades in the storage admin to architect roles during the height of the SAN era showed that it is more problematic than you expected.
To be honest, full block level backup of a boot volume is something you do when the data isn't that critical anyways.
But if you use case requires this and the data is critical, I highly encourage you to dig into how even device files are emitted.
If you are like most people who don't have the pager scars that forced you to dig into udev etc... you probably won't realize what appears to be a concrete implementation detail is really just a facade pattern.
thesnide 3 days ago
The rootfs can be mkfs and rsynced nicely.
That said, the article is awesome and the idea very clever.
But more to do streaming replication that dd catchup.
Joel_Mckay 3 days ago
Yet the btrfs, CephFS, or ZFS all have snapshot syncing tricks that make state mirrors far more practical and safe to pull off. =3
dambi0 3 days ago
2) Arguing that it might be better to avoid such methods because of possible problems with data integrity isn’t a lack of an approach to data integrity.
GTP 2 days ago
nyrikki 2 days ago
The number of writers on a typical OS means that you can't depend on a pathological case.
I suppose you could reduce it to Rice's theorm and how threeTM is undecidable but remember it generalizes to even total functions.
Just goes back to the equivalence of two static programs requires running them, and there is too much entropy in file operations to practically cover much of the behavior.
When forced to, it can save you, but a block level copy on a live filesystem is opportunistic.
Crash consistency is obviously the best you can hope for here, so that and holes in classic NFS writes may be a more accessible lens on the non-happy path than my preferred computability one.
The guid being copied and no longer unique problem I mentioned below is where I have see people lose data the most.
The undecidable part is really a caution that it doesn't matter how smart the programmer is, there is simply no general algorithms that can remove the cost of losing the symantic meaning of those writers.
So it is not a good default strategy, only one to use when context forces it.
TL:DR it is Horses for courses