Advice on data arrangement/backup strategies

Posted:
in Genius Bar edited January 2014
Hey guys, can any of you provide any advice on data arrangement/backup strategies?



I have a server that contains all of the current data that I am working on, plus all of the data that I might need to access in order to complete the current work. Basically we're talking about graphic design projects and all the related linked files. We're talking about 400GBs+ of data here.



I also have other data that I probably won't need to access any time soon, but it's handy to have it on-line so that if I do need to access it it's there waiting for me.



The data is currently sitting on a 500GB RAID 0 set made up of two 250GB drives (which I know is risky). I don't fancy attempting to back that lot up on to DVD again. The last time I did that the backup kept failing and I had to start again from the very beginning. I also don't have the time to sit and feed blank DVDs in to the machine.



The obvious choice is to backup on to another drive (which I have been doing). I bought two LaCie 500GB Big Disks for this task, but some worrying reports about similar drives dying has me wondering if this is such a smart way of doing this.



Is there anybody out there were data management training that might be able to point me in the right direction?



Many thanks in advance!

Comments

  • Reply 1 of 12
    lundylundy Posts: 4,466member
    I don't have industry experience, but it seems that your stuff is 100% irreplaceable and very valuable. So given that, the standard advice is to make daily incremental backups and separate weekly full backups. With the low cost of hard drives, tape is really an unnecessarily expensive solution nowadays.



    I assume the 400 gigs of data is the same data you speak about being on the RAID 0? And that is all that needs to be backed up?



    If so, the conventional wisdom is to make two backups and keep at least one of them off-site in case the building burns down. Now if you don't want to buy another 500GB drive for that purpose, then some options are



    - Retrospect (makes incrementals, but keeps all old versions - to me, that wastes space)



    - Super Duper! or Carbon Copy Cloner - both make exact image copies; not ideal for incremental backups



    - Apple's Backup utility - slow, saves all previous versions (wastes space), and you have to have a dot-Mac account to get it



    - ChronoSync (http://www.versiontracker.com/dyn/moreinfo/macosx/13652) - can be set to NOT save every version of a file. This saves space and guarantees that your backup volume never gets bigger than your original. But it also does not clone the whole drive and waste time like the cloners listed above.



    So give it a try with ChronoSync if, like me, you think that keeping old versions of files other than the most recent backup is not necessary.



    Two caveats:



    - Always test the Restore function from the backup the first few times it runs. It is very common for people to spend a lot of effort backing up and then find out that the Restore doesn't restore. This happened to me recently with Apple's Backup, although it was my own fault for screwing around deleting what I thought were "outdated" incremental backup files.



    - Make two backups. That basically removes almost all of the risk of a hard drive going south, as you will always have the other one.



    - The offsite storage, if done, should also be two backups, one a daily one that goes offsite each night, and another one that stays offsite except for the time once a week that you bring it in to be refreshed. "Offsite" backups that consist of a single volume that stays at the main site all day and only goes offsite at night don't really qualify as "offsite", since if there is a fire during the day, you don't want to be screwing around trying to get that disk drive out of there.
  • Reply 2 of 12
    messiahmessiah Posts: 1,689member
    Quote:
    Originally Posted by lundy


    I don't have industry experience, but it seems that your stuff is 100% irreplaceable and very valuable. So given that, the standard advice is to make daily incremental backups and separate weekly full backups. With the low cost of hard drives, tape is really an unnecessarily expensive solution nowadays.



    I assume the 400 gigs of data is the same data you speak about being on the RAID 0? And that is all that needs to be backed up?



    If so, the conventional wisdom is to make two backups and keep at least one of them off-site in case the building burns down. Now if you don't want to buy another 500GB drive for that purpose, then some options are



    - Retrospect (makes incrementals, but keeps all old versions - to me, that wastes space)



    - Super Duper! or Carbon Copy Cloner - both make exact image copies; not ideal for incremental backups



    - Apple's Backup utility - slow, saves all previous versions (wastes space), and you have to have a dot-Mac account to get it



    - ChronoSync (http://www.versiontracker.com/dyn/moreinfo/macosx/13652) - can be set to NOT save every version of a file. This saves space and guarantees that your backup volume never gets bigger than your original. But it also does not clone the whole drive and waste time like the cloners listed above.



    So give it a try with ChronoSync if, like me, you think that keeping old versions of files other than the most recent backup is not necessary.



    Two caveats:



    - Always test the Restore function from the backup the first few times it runs. It is very common for people to spend a lot of effort backing up and then find out that the Restore doesn't restore. This happened to me recently with Apple's Backup, although it was my own fault for screwing around deleting what I thought were "outdated" incremental backup files.



    - Make two backups. That basically removes almost all of the risk of a hard drive going south, as you will always have the other one.



    - The offsite storage, if done, should also be two backups, one a daily one that goes offsite each night, and another one that stays offsite except for the time once a week that you bring it in to be refreshed. "Offsite" backups that consist of a single volume that stays at the main site all day and only goes offsite at night don't really qualify as "offsite", since if there is a fire during the day, you don't want to be screwing around trying to get that disk drive out of there.



    It's amazing all the things you have to consider when you actually sit down and think it through properly!



    ChronoSync looks interesting.



    Thanks for the advice!
  • Reply 3 of 12
    kukukuku Posts: 254member
    Depends on the cost and flexiblity.



    I find it easier/cost effective to RAID 1 though you need the hdd space.



    After all it's what RAID was orginally ment for, preventing HW failure through distribition.



    Of course it doesn't prevent you from a flaming computer or soaked with suger water, etc.



    It depends on the data, but if it's constant access files across the board. Daily back up of that size is a pain, so you need fast drives.



    On the other hand if you're only working on "reference" that don't get changed, you only need to backup whole once or twice per yr(or whatever the turn around time is) and only back up smaller constantly saved files often.



    Heh old saying goes, that cost of backup is always proportional to how much the data is worth.



    Maybe you can try to sneak in a copy of Leopard and use timemachine earily? haha.
  • Reply 4 of 12
    ebbyebby Posts: 3,110member
    Well I'll have to give you a piece of advice on backing up RAIDS.



    I have a RAID 5 system and just backed up 2 days ago. I added more hard drives and needed to reformat. I used Disk Utility to create a whopping disk image and the command line app hdiutil to segment it into 2GB chunks. I had 158 of these little buggers and after a quick check to make sure it worked, I filled up all the other drives in my house. A quick reformat of the RAID and I had a 1.4TB drive on my desktop. Sweet! Bring all the disk image segments back to the same folder and open the image. No dice.



    Thats right. The 400GB compressed disk image won't freaking mount anymore. It says about 1/3 of the files are missing but they are not. You can't believe how &(^%$ I am. Bye, bye, data. *Waves hand and blows tissue*



    So, even though it should have worked, don't follow my example. I'm still trying to find out what went wrong.
  • Reply 5 of 12
    kukukuku Posts: 254member
    Heh RAID 5 is a partial data distro type.



    Yea got to be careful you're making a disk img Of the whole and not one drive.
  • Reply 6 of 12
    messiahmessiah Posts: 1,689member
    Am I right in saying that RAID 1 protects against hardware failure, but not data corruption?
  • Reply 7 of 12
    Quote:
    Originally Posted by Messiah


    Am I right in saying that RAID 1 protects against hardware failure, but not data corruption?



    RAID-1 is only half the solution: if one harddrive fails, all the data is still on the other drive. That's whar the theory says. If though one drive fails because the power goes off exactly at the moment the drive was writing data onto the disk, chances are very high that both drives are affected equally, so in the end you have no data at all. Ditto when deleting things: they're gone on both drives.



    The key is: backup, backup, backup!!! No RAID can be a substitute for multiple backup copies of your data. And as Lundy said: for you own safety, test the restore function thoroughly before deleting any old volumes or data...



    One more thing... NEVER, NEVER create RAID backup volumes!!! Spend a few bucks on Retrospect, its restore function alone is worth every penny! In Retrospect, YOU have the control over what is backed up, when, to where and how often. Automatically or manually or both. Connect one external drive to your Mac, let Retrospect fill up that drive, then it asks for a second one (much like Stuffit worked in the old times when we used floppies...). Or split your files up into two portions, one goes onto backup drive #1, the other half onto #2.



    EDIT: never create RAID-0 or RAID-1 backup volumes!!!
  • Reply 8 of 12
    ebbyebby Posts: 3,110member
    Quote:

    One more thing... NEVER, NEVER create RAID backup volumes!!!



    Wait wait wait. A RAID is perfect for backups. I started out using external drives and retrospect and that method is simply a waste of a hard drive IMO. Building a RAID5 system offered the redundancy protection I sought and the drive capacity to backup whenever I want without worrying about free space. Having a redundant, larger disk (than the drive you are backing up) is by far much easier to manage than multiple external drives especially for backups.



    Please elaborate on why you dis RAID backup volumes so much.



    EDIT: Unless you mean RAID 1. Yea, I call that a waste of a drive. But RAID 5 is freak'n sweet. 8)



    Quote:

    No RAID can be a substitute for multiple backup copies of your data.



    While that is true, the redundant nature of a good RAID system clocks in at a close second place. It still doesn't protect against software glitches, but you are pretty secure from hardware failures.
  • Reply 9 of 12
    Quote:
    Originally Posted by Ebby


    Wait wait wait. A RAID is perfect for backups. I started out using external drives and retrospect and that method is simply a waste of a hard drive IMO. Building a RAID5 system offered the redundancy protection I sought and the drive capacity to backup whenever I want without worrying about free space. Having a redundant, larger disk (than the drive you are backing up) is by far much easier to manage than multiple external drives especially for backups.



    Please elaborate on why you dis RAID backup volumes so much.



    EDIT: Unless you mean RAID 1. Yea, I call that a waste of a drive. But RAID 5 is freak'n sweet. 8)





    While that is true, the redundant nature of a good RAID system clocks in at a close second place. It still doesn't protect against software glitches, but you are pretty secure from hardware failures.



    RAID5 is a different story than RAID 0 or 1, you're absolutely right. BUT: creating a RAID5 or buying an external RAID5 solution that is only used for backups AND contains a spare disk (!) - which is a must IMHO -, is very, very expensive compared to buying some additional external drives of let's say 500 GB capacity.



    Backup needs to be simple and fool-proof, otherwise people won't do it. The most simple backup routine, though, is - as Lundy described it appropriately - taking two external drives (larger than the storage capacity of the server) and connecting them whenever you want to back up. Start the backup script and then take the drive home whenever you're done. Very simple and everybody can do that. Plus the drives alone are relatively small and lightwaight, and the data is stored off-site, too. You get all in "one package".



    Of course you could take a Windows/Linux PC, put a hardware RAID card into it (also quite expensive), add 6-8 drives to it and create a RAID5. Works beautifully, I also have a RAID5 server for my data, but in terms of control over what is backed up where, external drives, in my opinion, are better and easier to understand for most people.



    One more point: external drives will last almost forever when they are only connected a few hours per month instead of having harddrives spinning 24/7 in a server created only for backup purposes (unless you buy the even more expensive server drives). And RAIDs still tend to be susceptible to harddrive failures, that's a fact. Especially when using cheap (S)ATA drives in an el-cheapo backup server solution.



    For starting a backup scheme, it must be really simple, yet effective and reliable. I don't know how much knowledge the thread starter has, but building a RAID5 server isn't exactly the thing I would recommend to anybody at first. Having to buy an external FireWire 800 RAID5 solution for 1000 USD just for backups probably isn't what Messiah wanted to know, either... For storing data, yes, but for backups, there are other methods (as splitting the server data into (i) active and (ii) archive data, which is saved less often than the active data) of protecting your data as much as possible.
  • Reply 10 of 12
    Messiah, I see three scenarios for your backup:



    Good setup:

    Server + HD1 (500 GB) + HD2 (500 GB), each saves all of your data. Change every week to keep one HD off-site. Pros: relatively cheap, simple to maintain. Cons: capacity probably too small in a few months.



    Better setup:

    Server + HD1 (500 GB) for "active" data (on-site backup) + HD2 (500 GB) for "active" data (off-site backup) + HD3 (500 GB) for "archive" data backup (off-site only). Pros: still relatively cheap, more capacity because "archive" data is stored separately. Cons: ?



    Best setup:

    Server plus two external FireWire RAID5 enclosures (one stays in the office, the other at home, then you switch every week). Pros: large capacity, high reliability. Cons: very expensive and bulky.
  • Reply 11 of 12
    kukukuku Posts: 254member
    That's why almost always the cost of protecting your data is almost always proportional to the value of it.



    A Raid 5 system will suck up mssive CPU(in sw mode) and be plenty expensive, both in heat management and equipment cost.



    I recomended RAID1 because it is the most cost effective solution. It prevents from the 98% data loosing senerio. HDD dying/semi dying.



    It won't prevent user stupidity (luckly, a media disk is somewhat easier to do sw recovery). Overwriting, or "act of god" senerio.



    Power cutting off to a drive/computer during a write is not that bad. As while it does do data curruption, most modern OS/Programs, protects this by doing a $temp file write. That is,



    Read saving file

    Write to $temp file

    Write to orginal file

    Check orginal end of file flag

    delete $temp file



    In event of power going out. Your data is recoverable (if not the current work which depends on program auto-saving techniques)



    But in the end yes it's correct. Nothing beats multiple, storaged backup.



    But it's reward vs effort. Backing up media 1 a month is much better that backing up 1 a day vs 1 an hour.



    RAID (1-6) just gives you options on layer of protection. As they say in the insurance business, nothing really insures an "Act of God".
  • Reply 12 of 12
    drboardrboar Posts: 477member
    I have retrospect backing up about 15 macs (users) and 50 XP (documents and settings) boxes . Backups six days a week. Two set of disks (for every other week) external FW disks 250-400 GB each.



    Server located in a fireproof locked room used for nothing else so I actually have both hardisk at the same place until they are full, they are then moved to the storage. Retrospect might waste space but only if no user ever by acident delete a file, that they later need.



    Even if you buy the personal edition is has 2 additional licenses, so just to separate physically your computer with the backup computer is a good thing. In you case any old beige G3 or pentium II runing XP/Win 2000 will work as backup computer to some external USB2 drives, this reduces the need to move the other HD of site.
Sign In or Register to comment.