RPS METOCEAN intro, experiences with new sys, DMF 4 IT involved, me impl To give a bit of background, RPS MetOcean is a consultancy which provides oceanographic and meteorological services in support of coastal and ocean engineering and environmental protection. Our major business is around physical oceanography, along with some marine and local, land-based meteorology. For over 25 years we've been collecting, analysing and interpreting metocean data. IMG IMG Managaement know our data is one of our biggest assets, so when we come to make strategic decisions about data storage, we're confident we'll have support to choose the right technology. OUR DATA types: office, metocean netcdf, source & config etc mostly on 1 fs IMG growth, staff inform permanent retention, retrieve fast PREVIOUS SYSTEM IMG combined server, separate backup & archive & storage IO & CPU contention, load, scaling EMC SAN 1TB 2004 -> 16TB 2009 perf still good, scale & cost prohibitive CHOOSE what to achieve? incomplete before vendors, so took longer, some should have gone large cap but not all fast, i.e. HSM retention, DR & oops sideways scale DMF best fit. still not perfect, is it us? IMG The other vendors were either offering solutions comprised of several disparate systems, incomplete solutions or solutions that didn't match requirements. deal direct with ppl who know gear BUY nearly eclipsed frustrating, tsunami crashing staff okay, but poor service cleared the pingpong ... IMG IMG SITE INSTALL rack 2flight stair challenges DIY, Dad ropes IMG IMG tight room IMG IMG susheel & paul efficient effective: neat as we want the rest learnt lots, more to come POST-INSTALL SATA PSU paul ISSP learning: DMF not 3 tiered capacity while getting familar, all data online, later relax RAID volumes perf > flex, no defrag so tear down migrate data: bacula CXFS bug, tsunami repeat, 3 week fix CONFIGURE felt hard, but was good DMF thresholds: 2.8m of 3.2m <256k sugg: inode quotas netcdf good: headers offsite important, dmsilo considered hard delete extend, tape juggling: inelegant our solution: 3 tape pools. 1 onsite, 1 offsite, 1 offsite for when 1st offsite needs to be sparsed susheel: 2 tape pools. 1 onsite, 1 offsite. when sparsing, query catalog DB for BFIDs of files with data blocks on sparse tape. dmmove delete files from offsite tapes, dmmove back to new offsite tapes short period of only 1 copy, mitigated with dmmove to temp location tier choices: match current, tweak backup tina incrementals uncertainty bapi_fs tunable backup window similar to Lachlan (?), tina & openvault: qc_stinit CHANGES all space avail, need tighter project management, quotas CXFS: people can no longer walk the entire fs tree with their cron jobs expansion order of magnitude lower cost, nicely offsets increasingly difficult budgetary arrangements growth less flexible on top tier, but trivial on tape tier any growth now possible, no longer scrabble about for creative ideas backups still a concern FUTURE HPC growth more tape capacity: harness DMF's strengths train other sysadmins in administering invent new archive system probably move data to different fs, change permissions SGI to provide good support, as I wrote this preso Susheel emailed me with another issue & fix keep track of stats, thanks Rob M - justify, marketing to colleagues valuable get input from you guys, mail list archive backup