Reliable Data Delivery at the Institute for Genome Sciences of the University of Maryland

IGS manages the collection, storing and distributing of terabytes of genomic research data globally. Due to increasing file sizes from an average 5GB to 20GB, data move¬ment over FTP faced growing reliability problems, especially with international long distance transfers. FTP was delivering under 20% network utilization and frequent con¬nection failures were resulting in excessive re-transmission.

IGS then converted to physically shipping hard drives, which quickly became con¬strained by time and personnel resources. Many data sets exceeded disk capacity, which resulted in further increasing the man¬ual workload. It often took two people over two hours to prepare and pack the disks—an¬other 24 hours for overnight delivery—and another four hours to unpack and mount the disks; and if data was being distributed to multiple researchers at different sites for collaboration, additional personnel and time was required.

The researchers were losing valuable re¬search time because of the slow data deliv¬ery and poor reliability of data transfers.