Indiana University’s Media Digitization and Preservation Initiative (MDPI) is a seven-year mass digitization project that uses HPSS as working storage. MDPI can generate up to 40TB of digital content per day, and is projected to generate more than 20PB of content by the time the project ends. With this amount of data and the short time provided to process, it is essential that the workflow tools and storage system be reliable and as efficient as possible.
This presentation will address the challenges MDPI faced during the post-digitization workflow and the solutions they used to create a workflow, which has processed more than 330,000 digital objects and 11.5PB in four years.