Abstract
Background: Academic institutions risk losing vital digital information if urgent measures are not taken to safeguard digital resources and to allow for long-term access. New systems and technologies are thus needed to deal with the digital preservation challenges.
Objectives: The purpose of this article was to investigate the systems and technologies being used to support digital preservation within academic libraries in South Africa with a view to provide solutions for effective digital preservation. The study thus looked into theories, models, systems and technologies used in preserving the digital resources in order to enhance the success of the implementation of the digital preservation systems in academic libraries in South Africa.
Method: This study adopted a quantitative research approach by using a survey research design. In South Africa, there are 27 academic institutions, and all these institutions constituted the target population and all were included in the sample of the study. An online questionnaire was used as the data collection instrument and it was emailed to all 27 academic institutions in South Africa.
Results: The findings revealed that most academic libraries were adopting new technologies for preserving their digital resources. DSpace, E-print, ETD, digital commons, LOCKSS, DigiTool, Content dm and Archive-IT were noted as technologies that were commonly used by many academic institutions in preserving their digital resources.
Conclusion: Although different technologies are being implemented to support digital preservation in academic libraries, these institutions should make sure these systems are compatible with archival standards and should also account for technological changes so that the entities may continue to be migrated to newer platforms as needed to avoid technological obsolescence.
Keywords: digital preservation; digital resources; digital preservation system; academic libraries; digital technologies.
Introduction
The rapidly evolving digital technologies have led to the creation of digital resources within academic institutions the world over. With the advent of these technologies, digital preservation is thus becoming a necessity to ensure that digital resources are accessible over the long term. Digital preservation involves the application of preservation methods, technologies and strategies to ensure access to reformatted and born digital content regardless of the challenges of technological changes (Day 2006). The three leg stool model by Kenney and McGovern’s (2003) also combines hardware and software, file formats and storage media as a means to ensure continued access to digital resources. Digital resources are thus dependent on hardware and software to render them intelligible. As a result, new technologies are being implemented in academic libraries to support digital preservation programmes. These include DSpace, Fedora, E-prints, Greenstone, Innovative, I-T, Archivematica, Rosetta, Tesella, just to name a few. These software and technologies are intended to provide academic institutions with the capability to create, capture, store, preserve, track and retrieve digital resources, regardless of the format (Rosa, Craveiro & Dominques 2017).
Rosenthal et al. (2005) described the goal of any digital preservation system to ensure that the information it contains remains accessible to users over a long period of time. However, for the past few years, academic institutions have been grappling with how to preserve the digital resources they produce. Literature indicates that the academic institutions are faced with the challenge of making sure that users can access the content that has been ingested into the institutional repositories (IRs) and other archives in the past because of technological obsolescence. The Council of Canadian Academies (2015) regards technological obsolescence as a major issue as it poses challenges of maintenance and safeguarding the digital resources for keeping long term. Hedstrom and Lee (2002) concurred that digital resources present more complex problems than conventional analogue as newer digital technologies rapidly appear and older ones are outdated, and information that relies on obsolete technologies soon becomes inaccessible. As a result, some digitised materials have been lost and to date remain inaccessible because of the original software being outdated or incompatible with modern operating systems (Sigauke & Nengomasha 2011).
Asogwa (2011) also investigated the challenges of preservation of archives and records in the electronic age at Nigerian universities. The key findings cited several challenges that include copyright issues, technological obsolescence, lack of technical expertise in preserving digital resources, inadequate funding, increasing cost of payment for electronic databases and inadequate information and communication technologies (ICTs) infrastructures. However, Rahman and Mohammed-ul-Islam (2012) pointed out that any library in this digital environment has to cope with new technology for preserving the digital information for its users and to sustain itself. The best digital preservation practices thus involve developing a proper technological infrastructure, which includes communication network platforms, hardware and software to support workflows for efficient and effective digital preservation and to provide easy access to digital content. All these systems must conform to ISO 17799, an information security code of practice. It is therefore crucial in this study to identify and examine the systems and technologies being used to support digital preservation within academic libraries.
Contextual setting
South Africa has 27 public universities, which were all included in this study. They are namely, Tshwane University of Technology, Stellenbosch University, University of North West, Central University of Technology, Durban University of Technology, University of Cape Town (UCT), University of Fort Hare, University of Kwa-Zulu Natal, University of Limpopo, University of Pretoria, University of South Africa, University of Western Cape, University of Zululand, Walter Sisulu University, University of Johannesburg, University of Free State, Rhodes University, Nelson Mandela Metropolitan University, Walter Sisulu University, University of Durban Westville, University of Witswatersrand, Sol Plaatjie University, Mpumalanga University, Mangosuthu University, Vaal University of Technology, Cape Penisnsula University of Technology and Sefako Makgato Health Science University. The study was conducted in all these academic institutions in South Africa. The majority of these institutions recognised the changing library environment and the global reach of digital assets and created electronic resources in their libraries. As a result, a huge amount of information is now available in electronic format including institutions’ books, journal articles, manuscripts, theses, dissertations and other library materials. The increasing number of academic institutions is making efforts to manage, safeguard and disseminate their scholarly materials and research outputs through digitisation and the implementation of IRs. It was established that academic libraries in South Africa are actively leading the way in implementing digital preservation programmes, as compared with other African countries. Only a few of these institutions do not have a formal digital preservation programme in place; however, they expressed their desire and interest in developing a formal programme for preserving their digital materials.
Most of these institutions perform digitisation and preservation activities within a framework that matches international standards and they comply with the Promotion of Access to Information Act (No. 2 of 2000) (PAIA) and the broad principles of records management that are required by the National Archives and Records Service Act (No. 43 of 1996), the International Standard for Records Management (ISO 15489) and the South African National Standard for Records Management (SANS 15489). The majority of academic institutions have also expressed their commitment to openness and have signed the Berlin Declaration on Openness to Knowledge in the Sciences and Humanities, a mechanism to commit institutions to promote an open access approach to institutional research outputs and knowledge. This declaration asserted that scholarly research outputs and cultural heritage be freely accessible and usable for scientists and the public. Institutional repositories thus provide access in compliance with legal requirements and standards and follow the principles of the Berlin Declaration on Openness to Knowledge in the Sciences and Humanities.
Problem statement
Academic libraries are currently suffering from inability to provide permanent access to electronic materials they produce. Digital preservation aims at ensuring that digital content remains accessible to user communities for a long period of time and for future generations. However, digital resources are stored in fragile magnetic media that deteriorate rapidly and they, therefore, require proper systems and technologies to enable their accessibility as they are vulnerable to loss and destruction (Corrado & Moulaison 2014). Chowdhury (2009) identified a number of preservation challenges ranging from increasingly large volumes of data to the underlying hardware, data formats, metadata and the various management practices used by these systems. However, as noted by Styblinska (2006) , ensuring ongoing access requires currency with technological changes and moving digital objects from obsolete to current file formats, storage media and operating systems. Research Library Group (RLG) (OCLC & RLG) highlighted that the majority of institutions designing digital content management systems do not take great interest in their long-term preservation of the digital resources. It is, therefore, clear that many academic institutions still do not take long-term preservation into account. Technological obsolescence and other factors such as lack of commitment by relevant stakeholders make long-term preservation of digital resources a highly complex and diverse matter. There is, therefore, a need for a proper technology infrastructure that conforms to metadata and other international standards required to measure and validate the trustworthiness of IRs in respect of authenticity, integrity and reliability of the digital materials. In this regard, this article examines the current sytems and technologies being used for digital preservation within all academic libraries in South Africa, with a view to provide best solutions to effective preservation of digital resources. The objectives formulated to guide the study were to:
- determine the factors that motivate academic libraries to preserve their digital resources
- establish the extent to which IRs are implemented within academic libraries
- determine the systems and technologies being used to support digital preservation practices within academic libraries
- determine the measures that are in place to protect digital resources from unauthorised access
- determine the training needs in digital preservation.
Theoretical framework
The study was guided by Kenney and McGovern’s (2003) three leg stool model and Corrado and Moulaison’s (2014) preservation triad model.
Three leg stool
The three leg stool model developed by Kenney and McGovern (2003) is comprised of three elements, namely organisational leg, technological leg and resources leg, which are perceived in this study as some of the factors that can be used to sustain digital preservation in the academic libraries. According to Kenney and McGovern (2003), for a programme to be viable and sustainable, the three legs must be equally strong and balanced to sustain data over time. The best indicators of the development of the organisational leg are implementations of policies and procedures, the objectives, the appropriate strategies and staffing of an organisation for engaging in digital preservation. The technological leg entails preservation planning to provide ongoing support for a robust, flexible and cost-effective technological platform. Components of requisite technological infrastructure for digital preservation include hardware and software, file formats and storage media, tools and workflows, a secure environment, platforms and networks as well as the skills to establish and maintain the digital programme (Kenney & McGovern 2003). The resource leg addresses the requisite start-up, ongoing contingency funding, staffing and skills to enable and sustain digital preservation programmes. A sustainable resources framework is covering staffing, technological, operational and other costs that are necessary to undergird the organisational and technology infrastructures.
Preservation triad
Corrado and Moulaison (2014) also identified three components of sustainable digital preservation programme and these include management-related activities, technological-related activities and content-centred activities. Management activities include the creation of policies, documentation and human factors related to staff ability to perform roles in digitisation. Management is at the top of the triad model and it is vital to preservation as without management in the form of resources and policies there is no impetus to preserve digital objects (Corrado & Moulaison 2014). Technological actvities entail the implementation of digital preservation systems such as trustworthy digital repository (TDR) and metadata systems that need to be in place in order to make use of resources and the implementation of the policies described in the management section (Corrado & Moulaison 2014). According to Corrado and Moulaison (2014), digital preservation is not all about technology; however, it is not possible to undertake digital preservation without the use of complex technology. Therefore, digital preservation systems such as TDR and metadata have to be in place in order to support digital preservation practices. A major purpose of TDR is to facilitate access to digital information for a long term. Metadata has been regarded as the best way of minimising the risk of digital resources becoming inaccessible and it thus needs to be consistently maintained throughout the process (National Information Standards Organization [NISO] 2004). Content management activities are bordered on management functions such as organising, categorising and structuring information resources. Content is a core to digital preservation as without content there is nothing to preserve, no matter how well-thought-out your management and how good your policy and plans, the best preservation systems are worth nothing without content (Corrado & Moulaison 2014).
Literature review
This section presents the factors that motivate academic libraries to preserve their digital resources, implementation of IRs, preservation systems and technologies being used to ensure the perpetuity of digital records, protection of digital records from unauthorised access and the training needs in digital preservation.
Motivation to preserve digital materials in academic libraries
A change to the digital era is compelling academic libraries to rethink their structures, operations and services to remain relevant in this digital era. As noted by Raju (2014), the proliferation of the explosive growth of digital devices and related applications have collectively altered the traditional academic library beyond recognition. The main rationale behind digital preservation is to ensure access to information by present and future generations (Das, Sharma & Gurey 2009). Therefore, managing digital content, the desire to promote library services, increased application and interest in digital technologies, pressure from other institutions in the developed countries, competition amongst academic institutions, pressure from researchers and pressure from other libraries are regarded as the major reasons for preserving digital materials in academic libraries (Masenya 2018).
Implementation of institutional repositories
The evolution of digital technologies and the shift from print to digital collections has resulted in more innovations such as digital IRs and digital libraries. Digital repositories are information systems that ingest, store, manage, preserve and provide access to digital content (Xie & Matusiak 2016). Digital preservation has been regarded as an important motivation for building IRs to ensure that digital research materials are available and accessible in the long term. Memory institutions such as libraries, archives and museums are actively building IRs in an attempt to preserve their digital resources for future access (Corrado & Moulaison 2014). Ngulube (2012) also suggested that developing IRs in academic libraries will preserve and sustain digital information for the present and future generations. Institutional repositories may range from a simple system that involves a low-cost file server and software that provides preservation services to complex systems comprised of data centres and communication networks that are interoperable (Dollar & Ashley, 2014). Institutional repositories have existed ever since human beings began collecting and storing important information for safekeeping and long-term use (Ashikuzzaman 2018), and the libraries, museum and archives provide the foundation for any type of repository programme. According to Lynch and Lippincott (2005), IRs have emerged in North America and Western Europe primarily because they are regarded by the university communities as a means of preserving, maintaining and ensuring long-term access to digital resources. Therefore, a mission of every IR is to preserve their materials for as long as they are needed.
The majority of academic institutions in South Africa also developed IRs in an attempt to manage and preserve scholarly outputs in their libraries (Pienaar & Van de Venter 2008). For example, the UCT implemented IRs in four different departments, namely UCT Law Space (Department of Law), UCT Computer Science Research Document Archive, Department of Manuscripts and Archives in the library and open educational resources (Macha & De Jager 2011). In 2005, the Carnegie Corporation of New York awarded the UCT library, together with the libraries at the University of Witwatersrand and University of KwaZulu-Natal, a grant amount of $2.5 million for a 3-year project directed at supporting research and library staff development at these institutions (Macha & De Jager 2011). As a result, various types of digital materials have been digitised and made publicly accessible. Digital scholarly outputs such as scholarly publications, pre-prints, post-prints and digital versions of theses and dissertations are now managed and preserved in the IRs with the use of open-source software such as DSpace, ETD-db and Eprints. DSpace is the most used software that implements both the Open Archival Information Systems (OAIS) reference model and the Open Archives Initiative’s (OAI) Protocol for Metadata Harvesting (OAI-PMH). DSpace is being used by academic institutions to capture, store, index, preserve and redistribute an organisation’s research material in digital formats (Tansley, Bass & Smith 2003). OpenDOAR: Directory of Open Access Repositories shows that 1544 IRs from 3519 registered repositories use DSpace (Asadi et al. 2018).
However, as noted by Becker et al. (2009), IRs need to be trusted, a fundamental principle that is indispensable in the quest for long-term delivery of authentic information. This requirement underscored the need for a Trusted Digital Repository (TDR), one whose mission is to provide reliable, long-term access to digital resources to its designated community, now and into the future (RLG-OCLC 2002) by:
- designing its systems in accordance with commonly accepted conventions and standards to ensure the ongoing management, access and security of materials deposited within it
- establishing methodologies for system evaluation that meet community expectations of trustworthiness
- ensuring that policies, practices and performances can be audited and measured.
Carnegie Mellon University (1990)’s Digital Preservation Capability Maturity (DPCM) model also identified TDR as one of the elements for enabling effective digital preservation. However, in determining trustworthiness, one must look at the entire system in which the digital information is managed, including the organisation running the repository, its governance, organisational structure and staffing, policies and procedures, financial fitness and sustainability, the contracts, licences and liabilities under which it must operate and the trusted inheritors of data (TDR 2002). The RLG-CPA (1996) report also made a clear statement about trust in digital archives that for assuring the longevity of information, the most important role in the operation of a digital archive is managing the identity, integrity and quality of the archives itself as a trusted source of the digital resources.
Digital preservation systems and technologies used in academic libraries
Digital preservation in practice means provisioning secure storage systems, refreshing ageing media, fixity checks and replication in multiple systems or locations, format migration, emulation and other techniques to keep information safe and accessible over time (Ruusalepp & Dobreva 2013). Digital preservation can be seen as a specific case of system engineering, which is all about integration or federation of multiple systems that must interoperate in order to achieve a common goal (Valerdi et al. 2008). A digital preservation system thus requires the integration or interoperability of information entities, processes and technological infrastructure, as summarised by Barateiro, Antunes and Borbinha (2009) as follows:
- Information entities: It implies that a future system must be able to interpret the representation of the preserved information entities so that this information can be rendered as the original creator intended to
- Processes: It means that the alignment and traceability of processes manipulating digital objects during its entire life-cycle is crucial to be able to make assertions about provenance, integrity and authenticity
- Technological infrastructure: It entails the addition of new components into the preservation environment to support the growth of dynamic collections (incrementing the storage space) or to reduce the costs of digital preservation.
Digital preservation systems should therefore be interoperable, which refers to the capability of a computer hardware or software system to communicate and work effectively with another system in the exchange of data, usually a system of a different type, designed and produced by a different vendor (Reitz 2006). Interoperability can be achieved by being OAIS compatible and this means that digital preservation systems must be accepted by standards provided by the OAIS model (Maemura, Moles & Becker 2017). Nordland (2007) also proposed the digital preservation architecture as the system for managing the digital materials that is submitted for long-term digital preservation. This system is determined by several factors including the diversity of content, format types and media, storage and cost. The core of this architecture is an established repository that is reliable and has sufficient storage (Nordland 2007) and it supports the digital collection from acquisition to storage and finally to dissemination, which includes integrated access to digital and print resources. In order to do so, persistent and unique identifiers, metadata registry and a hierarchical data storage system or model are necessary for filing, organising and retrieving digital materials. These components should be interoperable and based upon standards such as ISO 15489. As noted by Nabe (2010), metadata plays a role in the systems operability and for digital repositories to achieve interoperability and to exchange digital objects, they firstly need to provide metadata to their partners.
As suggested by Nordland (2007), the system must also incorporate a means of ensuring that the digital records are searchable, thus metadata should at least be properly indexed for use by public search engines such as Google. Corrado and Moulaison (2014) also pointed out that every record stored in the repository should have its own persistent and unique identifier so that the database application can locate, retrieve and disseminate the requested record. For example, the National Library of Australia’s architecture called Metadata repository and search system was created, allowing for increased search ability and retrieval through several interoperable systems. The National Library of New Zealand (NLNZ) also worked with the ExLibris Group to develop a digital preservation system called Rosetta that comprises key features such as producers, validation stack, audit trail, workflow process automation, staff management, user management, permanent repository, delivery and reports (Knight 2010).
The increasing number of academic libraries around the world is creating various systems and technologies in an attempt to ensure long-term preservation of their digital resources. The majority of academic institutions in South Africa have also developed their own digital preservation systems and technologies whilst others have chosen to implement open source software or proprietary systems, in an attempt to address the digital preservation conundrum (Masenya 2018). Examples of repository software systems mostly used by IRs within academic libraries are Archivematica, Preservica, Repository of Authentic Digital Record (RODA), DSpace, Eprints, ETD-db, Greenstone, Fedora, the Bepress, Archeevo, Rossetta, Tessella and others (Masenya 2018). Archivematica is a free and open software, which supports Metadata Encoding and Transmission Standards (METS), Preservation Metadata: Implementation Strategies (PREMIS), Dublin Core, Badgit specification and other recognised standards to ensure trustworthy, authentic, reliable and system-independent Archival Information Packages (AIPs) for storage in a preferred repository (Bountouri 2017).
Preservica is a well known digital preservation system that has contributed to the development of many important standards such as PRONOM persistent unique identifier and OAIS (Preservica 2016). Preservica system is widely implemented by business, archives, libraries, museums and government and it implemented the Digital Preservation Capability Maturity (DPCM) model, which focuses on the process and systems needed to keep valuable digital assets accesible and readable for the long term (Preservica 2016). Digital Preservation Capability Maturity encompasses range of defined components that enable organisations to measure the maturity of their digital preservation processes and supporting technical environments and judge how safe their digital objects are (Bountouri 2017). Preservica also supports many different file format migration, which can be carried out not only during the inial ingest but also at any point in future (Preservica 2016).
Repository of Authentic Digital Records is also a full open-source software that is freely available to download, build on Fedora and can support the existing extensible mark-up language (XML) metadata schemas such as Encoded Archival Description (EAD), METS and PREMIS. In terms of digital preservation actions, RODA supports normalisation, ingestion of digital data, format conversion and checksum verification (Bountouri 2017). Archeevo is an archival management software produced by Keep Solutions, a company that provides diverse services related to the management and preservation of digital information (Keep Solutions 2016). It is compatible with the archival standards and the import or export of data and it can manage a large number of digital resources. Archeevo also has many features such as text indexing of digitised and born digital materials and technical metadata, which are important for long-term preservation (Keep Solutions, 2016).
Access to Memory (ATOM) is another open source software application for standard-based archival description and access in a multilingual and multirepository environment (Artefactual Systems 2017) and it makes possible for archival repositories to disseminate their collections online with minimal cost and effort. The DSpace is another open-source software that provides tools for managing digital assets and there have been 324 installations of DSpace in 54 countries, as of May 2008 (DSpace 2013). According to Kari and Barro (2016), DSpace creates indexes and retrieves various forms of digital content and is adaptable to various community needs and they further highlighted some of the reasons to choose this software. These include an open source platform that can be customised, a service model for open access or digital archiving for perpetual access, a platform for an institutional repository, and the collections are inteoperable, searchable and retrievable using the Web (Kari & Barro 2016). Journal Storage (JSTOR) is another open source software that provides the function to perform format-specific identification validation and characterisation of digital objects.
Another example of systems and technologies used by NLNZ (2011) for digital preservation includes those that identify and evaluate file formats, that is, Digital Record Object Identification (DROID) that normalise files to preservable formats and XML Electronic Normalising for Archives (XENA) that generate and capture metadata and the metadata extractor that produce a unique identifier and aid in detecting changes to files. DROID was developed by the National Archives to meet the fundamental requirements of any digital repository to identify the precise format of all stored digital objects and to link that format identification to a central registry of technical information about that format and dependencies (Corrado & Moulaison 2014). Colorado State University Libraries (CSUL) also designed two digital asset management tools, namely CONTENTdm and DigiTool. CONTENTdm is the legacy system that hosts around 5000 digital objects and provides online access to quite a number of digital collections whilst DigiTool supports the submission, ingestion, management and delivery of digital content including images, documents, videos and audios of various formats (Oehlerts & Liu 2013). It incorporates available open-source standards and utilities such as JSTOR Harvard Object Validation Environment (JHOVE) and records the important checksum information. California State University of Los Angeles (CSUL) has also begun exploring collaborative opportunities for digital preservation, such as participation in MetaArchive, LOCKSS and Dura Cloud systems, as noted by Oehlerts and Liu (2013).
California Digital Library and Stanford University also developed a tool for digital preservation called BagIT, a specification for the packaging of digital content for the purpose of automating the content’s receipt, storage and retrieval (Oehlerts & Liu 2013). However, even with these examples of available repository systems and software, organisations need to decide how to select an appropriate repository system or software by considering the capabilities and limitations of each system and the extent to which the repository software meets archival requirements and suits the digital content to be preserved. For example, University of Stellenbosch, University of Pretoria and Durban University of Technology use open source software called DSpace for the preservation of their digital resources whilst Rhodes University and UCT use the E-Prints open source software system (Masenya 2018). Although many institutions are using open source software, however, institutions may opt to build their own repository system or to subscribe to a digital preservation service provider, that is, the National Library of the Netherlands developed its own system called Bpress, an Online Computer Library Centre (OCLC) Digital Archive (Masenya 2018).
Academic libraries are also using various archival file formats for long-term preservation of their digital resources. For example, CSUL’s archival file formats are given the greatest level of preservation support including assigning persistent identifiers and preservation metadata to support files’ access and management over time, providing secure storage and backup, periodic refreshment to new media, performing regular fixity checks using the proven checksum method, strategic monitoring of format changes and developments using automated services such as listserv and migrating to succeeding formats upon format obsolescence (Oehlerts & Liu 2013). Corrado and Moulaison (2014) mentioned the preferred file format for most digital repositories as either PDF or PDF/A and other formats that are commonly accepted for long-term preservation including rich text format (RTF), XML and hypertext mark-up language (HTML). As Li (2011) pointed out, assuring quality of content and storing content in formats that can more easily be preserved is another area of consideration. The libraries and digital repositories should have a format support policy that is readily available to staff and end users to address this concern, as suggested by Oehlerts and Liu (2013). However, there are many advantages and disadvantages of systems and technologies used for preservation of digital resources. Digital preservationists should thus evaluate these systems and determine whether they meet their needs and determine what resources (human and financial) will be necessary to implement them and what their limitations might be.
Protection of digital resources from unauthorised access
Whilst this study concedes that digital repositories and archives containing digital materials are useful to institutions and user communities, they can pose a threat if proper security protections are not put in place. The purpose of preserving the digital materials is to ensure that it remains accessible to the public, and therefore access to digital materials should be free of unreasonable restrictions whilst at the same time, sensitive and personal information should be protected from any form of intrusion (UNESCO 2003). As also observed by Dollar and Ashley (2014), digital preservation requires processes that restrict access to the physical repository where digital content is stored, ensure the security of electronic records through techniques that block unauthorised access, protect the confidentiality and privacy of records and intellectual property rights. A TDR should thus understand threats to and risks within its systems and apply access control to ensure that the integrity of records is not compromised (TDR 2002). The DPCM model also outlines digital preservation services needed for continuous monitoring of external and internal environments in order to plan and take actions to sustain the integrity, security, usability and accessibility of electronic records stored in trustworthy preservation repositories (Carnegie Mellon University 1990). As stated by Williams (2006), the international standards also require digital resources to have the qualities of authenticity, integrity, reliability and usability. Authenticity helps to prevent unauthorised addition, alteration, deletion, use and concealment of records by unknown people (ISO 15489-12001). Integrity is the condition where a record is said to be whole, complete, consistent, correct, accurate and unaltered (Kiltz, Lang & Dittman 2007). On the other hand, reliability are those resources that are trusted to be full and accurate representation of the business transactions in hand (Williams 2006). Usability refers to the extent to which future end users can view and interact with the preserved data by way of retrieving, presenting and interpreting it correctly (Mason 2007). All these characteristics need to be considered when protecting digital resources over a long period of time and access control thus needs to be applied to ensure that they are not compromised.
Training needs in digital preservation
The changes from traditional to digital world have posed many challenges to information professionals in academic libraries and these institutions are faced with managing hybrid resources that acquire the necessary skills (Masenya 2018). Information professionals at various levels thus need to strive hard to implement and apply the latest ICTs advancements in their libraries and also to handle electronic or digital documents to bring change in the environment as per the goals of the parent organisation. However, Library and Information Science (LIS) professionals are exposed to stress because of changing library environment and the advent of digital technologies (Maemura et al. 2017). Halder (2009) observed that LIS professionals are stressed because they are lacking information, clarity and knowledge as handling the acquisition of electronic or digitised resources, data entry, data coordination and administration require specialised skills, experience, attitude, training and utmost attention. A viable digital preservation capability thus requires organisations to have sufficient staff with technical expertise to support all of the technology infrastructure and requisite key processes for digital preservation, and this will inevitably facilitate the preservation of digital content and guarantee the long-term storage of digital materials. Staff training and education are therefore essential when digitising or preserving materials. In order to survive in this digital world, academic libraries thus need more blended librarians to offer the best combination of skills and services, with the ability to use digital technologies (Masenya & Ngulube 2020). The Society of American Archivists (SAA) (2013) has also created a list of core competencies that a digital archivist should have, which include the ability to communicate the requirements related to digital archives, to formulate the strategies needed to best organise and preserve them, to integrate technologies, tools, software and media within existing functions for appraising, capturing, preserving and providing access to digital collections. However, as suggested by Dollar and Ashley (2014), technical expertise may be provided by internal staff, a centralised service bureau or by external service providers.
Methodology
The quantitative research methodology based on the positivist epistemology was found suitable for the study because its survey design is well-suited for a geographical dispersed population. Creswell and Creswell (2018) and Saunders, Lewis and Thornhill (2016) suggested the research approach when a snapshot of a research phenomenon is required. This facilitates the generalisation of the results of a representative sample to the population that it is drawn from. The sample of the study was drawn from 27 academic institutions, which constituted the target population. The digitisation and preservation directorate office or section, in each of the 27 academic institutions faciltated the completion of the online questionnairre by individuals who dealt with digitisation matters, including library directors or managers, librarians, archivists, ICT managers, institutional repository managers and digital preservation administrators, practitioners and experts. Out of 27 questionnaires emailed to these institutions, only 22 completed questionnaires were returned, representing a response rate of 81.5%. Their knowledge and experience in the field were therefore very critical to this study. Pre-testing was used to ensure reliability and validity of the study. The Statistical Package for Social Sciences (SPSS) was used for data analysis.
Presentation of results
This section presents the key findings that are discussed in line with the objectives of the study under the following themes: factors that motivate digital preservation in academic libraries, implementation of IRs, preservation systems, technologies and tools being used in academic libraries, protection of digital resources from unauthorised access and training needs in digital preservation. However, the data presented in this study is not associated with any other institution and this is so owing to the fact that in order to encourage full participation, respondents were assured that their institutional data would not be identifiable.
Motivation to preserve digital materials
The respondents were asked to state what motivated them to preserve digital materials in their institutions. All of the 22 (100%) respondents stated that they preserve digital materials to ensure long-term access to digital resources whilst the majority of respondents stated the desire to promote library services with the combined score of 21 (95.5%), managing digital content with a combined score of 21 (95.5%), interest in digital technologies with a combined score of 18 (81.8%) and increased application of digital technologies with a combined score of 17 (77.3%), as shown in Table 1.
TABLE 1: Motivation for digital preservation. |
Other motivating factors for the adoption of digital preservation were pressure from other institutions in the developed countries with a combined score of 15 (68.2%), competition amongst academic institutions with a combined score of 14 (63.6%), pressure from researchers 14 (63.6%) and pressure from library users with a combined score of 13 (59.1%). Respondents generally responded positively and the overall responses show that academic libraries acknowledged the need for digital preservation practices.
Development of institutional repositories
Literature reveals that in South Africa, the majority of academic institutions developed IRs in an attempt to manage and preserve scholarly outputs in their libraries (Pienaar & Van de Venter 2008). In this regard, the respondents were asked whether they developed IRs in their institutions, and all of the respondents 22 (100%) reported that their institutions implemented IRs. Furthermore, those who agreed that they do have IRs were also asked to state the purpose of the institutional repository in their institutions. The findings revealed increasing the dissemination of research output by researchers and ensuring long-term accessibility of digital resources both with the same score of 22 (100%), providing a central storage space for intellectual output of an institution and improving visibility to research output both with the same score of 21 (95.5%) and enhancing academic communication by allowing global users to comment on pre-prints it stored 19 (86.4%). In a nutshell, almost all the participants acknowledged the benefits of IRs.
Types of systems and software used for digital preservation
The role of memory institutions such as libraries, archives and museums is to provide access and safeguard documentary cultural heritage, scientific and research data, and institutional records (IFLA-UNESCO 2015). These institutions have a long history of collecting, storing and organising both analogue and digital materials and have always been at the forefront of efforts in ensuring long-term preservation and continued access to these materials. Therefore, in order to ensure long-term preservation, the academic libraries should look at implementing systems and software to support various digital preservation strategies. Using multi-response list, the respondents were asked to state the types of systems and software used in their institutions. The study’s findings revealed DSpace as the dominant software that has been used in most academic libraries with a total score of 15 (68.9%), and it is the only software used for digital preservation in academic libraries that recorded an above 60% score. Similarly, the study by Biswas and Paul (2010) also looked at open source software used for the preservation of digital resources around the world and their findings revealed that out of the 72 institutions studied with various open source software, DSpace had 42 installations. However, it is important to note that Fedora 5 (22.7%), Innovative 4 (18.2%) and E-prints 3 (13.6%) were slowly being reasonably used in academic institutions. Only 2 (9.1%) of the respondents acknowledged that their institutions are using Greenstone whilst Archive-IT ranked very low in adoption and usage with just one user 1 (4.5%), as depicted in Table 2.
TABLE 2: Types of systems and software used for digital preservation practices. |
Most responses on types of software or technologies used for digital preservation in academic libraries fell below a 20% range. This may be because most of the software for digital preservation identified have not been used in academic libraries or it is because these institutions are not aware of these software or technologies.
Motivation to use software
Using multi-response list, the respondents were also asked to state what motivated them to use a particular system or software. DSpace and E-prints both with the score of 6 (27.3%) and I-T 5 (22.7%) were noted as ease of use. The majority of 5 (22.7%) of respondents also noted that DSpace is affordable. Most of the respondents reported that they are using Tesella 7 (31.8%), Greenstone 7 (31.8%), Fedora 6 (27.3%) and E-prints 6 (27.3%) as per library policy. On the other hand, the respondents indicated that they are using E-prints 4 (18.2%) and I-T 4 (18.2%) because they have relevant knowledge (Table 3).
It was evident that most of the academic libraries were adopting new types of software and are selecting the most effective technologies for preservation of their digital resources. DSpace software was therefore noted as the dominant software used by many institutions in preserving their digital content kept in IR. This software complies with the OAI, thus allowing items in IRs to be easily discovered by web search engines, services and indexing tools. E-print, ETD and Digital commons software were also commonly used to support IRs in academic libraries whilst DuraCloud, Fedora, Bepress, BagIT and Greenstone were slowly being used in these institutions. This may be because these institutions are not aware of these software or they lack technical knowledge and skills. The efficacy of these software is thus dependent on the calibre and commitment of the library leadership and management and on the careful selection, training and development of its staff.
Other technologies and tools used for digital preservation
The academic libraries should also look at implementing technologies and tools in order to ensure long-term preservation of their digital resources. Such technologies and tools should also support metadata standards, selection and appraisal policies and format identification for digital preservation. Using a multi-response list, the respondents were asked to indicate the types of other technologies and tools used for digital preservation. The results revealed that LOCKSS 7 (31.8%), DigiTool 6 (27.3%), Content-dm 6 (27.3%) and Archive-IT 5 (22.7%) were the dominant tools that have been used in most academic institutions in South Africa. DuraCloud with a score of 3 (13.6%), Archivematica 3 (13.6%) and Exiftool 2 (9.1%) were the least used tools in academic libraries, as shown in Table 4.
TABLE 4: Types of technologies and tools used for digital preservation. |
Protection of digital resources from unauthorised access
Whilst this study concedes that preservation systems and databases containing digital materials are useful to institutions and individuals, they can however pose a threat if proper security protections are not put in place. In this regard, the respondents were asked whether they had any measures to protect their digital materials, using multi-response list. The majority of 19 (86.4%) institutions indicated that they do have measures to protect their digital resources whilst only a few 3 (13.6%) respondents indicated that they do not have any measures to protect their digital materials. The respondents who stated that they do have measures to protect unauthorised access to digital materials were further asked to indicate the measures taken in their institutions. The findings reveal that access and use policy 18 (81.8%) was widely implemented as a protection to unauthorised access to digital materials, followed by network security 15 (68.2%), security password authentication 13 (59.1%), request for access approval 13 (59.1%) and data security 11 (50.0%). Only a few institutions were using an audit trail 3 (13.6%), as depicted in Table 5.
TABLE 5: Protection of digital resources from unauthorised users. |
Training needs in digital preservation
Against the backdrop that digital preservation is an extremely complex area and evolving field that requires a great deal of knowledge to understand, respondents were asked about their training needs. The majority of respondents 20 (90.9%) preferred digitisation and digital preservation programmes followed by preserving electronic resources during their entire life-cycle and the application of digital technologies in preservation practices, both with the same score of 19 (86.4%). The study also agrees with the empirical literature that training programmes are needed to help information professionals manage the anticipated problems of digital records (IRMT 2008, 2009). When asked about the training programmes that their institutions would prefer to be delivered to their membership to enhance their digital preservation skills, the findings revealed seminars and workshops 5 (22.7%) and on the job training 5 (22.2%) as the most preferred programmes to meet their training needs followed by online training with the score of 4 (18.2%), internships and training in digital preservation in schools and colleges, both with the same score of 3 (13.6%).
Conclusion and recommendations
New systems and technologies are being designed in academic libraries intended to provide these institutions with the capability to create, capture, classify, store, preserve, track and retrieve digital resources, regardless of the format. These technologies include DSpace, Fedora, E-prints, Greenstone, Innovative, I-T, Archivematica, Rosetta and Tesella, just to name a few. In line with these findings, the study observed that academic institutions are using systems and technologies that suit their budget and the digital content to be preserved. However, as noted by Manaf (2007), a well-established infrastructure plays a role to ensure that the essential digital records are archived and stored so as to generate accessibility over a long period of time. The system designers in the digital preservation unit should therefore design or develop systems that support long-term preservation. The development of a reliable digital preservation system should also adhere to archival standards and best digital preservation practices and be guided by the principles of TDR to ensure that their IRs are able to prove reliability and trustworthiness over time. The academic institutions should also account for technological changes so that the entities may continue to be migrated to newer platforms as needed to avoid technological obsolescence. These institutions must also create mechanisms that allow for the determination of authenticity based on the trustworthiness of the source of the digital entities and adopt the necessary strategies and technologies to preserve them in a sustainable way. Security policies should also be developed to support the protection and security of digital materials from unauthorised access. Literature review proved that digital preservation is an extremely complex field and requires a great deal of knowledge and technical expertise. The academic institutions should, therefore, employ competent personnel to facilitate the digital preservation projects and also offer training to library staff members resposible for digital preservation practices.
Acknowledgements
Competing interests
The authors declare that they have no financial or personal relationships that may have inappropriately influenced them in writing this research article.
Authors’ contributions
T.M.M. and P.N. contributed equally to this research article.
Ethical considerations
Ethical approval to conduct the study was obtained from the University of South Africa on 31 October 2016 (Ethical clearance number: 2016 RPSC 66).
Funding information
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
Data availability
Data sharing is not applicable to this article as no new data were created or analysed in this study.
Disclaimer
The views and opinions expressed in this article are those of the authors and do not necessarily reflect the official policy or position of any affiliated agency of the authors.
References
Artefactual Systems, 2017, Archivematica: Open-source digital preservation system, viewed 26 February 2017, from https://www.artefactual.com/wp-content/uploads/2019/07/Archivematica-information-sheet-2019.pdf.
Asadi, S., Abdulla, R., Yah, Y. & Nazir, S., 2018, ‘Understanding institutional repository in higher learning institutions: A systematic literature review and directions for future research’, IEEE Access 7, 35242–35263. https://doi.org/10.1109/ACCESS.2017
Ashikuzzaman, M.D., 2018, Brief information about institutional repository, North South University Library, Bangladesh.
Asogwa, B.E., 2011, ‘The challenges of preservation of archives and records in the electronic age’, PNLA Quarterly 76(2), 115–125.
Barateiro, J., Antunes, G. & Borbinha, J., 2009, Addressing digital preservation: Proposals for new perspectives, International workshop on innovation in digital preservation, Lisboa.
Becker, C.H., Kulovits, M., Guttenbrunner, M., Strodl, S., Rauber, A. & Hofman, H., 2009, ‘Systematic planning for digital preservation: Evaluating potential strategies and building preservation plans’, International Journal on Digital Libraries 10(4), 133–155. https://doi.org/10.1007/s00799-009-0057-1
Biswas, G. & Paul, D. 2010, ‘An evaluative study on the open source digital library softwares for institutional repository: Special reference to Dspace and Greenstone digital library’, International Journal of Library and Information Science 2(1), 001–010.
Bountouri, L., 2017, Archives in the digital age: Standards, policies and tools, Chandos Publishing.
Carnegie Mellon University, 1990, The System Security Engineering: Capability Maturity Model (SSE-CMM), Version 2.
Chowdhury, G., 2009, ‘From digital libraries to digital preservation research: The importance of users and context’, Journal of Documentation 66(2), 207–223.
Corrado, E.M. & Moulaison, H.L., 2014, Digital preservation for libraries, archives, and museums, Rowman & Littlefield Publishers, Lanham, MD.
Council of Canadians Academies, 2015, Leading in the digital world: Opportunities for Canada’s memory institutions, The expert panel on memory institutions in the digital revolution, Council of Canadian Academies, Ottawa.
Creswell, J.W. & Creswell, J.D., 2018, Research design: Qualitative, quantitative and mixed methods approaches, 5th edn., Sage, Los Angeles, CA.
Das, T.K., Sharma, A.K., & Gurey, P., 2009, ‘Digitization, strategies and issues of digital preservation: An insight view to Visva-Bharati Library’, In Proceedings of the 7th international Convention on Automation of Libraries in Education and Research (CALIBER), Pondicherry University, 25th–27th February 2009.
Day, M., 2006, ‘The long-term preservation of web content’, Web Archiving, pp. 177–199, Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-46332-0_8
Dollar, C.M. & Ashley, L.J., 2014, Assessing digital preservation capability using a maturity model process improvement approach, Carnegie Mellon University, Pittsburgh, PA.
DSpace, 2013, Homepage, viewed n.d, from https://duraspace.org/.
Halder, S.N., 2009, ‘Multimodal roles of library and information science professionals in present era’, International Journal of Library and Information Science 1(6), 92–99.
Hedstrom, M. & Lee, C.A., 2002, ‘Significant properties of digital objects: Definitions, applications, implications’, Proceedings of the DLM-Forum 2002 Parallel session 3, viewed 28 April 2016, from http://www.ils.unc.edu/callee/sigprops_dlm2002.pdf.
IFLA & UNESCO, 2010, ‘IFLA/UNESCO Manifesto for Digital Libraries’, viewed 18 September 2018, from https://www.ifla.org/publications/iflaunesco-manifesto-for-digital-libraries.
International Records Management Trust (IRMT), 2008, Integrating records management in ICT system: Good practice indicator, IRMT, London.
Kari, K.H. & Baro, E.E., 2016, ‘Digital preservation practices in university libraries’, DE Gruyter 45(3), 134–144. https://doi.org/10.1515/pdtc-2016-0006
Keep Solutions, 2016, Keep Solutions, viewed 24 November 2016, from http://www.keep.pt/.
Kenney, A.R. & McGovern, N.Y., 2003, ‘The five organizational stages of digital preservation’, in P. Hodges, M. Bonn, M., Sandler & J.P. Wilkin (eds.), Digital libraries: A vision for the 21st century: A festschrift in honor of Wendy Lougee on the occasion of her departure from the University of Michigan, University of Michigan Library: Scholarly Publishing Office, Ann Arbor, MI.
Kiltz, S., Lang, A. & Dittman, J., 2007, Taxonomy for computer security incidents in cyber warfare and cyber terrorism, edited by LJ. Janczewski and AM, IGI Global, Colarik.
Knight, S., 2010, ‘Early learnings from the National Library of New Zealand’s National Digital Heritage Archive Project’, Electronic Library and Information Systems 44(2), 85–97. https://doi.org/10.1108/00330331011039454
Li, Y., 2011, Institutional repositories and digital preservation: Assessing current practices at research libraries, University of Massachusetts, Amherst, MA.
Lynch, C.A. & Lippincott, J.K., 2005, ‘Institutional repository deployment in the United States as of Early 2005’, D-Lib Magazine, 11(9). http://dx.doi.org/10.1045/september2005-lynch
Macha, A. & De Jager, K., 2011, ‘A comparative overview of the development of the institutional repositories at the University of Cape Town and at the University of Pretoria’, Proceedings of the 14th International Symposium on Electronic Theses and Dissertations, University of Cape Town, South Africa, 13th–17th September 2011.
Maemura, E., Moles, N. & Becker, C., 2017, ‘Organizational assessment frameworks for digital preservation: A literature review and mapping’, Journal of Association for Information Science and Technology 68(7), 1619–1637. https://doi.org/10.1002/asi.23807
Manaf, Z.A., 2007, ‘The state of digitization initiatives by cultural institutions in Malaysia’, Library Review 56(1), 45–60. https://doi.org/10.1108/00242530710722014
Masenya, T.M., 2018, ‘A Framework for preservation of digital resources in academic libraries in South Africa’, PhD thesis, University of South Africa.
Masenya, T.M. & Ngulube, P., 2020, ‘Factors that influence digital preservation sustainability in academic libraries in South Africa’, South African Journal of Libraries and Information Science 86(1), 52–63.
Mason, S., 2007, ‘Authentic digital records: Laying the foundation for evidence’, Information Management Journal 5, 32–40.
Nabe, J.A., 2010, Starting, strengthening, and managing institutional repositories (How-to-do-it manually), Neal-Schuman Publishers, New York, NY.
National Information Standards Organization (NISO), 2004, Understanding metadata: What is metadata, and what is it for?: A primer, National Information Standards Organization Press, Bethesda, MD, viewed n.d., from http://www.niso.org/publications/press/UnderstandingMetadata.pdf.
National Library of New Zealand (NLNZ), 2011, National Library of New Zealand Te Puna Mataunga O Aoetearoa National Digital Heritage Archive, s.n., s.l.
Ngulube, P., 2012, ‘Ghosts in our machines: Preserving public digital information for the sustenance of electronic government in sub-Saharan Africa’, Mousaion: South African Journal of Information Studies 30(2), 127–135.
Nordland, L.P., 2007, ‘The long and short of IT: The International Development Research Centre as a case study for a long-term digital preservation strategy’, PhD thesis, University of Manitoba.
OCLC & RLG Working Group on Preservation Metadata, 2002, ‘Preservation metadata for digital objects: A review of the state of the art’, A White Paper, viewed n.d., from www.oclc.org/research/pmwg/presmeta_wp.pdf.
Oehlerts, B. & Liu, S., 2013, ‘Digital preservation strategies at Colorado State University Libraries’, Library Management 34(1/4), 1–12. https://doi.org/10.1108/01435121311298298
Pienaar, H. & Van Deventer, M., 2007, ‘Capturing knowledge in institutional repositories … playing leapfrog with giraffes’, paper presented at the Knowledge Management preconference workshop at the World Library and Information Congress (WLIC): 73rd IFLA General Conference and Council.
Preservica, 2014, How Preservica works, viewed n.d, from http://preservica.com/preservica-works.
Rahman, M.D. & Muhammed-ul-Islam, M., 2012, ‘Issues and challenges for sustainable digital preservation practices in Bangladesh’, International Seminar on ‘Digital Libraries for Digital Nation’ organized by Library Association of Bangladesh, Dhaka, Bangladesh, October 17–18, 2012, pp. 79–93.
Raju, J., 2014, ‘Knowledge and skills for the digital era academic library’, The Journal of Academic Librarianship 40, 63–170. https://doi.org/10.1016/j.acalib.2014.02.007
Reitz, J.M., 2006, ODLIS: Online Dictionary of Library and Information Science, Western Conneticut State University, Danbury, CT.
Rosa, C.A., Craveiro, A. & Dominques, P., 2017, ‘Open source software for digital preservation repositories: A survey’, International Journal of Computer Science & Engineering Survey (IJCSES) 8(3), 21–39. https://doi.org/10.5121/ijcses.2017.8302
Rosenthal, D.S.H, Robertson, T, Lipkis, T, Reich, V. & Morabito, S., 2005, ‘Requirements for digital preservation systems: A bottom-up approach’, D-Lib Magazine 11(11), 1–14. https://doi.org/10.1045/november2005-rosenthal
Ruusalepp, K.I. & Dobreva, M., 2013, Innovative digital preservation using social search in agent environments: State of art in digital preservation and multi-agent systems, viewed 02 December 2013, from http://www.durafile.eu/.
Saunders, M., Lewis, P. & Thornhill, A., 2016, Research methods for business students, 7th edn. Pearson Education Limited, Harlow.
Sigauke, D.T. & Nengomasha, C.T., 2011, ‘Challenges and prospects facing the digitization of historical records for their preservation with national archives of Zimbabwe’, Paper presented at the 2nd International Conference in African Digital Libraries and Archives, Johannesburg, South Africa, 17th–18th November 2011.
Styblinska, M., 2006, ‘Long-term preservation of digital assets: Some specific aspects’, in Proceedings of the International Multiconference on Computer Science and Information Technology, viewed 10 November 2015, from https://www.yumpu.com/en/document/view/33742324/long-term-preservation-of-digital-assets-some-specific-aspects.
Tansley, R., Bass, M. & Smith, M., 2003, ‘DSpace as an open archival information system: Current status and future directions’, in T. Koch & I.T. Sølvberg (eds.), ECDL 2003, LNCS 2769, pp. 446–460, Springer-Verlag Berlin, Heidelberg
Trusted Digital Repositories (TDR), 2002, Attributes and Responsibilities, An RLG-OCLC Report, RLG, Mountain View, CA.
The Society of American Archivists (SAA), 2013, Council of State Archivists (CoSA), New Orleans, LA.
United Nations Educational, Scientific and Cultural Organization (UNESCO), 2003, Convention for the safeguarding of the intangible cultural heritage, UNESCO, France, viewed 12 October 2016, from https://ich.unesco.org/en/1com.
Valerdi, R., Axelband, T., Baehren, B., Boehm, D., Dorenbos, S., Jackson, A. et al., 2008, ‘A research agenda for systems of systems architecting’, International Journal of System of Systems Engineering 1(1/2), 171–188. https://doi.org/10.1504/IJSSE.2008.018137
Williams, C., 2006, Managing archives: Foundations, principles and practice, Chandos Publishing, Oxford.
Xie, I. & Matusiak, K.K., 2016, Discover digital libraries: Theory and practice, Elsevier, Amsterdam.
|