The preservation of the scholarly record, which includes published research articles, is important. For printed materials, this task falls, e.g., to libraries. In the digital era, the digital object identifier (DOI) system is meant to ensure that there is a persistent way to find objects such as an article that is published online. Research articles and other digital resources with an assigned DOI can, in theory, always be accessed using this unique identifier, even if the underlying link might change. For example, if a publisher were to cease operations, the DOI could point to a new location such as a digital archive. Such archives can be publicly accessible, or they can save materials in the background only for the case that other sources fail (“dark” archives).
Martin Paul Eve, Crossref and Birkbeck University of London, UK, has investigated whether research articles with an assigned DOI are adequately preserved. He used archives such as those of national libraries, Portico, Gallica, or HathiTrust to build a database of over 7.4 million articles with DOIs. He found that there were ca. 5.9 million copies of these articles in total in the studied archives, and that about 4.3 million articles had at least one archived copy. This means that ca. 58 % of the articles in the database were preserved in some way, but ca. 28 % seemed to be unpreserved. About 14 % of the articles were excluded from the analysis (e.g., due to a recent date of publication or insufficient data on the publication date).
Eve also evaluated how well members of Crossref, which is the largest digital object identifier (DOI) registration agency, preserve their content. He found that fewer than 1 % of members could be confirmed to preserve over 75 % of their content in three or more of the studied archives. About 8.5 % of members were found to preserve over 50 % of their content in two or more archives, and about 33 % met the threshold of preserving at least 25 % of their content in at least one archive. Eve suggests, for example, that DOI registration agencies should define a minimum required standard of preservation and enforce it.
- Digital Scholarly Journals Are Poorly Preserved: A Study of 7 Million Articles,
Martin Paul Eve,
J. Libr. Scholarly Commun. 2024.
https://doi.org/10.31274/jlsc.16288