Archive [draft] [#digitalkeywords]

“In the digital age, we attempt to create archives of a particular moment, the entirety of a medium, the mutability of language, all knowledge. … The archive as a fractured, incalculable moment is an attempt to hold close all that happens at once in the world. But this concept has become incredibly problematic with the rush of information around us.”

The following is a draft of an essay, eventually for publication as part of the Digital Keywords project (Ben Peters, ed). This and other drafts will be circulated on Culture Digitally, and we invite anyone to provide comment, criticism, or suggestion in the comment space below. We ask that you please do honor that it is being offered in draft form — both in your comments, which we hope will be constructive in tone, and in any use of the document: you may share the link to this essay as widely as you like, but please do not quote from this draft without the author’s permission. (TLG)


Archive — Katherine D. Harris, San Jose State University

In Archive Fever, Derrida suggests that the moments of archivization are infinite throughout the life of the artifact: “The archivization produces as much as it records the event” (17). Archiving occurs at the moment that the previous representation is overwritten by a new “saved” document. Traces of the old document exist, but cannot be differentiated from the new. At the moment an archivist sits down to actively preserve and store and catalogue the objects, the archiving is once again contaminated with a process. This, according to Derrida, “produces more archive, and that is why the archive is never closed. It opens out of the future” (Derrida 68). Literary works become archives not only in their bibliographic and linguistic codes,[1] but also in their social interactions yet to occur. It is the re-engagement with the work that adds to an archive and that continues the archiving itself beyond the physical object.

I crafted my keyword perambulations around this burning desire to return to origins intermixed with the desire to hold everything at once in the mind’s eye. In literature, this of course causes the protagonist to faint, go mad, isolate herself, create alternate realities — all in the name of either escaping or explaining what cannot be known. My Gothic Novel students pointed out just this week that the narrator in a short story, most specifically Lovecraft, attempts to focus on a few actions in the busy-ness of the world, to focus the reader on what is calculable, knowable but ultimately unheimlich.

In the digital age, we attempt to create archives of a particular moment (The September 11 Digital Archive), the entirety of a medium (The Internet Archive), the mutability of language (The Oxford English Dictionary), all knowledge (Wikipedia). More than any others, the crowd-sourced information of Wikipedia attempts to capture knowledge as well as the creation of that knowledge — the history or Talk of each Wikipedia entry unveils an evolving community of supposedly disinterested[2] users who argue, contribute, and create each entry. Wikipedia entries represent that digital version of an archive in the twenty-first century. The archive as a fractured, incalculable moment is an attempt to hold close all that happens at once in the world. But this concept has become incredibly problematic with the rush of information around us – a topic that I broach in my entry for “Archive” with The Johns Hopkins Guide to Digital Media.

Kenneth Price begins my discussion about “archive” by offering a traditional definition of the term:

Traditionally, an archive has referred to a repository holding material artifacts rather than digital surrogates. An archive in this traditional sense may well be described in finding aids but its materials are rarely, if ever, meticulously edited and annotated as a whole. In an electronic environment, archive has gradually come to mean a purposeful collection of digital surrogates. (para. 3)

Later in this article, Price veers into discussing the role of archivist in shaping the archive, similar to what Derrida proposes above but with less dramatic flair. Price’s article is in response to the authority of a digital scholarly edition and its editors in the face of traditional print editions. Always, for Price, there is an organizing principle to archiving and, subsequently, editing. However, what we’re concerned with for this particular gathering is inherent the messiness of the archive as it pertain to cultural records, both physical and digital. What gets placed into the archive and by whom becomes part of that record. What’s missing, then, becomes equally important. Martha Nell Smith proposes that digital archives are free from the constraints of a traditional print critical edition; more importantly, the contents and architecture of a digital archive can be developed in full view of the public with the intention of incorporating the messiness of humanity.

In “Googling the Victorians,” Patrick Leary writes that all sorts of digital archives about Victorian literature are springing up, archives that are not peer-reviewed per se but offer an intriguing and sanguine view of the wealth of nineteenth-century materials. Leary concludes his essay by asserting that whatever does not end up in a digital archive, represented as cyber/hypertext will not, in the future, be studied, remembered, valorized and canonized. Though this statement reflects some hysteria about the loss of the print book, it is also revealing in its recognition that digital representations have become common and widespread, regardless of professional standards. Whatever is not on the Web will not be remembered, says Leary. Does this mean that the literary canon will shift to accommodate all of those wild archives and editions? Or, does it mean that those mega projects of canonical authors will survive while the disenfranchised and non-canonical literary materials will fall further into obscurity?

Raymond Williams posits that “vulgar misuse” allows for entry into the cultural record (Williams, Keywords 21), though those in library science object to the normalization of archive, that moves away from their professional standards for a vault of record of humanity. But the construction of a digital archive in literary studies conflates literature, digital humanities, history, computer programming, social sciences, and a host of other cross-pollinated disciplines. The archive, more than anything right now in literary studies, demonstrates what Williams calls “networks of usage” (23) with “an emphasis on historical origins [as well as] on the present – present meanings, implications, relationships – as history” (23). Community, radical change, discontinuity, and conflict are all part of the continuum in the creation of meaning according to Williams, seemingly similar to Borges’ “Library of Babel” and Derrida’s “archive fever.” While archivists insist on a conscious choice in the use of “archive” (noun or verb), perhaps as part of a professional tradition, I seek to look at the messiness of the word as a representation of the messiness of past, present, and future.

The issue with formal digital archives is where to stop collecting to account for scope, duration, and shelf space. In digital archives, sustainability is key; but the digital archive is vastly more capable of accumulating everything and then its liberal and even promiscuous remixing by its users based on the tools available. The primary argument seems to be who is controlling the inventorying, organization, tagging, coding of the data with an archive (user, curator, editor, architect?).  And, what digital tools are best employed in sorting the information? Even a tool offers a preliminary critical perspective.

Kathleen Burnett, borrowing from Deleuze and Guattari’s concept of the rhizome, notes in “Theory of Hypertextual Design” that the archive is less about the artifact and more about the user:

[e]ach user’s path of connection through a database is as valid as any other.  New paths can be grafted onto the old, providing fresh alternatives.  The map orients the user within the context of the database as a whole, but always from the perspective of the user.  In hierarchical systems, the user map generally shows the user’s progress, but it does so out of context.  A typical search history displays only the user’s queries and the system’s responses.  It does not show the systems’s path through the database.  It does not display rejected terms, only matches.  It does not record the user’s psychological responses to what the system presents. . . .  The map does not reproduce an unconscious closed in upon itself; it constructs the unconscious.'” (25)

The digital archive, some argue, is the culmination of Don McKenzie’s “social text,” and the database, and to some extent hyperlinks, allow users to chase down any reference. In essence, the users become ergodic and radial readers. McGann, in The Textual Condition, defines radial reading as the activity of reading regularly transcends its own ocular physical bases, which means that readers leave the book in order to acquire more information about the book (i.e., look up word in dictionary or footnote in back). This allows the reader to interact with the book, text, story, etc., through this acquisition of knowledge. The reader makes and re-makes the knowledge produced by the text through this continual knowledge acquisition, yet the reader never actually leaves the text. It stays with her even while consulting other knowledge. This creates a plasticity to the text that is unique according to each reader (119).

But the archive, a metaphor once again, is always and forever contaminated according to David Greetham in Pleasures of Contamination. An archive is less about the text of a printed word and can be about all facets of materiality, form, and its subsequent encoding – even the reader herself.  Scott Rettberg notes that the act of reading prioritizes the experience over the object itself with this idea of ergodic reading:

The process of reading any configurative or “ergodic” form of literature invites the reader to first explore the ludic challenges and pleasures of operating and traversing the text in a hyperattentive and experimental fashion before reading more deeply. The reader of Julio Cortazar’s Hopscotch must decide which of the two recommended reading orders to pursue, and whether or not to consider the chapters which the author labels “expendable.” The reader of Milorad Pavic’s Dictionary of the Khazars must devise a strategy for moving through the cross-referenced web of encyclopedic fragments. The reader of David Markson’s Wittgenstein’s Mistress or Reader’s Block must straddle between competing desires to attend to the nuggets of trivia of which those two books are largely composed or to concentrate on the leitmotifs which weave them into a tapestry of coherent psychological narrative. In each of these print novels, the reader must first puzzle over the rules of operation of the text itself, negotiate the formal “novelty” of the novel, play with the various pieces, and fiddle with the switches, before arriving at an impression of how the jigsaw puzzle might together, how the text-machine may run. Only after this exploratory stage is the type of contemplative or interpretive reading we associate with deep attention possible. (para. 13 – emphasis added)

As our understanding of digital interruptions into an otherwise humanistic world expands and becomes both resistant and welcoming, the definition of an “archive” expands as well.


Bornstein, George. “How to Read a Page: Modernism and Material Textuality.” Studies in the Literary Imagination 32:1 (Spring 1999): 29-58.

Burnett, Kathleen. “Toward a Theory of Hypertextual Design.” Postmodern Culture 3: 2 (January, 1993): 1-28.

Greetham, David. The Pleasures of Contamination: Evidence, Text, and Voice in Textual Studies. Indiana UP, 2010.

Harris, Katherine. “Archive.” The Johns Hopkins Guide to Digital Media. Eds. Marie-Laure Ryan, Lori Emerson, and Benjamin J. Robertson. Baltimore: Johns Hopkins UP, 2014.

Leary, Patrick. “Googling the Victorians.” Journal of Victorian Culture 10:1 (Spring 2005): 72-86.

McGann, Jerome. “How to Read a Book.” The Textual Condition. Princeton: Princeton UP, 1991. 119.

Price, Kenneth. “Electronic Scholarly Editions.” A Companion to Digital Literary Studies. Eds. Susan Schreibman and Ray Siemens. Oxford: Blackwell, 2008.

Rettberg, Scott. “Communitizing Electronic Literature.” Digital Humanities Quarterly 3:2 (Spring 2009).

Smith, Martha Nell Smith. “The Human Touch: Software of the Highest Order.” Textual Cultures 2:1 (2007 Spring): 1-15.

William, Raymond. Keywords: A Vocabulary of Culture and Society. NY: Oxford UP, 1983.


[1] The bibliographic code is distinguished from the content or the semantic construction of language within a text (linguistic code) by the following elements, as George Bornstein describes: “[F]eatures of a page layout, book design, ink and paper, and typeface . . . publisher, print run, price or audience. . . . [Bibliographic codes] might also include the other contents of the book or periodical in which the work appears, as well as prefaces, notes, or dedications that affect the reception and interpretation of the work” (30, 31). Linguistic codes are specifically the words. Also within the book are paratextual elements that do not necessarily fall under the bibliographic or linguistic codes. 

[2] See Matthew Arnold on disinterestedness in Essays on Criticism.