Print
PRINT
James Joyce
David Letzler
REDUNDANCY, MODERNISM, AND READERS' EXPECTATIONS:
AN EXPERIMENT IN JOYCE PREDICTION

Introduction

As Leonard Diepeveen notes in The Difficulty of Modernism, the most common rhetorical move throughout the century-long discourse on modernist literature has been to invoke its readers' “unfulfilled expectations.”(1) Though many readers respond to these unfulfilled expectations with frustration, modernist polemics have long celebrated these texts' evasion of expectations as central to their value. Inspired by Marxist critiques of conventional language - such as Theodor Adorno's association of “everyday speech” with the “shoddiness” of “received opinion,” claiming that “Only what [most people] do not need first to understand, they consider understandable; only the word copied by commerce, and really alienated, touches them as familiar”(2) - this view depicts modernism's refusal to fulfill expectations as renewing and redeeming the written word. Exemplifying that attitude in her Modernist Literature: Challenging Fictions, Vicki Mahaffey celebrates modernist texts for working “according to different laws than the ones that most contemporary readers unconsciously expect to govern literary works,” criticizing “traditional stories” for being “too easy” by following those laws, whereas modernist works “force us to laboriously to rethink our most comfortable assumptions and expectations.”(3)

Unsurprisingly, James Joyce is frequently positioned as a modernist whose work defies expectations in this way - or at least the late Joyce, author of Ulysses and Finnegans Wake, often is. Though modernist critics rarely criticize Dubliners and A Portrait of the Artist as a Young Man directly, many imply that these earlier works retain a certain residual conventionality that Joyce gradually had to overcome: introducing his alternative history of the novel, for instance, Steven Moore pointedly describes Joyce as “progressing from Dubliners to Finnegans Wake,” with “progressing” not meant merely in the sense of continuation but improvement in aesthetic value, as the Beatles' catalog “progresses” from “I Want to Hold Your Hand” to Sgt. Pepper's.(4) The Wake, in fact, is often thought to be Western literature's furthest point of “progress” in this direction, maximally freeing readers of the limiting expectations of conventional language: as Derek Attridge puts it, “Instead of saying that in learning a language we learn to ascribe meaning to a few of the many patterns of sound we perceive, it may be as true to say that we learn not to ascribe meaning to most of those connections [...] - until we are allowed to do so again to a certain degree in rhetoric and poetry, and with almost complete abandon in Finnegans Wake.”(5)

Characteristic of literary studies, of course, such claims about what readers expect from a text are usually articulated hypothetically rather than with reference to any actual readers, as it is infinitely easier to speculate on this issue rather than tackle it empirically. After all, how would one even attempt to study such a complex subjective experience?

We might be able to investigate this matter, though, by adapting an early experiment from information theory. At the 1950 edition of the Macy Conferences, an annual series of landmark interdisciplinary seminars themed around the emerging field of cybernetics, Bell Labs' Claude Shannon presented the results of a study designed to estimate the “redundancy” of written English. “Redundancy,” in Shannon's terminology, is a measure of a text's predictability: for instance, a string of forty digits that strictly alternates 1s and 0s is more redundant than a string of forty random digits, because while each digit of the second string provides a new, unpredictable bit of information, the first is largely predictable. Put another way, the first string adheres more closely to a set of expectations, allowing its receiver to reconstruct it even if it were significantly compressed, while the second string follows no pattern at all, requiring all forty digits to be received individually for it to be reconstructed. Curious about ordinary written English's redundancy, Shannon calculated from computational linguists' letter-frequency tables that English text must have a baseline redundancy of at least 54%. Furthermore, he realized that even that figure likely underestimated the redundancy of English, because frequency tables do not account for how, as he put it, “every one of us who speaks a language has implicitly an enormous statistical knowledge of the structure of the language if we could only get it out. That is to say, we know what words follow other words; we know the standard cliches, grammar, and syntax of English.”(6) To test this sort of redundancy, his lab conducted a prediction experiment, in which test subjects were given the first n letters of a randomly-chosen sentence (for several values of n) and asked to guess the next letter until they got it right - and then the letter after that, and so on, until the sentence's end. Shannon found that subjects guessed over 70% of a sentence's letters on the first try, suggesting that typical written English is over 70% redundant.(7)

Interestingly, Shannon and his fellow conferees named Joyce's work, alongside newspaper prose, as an example of text that was somewhat less redundant than average.(8) This seems like a sensible assessment, because as Hugh Kenner notes, Joyce “continuously evades the normal English patterns to which structural linguistics have devoted so much study.”(9) Furthermore, Joyce's prose does so progressively more throughout his career, even to the point that his later novels might render the normative frequencies upon which Shannon based his calculations irrelevant. What would happen, then, if we ran Shannon's experiment using Joyce's prose instead of ordinary text? By replicating this experiment with representative sentences from Joyce's three major novels - A Portrait of the Artist as a Young Man, Ulysses, and Finnegans Wake - we might measure and evaluate how different kinds of sentences defy or confirm readers' expectations for them. How low does the later Joyce's redundancy get - how much does it defy conventional expectation?

Methods and Materials

I will withhold the full text of the test sentences I selected until the “Results” section, as an invitation for interested readers to try the experiment with a partner: for such readers, look for the sentences beginning “It” on page 78 of Portrait, “Your” on line 672 of Chapter 6 of Ulysses, and “In” on line 29 of page 15 of the Wake. However, I will describe here their salient qualities with respect to the experiment. Though they were chosen to exemplify different novels, they each share several features. They are of similar length - respectively 66, 77, and 77 characters - and have similar word-length distribution. Each sentence begins a new paragraph and commences with a brief functional word, two of which are pronouns and one of which is a preposition. None includes any punctuation prior to the sentence's full stop, nor does any contain explicit references to the books' characters and events. Though no single sentence could be representative of such heterogeneous novels, the three selected represent characteristic Joycean styles, particularly in the way they increasingly violate the standard rules of English grammar. The first sentence, from Portrait, is a well-crafted literary sentence, one typical of a turn-of-the-century English-language novel. However, the second, taken from a stream-of-consciousness passage in Ulysses' “Hades” chapter, frequently violates syntactic rules and employs its vocabulary in unusual ways. The third, from the first chapter of the Wake, does not even keep to the standard written English lexicon, frequently inventing words or importing them from other languages, all the while further blurring the boundaries between its syntactic units.

For reasons that will be described in the “Discussion” section, I used two distinct groups of test subjects for this experiment, each of whom participated between December 2011 and May 2012. Keeping in mind both Shannon's invocation of how “every one of us who speaks a language has implicitly an enormous statistical knowledge of the structure of the language” and, conversely, modernist critiques of the mass audience, I intended the first group to approximate “ordinary readers.” It comprised 31 volunteer undergraduate students from Queens College (CUNY), mostly (though not entirely) English majors. Obviously, a sample this small, drawn from so limited a pool, cannot really represent “ordinary readers” generally, but given that the Queens College student body is incredibly diverse, representing a variety of backgrounds and reading abilities, it seemed as good a subpopulation to work with as any.(10) For the second group, I recruited more highly skilled readers, 14 volunteer graduate students from a variety of fields (mostly in the humanities) from several major universities. Somewhat more eclectic in its makeup and perhaps less representative of any particular demographic, this group was consistently distinguished from the former in its members' education level and reading experience.

My experiment recreates Shannon's in its basic outlines. Each subject was given the first word of a sentence, instructed to write it on scrap paper, then prompted to guess the following letter. I recorded whether or not the guess was correct and told the subject what the correct letter was. The subject proceeded to write down this new letter, then guessed the letter following that one, etc., until the sentence's end, at which point we moved to the next sentence. This process made a few minor, simplifying changes to Shannon's procedure. First, Shannon included separate trials where subjects guessed from 26-character and 27-character alphabets, one excluding and one including the space character, but since Shannon found that the spaces ended up being almost totally redundant, I used the 26-character alphabet exclusively. Second, for the sake of time and sanity, I did not make test subjects guess letters until they came up with the correct one, but simply told them the correct letter after their first guess, whether correct or not. Third, since my purpose was to gauge readers' expectations rather than abstract properties of the text, I did not supply the subjects with letter frequency charts as Shannon did. Fourth, while Shannon's experiment examined the impact that a variable number of introductory characters might have on the predictability of ensuing characters, I thought it would be easiest for our more modest purposes to begin with the first word of each sentence. Otherwise, the experiment was identical.

Results

This will be a technical, statistical presentation of some key results. Those who dislike such things may skip to “Discussion” for an interpretation of the data.

The Portrait sentence was “It was towards the close of his first term in the college when he was in number six” (P 78). The Ulysses sentence was “Your heart perhaps but what price the fellow in the six feet by two with his toes to the daisies?” (U 6.672-673). The Wake sentence was “In the name of Anem this carl on the kopje in pelted thongs a parth a lone who the joebiggar be he?” (FW 15 29.30). The sample of thirty-one undergraduate and fourteen graduate subjects, each producing over two hundred binary data points, yielded the following results:

Table 1

As suggested by data's distribution within one and two standard deviations, the results adhere reasonably closely (given the limited sample) to a normal distribution. This may be loosely observed by the score distributions shown in the graphic below, with x-axis representing several ranges of scores and the y-axis representing the number of subjects scoring in that range:

Chart 1

Since our sample size is limited, I used a t-test to evaluate confidence intervals regarding the mean results. We will calibrate the confidence interval to 98% confidence:

Table 2

I also calculated a separate-variance t-test to examine the statistical significance of the experimental differences between the two subject populations' mean performance on the three texts at various levels of confidence, taking as our null hypothesis (H0) that the mean score for each is the same. (If t' is greater than tv at a given level of confidence, then that null hypothesis is rejected.)

Table 3

Within each sentence, we may also examine whether subject scores improved as the text proceeded. Given word-by-word variability, we probably cannot calculate a significant result for how the subjects' accrued knowledge of the sentence's text impacted their ability to guess the remaining letters, but we may observe general trends in the chart below:

Chart 2

We may also calculate how many characters were guessed by relatively large or small numbers of test subjects. The following graph's x-axis denotes how many of the forty-five subjects guessed a given letter correctly, while the y-axis represents how many letters in each of the three sentences were correctly guessed by that number of subjects:

Chart 3

Lastly, we may perform a linear regression to examine the correlative relationship between how well a given subject performed on one sentence versus how well that subject performed on the others. In each of the scatter plots below, the x-axis represents the subjects' scores on the first named text, and the y-axis the second. The best-fit line appears in red.

Chart 4.1

The best-fit line is y=0.2x + .22, which has a correlation coefficient of .370 and an r<sup>2</sup> of .137.

Chart 4.2

The best-fit line is y=.18x + .15, which has a correlation coefficient of .361 and an r<sup>2</sup> of .130.

Chart 4.3

The best-fit line is y=.15x + .19, which has a correlation coefficient of .158 and an r<sup>2</sup> of .024.

Discussion

A. The Undergraduate Scores Against the Graduate Scores

Though initially I conducted this experiment to quantify the difference in redundancy between Joyce's three texts, the most eye-opening results were the different scores attained by the two subject populations. At first, I had assumed the correctness of Shannon's comment that “every one of us who speaks a language has implicitly an enormous statistical knowledge of the structure of the language if we could only get it out,” and intended to estimate the sentences' redundancy using only undergraduate test subjects. Even without frequency tables, I expected them to achieve, as did Shannon's subjects, “close enough to ideal prediction”(11) on their own, yielding something like accurate values for the sentences' overall redundancy. In doing so, I had forgotten that Shannon's comment was made in another era, one in which linguists more readily made broad generalizations about the structure of the langue and paid less attention toward how language is shaped by its individual speakers.

Consequently, I found the undergraduate results, especially on the sentence from Portrait, rather shocking. Expecting successful prediction rates on the first sentence somewhere above sixty percent - i.e., fitting with Shannon's assertion of Joyce's minimal redundancy while also remaining above the former's 54% lower-bound figure for redundancy - I did not know how to respond when the undergraduates averaged only 45.6% (see Table 1). It appeared mathematically impossible, and it wasn't as if the mean were being dragged down by outliers - though one subject did score as low as 26.6% - because the median was even lower, at 43.8%. In fact, out of thirty-one trials, only three subjects broke the 54% lower bound at all. What had gone wrong? I wondered if I had somehow chosen a “bad” sentence, yet the Portrait passage I selected appeared of sufficient length and normativity to adhere to Shannon's rules.

If the text was not the problem, I wondered whether the sample population was responsible. Perhaps belatedly recalling C. P. Snow's comment in “The Two Cultures” that, outside of the humanities, even educated individuals tended to regard Charles Dickens - i.e., literary studies' exemplar of conventional, populist realism - as immensely difficult to read,(12) I expanded the experiment to include a sample of humanistically-inclined graduate subjects. The difference in the scores was immediate and obvious. Even over a small sample, the t-test confirms as statistically significant a different mean score between the two populations of at least six percentage points(13) (see Table 3). The graduate mean was 57.8%, exactly equal to the highest value any undergraduate scored, and the median was even higher, at 59.3%. The lowest graduate score, meanwhile, was still comfortably above the undergraduate mean. In other words, the graduate subjects possessed many more correct expectations for the Portrait sentence - eight to ten correct guesses' worth - than the undergraduates.

This enormous difference requires examination, especially since the gap between the two populations substantially decreased in the other two sentences, shrinking to 4.8% in the mean for Ulysses and 3.1% for Finnegans Wake (see Table 1). The Ulysses difference is statistically significant, but even hypothesizing that difference as high as one percentage point is beyond the confidence justified by these limited trials,(14) while the latter is barely significant at all (see Table 3). How did the graduates achieve higher scores on the Portrait sentence than the undergraduates, especially in comparison to their relative equality on the other two? We cannot analyze this on a letter-by-letter basis as confidently as we may the overall averages, but there are some general trends we might pick out from the raw data. For instance, the four characters exhibiting the largest differences between undergraduate and graduate accuracy came within a sequence of five characters, namely the o, f, i, and s in the middle of the phrase “towards the close of his first term”: the s gap was nearly forty percentage points, and the other three were in excess of fifty. I would argue that this likely reflects the graduates' greater command of idioms: after constructing “towards the close,” their internalized knowledge of the language allowed them to anticipate the “close of” expression; subsequently, upon receiving the h (after most incorrectly chose d for “close of day”) they were able to predict that his was the most likely word to continue that phrase.(15) The undergraduates, on the other hand, did not anticipate this expression nearly as well, causing them to muddle through with incorrect guesses for a few letters until re-grounding themselves at “first.” Generally speaking, the undergraduates, apparently possessing a less thorough feel for English word combinations, syntax, and spelling, required more information to put together several parts of the sentence: to take two other examples, they needed more letters before figuring out longer words like “towards” and “college”(16) and had greater trouble in guessing the first few letters of “when.”

To an extent, this advantage is still visible in the Ulysses sentence, as the graduates similarly gained on the undergraduates at the ends of longer words and expressions like “fellow” and “toes to the daisies.”(17) However, there were fewer such clusters, muting the advantage exhibited on Portrait. Overall, in fact, high performance on one text generally did not predict high performance on another, as may be seen in the scatter plots in Charts 4.1-3, showing that a set of expectations that was accurate for one text often was not for the others. Given the relatively flat slope of the best-fit line and the low correlation coefficient for the Portrait-Ulysses regression, the association between the two appears weak and inconsistent: on average, performing ten percentage points better on Portrait only led to a two-point increase on Ulysses, with only 13.7% of the variation on the Ulysses scores explicable by the Portrait scores.(18) This correlation is slightly weaker when comparing Portrait and Wake performance, as the slope flattens further and correlation drops. If we look to the raw data on the Wake, we see no letter clusters where the graduates consistently outperformed the undergraduates: their only consistent advantage seemed to come from guessing h after t more consistently. (The correlation between Ulysses and Wake performance, for that matter, is so small as to be totally insignificant.) As a result, the difference in median scores between Portrait and the other two texts are much smaller for the undergraduates than for the graduates; while the undergraduates experienced a 13.7-percentage-point gap in predictability between Portrait and Ulysses and a 19.8-point gap between Portrait and the Wake, these numbers balloon to 24.5 and 33.4 percentage points for the graduates, more than two-thirds again larger. We will discuss this more in the “Conclusion,” but these numbers show that while readers' expectations do seem to be thwarted by Ulysses and the Wake, it is the highly-skilled readers' expectations that are most affected.

B. The Three Texts Against Each Other

The subjects' results on the three texts relative to each other were much less surprising. Portrait was, by far, the easiest to predict of the three samples, with Ulysses the next most difficult and the Wake the most difficult; at 98% confidence, the graduate difference in mean from Portrait to Ulysses was significant to at least eighteen percentage points, and from Portrait to the Wake, at least twenty-seven (see Table 3).(19) Since the graduates' 98% confidence interval for Portrait stretches to 61.3% (see Table 2) - a number roughly consistent with Shannon's findings, and one which might have been more consistently achieved by the subjects had they been supplied with Shannon's frequency tables - we might similarly estimate redundancy values for the Ulysses and Wake sentences at the high end of the graduate interval, placing the former somewhere in the high thirties and the latter in the high twenties. In other words, Ulysses only contains about two-thirds the redundancy due to standard English conventions as does Portrait, while the Wake possesses only half. In particular, easily guessable characters vanish as we move from Portrait to Ulysses and the Wake, and the entirely unexpected ones dramatically increase (see Chart 3). Twenty-four of Portrait's characters were guessed correctly by at least two-thirds of the subjects, while only ten of Ulysses' and eight of the Wake's were guessed so often. Conversely, thirty-one of the Wake's characters (nearly half) were guessed by four or fewer subjects, with a further twenty-eight guessed by fewer than a third of the subjects, as compared to twenty-four and nineteen for Ulysses and twelve and ten for Portrait. The Wake further distinguished itself by lacking the “50/50” guesses found throughout Portrait and Ulysses: eighteen and twenty characters in Portrait and Ulysses, respectively, were guessed correctly by between one- and two-thirds of the subjects, while only seven in the Wake were.

The greater length of the Ulysses and Wake sentences did not seem to bias their results upward as the sentence progressed. As we see in Chart 2, while subjects tended to do a little better on the Portrait sentence as it progressed (possibly by using the information they had received earlier in the sentence to better predict later characters), accuracy did not improve on the Ulysses sentence and actually got worse for the Wake. This is probably due to random variation within the sentence, but there is some evidence to suggest that the Wake's text defies expectations to the point of actively worsening subjects' prediction abilities, reshaping their expectations so that even the conventional appears unfamiliar. For instance, each of the three test sentences contained at least one two-word sequence in which a preposition was followed by the. In English, since a word beginning with t that follows a preposition is usually the, one should generally guess the next two letters to be h-e. For the Portrait and Ulysses sentences, subjects generally did so. However, the Wake's phrase “carl on the kopje” produced somewhat lower scores on its h-e, especially for the undergraduates. The gap is not large enough to support firm conclusions, but it is possible that the Wake sentence confused the subjects so much that they began to abandon their predictive intuitions and started guessing more haphazardly.

I will make one final observation. Though I did not run a variation of this experiment with the 27-letter alphabet, I can report from watching the subjects copy down the Wake sentence that its spaces were not as redundant as Shannon claimed they usually are. (The other two sentences' spaces did seem to be mostly redundant.) Subjects frequently misplaced the Wake's spaces, and moreover were unable to correct these misplacements as the sentence progressed, something that happened only rarely with the other two texts. Often, subjects constructed whole word sequences that, despite having the same letters, were different from Joyce's: for instance, the phrase “a parth a lone who the” was frequently transcribed as “apart halo new hot he.” Though this phenomenon might suggest to those who believe that the Wake destroys all linguistic boundaries that the book renders spaces - which are, after all, English's most basic lexical boundary - irrelevant, I would argue, conversely, that the Wake's spaces are even more important than those of standard written English: since their placement is much less predictable, each one carries relatively high information.

C. Caveats

I want to make several caveats regarding these results. First, though we may make some statistically significant conclusions from this data, we cannot claim that they characterize the books as a whole, because these results are derived only from single sentences. Conceivably, an experiment treating the whole texts could be run: a savvy computer programmer might program a database containing the novels' entire corpus to administer this test over a larger sample of subjects, randomizing the text given to each. Since I have not done this, however, it might be a hair more defensible to claim these results broadly characterize differences between Joyce's styles: his well-crafted conventional prose is about 60% redundant, his stream-of-consciousness writing drops to around 40%, and Wakese falls to under 30%.

Second, there are obviously many factors influencing individual subjects' performance on this experiment for which I have not accounted. A scholar repeating this experiment with a test population differing in age, education level, nationality, or profession - to name only a few possible variables - might achieve substantially different results, which would underline how readers' expectations may vary dramatically based on their reading history and experience. I do not pretend to have come close to isolating what factors mostly strongly affected my subjects' scores. Furthermore, given the possibility that Ulysses and the Wake may tend to alter or impair readers' prediction ability, some subjects (human or programmed) might find a way to systematically predict more optimally than my graduate population, which would alter my hypothesized values regarding the texts' redundancy.

Third, though I believe the impact to be minimal, it is possible that the samples may be somewhat biased by the fact that graduate subjects were likely to have already read Portrait or Ulysses at some point prior to the trials, though most had not cracked either in some years. (These books had been read, respectively, rarely and not at all by the undergraduates.) Ideally, one would run the trials with subjects who had never read any Joyce, but the difficulty in recruiting highly educated, experienced readers who fit this criterion was not surmountable for this experiment. However, subjects' previous experiences with the texts should not have had much impact on the data. As I noted, the sentences were deliberately selected to lack specific reference to the novels: many graduate subjects explicitly noted that having read the books provided them no help in guessing letters. The few occasions when the sentences referred to potentially memorable aspects of the text - most notably, term and college in the Portrait sentence might have been more guessable by someone aware that Portrait's early chapters partly take place at boarding schools - seem to have impacted the average scores by little enough that my basic conclusions would be unaffected. In fact, over the eleven letters in term and college, the undergraduates scored less than two correct characters worse than the graduates, little different than their underperformance throughout the rest of the sentence.

Fourth, it has been suggested to me that the undergraduates' results might have been biased by greater performance anxiety than the graduates felt, which is certainly possible. However, I did emphasize to each subject that I was testing the sentences, not them, so pressure to perform should have been limited. However, as test-taking research has shown, such effects may be unpredictable and irrational, so we cannot say whether this had an impact.

Conclusion and Summary

Probably, it does not surprise anyone to see how Ulysses and Finnegans Wake do, indeed, defy readers' expectations, just as modernist criticism claims. In fact, they do so even more than I had expected. An information theorist with whom I discussed this project prior to conducting the experiment suggested that any text approaching Shannon's 54% lower bound would be nigh-on unreadable, and Ulysses and the Wake appear to go well below that. Redundancy is, after all, is a crucial element of pattern, which, as Norbert Wiener pointed out long ago, is the cornerstone of complex human communications.(20) Consequently, texts with as little redundancy as Ulysses and the Wake verge dangerously close to illegibility. Of course, however, Ulysses and Finnegans Wake are actually more redundant than the experiment's numbers indicate, because they only measure the texts' redundancy with respect to written English conventions. The two late books derive more redundancy from their own peculiar internal sets of motifs and repetitions, which have been explicated at great length in Joyce scholarship: to pick two examples here, an experienced Wake reader will likely hear in “a parth a lone” the approaching echo of “A way a lone a last a loved a long the,” while a Ulysses veteran might anticipate “daisies” all the more quickly because of the text's many other flower references. In Joyce, when conventional expectations are refused in one place, other patterns are created elsewhere.

More importantly, though, this experiment also demonstrates something that modernist discourse often does not acknowledge: conventional expectations are often only held by highly-experienced readers. At the risk of conflating readers' letter-by-letter expectations with their broader expectations for narrative, this data suggests that the modernist framework in which readers are divided into those who respectively are and are not willing to have their conventional expectations refused is inadequate, because it ignores that many readers lack these expectations in the first place. Because average readers' textual expectations are less developed than those of experienced readers, what the latter consider to be conventional text may appear quite unusual to the former. Put another way, what our students see when they read Portrait is not what we see: if these numbers are representative, Portrait looks to them as much like what Ulysses looks to us as what Portrait looks to us. What separates readers is often less the ability to adjust one's expectations to unconventional text than having expectations for conventional text at all.

Perhaps the rhetoric of modernism does not handle this point adequately because, as Diepeveen observes, it is vague about whether modernist literature aims to alter readers' expectations or to eliminate them entirely. (This vagueness, Diepeveen adds, is closely tied to modernism' ambivalence about how the moderately-educated are supposed to experience modernist text).(21) If the latter is modernism's goal, then perhaps we all should strive to read, write, and think language more like our students, who, as this experiment shows, already bring many fewer expectations to texts than do we. If that conclusion seems unacceptable, though, then appreciating modernist literature must require that one experience a difference between how our expectations interact with it and with conventional text - a difference which, consequently, cannot be fully appreciated by average readers, because they often lack highly-developed expectations for conventional text. Paradoxically, then, instead of denouncing conventional expectations, modernists should encourage readers to cultivate them, because without them, modernist difference cannot be perceived.

1 Leonard Diepreveen, The Difficulties of Modernism (New York: Routledge, 2003), 73. Throughout this paper, incidentally, I use “modernism” less in its historical sense than in Umberto Eco's trans-historical one, where modernism is a tendency to “destroy the past” by eliminating its conventional elements (“Postscript,” in The Name of the Rose, trans. William Weaver (New York: Harcourt & Brace, 1984), 530). I use it this way because “modernism” appears to be the most common term invoked when discussing the problem of difficult literature. Had I the space to do so, I might advocate for modernism to be used largely in a historical sense, and give Eco's definition to a different term, but that argument would go rather beyond this paper's scope.
2 Theodor Adorno, Minima Moralia: Reflections from Damaged Life, trans. E. F. N. Jephcott. (London: Verso, 1978), 101.
3 Vicki Mahaffey, Modernist Literature: Challenging Fictions (Malden, MA: Blackwell, 2007), vii, 4, 8.
4 Steven Moore, The Novel: An Alternative History (New York: Continuum, 2010), 1: 23-24.
5 Derek Attridge, Peculiar Language: Literature as Difference from the Renaissance to James Joyce (Ithaca: Cornell University Press, 1988), 215.
6 Claude Shannon, “The Redundancy of English,” in Cybernetics: The Macy Conferences 1946-1953, ed. Claus Pias (Z├╝rich: Diaphanes, 2003), 1:248-250. A related Internet meme regarding this phenomenon has circulated over the past decade involving our ability to “read” text even when its vowels have been removed and its consonants switched. For a summary of this meme and links to some cognitive linguistic studies relevant to how we process language, see Matt Davis, “Cmabrigde.” MRC Cognition and Brain Sciences Unit.http://www.mrc-cbu.cam.ac.uk/people/
matt.davis/Cmabrigde/
.
7 Shannon, 253.
8 Ibid, 254-255. Actually, things are a little more complicated than that. Shannon and his colleagues refer to the frequency studies of G. K. Zipf, who famously used Ulysses to test his theory regarding the relation of a corpus's words' statistical frequencies to their frequency ranks within the work. Zipf chose Joyce's novel because he figured its size, vast vocabulary, and abnormal prose style would push his theory to its limit, and he was gratified to find that it still held. Accordingly, the Macy conferees appear to have invoked Joyce because they believed, following Zipf, that the size of his vocabulary would make any given word less predictable, much as a newspaper article's use of many disparate proper nouns and minimum filler make it less redundant. From my (very limited) understanding of Zipf's work, though, it appears he did not consider syntax, which is where Ulysses (and, to an extent, the Wake) is at its most abnormal and most defies predictability. Quantitative linguistic analysis beyond this, sadly, is outside my ken.
9 Hugh Kenner, The Stoic Comedians: Flaubert, Joyce, and Beckett (Berkeley: University of California Press, 1961), 31.
10 Obviously, elite readers (of the kind who either have accrued more education than my undergraduates or who possess the ability and privilege to attend more highly selective universities) are not well represented in this group; on the other hand, neither are those whom we might generally expect to have very limited reading skills, such as those who have not attended university at all or who have avoided the kinds of English classes from which I drew my sample.
11 Shannon, 253-254.
12 C. P. Snow, The Two Cultures (New York: Cambridge UP, 1998), 12.
13 For the sake of space, this calculation is not shown in Table 3, though the data presented suggests how it may be calculated. In particular, if one subtracts .06 from the numerator of t' on the “Portrait-G v. Portrait-U” line, t' will still be higher than tv at α=.02.
14 Similar to the calculation suggested in note 13, if one subtracts .01 from the t'numerator on the “Ulysses-G v. Ulysses-U” line in Table 3, t' will no longer be higher than tv at α=.02.
15 Obviously, his is more likely than her in the context of Portrait. While greater knowledge of the book's contents (specifically, its constant focus on Stephen) might have helped the graduate subjects guess this correctly (see “Caveats”), the i's higher probability could also have been intuited from the book's title, about which all subjects were informed before the experimental trial. Similarly, lack of knowledge of the book would not explain why, after being given the h-i, the undergraduates had such trouble producing the final s.
16 The former may be partly caused by my (American) undergraduates' lesser familiarity with the British usage “towards,” rather than the more common American “toward,” which gave the graduates less trouble. Knowledge of a variety of conventional systems is, after all, part of one's linguistic knowledge and expectations.
17 Again, in both of these cases, the graduates' linguistic knowledge includes a greater familiarity with Britishisms and idiomatic expressions.
18 Further confusing matters, two subjects (both undergraduates) actually scored higher on Ulysses than Portrait.
19 Again, the calculation is not made on Table 3, but may be confirmed by subtracting .18 and .27 from the respective numerators.
20 Norbert Wiener, The Human Use of Human Beings: Cybernetics and Society (Boston: Houghton Mifflin, 1950), 3-4.
21 Diepeveen, 237-239.