As speech researchers we are often interested in how the context in which a word appears influences its processing. For example, in a noisy environment, the word "cat" might be confused with other, similar sounding words ("cap", "can", "hat", etc.). However, in that same noisy environment, if it were in a sentence such as "The girl did not like dogs but she loved her pet cat", it would be much easier to recognize: The preceding sentence context limits the number of sensible ways to finish the sentence.

One way to measure how predictable a word is at the end of a sentence is to ask people to guess what it is. So, for example, people might see the sentence "The girl did not like dogs but she loved her pet _______" and be asked to write down the first word that comes to mind in finishing the sentence. The proportion of people giving a particular word is then taken to indicate the probability of that word. If 99 out of 100 people (99%) think the sentence ends with "cat", we might assume a 0.99 probability for "cat" being the final word.

Screen Shot 2019-09-10 at 1.39.37 PM.png

Unfortunately, using this approach means that as researchers we are generally limited by the lists of available sentences with this sort of data. A few years ago we discovered we needed a greater variety of sentences for an experiment, and thus was born our new set of norms. Over the course of a summer, undergraduate students in the lab created 3085 sentences. We broke these up into lists of 50 sentences and recruited participants online to fill in sentence-final words. We got at least 100 responses for each sentence: a total of 309 participants (many of whom did more than one list), and over 325,000 total responses.

We then wrote some Python code to tally the responses, including manually checking all of the responses to correct typos, etc. Our hope is that with this large number of sentences and target words, researchers will be able to select stimuli that meet their needs for a variety of experiments.

(Although the norms are available on OSF, we are working on making a more user-friendly search interface...hopefully, coming soon.)

Peelle, J. E., Miller, R., Rogers, C. S., Spehar, B., Sommers, M., & Van Engen, K. J. (2019, September 4). Completion norms for 3085 English sentence contexts. https://doi.org/10.31234/osf.io/r8gsy

The Speech, Hearing, And Communication (SHAC) Lab