The science of language is the study of how humans communicate and understand meaning. It does this by examining the ways in which words influence and reflect internal and external processes and behavior, as well as social interaction and connectivity (Krieger & Gallois, 2017; Mehl & Pennebaker, 2003). The average person speaks 150-160 English words per minute (Yuan, Liberman, & Cieri, 2006) and is exposed to 14 million words per year (Moore, 2003). There is a large body of scientific literature that reveals how an understanding of the content (i.e., what is said) and style and structure (i.e., how it is said) of language can inform our understanding of how people think, feel, process information, connect with others, and cope with difficulties.
These insights are collected by capturing an individual’s natural use of language in a non-intrusive way that is not reliant on self-report and therefore not vulnerable to the same response bias (e.g., social desirability, acquiescence; test bias) generally recognized in the use of self-report assessment measures (Arnold & Feldman, 1981; Hill et al., 2018 ; Hu & Rahnev, 2019; Schriesheim & Hill, 1981). Natural language analysis is invaluable as it can predict social status (Kacewicz, Pennebaker, Davis, Jeon & Graes, 2014), personality traits like overconfidence and narcissism (Holtzmann et al., 2019), need states (Pennebaker & King, 1999), acts of deception (Newman, Pennebaker, Berry & Richards, 2003), overall health status (Ziemer & Kormaz, 2017), mental distress (Lyons, Deniz, Aksayli & Brewer, 2018), and depression (Tackman et al., 2019). Natural language use analysis is also able to identify linguistic indicators (e.g., use and number of specific words and utterances) that signal distinct psychological processes such as psychological distancing (Nook, Schleider & Somerville, 2017) and psychological change (Cohn, Mehl & Pennebaker, 2004).
History of Language Use Analysis
One of the most prolific researchers in the area of language analysis is Dr. James Pennebaker from the University of Texas, Austin. For over 30 years, Pennebaker has utilized word counting strategies to examine how the use of content words (e.g., words that label an object or an action such as nouns and verbs) and function words (e.g., words that connect, shape and organize language such as pronouns, articles, prepositions, adverbs, conjunctions, quantifiers, etc.) correlate to personality and emotional states and traits, as well many other human experience factors. Content words are words critical to conveying information or an idea to others (i.e., table, to love, yesterday). Meanwhile, function or style words are most often structuring words used to link interpersonal concepts and content words together but have no inherent meaning of their own (i.e., I, with, some, really). For example, the word “I” is uniquely describing the individual using it rather than representing a constant person or object. While less apparent, function words are used at very high rates, are shorter and harder to consciously detect, are social in nature and processed by different parts of the brain than content words (Pennebaker, 2011).
Pennebaker and his research team (1999) first began to examine word use by categorizing them into dictionaries containing dimensions (e.g., words like love, nice, and sweet would be categorized as positive emotion words) in order to examine differences in narrative text samples before and after a series of expressive writing exercises. The results of this initial work yielded surprising and fascinating results suggesting that the act of expressive writing helped to resolve trauma as measured by changes in writing style and measures of trauma-related symptoms (Pennebaker & Chung, 2011). Furthermore, word use reliably predicted improved overall health in volunteer, undergraduate participants, signaling that Pennebaker had potentially identified a meaningful method of measuring involuntary cognitive, emotional, psychological, and even physical functioning.
These initial experiments led to the development of the Linguistic Inquiry and Word Count (LIWC) software (Tausczik & Pennebaker, 2010) that automated the word counting and categorization process. The LIWC software is able to generate quantitative linguistic profile scores across 80 language dimensions. These dimensions include 21 standard linguistic dimensions that are computed by identifying the percentage of words in the text that are pronouns, articles, auxiliary verbs, etc. over the total words used in narrative sample. These language dimensions (i.e., function words, time orientation, drives) allow for an objective measure of the structure and function of the words contained in an individual’s language pattern. In addition, and interestingly, among these dimensions there are 41 word categories tapping psychological constructs (e.g., affect, cognition, biological processes), six personal concern categories (e.g., work, home, leisure activities), five informal language markers (assents, fillers, swear words, netspeak), and 12 punctuation categories (periods, commas, etc.) and has been translated into Spanish, German, Chinese and many other languages (For a full review of the software details see Pennebaker, Boyd, Jordan & Blackburn, 2015).
What Can Word Use Tell Us About What They Are Feeling That People May Not?
In order to demonstrate how language analysis is conducted, below is a brief sample narrative:
“I can’t even believe I have to come to therapy. I’m way too strong to be here. I have never needed anyone in my entire life! A month ago, everything just fell apart... and now I am a mess. I mean I was fine before. I had a job. I was going to school and I was having a happy life and all of a sudden I had the accident, and everything fell apart”.
In this narrative, we can see that the individual used several singular, personal pronouns (i.e., I, me, my) and these function words comprised 16.4% of the total words used in the narrative. If we compare this word use to that of the grand mean of 8.7% for this category (Pennebaker, Boyd, Jordan & Blackburn, 2015), we see that this individual uses nearly twice the incidence of singular personal pronouns as used on average amongst the general population. By having a point of comparison, we can begin understanding the linguistic markers in this narrative. Importantly, most research in this area also collects additional data including objective measures of the clinical disorders, personality traits or processes/behaviors to quantify the conditions that are being studied (e.g., depression, suicidal behavior, PTSD). As a result, this data can be tested for significant correlation with the linguistic dimensions mentioned earlier to understand any underlying associations with specific words or word category use.
What can word use tell us about depression?
One of the best documented correlations that has been identified in the literature is the relationship between increased use of singular, personal, pronouns and depression. A meta-analysis of 21 studies representing 3,758 participants determined that a robust association exists between the increased use of first-person singular pronouns and measures of depression, such as the Beck Depression Inventory I & II, even after controlling for known covariates, such as age and gender (Edwards & Holtzmann, 2017). Researchers interpreted the results to suggest that the increased use of these pronouns signals attentional bias toward one’s self and internal rumination. Interestingly, in the studies examined, the results were the same when natural language use was captured from a variety of sources and in response to a range of prompts including narrative text from a journal about loss, Facebook posts, and a five-minute interview about one’s personality. These results seem to suggest that function word use, or language style, remains constant despite differences in context and contextual prompts (i.e., if an individual is self-focused, this state generalizes to all interactions and language use).
Can word use detect if patients are thinking about hurting themselves or someone else?
Other studies help to elucidate the link between language use and risk assessment. Handelman and colleagues (2007) analyzed and compared the narrative text of a group of suicide notes from “attempters” (e.g., those who attempted but did not die by suicide) and “completers” (e.g., those who died by suicide). These researchers discovered that individuals who completed suicide used more second person pronouns (e.g., you), hearing words (e.g., listen, hear), references to other people pronouns (e.g., them, they, her), future tense verbs, fewer references to inclusive space (e.g., with or include) and increased metaphysical language (e.g., god, heaven) than those who did not even after controlling for age and gender. They also reported that narratives from attempters signaled greater distress than completers as evidenced by less positive emotion words (e.g., love, nice, sweet), social references (e.g., mate, talk, together), and future tense words (e.g., may, will, soon). These results seem to be consistent with empirical studies documenting that an estimated 75% of individuals who have died by suicide actually denied suicidal ideation immediately preceding completion (Berman, 2017). This may shed light on why accurate risk assessment is so difficult, as these results are counterintuitive.
Similarly, a novel study identified distinct language patterns in individuals who committed homicidal acts defined as “spree killings”, such as school shooters (Egnoto & Griffin, 2016). Researchers collected narrative data from school papers, manifestos, essays, transcriptions from YouTube videos, social media posts, and journal entries from seven school shooters and individuals who completed suicide and used narrative data from a high school student group data set as a control group. The materials were analyzed utilizing the LIWC software to determine significant differences in linguistic markers between groups. Results revealed that school shooters used twice as many anger words (e.g., hate, kill, annoyed), three times more pronouns (e.g., I, they, it), and six times as many future tense verbs (e.g., may, will, soon) than suicide completers and controls. Consistent differences in word use were able to reliably identify individuals who engaged in violent acts as suicidal and control narratives were not significantly different in these language dimensions. However, suicidal narratives were significantly different than both groups in the use of personal pronouns and future tense words consistent with Handelman’s work.
Are certain words indicative of reexperiencing trauma?
Finally, understanding language use can also be instrumental in detecting indicators of difficulties related to trauma. According to recent work completed by Jaeger and his team at the University of Washington (2014), the use of positive (e.g., love, nice, sweet) and negative emotion (e.g., hurt, nasty, ugly) words was associated with lower symptoms of Posttraumatic Stress Disorder (PTSD). In their study, a small sample of female survivors of interpersonal abuse were administered a battery of clinical assessments assessing for mood and PTSD symptoms and provided a narrative summary related to their individual trauma experiences. Results revealed that reexperiencing symptoms were associated with higher use of cognitive mechanism words (e.g., cause, know, ought) pronouns, and negative emotion words while positive emotions were the opposite.
Perhaps more interestingly, increased pronoun use predicted an increased level of dissociation and trauma-related guilt. Excessive pronoun use, unlike positive and negative emotion, is not obvious nor is the clinical implication of this finding immediately clear. According to Pennebaker (2011), pronouns can help us understand whether an individual is acknowledging the emotional processing of trauma or avoiding it. Unfortunately, the authors did not report the specific pronouns that survivors used, only that they used more than the general population. An investigation into whether more singular pronouns (e.g., I, me, my) or third party personal pronouns (e.g., he, she, they) were used would help to understand if links exist between a self-focus or potential acknowledgement of trauma (e.g., I, me, my) or other focus and potential avoidance of trauma (e.g., they, she, them) coping style is present.
In another study examining the language use of individuals diagnosed with PTSD, researchers deemphasized the narrative of the trauma, but instead sought to uncover any underlying language patterns that might surface during natural language use that differ in participants with trauma when compared to controls (Papini et al., 2015). Participants were administered the Thematic Apperception Test a measure of PTSD symptom severity and provided information related to the type of trauma they experienced. Interestingly, the group with PTSD used more third person singular pronouns (e.g., he, she, him), fewer third person plural pronouns (e.g. they, them), and more death-related words (e.g., bury, coffin, kill) which may provide more insight than the previous study as to which pronouns are more prevalent in individuals suffering from difficulties related to exposure to trauma.
In addition, Papini et al. (2015) found that specific word use patterns were significantly correlated with PTSD symptom severity. For example, the higher use of third person pronouns, which may signal an avoidance coping style, the higher reexperiencing symptoms were reported. Contrarily, the use of more words signaling cognitive flexibility (e.g., may, might) and acknowledgement of trauma, the less reexperiencing symptoms were reported. Furthermore, higher use of death-related words was associated with less avoidance and numbing symptoms. The authors interpreted this association to imply that individuals who had successfully faced their mortality and were willing to use language reflective of that experience may have a more adaptive coping styles overall and therefore less severe symptoms. Finally, the more anxiety words used, the lower the individual’s hyperarousal symptoms were, supporting the notion that expression of anxiety may help alleviate somatic expression of distress.
While these examples of correlations between specific language use and clinical symptoms (i.e., depression, suicidality, homicidality, and posttraumatic stress) do not provide causal or conclusive evidence of the underlying cognitive and emotional processes that they reflect, they do seem to provide preliminary insight into how patterns of content and function word-use differs depending on the presence or absence of psychopathology. It seems that they do offer an additional tool in our clinical toolbox to help in detecting and understanding patients that may be unable or unwilling to disclose the true nature of their distress. Future research is needed to help develop a more vigorous and methodical approach to the clinical application of these findings. In advance, there are a couple of ways this information can be useful in our daily clinical practice.
What We Can Do
While language use analysis is most often used in research settings, the potential for clinical use and application has considerable potential. Language use analysis can be viewed as an innovative clinical assessment approach, a progress monitoring technique as well as an intervention strategy. as it is well documented that writing about traumatic events improves mental and physical health (Pennebaker & Chung, 2011). However, it should be noted that the benefits of writing specifically about trauma is not always indicated (Niles et al., 2013) and may be disorder dependent (Reinhold et al.., 2018).
1. Read about it and know what to listen for
The first and most basic opportunity for clinicians is to become more familiar with the scientific literature surrounding what is known about the word use and language style of the populations you treat. We provide several examples above including basic word categories to be aware of and pay attention to when assessing for depression, risk and trauma (See Table 1.). Yet, there have been hundreds of peer-reviewed studies published examining similar links that offer insight into the language of many clinical disorders, behavioral patterns, and psychological processes. Such a simple strategy will naturally begin to help you become more attuned to the word use of your clients and sensitive to the way in which language is used in session.
2. Analyze narrative text as part of your practice
Once familiar with the literature and the common research protocol explained in the work of Pennebaker and his colleagues, there is an opportunity for clinicians to utilize the LIWC word counting strategy in an informal or formal way in private practice or other mental health settings. This can be done manually by counting the amount of words provided in narrative text by your patients or via word counting software like the LIWC. The LIWC software is available for purchase through an academic or commercial license (See http://liwc.wpengine.com/ for details). The software is user-friendly and accompanied by a language dictionary and population information for each word category and dimension.
3. Measure psychological change in language style and use changes
Finally, evidence-based interventions such as cognitive processing therapy (CPT) and more expressive writing-oriented informal techniques such as journaling involve writing a narrative as a primary intervention strategy. Using language use analysis in combination with these interventions may allow for an increased level of clinical utility as word use is able to detect unconscious relational patterns such as the propensity for psychological distancing (Nook et al., 2017).
In addition, natural language use analysis has proven to be a useful, quantitative progress monitoring technique that allows for pre, during, and post-treatment comparisons as a unique treatment outcome measure. In fact, as stated earlier, language change over time allows for an indicator of treatment response that is not subject to the inherent bias of self-reported measures and may allow for a more accurate view of enduring change. Such strategies have been used to monitor progress in victims of childhood sexual abuse (Pulverman et al., 2015) and borderline personality disorder (McMain et al., 2013).
Language is the primary vehicle by which human beings convey their experiences, wishes, thoughts, and emotions. It seems that as mental health professionals, there is much to be gained by studying and understanding how language use reflects the complex narratives that account for the moods, behaviors and personality patterns we observe in our patients. Clinicians can use this deeper understanding as additional pathway to help those with whom we work.
Cite This Article
Maccarrone, J. (2020, September). Harnessing insights from language use research in counseling and psychotherapy. [Web article]. Retrieved from http://www.societyforpsychotherapy.org/harnessing-insights-from-language-use-research-in-counseling-and-psychotherapy
Arnold, H. J., & Feldman, D. C. (1981). Social desirability response bias in self-report choice situations. Academy of Management Journal, 24(2), 377–385. https://doi.org/10.5465/255848
Berman, A. L. (2017). Risk factors proximate to suicide and suicide risk assessment in the context of denied suicide ideation. Suicide and Life-Threatening Behavior, 48(3), 340–352. https://doi.org/10.1111/sltb.12351
Cohen, M. A., Mehl, M. R., & Pennebaker, J. W. (2004). Linguistic markers of psychological change surrounding September 11, 2001. Psychological Science, 15. https://doi.org/10.1111/j.0956-7976.2004.00741.x
Cohen, S. J. (2012). Construction and preliminary validation of a dictionary for cognitive rigidity: Linguistic markers of overconfidence and overgeneralization and their concomitant psychological stress. Journal of Psycholinguistic Research, 41(5), 347–370. https://doi.org/10.1007/s10936-011-9196-9
Edwards, T., & Holtzman, N. S. (2017). A meta-analysis of correlations between depression and first person singular pronoun use. Journal of Research in Personality, 68, 63–68. https://doi.org/10.1016/j.jrp.2017.02.005
Egnoto, M. J., & Griffin, D. J. (2016). Analyzing language in suicide notes and legacy tokens. Crisis, 37(2), 140–147. https://doi.org/10.1027/0227-5910/a000363
Handelman, L. D., & Lester, D. (2007). The content of suicide notes from attempters and completers. Crisis, 28(2), 102–104. https://doi.org/10.1027/0227-5910.28.2.102
Hill, N., Mogle, J., Whitaker, E., Gilmore-Bykovskyi, A., Bhargava, S., Bhang, I., Sweeder, L., & Van Haitsma, K. (2018). Sources of response bias in cognitive self-report items: “which memory are you talking about?”. Innovation in Aging, 2(suppl_1), 777–777. https://doi.org/10.1093/geroni/igy023.2877
Holtzman, N. S., Tackman, A. M., Carey, A. L., Brucks, M. S., Küfner, A. C., Deters, F., Back, M. D., Donnellan, M. B., Pennebaker, J. W., Sherman, R. A., & Mehl, M. R. (2019). Linguistic markers of grandiose narcissism: A LIWC analysis of 15 samples. Journal of Language and Social Psychology, 38(5-6), 773–786. https://doi.org/10.1177/0261927X19871084
Hu, M., & Rahnev, D. (2019). Predictive cues reduce but do not eliminate intrinsic response bias. Cognition, 192, 104004. https://doi.org/10.1016/j.cognition.2019.06.016
Jaeger, J., Lindblom, K. M., Parker-Guilbert, K., & Zoellner, L. A. (2014). Trauma narratives: It’s what you say, not how you say it.. Psychological Trauma: Theory, Research, Practice, and Policy, 6(5), 473–481. https://doi.org/10.1037/a0035239
Kacewicz, E., Pennebaker, J. W., Davis, M., Jeon, M., & Graes, A. C. (2014). Pronoun use reflects standings in social hierarchies. Journal of Language and Social Psychology, 33(2), 125–143. https://doi.org/10.1177/0261927X13502654
Krieger, J. L., & Gallois, C. (2017). Translating science: Using the science of language to explicate the language of science. Journal of Language and Social Psychology, 36(1), 3–13. https://doi.org/10.1177/0261927X16663256
Logan, D. E., Claar, R. L., & Scharff, L. (2008). Social desirability response bias and self-report of psychological distress in pediatric chronic pain patients. Pain, 136(3), 366–372. https://doi.org/10.1016/j.pain.2007.07.015
Lyons, M., Aksayli, N., & Brewer, G. (2018). Mental distress and language use: Linguistic analysis of discussion forum posts. Computers in Human Behavior, 87, 207–211. https://doi.org/10.1016/j.chb.2018.05.035
McMain, S., Links, P. S., Guimond, T., Wnuk, S., Eynan, R., Bergmans, Y., & Warwar, S. (2013). An exploratory study of the relationship between changes in emotion and cognitive processes and treatment outcome in borderline personality disorder. Psychotherapy Research, 23(6), 658–673. https://doi.org/10.1080/10503307.2013.838653
Mehl, M. R., & Pennebaker, J. W. (2003). The sounds of social life: A psychometric analysis of students’ daily social environments and natural conversations. Journal of Personality and Social Psychology, 84(4), 857–870. https://doi.org/10.1037/0022-3522.214.171.1247
Moore, R. (2003). A Comparison of the Data Requirements of Automatic Speech Recognition Systems and Human Listeners. Eurospeech, Geneva, Switzerland. https://www.isca-speech.org/archive/eurospeech_2003/e03_2581.html
Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. (2003). Lying words: Predicting deception from linguistic styles. Personality and Social Psychology Bulletin. https://doi.org/10.1177/0146167203029005010
Niles, A. N., Haltom, K., Mulvenna, C. M., Lieberman, M. D., & Stanton, A. L. (2013). Randomized controlled trial of expressive writing for psychological and physical health: The moderating role of emotional expressivity. Anxiety, Stress, & Coping, 27(1), 1–17. https://doi.org/10.1080/10615806.2013.802308
Nook, E. C., Schleider, J. L., & Somerville, L. H. (2017). A linguistic signature of psychological distancing in emotion regulation. Journal of Experimental Psychology: General, 146(3), 337– 346. https://doi.org/10.1037/xge0000263
Papini, S., Yoon, P., Rubin, M., Lopez-Castro, T., & Hien, D. A. (2015). Linguistic characteristics in a non-trauma-related narrative task are associated with PTSD diagnosis and symptom severity.. Psychological Trauma: Theory, Research, Practice, and Policy, 7(3), 295–302. https://doi.org/10.1037/tra0000019
Pennebaker, J. W. (2011). The Secret Life of Pronouns. Bloomsbury Press.
Pennebaker, J. W., & Chung, C. K. (2011). Expressive writing: Connections to physical and mental health. Oxford University Press. https://doi.org/10.1093/oxfordhb/9780195342819.013.0018
Pennebaker, J. W., & King, L. A. (1999). Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 77(6), 1296–1312. https://doi.org/10.1037/0022-35126.96.36.1996
Pennebaker, J. W., & King, L. A. (1999). Linguistic Styles: Language Use as an Individual Difference. Journal of Personality and Social Psychology, 77, 1296–1312. https://doi.org/10.1037/0022-35188.8.131.526
Pennebaker, J. W., Boyd, R. L., Jordan, K., & Blackburn, K. (2015). The Development of Psychometic Properties of LIWC2015. http://liwc.wpengine.com/wp-content/uploads/2015/11/LIWC2015_LanguageManual.pdf
Pennebaker, J. W., Mayne, T., & Francis, M. (1997). Linguistic predictors of adaptive bereavement. Journal of Personality and Social Psychology, 72(4), 863–871. https://doi.org/10.1037//0022-35184.108.40.2063
Pennebaker, J. W., Mehl, M. R., & Niederhoffer, K. G. (2003). Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology, 54, 547–577. https://doi.org/10.1146/annurev.psych.54.101601.145041
Pulverman, C. S., Lorenz, T. A., & Meston, C. M. (2015). Linguistic changes in expressive writing predict psychological outcomes in women with history of childhood sexual abuse and adult sexual dysfunction.. Psychological Trauma: Theory, Research, Practice, and Policy, 7(1), 50–57. https://doi.org/10.1037/a0036462
Reinhold, M., Bürkner, P. C., & Holling, H. (2018). Effects of expressive writing on depressive symptoms-a meta-analysis. Clinical Psychology: Science and Practice, 25(1), e12224. https://doi.org/10.1111/cpsp.12224
Schriesheim, C. A., & Hill, K. D. (1981). Controlling acquiescence response bias by item reversals: The effect on questionnaire validity. Educational and Psychological Measurement, 41(4), 1101–1114. https://doi.org/10.1177/001316448104100420
Stone, L. D., & Pennebaker, J. W. (2010). Trauma in real time: Talking and avoiding online conversations about the death of Princess Diana. Basic and Applied Social Psychology, 24(3), 173–183. https://doi.org/10.1207/S15324834BASP2403_1
Tackman, A. M., Sbarra, D. A., Carey, A. L., Donnellan, M. B., Horn, A. B., Holtzman., N. S., Edwards, T. S., Pennebaker, J. W., & Mehl, M. R. (2019). Depression, negative emotionality, and self-referential language: A multi-lab, multi-measure, and multi-language-task research synthesis. Journal of Personality and Social Psychology, 116(5), 817–834. https://doi.org/10.1037/pspp0000187
Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of language and social psychology, 29(1), 24-54.Journal of Language and Social Psychology, 29(1), 24– 54. https://doi.org/10.1177/0261927X09351676
Yuan, J., Liberman, M., & Cieri, C. (2006). Towards an Integrated Understanding of Speaking Rate in Conversation[Paper presentation]. Interspeech, Pittsburgh, PA, United States. https://www.isca-speech.org/archive/interspeech_2006/i06_1795.html
Ziemer, K. S., & Korkmaz, G. (2017). Using text to predict psychological and physical health: A comparison of human raters and computerized text analysis. Computers in Human Behavior, 76, 122–127. https://doi.org/10.1016/j.chb.2017.06.038