Bridging Practice & Research Ethics & Legal Teaching

Writing Psychotherapy Research With Generative Artificial Intelligence (This Article Was Mostly Written by a Human)

Wilson T. Trusty, Ph.D

May 14, 2024

Writing Psychotherapy Research With Generative Artificial Intelligence (This Article Was Mostly Written by a Human)

Wilson T.: Hey ChatGPT, write an introductory paragraph for a paper on using artificial intelligence to help write psychotherapy research articles. Write it in the style of the Society for Psychotherapy Research’s newsletter, “Psychotherapy Bulletin.”

ChatGPT: In the dynamic realm of psychotherapy research, the emergence of artificial intelligence (AI) as a tool for assisting in the writing process marks a significant juncture in scholarly endeavors. While AI offers unprecedented opportunities for streamlining and enhancing various facets of scholarly writing, it also presents nuanced considerations that warrant careful attention. In this paper, we embark on an examination of the considerations essential for leveraging AI in the composition of psychotherapy research articles. From the preservation of scholarly integrity to the ethical implications of automated content generation, we delve into the multifaceted dimensions that shape this evolving landscape. Through this inquiry, we endeavor to illuminate pathways for researchers to harness the potential of AI while upholding the rigors of academic inquiry and maintaining the integrity of psychotherapy scholarship (OpenAI, 2024).

As ChatGPT’s introduction to this article states, generative AI (e.g., OpenAI’s Chat GPT, Google’s Gemini, Microsoft’s Copilot) are changing the landscape of academic writing. As generative AI continues to evolve, authors are energetically discussing how these tools can help and hinder the writing process (Balel, 2023; Golan et al., 2023). Researchers are also discussing ways for clinicians, authors, and academic journals to use AI responsibly (Misra & Chandwar, 2023; Salvagno et al., 2023).

In response to these issues, the American Psychological Association (APA) recently released a policy statement on the use of generative AI in scholarly work (American Psychological Association, 2023, APA policy on the use of generative artificial intelligence (AI) in scholarly materials section). While these brief guidelines prohibit researchers from naming generative AI as an author on APA publications, they do allow for its use in the writing process as long as the following criteria are met: 1) The use is disclosed, described, and cited transparently in the method section of the publication; and, 2) The full output of the generative AI tool being used is included as supplementary material. This opens the possibility that psychotherapy researchers may not just be reading and writing about AI (e.g., Goldberg et al., 2020), but will also be reading from and writing with AI in the near future.

What Might Generative AI Look Like in a Psychotherapy Manuscript?

After reading APA’s guidelines on generative AI use, I decided to try an informal experiment to see how well one AI platform, ChatGPT, would perform in completing a routine research writing task. I work with a wonderful team at the Center for Collegiate Mental Health (CCMH) and nearly all of our publications use the Counseling Center Assessment for Psychological Symptoms (CCAPS; Locke et al., 2012). Although writing a description of the CCAPS for method sections of manuscripts is fairly routine at this point, it can become tedious and repetitive, making it a potentially straightforward task to delegate to an AI tool.

Scenario One: Asking ChatGPT to Write Without Providing Additional Information

For my informal experiment, I first passed this prompt to ChatGPT 3.5: Write a one-paragraph description of the Counseling Center Assessment of Psychological Symptoms for the Method section of a peer-reviewed paper. (Click here for the full input and output with ChatGPT).

After the initial output did not go into much detail on psychometrics, I requested ChatGPT to: Describe the psychometric properties of the CCAPS

It wrote a paragraph that included psychometrics, but it did not include any citations. I then asked: Please rewrite the description of the psychometric properties of the CCAPS using at least five different peer-reviewed citations.

The paragraph gave a plausible narrative description of the CCAPS’ psychometrics with in-text citations. The following instructions were then given: Provide the bibliographic entries for the in-text citations in the CCAPS description above.

I then took the three paragraphs it wrote in response to my three prompts and merged them. I rearranged the order of some sentences and deleted repetitive ones but otherwise did not make any additional edits. Afterward, I pasted the bibliographic citations it provided at the end of the composite paragraph. This is the final output with references (OpenAI, 2024):

The Counseling Center Assessment of Psychological Symptoms (CCAPS) is a comprehensive self-report measure designed to evaluate various psychological symptoms prevalent among college students seeking counseling services. Developed by Locke et al. (2012), the CCAPS consists of multiple subscales assessing depression, anxiety, social anxiety, academic distress, eating concerns, hostility, substance abuse, and family distress. Locke et al. (2012) reported high internal consistency for the CCAPS subscales, with Cronbach’s alpha coefficients ranging from .81 to .92. The CCAPS has exhibited favorable convergent validity, as evidenced by its significant correlations with established measures of psychological distress (Locke et al., 2012; Mallett et al., 2019; Hill et al., 2018). Additionally, Hill et al. (2018) highlighted the instrument’s discriminant validity, showing that it effectively differentiated between clinical and non-clinical samples. Furthermore, the sensitivity of the CCAPS to changes in symptomatology over time has been demonstrated, indicating its utility in monitoring mental health issues among college students (Mallett et al., 2019; Hill et al., 2018).

Locke, B. D., Bieschke, K. J., Castonguay, L. G., & Hayes, J. A. (2012). The utility of the Counseling Center Assessment of Psychological Symptoms-62 (CCAPS-62) in identifying depression and anxiety among college counseling center clients. Journal of Counseling Psychology, 59(4), 593–603.

Mallett, R. K., Bacharach, S. B., & Turiano, N. A. (2019). A longitudinal examination of the Counseling Center Assessment of Psychological Symptoms-62 (CCAPS-62): Improving the detection of psychological distress in college students. Psychological Assessment, 31(12), 1503–1512.

Hill, C. E., Smith, D. W., & Atkinson, B. J. (2018). A counseling center assessment tool: The development, reliability, and validity of the Outcome Questionnaire-45.3. Journal of Counseling Psychology, 65(1), 27–37.

After compiling this output, I shared it with the CCMH staff (who are all intimately familiar with research on the CCAPS) and asked them if they could spot any errors. Within about 10 minutes, the team found the main problem: all three of ChatGPT’s references, which were generally well-formatted aside from failing to italicize journal names, were fake. What was even more notable was that the references were not only smoothly integrated into the narrative, but they also seemed plausible. For example, it would not be surprising to see “C. E. Hill” in the reference list of a psychotherapy article (presumably referring to Clara E. Hill, a highly published researcher and past president of APA’s Division 29). However, the citation with her name is completely fabricated. Regarding the Locke et al. citation from 2012, Ben D. Locke, the founder of CCMH, did publish a paper on the CCAPS in 2012. However, the legitimate paper had a different title than the one listed, and it found slightly different internal consistencies than the ones ChatGPT provided. I was unable to find any connection of the Mallet et al. (2019) citation to real research on the CCAPS. In spite of these fabricated references and reliability coefficients, the paragraph was fairly well constructed and the overall claims about reliability and validity of the CCAPS were generally consistent with real research on the instrument (Locke et al., 2012).

Scenario Two: Asking ChatGPT to Write in the Context of Additional Information

After the first trial, I asked ChatGPT to write a description of the CCAPS again. This time though, I gave it a list of real citations to reference. I also included bullet points under each citation with information from the articles I wanted to be included in the output. The specific prompt I gave was: Write a one paragraph description of the Counseling Center Assessment of Psychological Symptoms for the Method section of a peer-reviewed paper. Use the following annotated reference list to support your description.

Click here for the full reference list and ChatGPT inputs and outputs.

An example of one of the reference list entries is below. Note that the bullet points under the bibliographic citation are informal, similar to what I would do when taking personal notes (i.e., incomplete sentences, unexpanded abbreviations).

Locke, B. D., McAleavey, A. A., Zhao, Y., Lei, P. W., Hayes, J. A., Castonguay, L. G., … & Lin, Y. C. (2012). Development and initial validation of the Counseling Center Assessment of Psychological Symptoms–34. Measurement and Evaluation in Counseling and Development, 45(3), 151-169. https://doi.org/10.1177/0748175611432642

Validation study of the CCAPS-34

EFA and CFA supported the discriminant validity of each subscale

Internal consistency of the subscales ranged from .76 (Academic Distress) to .89 (Depression)

Subscales were positively correlated with related measures of psychological symptoms, supporting their convergent validity

2-week test-retest reliability ranged from .74 (Academic Distress) to .86 (Depression)

Replicated past findings supporting the internal consistency and convergent validity of the subscales

They found that the CCAPS differentiated between psychotherapy clients and non-treatment-seeking individuals

ChatGPT wrote a paragraph that accurately reported the information in the references I provided, but it did not include in-text citations. I provided this feedback: Write that paragraph again but include in-text citations based on the annotated references I provided originally. This is ChatGPT’s unedited output (OpenAI, 2024):

The Counseling Center Assessment of Psychological Symptoms (CCAPS) is a comprehensive measure designed to assess various psychological symptoms commonly encountered in counseling settings. Developed through rigorous validation studies (Locke et al., 2011), the CCAPS-62 consists of 62 items across multiple subscales, demonstrating strong internal consistency ranging from .80 to .93 and supported by both exploratory and confirmatory factor analyses. Additionally, the CCAPS-34, a shorter version, maintains robust psychometric properties with subscales demonstrating internal consistency ranging from .76 to .89 (Locke et al., 2012), as well as good test-retest reliability ranging from .74 to .86 over a two-week period (Locke et al., 2012). These findings are further bolstered by evidence showing positive correlations with related measures of psychological symptoms, indicating convergent validity (Locke et al., 2011; Locke et al., 2012). Moreover, the CCAPS subscales have been shown to be sensitive to change during psychotherapy (Youn et al., 2020), making them valuable tools for monitoring treatment progress. Further clinical applications reveal the CCAPS’ ability to differentiate between psychotherapy clients and non-treatment-seeking individuals (McAleavey et al., 2012), highlighting its utility in clinical practice for both assessment and treatment evaluation.

The quality of this writeup was not as high as when ChatGPT wrote without having additional information (e.g., it did not name the CCAPS subscales). However, all of the in-text citations and psychometrics reported were accurate, and the writing seemed fairly easy to refine with human input.

Implications

After scenario one of this informal experiment, the CCMH staff jokingly concluded that generative AI is “good at sounding good without knowing what it’s talking about.” On a more serious level though, it illustrated what current iterations of freely available generative AI are: tools trained to produce text that closely replicates their training data while incorporating human feedback (OpenAI, n.d.). This highlights the “artificial” component of AI. That is, generative AI is designed to create language in human-like ways, not necessarily to find and report objectively accurate information.

Scenario two of the experiment led to a related consideration. Specifically, researchers need to know enough about their topic to write on their own even if they use an AI tool. This can allow them to thoroughly fact-check AI outputs or, perhaps even better, provide the program with information to include ahead of time and then proofread and double-check for accuracy. Using AI in this way could be helpful for researchers who know their topic areas well but want to automate the initial drafting of certain manuscript sections.

Because of how quickly generative AI is being developed, it could easily become a common tool among psychotherapy researchers in the future. When using this tool, it is important to remember that it is not (yet) able to replace the content knowledge, creativity, and nuance that humans bring to the writing process. In spite of its current limitations, it is possible to use it as a writing aid if it is provided with the proper inputs, edited by subject matter experts, and acknowledged transparently. Regardless of how generative AI is used, it is critical that we as psychotherapy researchers remain committed to the ethical principle of promoting integrity and accuracy in our written work (American Psychological Association, 2017), including the use of content created with the help of AI.

About the Author

Wilson T. Trusty, Ph.D

Wilson Trusty, Ph.D. (he/him) is a Research Psychologist at the Center for Collegiate Mental Health (CCMH) and a Senior Staff Psychologist at Penn State University’s student counseling center. He received his Ph.D. in Clinical Psychology from Idaho State University. Wilson's research focuses on psychotherapy process and outcome, college student mental health, and intersections of college counseling center services with student academic success. His clinical specialty is trauma-informed care, and he also owns a solo private practice focused on adult Autism assessment.

View all articles by Wilson T.→

Citation

Trusty, W. T. (2024, May). Writing psychotherapy research with generative artificial intelligence (this article was mostly written by a human). Psychotherapy Bulletin, 59(3), 4-8.

References

American Psychological Association. (2017). Ethical principles of psychologists and code of conduct (2002, amended effective June 1, 2010, and January 1, 2017). https://www.apa.org/ethics/code/

American Psychological Association. (2023, August). APA publishing policies. https://www.apa.org/pubs/journals/resources/publishing-policies

Balel, Y. (2023). The role of artificial intelligence in academic paper writing and its potential as a co-author. European Journal of Therapeutics, 29(4), 984-985. https://doi.org/10.58600/eurjther1691

Golan, R., Reddy, R., Muthigi, A., & Ramasamy, R. (2023). Artificial intelligence in academic writing: A paradigm-shifting technological advance. Nature Reviews Urology, 20(6), 327-328.https://doi.org/10.1038/s41585-023-00746-x

Locke, B. D., McAleavey, A. A., Zhao, Y., Lei, P.-W., Hayes, J. A., Castonguay, L. G., Li, H., Tate, R., & Lin, Y. C. (2012). Development and initial validation of the Counseling Center Assessment of Psychological Symptoms–34. Measurement and Evaluation in Counseling and Development, 45(3), 151-169. https://doi.org/10.1177/0748175611432642

Misra, D. P., & Chandwar, K. (2023). ChatGPT, artificial intelligence and scientific writing: What authors, peer reviewers and editors should know. Journal of the Royal College of Physicians of Edinburgh, 53(2), 90-93. https://doi.org/10.1177/14782715231181023

OpenAI (2024). ChatGPT (version 3.5) [Large language model]. https://chat.openai.com/chat

Open AI (n.d.). What is ChatGPT? https://help.openai.com/en/articles/6783457-what-is-chatgpt

Salvagno, M., Taccone, F. S., & Gerli, A. G. (2023). Can artificial intelligence help for scientific writing? Critical Care, 27(1), 1-5. https://doi.org/10.1186/s13054-023-04380-2