Web-only Feature

Web-only Feature

Effectiveness For Online Cognitive Behavioral Therapy Versus Outpatient Treatment

A Session by Session Analysis

There is growing evidence that online self-management tools based on psychotherapy models are effective with various forms of psychic distress, according to recent reviews of the literature (Andersson, 2018; Davies et al., 2014; Lattie et al., 2019). Many of these online resources are based on the application of principles of Cognitive Behavioral Therapy (CBT). CBT is particularly amenable to structured learning modules and homework assignments.

Internet-delivered CBT (called iCBT) has been shown to successfully address depression (Clarke et al., 2005; Cuijpers et al., 2010; Melling & Houguet-Pincham, 2011), anxiety problems (Dryman et al., 2017; Jakobsen et al., 2017; Nordgren et al., 2014), and social anxiety (Kampmann et al., 2016; Klein et al., 2010). Cuijpers et al., 2010 performed a meta-analysis of studies comparing iCBT to in-person counseling for depression and anxiety disorders and found no meaningful differences in the effectiveness of the modes of treatment delivery. However, their total sample size (810 participants from 21 studies) is relatively small.

It should be noted that while CBT has a large body of research support, numerous meta-analyses over the past several decades have failed to provide evidence that one approach to therapy is overall superior to another (Wampold & Imel, 2015). It has become evident that the variance due to the therapist is much greater than the variance due to the treatment method, even when the treatment involves medications (Wampold & Brown, 2005). Furthermore, the quality of the working alliance between therapist and client is a strong predictor of treatment outcome (Minami et al. 2013). This line of evidence suggests that contact with a therapist or coach might augment the effects of CBT.

Purpose of Study

This study makes use of two distinct data sets involving session by session ratings based on client self-report outcome questionnaires. One dataset was generated by clients enrolled in one of Learn to Live’s online cognitive behavioral therapy programs for depression, social anxiety, or general stress and worry. The second dataset was generated by participants in the ACORN collaboration who likewise employ questionnaires measuring depression, anxiety and global distress. Combined, these datasets provide evidence from samples sizes substantially larger than have been reported in the past.

The purpose of this study is to take advantage of the large sample sizes to examine differences between in-person and online interventions. While other studies have determined the relative effectiveness of online services in comparison with in-person services, this study will provide a closer examination of clinical change for clients choosing to end their services after different numbers of sessions or online lessons.

Learn to Live Program Description

Learn to Live offers customized online CBT programs targeting depression, social anxiety, or stress, anxiety & worry. Each program consists of 7 modules, or lessons, and an 8th lesson that provides a final wrap-up and assessment after the others are completed. Outcome questionnaires focus on the problems targeted with each online program are completed at each lesson.


Outcomes (client self-reported improvement) are evaluated and compared employing the benchmarking methodology developed by various researchers participating in the ACORN collaboration (Minami et al., 2007; Minami et al., 2008a; Minami et al., 2008b; Minami et al., 2007;  Minami et al. 2012; Minami et al. 2014).

The ACORN Collaboration (https://acorncollaboration.org/) spans more than a decade, involving thousands of clinicians treating clients in a wide variety of settings. Consistent with the best practices of so-called “feedback informed treatment,” outcome and therapeutic alliance questionnaires are administered at every session. Clinicians are encouraged to log into a secure web site to view scores and other information designed to inform clinical decision making. Clinician use of this resource has been demonstrated to improve treatment outcomes (Brown et al. 2015; Brown & Cazauvieilh, 2019; Brown & Minami, 2019). These studies suggest the results for clinicians in the ACORN collaboration exceed likely outcomes for clinicians not employing feedback informed treatment. The ACORN data repository contains outcome data for over 1.2 million episodes of care.

The following link to a one-minute video of Takuya Minami, PhD explaining to purpose of benchmarking is available for all clinicians in the ACORN collaboration.


A critical component to the benchmarking methodology is the use of effect size, a standardized measure of improvement generally reported in outcome studies. While there are various approaches to calculating effect size (e.g., Cohen’s d score), the basic idea is to divide the pre-post change score by the standard deviation of the outcome questionnaire. An effect size of 1 means that the client improved one standard deviation. Based on the effect size statistic, ACORN is able to aggregate large heterogenous samples using various questionnaires. The goal is to permit comparison of various forms of psychotherapy.

In order to permit comparisons to subjects in clinical trials, effect size is only calculated for those clients in a so-called clinical range. ACORN defines this as clients having intake scores in the upper 75% for symptom severity. The standard deviation utilized is obtained from the scores of those clients in the clinical range at the first session/lesson.

Benchmarking was initially derived from meta-analyses of clinical trials of evidence-based treatments for depression (Minami et al., 2007) and subsequently from analyses of naturalistic data for treatment as usual in clinical outpatient clinical settings (Minami et al., 2008a, Minami et al., 2008b). Interestingly, both the data from controlled studies and from the ACORN collaboration yielded similar effect sizes of close to 0.8. Based on these analyses ACORN designated a mean effect size of 0.8 or greater as meeting criteria for Highly Effective Services, while an effect size between 0.5 and 0.8 is considered as meeting criteria for Effective Services.

Description of Samples

The Learn to Live sample was drawn from clients completing assessments for at least two of the online modules, or so-called lessons. It should be noted that the eighth module is an exit assessment, after the other lessons presenting the intervention have been completed.

The ACORN sample was drawn from adult clients in a wide variety of outpatient clinical settings who had begun treatment on or after January 1, 2019 up to the time of this article preparation (March 15, 2020). All clients received treatment as usual, such that the clinician has broad discretion to determine the methods employed and number of sessions provided.

In order to achieve the best structural equivalence, the same selection criteria were used for both samples.

  1. The first assessment score must be in a clinical range (defined as clients scoring in the upper 75% of symptom severity at the first assessment).
  2. The clients must have completed at least two lessons or sessions, permitting the calculation of change scores and effect size.
  3. In the case of the ACORN data, clients must have completed treatment within 8 sessions, comparable to the maximum of 8 lessons in the Learn to Live CBT programs.

This is a representative sample of ACORN data since approximately 20% of ACORN clients exceed this 8-session limit.

Based on these criteria, the Learn to Live sample numbered 2,462 clients while the ACORN sample numbered 120,671 clients. While the Learn to Live clients complete the modules and assessments on their own without therapist involvement, the ACORN clients were treated in over 150 various outpatient settings with a total of 2,739 clinicians contributing data.

Description of Questionnaires

Learn to Live uses three different questionnaires depending on the lesson modules: depression (PHQ-9; K Kroenke et al; 2001), social anxiety (SPIN-17; KM Conner, et al., 2000) and general stress and worry (GAD-7; RL Spitzer et al.; 2006) . At the time of the initial assessment before the client is exposed to the clinical material, all questionnaire items from the three scales are administered concurrently as a comprehensive assessment. This allows for a factor analysis of items for all three questionnaires concurrently to determine if the three questionnaires are distinct measures or if they represent a common psychological factor.

The analysis of these three questionnaires reveals that all the items load on a common factor, with factor loadings ranging from .47 to .77, with a mean loading of .63.  The determination of a common factor is a statistical process that does not connect the factor with any specific clinical construct.  Yet some designation for this factor might offer clarity, and so the common factor can be regarded as psychological distress in some general sense.

Only a single item for the PHQ-9 (“Thoughts that you would be better off dead or of hurting yourself in some way.”) displayed a factor loading of less than .5 (.46). This lower loading is typical of ACORN items inquiring about self-harm. Estimates of reliability using Cronbach’s coefficient alpha ranged from .88 (PHQ-9) to .94 (SPIN-17).

As would be expected, the various questionnaires displayed moderate to strong correlations with one another, as expected in Table 1.

Given the psychometric properties of these questionnaires, it is possible to combine the data from each for an overall analysis. The aggregated data can be merged in a meta-analysis to estimate an overall effect size for the combined programs.

Similarly, all the ACORN questionnaires were intentionally developed using items that loaded on the same common factor. The reason for this is connected with the creation of the ACORN collaboration in the first place. Various clinicians and clinics approach questionnaire development with specific needs and interests. ACORN maintains a database of hundreds of items used by clinicians, and has constructed questionnaires with strong psychometric properties focused on a wide range of clinical symptoms. However, knowledge that these items loaded on a common factor has carried through the evolution of the collaboration and its collection of data during more than one million episodes of care.

One ACORN questionnaire in this study utilized the PHQ-9 and GAD-7 to create a single measure. This accounted for 29% of the ACORN sample. The remaining 71% used a variety of ACORN measures using items assessing symptoms of depression and anxiety, social conflict/isolation, and daily functioning. All items load on the common factor, so that all questionnaires display a high degree of construct validity with reliability, as measured by the coefficient alpha of approximately 0.9 or higher.


Table 2 presents overall results for the Learn to Live sample for each questionnaire and for all questionnaires combined. The mean number days is the number of days from the initial assessment to the last completed assessment. Table 3 displays the same information for the ACORN data.

As is apparent from the tables, the length of treatment (days) and mean number of sessions/lessons is comparable between the two data sets. In terms of clinical results, the mean effect size for the Learn to Live clients is significantly lower than for the ACORN sample (p<.01).  The effect size for all Learn to Live cases is 0.46, while the ACORN effect size is 0.74 for all cases. While this would suggest the outpatient psychotherapy group had better outcomes in aggregate than the Learn to Live group, this finding obscures the results for each cohort completing a specific number of lessons/sessions. In other words, one is left to wonder how the groups compare when analyzed at the point where an individual’s lessons or sessions have ended.

Further analyses were conducted to explore the magnitude of change as a function of the number of lessons/sessions completed. Tables 3 to 5 display effect size as a function of the number of Learn to Live lessons completed for each of the questionnaires.

Table 7 presents the combine results for all three of the Learn to Live questionnaires

Note that the effect sizes increase as a function of the number of lessons completed, and furthermore, by the 5th lesson the effect sizes exceed benchmark for clinical trials (.8 effect size). Graph 1 displays the effect size as a function the last lesson completed.

Learn to Live effect sizes by program and last lesson completed

Note that the rate of improvement is faster for both the PHQ-9 and GAD-7 questionnaires as compared to the SPIN-17. It is unclear whether this is a function of the questionnaire (less sensitive to change) or the CBT program itself. However, in all cases the effect size for those who completed 5 lessons or more exceeds the cutoff for the Highly Effective range (effect size => .8).

Table 8 presents similar session by session combined results for the ACORN PHQ-9/GAD-7 questionnaires and for other combined ACORN questionnaires.

Graph 2 displays combined results for lessons/sessions completed for the Learn to Live data and ACORN data.

Comparison of Learn to Live to ACORN effect sizes as a function of last lesson/session completed.

As is apparent, both Learn to Live and ACORN clients achieve similar results by session/lesson 5. However, while the ACORN effect sizes tend to plateau after session 4, the Learn to Live effect sizes continue to increase steadily up to session 8. Clients completing 8 lessons of the Learn to Live programs appear to achieve significantly better results than those completing 8 sessions of outpatient psychotherapy.

While the finding of significant therapist effects in the research literature raises the question of how clients completing online lessons may differ based on exposure to the additional element of personal coaching, insufficient data preclude reporting this analysis.  Fewer than one third of Learn to Live clients had any personal contact with coaches, and so this issue will be taken up in a later study when an adequate sample size is available.

Discussion and Implications for Program Development

The first observation in these two data sets is that both samples have a significant degree of attrition in the early lessons/sessions. Both samples have over 50% of the clients choosing to terminate by the third lesson/session or earlier (59% for ACORN; 56% for Learn to Live). This would argue for efforts to increase engagement in both traditional outpatient and Learn to Live iCBT programs, especially as those persisting with services receive greater benefit.

It appears that outpatient face to face psychotherapy yields a somewhat larger dose benefit during early sessions. Perhaps this is due to the additional non-specific benefit of a face to face encounter with a therapist. However, by session/lesson 5, the advantage has shifted to Learn to Live programs. Both samples have an effect size greater than .8 at this point. However, the ACORN outpatient psychotherapy dose benefit appears to plateau at session 4, while the Learn to Live clients continue to show consistent gains from later lessons. At lesson 7, Learn to Live has an effect size of 1.37, compared to .94 for the ACORN sample (p<.01; one tailed t-test).

As with all naturalistic data, results must be interpreted with caution. It is difficult to account for the additional benefit (if any) of face to face human contact. It is likely that many clients benefit adequately from iCBT programs, while some would benefit from additional therapist contact. Also, while the two samples may appear roughly equivalent in terms of test scores and length of treatment, other client variables such as client motivation might differentiate the samples. Also, it is likely that different questionnaires will yield differing effect sizes, as a property of item selection. Some items are more sensitive to change than others.

While this study provides strong evidence that iCBT self-guided programs such as those offered by Learn to Live can produce similar results to traditional in-person therapy, more research is needed to identify which clients are likely to benefit from iCBT therapy, and which may require more direct contact with a therapist. Likewise, iCBT therapy may provide a highly effective adjunct to more traditional treatment, while possibly facilitating longer term follow-up.

Jeb Brown has been at the forefront of research into so called "feedback informed treatment". He is the founder and coordinator for the ACORN Collaboration. Prior to this he help positions of leadership within United Behavioral Health and Aetna Health Plans.

Cite This Article

Brown, J. S., Jones, E., & Cazauvieilh, C. (2020, May). Effectiveness for online cognitive behavioral therapy versus outpatient treatment: A session by session analysis. [Web article]. Retrieved from http://www.societyforpsychotherapy.org/effectiveness-for-online-cognitive-behavioral-therapy-versus-outpatient-treatment


Andersson, G. (2018). Internet interventions: Past, present and future. Internet Interventions, 12(2018), 181-188. https://doi.org/10.1016/j.invent.2018.03.008

Attridege, M. D. (2020). Digital Cognitive Behavioral Therapy Tools for Employees with Anxiety, Depression, Social Anxiety, or Insomnia: A Real-World Archival Study of Clinical and Work Outcomes. Unpublished manuscript. ORCID = orcid.org/0000-0003-1852-2168

Attridge, M. (2019). A global perspective on promoting workplace mental health and the role of employee assistance programs. American Journal of Health Promotion: The Art of Health Promotion, 33(4), 622-629. doi:10.1177/0890117119838101c

Attridge, M. D., Morfitt, R. C., Roseborough, D. J., & Jones, E. R. (2019). Impact of Internet-delivered cognitive behavioral therapy on clinical and academic outcomes for college students with anxiety, depression, social anxiety, or insomnia: Four longitudinal studies of archival operational data and follow-up surveys. JMIR Preprints. 06/01/2020:17712. doi:10.2196/preprints.17712

Brown, G. S., Simon, A., Cameron, J., & Minami, T. (2015). A collaborative outcome resource network (ACORN): Tools for increasing the value of psychotherapy. Psychotherapy, 52, 412–421. doi:10.1037/pst0000033

Brown, J. & Cazauvieilh, C. (2019). Clinician Engagement in Feedback Informed Treatment (FIT) and Patient Outcomes. https://acorncollaboration.org/blog/2019/2/12/therapist-engagement

Brown, J. & Minami, T. (2019). Brief Report on Improved Treatment Outcomes for Medicaid Funded Mental Health in Maryland. https://acorncollaboration.org/blog/2019/2/12/therapist-engagement-bbkzz

Clarke, G., Eubanks, D., Reid, E., Kelleher, C., DeBar, L., Lynch, F., Nunley, S. (2005). Overcoming depression on the Internet (ODIN) (2): A randomized trial of a self help depression skills program. Journal of Medical Internet Research, 7(2):e16. doi:10.2196/jmir.7.2.e16

Cuijpers, P., Donker, T., van Straten, S. A, Li, J., & Andersson, G. (2010). Is guided self-help as effective as face-to-face psychotherapy for depression and anxiety disorders? A systematic review and meta-analysis of comparative outcome studies. Psychological Medicine, 40(12):1943-1957. doi:10.1017/S0033291710000772

Davies, E. B., Morriss, R., & Glazebrook, C. (2014). Computer-delivered and web-based interventions to improve depression, anxiety, and psychological well-being of university students: A systematic review and meta-analysis. Journal of Medical Internet Research16(5), e130. doi:10.2196/jmir.3142

Jakobsen, H., Andersson, G., Havik, O.E., & Nordgreen, T. (2017). Guided Internet-based cognitive behavioral therapy for mild and moderate depression: A benchmarking study. Internet Interventions7, 1-8. doi:10.1016/j.invent.2016.11.002

Kroenke, K., & Spitzer, R.L. (2002). The PHQ-9: A new depression diagnostic and severity measure. Psychiatric Annals, 32(9), 509-515. doi:10.1155/2012/309094

Lattie, E. G., Adkins, E. C., Winquist, N., Stiles-Shields, C., Wafford, Q. E., & Graham, A. K. (2019). Digital mental health interventions for depression, anxiety, and enhancement of psychological well-being among college students: Systematic review. Journal of Medical Internet Research21(7), e12869. doi:10.2196/12869

Melling, B., & Houguet-Pincham, T. (2011). Online peer support for individuals with depression: A summary of current research and future considerations. Psychiatric Rehabilitation Journal, 3493, 252–254. doi:10.2975/34.3.2011.252.254

Minami T., Brown G. S., McCulloch, J., & Bolstrom, B. J. (2012). Benchmarking therapists: Furthering the benchmarking method in its application to clinical practice. Quality and Quantity, 46, 699-708. doi:10.1007/s11135-011-9548-4

Minami, T., Davies, D. R., Tierney, S. C., Bettmann, J. E., McAward, S. M., Averill, L. A., … & Wampold, B. E. (2009). Preliminary evidence on the effectiveness of psychological treatments delivered at a university counseling center. Journal of Counseling Psychology, 56, 309-320. doi:10.1037/a0015398

Minami, T., Serlin, R. C., Wampold, B. E., Kircher, J. C., & Brown, G. S. (2008a). Using clinical trials to benchmark effects produced in clinical practice. Quality and Quantity, 42, 513-525. doi:10.1007/s11135-006-9057-z

Minami, T., Wampold, B. E., Serlin, R. C., Kircher, J. C., & Brown, G. S. (2007). Benchmarks for psychotherapy efficacy in adult major depression. Journal of Consulting and Clinical Psychology, 75, 232-243. doi:10.1037/0022-006X.75.2.232

Minami, T., Wampold, B. E., Serlin, R. C., Hamilton, E. G., Brown, G. S., & Kircher, J. C. (2008b). Benchmarking the effectiveness of psychotherapy treatment for adult depression in a managed care environment: A preliminary study. Journal of Consulting and Clinical Psychology, 76, 116-124. doi:10.1037/0022-006X.76.1.116

Minami, T., Wislocki, A. P., Brown, G. S., & Wampold, B. E. (2013, July). Modeling psychotherapy treatment progress and therapeutic alliance in natural clinical settings. In Z. E. Imel (Chair), Interpersonal influence processes in psychotherapy. Symposium presented at the 121st Annual Convention of the American Psychological Association, Honolulu, HI.

Spitzer, R. L., Kroenke, K., Williams, J. B., & Löwe, B. (2006). A brief measure for assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine166(10), 1092-1097. doi:10.1001/archinte.166.10.1092

Wampold, B. E., & Brown, G. S. (2005). Estimating therapist variability: A naturalistic study of outcomes in private practice. Journal of Consulting and Clinical Psychology, 73, 914-923. doi.org/10.1037/0022-006X.73.5.914

Wampold, B. E., & Imel, Z. E. (2015). The great psychotherapy debate: The evidence for what makes psychotherapy work (2nd ed.). New York, NY. Routledge.

1 Comment

  1. Dr. Mark Attridge

    Interesting research study design and good to see positive findings for the in-person and the iCBT technology tool approaches.

    There is a mistake, I believe, that Table 7 is repeated and Table 8 is missing. Hope this can be updated as I would like to see those findings.

    Two other published research studies on the clinical effectiveness of Learn to Live programs for employees and for college students:

    1 – https://journals.sagepub.com/doi/full/10.1177/2158244020914398
    2 – https://formative.jmir.org/2020/7/e17712/


Submit a Comment

Your email address will not be published. Required fields are marked *