Background: Physicians worldwide face an increasing administrative burden that diverts time from direct patient care. Among inpatient documentation tasks, authoring hospital course summaries is particularly time-consuming and critical for safe care transitions. Large language models (LLMs) have shown promise for clinical text generation; however, robust evidence from randomized, evaluator-blinded trials conducted in routine hospital practice remains limited. Objectives: The CLEAN study aims to evaluate whether LLM-assisted, specialistedited generation of hospital course summaries is non-inferior in safety compared with standard clinician-written documentation in routine inpatient care. Secondary objectives include noninferiority assessments of resident-edited and unedited LLMgenerated summaries. Additional objectives are to evaluate summary quality across predefined domains, quantify physician documentation time, assess LLM generation stability, measure clinician adoption following the randomized phase, and examine inter-, intra-observer, and test-retest reliability of expert assessments. Methods: This is a single-centre, double-campus, exploratory randomized controlled non-inferiority trial conducted at a tertiary university hospital. Consecutive hospital discharges across multiple clinical departments are randomized 1:1 to either an LLM-assisted documentation workflow or standard manual authorship. The intervention integrates an on-premise LLM into a parallel hospital information system, generating draft hospital course summaries from complete, uncurated clinical documentation, which physicians may review and edit prior to finalization. Safety, the primary outcome, defined as presence of all important information and absence of incorrect/hallucinated information, is assessed by an adjudication committee blinded to documentation workflow. Secondary outcomes include content validity, workflow efficiency, generation stability, post-trial clinician adoption, and reliability metrics. A total of 786 discharge episodes are required to assess non-inferiority using a predefined margin of 5 percentage points. Ethics and Dissemination: The study will be conducted in accordance with the Declaration of Helsinki, Good Clinical Practice, and the General Data Protection Regulation. A waiver of informed consent is sought due to minimal risk and exclusive use of routine clinical data. Results will be disseminated through peer-reviewed publication and engagement with healthcare stakeholders.
See this in plain English?
AI-rewrites the medical criteria so a patient or caregiver can understand them. Always confirm with the trial site.
Safety assessed by outcome evaluator on an ordinal scale (1,2,3)
Timeframe: Assessment at one time point - hospital discharge (up to 5 days)