Grouping setup
The triage methods include assessments by outpatient rehabilitation doctors based on clinical experience, assessments using the DRP tool, and assessments using ChatGPT. Three groups were set: (1) Patients assessed by outpatient doctors based on their clinical experience were assigned to the “Doctor Group”; (2) Patients assessed using the DRP tool were assigned to the “Tool Group”; (3) Patients assessed using ChatGPT were assigned to the “ChatGPT Group.” The study flowchart is shown in Fig. 2.

Flowchart of the procedure in the study.
Triage assessment
In this study, the triage outcomes were coded as categorical variables and assigned scores as follows: 1 = outpatient rehabilitation, 2 = inpatient rehabilitation at primary healthcare institutions, 3 = inpatient rehabilitation at secondary hospitals, 4 = inpatient rehabilitation at tertiary hospitals, 5 = inpatient rehabilitation at nursing homes or long-term care institutions.
To facilitate data collection and the assessment using the DRP tool, we developed an online version of the questionnaire through the WeChat application. Prior to data collection, all doctors participating in the DRP assessments underwent standardized training to learn how to determine each indicator and use the tool, thereby minimizing bias. All questionnaire information, clinical disease information, and triage results were recorded in the Longhai Zhikang WeChat mini-program. Missing information required doctors to follow up with patients for further consultation.
Outpatient rehabilitation doctors
The assessment process by doctors involved patients attending outpatient clinics, where the doctors first collected baseline information and clinical disease data through face-to-face consultations. Baseline information included demographic data such as age and gender. Clinical disease information included diagnosis, functional status, disease course, complications, vital signs, and past medical history. Based on this information, doctors then used their clinical experience to assess whether the patient should receive rehabilitation services and determine the triage outcome.
After completing the consultation, the outpatient doctor recorded the triage results based on clinical experience. Besides the patient’s baseline information and clinical disease data were also recorded in online questionnaire. Clinical diagnoses were classified into six common categories seen in rehabilitation outpatient clinics: pediatric-related diseases, orthopedic diseases, geriatric diseases, neurological diseases, cardiopulmonary diseases, and oncological diseases. Diagnoses such as “intellectual disability,” “cerebral palsy,” or “autism” were classified under “Pediatric-related diseases”; diagnoses such as “fractures,” “osteoarthritis,” “lumbar disc herniation,” or “frozen shoulder” were classified under “Orthopedic diseases”; diagnoses such as “stroke,” “traumatic brain injury,” “spinal cord injury,” or “peripheral neuropathy” were classified under “Neurological diseases”; diagnoses such as “hypertension” or “diabetes” were classified under “Geriatric diseases”; diagnoses such as “pneumonia,” “atrial fibrillation,” or “coronary artery disease” were classified under “Cardiopulmonary diseases”; or diagnoses such as “tumor” were classified under “Oncological diseases”.
DRP tool
The DRP tool demonstrated good reliability and validity4. Specifically, the internal consistency and inter-rater reliability of the DRP tool were demonstrated by Cronbach’s alpha (0.66), the Kendall coefficient (0.86, P < 0.001), and the Kappa test (Kappa = 0.83, agreement = 90.48%, P < 0.001). In addition, the construct validity of the DRP tool was demonstrated by a Kaiser-Meyer-Olkin (KMO) value greater than 0.5, with an overall KMO of 0.56 (Bartlett’s test χ² = 549.35; P < 0.001). The modified confirmatory factor analysis showed the indexes of the DRP tool had significantly factor loadings on the rehabilitation triages (Root mean square error of approximation = 0.03; Comparative fit index = 0.99; Tucker-Lewis index = 0.95; Standardized Root Mean Square Residual = 0.01).
As shown in Fig. 3, the DRP tool consists of five clinical indicators: the level of dysfunction, ADL, vital signs, disease status, and disease course. Assessors are required to evaluate patients based on five clinical indicators.

The novel clinical tool for distributing rehabilitation patients (DRP).
(1) The level of dysfunction: The types of dysfunctions include “cognitive disorders,” “speech disorders,” “swallowing disorders,” “cardiopulmonary disorders,” “motor disorders,” and “others.” In the DRP tool, assessors determine whether patient has a dysfunction and whether multiple dysfunctions are present. “Multiple dysfunctions” is defined as the presence of disorder of consciousness, or the presence of one or more additional disorders beyond motor dysfunction, such as cognitive, speech, swallowing, or cardiopulmonary disorders. If none of these conditions are met, the patient is classified as having “No dysfunction.”
(2) ADL: In the DRP tool, ADL is assessed using the Longshi Scale (LS). The LS is a novel, image-based tool developed in 2013, aimed at evaluating functional independence and disability. It has been recognized as one of the national standards in China for evaluating functional independence and disability (license code: GB/T 37103 − 2018)24. Specifically, the LS categorizes patients based on their ADL into three groups: the “Community Group,” the “Domestic Group,” and the “Bedridden Group.”
(3) Vital signs: This includes the temperature, pulse, heart rate, and respiration. The stability of vital signs is determined by assessing whether these parameters fall within normal ranges.
(4) Disease status: This includes the patient’s primary disease, underlying conditions, complications, and comorbidities. If the disease does not meet the ideal clinical indicators, new clinical manifestations appear, or new medications are required, the patient is classified as having “Uncontrolled disease.” Conversely, if none of these conditions are present, the disease is considered “Controlled.”
(5) Disease course: Due to issues related to medical insurance reimbursement, patients whose disease course exceeds a certain limit are not eligible for hospitalization cost coverage by medical insurance4. In most regions of China, the disease course limit is set at 12 months. Thus, the disease course is categorized as either greater than or less than 12 months.
The specific assessment process for the DRP tool involves the following steps: (1) Assessing the presence and type of “Dysfunction.” Patients without any dysfunction are classified as non-rehabilitation patients and should be redirected to other clinical departments; (2) Assessing ADL using the LS scale. Patients classified as “Community Group” are directed to outpatient rehabilitation, while those with other types of disability are directed to inpatient rehabilitation; (3) Determining the appropriate level of medical institutions for inpatient rehabilitation based on “Disease Status,” “Disease Course,” and “Vital Signs.” Patients with a disease course longer than 12 months are referred to nursing homes or long-term care institutions, while those with disease course of 12 months or less are referred to hospitals for inpatient. Further triage is based on “Disease Status.” Patients with “Uncontrolled disease” are referred to inpatient rehabilitation at tertiary hospitals. Lastly, patients with “Multiple Dysfunctions” are directed to inpatient rehabilitation at secondary hospitals, while those with only motor function impairments are referred to inpatient rehabilitation at primary healthcare institutions.
ChatGPT
We utilized the ChatGPT-4 model connected through a Chinese network portal for the triage assessment. The triage process using ChatGPT is shown in Fig. 4A and B. The specific procedure is as follows: (1) ChatGPT was configured as a rehabilitation doctor with extensive experience practicing in China; (2) ChatGPT was instructed to first learn about Chinese hierarchical medical system and then, based on patient information, determine whether the patient required rehabilitation services, and if so, whether outpatient or inpatient services were appropriate, and at which level of healthcare institution; (3) Patient’s information was sequentially entered into the dialog box, including age, gender, primary clinical diagnosis, dysfunction status, ADL, vital signs, control status of the primary disease, and duration of the disease course.

ChatGPT Assessment Schematic. Note: (A) (B) are graphical representations of the assessment of ChatGPT.
We implemented rigorous quality control in the evaluation of ChatGPT. First, Given that ChatGPT demonstrates a strong ability to learn within the same conversation window, there is a risk of potential repetition or error propagation. To minimize systematic errors, a new conversation window was created for each triage assessment. Second, ChatGPT did not utilize any auxiliary plugins, and the “chat history and training” feature was disabled to maintain objectivity in each response. Third, we ensured consistency in the prompt structure for all cases. In most responses, ChatGPT directly generated the triage results for the patient. If multiple triage options were suggested, we prompted ChatGPT to select the most suitable one, ensuring a precise and definitive response.
Statistical analysis
Statistical analysis was conducted using EmpowerStats software and SPSS 26.0. P< 0.05 was statistically significant. The statistical analysis steps were as follows: (1) The Kolmogorov-Smirnov test was used to assess normality. Data following a normal distribution were presented as mean ± standard deviation, whereas non-normally distributed data were presented as median and interquartile range; (2) Differences in the triage scores among the three groups were compared using analysis of variance or Kruskal-Wallis (KW) test. The Bonferroni post hoc tests were also used to identify the between group differences; (3) The Bland-Altman test, linear weighted Kappa, and intraclass correlation coefficient (ICC) were employed to assess the consistency of triage results among the three groups. Kappa reliability and ICC thresholds were defined as follows: poor (0–0.2), fair (0.2–0.4), moderate (0.4–0.6), good (0.6–0.8), and very good (0.8–1)24,25; (4) The percentage consistency test was used to assess consistency between different triage methods.
link