Sample
LS
Data were drawn from 10 LSs in the United Kingdom who had conducted research before and during the COVID-19 pandemic involving five homogeneous age groups: the Millennium Cohort Study (MCS) 28; the Avon Parent-Child Study (ALSPAC (Generation 1, “G1”)) 29 · Next Steps (NS) 30; 1970 British Cohort Study (BCS) 31; and the National Child Development Study (NCDS) 32, and included five heterogeneous age samples: the Born in Bradford study (BIB) 33; Understanding Society (USOC) 34; Generation Scotland: the Scottish Family Health Study (GS) 35; the parents of the ALSPAC-G1 cohort, which we call ALSPAC-G036; and the United Kingdom Register of Twin Adults (TwinsUK) 37. Study details and references are shown in Supplementary Table 1. The minimum inclusion criteria were pre-pandemic health measures, age, sex, nationality and self-reported COVID-19 and self-reported duration of COVID-19 symptoms. Ethics statements presented in Supplementary Table 2.
Electronic Health Records (EHR)
Working on behalf of NHS England, we conducted a population-based cohort study to measure the long COVID record in EHR data from primary care practices using TPP SystmOne software, linked to secondary service data (SUS files containing) ) through OpenSAFELY (This is a data analysis platform developed on behalf of the NHS England during the COVID-19 pandemic to allow near-real-time analysis of pseudonymized primary care records within the highly secure EHR data protection environment Details on the Governance Information for the OpenSAFELY platform can be found in Supplementary Note 1. From a population of all people who lived and were registered at a GP on 1 December 2020, we selected all patients COVID-19-related code indications, either: positive SARS-CoV-2 testing, hospitalization with a relevant COVID diagnostic code, or possession of a registered COVID diagnostic code in primary care.
Meters
Results: COVID-19 and long-term definitions of COVID
LS: COVID-19 cases were self-reported, including test confirmation and diagnosis by health professionals (see Supplementary Data 1 for full details of the questions and coding used in each study). Long-term COVID was defined according to NICE categories using self-reported symptom duration1. Based on these categories, we defined two main outcomes: (i) symptoms lasting 4+ weeks (symptoms lasting 0-4 weeks as a reference) and (ii) symptoms lasting 12+ weeks (symptoms lasting 0-12 weeks as report). Some studies recorded the duration of symptoms of any severity, while others reported only symptoms that affected daily functioning (Table 2). In addition, two studies derived alternative estimates for long-term COVID based on the number of individual symptoms lasting more than 4 or 12 weeks for at least six months (BiB, TwinsUK) (Supplementary Note 2). All data used to export these results were collected between April and November 2020. EHR: Any long-term COVID entry in the primary care record was coded as a binary variable. This was defined using a list of 15 UK SNOMED codes, categorized as diagnostic (2 codes), reference 3 and evaluation codes10. SNOMED is an internationally structured clinical coding system for use in EHR38. These clinical codes were designed based on the guidelines issued for long-term COVID by NICE1. The result was measured between the start date of the study (February 1, 2020) and the end date (May 9, 2021).
Exhibitions
Socio-demographic factors
All studies included age, gender, ethnicity (white or non-white minority ethnic group, where available) and Multiple Deprivation Index (IMD, divided into quintiles with 1 representing the most deprived and 5 representing the least deprived) . SES at the regional level was measured using IMD 2019, a synthesis of different sectors including regional level income, employment, access to education and crime, for the postal code where one participant lived at the time of sample collection39. The LS included additional measures of socioeconomic status: education (degree, no degree) and current / recent employment occupation (Supplementary data1). The EHR also included a geographical area40.
Mental health
LS: Pre-pandemic measures using validated continuous scales of anxiety and depression symptoms separated using standard limits to indicate discomfort (see Supplementary Data 1). EHR: Evidence of a pre-existing mental health condition was defined using previous codes for one of the following: psychosis. schizophrenia; bipolar disorder; the Depression.
Self-assessed general health
LS: Pre-pandemic self-assessment on a 5-point scale that is broken down to compare excellent-health (categories 1–3) with fairly poor health (categories 4 and 5).
Overweight and obesity
LS: Body mass index (BMI, kg / m2) taken before the pandemic, coded to compare a BMI between 0 and 24.9 (having underweight / normal weight) versus BMI ≥25 (overweight / obesity). EMU: Categorized as obese using the most recent BMI measurement, while those with obesity are further classified as obese I (BMI 30–34.9), Obese II (BMI 35–39.9) or Obese III (BMI 40+). A BMI> 25 was used in LS as their percentage in the obese category (i.e., BMI> 30) was relatively small, e.g. 8.9% for TwinsUK, while EHR obesity codes were used as these are the most reliable and valid indicators of obesity in general practice.
Health conditions
LS: Pre-pandemic self-report of asthma, diabetes, hypertension and high cholesterol. EHR: A previous code 6 months to 5 years before March 2020 for one or more of the following: diabetes. Cancer; hematological cancer? asthma; chronic respiratory disease; chronic heart disease; chronic liver disease; stroke or dementia; another neurological condition; organ transplant? dysplasia; rheumatoid arthritis, systemic lupus erythematosus or psoriasis. or other immunosuppressive conditions. Those who did not have a relevant code for a treaty were considered not to have this condition. The number of conditions was categorized into “0”, “1” and “2 or more”.
Health behaviors
LS: Current smoking status (divided by “0” = no, “1” = yes).
Statistical analysis: LS
The main analyzes were performed in studies with a direct self-reported measure of the length of COVID-19 symptoms. The correlations between each factor and the long-term effects of COVID (symptoms for 4+ weeks and symptoms for 12+ weeks) were evaluated in separate accounting regression models in each study. We adjust a minimum set of variables in all studies, where needed: age (adjusted as a continuous variable when considered variable), gender and nationality. We report odds ratios (ORs) and 95% confidence intervals (CIs). To synthesize the correlation sizes between the studies, a fixed-effect meta-analysis with limited maximum probability was performed and repeated by modeling random effects for comparison. The IStatistics 2 was used to report heterogeneity between estimates. Meta-analyzes were performed using the metafor41 package for version R. 4). Due to the different age structures of LS, the examination of the direct relationship between age and long-term COVID risk was treated separately from other risk factors and we modeled the relationship in two ways. First, in heterogeneous age samples we compared the long-term COVID risk in predetermined age groups, comparing 45-69 and 70+ with 18-44 in three cohorts (USOC, TwinsUK and GS) and 55-59 and 60-76 with 45–54 in one cohort (ALSPAC G0). Second, in a subset of LS birth cohorts with participants of almost identical ages and who were given fully harmonized long COVID questionnaires (MCS, NS, BCS70 and NCDS), we analyzed the trend in absolute risk of long-term COVID with increasing age using post-regression. The wear and tear design was treated with weighting estimates to be representative of their target population in each LS (weights were not available for BiB and TwinsUK).
Sensitivity analyzes
To mitigate the bias of indicator events27, the IPWs for COVID-19 risk emerged. These arose in each LS separately but following a common approach previously used (see Supplementary Note 3 for details) 42. The resulting weights were then applied to all analysis models as a sensitivity test. For studies in which we were able to verify SARS-CoV-2 infection (TwinsUK and ALSPAC-G0 and -G1), the analyzes were repeated in the sample of those with a positive polymerase chain reaction (PCR) obtained by binding to the Data and / or lateral flow antibody tests (ALSPAC) and enzyme-linked immunosorbent assay (ELISA) (TwinsUK) 43 confirming exposure to the virus. These results are presented in Supplements Fig. 11–14.
Statistical analysis: EHR
We performed logistic regression to assess whether the long-term COVID-19 recorded by GPs was associated with any socio-demographic or pre-pandemic health characteristics. We adjusted for the same set of confounders used in the LS analyzes: age (as a categorical variable), gender, ethnicity. In further analyzes of age as a risk factor for long-term COVID in the EHR data, we assigned individuals in 10-year-olds an age in the middle of each group and then assessed the trend in high-frequency COVID with age using linear and non-linear meta -reciprocating.
Reference summary
Further information on research design is available in the Summary of Nature Research Reports attached to this article.