Study design
ENVIRonmental influence ON early AGEing (ENVIRONAGE) is a Belgian birth cohort that started in 2010, with ongoing recruitment for mother-newborn pairs at birth in the East-Limburg Hospital (Genk, Belgium). Complete information on the eligibility and recruitment process is presented elsewhere [26]. A follow-up examination is conducted when the child is 4–6 years, where parents fill out questionnaires to provide lifestyle and socio-demographic information, and the Strengths and Difficulties Questionnaire (SDQ) to assess the child’s behavior. In addition, during this visit, the child performs cognitive testing using the Cambridge Neuropsychological Test Automated Battery (CANTAB) [27]. The study protocol was approved by the ethical committee of Hasselt University and complied with the Helsinki Declaration. Parents gave written informed consent and children verbal permission [26, 28]. This study followed the Strengthening Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.
In 2017 and 2018, a subset of the ENVIRONAGE participants were asked to participate in home visits. More specifically, we selected households that already participated in the follow-up examination or had the examination planned close to the home visits, did not move in between, and had no indoor renovations planned. In total, 233 of the 284 eligible households were contacted, of which 189 accepted to participate, resulting in a participation rate of 81% (Supplemental Fig. 1). Due to logistical constraints, eight samples were not collected. Additionally, we excluded two samples because of sampling irregularities, two samples with less than 1000 sequences due to insufficient dust and one sample because sampling period exceeded the predetermined maximum of 9 weeks. Of the 176 children with appropriate dust samples, 171 and 172 children completed the SDQ and CANTAB, respectively.
Indoor microbial assessment during home visit
Settled dust was collected using two sterile, open-faced Petri dishes (92x16mm) over a period of minimum four and maximum nine weeks (mean 43.4 days), in spring 2017 and spring 2018, in the household’s living room. They were placed approximately 2 m above floor level, a safe distance from major air flows and heating sources [29]. Upon collection, the Petri dishes were sealed and stored at −20°C to be processed in the summer of 2018, as described previously [30]. After processing, samples were shipped frozen on dry ice to the Finnish Institute for Health and Welfare (Kuopio, Finland), where DNA extraction was conducted as described earlier [30], storing the DNA at −20°C until sequencing.
Extracted DNA from dust and control samples was shipped frozen to the sequencing service partner LGC Genomics (Germany) for library preparation and sequencing. For bacteria, the V4 region of the bacterial 16S rRNA gene was amplified using 515F/806R primers [31]. For fungi, the Internal Transcribed Spacer region 1 (ITS1) was amplified using ITS1F/ITS2 primers [32].
The PCR procedure, sequencing, sequence processing, and bioinformatics analyses are described more in-depth in supplement. In brief, 16S and ITS amplicon data was processed by standard dada2 pipeline version 1.8 [30]. Taxonomy was assigned using SILVA [33] database version 132 for bacteria and UNITE database version 7.2 for fungi [34]. Downstream processing included removal of chimeras, chloroplast, and mitochondria sequences, as well as of potential contaminants utilizing negative controls and Decontam package version 1.2 [35].
QIIME software version 1.9.1 [36] was used to calculate Chao1 richness and Shannon diversity index using rarefaction values of 1495 and 3956 sequences for bacteria and fungi, respectively. The Chao1 index is an abundance-based estimator of species richness. The Shannon diversity index incorporates species evenness, i.e., the homogeneity of species abundance, with species richness.
We used quantitative PCR to calculate the total Gram-negative and Gram-positive bacterial and fungal loads, as described previously [30]. We determined the number of microbial cell equivalents (CE) in the samples using relative quantification, utilizing an internal standard to adjust for the presence of DNA inhibitors and/or variability in DNA extraction efficiency [37]. Results were normalized for sampling surface area and accumulation duration and expressed as CE per m2 settling surface area per day, referred to hereafter as microbial load.
Neuropsychological assessment during follow-up visit
Behavioral outcomes
To assess the child’s behavior, parents filled out the SDQ [38], a validated screening method for psychiatric disorders in children [39,40,41]. From this questionnaire, four scales (range 0–10) are calculated from five statements each: peer relationship, emotional, conduct, and hyperactivity and were additionally combined to calculate a Total Difficulties Score (range 0–40). We used British cutoff guidelines and categorized SDQ outcomes into discrete variables by grouping borderline and abnormal scores together to define a “not normal” category. Being not normal was defined when scores were equal or above 3 for peer relationship, 4 for emotional, 3 for conduct, 6 for hyperactivity, and 14 for the Total Difficulties Score [42].
Cognitive function outcomes
Cognitive function was assessed via CANTAB [27] software on a touchscreen tablet, reliable for measuring executive functions in young children [43]. In total, each child had to complete four tasks (Fig. 1). Two tasks assessed the attention and psychomotor speed (Motor Screening Task (MOT) and Big/Little Circle (BLC)), and two assessed the visual working memory (Spatial Span (SSP) and Delayed Matching to Sample (DMS)). Detailed information on cognitive assessment is provided in supplement.
The MOT outcome variables included the response time and error. The response time is the average time in milliseconds to select the cross successfully. The error is the average distance in pixel units between the child’s press and the cross’s center on all successful trials. Low values in response time and in error indicate better performance. The BLC outcome variable was response time, being, in this test, the median time in milliseconds to select the right circle successfully, low values indicate better performance. The SSP measured the maximum sequence length the child could correctly recall, further referred to as span length, higher considered better. In the fourth and last test, i.e., DMS, the first outcome variable was the average time, in milliseconds, it took to correctly answer on the first try, further referred to as response time; lower values considered better. We excluded response times calculated on less than 25% of the trials. The last outcome variables were the probability of error if the previous trial was correct, and the total proportion of correct answers on first try, further referred to as percentage correct, both expressed in percentages. A lower probability of error and higher percentage correct are considered better.
Covariables
During the follow-up examination, questionnaires were used to collected lifestyle and clinical information. Maternal education was used to represent socioeconomic status, coded as “low” (no diploma or primary school), “middle” (high school), or “high” (college or university degree). In addition, we obtained information on average daily screen time, defined as watching television, playing computer games and tablet use, categorized as “<1 h per day,” “1–2 h per day,” and “>2 h per day.” The time of examination was used as a continuous variable in the main analyses and categorized in the sensitivity analysis into morning (before 12pm), early afternoon (from 12pm up to 4pm), and late afternoon (after 4pm). Additionally, we obtained information on parental smoking coded dichotomously as non-smoking parents versus one or both parents smoking.
The residential addresses of the households were geocoded and categorized into rural, and suburban or urban, based on population density, employment, location, and spatial planning of statistical sectors (Flemish Government-Department Environment). In addition, we calculated average black carbon (BC) exposure by averaging daily BC concentrations at the residential address over the sampling period, using a spatiotemporal interpolation method [44] as described elsewhere [30]. Additionally, average outdoor temperature (°C), provided by the Belgian Royal Meteorological Institute, was calculated as the daily mean temperatures measured at a representative measuring station (Diepenbeek, Belgium) averaged over the sampling period.
Upon sample collection, we obtained additional household information: number of household members, pet ownership, ventilation type, and sampling duration. Pet ownership was dichotomized into the presence of a furry pet (cat, dog, rabbit, hamster, or guinea pig) or not and ventilation type into the use of passive ventilation or not.
Statistical analysis
For the statistical processing, we used the R environment version 3.6.0 [45]. We screened for outliers using a threshold of more or less than three times the standard deviation away from the mean. We removed two outliers for the bacterial Shannon diversity index. For the Gram-negative, Gram-positive, and fungal load, we removed two, five, and three outliers, respectively, and log-transformed values (base 10) to better comply with linear model assumptions. We identified certain core variables to be included in all of our models, including child’s age, sex, maternal education, urbanicity, and sampling duration, which reflect important clinical, socioeconomical, and technical information regarding the microbial and cognitive assessment.
To examine the associations between microbial exposure and the child’s behavior, we used logistic regression models for the SDQ outcomes, adjusting for the aforementioned core variables (child’s age, sex, maternal education, urbanicity, and sampling duration), as well as the number of household members, which was identified as an additional potential confounder for this analysis. Results regarding microbial diversity are expressed as odds ratios (OR) per interquartile range (IQR) increase in microbial diversity indices or for a 2-fold increase in microbial load.
To investigate microbial exposure in association with cognitive CANTAB outcomes, we log-transformed (base 10) all response times to better comply with assumptions on model linearity. Furthermore, we performed multivariable linear regression models adjusting for the aforementioned core variables (child’s age, sex, maternal education, urbanicity, and sampling duration), as well as the time of examination, reflecting relevant technical information regarding the CANTAB outcomes. Results are expressed as a unit change or a percentage change for the log-transformed response time outcomes (with 95% CI), per IQR increment in microbial diversity or per 2-fold increase in microbial load.
To assess the robustness of our findings, we additionally adjusted for potential confounders: smoking, determinants for indoor microbiota (average outdoor temperature, BC exposure, pet ownership, and ventilation), screen time (as a proxy for prior screen familiarization), and number of household members. We performed two sensitivity analyses for CANTAB models, excluding children who showed any signs of possible disinterest during cognitive testing, based on behavioral remarks and irregular touch patterns (n=1, 4, 7, and 5 for MOT, BLC, SSPM and DMS, respectively). Lastly, because a child’s performance may depend on tiredness, we restricted our analysis to children that performed the cognitive tests before 4pm.