• Home
  • Data Catalog
  • Citations
  • Login
    Login
    Home / Central Data Catalog / SAPRIN.SMHD2024V1
central

SAPRIN Mental Health Datasets 2024

South Africa, 1993 - 2023
Get Microdata
Reference ID
SAPRIN.SMHD2024V1
Producer(s)
Dr Molulaqhooa Linda Maoyi, Tinofa Mutevedzi, Prof Mark Collinson, Prof Steve Tollman, Dr Joseph Tlouyamma, Prof Collins Iwuji, Dr Kobus Herbst
Metadata
Documentation in PDF DDI/XML JSON
Created on
Nov 29, 2024
Last modified
Apr 15, 2025
Page views
19002
Downloads
43
  • Study Description
  • Data Dictionary
  • Downloads
  • Get Microdata
  • Identification
  • Version
  • Scope
  • Coverage
  • Producers and sponsors
  • Sampling
  • Survey instrument
  • Data collection
  • Data processing
  • Data appraisal
  • Data Access
  • Disclaimer and copyrights
  • Contacts
  • Metadata production
  • Identification

    Survey ID number

    SAPRIN.SMHD2024V1

    Title

    SAPRIN Mental Health Datasets 2024

    Country
    Name Country code
    South Africa RSA
    Study type

    Demographic Surveillance

    Series Information

    This dataset contains demographic surveillance data covering the period from 1 Jan 1993 to 31 Dec 2023.

    Abstract

    SAPRIN (South African Population Research Infrastructure Network) is a network of health and demographic surveillance sites in South Africa that consists of seven Health and Demographic Surveillance System (HDSS) nodes located in South Africa.
    Between them, the nodes follow more than 75 000 households (320 000 individuals) longitudinally through regular surveillance visits. Shortly after the start of the Covid-19 pandemic, SAPRIN implemented a shared Covid-19 surveillance programme in the MRC/Wits University Agincourt HDSS in Bushbuckridge District, Mpumalanga, established in 1993; the University of Limpopo DIMAMO HDSS in the Capricorn District of Limpopo, established in 1996, and and the Africa Health Research Institute (AHRI) HDSS in uMkhanyakude District, KwaZulu-Natal, established in 2000. This Covid-19 surveillance is still being conducted in these three SAPRIN nodes.

    As part of the Covid-19 surveillance, the PHQ-2 and GAD-2 screening questions were administered to household respondents in Agincourt, DIMAMO and AHRI. By the end of 2021, a total of 90 000 such interviews were conducted.

    The interviews can be directly linked to the detailed longitudinal surveillance data in the three nodes, providing interesting contextual data to this set of observations on depression and anxiety during the span of the covid epidemic from its start across several epidemic waves of infection in South Africa.

    These contextual data include:

    • Individual-level data on partnership status, educational attainment, and employment, including self-reported health status.
    • Household data of the households the respondents are members of, including household composition and whether the respondents' parents are co-resident with them.
    • Household socio-economic status.
    • Household asset status
    • A large set of Covid-19 specific data, including vaccine acceptance and hesitancy data and the impact of Covid-19 measures on the household.

    The Covid-19 specific interviews were conducted from May 2020 up until July 2023, with more than one interview with some participants at different points in time, allowing for the analysis of temporal effects.

    The SAPRIN Mental Health 2024 datasets consists of six types of the Demographic surveillance datasets :

    1. SAPRIN Individual exposure episodes. This dataset splits the basic surveillance episodes at calendar year-end and at the date when the age in years (birthday) of an individual change. In the case of women who have given births, episodes are split at the time of delivery as well.

    2. SAPRIN Individual status observations. This dataset consists of status observations such as education, employment, employment and partnership status of an individual, that recur at more or less regular interval per individual over the study period.

    3. SAPRIN household status observations. This dataset consists of socio-economic status observations for a household. This data is collected from a household proxy respondent, preferably the head of household or any next available senior adult resident household member at more or less regular interval over the duration of the study.

    4. SAPRIN household asset status observations. This dataset consists of asset status observations for a household. This data is collected from a household proxy respondent, preferably the head of household or any next available senior adult resident household member at more or less regular interval over the duration of the study.

    5.SAPRIN individual COVID-19. This dataset consists of Covid-related status observations pertaining to Covid-19 diagnosis, vaccination status, attitudes to vaccination and the PHQ-2 and GAD-2 mental health related questions.

    6.SAPRIN household COVID-19. This dataset consists of Covid-19 related household level status observations, household awareness, and impact of Covid-19 control measures on the household.

    Kind of Data

    Event history data

    Unit of Analysis

    Individual and household interviews

    Version

    Version Description

    v1: Dataset for public distribution.

    Version Date

    2024-12-02

    Version Notes

    v1: Dataset for public distribution.

    Scope

    Notes

    Each record in the exposure dataset represents a period of observation for an individual during which all the recorded characteristics of the individual stay constant. For example, on the birthday of the individual a new episode will start, because the age of the individual has changed. Any change in one of the status values, such as education or marital status, will likewise result in a new episode on the date of the change. For the COVID-19 data, the questionnaire included both household-level and individual-specific questions, the latter of which could be directly addressed by other household members if they were present. The primary respondent acted as a proxy in all other cases. COVID-19 symptom screening was included in the questionnaire.

    Topics
    Topic
    Mental Health, Covid-19
    Keywords
    Mental Health, Covid-19

    Coverage

    Geographic Coverage

    SAPRIN (South African Population Research Infrastructure Network) is a network of health and demographic surveillance sites in South Africa that consists of seven Health and Demographic Surveillance System (HDSS) nodes located in South Africa, namely: 1) MRC/Wits University Agincourt HDSS in Bushbuckridge District, Mpumalanga, which has collected data since 1993. The nodal website is http://www.agincourt.co.za. 2) the University of Limpopo DIMAMO HDSS in the Capricorn District of Limpopo, which has collected data since 1996.The nodal website is: N/A. 3) the Africa Health Research Institute (AHRI) HDSS in uMkhanyakude District, KwaZulu-Natal, which has collected data since 2000. The nodal website is http://www.ahri.org. 4) the Gauteng Research Triangle Initiative for the Study of Population, Infrastructure and Regional Economic Development (GRT-INSPIRED) in Hillbrow, Johannesburg, and Atteridgeville and Melusi, Tshwane, Gauteng. The nodal website is: N/A. 5) the Cape Town Surveillance through Healthcare Action Research Project (C-SHARP), Nomzamo and Bishop Lavis, Cape Town, Western Cape. The nodal website is: N/A. 6) the Umlazi Surveillance Initiative to Nurture Grassroots Action (USINGA) HDSS in Umlazi Ward 79 and 82, KwaZulu-Natal. The nodal website is https://usinga.ukzn.ac.za. 7) and the Bafokeng Health & Demographic Surveillance System (BAMMISHO) HDSS in Bojanala District, North West. The nodal website is: N/A.

    Universe

    Households resident in dwellings within the study area will be eligible for inclusion in the household component of SAPRIN. All individuals identified by the household proxy informant as a member of the household will be enumerated. A resident household member is an individual that intends to sleep the majority of time at the dwelling occupied by the household over a four-month period. Households will include resident and non-resident members. An individual is a non-resident member if they have close ties to the household, but do not physically reside with the household most of the time. They can also be called temporary migrants and they are enumerated within the household list. Because household membership is not tied to physical residency, an individual may be a member of more than one household.

    Producers and sponsors

    Primary investigators
    Name Affiliation
    Dr Molulaqhooa Linda Maoyi SAPRIN
    Tinofa Mutevedzi SAPRIN
    Prof Mark Collinson SAPRIN
    Prof Steve Tollman Agincourt
    Dr Joseph Tlouyamma DIMAMO
    Prof Collins Iwuji AHRI
    Dr Kobus Herbst SAPRIN
    Producers
    Name Affiliation Role
    Dr Molulaqhooa Linda Maoyi SAPRIN Technical Assistance
    Augustine Khumalo SAPRIN Technical Assistance
    Tinofa Mutevedzi SAPRIN Technical Assistance
    Prof Mark Collinson SAPRIN Technical Assistance
    Dr Kobus Herbst SAPRIN Technical Assistance
    Funding Agency/Sponsor
    Name Abbreviation Role
    Department of Science, Technology and Innovation DSTI Current Funder
    Other Identifications/Acknowledgments
    Name Affiliation Role
    Agincourt Data Team Agincourt Providing Data
    DIMAMO Data Team DIMAMO Providing Data
    AHRI Data Team AHRI Providing Data
    Centre for High Performance Computing Centre for High Performance Computing Providing IT Infrastucture for Data Processing

    Sampling

    Sampling Procedure

    This dataset is not based on a sample but contains information from the complete demographic surveillance areas.

    Survey instrument

    Questionnaires

    The data on this Repository is not the result of a single questionnaire but is a result of harmonised data from three different sites longitudinally collected over more than twenty years using different questionnaires that varied over time and site.

    Data collection

    Dates of Data Collection
    Start End Cycle
    1993-01-01 2023-12-31 Agincourt
    1996-01-01 2023-12-31 DIMAMO
    2000-01-01 2023-12-31 AHRI
    Time Method

    2020 - 2023

    Time periods
    Start date End date Cycle
    1993-01-01 2023-12-31 Agincourt
    1996-01-01 2023-12-31 DIMAMO
    2000-01-01 2023-12-31 AHRI
    Mode of data collection
    • Monthly data collection.
    Data Collection Notes

    In all the HDSS nodes, data are collected from a household proxy respondent, preferably the head of household or any next available senior adult resident household member, after informed consent was obtained by trained fieldworkers. Respondents are informed of the purpose and confidentiality of the interview, their right to refuse participation or withdraw from the study, and that scientists would be given access to anonymised data to analyse and publish information. Informed consent was verbal in all HDSS nodes until 2016. Written informed consent started in 2017 in AHRI, and 2018 in DIMAMO and 2019 in Agincourt. Until 2016 for Agincourt and AHRI, and 2017 for DIMAMO, data collection was field-based 'paper and pen' personal interviews (PAPI), before changing to field-based computer-assisted personal interviews (CAPI). Since 2019, all SAPRIN HDSS nodes collect data in 3 annual rounds over a 45-week data collection schedule; one field-based CAPI round, sandwiched on either side by a Call-Centre-based computer assisted telephonic interview (CATI), to create 3 data points at an interval of approximately 4 months in each calendar year. In the past HDSS nodes had different data collection frequencies. AHRI data collection was 2 PAPI rounds per year from inception to 2011, changing to 3 PAPI rounds per year between 2012 and 2016, before becoming 1 PAPI round and 2 CATI rounds from 2017. Agincourt and DIMAMO have been collecting data once annually in a census-type format, over 4-5-month period until 2018.

    Data processing

    Data Editing

    The first step in the data preparation process is quality assurance. The SAPRIN Management hub team assess the data submitted to ensure it is in the correct format and falls within expected value ranges. Other potential issues checked include: missing data, incorrect data types, unexpected duplicate or orphan records. The SAPRIN Management hub assess this conversion by running both original operational database and the SAPRIN database created from the operational database through the SAPRINQA data quality assessment and indicator process. The data quality checking process is conducted using the SAPRIN QA Julia Code. The Julia Code provides the Extract, Transform, and Load (ETL) capabilities that facilitates the process of capturing, cleansing, and storing data using a uniform and consistent format that is accessible and relevant to end users. The principle of the data quality checks is that if the data conversion conducted by the nodes was complete and accurate, there should be little or no difference in the data quality and demographic indicators between the base and SAPRIN versions of the nodal data. If the data submitted by the nodes meets the criteria for inclusion into the consolidated dataset the data moves to the second step of the data production process. However, if the data fail the inclusion checks, this could then lead to another iteration of data submission and quality control checks until the SAPRIN Management hub is satisfied that they have high quality data.To produce this final standard dataset, the data is processed using Julia Code on the Centre for High Performance Computing cluster.

    Data appraisal

    Estimates of Sampling Error

    Not Applicable

    Data Access

    Access authority
    Name Affiliation URL Email
    Molulaqhooa Linda Maoyi SAPRIN http://saprin.mrc.ac.za/ linda.maoyi@mrc.ac.za
    Kobus Herbst SAPRIN http://saprin.mrc.ac.za/ kobus.herbst@mrc.ac.za
    Access conditions

    This data is made available for access under the following conditions:

    1)The data and other materials provided by SAPRIN will not be redistributed or sold to other individuals, institutions, or organizations without the written agreement of SAPRIN.
    2)The data will be used for statistical and scientific research purposes only. They will be used solely for reporting of aggregated information, and not for investigation of specific individuals or organisations. The Data User will neither use nor permit others to use the data in any way other than listed in the original application (Analysis Plan) for access to the dataset.
    3)No attempt will be made to re-identify respondents, and no use will be made of the identity of any person or establishment discovered inadvertently. Any such discovery should immediately be reported to SAPRIN.
    4)No attempt will be made to produce links among datasets provided by SAPRIN, or among data from SAPRIN and other datasets that could identify individuals or organizations.
    5)The Data User will ensure that the data are kept in a secured environment and that only authorized users have access to the data.
    6)Any books, articles, conference papers, theses, dissertations, reports, or other publications that employ data obtained from SAPRIN will cite the source of data in accordance with the Citation Requirement provided with each dataset.
    7)An electronic copy of all reports and publications based on the requested data will be sent to SAPRIN.
    8)The original collector of the data, SAPRIN, and relevant funding agencies bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
    9) Once the data set has served its indicated purpose it must be destroyed. If the dataset needs to be lodged for publication purposes, a reference (a digital object identifier will be maintained by SAPRIN for this purpose) to the original dataset on the SAPRIN data repository should be used. Derived or aggregated datasets produced from the original dataset do not fall within this provision and may be lodged as publication datasets. If the same dataset is needed for a different purpose, the dataset should be re-requested and the new purposes indicated.

    Citation requirements

    Maoyi, ML; Mutevedzi, T; Collinson, M; Herbst, K (2024): SAPRIN Mental Health Datasets 2024: Individual exposure episodes dataset. South African Population Research Infrastructure Network. https://doi.org/10.23667/SAPRIN.SMHDIEE2024

    Maoyi, ML; Mutevedzi, T; Collinson, M; Herbst, K (2024): SAPRIN Mental Health Datasets 2024: Individual status observations dataset. South African Population Research Infrastructure Network. https://doi.org/10.23667/SAPRIN.SMHDISO2024

    Maoyi, ML; Mutevedzi, T; Collinson, M; Herbst, K (2024): SAPRIN Mental Health Datasets 2024: Individual COVID-19 dataset. South African Population Research Infrastructure Network. https://doi.org/10.23667/SAPRIN.SMHDICS2024

    Maoyi, ML; Mutevedzi, T; Collinson, M; Herbst, K (2024): SAPRIN Mental Health Datasets 2024: Household status observations dataset. South African Population Research Infrastructure Network. https://doi.org/10.23667/SAPRIN.SMHDHHS2024

    Maoyi, ML; Mutevedzi, T; Collinson, M; Herbst, K (2024): SAPRIN Mental Health Datasets 2024: Household asset status observations dataset. South African Population Research Infrastructure Network. https://doi.org/10.23667/SAPRIN.SMHDHHAS2024

    Maoyi, ML; Mutevedzi, T; Collinson, M; Herbst, K (2024): SAPRIN Mental Health Datasets 2024: Household COVID-19 dataset. South African Population Research Infrastructure Network. https://doi.org/10.23667/SAPRIN.SMHDHHCS2024

    Disclaimer and copyrights

    Disclaimer

    The user of the data acknowledges that the original collector of the data and the relevant funding agencies bear no responsibility for the data's use or interpretation or inferences based upon it.

    Copyright

    This dataset documentation is licensed under a Creative Commons Attribution-Non Commercial 4.0 International License. The dataset is shared in terms of the data-use agreement accepted at the time of data download.

    Contacts

    Contacts
    Name Affiliation Email URL
    Molulaqhooa Linda Maoyi SAPRIN linda.maoyi@mrc.ac.za http://saprin.mrc.ac.za/
    Kobus Herbst SAPRIN kobus.herbst@mrc.ac.za http://saprin.mrc.ac.za/

    Metadata production

    DDI Document ID

    DDI.SAPRIN.SMHD2024V1

    Producers
    Name Abbreviation Affiliation Role
    Molulaqhooa Linda Maoyi MLM SAPRIN Documentation of Study and Review of the metadata
    Augustine Khumalo AK SAPRIN Documentation of Study and Review of the metadata
    Tinofa Mutevedzi TM SAPRIN Documentation of Study and Review of the metadata
    Mark Collinson MC SAPRIN Documentation of Study and Review of the metadata
    Kobus Herbst KH SAPRIN Documentation of Study and Review of the metadata
    Date of Metadata Production

    2024-12-02

    Metadata version

    DDI Document version

    Version 1 (December 2024)

    Back to Catalog
    Hosted by the South African Medical Research Council

    © SAPRIN Data Repository, All Rights Reserved.