Save HILDA Stata files data to fst data

This function looks in a directory for HILDA data files with .dta, the Stata binary data format, save them as fst files. The fst files will be used by hil_fetch() for loading HILDA data.

Usage

hil_setup(read_dir, save_dir)

make_dict(read_dir, save_dir = NULL)

hil_dict()

Format

hil_dict() returns a data table with many rows and 3 variables:

var: (character()) variable names
wave: (list(integer())) waves that the variable was recorded
label: (character()) short description of the variable

Arguments

read_dir: read directory where the HILDA files that match this Combined_.*.dta regex pattern are in.
save_dir: a directory to save HILDA files in 'fst' format. This directory will be added to .Rprofile as hildar.vault.

Value

make_dict() returns a data.table contains three columns: var, label, and wave. But if save_dir is not NULL, the dict will be saved to that location.

Note

This function can take a long time to finish since each HILDA file is quite large. One option is to use the future package to choose your parallel backend before running hil_fetch(). The following code chuck uses multisession which creates background R sessions equal to the number of workers.

library(future)
plan(multisession, workers = 2)

# `hil_setup()` can take several minutes to finish.
# To monitor its progress, you can wrap the function in
# `progressr::with_progress({...}}` like below.
progressr::with_progress({
   hil_setup(read_dir = "...", save_dir = "...")
})

Examples

# HILDA data dictionary
if (FALSE) {
hil_dict()
}