This function looks in a directory for HILDA data files with .dta,
the Stata binary data format, save them as fst files.
The fst files will be used by hil_fetch() for loading HILDA data.
Format
hil_dict() returns a data table with many rows and 3 variables:
- var
(
character()) variable names- wave
(
list(integer())) waves that the variable was recorded- label
(
character()) short description of the variable
Arguments
- read_dir
read directory where the HILDA files that match this
Combined_.*.dtaregex pattern are in.- save_dir
a directory to save HILDA files in 'fst' format. This directory will be added to .Rprofile as
hildar.vault.
Value
make_dict() returns a data.table contains three
columns: var, label, and wave. But if save_dir is not
NULL, the dict will be saved to that location.
Note
This function can take a long time to finish since each HILDA file
is quite large. One option is to use the future package to choose
your parallel backend before running hil_fetch(). The following
code chuck uses multisession which creates background R sessions
equal to the number of workers.