@wbg-mde/r-factory

Version:

Metadata editor R integration module

126 lines (83 loc) • 4.13 kB

Markdown

# Metadata Editor Import(r-factory) R integration module of the Metadata Editor application. This module contains R scripts, Node based R utility methods and test cases. This module passing data from NodeJS to R by using [PanApps](http://services.panapps.co/) customized version of [r-script](https://www.npmjs.com/package/@wbg-mde/r-script). This module includes various data import export features as well as data analytic features such as resequence, spread metadata etc. Here are the list features.. - import dataset from file formats `SPSS / STATA / CSV` - export dataset to different file formats - destring straing variables - resequence variables - spread metadata - export to dictionary - update variable status - calculate variable statistics ### Prerequisites Install Node and R if not installed. Set environment variable for windows. - Node 10.15.1 - R version 3.3.3 Check whether R packages are installed and the version. If not please install using the command `install.packages("package_name")` ##### R packages - jsonlite (version: 1.3) - haven (version: 1.1.0) - plyr (version: 1.8.4) - stringr (version: 1.2.0) - labelled (version 1.0.0) - readr (version 1.1.1) ### Installation Install the dependencies and devDependencies. ```sh npm install ``` Build the application ```sh npm run build ``` Test the application ```sh npm run test ``` Publish the application to npm ```sh npm publish --access public ``` ### Running the tests Unit test are written for each features. You can copy input files to `test-data/input` directory. Please see the commands to run unit test below. Note:- Please start the editor before run the tests. Editor start the OpenCPU API server and it will be used in the unit test. #### `npm run test` unit test to check the dataset import/export functionalities. Keep only dataset files to be tested in the `test-data/input/dataset` folder, remove other files. known issues - some datasets may fail the unit tests due to labelled integer validation while exporting to STATA dataset format(eg: cs1_pupil.dta) flow of test execution :- - import datasets from `test-data/input/datasets` directory - export the imported files to `test-data/output/datasets` - import the exported datasets #### `npm run test:resequence` unit test to import dataset and perform resequence on the imported datasets. Drop the dataset files to be tested in the `test-data/input/dataset` folder and run command flow of test execution :- - import datasets from `test-data/input/datasets` directory - perform resequence and write updated varable json file to `test-data/output/json` directory #### `npm run test:destring` unit test to check destring functionality in the imported file. Since we have to mention the variables to be destringed, the test is limited for a particular dataset "ghs_2015_person_v1.1_20160608.dta". Keep this file in the input folder and remove others while run the test. flow of test execution :- - import datasets from `test-data/input/datasets` directory - perform destring to the selected variables and write the updated csv file to `test_data/output/csv/` directory #### `npm run test:dictionary` #### `npm run test:dictionary:stata` #### `npm run test:dictionary:spss` unit test for export to dictionary format. flow of test execution :- - import datasets from `test-data/input/datasets` directory - export dataset to `test_data/data-dictionary/` directory ### `npm run test:validateKey` unit test to check the unique key constraint for the given key variable of a dataset. Steps :- - Copy the data file to `test-data/input/datasets` directory. - Set the data `datasetname` and `keyVariables` in `dist/test/validation.unit.test.js` constructor method - run the command flow of test execution :- - import dataset from `test-data/input/datasets` directory - validate the key variables ### Contributors - Navin VI (navin.v.i@panapps.co) - Anoop Xaviour (anoopx@panapps.co) - Libin Thomas (libint@panapps.co) License ---- MIT