Text file I/O

<< Click to Display Table of Contents >>

Navigation:  Using SMILE > Datasets >

Text file I/O

You can load the contents of a data set from a text file by calling DSL_dataset::ReadFile. For the illustration purposes, let us assume that we have a comma separated text file with the following data:

VarA,VarB,VarC

44.225,3,StateZ

26.913,0,StateY

24.379,2,*

*,3,StateX

76.681,*,StateZ

44.702,1,StateX

After the DSL_dataset::ReadFile call, the data set will be structured as below:

dataset

variable VarA is continuous. The value missing in the fourth record was replaced by sqrt(-1).

variable VarB is discrete. The value missing in the fifth record was replaced by -1.

variable VarC is also discrete, and has the string labels associated with its integer values. The value missing in the third record was replaced by -1.

You can fine-tune the parsing by passing a DSL_datasetParseParams structure to ReadFile. The following code should be used if the data file has no header line with the names of the columns, missing values are marked by a "N/A" string and missing values in discrete columns should be replaced by 999:

DSL_dataset ds;

DSL_datasetParseParams params;

params.columdIdsPresent = false;

params.missingValueToken = "N/A";

params.missingInt = 999;

int res = ds.ReadFile("datafile.txt", &params);

To write the contents of the data set to a text file, use DSL_dataset::WriteFile. You can customize the field separator, the missing value marker, etc. by passing a DSL_datasetWriteParams structure to WriteFile. The code snippet below writes a comma-separated file, which includes a header line with column names and uses "(none)" as a marker for missing values:

DSL_dataset ds;

// create or load dataset here

DSL_datasetWriteParams params;

params.columdIdsPresent = true;

params.missingValueToken = "(none)";

params.separator = ',';

int res = ds.WriteFile("datafile.csv", &params);