Hello,
How the SMILE's algorithm recognize if the column type continuous or discrete ?
Thanks,
Boris
Continuous/Discrete
-
- Site Admin
- Posts: 1460
- Joined: Mon Nov 26, 2007 5:51 pm
Re: Continuous/Discrete
The variables (data columns) in the DSL_dataset can be created with AddIntVar or AddFloatVar - this automatically marks the column as discrete/continuous. The parser used by DSL_dataset::ReadFile treats the column as continouos when at least one value is non-integer.
-
- Posts: 24
- Joined: Thu Sep 30, 2010 7:48 pm
Re: Continuous/Discrete
Suppose the column contains only integers : 1,2,3,4,5,...,50,.....1000 this column esteemed as Continuous or Discrete ?shooltz wrote:The parser used by DSL_dataset::ReadFile treats the column as continouos when at least one value is non-integer.
Thanks,
Boris
-
- Site Admin
- Posts: 1460
- Joined: Mon Nov 26, 2007 5:51 pm
Re: Continuous/Discrete
If you're asking about the type of the column as inferred by DSL_dataset::ReadFile, then the answer is 'discrete'.borisrabin wrote:Suppose the column contains only integers : 1,2,3,4,5,...,50,.....1000 this column esteemed as Continuous or Discrete ?
-
- Posts: 24
- Joined: Thu Sep 30, 2010 7:48 pm
Re: Continuous/Discrete
What is the "Discrete threshold" feature in GeNIe ?shooltz wrote:If you're asking about the type of the column as inferred by DSL_dataset::ReadFile, then the answer is 'discrete'.borisrabin wrote:Suppose the column contains only integers : 1,2,3,4,5,...,50,.....1000 this column esteemed as Continuous or Discrete ?
Is this feature enabled in SMILE with some default value ?
Thanks,
Boris
-
- Site Admin
- Posts: 1460
- Joined: Mon Nov 26, 2007 5:51 pm
Re: Continuous/Discrete
If the number of unique integer values in the data column is above the 'discrete threshold', the column is considered continuous. For example, the 'salar' column in retention.txt will be considered continuous despite containing integer values only.What is the "Discrete threshold" feature in GeNIe ?
No, this is GeNIe feature; however, GeNIe uses publicly available SMILearn API (DSL_dataset) to implement it. After the data is loaded into the dataset, but before learning starts, the discrete columns are checked against 'discrete threshold', and, if required, converted to continuous.Is this feature enabled in SMILE with some default value ?