Parameters learning with jSmile

The engine.
dtodor
Posts: 13
Joined: Tue Feb 03, 2009 8:30 am

Post by dtodor »

Hi,

I'm using Visual C++ Express Edition (Version 9.0.30729.1 SP):


Microsoft Visual Studio 2008
Version 9.0.30729.1 SP
Microsoft .NET Framework
Version 3.5 SP1

Installed Edition: VC Express

Microsoft Visual C++ 2008 91909-152-0000052-60665
Microsoft Visual C++ 2008

Hotfix for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU (KB945282) KB945282
This hotfix is for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU.
If you later install a more recent service pack, this hotfix will be uninstalled automatically.
For more information, visit http://support.microsoft.com/kb/945282.

Hotfix for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU (KB946040) KB946040
This hotfix is for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU.
If you later install a more recent service pack, this hotfix will be uninstalled automatically.
For more information, visit http://support.microsoft.com/kb/946040.

Hotfix for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU (KB946308) KB946308
This hotfix is for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU.
If you later install a more recent service pack, this hotfix will be uninstalled automatically.
For more information, visit http://support.microsoft.com/kb/946308.

Hotfix for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU (KB947540) KB947540
This hotfix is for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU.
If you later install a more recent service pack, this hotfix will be uninstalled automatically.
For more information, visit http://support.microsoft.com/kb/947540.

Hotfix for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU (KB947789) KB947789
This hotfix is for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU.
If you later install a more recent service pack, this hotfix will be uninstalled automatically.
For more information, visit http://support.microsoft.com/kb/947789.

Hotfix for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU (KB948127) KB948127
This hotfix is for Microsoft Visual C++ 2008 Express Edition with SP1 - ENU.
If you later install a more recent service pack, this hotfix will be uninstalled automatically.
For more information, visit http://support.microsoft.com/kb/948127.



Here is the code of the test application:

Code: Select all

#include <iostream>
#include <vector>

#include "smile.h"
#include "smilearn.h"

int main (int argc, char * const argv[]) {
	
	DSL_network network;
	if (network.ReadFile("abc-h-def-network-randomized.xdsl") != DSL_OKAY) {
		std::cout << "Unable to read network\n";
		return -1;
	}
	std::cout << "1. Successfully read network\n";
	
	DSL_dataset dataset;
	if (dataset.ReadFile("abc-h-def-network-original.txt") != DSL_OKAY) {
		return -2;
	}
	std::cout << "2. Successfully read data set\n";
	
	
	std::string errMsg;
	
	
	std::vector<DSL_datasetMatch> matches;
	if (dataset.MatchNetwork(network, matches, errMsg) != DSL_OKAY) {
		return -3;
	}
	std::cout << "3. Successfully calculated matches\n";
	
	for (unsigned int i=0; i<matches.size(); i++) { 
		const DSL_datasetMatch &m = matches[i]; 
		printf("%d col=%d slice=%d h=%d %s\n", i, m.column, m.slice, m.node, network.GetNode(m.node)->GetId()); 
	} 
	
	std::vector<int> fixedNodes;
	
	DSL_em em;
	//em.SetEquivalentSampleSize(2);
	em.SetRandomizeParameters(true);
	if (em.Learn(dataset, network, matches) != DSL_OKAY) {
		std::cout << errMsg;
		return -4;
	}	
	std::cout << "4. Successfully learnt network\n";
	
	if (network.WriteFile("abc-h-def-network-learnt.xdsl") != DSL_OKAY) {
		return -5;
	}
	std::cout << "5. Successfully written learnt network\n";
	
	for (int h = network.GetFirstNode(); h >= 0; h = network.GetNextNode(h)) 
	{ 
		printf("%d %s\n", h, network.GetNode(h)->GetId()); 
		const DSL_Dmatrix *mtx = network.GetNode(h)->Definition()->GetMatrix(); 
		for (int i = 0; i < mtx->GetSize(); i ++) 
		{ 
			printf("%g ", (*mtx)[i]); 
		} 
		printf("\n"); 
	} 
    
    return 0;
}

Under Mac OS X 10.5 I get the following results:

Code: Select all

	//em.SetEquivalentSampleSize(2);
	em.SetRandomizeParameters(true);
1. Successfully read network
2. Successfully read data set
3. Successfully calculated matches
0 col=0 slice=0 h=0 A
1 col=1 slice=0 h=1 B
2 col=2 slice=0 h=2 C
3 col=3 slice=0 h=3 H
4 col=4 slice=0 h=4 D
5 col=5 slice=0 h=5 E
6 col=6 slice=0 h=6 F
4. Successfully learnt network
5. Successfully written learnt network
0 A
0.5005 0.4995
1 B
0.4925 0.5075
2 C
0.5275 0.4725
3 H
0.195312 0.804688 0.518448 0.481552 0.584179 0.415821 0.86999 0.13001 0.0174197 0.98258 0.808501 0.191499 0.162883 0.837117 0.782767 0.217233
4 D
0.201708 0.798292 0.677407 0.322593
5 E
0.588047 0.411953 0.643097 0.356903
6 F
0.323372 0.676628 0.386902 0.613098

Code: Select all

	em.SetEquivalentSampleSize(2);
	em.SetRandomizeParameters(false);
1. Successfully read network
2. Successfully read data set
3. Successfully calculated matches
0 col=0 slice=0 h=0 A
1 col=1 slice=0 h=1 B
2 col=2 slice=0 h=2 C
3 col=3 slice=0 h=3 H
4 col=4 slice=0 h=4 D
5 col=5 slice=0 h=5 E
6 col=6 slice=0 h=6 F
4. Successfully learnt network
5. Successfully written learnt network
0 A
0.500759 0.499241
1 B
0.492636 0.507364
2 C
0.527896 0.472104
3 H
0.195727 0.804273 0.518448 0.481552 0.584179 0.415821 0.86999 0.13001 0.0174197 0.98258 0.808501 0.191499 0.162883 0.837117 0.782767 0.217233
4 D
0.20216 0.79784 0.677407 0.322593
5 E
0.588571 0.411429 0.643097 0.356903
6 F
0.324728 0.675272 0.386902 0.613098



The results for Windows XP SP3 are as follows:

Code: Select all

	//em.SetEquivalentSampleSize(2);
	em.SetRandomizeParameters(true);
1. Successfully read network
2. Successfully read data set
3. Successfully calculated matches
0 col=0 slice=0 h=0 A
1 col=1 slice=0 h=1 B
2 col=2 slice=0 h=2 C
3 col=3 slice=0 h=3 H
4 col=4 slice=0 h=4 D
5 col=5 slice=0 h=5 E
6 col=6 slice=0 h=6 F
4. Successfully learnt network
5. Successfully written learnt network
0 A
0.5005 0.4995
1 B
0.4925 0.5075
2 C
0.5275 0.4725
3 H
0.195313 0.804688 0.725738 0.274262 0.361217 0.638783 0.289796 0.710204 0.172932 0.827068 0.884956 0.115044 0.451852 0.548148 0.763713 0.236287
4 D
0.201708 0.798292 0.703669 0.296331
5 E
0.588047 0.411953 0.478833 0.521167
6 F
0.323372 0.676628 0.0903104 0.90969

Code: Select all

	em.SetEquivalentSampleSize(2);
	em.SetRandomizeParameters(false);
1. Successfully read network
2. Successfully read data set
3. Successfully calculated matches
0 col=0 slice=0 h=0 A
1 col=1 slice=0 h=1 B
2 col=2 slice=0 h=2 C
3 col=3 slice=0 h=3 H
4 col=4 slice=0 h=4 D
5 col=5 slice=0 h=5 E
6 col=6 slice=0 h=6 F
4. Successfully learnt network
5. Successfully written learnt network
0 A
0.500759 0.499241
1 B
0.492636 0.507364
2 C
0.527896 0.472104
3 H
0.195727 0.804273 0.724004 0.275996 0.362899 0.637101 0.294494 0.705506 0.171772 0.828228 0.884285 0.115715 0.449727 0.550273 0.763873 0.236127
4 D
0.20216 0.79784 0.70362 0.29638
5 E
0.588571 0.411429 0.479142 0.520858
6 F
0.324728 0.675272 0.0908674 0.909133


Thanks for the support!!!
Attachments
abc-h-def-network-randomized.xdsl
Network
(3.11 KiB) Downloaded 358 times
abc-h-def-network-original.txt
Training data
(49.21 KiB) Downloaded 398 times
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

Due to issues with our internal network I'm unable to connect to the machine which has VM with Darwin, so I can't directly verify your results. However, I checked that the numbers you reported on XP; they match those I'm getting on my Windows box with Visual C++ and cygwin. Output from the program running on FreeBSD is also the same.

While I'm waiting for the admins to complete their job, can you check the gcc version on your OSX computer with gcc --version ? Also, what's the command line you're using to build the program?
dtodor
Posts: 13
Joined: Tue Feb 03, 2009 8:30 am

Post by dtodor »

I'm actually using XCode with gcc 4:

i686-apple-darwin9-g++-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5493)

I've also tried building from the command line:

g++ -O3 -DNDEBUG -ffast-math main.cpp -o test -Lsmile -Ismile -lsmilearn -lsmile

I'm getting the exactly same results as when using XCode. And these are different from what I get under Windows.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

I have just confirmed your findings on Darwin running in VM (so it's certainly not an issue of OS/compiler version mismatch). We'll be trying to identify the reason for the output discrepancy and will post here as soon as any information is available.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

At this point it looks like the issue with gcc's optimizer. The output is incorrect only if -O1/-O2/-O3 options are used to build SMILE/SMILearn. However, these options do not cause problems when gcc is used on other platforms (we don't have a system with gcc 4.0.x though).
dtodor
Posts: 13
Joined: Tue Feb 03, 2009 8:30 am

Post by dtodor »

Could you provide a build without any optimizations?
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

dtodor wrote:Could you provide a build without any optimizations?
We'll do it as a last instance. Currently we're trying to determine where the codepaths diverge on different platforms.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

dtodor wrote:Could you provide a build without any optimizations?
We've found the loop which is executed incorrectly when compiled with optimizations enabled on Darwin. It's quite easy to work around this gcc 4.0.0 bug and we'll post new build for OSX later this week compiled with -O3.
dtodor
Posts: 13
Joined: Tue Feb 03, 2009 8:30 am

Post by dtodor »

Thanks for the support! Would it be possible to make a x86_64 build as well?
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

dtodor wrote:Thanks for the support! Would it be possible to make a x86_64 build as well?
I believe gcc 4.0.0 on Darwin 8.0.1 only supports i386 and ppc architectures. If you can point me to downloadable Darwin installer CD with gcc support for x86-64 I can try to set it up in a VM later this week.
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Post by shooltz[BayesFusion] »

SMILE binaries for OSX have been refreshed. This is a build with -O3.
Post Reply