Genie Sampling and fixed nodes

The front end.
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Genie Sampling and fixed nodes

Post by techmec »

Hi all,

first of all: I'm a newbee.
My problem:
I built a Bayes Net and want to evaluate the sampling algorithms.

I have some nodes, where I'm sure about the distribution and I set the soft evidence. The data is known to me. Now I want to sample the other nodes, but I can't find something to fix the nodes I do not want to sample...The algortihms always take all...

May be I understand it in a wrong way, I would be glad to get some input ....


EDIT: is something wrong in my thinking?
Thanks,
techmec
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Re: Genie Sampling and fixed nodes

Post by techmec »

HI

its me again...

May be I understand it better if I ask that:

What benefit do the "mandatory" nodes have?

Thank you

techmec
Martijn
Posts: 76
Joined: Sun May 29, 2011 12:23 am

Re: Genie Sampling and fixed nodes

Post by Martijn »

Hi techmec,

I'm not quite sure what you are trying to accomplish.
The term mandatory nodes doesn't really mean anything to me.

Could you describe step by step what you are doing at the moment?

Thanks,

Martijn
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Re: Genie Sampling and fixed nodes

Post by techmec »

Hihi,

thanks for the reply!
I unfortunately do not understand the sampling approach, I think.

I modeled a Bayes Net. My aim is (is this possible?) to sample the joint distribution of this net, but there are some local distributions for some nodes that are fixed, because I know them (They are outcomings of a Gaussiaon Mixture Model). I try to find a way in Genie or Smile to fix nodes values referring to the sampling procedures...
May be I have to structure the net in another way... I don't know. All nodes are chances.

The only thing I found was the option "mandatory" in the node properties, which I thought means, that the values are given beforehand/ in prior, and that's why they can not change during inference process. But thats not the way they act. The values are changable, so I don't understand that parameter.

Any input for one of these points will help me to improve.... THANKS


techmec
Martijn
Posts: 76
Joined: Sun May 29, 2011 12:23 am

Re: Genie Sampling and fixed nodes

Post by Martijn »

Hi techmec,

No, it's not possible to sample the joint distribution.
The sampling algorithms are use for inference.

You can set a node to a value, by setting evidence, but this means setting it to a state, I'm not sure if that's what you mean.

The mandatory option you found is used for something else.
Genie has a diagnosis mode where the mandatory option for a nodes means it is automatically added to the list of nodes to be used for diagnosis.

I hope this clears up a few things, If you need more help please let me know.

Best,

Martijn
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Re: Genie Sampling and fixed nodes

Post by techmec »

Hi,

mandatory node-> I understand now. Thanks.

But the second point is not clear to me.. the aim of the inference is to get the joint prob to be able to estimate the posterior prob. One way is exact inference the other is approximating the distribution e.g. by sampling...
This is my understanding.

Not? Please explain it to me. I have to get it

Greetz
techmec
Martijn
Posts: 76
Joined: Sun May 29, 2011 12:23 am

Re: Genie Sampling and fixed nodes

Post by Martijn »

Well, yes if you have the joint probability distribution, you can do all the computations you would ever need, but the size of it grows exponentially and for anything beyond the usual toy problems they grow beyond what's capable of storing on one or multiple computers altogether. Wit just 32 binary nodes you already need 2^32 entries for the joint distribution, so it's simply not a viable option to rely on it for the calculations.
This is one of the reasons Bayesian network were being used in the first place, you exploit conditional independences in the data so you no longer need to assign probabilities for all combinations of variables.
A product of conditional probabilities now represents a entry from the joint probability distribution. So now you have a lot of small tables that can be used to do all the calculations without needing the enormous amount of space the represent the whole joint.

Exact inference calculates the posterior exactly, and indeed with sampling you approximate i. From a time complexity point of view both are equally "bad", sometimes depending on the network structure you can prefer one over the other in terms of speed.

Are you familiar with how some the BN inference algorithms work? Like Clustering (i.e. Lauritzen), or perhaps Variable Elimination?

Martijn
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Re: Genie Sampling and fixed nodes

Post by techmec »

Hi,

well, I thought I have. :)

But I really have problems in modeling or using my BN.
I have evidence for two nodes of twenty measured, so I'm really sure about it.
When I then use the sampling algorithms they variate my evidence values.
This is what I don't understand. That's why I asked for the ability of fixing nodes.
Don't know how to explain it better.

In my opinion it should be possible to restrict the sampling algorithms to defined nodes or it should be possible to invalidate nodes for the sampling.
Somewhere must be an error in my thinking.

Greetz,
techmec
Martijn
Posts: 76
Joined: Sun May 29, 2011 12:23 am

Re: Genie Sampling and fixed nodes

Post by Martijn »

Just wondering something,

You always seem to be using soft evidence, why is that?
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Re: Genie Sampling and fixed nodes

Post by techmec »

My evidence is measured by sensors. I am not really sure about the values. So to say, they are really "beliefs".
E.g. some nodes are feeded by image processing. If my detectors say they found some green regions in an image, due to many effects, the number of regions and the real color can vary a little bit.
Martijn
Posts: 76
Joined: Sun May 29, 2011 12:23 am

Re: Genie Sampling and fixed nodes

Post by Martijn »

Ok, let's summarize a few thing, please correct me if I'm wrong.

- You have a network with discrete nodes (i.e. a discrete number of states for each node).
- Some of the data for these nodes comes from sensors and the observed states are soft evidence.
- You want to use an inference algorithm based on sampling, but you only want a few nodes to be sampled, the others should be fixed?
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Re: Genie Sampling and fixed nodes

Post by techmec »

yes :)
Martijn
Posts: 76
Joined: Sun May 29, 2011 12:23 am

Re: Genie Sampling and fixed nodes

Post by Martijn »

Ok, question why do want these specific nodes sampled?

Why would an exact algorithm not be an option for these nodes?

Are there specific requirements you need or other special circumstances?
techmec
Posts: 21
Joined: Sun Feb 05, 2012 2:53 pm

Re: Genie Sampling and fixed nodes

Post by techmec »

I want to test the algorithms because I have to decide which algorithm I need for my application.
If the sampling algorithms (one of them) are not that exact but faster than the exact solution, than I will probably use them.
I have to proof it for a publication (paper) I cannot only say: believe me, the exact solution is the best.


Or is it normal behavior that all nodes are taken into consideration?
I would have thought my usage is not that crazy,
May be I simply have to live with it.
Martijn
Posts: 76
Joined: Sun May 29, 2011 12:23 am

Re: Genie Sampling and fixed nodes

Post by Martijn »

The basic idea of sampling algorithms for BN inference is that you generate samples from the network, i.e. for every node you randomly pick a value depending on it's (conditional) probability distribution and any evidence set (hard or soft).
So you do for a while and generate like 10000 samples.
From these samples you can then get counts you can use to calculate the poterior distribution of interest.

So, let's say there's a node A that has values true and false and after 10000 samples we find that there are 6800 true and 3200 false.
We can then calculate P(A=true) = 6800 / 10000 = 0.68.

For a conditional probability you can calculate P(A=true|B=true) by (Counts(A=true AND B = true) / Counts(B=true)

So you always do all the nodes, because you don't do any exact calculations all, you just generate samples.

This is the gist of it, you should look up papers for details on the algorithms.

Did this fit in your idea of sampling?
Post Reply