The effect of fixed nodes in EM

TimoSamuel · Post by **TimoSamuel** » Tue Dec 20, 2016 1:09 pm

Dear reader,

I had a question that came up while learning my network with the EM algorithm.
First a small description of my case study:

I have three different datasets of 312, 2184 and 1248 observations. I want to learn these with equal value/strength in the network and thus set the confidence parameter in line with the numbers corresponding to their size. Thus, 312 with confidence parameter 312, 2184 with 2184 etc.

The observations consist of monthly time series from 25 years, each representing a certain land use (forest, maize and pasture) These are divided in short term (12.5 years) and long term (12.5 years). Every time series stands for a change in land management actions; such as maize - zero tillage, maize - biomass application etc.

Througout the dataset; some variables remain constant (e.g. forest, maize, pasture) some vary according the 12.5 years (such as tillage practices, biomass application) and some variables are variant within the time series and thus discretized (such as DOC_store).

Thus, an example of the amount of observations in each dataset is: amount of land management actions x amount of months x amount of years.
Thus for maize this is 14 x 12 x 12.5 = 2184.

However, in order to not let the EM learning set the node to a relative differenece between states that are unrealistic in terms of physics, I set some nodes to fixed nodes in the network (soil type, watertable class, but also land use) and have two questions with this.
First, I was wondering what the general effect is of setting fixed nodes in the context of EM learning. This is important to know since I would like to discover scenarios by learning both the Short Term and Long Term datasets in the same network structure.
Second, if I fix all the nodes that I learn with EM to keep their prior values would that still affect the probability distribution meaning that the relationships between the nodes are altered according to the dataset? Thus, I keep the same distribution of the states of the nodes but not the causal relations between different nodes because there has been entered new evidence in the network.

Since I am having my final presentation of my thesis this week, some help would be greatly appreciated.
Looking forward hearing from you.

Regards,

Timo

p.s. I have added the network and two example datasets as an attachment to this message I couldnt add the dataset of maize with 2184 observations since the text file exceeded the limit of 256kb

Post by **marek [BayesFusion]** » Thu Dec 22, 2016 9:10 am

Timo,

You may be reading too much into fixing nodes. What fixing nodes does is excluding them from the learning process. In other words, their CPTs do not get affected by the learning. Try fixing some nodes and not fixing others. If you do this, the nodes that you have fixed will preserve their original conditional probability distributions (or prior if they have no parents). If you fix all the nodes, none of the parameters will be affected, so the whole process of learning parameters will have no effect on the network. "Select all" and "Select none" in the dialog are just for the user convenience as a starting point for fixing most or fixing few nodes. Both make no sense in separation.

In the "Learn parameters" functionality, we are talking about learning parameters, not the structure. The structure will be unaffected. Existing evidence in the network is ignored and has no effect on the learned parameters.
I hope this helps.

Marek

TimoSamuel · Post by **TimoSamuel** » Wed Jan 04, 2017 8:07 pm

Dear Marek,

Thanks for your reaction. Indeed I might have been overthinking on the effect of fixed nodes.
Although, I had one minor question on the subject; what is the difference in effects when these nodes are not within the dataset?
Too put it simply; if i do not fix them their value changes. Does this mean that the overall probability
of the node shifts or is it only an indication of how their relationship to the learned nodes change?
(I.e. that is the BN represent a region in which a change in these values is physically impossible, hence the importance of detecting the difference between these two)
Looking forward to your reply.

Regards,

Timo

Post by **marek [BayesFusion]** » Wed Jan 04, 2017 9:35 pm

When you fix them, whether they are present or absent in the data file, does not matter, as their CPTs will remain unchanged. If you do not fix them and they are absent from the data file, EM will calculate their CPTs in every iteration based on the values of other variables in the data file. The CPTs in the very last iteration will be their final CPTs. Effectively, their CPTs will change. Does this help?

Marek

TimoSamuel · Post by **TimoSamuel** » Wed Jan 18, 2017 12:36 pm

Hi Marek,

That is clear, thank you very much for spelling it out.
Have a good day.

Regards,

Timo

TimoSamuel · Post by **TimoSamuel** » Fri Jan 20, 2017 5:54 pm

Actually one more question came up. That is the following. I have noticed that after the learning of the network with the three previous datafiles, the node distributions were not very sensitive to changes when multiple nodes were set as evidence and thus a certain scenario was simulated.
I figured myself that was because many of the node's parameters were clustered into a single datafile (many datarows had similar values with only one node changing) Thus I decoupled them and learned them separately. (Around 25 datafiles of 156 rows).
I used a random seed of 1, and confidence: 156 to ensure that they all datafiles were learned with equal weight.
What I am wondering is whether these settings are a condition that allows for learning the network incrementally with these datasets?
To be clear, the dataset involves +-8 nodes of the original network.
Another question is whether you can learn a dataset for conditional probabilities but exclude the influence of the frequency of occurence within the dataset rises the probability of a node. (Thus, learn the relations between nodes without increasing the probability of occurence. Here I assume that if no evidence is given in the network and it is updated that the values indicate the probability of occurence of certain variable with the network being in a neutral state).
Looking forward to your reply.

TimoSamuel · Post by **TimoSamuel** » Sat Jan 21, 2017 4:47 pm

To make it more clear what I refer to an excerpt from an article (that I alas did not note down the original source):
"In spite that it is tempting to think that the probability of certain variablesis changing according to new data/information, however, itis not the probability of a certain variablethat changes but theinherent conditional probabilities between them as a result of training the network. Charniak (1991) describes accurately the reasoning of a Bayesian updating, i.e. constituting a posterior probability based on new evidence. Such updating leads to a new joint probability betweendifferentvariables throughout the network."

I feel that when the network is used as a decision-support model,this is important to understand yet the problem is that I think I do not.

If for instance a node has three parameters;A, B, C.
In the prior state of the network, the distribution is 80, 15, 5. This distribution is not likely to change in the period for which data is simulated (a time series of 12 years). However,within the data, node C occurs in a higher frequency than the prior distribution.
Although I want to learn the conditional relations between the nodes under parameter C, I do not want this to change the 'likelihood of occurence'. Does this make sense?

It is important for me to understand this since the network's response varies highly depending on which nodes I fix/not fix (both when there are included in the Bayesian updating datafiles or not included).

TimoSamuel · Post by **TimoSamuel** » Sat Jan 21, 2017 4:51 pm

Also, half of these particular nodes do not have parent nodes, in case that makes a difference.

Mon Jan 23, 2017 10:48 pm

Hi Timo,

I'm having a hard time understanding your questions but let me try. Please let me know in case I have misunderstood you.

I'm not sure what you mean by setting nodes as evidence and simulating scenarios during learning parameters.

Have you kept refining the same set of parameters with successive data sets of 15 rows? If so, shouldn't you update your ESS as well?

Generally, your feeling is right; ESS allows you to learn incrementally.

If you mean by "probability of occurrence" the distributions of the nodes without parents, sure, you can do it. Just fix them. You can also change these distributions manually after learning.

"In spite that it is tempting to think that the probability of certain variablesis changing according to new data/information, however, itis not the probability of a certain variablethat changes but theinherent conditional probabilities between them as a result of training the network. Charniak (1991) describes accurately the reasoning of a Bayesian updating, i.e. constituting a posterior probability based on new evidence. Such updating leads to a new joint probability betweendifferentvariables throughout the network."

This citation refers to the difference between training a BN (which learns the joint probability distribution, jpd) and inference using a BN (which is based on the jpd but the jpd does not change).
I hope this helps.

Marek

BayesFusion Support Forum

The effect of fixed nodes in EM

The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM

Re: The effect of fixed nodes in EM