Hello;
thank you for answering my questions, although sometimes it seems trivial.but i'm beginner.
I want to use Dynamic Bayesian network.to learn and inference in natural language processing.
the dataset is a set of sentences each sentence has set of words.sentence are the experiments, words are the time in DBN,
if we have discrete data like in the following
sample from the data
sentence word pattern toolword propernoun exceptionalwork priority
1 0 857 0 0 0 22111
1 1 2808 0 0 0 1073211
1 2 689 0 0 0 1111131
1 3 115 0 0 0 1161131
questions:
1. according to the tutorials,system learn bayesian network first. so the sentence col.and the word col. we omit them or not?because they are not variables?
2. these are discrete variable why we must descretize?
DBN descetize and time direction
-
- Site Admin
- Posts: 1417
- Joined: Mon Nov 26, 2007 5:51 pm
Re: DBN descetize and time direction
For the DBNs we only support the parameter learning (EM). You'll have to create the structure of your DBN manually, then run the EM.1. according to the tutorials,system learn bayesian network first.
That is up to you, depending on the model structure you choose.so the sentence col.and the word col. we omit them or not?because they are not variables?
Sorry, but I don't understand this question.2. these are discrete variable why we must descretize?
Re: DBN descetize and time direction
the variables i'm using for network learning are discrete random variables. when dataset is loaded into genie we first have to discretize it and choose the quantile like 2q,4q....etc. but it is already discrete.
so what is the benefit?
so what is the benefit?
-
- Site Admin
- Posts: 430
- Joined: Tue Dec 11, 2007 4:24 pm
Re: DBN descetize and time direction
I'm not sure I understand the problem. Learning a static network can happen when all variables are discrete or all are continuous and follow the multivariate Gaussian distribution. If you a mixture of discrete and continuous variables, you will need to discretize the continuous variables. I am wildly guessing now but is your problem perhaps that you have discrete but numerical variables with outcomes like 0, 1, 2, etc? In that case, you will need to change them into state labels, like s0, s2, etc. This has to do with restrictions on state names. You can do it fairly automatically outside of GeNIe or inside GeNIe by discretizing and setting the interval boundaries wisely. I hope this helps.safaa wrote:the variables i'm using for network learning are discrete random variables. when dataset is loaded into genie we first have to discretize it and choose the quantile like 2q,4q....etc. but it is already discrete.
so what is the benefit?
Cheers,
Marek
-
- Site Admin
- Posts: 1417
- Joined: Mon Nov 26, 2007 5:51 pm
Re: DBN descetize and time direction
Just to clarify: if the numerical variable is discrete and contains values from [0..N-1] range, there's no need to convert integers into state names - this will be done automatically by GeNIe in the background before invoking structure learning algorithm.marek wrote:you have discrete but numerical variables with outcomes like 0, 1, 2, etc? In that case, you will need to change them into state labels, like s0, s2, etc.
Re: DBN descetize and time direction
[quote
Yes they are discrete numerical value.
to make things clear. in my research we are working withe sentenes. each sentence has words and each word has features like prefix ,suffix,pattern, part of speech.those features are the variables i'm going to deal with of coarse the value of these variables are texts. like prefix="xdsdf" here i put all data in db with indices and for each value i just take the corresponding index.
][/quote]Merak wrote:I am wildly guessing now but is your problem perhaps that you have discrete but numerical variables with outcomes like 0, 1, 2, etc?
Yes they are discrete numerical value.
to make things clear. in my research we are working withe sentenes. each sentence has words and each word has features like prefix ,suffix,pattern, part of speech.those features are the variables i'm going to deal with of coarse the value of these variables are texts. like prefix="xdsdf" here i put all data in db with indices and for each value i just take the corresponding index.