DBN descetize and time direction

The front end.
Post Reply
safaa
Posts: 31
Joined: Sat May 24, 2014 7:00 am

DBN descetize and time direction

Post by safaa »

Hello;
thank you for answering my questions, although sometimes it seems trivial.but i'm beginner.
I want to use Dynamic Bayesian network.to learn and inference in natural language processing.
the dataset is a set of sentences each sentence has set of words.sentence are the experiments, words are the time in DBN,
if we have discrete data like in the following
sample from the data
sentence word pattern toolword propernoun exceptionalwork priority
1 0 857 0 0 0 22111
1 1 2808 0 0 0 1073211
1 2 689 0 0 0 1111131
1 3 115 0 0 0 1161131
questions:
1. according to the tutorials,system learn bayesian network first. so the sentence col.and the word col. we omit them or not?because they are not variables?
2. these are discrete variable why we must descretize?
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: DBN descetize and time direction

Post by shooltz[BayesFusion] »

1. according to the tutorials,system learn bayesian network first.
For the DBNs we only support the parameter learning (EM). You'll have to create the structure of your DBN manually, then run the EM.
so the sentence col.and the word col. we omit them or not?because they are not variables?
That is up to you, depending on the model structure you choose.
2. these are discrete variable why we must descretize?
Sorry, but I don't understand this question.
safaa
Posts: 31
Joined: Sat May 24, 2014 7:00 am

Re: DBN descetize and time direction

Post by safaa »

the variables i'm using for network learning are discrete random variables. when dataset is loaded into genie we first have to discretize it and choose the quantile like 2q,4q....etc. but it is already discrete.
so what is the benefit?
marek [BayesFusion]
Site Admin
Posts: 430
Joined: Tue Dec 11, 2007 4:24 pm

Re: DBN descetize and time direction

Post by marek [BayesFusion] »

safaa wrote:the variables i'm using for network learning are discrete random variables. when dataset is loaded into genie we first have to discretize it and choose the quantile like 2q,4q....etc. but it is already discrete.
so what is the benefit?
I'm not sure I understand the problem. Learning a static network can happen when all variables are discrete or all are continuous and follow the multivariate Gaussian distribution. If you a mixture of discrete and continuous variables, you will need to discretize the continuous variables. I am wildly guessing now but is your problem perhaps that you have discrete but numerical variables with outcomes like 0, 1, 2, etc? In that case, you will need to change them into state labels, like s0, s2, etc. This has to do with restrictions on state names. You can do it fairly automatically outside of GeNIe or inside GeNIe by discretizing and setting the interval boundaries wisely. I hope this helps.
Cheers,

Marek
shooltz[BayesFusion]
Site Admin
Posts: 1417
Joined: Mon Nov 26, 2007 5:51 pm

Re: DBN descetize and time direction

Post by shooltz[BayesFusion] »

marek wrote:you have discrete but numerical variables with outcomes like 0, 1, 2, etc? In that case, you will need to change them into state labels, like s0, s2, etc.
Just to clarify: if the numerical variable is discrete and contains values from [0..N-1] range, there's no need to convert integers into state names - this will be done automatically by GeNIe in the background before invoking structure learning algorithm.
safaa
Posts: 31
Joined: Sat May 24, 2014 7:00 am

Re: DBN descetize and time direction

Post by safaa »

[quote
Merak wrote:I am wildly guessing now but is your problem perhaps that you have discrete but numerical variables with outcomes like 0, 1, 2, etc?
][/quote]
Yes they are discrete numerical value.
to make things clear. in my research we are working withe sentenes. each sentence has words and each word has features like prefix ,suffix,pattern, part of speech.those features are the variables i'm going to deal with of coarse the value of these variables are texts. like prefix="xdsdf" here i put all data in db with indices and for each value i just take the corresponding index.
Post Reply