Can learning algorithms be parallelized?

maghnie · Post by **maghnie** » Mon Jul 08, 2024 9:03 am

Specifically for PySmile, is it possible to parallelize a function call like

pysmile.learning.BayesianSearch.learn(...)

?

For example, could we tell SMILE somewhere to use a specific number of threads? Or is there a valid way to divide-and-conquer the learning problem before the "...learn()" call?

When it comes to the later data fitting part, I could easily parallelize testing different mappings to the network nodes using app-level threading, just as described here: viewtopic.php?p=393#p393

For training on a data set with 30 nodes and around 35k records, it took my PC about 5 hours to learn the network (which is actually pretty nice).

So, to be a bit greedy, it would be even better if there were options to speed up the process and make it more scalable.

Mon Jul 08, 2024 12:10 pm

At this point SMILE does not directly support parallelism. I 100% agree that running multiple instances of learning algorithms like Bayesian Search could be very useful.

We have completed the implementation of equations for dynamic Bayesian networks. Our next high priority item is structure learning with missing data, while it is not directly related to parallelism it may open up the code for modifications which finally allow for multiple threads.

maghnie · Post by **maghnie** » Mon Jul 08, 2024 4:38 pm

Thank you for the update!

Reducing data requirements for structure learning also sounds like a great way to support more use cases.

In the meantime, I could experiment with filtering those 35k records into a more compact and representative set.

BayesFusion Support Forum

Can learning algorithms be parallelized?

Can learning algorithms be parallelized?

Re: Can learning algorithms be parallelized?

Re: Can learning algorithms be parallelized?