Can learning algorithms be parallelized?

The engine.
Post Reply
maghnie
Posts: 14
Joined: Fri May 03, 2024 10:45 am

Can learning algorithms be parallelized?

Post by maghnie »

Specifically for PySmile, is it possible to parallelize a function call like

Code: Select all

pysmile.learning.BayesianSearch.learn(...)
?

For example, could we tell SMILE somewhere to use a specific number of threads? Or is there a valid way to divide-and-conquer the learning problem before the "...learn()" call?

When it comes to the later data fitting part, I could easily parallelize testing different mappings to the network nodes using app-level threading, just as described here: viewtopic.php?p=393#p393

For training on a data set with 30 nodes and around 35k records, it took my PC about 5 hours to learn the network (which is actually pretty nice).

So, to be a bit greedy, it would be even better if there were options to speed up the process and make it more scalable.
shooltz[BayesFusion]
Site Admin
Posts: 1437
Joined: Mon Nov 26, 2007 5:51 pm

Re: Can learning algorithms be parallelized?

Post by shooltz[BayesFusion] »

At this point SMILE does not directly support parallelism. I 100% agree that running multiple instances of learning algorithms like Bayesian Search could be very useful.

We have completed the implementation of equations for dynamic Bayesian networks. Our next high priority item is structure learning with missing data, while it is not directly related to parallelism it may open up the code for modifications which finally allow for multiple threads.
maghnie
Posts: 14
Joined: Fri May 03, 2024 10:45 am

Re: Can learning algorithms be parallelized?

Post by maghnie »

Thank you for the update!

Reducing data requirements for structure learning also sounds like a great way to support more use cases.

In the meantime, I could experiment with filtering those 35k records into a more compact and representative set.
Post Reply