System.AccessViolationException with SMILE.net
System.AccessViolationException with SMILE.net
Hello Forum!
I am using the Smile.net wrapper to access SMILE from an ASP.net web application.
In order to avoid the expensive loading of a Network from disk using the ReadFile-method for every request, I am keeping a "master copy" in memory and then use the .Clone()-method to give out "fresh" networks to handle the requests.
Occasionally, I am getting AccessViolationExceptions from within the .Clone()-call and, more seldomly, during the Finalize()-method of the Network-object (apparently, when Garbage Collection runs).
What's wrong here?
I am using the Smile.net wrapper to access SMILE from an ASP.net web application.
In order to avoid the expensive loading of a Network from disk using the ReadFile-method for every request, I am keeping a "master copy" in memory and then use the .Clone()-method to give out "fresh" networks to handle the requests.
Occasionally, I am getting AccessViolationExceptions from within the .Clone()-call and, more seldomly, during the Finalize()-method of the Network-object (apparently, when Garbage Collection runs).
What's wrong here?
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
Re: System.AccessViolationException with SMILE.net
Hard to tell without looking at the specifics of your application. Can you estimate how much memory is used by CPTs in your network? Do you have multiple threads accessing single Smile.Network objects?svenr wrote: Occasionally, I am getting AccessViolationExceptions from within the .Clone()-call and, more seldomly, during the Finalize()-method of the Network-object (apparently, when Garbage Collection runs).
Re: System.AccessViolationException with SMILE.net
Thanks for answering! Let's see whether I can get specific enough without having to post the entire applicationshooltz wrote:Hard to tell without looking at the specifics of your application.

The memory difference between an "empty" Network and a "loaded" one is about 100MB in Windows' task manager.Can you estimate how much memory is used by CPTs in your network?
I don't think so. Being used in a web application, we obviously have concurrent requests, but the network cloning is synchronized and afterwards, every thread should be using its own separate instance of the Smile.Network.Do you have multiple threads accessing single Smile.Network objects?
See attached a minimized C#-codefile that does the Cloning. It is implemented as a singleton to be available during the entire lifetime of the web server's worker process.
Callers would do something like:
Code: Select all
Network myNetwork = BayesContainer.Instance.CloneNetwork();
- Attachments
-
- BayesContainer.cs.txt
- Class to hold the SMILE network
- (853 Bytes) Downloaded 1239 times
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
Re: System.AccessViolationException with SMILE.net
If there's a chance of posting the entire application, it would be very helpfulsvenr wrote:Thanks for answering! Let's see whether I can get specific enough without having to post the entire application

Please note that .NET garbage collector is not aware of this difference, unless you explicitly call GC.AddMemoryPressure - the 100 MB is almost exclusively allocated by unmanaged code (the C++ SMILE library). There's a non-zero chance that garbage collection is delayed due to small amount of allocation on .NET-managed heap, but at the same time the actual allocations are hitting the address space limit. You can ensure that native code deallocates its memory by calling Network.Dispose directly or by encapsulating the code utilizing given Smile.Network object with 'using' statement.The memory difference between an "empty" Network and a "loaded" one is about 100MB in Windows' task manager.
Looks OK. The network which is the output of Clone() is subsequently used only on single thread, right?See attached a minimized C#-codefile that does the Cloning. It is implemented as a singleton to be available during the entire lifetime of the web server's worker process.
Thanks for the hints with AddMemoryPressure and Network.Dispose(). 
I was unable to reproduce the exception with the code that uses Dispose() so it appears that your suggestions did the trick (although the bug was not reliably reproducible even before - will report back if it re-surfaces).
For completeness, I have attached a more comprehensive example to this post. It contains an updated BayesContainer.cs that uses both suggestions.
The attachment also contains an example of how I use SMILE in a WCF webservice. The project should compile in VS2008 once you put a smilenet.dll into its directory.
It should show that accesses to a SMILE.Network instance are single threaded, with the exception of the cloning itself (which is synchronized for this reason). The cloned SMILE.Network is only used as a local member within one method.

I was unable to reproduce the exception with the code that uses Dispose() so it appears that your suggestions did the trick (although the bug was not reliably reproducible even before - will report back if it re-surfaces).
For completeness, I have attached a more comprehensive example to this post. It contains an updated BayesContainer.cs that uses both suggestions.
The attachment also contains an example of how I use SMILE in a WCF webservice. The project should compile in VS2008 once you put a smilenet.dll into its directory.
It should show that accesses to a SMILE.Network instance are single threaded, with the exception of the cloning itself (which is synchronized for this reason). The cloned SMILE.Network is only used as a local member within one method.
- Attachments
-
- BayesService.zip
- Exemplary Webservice project
- (6.96 KiB) Downloaded 1388 times
Stupid me, why am I posting so bold things... 
Here is the relevant part of two stacktraces that may be of help:
Sorry, no line numbers in the debug output... It never happens on my local machine.

Here is the relevant part of two stacktraces that may be of help:
Code: Select all
ERROR 0 - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
StackTrace: at DSL_network.__ctor(DSL_network* )
at Smile.Network..ctor()
Code: Select all
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at Smile.Network.Finalize()
Code: Select all
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at DSL_network.=(DSL_network* , DSL_network* )
at Smile.Network.Clone()
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
I don't think there's a need for using both approaches. Since your cloned Network are only used within single method, the 'using' statement is the best choice - it will call Network.Dispose() at the end of the scope, even if exception is thrown.svenr wrote:Thanks for the hints with AddMemoryPressure and Network.Dispose().
With large network, it can be actually cheaper to run synchronized inference on single Nework object instead of synchronized copying/multithreaded update. This of course depends on the structure of the network, actual hardware and the workload.
I have prepared smilenet.dll compiled without optimizations and with extra diagnostic checks. This version may give you better stack frames, possibly with line numbers. I'm sending the download link as private message.
Back again...
Hi! It's me again - with bad (or, let's say, ambivalent) news...
The problem didn't surface again for quite some time now, mostly because of excessive synchronisation which effectively meant that the whole application ran single-threadedly...
In the meantime, thanks to your suggestion, I tried to avoid the expensive per-request cloning by implementing a pool from which threads may acquire SMILE.Networks.
The pool is built by reading one .xdsl file from disk several times (i.e. there should not be any connections between the Network-objects in the pool). Threads return Networks to the pool after they have used them.
Using the pool with a size of 1 (i.e., serialize all threads to use one single SMILE.Network) is fine. Setting the pool size to 2 or more causes an AccessViolationException under the following circumstances: When UpdateBeliefs() was executed on two ore more Networks in parallel, the next execution of UpdateBeliefs() on one of these Networks (done by the next thread to get it from the pool) throws the exception. In fact, calling UpdateBeliefs() twice within the same thread would also throw the exception, if another thread completed a call to UpdateBeliefs() between both calls.
However, the exception is thrown only for some BayesianAlgorithmTypes:
Henrion, HeuristicImportance and LSampling fail, while AisSampling, BackSampling, EpisSampling, Lauritzen and SelfImportance are ok.
The exception even occurs when the pool contains wholly different networks (i.e. read from separate files).
It appears to me that the first algorithms use some variable that is shared across the library and not private to the Network object (something 'static', perhaps?) which leads to interferences.
I have attached a sample project which implements the described behaviour and reproduces the Exception reliably (increase NUMBER_OF_THREADS in case it does not
).
It is for VS2008 but I guess the .cs files can easily be compiled in other versions of Visual Studio. In any case, make sure to have a smilenet.dll referenced by the project to build it.
If you need any further information, let me know.
The problem didn't surface again for quite some time now, mostly because of excessive synchronisation which effectively meant that the whole application ran single-threadedly...
In the meantime, thanks to your suggestion, I tried to avoid the expensive per-request cloning by implementing a pool from which threads may acquire SMILE.Networks.
The pool is built by reading one .xdsl file from disk several times (i.e. there should not be any connections between the Network-objects in the pool). Threads return Networks to the pool after they have used them.
Using the pool with a size of 1 (i.e., serialize all threads to use one single SMILE.Network) is fine. Setting the pool size to 2 or more causes an AccessViolationException under the following circumstances: When UpdateBeliefs() was executed on two ore more Networks in parallel, the next execution of UpdateBeliefs() on one of these Networks (done by the next thread to get it from the pool) throws the exception. In fact, calling UpdateBeliefs() twice within the same thread would also throw the exception, if another thread completed a call to UpdateBeliefs() between both calls.
However, the exception is thrown only for some BayesianAlgorithmTypes:
Henrion, HeuristicImportance and LSampling fail, while AisSampling, BackSampling, EpisSampling, Lauritzen and SelfImportance are ok.
The exception even occurs when the pool contains wholly different networks (i.e. read from separate files).
It appears to me that the first algorithms use some variable that is shared across the library and not private to the Network object (something 'static', perhaps?) which leads to interferences.
I have attached a sample project which implements the described behaviour and reproduces the Exception reliably (increase NUMBER_OF_THREADS in case it does not

It is for VS2008 but I guess the .cs files can easily be compiled in other versions of Visual Studio. In any case, make sure to have a smilenet.dll referenced by the project to build it.
If you need any further information, let me know.
- Attachments
-
- AccessViolationExample.zip
- Small VS2008 project to reproduce the AccessViolationException
- (7.68 KiB) Downloaded 1438 times
-
- Site Admin
- Posts: 1457
- Joined: Mon Nov 26, 2007 5:51 pm
Re: Back again...
Bug confirmed & fixed. The code for the three failing sample algorithms is vintage 1996svenr wrote:However, the exception is thrown only for some BayesianAlgorithmTypes:
Henrion, HeuristicImportance and LSampling fail, while AisSampling, BackSampling, EpisSampling, Lauritzen and SelfImportance are ok.

The fix will be included in the upcoming release.
Re: Back again...
Thanks very much, that's good news!shooltz wrote:Bug confirmed & fixed.
I'm looking forward to the new release.

Re: Back again...
I have run the smilenet.dll released on November 4 through my test suite and all algorithms passed without throwing the exception.shooltz wrote:Bug confirmed & fixed.
...
The fix will be included in the upcoming release.
Thanks again!
