OutOfMemoryException during TeamBuild of a shfb's project

Jul 7, 2009 at 11:53 AM

Log:

Info: BuildAssembler: Building topic M:xxx.xxx.TransferServices.Services.XTsDummyWfPersistenceService.LoadCompletedContextActivity(System.Guid,System.Workflow.ComponentModel.Activity)
 
  Unhandled Exception: OutOfMemoryException.
  Info: CachedCopyFromIndexComponent: Used "reflection" cache entries: 10
  Info: CachedCopyFromIndexComponent: Used "comments" cache entries: 29
  Info: CachedResolveReferenceLinksComponent: MSDN URL cache updated.  Saving new information to C:\Documents and Settings\XFW_Build\Local Settings\Application Data\EWSoftware\Sandcastle Help File Builder\Cache\MsdnUrl.cache
  Info: CachedResolveReferenceLinksComponent: New cache size: 9 entries
C:\Temp\24\Sources\Sources\Help\ref_doc\Help\Working\BuildReferenceTopics.proj(27,5): error MSB6006: "BuildAssembler.exe" exited with code -532459699.
    Last step completed in 00:10:34.5019

My project is:

Info: Wrote information on 170 namespaces, 2238 types, and 12751 members

 

Coordinator
Jul 7, 2009 at 4:11 PM

BuildAssembler can consume a vast amount of memory.  Unfortunately, there isn't anything I or SHFB can do to reduce BuildAssembler's memory usage.  Be sure that you haven't added unnecessary assemblies to the list to be documented.  If it's just a referenced assembly (one your project uses but that you don't want to document), be sure you add it to the project's References node rather than as a document source.  Another option is to perhaps use the API Filter to get rid of unnecessary items or, as a last resort, split the documented assemblies into separate projects to try and bring down the amount of information.

Eric

Feb 8, 2013 at 5:47 PM
Eric,

Can you clarify what you mean when you say "there isn't anything I or SHFB can do to reduce BuildAssembler's memory usage"?

We're also running into this problem, and unfortunately the suggested workaround is insufficient for our use case. We are developing conceptual and tutorial documents for our SDK, which is very large. It's common for a single tutorial document to need to link to API documentation from several of our assemblies, but each assembly is so large that SHFB runs out of memory if we include more than one assembly as a documentation source in a SHFBPROJ file.

Is it true that Microsoft uses SHFB internally to build MSDN? If so, how do they work around this problem?

We are very motivated to resolve this issue, so any additional information would be immensely helpful.
Coordinator
Feb 8, 2013 at 8:36 PM
Edited Feb 8, 2013 at 8:40 PM
Just to clarify, Microsoft used the Sandcastle tools to produce their MSDN documentation, not SHFB. SHFB is a front end for the Sandcastle tools. Even then, they didn't use Sandcastle exactly like we do since they have a number of differences in the way they store their comments, examples, etc. Since they used Help 2 and MS Help Viewer formats, it's also much easier to break up the build into different parts if necessary due to the use of index and ID links rather than links to actual topics. Their actual MSDN output was converted from the Help 2 output as I recall using a separate process that was never made public.

This is a very old thread so a lot has changed since then. I've taken over support and development of the Sandcastle tools. To that end, I have recently been cleaning up and revising the build components. I've added namespace filtering support to the default components so that they don't have to load all of the framework information, just the relevant parts. I've also created a set of cached build components that store the framework reflection and comment data in ESent databases. This frees up a significant amount of memory which may help. These are in source control but have not been officially released yet. I don't have a firm date for when the next release will be published.

How much memory is ultimately used still depends on how much space your assembly's reflection data and comments take up. For example, without the new cache components, BuildAssembler can easily consume a gigabyte or more of memory when building the SHFB help file. If you're running on a 32-bit system, you're constrained by the 4GB limit less whatever is in use by all the other apps. On a 64-bit system, you've got room to use all the available memory in the system if you've got more than 4GB so a build may be successful on a 64-bit system with more memory whereas it fails on a 32-bit system with less.

I should add too that if you are building multiple help formats at once, the memory usage is currently multiplied by the number of formats produced since the components don't share data in the current release. If such is the case, building one format at a time may be a workaround. This problem is addressed in the next release with the changes made above and building multiple formats typically uses no more memory than a single format.

It would help to know what you define as "very large". How many namespaces, types, and members are in one assembly or the combined set attempting to be documented? The MRefBuilder step reports that info. If you're packing hundreds of namespaces and thousands of types into one assembly, that could still be a problem. While the .NET Framework is large, it is spread across many different assemblies so while documenting them all at once might not work, breaking the build down into smaller chunks would work.

Eric
Feb 8, 2013 at 9:03 PM
Eric,

Thanks for the quick response.
  • We're only trying to build the website format. However, we do need the help build to create links from a single concept/tutorial document to types in multiple different assemblies, so breaking the help build up by assembly doesn't seem to be a viable work-around for our use case.
  • Our builds run on 64-bit machines with 12+ GB of RAM.
  • The combined size of the assemblies we're trying to document is currently around 15k types and 50k members spread across 170 assemblies. Even with nothing else of any significance running in the background, the help build chews through 12 GB pretty quickly.
Hopefully, the caching you described will help. However, I'm afraid it might take more than that to solve our use case. Most likely, the BuildAssembler will need to be re-factored to support a "memory efficient" mode that only keeps in memory that which is absolutely necessary at any given point during the build process. This mode would likely take longer than the standard mode, but would enable large-scale use cases such as ours. Can you estimate the feasibility and work required to implement such a mode? If necessary, we might be able to devote some of our developer resources to helping create such a mode...
Coordinator
Feb 12, 2013 at 2:06 AM
The new components would help as they operate in the "memory efficient" mode you describe but they are currently only set to cache the base framework data and comments so it wouldn't help with the amount of project data you are producing. I'm considering letting the SQL version cache project data since it is so much faster than the ESent version. However, that will require a little more work. The ESent version could do the same thing and would be simple to allow since it stores each cache in a separate database. However, it has proved to be rather slow in the initial indexing phase so for a large project it may take 30-60 minutes to index all of the information. I suppose it could be set up to allow you to specify that the project cached items could be saved outside the working folder so that they didn't need to be recreated on each build and you'd be responsible for clearing them to rebuild them if the information changed. Perhaps the extra time overhead in such a case even if it was in each build would be acceptable if it meant being able to actually produce documentation for such a large project.

I haven't done any work on the configuration dialogs for any of the new components but the code for the components themselves is checked in. Adding support for caching project data is possible but will require some changes to the build engine to allow writing out the options on the necessary configuration elements.

Eric