Document Cache components

Document caching enables you to temporarily hold and index Boomi documents within the scope of an integration process.

note

Document caching and the Document Cache component are part of the Advanced Workflow and are available only in the Professional and Enterprise Editions of Integration. For more information, contact your account representative.

Document caching lets you add documents to a cache and then reference them later in a process or subprocess. You can look up and hold data in memory, and reference that data when you need it in the process. This helps you to avoid making multiple connector calls to an application within a single process to look up different types of information. You can get the documents from various sources, index and store the documents you need, and then retrieve the cached documents for use in a process execution. Document caching also allows you to persist data across branches.

Understanding document caching

Each document becomes an entry in the document cache. If a document contains multiple records that you want to retrieve separately (as in a batch file), you must split the document before adding it to the cache.

A document cache can be shared among parent and child processes. You can add documents to the cache in the parent process, and those documents are available in any child processes. However, the document cache is temporary. Documents remain in the cache only for a single execution of a process, either in Test mode or production. You can also remove some or all of the documents from the cache during process execution, if you want to reuse the same document cache.

Document caching simplifies data synchronization and integration processes, streamlines repetitive tasks, reduces the risk of errors and inconsistencies, lowers costs in data migration projects, and saves time.

Use cases

Document caching and Document Cache components are useful if you:

Need to combine documents from different sources, including flat file systems and cloud-based applications
Work with documents that are missing data or contain inaccurate data, so you have to query other documents or web sites to find the information
Need to find default values for an application to use when there is no corresponding data from the source system
Need to find industry IDs and codes from outside data sources
Have integration processes that repeatedly query a source
Are using complex integration processes with multiple Set Properties steps.
Need to send or receive SOAP messages with MIME attachments. The attachments are stored in a document cache.
Need to send MIME messages with attachments. The attachments are stored in a document cache.

However, remember the following when using a document cache:

A document cache can only be accessed in the scope of the current process execution.
You cannot persist data in a cache across multiple executions or processes. The contents of a cache are deleted upon the completion of each process execution.

note

The following resources provide more in-depth information about the capabilities of document caching:

Watch the Document Caching video to learn about document caching functionality and to see an integration scenario.
Learn about common errors and behaviors in Document Cache Best Practices and Common Scenarios.

Document cache workflow

To use document caching, create a Document Cache component. Document Cache components are reusable throughout all of your processes.

General steps for using document caching:

Create a Document Cache component, which determines which documents are cached and how they are indexed.
Add documents to the document cache to use them in a process or subprocess. There are two ways to add documents to a cache:
1. You can bring data into a process and then use the Add to Cache step to add documents to the document cache. The Add to Cache step references a Document Cache component.
2. You can specify on a connector operation to add that documents returned by the connector call to the selected document cache. This option is available on Caching tab of connector operation . If you use this method, it is in effect combining a connector call and an Add to Cache step.

After adding documents to the cache, you can retrieve data from those documents in following ways:

You can join multiple sources in a map by adding documents from document caches. After adding the document caches to the source profile, you can steps from the document caches and source profile to the destination profile's steps.
You can query a Document Cache component in every place that you can use a parameter value. You can use parameter values in the following steps: Connector (on the Parameters tab), Decision, Exception, Message, Notify, Program Command, and Set Properties. You can output a single field from a single document.
You can use the Retrieve From Cache step in your process, which retrieves documents from the selected Document Cache component. You can output multiple documents from a single document.
You can select a document cache lookup function in maps to retrieve particular fields from a document.

note

When documents are retrieved from a document cache for use in a process, they replace the current document data and its document properties with those from the cache.

Document Cache storage

A document cache's documents and indexes are stored to disk within the Runtime directory and loaded into memory when needed. Storing the documents and indexes to disk during process execution reduces the amount of memory the process uses and allows the handling of large volumes of data.

There is no size limit on a document cache. However, because document caches are temporarily stored on your hard disk, they are limited to your available space.

If you use document caching and run your processes on a Runtime cluster or Runtime cloud, the document cache is distributed within the Runtime cluster or Runtime cloud. Therefore, parallel processing can take advantage of it.

Once a process either runs through Test mode or is deployed and run, the cached documents are purged from the Runtime. The cache is temporary and cannot be used by another process.

Low Latency processes and Document Caches

Any data written to disk greatly increases processing time, therefore low latency processes that use document caches are handled differently. To allow low latency processes to execute quickly, their cached documents and indexes are stored to disk only under certain conditions.

Cloud owners can set the Runtime Working Overflow Size quota (on the Attachment Quotas tab in Cloud Management). For low latency processes, this quota limits the number of bytes per working datastore maintained in memory before overflowing to disk. If this quota is set, and the low latency process’ index exceeds the quota, the index is stored to disk. If the process’ cached documents exceed the quota, they will also store to disk.
If the Runtime Working Overflow Size quota is not set, then there is a 1 MB limit for indexes and a 1 MB limit for cached documents. If the low latency process’ index is greater than 1 MB, it is stored to disk. If the low latency process’ cached documents are greater than 1 MB, then they are stored to disk.