Real-Time Write Performance Tips#

DEPRECATED

Apperate and its features (including this feature) are deprecated.

Whether you’re streaming data via a connector or writing real-time data from your application, you’ll want to do it the best, most performant way for your use cases. Here we’ll contrast the different options, describing their pros and cons and highlighting their performance.

Key Takeaways#

Here are the performance findings and associated best practices:

  • Asynchronous writes are faster than synchronous writes - Unless the data you’re writing must be immediatlely ready to query, write the data asynchronously (e.g., use wait=false).

  • Writes are much faster for datasets opted out of Financial Identifier Resolution - Unless your data contains a financial symbol field (e.g., stock symbol field) or you are planning to query using other equivalent supported identifiers (other symbologies), opt out your dataset from Financial Identifier Resolution.

Read on to learn more about the performance analysis and how to configure for optimum real-time writes.

Asynchronous versus Synchronous Writes#

When you write data directly to Apperate using the POST /record endpoint or the apperate.write() iexjs method, you can execute requests asynchronously (with wait=false) or synchronously (default). Async requests return immediately; sync requests return after the data can be queried.

The following chart contrasts the average POST /record execution latency for requests sent asynchronously and synchronously.

Async vs. Sync Record Posts - Average Latency

Async/sync requests

100 records

1000 records

Async

1.00 ms per record

1.48 ms per record

Sync

1.96 ms per record

1.99 ms per record

The asynchronous requests completed much faster than the synchronous ones, especially when request volume was lower (e.g., 100 records). Async requests nearly 2x as fast with 100 simultaneous record writes and over 24% faster with 1,000 simultaneous record writes.

Async/Sync Pros and Cons#

You must decide whether async or sync real-time write requests best fit your use cases. The following table lists pros and cons of async and sync real-time write requests.

Async/Sync

Pros

Cons

Async

Data is stored faster.

Each write request returns immediately.

The data may not be available to query as soon as you need because Apperate saves the data, then queues it, and shortly therafter makes it available to query.

Sync

The data is available to query immediately after the write request returns.

Data takes longer to store.

Write requests return more slowly–they return after the data is available to query.

See also

Please see Write Data in Real Time with POST /record for details on writing data with POST /record.

Financial Identifier Resolution: Opt Out versus Opt In#

When you generate datasets on-the-fly as part of your connector run, write request, or data upload, Apperate scans the initial record fields for financial symbols (e.g., stock tickers). On finding such a field, Apperate applies a metadata graph to the resulting dataset field, enabling Financial Identifier Resolution: the ability to query on the mapped field using equivalent supported financial identifiers. For example, you can use Apple’s INET symbol AAPL to query for records that use Apple’s FIGI symbol US0378331005.

The drawback of opting in a dataset to Financial Identifier Resolution, however, is that it makes real-time writes much slower for it than if it was opted out.

The following chart contrasts writes for opted out datasets with writes for opted in datasets.

Financial Identifier Resolution Opt Out vs. Opt in - Average Record Post Latency

Opted in/out

100 records

1000 records

Opted out

1.96 ms per record

1.99 ms per record

Opted in

14.18 ms per record

126.08 ms per record

Writes to opted out datasets are much faster. For batches of 100 records, writes were 7x faster for opted out datasets; for batches of 1000 records, they were 63x faster.

Warning

If you don’t plan to use Financial Identifier Resolution with your dataset, Opt Out your dataset from the metadata graph.

Tip

A good way to opt out a datasource’s target dataset is to stop the data source after writing some initial records. Then use the Schema Editor or Dataset API to check the dataset’s Opt-in status. If it is opted in, deselect that option and reingest the existing data per the following instructions.

How to Opt Out of Financial Identifier Resolution#

Here are steps for opting out a dataset in the dataset schema editor:

  1. In the console, go to your workspace datasets. Your workspace datasets list appears.

  2. Click on your dataset. Your dataset’s Database page appears.

  3. Open the schema editor by clicking Edit . The schema editor appears.

  4. In the schema editor, scroll down to the Opt in section below the properties table. If Opt In is selected, unselect it; otherwise cancel editing the schema and skip the remaining steps.

  5. In the Select action for existing data field above the properties table, select Reingest data using a new schema.

  6. Submit your changes.

Apperate reingests the data and to your modified schema.

Additional Resources#

Write Data in Real Time with POST /record

Write Data in Real Time with the apperate.write() JS Method

Normalized Financial Symbols

Understanding Datasets