High-Frequency Automated Trading (HFAT); part 2

Interactive Brokers’ Real-Time Volume Data

Just like with price data, volume data are subject to delays and BF (Backfill) corrections. Moreover, IB (Interactive Brokers) reports volume data in a manner that could cause major performance differences between backtesting and actual trading.

This post outlines simple procedures to collect RT and BF data for comparison. No effort is made to explain the differences or to perform statistical analysis. The views expressed here are based on personal experiences and/or may be anecdotal; not everything that happens in real-time trading is easy to explain. As always, if you have technical insight and/or see inaccuracies, please comment for the benefit of future readers.

As expected, IB RT volume data contain the usual bad ticks and delays that are corrected during backfill. However, and this is very important to the RT trader, IB adjusts live volumes at about 30-second intervals. This means that the volumes IB reports during RT trading do not accurately reflect market activity. This means also that volume data may be delayed by up to 30 seconds, instead of by the typical snapshot delay, which is about 300 milliseconds for price data. Comparing backfilled with real-time volume, it appears that the real-time periodic volume adjustments are re-distributed across individual snapshots during backfill. This post is intended to help you perform your own data analysis. The methods outlined below are intended to get you started.

To collect and save real-time data:

  1. Create a new database in the 5-second interval.
  2. Embed “RD”, for Raw Data, when naming the database.
  3. In Database Settings select the Interactive Brokers plugin.
  4. Pick a high volume stock, for example, AAPL (used in this post).
  5. Connect to the TWS (Trader Work Station), signing in to your Paper Trading account. Do not use the eDemo account.
  6. Collect about an hour’s worth of real-time data.

The first thing that will happen when you connect to the TWS is that AmiBroker backfills approximately 2000 bars of 5-second data. This cannot be prevented and you must be careful to note the time where backfill ends and raw data collection starts. The simplest way is to place a vertical line on your chart and label it “Start of real-time data”.

To save the database:

  1. Disconnect the IB plugin (see Plugin menu at right bottom of chart).
  2. Open Database Settings and set the database to Local.
  3. Place another vertical line to indicate where data collection stopped.
  4. Go to the File menu and save the database.

Be sure to set the Database Settings -> Data Source -> Local before saving. If you do not do this the database will backfill on the next startup and this may corrupt your RT data sample.

The next step is to collect a sample of BF data that overlaps the previously collected real-time sample. To do this, you need to create another database. Since IB backfills only about 2000 bars of 5-second data, you should do this as soon as possible after collecting raw data, else the collection periods may not overlap and you will not be able to compare the two types of data. The procedure is the same as above except that you want to embed “BF” (for backfilled data) instead of “RD” in the database name.

To visually compare the two databases you can open two instances of AmiBroker and load the RT database in one and the BF database in the other. You can then display the two databases at the same time and visually compare the respective charts. You may want to display both a price chart and a volume chart in separate panes, as shown in the captures below.

You can use the code below to inspect your price chart:

<b>Plot</b>(<b>C</b>,<b>"Close"</b>,<b>colorBlack</b>,<b>styleBar</b>); <p>TN=<b>TimeNum</b>(); <p>Cursortime = <b>SelectedValue</b>(TN); <p>CumHL = <b>Cum</b>(<b>IIf</b>(TN&gt;=CursorTime,<b>H</b>-<b>L</b>,<b>0</b>)); <p><b>Plot</b>(CumHL,<b>""</b>,<b>4</b>,<b>styleArea</b>|<b>styleOwnScale</b>); <p><b>Title</b>=<b>Name</b>()+<b>" Interactive Brokers BackFilled price data - "</b>+<b>Interval</b>(<b>2</b>); 

And this code to inspect your Volume chart:

<b>Plot</b>( <b>Volume</b>,<b>""</b>,<b>2</b>,<b>styleOwnScale</b>|<b>styleHistogram</b>|<b>styleThick</b>); <p>TN=<b>TimeNum</b>(); <p>Cursortime = <b>SelectedValue</b>(TN); <p>CV = <b>Cum</b>(<b>IIf</b>(TN&gt;=CursorTime,<b>V</b>,<b>0</b>)); <p><b>Plot</b>(CV,<b>""</b>,<b>4</b>,<b>styleArea</b>); 
<p><b>Title</b>=<b>"Backfilled Volume data - "</b>+<b>Interval</b>(<b>2</b>); 

The above formulas will display basic charts plus a cumulative value (red area) for any parameter you would like to test. In the price chart, high-low range (H-L) is summed while in the Volume chart plain Volume is summed. Summation starts with the cursor-selected bar. This feature is only provided to visually reveal data differences; it has no other significance.

The charts below were created using the above methods, which quickly reveal the difference between the two types of data. To explain why these difference occur is left up to the expert reader (because I don’t have a clue!!).


Figure 1 – Backfilled data


Figure 2 – Real-Time Collected data

The following volume indicator can be used to display the RT volume periodicity more clearly:

Filename = <b>StrLeft</b>(_DEFAULT_NAME(),<b>StrLen</b>(_DEFAULT_NAME())-<b>2</b>); <p>Vref = <b>Ref</b>(<b>HHV</b>(<b>V</b>,<b>4</b>),-<b>1</b>); <p>VSpike = <b>V</b> &gtVref <b>AND</b> <b>V</b>&gt;<b>Ref</b>(VRef,-<b>1</b>)/<b>2</b>; <p>BS=<b>ValueWhen</b>(VSpike,<b>BarsSince</b>(<b>Ref</b>(VSpike,-<b>1</b>))+<b>1</b>); <p><b>Plot</b>(<b>V</b>,<b>""</b>,<b>2</b>,<b>styleHistogram</b>); <p><b>Plot</b>(<b>IIf</b>(Vspike ,<b>V</b>,<b>Null</b>),<b>""</b>,<b>1</b>,<b>styleArea</b>); <p>FirstVisibleBar = <b>Status</b>( <b>"FirstVisibleBar"</b> ); <p>Lastvisiblebar = <b>Status</b>(<b>"LastVisibleBar"</b>); <p>TN=<b>DateTime</b>(); <p>S=<b>Second</b>(); <p><b>for</b>( Firstvisiblebar&lt;= Lastvisiblebar <b>AND</b&lt; <b>BarCount</b>; b++) <p>{ <p><b>if</b>(VSpike[b]) <b>PlotText</b>( <p><b>"\n"</b>+<b>NumToStr</b>(<b>V</b>[b]/<b>100</b>*<b>Interval</b>(),<b>1.0</b>,<b>False</b>)+ <p><b>"\n"</b>+<b>NumToStr</b>(BS[b],<b>1.0</b>,<b>False</b>)+ <p><b>"\n"</b>+<b>NumToStr</b>(S[b],<b>1.0</b>,<b>False</b>),b,<b>V</b>[b],<b>2</b>); <p>} <p><b>Title</b> = <b>"\nInteractive Brokers "</b>+Filename + <b>" - Display Raw data in 5-Second time frame\n"</b>+ <p><b>"Histogram labeling:\n"</b>+ <p><b>" Volume/100\n Barssince last Volume update\n Second Timestamp";</b

This code produced the next two charts below. A simple spike filter (see the VSpike definition in the code) is used to identify Volume spikes and make them stand out with a Black background. Since these volume spikes do not appear in backfilled data, we can assume that they do not reflect true market activity. The three numbers at the top of the histogram bars, from the top down, show the Volume/100, number of bars since the last volume spike, and the Second count derived from the data time stamp.


Figure 3 – Real-time collected volume data

Applying the code on backfilled data produces the chart below. Note that many of the low volume periods between the spikes have been filled in (it appears that the volume spikes have been retroactively distributed) and that there is no longer any visible volume periodicity.


Figure 4 – Backfilled volume data

Comparing Data from different Databases

You can compare data from different databases in a single chart. Overlaying two data arrays will immediately reveal differences and will also suggest more sophisticated analysis to be performed. The code below can be executed by itself, or it can be appended to any other program. In this case it is coded for Volume comparison. However, you can easily modify it to compare price, indicators, or any other array. The SetBarsRequired() statement is necessary for data alignment. You must use the same timeframe for both RT and BF charts and for composite creation. All tests in this post were performed in the 5 second timeframe.

<b>function</bStaticVarArraySetVarname, array ) <p>{ <p><b>AddToComposite</b>( array, <b>"~SA_"</b>+VarName, <b>"C"</b>, <b>atcFlagDefaults</b> | <b>atcFlagEnableInBacktest</b> | <b>atcFlagEnableInExplore</b> | <b>atcFlagEnableInIndicator</b> | <b>atcFlagEnableInPortfolio</b> ); <p>} <p><b>function</bStaticVarArrayGetVarName ) <p>{ <p><b>return</b> <b>Foreign</b>(<b>"~SA_"</b>+VarName,<b>"C"</b>); <p>} <p><b>SetBarsRequired</b>(<b>1000000</b>,<b>1000</b>); <p><b>GraphZOrder</b> = <b>1</b>; <p>StaticArrayName = <b>ParamList</b>(<b>"Static Array Name"</b>,<b>"RawDataSample|BackfillDataSample"</b>,<b>0</b>); <p><b>if</b>(<b>ParamTrigger</b>(<b>"Create Volume Composite"</b>,<b>"CREATE"</b>) ) <p>{ <p>StaticVarArraySetStaticArrayName, <b>V</b>); <p>} <p><b>if</b>( <b>ParamToggle</b>(<b>"Overlay Composite"</b>,<b>"NO|YES"</b>,<b>0</b>) ) <p>{ <p><b>Plot</b>(StaticVarArrayGetStaticArrayName),<b>""</b>,<b>colorYellow</b>,<b>styleStaircase</b>); <p>} 

To compare BF with RT volume arrays, you first create the composite for the BF volume and copy this to your RT database for comparison. The procedure is as follows:

  1. Load up the database containing your BF data sample.
  2. Display the data and open the Param window:


  1. Select BackFillDataSample for static variable name.
  2. Click CREATE.
  3. In the Amibroker menu bar, click View -> Refresh All.
  4. In the Indicator window, set Overlay Composite to YES. The composite data should display as a Yellow staircase superimposed on your volume chart.
  5. Close AmiBroker.
  6. Use Windows Explore to find your BF database and copy the composite for BF volume from the “_” folder and paste it into the “_” folder of the RT database.
  7. Delete the Broker.Master file from the RT database. This file will be recreated at next startup. This step is needed to include the new composite file in the database index.
  8. Start up AmiBroker and load up the RT database.
  9. Display the RT volume chart you were working with. If the Parameters are set as shown in the capture above you should now see the Yellow staircase for BF Volumes superimposed on the RT volume histogram.

At this point you can scroll back and forth in time to see how BF volume differs from RT collected volume. Do not click CREATE, or you will overwrite the BF composite. The charts below show what your charts should look like.


Figure 5 – BF composite (Yellow) on BF Volume Histogram

Figure 5 above shows a period where the composite covered backfilled volume (for example the backfill period before RT collection). Because the composite copied this BF data, they match perfectly.


Figure 6 – BF Composite (Yellow) on RT collected Volume Histogram

Figure 6 above is for a period where the composite (backfilled volume) is superimposed on the real-time collected volume (histogram). Note the difference between the two types of data.

Developing a trading system should start with learning about the basics; delays and bad data quality can kill any HFAT trading system no matter how much time you spent developing it. The best way to understand and know what you are working with is to write a few small programs, like those that were included in this series.


In the previous discussions, it became clear that developing an HFAT trading system might not be as easy as you think. Googling for information will reveal very few links to practical information; you’ll be mostly on your own to discover the pitfalls. Developing with live data from your paper-trading account may be better than using backfilled data. However, since it is highly likely that IB executes paper trades subject to the reported price and volume you see, paper-trading results may not match actual trading results. Unless you are acutely aware of the various problems and can develop your system to work around them, it would appear futile to try and develop an HFAT trading system with 5-second IB data. The unique real-time volume patterns also occurred in data collected from the real-trading account.

Data from all sources will have their own unique problems, and it is prudent to perform some basic testing to get to know your RT data before spending considerable time on development.


IB Snapshots and data compression methods are relevant to the above discussion; even though there isn’t much detail available, you may want to read the following threads to learn more about these topics.

AmiBroker user group: Interactive Brokers Plug-in dropping volume data
IB’s Discussion Board: Globex Ticks snapshot or reality?
AmiBroker User Group: AB Tick Bar Analysis

Edited by Al Venosa.

Comments are closed.