Click here to Login




How to filter quotes data from high and low price spikes

Updated on 2009-08-05





Trading and backtesting stocks, futures and options with rules that use the high and low prices or limit orders can generate misleading results when using with a non-clean historical quotes database. In intraday sessions, several bad ticks could appear. These tick prices are often far from the transaction prices, and can be easily detected in a real-time stream. This is not the case for longer term data such as an EOD data. The consequence of this problem is that sometimes, in end of day quotes data, the high and low prices will be incorrect and the historical chart will have some big spikes (long shadows) that shouldn't exist if these bad ticks were cleaned.

With an EOD database, the only solution to clean up these spikes is to guess whether the reported high and low prices are correct or not.

There are two ways to clean up an historical database using QuantShare. The first solution is to create a script that loops through your symbol quotes data and executes some sort of cleaning routines then save the modified data. The other way, which we will explain in more details later, is to create a Post-Script formula directly in the downloader so that the quotes are automatically cleaned before they are filled into the database.

Several methods can be used to clean quotes from bad ticks. We will details a simple method that uses a standard deviation threshold to detect potential errors in high and low prices.

First of all, you need to open the Post-Script form of your downloader. Open your end of day quotes downloader (example: the yahoo downloader - Yahoo EOD historical quotes). Click on settings, then on next and finally click on the 'Post Script' button.

This filtering method is rather simple, it consists of calculating the standard deviation of the bar range (high - low) for all the available quotes, and update the high and low prices for the bars where the range exceeds 3 standard deviations. Any other level could also be used. For a normal distribution, there is only 5% of chance that a bar range falls outside 2 standard deviations and 0.3% chance that it falls outside 3 standard deviations.

If a range falls outside our boundaries, we must update the high and low prices. A conservative and simple way to do this consists of removing completely the lower and upper shadows. The new bar will consist of only a body.

The filter script can be downloaded here: Filter high and low spikes. You can find it in the Post-Script form.

Note: This script provides a way to deal with the high and low price spikes problem; several other methods could be used to filter quotes data.











no comments (Log in)

QuantShare Blog
QuantShare
Search Posts




QuantShare
Recent Posts QuantShare
Previous Posts

Ranking system calculation methods
Posted 1393 days ago

RSS feeds transformation
Posted 1397 days ago

Stocks: Market Capitalization
Posted 1399 days ago

Ranking System Engine
Posted 1401 days ago

How to deal with StockTwits data
Posted 1411 days ago

Trend following systems
Posted 1414 days ago

Working with the formula editor
Posted 1416 days ago


More Posts

Back







QuantShare
Product
QuantShare
Features
Create an account
Affiliate Program
Support
Contact Us
Trading Forum
How-to Lessons
Manual
Company
About Us
Privacy
Terms of Use

Copyright © 2012 QuantShare.com
Social Media
Follow us on Facebook
Twitter Follow us on Twitter
Google+
Follow us on Google+
RSS Trading Items