EFTA01201796.pdf

DataSet-9 4 pages 1,448 words document
👁 1 💬 0
📄 Extracted Text (1,448 words)
Aidyia Limited - CONFIDENTIAL Aidyia Methodology Overview Cassio Pennachin Et Ben Goertzel - October 2014 This document gives a very brief , nontechnical summary of the process that went into generating the US Equities and Global backtest results that Aidyia produced in October 2014. EFTA01201796 1. Overview of Aidyia's Predictive Methodology The novel predictive methodology we have developed involves a series of stages: • Input features, including o Price/volume based features, including standard financial indicators and more advanced mathematical indicators, produced based on daily financial data (and in some cases more granular data) o News based features, based on English and Chinese language newsfeeds o Fundamental and macroeconomic features • Predictive algorithms, including 5 different algorithms used in the backtests, and 2 in development. o Each of these algorithms, to be applied to a certain universe of financial instruments, is first trained on historical data regarding these instruments. Based on this training, it learns a set of "predictive models" for each of the instruments in the universe on that day. o A predictive model for a certain instrument, applied on a certain day, views some subset of the input features generated relative to the instrument in the past up till that day, and then makes a prediction as to the direction and magnitude of price movement of the instrument N days in the future (where N is specified at the start of the training process) o Some of our algorithms are entirely proprietary to Aidyia; others are heavily customized versions of open-source machine learning tools. o We use the term "model class" to refer to the set of models coming out of a particular algorithm and predicting movements in a particular direction with a particular lookahead N • Signal weighting: Methods for assigning numerical weights to predictions, based on diverse data regarding each prediction. • Model class weighting: Methods for assigning numerical weights to the different model classes on a given day, based on the historical performance characteristics (and other aspects) of each model class. These are among the inputs used by our methods to assign weights to individual predictions (as one important property of a prediction is which model class it came from) The predictive algorithms are the core of our proprietary trading methodology; they are our "secret sauce." However, our predictive algorithms alone, applied in a naive way without embedding in an appropriate domain-specific framework, would not yield adequate trading performance. Alongside our research on predictive algorithms, we have put very significant R&D effort into what comes before (the input features) and after (signal weighting, model class weighting and signal combination). What comes from this series of stages is a set of predictive signals, for each instrument in the universe in question, on each day. Then, in Aidyia's overall trading framework, these predictions are fed into a portfolio management system which handles capital allocation, risk management, and trading system integration. 2. High Level Development and Testing Narrative Now we will review the overall high level development and testing process of which this backtesting has been the most recent part. The overall process is important, because careful handling of data is extremely critical in this kind of work to avoid falling prey to the various overfitting-related errors that may otherwise occur. EFTA01201797 Aidyia Limited was founded in late 2011, and since that time most of the firm's R&D efforts went into creating and refining algorithms for predicting the prices of financial instruments. Until Summer 2014, this work was focused more specifically on creating algorithms aimed at: • Forecasting the prices of relatively high-liquidity stocks on the Hong Kong stock exchange (liquidity criteria were defined to restrict the stock universe used for experimentation, resulting in around 200 adequately liquid stocks at any point in time.) • Forecasting these stocks' prices 1, 5, 10 or 20 days in the future. • Achieving adequate predictive performance as validated by experimentation on historical data from 2007-2011. Data from 2012-2013 was held out until October 2014, for use as true out of sample validation data. This data was not used at all in the experimentation or testing process, until the "true out of sample validation" test was done in late October 2014. All the testing and development was done, until this point, as if the world ended on Dec. 31, 2011. After a period of experimentation, we decided to restrict focus to forecasting 5 or 20 days in the future. 1 day was eliminated because the results were generally not good enough (which may be largely an artifact of the large transaction costs on the Hong Kong market, on which we were doing all our testing at that point). 10 days was eliminated purely for simplicity, because running our model learning algorithms is computationally costly, so by restricting attention to 5 and 20 days we could explore more variations of our algorithms using the limited compute time at our disposal. The Hong Kong stock market has relatively high transactions costs due both to fees imposed on each trade by law (stamp duty tax), and relatively high friction. This posed difficulty in creating a system with adequate returns on the 2007-2011 Hong Kong equity data we were experimenting with. Therefore, in Summer 2014, we decided to shift focus and by our software at the tasks of predicting US equity and global macro asset prices instead. Our US equity universe was defined as the members of the S&P 500, and the global portfolio is a collection of about 50 instruments, covering commodities, interest rates, equity indexes, and currency pairs. Some of our software's input features were not appropriate for these other markets, e.g. news-based features and most fundamental features. We simply disabled these features for our tests on these other tradable universes. (Obviously this is not optimal, and we suspect better results could be obtained by modifying these features appropriately for these other universes, rather than simply disabling them. However, that would have been time-consuming, so we chose a more expedient route for our first pass of experimentation.) On the US equity and global macro universes, we ran our predictive modeling algorithms on data from 2002-2011. Results were good, with basic predictive accuracy around the same level as we had been finding on the Hong Kong equity data, but with profitability looking much easier to come by due to the much lower transaction costs in these other markets. We then designed a simple backtesting framework, including code accounting for signal weighting and combination (combining the signals from multiple predictive models produced by our multiple learning algorithms), capital allocation, portfolio management and trade exit management. Our assumptions in these regards have been relatively simplistic, but customized in some regards to the particulars of our EFTA01201798 overall prediction framework (particularly to the fact that we are combining predictions produced by multiple heterogeneous prediction algorithms, possessing quite different properties). We then ran backtests on the US equity and global macro data from 2007-2011. This exposed some things in our backtesting framework in need of adjustment. Once these adjustments were done, we ran backtests on these universes running all the way from 2003-2011. These results looking promising, and so we finally (in October 2014) decided the time had come to run our code on data from the long-held-out "true out of sample" time period of 2012-2013. These results were also reasonably good, thus providing significant validation that we had not overfit our methodology to the time period before 2011. (Of course, the applicability of our HK equity tuned system to the US equity and global macro universes, also serves as significant validation that we had not overfit our methodology to the HK equity universe either.) Based on these results, it seems fair to provisionally conclude that we have found a trading methodology with reasonable effectiveness across different asset classes and time periods. To recap the high level experimentation and validation steps that have just been reviewed above in narrative form, what we did was, in order: 1. Nov 2011-June 2014: Develop predictive algorithms via testing on HK equity data from 2007- 2011. 2. July-Sep 2014: a. Validate these predictive algorithms via testing on US equity and global macro data from 2002-2011. b. Develop a trading methodology (model combination, capital allocation, portfolio and trade management) via testing on US equity and global macro data from 2007-2011. c. Validate the trading methodology and predictive algorithms via testing on US equity and global macro data from 2003-2006. 3. Oct. 2014: Validate the predictive algorithms and trading methodology via testing on US equity and global macro data from 2012-2013 (data that had been kept totally "virgin" with respect to Aidyia's work, across all asset classes, until we reached this final step). EFTA01201799
ℹ️ Document Details
SHA-256
1729a0cf5e5430aa64cc5d9332e762fbdb0d218819352ed1504a87a73ff2660d
Bates Number
EFTA01201796
Dataset
DataSet-9
Type
document
Pages
4

Community Rating

Sign in to rate this document

📋 What Is This?

Loading…
Sign in to add a description

💬 Comments 0

Sign in to join the discussion
Loading comments…
Link copied!