Tuesday, August 9, 2011

Out of Sample Strategy Performance Update

A little while ago, we published a blog post on a trading signal we've developed internally based on media analytic data. In May, we launched a live version of the components of that signal, as a feature of the Recorded Future API. Our customers can pull this data directly from the API at 3:30pm, giving them time to trade before the equity markets close at 4.

Taking the same strategy we presented earlier, and using the live data as it was available to our customers at 3:30, we have rolled our backtest forward, and looked at the performance of this strategy over the last few tumultuous months. Between May 13, and August 5, this strategy returned 10.4%, while the market lost 9.9% of its value. These returns have been fairly consistent, and turnover has been similar to what we saw in our original backtests. The results are plotted above.

Of course, this is a short time window - encompassing just 59 trading sessions, and we haven't taken into account trading costs in this analysis. Still, we find these results encouraging and will continue to look for other sources of long-term signal in our studies going forward.

If you'd like to learn more about the Recorded Future media analytics API, contact our team.

Friday, March 4, 2011

Factor Modeling Media Analytic Data

At Recorded Future, we’re scouring the web for predictive signals in online content. Previously, we’ve covered our efforts at complex event modeling, and liquidity modeling using news flow information. Publicly, we’ve also touched briefly on some of our returns modeling - we’ve seen instances of particular blogs that seem to have superior predictive power in terms of their ability to write about stocks that will outperform.

Recently, we’ve expanded this approach to build a whole-market factor model that uses media analytic data to predict excess returns. Using aggregate data for the S&P 500, which is available to our API customers, we’ve built a number of factors that are derived from online sentiment and momentum of S&P 500 constituents that show statistically robust predictive signals of market-relative returns over a 1-day to 1-week investment horizon in a time-series cross-sectional modeling environment.

Factor Examination
Let’s take a look at one such factor, which is based on sentiment and momentum. If we take this factor, and break it into deciles by day and then construct portfolios for each decile, we see the following cumulative continuous returns in these portfolios. We’ve included dividend-adjusted returns to the SPDR S&P 500 ETF (SPY) as a benchmark in bright orange.


You can see quite clearly that over the last two years, our top decile (in orange) has outperformed all other deciles in a fairly consistent manner. Meanwhile, the bottom three deciles (the three darkest shades of blue) have underperformed all other deciles, as well as the market. One thing to note is that this relationship is not strictly linear. For instance, our 2nd, 3rd, and 4th place deciles actually fall near the middle of the returns distribution, which may have something to do with the construction of this particular factor.

If we compare the portfolios to the performance of the S&P 500 over this period, we find that the portfolio in the top decile has a Beta of 1.08, assuming a risk free rate of return roughly equivalent to that of T-bills over the period. It has a statistically significant annualized (continuous) Jensen’s alpha of +16% over the period. When we examine the bottom two deciles under the same assumption, we see that they are high Beta portfolios (1.37 and 1.34, respectively), but with statistically significant and negative alphas, at -42% annually, and -26%, annually. As you might imagine, constructing hedged portfolios out of the securities in these deciles provides some possibly compelling trading strategies.

If you’d like to experiment with this approach yourself. We’ve made some R code available on our Google Code site which will pull in market data, Recorded Future data, and perform this sort of decile analysis on a factor of your choosing. You’ll need a Recorded Future API token to pull that data.

Soon, we’ll discuss the inclusion of a factor like this into a portfolio built using other factors based on Recorded Future media analytic data, and find out whether a portfolio like this can stand up to trading costs, and evaluate its performance in an out-of-sample context.

Thursday, March 3, 2011

Turning Online Media into Big Data for Quants

We recently hosted a webcast discussing applications of the Recorded Future news analytics API for quantitative finance, and a big thanks goes out to everyone that joined us. The original presentation can be viewed here and slides from the session detailing how we turn online media into actionable data as well as several case studies are below:

Thursday, February 17, 2011

Live Webinar: Recorded Future for Quantitative Trading

When: Tuesday, March 1 at 10am EST
Where: Web conference (register here)

Join us on Tuesday, March 1 at 10am Eastern time for a webcast introducing how you can apply our news analytics API data to quantitative investment and trading strategies.

Recorded Future converts the real-time stream of news, niche, and other online channels into the only source of past, planned and speculative events on the web. These events range from corporate and government announcements to discussion of what might happen in the future — speculative events. This temporal analysis is the focus of Recorded Future, the analytical tool-kit we’ve developed, and the API that’s available.

The live session led by our Chief Analytic Officer Dr. Bill Ladd will feature modeling experiments showing how Recorded Future data can be used to formulate predictive models of liquidity and volatility as well as returns around “future” events.

We’ll also provide an in-depth introduction to Recorded Future’s temporal data including how we use computational linguistics to extract and index events, entities and related statistical measures from online media to create a robust data set ripe for generating innovative trading strategies.

Register for the March 1 event!

Sunday, January 30, 2011

Detecting and Profiting from the Future

Recorded Future identifies and collects discussion of events scheduled/speculated to happen in the future, and we want to find ways to incorporate this information into investment strategies. We’ve already seen some predictive power with these “future” events and are in the process of "pulling it apart." Specifically, we’ve investigated the market impact of our “future” events and observed some interesting behaviors.

When we look at market returns in the 5 days before and after a forecast event, we see a slight rise in returns before the event followed by a drop after the actual occurrence of the event.





Since these events are known ahead of time, it is initially surprising that there is any price movement at the event.

Drilling into the data a little further, we looked at just the most scheduled events: earnings calls. We see that when the future event is an earnings event, on average there is a ~25bp rise before the event followed by a ~25bp drop after the event.



This was a surprisingly large movement and may fall into our crowded investment thesis noted in an earlier post. When the earnings call is coming, there is a lot of attention on it, and then that attention dissipates. Our analysis suggest that cumulative returns on average follow the same pattern.

In contrast, for all other forecast events in our system there is essentially no movement before the event and a ~10bp drop after.



These events seem to drive no activity beforehand but are predictive of a slight drop afterwards. Now, this is averaged over 16000 trades and indicates a significant relationship. We will continue to drill down into subsets of events to find additional investment opportunities.

In this analysis, we are building on two earlier blog posts where we looked at forecast events and event studies. Our forecast events occur when an event is reported to occur after the publication date. For example, “The Verizon iPhone will be available to all on February 10th.” was published at mobilemarketingwatch.com on January 11.

We collected about 20000 of these events for S&P 500 companies over roughly the last two years and found the above patterns by looking at the average market adjusted returns for these companies in the days leading up to and after reported events. Additionally, we saw volume increases in all three analyses ranging from an increase in 3 standard deviations from normal in the earnings call events to a tenth of a standard deviation above average for the data without earnings calls.

Why is there the average drop after the event for non-earnings related events? Is negative information being withheld at announcement? Is speculative and forecast related news typically negative? Watch this space for further investigation of data from our news analytics API.

Monday, January 10, 2011

Measuring Crowded Investment Strategies with Online Media

Recently at Recorded Future, we have been experimenting with applying our sentiment scoring methodology to measuring the level of other concepts communicated in web content. Some ideas we have have been playing with include deceit, fear, and uncertainty. Outside of emotive language, we have also looked at capturing the level of chatter around a particular technology or business construct.

In particular, we recently developed a score to monitor the level of chatter around the concept of “momentum investing,” an investment style that has been in and out of favor with the media and the market over the years. We then applied this scored to our content and plotted the results over time. As a comparison, we look at the performance of the Monetta Fund (MONTX) a Mutual Fund that follows a momentum investment strategy.



Our theory, before seeing these results was that we would see a positive correlation between the performance of a momentum investing strategy and discussion about it online. However, as you can see in the chart above, the two metrics are generally inversely correlated for the time period in question. For 2010, the correlation of monthly changes in the metrics was -0.56. As chatter around momentum investing declines, the $NAV/share of the fund rises, and vice-versa. This finding makes economic sense, if you look at the market from in an ecological framework. To quote David Merkel of alephblog.com, “Many strategies are competing for scarce returns. Often the best strategy is the one that has few following it, and the worst one is the crowded trade.” Is there a suitable proxy for the “crowded trade” based on online chatter? Stay tuned for more research in this space.

Monday, December 6, 2010

Seconds Away From News Analytics

A little while ago we gave an example of how to get access to our News Analytic content using R. While this was pretty straightforward, we wanted to find an even easier way to use the Recorded Future API. I’ve put together an example spreadsheet that loads requested Recorded Future data live into a Google Spreadsheet and then combines that data with historical finance data from the Google Finance API. The result: a spreadsheet that will populate itself with media analytic data and stock market data straight from the web. Just enter in a list of stock tickers, a date range, and a Recorded Future API token - and within a few seconds you should have plenty of data, ripe for analysis.

What can we do with it? I’ve linked a “motion chart” to the spreadsheet. After switching around the chart type, I’ve set stock price on the y-axis, time on the x-axis, and color-coded my stock prices according to momentum. I see some interesting days of high momentum, particularly for Intel. One of these seems to be focused on August 5, 2010 - the day the FTC won an anti-trust settlement against Intel.


I’ve taken advantage of Google Apps Script to write a script that picks up spreadsheet data, runs queries against our API, Google Finance’s API, and does some processing to merge the results. Data is then put in a seperate spreadsheet “Stock Data”. The motion chart updates when the contents of that sheet change.

If you’d like to put your own token in the spreadsheet, go to File -> Make a Copy, and the sheet will be editable in your Google docs storage area. To see the code that runs when you click the “Run!” button, you can then go to Tools -> Scripts -> Script Editor.

By the way, if this kind of data is interesting to you, we have it in bulk for API customers for the S&P500 and the Russell 3000 - so you don’t have to manually enter hundreds of tickers!