Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Being open (source) in the traditionally secretive field of quant finance.


Published on

The field of quantitative finance is intensely competitive and maniacally secretive as a rule. The tendency toward secrecy is perhaps unsurprising given that the smallest of competitive advantages can translate to substantial profits. Indeed, over the past decade a growing list of legal prosecutions for alleged code theft or misuse have underscored how high the stakes can be for developers looking to leverage and contribute to open source projects. Notable exceptions to this approach include work from Wes McKinney and Travis Oliphant, whose work on open source projects like pandas and numpy, which have gained widespread adoption. In this talk we will review some of the costs and benefits of engaging with open source as a “two way street” and frame the modern quant workflow as a mosaic of open sourced, third party, and proprietary components.

Published in: Software
  • Login to see the comments

  • Be the first to like this

Being open (source) in the traditionally secretive field of quant finance.

  1. 1. A beginner’s guide to being open (source) in the traditionally secretive field of quantitative finance. PyData NYC October 2018 Jess Stauth, PhD Portfolio Management and Research , Quantopian
  2. 2. Disclaimer Quantopian provides this presentation to help people write trading algorithms - it is not intended to provide investment advice. More specifically, the material is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory or other services by Quantopian. In addition, the content neither constitutes investment advice nor offers any opinion with respect to the suitability of any security or any specific investment. Quantopian makes no guarantees as to accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.
  3. 3. Quants and open source – let’s start with a scary story.
  4. 4. Excerpted from: “…A one-way relationship with open source.”
  5. 5. What followed was a decade of legal battle… • Aleynikov was convicted in 2010 of violating the Economic Espionage Act and the Interstate Transportation of Stolen Property Act. • In 2012, the U.S. Court of Appeals in New York reversed the conviction. • Aleynikov was rearrested in August 2012 on state charges. • In 2015, he was convicted of one count of unlawfully using secret scientific material, but the judge threw out the verdict and acquitted him. • The District attorney’s office appealed and the conviction was reinstated by an intermediate appellate court. • In May of 2018 the appeals court upheld the conviction but declined to seek additional jail time. All told Aleynikov served one year in prison.
  6. 6. Scared yet?
  7. 7. Can open-source be a two-way street, even in traditionally secretive quant finance land? Short answer: YES! Longer answer: Yes, but… • Effort and some expertise are required to separate source code that is truly proprietary from that which is effectively commoditized. • Some companies just don’t see the benefits as large enough to outweigh the costs and risks. • The first real example of this in the quant finance space of course was Wes’ open sourcing of pandas while at AQR, built on Numpy/Travis’ work!
  8. 8. Costs and risks – both real and perceived • Costs: • Engaging users on mailing lists • Reviewing pull requests • Making proper releases • Documentation • Navigating interdependencies between open and closed source • Risks: • May fail to get engagement if response times are slow • IP leakage, if you misjudge the line between commodity and proprietary code • Perception of IP leakage, non-technical stakeholders need to trust that what is open sourced is not a ‘trade secret’ • Embarrassment “What if everyone thinks my code is crap!?!”
  9. 9. Why it’s worth it to be open, and not just with code. • Transparency builds trustful relationships that extend far beyond the walls of your company headquarters. • Robustness scales with use. • Linus’ Law – Given enough eyes, all bugs are shallow. • Talent is globally distributed. • Some of our best hires have come from OSS community. • Our asset management business model relies on contributions from the community in the form of (closed source) algorithms. • Life comes at you fast  h/t Thomas Wiecki’s open source talk
  10. 10. Ok so it’s worth it, but there remain costs and risks – so how does this all come together?
  11. 11. The modern quant finance workflow – a mosaic of open and closed source components Research • Ingest both publicly available data such as company financial statements and proprietary data sources • Open source tools (e.g. Jupyter, Pandas, Numpy, SciPy, Alphalens, Qgrid, Qdb) • Product is proprietary signals/factors Back testing • Open source event simulation tools (e.g. Zipline) • May exploit vendor models/tools for cost modeling. • Product is a proprietary alpha model (or trading algo) Production • Vendor tools for risk management and portfolio construction (e.g. Axioma, Barra, Northfield), or Quantopian’s free Risk Model • Product is a *very* proprietary trade list and possibly an execution strategy
  12. 12. Research Back testing Production Risk Model Individual Quant Developer / Researcher Proprietary signals, factors, insights from data 3rd party vendors • Factset • Estimize • FRED • ITG Market Impact Model • Axioma • Northfield • Barra • Executing Brokers Proprietary logic combining individual factors into a predictive model or algo Proprietary execution strategy if HFT, else rebalance frequency choice. Same mosaic, fewer words…
  13. 13. Some (more) Open source projects we maintain • qdb - A debugger for python that allows users to debug code executing on remote machine. • empyrical - A python library for computing common financial risk and performance metrics. Used by zipline and pyfolio. • warp_prism • pgcontents • DockORM • coal-mine • PenguinDome • trading-calendars • serializable-traitlets
  14. 14. We have committers or made significant contributions to: • jupyter • PyMC3 • blaze/odo/datashape Smaller contributions to: • CPython • Airflow • pandas
  15. 15. Acknowledgement to our amazing OSS contributors: Q Current Jean Bredeche Andrew Daniels John Fawcett Rich Frank Kathryn Glowinski Gus Gordon Eddie Hebert George Ho Joe Jevnik Jonathan Kamens Abhijeet Kalyan Samantha Klonaris Max Margenot David Michalowicsz Vikram Narayan Jacob Nazarenko Ernesto Perez Scott Sanderson Adrian Seybolt Tim Shawver Paul Sutherland Freddie Vargus Thomas Wiecki Rene Zhang Q Alums Andrew Campbell James Christopher Stewart Douglas James Kirk Andrew Liang Ana Ruelas Maya Tydykov … and counting!
  16. 16. Contribute $ as well as code! new-numfocus-corporate-partner
  17. 17. Thank you and Happy Halloween! Questions? Email: jess at quantopian dot com @jstauth
  18. 18. Zipline is a Python library for algorithmic backtesting and trading. Zipline is the backtesting and live-trading engine powering Quantopian – a free, community-centered, hosted platform for researching quantitative trading strategies. The best strategies are eligible for performance-based royalty licenses.
  19. 19. Pyfolio is a Python library for performance and risk analysis of financial portfolios. It works well with the Zipline open source backtesting library. At the core of pyfolio is a tearsheet consisting of individual plots that provide a comprehensive overview of the performance of a trading algorithm.
  20. 20. Alphalens is a Python library for performance analysis of predictive (alpha) stock factors. Alphalens is compatible with the Zipline open source backtesting library, and Pyfolio which provides performance and risk analysis of financial portfolios.
  21. 21. qgrid - An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks