Hands-On Machine Learning for Algorithmic Trading

1.
정확한 책 제목은 ‘Hands-On Machine Learning for Algorithmic Trading: Design and implement investment strategies based on smart algorithms that learn from data using Python’입니디. 강화학습중 ‘Policy Gradient’와 관련한 자료를 찾다가 구글북스에 잠시 본 책입니다. 궁금해서 좀더 찾아보았습니다. 우선 저자는 Stefan Jansen입니다. Applied AI 창업자이고 데이타과학자입니다. 왜 이런 책을 내놓았고 다른 책과 무엇이 다른지 찾아보니까 github에 관련한 이야기를 상세히 소개하였습니다. 우선 책을 낸 이유입니다.

This book aims to equip you with the strategic perspective, conceptual understanding, and practical tools to add value from applying ML to the trading and investment process. To this end, it covers ML as an important element in a process rather than a standalone exercise.

두가지 단어를 기억합니다. 첫째는 Standalone입니다. ML 그 자체를 이해하는 것이 중요하지 않다는 취지입니다. 둘째는 Process입니다. 알고리즘트레이딩에서 ML이 차지하는 비중이 클 수 있지만 ML이 모두라고 할 수 없습니다. 전략적 아이디어부터 실행까지 이어지는 프로세스중 하나로 이해하자는 취지입니다. 이런 저자의 노력은 목차에서 그대로 드러납니다. 처음 5장의 구성입니다. 알고리즘트레이딩이든 무엇이든 정량적인 방식으로 트레이딩을 해본 사람이면 당연히 알아야 하고 고민하여야 할 내용들입니다.

Chapter 01: Machine Learning for Trading

This chapter summarizes how and why ML became central to investment, describes the trading process and outlines how ML can add value. It covers:

  • How to read this book
  • The rise of ML in the Investment Industry
  • Design and execution of a trading strategy
  • ML and algorithmic trading strategies: use cases

Chapter 02: Market & Fundamental Data

This chapter introduces market and fundamental data sources and the environment in which they are created. Familiarity with various types of orders and the trading infrastructure matters because they affect backtest simulations of a trading strategy. We also illustrate how to use Python to access and work with trading and financial statement data.

In particular, this chapter will cover the following topics:

  • How market microstructure shapes market data
  • How to reconstruct the order book from tick data using Nasdaq ITCH
  • How to summarize tick data using various time, volume and dollar bars
  • How to work with eXtensible Business Reporting Language (XBRL)-encoded electronic filings
  • How to parse and combine market and fundamental data to create a P/E series
  • How to access various market and fundamental data sources using Python

Chapter 03: Alternative Data for Finance

This chapter outlines categories and describes criteria to assess the exploding number of alternative data sources and providers. It also demonstrates how to create alternative data sets by scraping websites, for example to collect earnings call transcripts for use with natural language processing (NLP) and sentiment analysis algorithms in the second part of the book. More specifically, this chapter covers:

  • How the alternative data revolution has unleashed new sources of information
  • How individuals, business processes, and sensors generate alternative data
  • How to evaluate the proliferating supply of alternative data used for algorithmic trading
  • How to work with alternative data in Python, such as by scraping the internet
  • Important categories and providers of alternative data

Chapter 04: Research & Evaluation of Alpha Factors

Chapter 4 provides a framework for understanding how factors work and how to measure their performance, for example using the information coefficient (IC). It demonstrates how to engineer alpha factors from data using Python libraries offline and on the Quantopian platform. It also introduces the zipline library to backtest factors and the alphalens library to evaluate their predictive power. More specifically, this chapter covers:

  • How to characterize, justify and measure key types of alpha factors
  • How to create alpha factors using financial feature engineering
  • How to use zipline offline to test individual alpha factors
  • How to use zipline on Quantopian to combine alpha factors and identify more sophisticated signals
  • How the information coefficient (IC) measures an alpha factor’s predictive performance
  • How to use alphalens to evaluate predictive performance and turnover

Chapter 05: Strategy Evaluation & Portfolio Management

Testing a strategy requires simulating the portfolios generated by an algorithm to verify its performance under market conditions. Strategy evaluation includes backtesting against historical data to optimize the strategy’s parameters, and forward-testing to validate the in-sample performance against new, out-of-sample data and avoid false discoveries from tailoring a strategy to specific past circumstances. This chapter introduces several approaches to optimizing portfolios that include the application of machine learning (ML) to learn hierarchical relationships among assets.

More specifically, in this chapter, we cover

  • How to build and test a portfolio based on alpha factors using zipline
  • How to measure portfolio risk and return
  • How to evaluate portfolio performance using pyfolio
  • How to manage portfolio weights using mean-variance optimization and alternatives
  • How to use machine learning to optimize asset allocation in a portfolio context

2.
Stefan Jansen은 책에서 언급한 자료 및 코드를 Github에 공개하였습니다.

machine-learning-for-trading

더불어 아래에 언급한 주제는 별도로 PDF를 제공합니다.

  • Chapter 16: Deep Learning
  • Chapter 17: Convolutional Neural Networks
  • Chapter 18: Recurrent Neural Networks
  • Chapter 19: Autoencoders & GANs
  • Chapter 20: Reinforcement Learning
  • 제가 관심이 있는 주제와 관련한 PDF들입니다.

    Download (PDF, 1.84MB)


    Download (PDF, 1.56MB)

    책을 구매하시려면 Hands-On Machine Learning for Algorithmic Trading을 이용하시면.. 그리고 어떤 분이 번역을 해보시면 좋을 듯 합니다. 알고리즘트레이딩과 관련한 여럿 교육과정이 있는데 책이 제시하는 목차대로 교과과정을 구성해도 훌륭하지 않을까 합니다.

    2 Comments

    1. 이명환

      이 책에 대해 찾아 보다가 같은 작가의 비슷한 책이 있던데요. 2nd 인게 다음 버전인가요?
      machine learning for algorithmic trading – second edition

      Reply
      1. smallake (Post author)

        https://github.com/stefan-jansen/machine-learning-for-trading
        을 보면 다음 버전이 맞네요…

        Reply

    Leave a Comment

    이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다

    이 사이트는 스팸을 줄이는 아키스밋을 사용합니다. 댓글이 어떻게 처리되는지 알아보십시오.