1.
제목의 RL이 무엇일까요? 길어서 약어를 사용했지만 아래를 읽으시면 답이 나옵니다.(^^)
2018년 여름 Morgan Stanley가 재미있는 소식을 전하였습니다. Morgan Stanley Hires Ex-SAC Capital Artificial Intelligence Expert
을 보면 펜실바니아 주립대학 교수인 Michael Kearns을 Machine Learning/AI와 관련한 Senior Advisor로 채용하였습니다.
Kearns is a computer science professor at the University of Pennsylvania and has years of experience at Steve Cohen’s former hedge fund and other Wall Street firms. He will lead Morgan Stanley’s AI research and offer advice on deploying the technology for projects across the company, the New York-based firm said in a memo to employees Tuesday.
HFT가 여전한 관심사였던 2014년에 쓴 Machine Learning for Market Microstructure and High Frequency Trading으로 보시면 Kearns교수의 논문을 소개하였습니다. 이 때 소개한 논문이 Machine Learning for Market Microstructure and High Frequency Trading이었습니다. 앞서 블룸버그 기사를 보면 Kearns교수의 전문 분야가 Reinforcement Learning입니다.
Kearns’s expertise includes the branch of AI called reinforcement learning that can be used to improve trading execution and reduce associated costs. While standard machine learning models make predictions on prices, they don’t specify the best time to act, the optimal size of a trade or its impact on the market.
“With reinforcement learning, you are learning to make predictions that account for what effects your actions have on the state of the market,” Kearns said in an interview in early June.
이와 관련하여 Kearns 교수가 쓴 논문이 Reinforcement learning for optimized trade execution입니다. 발행년도를 보면 2006년입니다. 13년전입니다.
이제 JPMorgan입니다. JPMorgan’s new guide to machine learning in algorithmic trading을 보면 2개의 보고서를 언급하고 있습니다. 첫째는 JP Morgan이 발행한 기계학습 및 빅데이타 안내서에서 소개하였던 J.P.Morgan’s massive guide to machine learning and big data jobs in finance입니다. 둘째는 오늘 소개하는 Idiosyncrasies and challenges of data driven learning in electronic trading입니다.
논문중 일부입니다. 이미 2세대 RL기반의 OMS를 구축해서 운용하고 있다고 합니다. JP Morgan이 이미 운영하고 있다고 하는 RL시스템은 아마도 JPMorgan과 Deep Reinforcement Learning에서 간단히 소개한 LOXM이 아닐까 합니다.
3.4 A nano-description of our approach
We are now running the second generation of our RL-based limit order placement engine. We successfully train a policy with a bounded action space. To tackle the issues we have just described we use hierarchical learning and multi-agent training which leverage the domain knowledge. We train local policies (e.g. how to place aggressive orders vs how to place a passive order) on local short term objectives which differ in their rewards, step and time horizon characteristics. These local policies are then combined, and longer term policies then learn how to combine the local policies. We also believe that inverse reinforcement learning is very promising: leveraging the massive history of rollouts of human and algo policies on financial markets in order to build local rewards is an active field of research.
이상의 시스템을 운영하면서 제기된 문제를 해결하기 위하여 gradient-based training, hyper-parameter optimization technique,policy learning algorithm, hierarchical reinforcement learning을 적용했던 과정을 소개하고 Certainty Equivalent Reinforcement Learning (CERL)의 유용성을 설명합니다. 사실 AI를 전문적으로 공부하지 않은 저같은 사람이 이해하려면 오래 걸리는 내용입니다.
2.
자본시장에서 AI를 논할 때 Reinforcement Learning(강화학습)이 많은 관심을 받고 있는 듯 합니다. Reinforcement learning in financial markets – a survey의 서론을 보면 “왜 각광을 받는지” 이유를 설명하고 있습니다. 금융이 요청하는 Prediction과 Porfolio Management를 결합시킬 수 있는 방법론이기때문이라고 합니다.
The advent of reinforcement learning (RL) in financial markets is driven by several advantages inherent to this field of artificial intelligence. In particular, RL allows to combine the \prediction” and the \portfolio construction” task in one integrated step, thereby closely aligning the machine learning problem with the objectives of the investor. At the same time, important constraints, such as transaction costs, market liquidity, and the investor’s degree of risk-aversion, can be conveniently taken into account. Over the past two decades, and albeit most attention still being devoted to supervised learning methods, the RL research community has made considerable advances in the finance domain.
RL이 어떤 분야에서 사용하는지 잘 정리한 글입니다.
Reinforcement Learning: The Business Use Case, Part 1
Reinforcement Learning: The Business Use Case, Part 2
만약 RL과 같은 기계학습을 트레이딩에 적용하려고 할 때 기초적인 개념을 정리하시려면 Introduction to Learning to Trade with Reinforcement Learning을 읽어보시면 좋습니다. 단순한 짧은 자료입니다. 장점이자 단점입니다. 혹 깊이있게 공부하신다고 하면 Deep Reinforcement Learning이 2018년에 나왔습니다.
We discuss deep reinforcement learning in an overview style. We draw a big picture, filled with details. We discuss six core elements, six important mechanisms, and twelve applications, focusing on contemporary work, and in historical contexts. We start with background of artificial intelligence, machine learning, deep learning, and reinforcement learning (RL), with resources. Next we discuss RL core elements, including value function, policy, reward, model, exploration vs. exploitation, and representation. Then we discuss important mechanisms for RL, including attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn. After that, we discuss RL applications, including games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer systems, and, science, engineering, and art. Finally we summarize briefly, discuss challenges and opportunities, and close with an epilogue.
덧붙여 2018년에 나온 RL을 주제로 한 논문들을 소개합니다. 검색을 하면 무척이나 많은 자료를 접할 수 있습니다. 아래는 제목때문에 살펴본 것이라 대표적인 RL논문이라고 할 수 없습니다. HFT가 유행할 때 고빈도매매전략 및 시장시장구조와 관련한 논문이 홍수를 이루었던 지금은 RL과 관련한 논문이 진짜로 넘치네요. 참고하세요.
Financial Trading as a Game:A Deep Reinforcement Learning Approach
Optimized Trade Execution with Reinforcement Learning
Market Making via Reinforcement Learning
Reinforcement Learning for High-Frequency Market Making
마지막으로 앞서 JP Morgan 저자들이 사용한 RL 프레임워크는 OpenAI baselines, dopamine, deepmind/trfl 및 Ray RLlib입니다.
http://www.datanet.co.kr/news/articleView.html?idxno=130064
크래프트테크놀로지스라는 국내 딥러닝 스타트업이 DRL 기반으로 execution system을 만들고 상금 1억원을 걸고 증권사 딜링룸 전문 딜러와 대결하여 우승하였습니다. PPO를 쓴다고 하니 DQN을 사용하는 JP Morgan 의 LOXM 보다 구조도 진보된 것 같습니다.
감사합니다
찾아보니 LOXM 과 AXE 를 비교해놓은 글도 있네요.
https://www.linkedin.com/pulse/loxm-axe-hyungsik-kim/