{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "code", "collapsed": false, "input": [ "%autosave 10" ], "language": "python", "metadata": {}, "outputs": [ { "javascript": [ "IPython.notebook.set_autosave_interval(10000)" ], "metadata": {}, "output_type": "display_data" }, { "output_type": "stream", "stream": "stdout", "text": [ "Autosaving every 10 seconds\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Source material\n", "\n", "- [http://www.hilpisch.com/YH_PyData_Eurex_Tutorial.html](http://www.hilpisch.com/YH_PyData_Eurex_Tutorial.html)\n", "- [http://www.hilpisch.com/YH_PyData_Eurex_Tutorial.ipynb](http://www.hilpisch.com/YH_PyData_Eurex_Tutorial.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Background\n", "\n", "- Not just data munging as primary problem, but it is a big problem.\n", " - Sources, formats, cleaning missing data.\n", "- Performance too\n", "- Organisational problems\n", " - Teams are silos. People who need answers can't ask questions. People who can give answers can't express them.\n", "- Continuum Analytics vision\n", " - Simple, interactive, collaborative, but still scalable performance" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notes\n", "\n", "- pandas offers access to free financial sources, but beware! Not clean, not reliable, but good for playing around.\n", "- *Volatility clustering*: if you plot log difference in close (log returns) you notice volatility clusters, isn't randomly distributed.\n", "- Investors want volalitity, offers short-term trading profit chances\n", "- Do you have to shift Returns by 1 (back 1) before multiplying? No.\n", "- This model doesn't take portfolio rebalancing (?) into account.\n", " - Rebalancing means restoring e.g. 70% in X, 30% in Y balance of your portfolio\n", "- Should use discounting, rather than simple sum of Earnings\n", "- Err on the side of readability. Don't put too many operations, particular in Pandas, onto one line.\n", " - Unless performance, when measured, is an issue.\n", "- VSTOXX vs EUROSTOXX\n", " - EUROSTOXX is mean reverting, standard theory of stocks apply.\n", " - VSTOXX is kind of like an interest rate. Percentage points, aggregate, implies volatility of puts and calls.\n", "- Log returns helps comparing two different time series in a mathematical way. Seems a common pattern.\n", "- Good link: http://scipy-lectures.github.io/advanced/mathematical_optimization/\n", "\n", "###\u00a0High frequency trading data\n", "\n", "- High frequency data not well covered by textbooks, even just the data sizes changes the game.\n", "- Worse, heterogenous time intervals! Tick data comes when it comes, not fixed.\n", "- !!AI Can you use numexpr to df.apply(...) some optimized function?\n", "\n", "## Why Python?\n", "\n", "- Nothing compares to Python's sheer breadth.\n", " - What, in Ruby, comes close to NumPy, SciPy, and Pandas?\n", "- R?\n", " - Systems development, actual production code, web development, ..., Python can do it.\n", "- Performance?\n", " - Python has overcome this stigma.\n", " - Python is less a glue between system components or libraries, and more a glue between high performance methods.\n", " - LLVM, multi-core, GPUs, clusters." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }