数据分析-08
金融领域数据分析示例1. 移动均线3. 布林带1)什么是布林带2)布林带的业务指标与含义3)绘制布林带
基于函数矢量化的股票回测模型
代码总结量化交易常用指标移动平均线布林带浦发银行量化回测模型加载pfyh文件设计投资策略回测模型
金融领域数据分析示例
1. 移动均线
移动均线(Moving Average,简称MA)是用统计分析的方法,将一定时期内的证券价格(指数)加以平均,并把不同时间的平均值连接起来,形成一根MA,用以观察证券价格变动趋势的一种技术指标。移动平均线是由著名的美国投资专家Joseph E.Granville(葛兰碧,又译为格兰威尔)于20世纪中期提出来的。均线理论是当今应用最普遍的技术指标之一,它帮助交易者确认现有趋势、判断将出现的趋势、发现过度延生即将反转的趋势。
移动均线常用线有5天、10天、30天、60天、120天和240天的指标。其中,5天和10天的短期移动平均线,是短线操作的参照指标,称做日均线指标;30天和60天的是中期均线指标,称做季均线指标;120天、240天的是长期均线指标,称做年均线指标。
收盘价5日均线:从第五天开始,每天计算最近五天的收盘价的平均值所构成的一条线。
移动均线算法:
一支股票价格序列: a b c d e f g h i j
...
(a
+b
+c
+d
+e
)/5
(b
+c
+d
+e
+f
)/5
(c
+d
+e
+f
+g
)/5
...
(f
+g
+h
+i
+j
)/5
在K线图中绘制5日均线图
import numpy
as np
import datetime
as dt
def dmy2ymd(dmy
):
dmy
= str(dmy
, encoding
="utf-8")
dat
= dt
.datetime
.strptime
(dmy
, "%d-%m-%Y").date
()
tm
= dat
.strftime
("%Y-%m-%d")
return tm
dates
, open_prices
, highest_prices
, lowest_prices
, close_prices
= \
np
.loadtxt
("../da_data/aapl.csv",
delimiter
=",",
usecols
=(1, 3, 4, 5, 6),
unpack
=True,
dtype
="M8[D], f8, f8, f8, f8",
converters
={1: dmy2ymd
})
import matplotlib
.pyplot
as mp
import matplotlib
.dates
as md
mp
.figure
("APPL K-Line", facecolor
="lightgray")
mp
.title
("APPL K-Line")
mp
.xlabel
("Day", fontsize
=12)
mp
.ylabel
("Price", fontsize
=12)
ax
= mp
.gca
()
ax
.xaxis
.set_major_locator
(md
.WeekdayLocator
(byweekday
=md
.MO
))
ax
.xaxis
.set_major_formatter
(md
.DateFormatter
("%d %b %Y"))
ax
.xaxis
.set_minor_locator
(md
.DayLocator
())
mp
.tick_params
(labelsize
=8)
dates
= dates
.astype
(md
.datetime
.datetime
)
mp
.plot
(dates
, open_prices
, color
="dodgerblue", linestyle
="--")
mp
.gcf
().autofmt_xdate
()
ma_5
= np
.zeros
(close_prices
.size
- 4)
for i
in range(ma_5
.size
):
ma_5
[i
] = close_prices
[i
: i
+ 5].mean
()
mp
.plot
(dates
[4:],
ma_5
,
color
="orangered",
label
="MA_5")
mp
.legend
()
mp
.show
()
执行结果:
3. 布林带
1)什么是布林带
布林带(Bollinger Band)是美国股市分析家约翰·布林根据统计学中的标准差原理设计出来的一种非常实用的技术指标,它由三条线组成:
中轨:移动平均线(图中白色线条)
上轨:中轨+2x5日收盘价标准差 (图中黄色线条,顶部的压力)
下轨:中轨-2x5日收盘价标准差 (图中紫色线条,底部的支撑力)
布林带收窄代表稳定的趋势,布林带张开代表有较大的波动空间的趋势。
2)布林带的业务指标与含义
利用股价与布林带上轨线、下轨线进行比较,以及结合变化趋势,判断股票买入、卖出的时机。
(1)股价由下向上穿越下轨线(Down)时,可视为买进信号。 (2)股价由下向上穿越中轨时,股价将加速上扬,是加仓买进的信号。 (3)股价在中轨与上轨(UPER)之间波动运行时为多头市场,可持股观望。 (4)股价长时间在中轨与上轨(UPER)间运行后,由上向下跌破中轨为卖出信号。 (5)股价在中轨与下轨(Down)之间向下波动运行时为空头市场,此时投资者应持币观望。 (6)布林中轨经长期大幅下跌后转平,出现向上的拐点,且股价在2~3日内均在中轨之上。此时,若股价回调,其回档低点往往是适量低吸的中短线切入点。 (7)对于在布林中轨与上轨之间运作的强势股,不妨以回抽中轨作为低吸买点,并以中轨作为其重要的止盈、止损线。 (8)飚升股往往股价会短期冲出布林线上轨运行,一旦冲出上轨过多,而成交量又无法持续放出,注意短线高抛了结,如果由上轨外回落跌破上轨,此时也是一个卖点。
3)绘制布林带
以下是一个绘制布林带的示例。
import numpy
as np
import datetime
as dt
def dmy2ymd(dmy
):
dmy
= str(dmy
, encoding
="utf-8")
dat
= dt
.datetime
.strptime
(dmy
, "%d-%m-%Y").date
()
tm
= dat
.strftime
("%Y-%m-%d")
return tm
dates
, open_prices
, highest_prices
, lowest_prices
, close_prices
= \
np
.loadtxt
("../da_data/aapl.csv",
delimiter
=",",
usecols
=(1, 3, 4, 5, 6),
unpack
=True,
dtype
="M8[D], f8, f8, f8, f8",
converters
={1: dmy2ymd
})
import matplotlib
.pyplot
as mp
import matplotlib
.dates
as md
mp
.figure
("APPL K-Line", facecolor
="lightgray")
mp
.title
("APPL K-Line")
mp
.xlabel
("Day", fontsize
=12)
mp
.ylabel
("Price", fontsize
=12)
ax
= mp
.gca
()
ax
.xaxis
.set_major_locator
(md
.WeekdayLocator
(byweekday
=md
.MO
))
ax
.xaxis
.set_major_formatter
(md
.DateFormatter
("%d %b %Y"))
ax
.xaxis
.set_minor_locator
(md
.DayLocator
())
mp
.tick_params
(labelsize
=8)
dates
= dates
.astype
(md
.datetime
.datetime
)
mp
.plot
(dates
, open_prices
, color
="dodgerblue", linestyle
="--")
mp
.gcf
().autofmt_xdate
()
ma_5
= np
.zeros
(close_prices
.size
- 4)
for i
in range(ma_5
.size
):
ma_5
[i
] = close_prices
[i
: i
+ 5].mean
()
mp
.plot
(dates
[4:],
ma_5
,
color
="orangered",
label
="MA_5")
stds
= np
.zeros
(ma_5
.size
)
for i
in range(stds
.size
):
stds
[i
] = close_prices
[i
: i
+ 5].std
()
upper
= ma_5
+ 2 * stds
lower
= ma_5
- 2 * stds
mp
.plot
(dates
[4:], upper
, color
="green", label
="UPPER")
mp
.plot
(dates
[4:], lower
, color
="blue", label
="LOWER")
mp
.fill_between
(dates
[4:], upper
, lower
, lower
< upper
, color
="orangered", alpha
=0.05)
mp
.grid
(linestyle
="--")
mp
.legend
()
mp
.show
()
执行结果:
基于函数矢量化的股票回测模型
def model(data):
# 基于均线、布林带等各种业务指标进行分析,从而得到结果。
if xxx and or xxx:
return 1
elif xxxx:
return -1
if xx:
return 1
x = linalg.lstsq(xxx)
if xx xxx:
return 0
.....
.....
.....
return 1 ? -1 ? 0
函数的矢量化指通过一个只能处理标量数据的函数得到一个可以处理矢量数据的函数,从而可以对大量数据进行矢量化计算。
numpy提供了vectorize函数,可以把处理标量的函数矢量化,返回的函数可以直接处理ndarray数组。
import math
import numpy
as np
def func(x
, y
):
return math
.sqrt
(x
** 2 + y
** 2)
x
, y
= 3, 4
print(func
(x
, y
))
X
= np
.array
([3, 6, 9])
Y
= np
.array
([4, 8, 12])
vect_func
= np
.vectorize
(func
)
print(vect_func
(X
, Y
))
执行结果:
5.0
[ 5. 10. 15.]
案例:定义一种投资策略,传入某日期,基于均线理论返回是否应该按收盘价购买,并通过历史数据判断这种策略是否值得实施。
import numpy
as np
import matplotlib
.pyplot
as mp
import datetime
as dt
import matplotlib
.dates
as md
def dmy2ymd(dmy
):
dmy
= str(dmy
, encoding
="utf-8")
dat
= dt
.datetime
.strptime
(dmy
, "%Y/%m/%d").date
()
tm
= dat
.strftime
("%Y-%m-%d")
return tm
dates
, opening_prices
, highest_prices
, lowest_prices
, closing_prices
, ma5
, ma10
= \
np
.loadtxt
("../data/pfyh.csv",
delimiter
=",",
usecols
=(0,1,2,3,4,5,6),
unpack
=True,
dtype
="M8[D], f8, f8, f8, f8, f8, f8")
mp
.plot
(dates
, closing_prices
)
mp
.show
()
def profit(m8date
):
mma5
= ma5
[dates
< m8date
]
mma10
= ma10
[dates
< m8date
]
if mma5
.size
< 2:
return 0
if (mma5
[-2] <= mma10
[-2]) and (mma5
[-1] >= mma10
[-1]):
return 1
if (mma5
[-2] >= mma10
[-2]) and (mma5
[-1] <= mma10
[-1]):
return -1
return 0
vec_func
= np
.vectorize
(profit
)
profits
= vec_func
(dates
)
print(profits
)
assets
= 1000000
stocks
= 0
payment_price
= 0
status
= 0
for index
, profit
in enumerate(profits
):
current_price
= closing_prices
[index
]
if status
== 1:
payment_assets
= payment_price
* stocks
current_assets
= current_price
* stocks
if (payment_assets
> current_assets
) and ((payment_assets
-current_assets
) > payment_assets
*0.05):
payment_price
= current_price
assets
= assets
+ stocks
* payment_price
stocks
= 0
status
= -1
print('止损:dates:{}, curr price:{:.2f}, assets:{:.2f}, stocks:{:d}'.format(dates
[index
], current_price
, assets
, stocks
))
if (profit
== 1) and (status
!= 1):
payment_price
= current_price
stocks
= int(assets
/ payment_price
)
assets
= assets
- stocks
* payment_price
status
= 1
print('买入:dates:{}, curr price:{:.2f}, assets:{:.2f}, stocks:{:d}'.format(dates
[index
], current_price
, assets
, stocks
))
if (profit
== -1) and (status
!= -1):
payment_price
= current_price
assets
= assets
+ stocks
* payment_price
stocks
= 0
status
= -1
print('卖出:dates:{}, curr price:{:.2f}, assets:{:.2f}, stocks:{:d}'.format(dates
[index
], current_price
, assets
, stocks
))
print('持有:dates:{}, curr price:{:.2f}, assets:{:.2f}, stocks:{:d}'.format(dates
[index
], current_price
, assets
, stocks
))
代码总结
量化交易常用指标
import numpy
as np
import matplotlib
.pyplot
as plt
import pandas
as pd
移动平均线
data
= pd
.read_csv
('../../data/aapl.csv', header
=None, usecols
=[1, 6])
data
.columns
= ['date', 'close']
def todate(item
):
datestr
= '-'.join
(item
.split
('-')[::-1])
return pd
.to_datetime
(datestr
)
data
['date'] = data
['date'].apply(todate
)
data
.plot
(x
='date', y
='close')
<matplotlib.axes._subplots.AxesSubplot at 0x1ed49ca95b0>
data
dateclose
02011-01-28336.1012011-01-31339.3222011-02-01345.0332011-02-02344.3242011-02-03343.4452011-02-04346.5062011-02-07351.8872011-02-08355.2082011-02-09358.1692011-02-10354.54102011-02-11356.85112011-02-14359.18122011-02-15359.90132011-02-16363.13142011-02-17358.30152011-02-18350.56162011-02-22338.61172011-02-23342.62182011-02-24342.88192011-02-25348.16202011-02-28353.21212011-03-01349.31222011-03-02352.12232011-03-03359.56242011-03-04360.00252011-03-07355.36262011-03-08355.76272011-03-09352.47282011-03-10346.67292011-03-11351.99
ma5
= np
.zeros
(len(data
)-4)
for i
in range(ma5
.size
):
ma5
[i
] = np
.mean
(data
['close'][i
:i
+5])
ma5
0.0
ma5
= pd
.Series
(ma5
,index
=np
.arange
(4,len(data
)))
data
['ma5'] = ma5
data
dateclosema5
02011-01-28336.10NaN12011-01-31339.32NaN22011-02-01345.03NaN32011-02-02344.32NaN42011-02-03343.44341.64252011-02-04346.50343.72262011-02-07351.88346.23472011-02-08355.20348.26882011-02-09358.16351.03692011-02-10354.54353.256102011-02-11356.85355.326112011-02-14359.18356.786122011-02-15359.90357.726132011-02-16363.13358.720142011-02-17358.30359.472152011-02-18350.56358.214162011-02-22338.61354.100172011-02-23342.62350.644182011-02-24342.88346.594192011-02-25348.16344.566202011-02-28353.21345.096212011-03-01349.31347.236222011-03-02352.12349.136232011-03-03359.56352.472242011-03-04360.00354.840252011-03-07355.36355.270262011-03-08355.76356.560272011-03-09352.47356.630282011-03-10346.67354.052292011-03-11351.99352.450
data
.plot
(x
= 'date',y
=['close','ma5'])
<matplotlib.axes._subplots.AxesSubplot at 0x1ed4bd75160>
ma10
= np
.zeros
(len(data
) - 9)
for i
in range(ma10
.size
):
ma10
[i
] = np
.mean
(data
['close'][i
:i
+10])
ma10
= pd
.Series
(ma10
, index
=np
.arange
(9, len(data
)))
data
['ma10'] = ma10
data
.plot
(x
='date', y
=['close', 'ma5', 'ma10'])
<matplotlib.axes._subplots.AxesSubplot at 0x1ed4be50df0>
布林带
stds
= np
.zeros
(len(data
)-4)
for i
in range(stds
.size
):
stds
[i
] = np
.std
(data
['close'][i
:i
+5])
stds
= pd
.Series
(stds
, index
=np
.arange
(4, len(data
)))
data
['stds'] = stds
data
['upper'] = data
['ma5'] + 2*data
['stds']
data
['lower'] = data
['ma5'] - 2*data
['stds']
data
.plot
(x
='date', y
=['close' ,'ma5', 'upper', 'lower'])
<matplotlib.axes._subplots.AxesSubplot at 0x1ed4bebd640>
浦发银行量化回测模型
import numpy
as np
import pandas
as pd
import matplotlib
.pyplot
as plt
加载pfyh文件
data
= pd
.read_csv
('../data/pfyh.csv', header
=None,
names
=['date', 'open', 'high', 'low', 'close', 'ma5',
'ma10', 'volume'])
data
['date'] = pd
.to_datetime
(data
['date'])
data
.describe
()
openhighlowclosema5ma10volume
count487.000000487.000000487.000000487.000000487.000000487.0000004.870000e+02mean11.30223811.40624211.19876811.29671511.29771311.2994583.742897e+05std0.9643520.9812890.9549580.9687870.9602440.9525603.517131e+05min9.2600009.3500009.1700009.2400009.3120009.4020008.995716e+0425.56500010.66000010.46500010.55500010.55500010.5495001.980381e+0550.39000011.46000011.30000011.37000011.39600011.3820002.732542e+0575.95000012.04000011.84000011.92000011.89900011.8810004.008432e+05max13.65000014.00000013.36000013.66000013.45000013.3830003.796536e+06
data
.plot
(x
='date', y
='close')
<matplotlib.axes._subplots.AxesSubplot at 0x2725e966668>
设计投资策略
def profit(mdate
):
""" 传入日期作为参数,通过日期与近几天的股价,进行分析,得出结论:
建议买入:1 建议卖出:-1 建议观望:0 """
mask
= data
['date']<=mdate
if len(data
[mask
]) < 2:
return 0
today_data
= data
[mask
].iloc
[-1]
yest_data
= data
[mask
].iloc
[-2]
tma5
, tma10
= today_data
['ma5'], today_data
['ma10']
yma5
, yma10
= yest_data
['ma5'], yest_data
['ma10']
if (yma5
<= yma10
) and (tma5
>= tma10
):
return 1
if (yma5
>= yma10
) and (tma5
<= tma10
):
return -1
return 0
profit
(pd
.to_datetime
('2018-11-12'))
-1
回测模型
profits
= data
['date'].apply(profit
)
assets
= 1000000
stocks
= 0
status
= 0
for k
, v
in profits
.items
():
payment_price
= data
.loc
[k
]['close']
if v
==1 and status
!=1:
stocks
= int(assets
/ payment_price
)
assets
= assets
- stocks
*payment_price
status
= 1
print('买入操作{}----当前现金:{}, 当前股票数量:{}'.format(data
.loc
[k
]['date'], assets
, stocks
))
if v
==-1 and status
!=-1:
assets
= assets
+ stocks
*payment_price
stocks
= 0
status
= -1
print('卖出操作{}----当前现金:{}, 当前股票数量:{}'.format(data
.loc
[k
]['date'], assets
, stocks
))
买入操作2018-01-04 00:00:00----当前现金:11.92000000004191, 当前股票数量:78988
卖出操作2018-02-02 00:00:00----当前现金:1037914.2400000001, 当前股票数量:0
买入操作2018-02-07 00:00:00----当前现金:4.640000000130385, 当前股票数量:77168
卖出操作2018-02-12 00:00:00----当前现金:964604.6400000001, 当前股票数量:0
买入操作2018-03-01 00:00:00----当前现金:1.2800000000279397, 当前股票数量:77416
卖出操作2018-03-02 00:00:00----当前现金:960733.8400000001, 当前股票数量:0
买入操作2018-03-12 00:00:00----当前现金:3.3700000001117587, 当前股票数量:76797
卖出操作2018-03-15 00:00:00----当前现金:950750.2300000002, 当前股票数量:0
买入操作2018-04-10 00:00:00----当前现金:1.270000000251457, 当前股票数量:80846
卖出操作2018-04-19 00:00:00----当前现金:952367.1500000003, 当前股票数量:0
买入操作2018-04-24 00:00:00----当前现金:1.270000000251457, 当前股票数量:80846
卖出操作2018-04-26 00:00:00----当前现金:933772.5700000003, 当前股票数量:0
买入操作2018-05-15 00:00:00----当前现金:4.570000000298023, 当前股票数量:84888
卖出操作2018-05-18 00:00:00----当前现金:924434.8900000004, 当前股票数量:0
买入操作2018-06-07 00:00:00----当前现金:6.810000000405125, 当前股票数量:87128
卖出操作2018-06-08 00:00:00----当前现金:907009.2900000004, 当前股票数量:0
买入操作2018-07-10 00:00:00----当前现金:2.9700000003213063, 当前股票数量:94776
卖出操作2018-07-19 00:00:00----当前现金:907009.2900000004, 当前股票数量:0
买入操作2018-07-20 00:00:00----当前现金:5.640000000479631, 当前股票数量:91895
卖出操作2018-08-03 00:00:00----当前现金:903333.4900000005, 当前股票数量:0
买入操作2018-08-09 00:00:00----当前现金:1.4500000004190952, 当前股票数量:89086
卖出操作2018-08-17 00:00:00----当前现金:888188.8700000005, 当前股票数量:0
买入操作2018-08-22 00:00:00----当前现金:6.530000000493601, 当前股票数量:88026
卖出操作2018-09-05 00:00:00----当前现金:890829.6500000004, 当前股票数量:0
买入操作2018-09-18 00:00:00----当前现金:5.150000000372529, 当前股票数量:86825
卖出操作2018-10-10 00:00:00----当前现金:883015.4000000004, 当前股票数量:0
买入操作2018-10-19 00:00:00----当前现金:9.56000000028871, 当前股票数量:85068
卖出操作2018-11-12 00:00:00----当前现金:918743.9600000003, 当前股票数量:0
买入操作2018-12-03 00:00:00----当前现金:6.560000000405125, 当前股票数量:83370
卖出操作2018-12-12 00:00:00----当前现金:894566.6600000005, 当前股票数量:0
买入操作2019-01-07 00:00:00----当前现金:9.360000000451691, 当前股票数量:89635
卖出操作2019-03-11 00:00:00----当前现金:1028122.8100000005, 当前股票数量:0
买入操作2019-03-21 00:00:00----当前现金:0.3700000004610047, 当前股票数量:89714
卖出操作2019-03-25 00:00:00----当前现金:989545.7900000004, 当前股票数量:0
买入操作2019-04-03 00:00:00----当前现金:5.290000000386499, 当前股票数量:86047
卖出操作2019-04-15 00:00:00----当前现金:986964.3800000005, 当前股票数量:0
买入操作2019-04-17 00:00:00----当前现金:6.500000000465661, 当前股票数量:82868
卖出操作2019-04-25 00:00:00----当前现金:956303.2200000004, 当前股票数量:0
买入操作2019-05-08 00:00:00----当前现金:6.380000000470318, 当前股票数量:83084
卖出操作2019-05-10 00:00:00----当前现金:940517.2600000005, 当前股票数量:0
买入操作2019-05-21 00:00:00----当前现金:6.380000000470318, 当前股票数量:83084
卖出操作2019-05-23 00:00:00----当前现金:922238.7800000005, 当前股票数量:0
买入操作2019-06-03 00:00:00----当前现金:8.540000000502914, 当前股票数量:81758
卖出操作2019-06-27 00:00:00----当前现金:951671.6600000005, 当前股票数量:0
买入操作2019-07-17 00:00:00----当前现金:2.6200000004610047, 当前股票数量:82898
卖出操作2019-08-05 00:00:00----当前现金:931776.1400000005, 当前股票数量:0
买入操作2019-08-14 00:00:00----当前现金:3.0200000004842877, 当前股票数量:82604
卖出操作2019-08-21 00:00:00----当前现金:942514.6600000005, 当前股票数量:0
买入操作2019-08-22 00:00:00----当前现金:8.290000000502914, 当前股票数量:82459
卖出操作2019-08-29 00:00:00----当前现金:926022.8600000006, 当前股票数量:0
买入操作2019-09-05 00:00:00----当前现金:1.8200000006472692, 当前股票数量:79692
卖出操作2019-09-23 00:00:00----当前现金:936382.8200000006, 当前股票数量:0
买入操作2019-10-08 00:00:00----当前现金:7.520000000600703, 当前股票数量:78687
卖出操作2019-10-24 00:00:00----当前现金:1030020.3500000006, 当前股票数量:0
买入操作2019-11-07 00:00:00----当前现金:7.630000000586733, 当前股票数量:80722
卖出操作2019-11-12 00:00:00----当前现金:988044.9100000006, 当前股票数量:0
买入操作2019-12-11 00:00:00----当前现金:6.3900000005960464, 当前股票数量:82474
卖出操作2019-12-26 00:00:00----当前现金:1013611.8500000006, 当前股票数量:0