3、欺诈与时间序列分布关系
- # 查看二者的描述性统计,与时间的序列分布关系
- print('Normal')
- print(crecreditcard_data.
- Time[crecreditcard_data.Class == 0].describe())
- print('-'*25)
- print('Fraud')
- print(crecreditcard_data.
- Time[crecreditcard_data.Class == 1].describe())
- Normal
- count 284315.000000
- mean 94838.202258
- std 47484.015786
- min 0.000000
- 25% 54230.000000
- 50% 84711.000000
- 75% 139333.000000
- max 172792.000000
- Name: Time, dtype: float64
- -------------------------
- Fraud
- count 492.000000
- mean 80746.806911
- std 47835.365138
- min 406.000000
- 25% 41241.500000
- 50% 75568.500000
- 75% 128483.000000
- max 170348.000000
- Name: Time, dtype: float64
- f,(ax1,ax2)=plt.subplots(2,1,sharex=True,figsize=(12,6))
- bins=50
- ax1.hist(crecreditcard_data.Time[crecreditcard_data.Class == 1],bins=bins)
- ax1.set_title('欺诈(Fraud))',fontsize=22)
- ax1.set_ylabel('交易量',fontsize=15)
- ax2.hist(crecreditcard_data.Time[crecreditcard_data.Class == 0],bins=bins)
- ax2.set_title('正常(Normal',fontsize=22)
- plt.xlabel('时间(单位:秒)',fontsize=15)
- plt.xticks(fontsize=15)
- plt.ylabel('交易量',fontsize=15)
- # plt.yticks(fontsize=22)
- plt.show()
欺诈与时间并没有必然联系,不存在周期性;
正常交易有明显的周期性,有类似双峰这样的趋势。
4、欺诈与金额的关系和分布情况
- print('欺诈')
- print(crecreditcard_data.Amount[crecreditcard_data.Class ==1].describe())
- print('-'*25)
- print('正常交易')
- print(crecreditcard_data.Amount[crecreditcard_data.Class==0].describe())
- 欺诈
- count 492.000000
- mean 122.211321
- std 256.683288
- min 0.000000
- 25% 1.000000
- 50% 9.250000
- 75% 105.890000
- max 2125.870000
- Name: Amount, dtype: float64
- -------------------------
- 正常交易
- count 284315.000000
- mean 88.291022
- std 250.105092
- min 0.000000
- 25% 5.650000
- 50% 22.000000
- 75% 77.050000
- max 25691.160000
- Name: Amount, dtype: float64
- f,(ax1,ax2)=plt.subplots(2,1,sharex=True,figsize=(12,6))
- bins=30
- ax1.hist(crecreditcard_data.Amount[crecreditcard_data.Class == 1],bins=bins)
- ax1.set_title('欺诈(Fraud)',fontsize=22)
- ax1.set_ylabel('交易量',fontsize=15)
- ax2.hist(crecreditcard_data.Amount[crecreditcard_data.Class == 0],bins=bins)
- ax2.set_title('正常(Normal)',fontsize=22)
- plt.xlabel('金额($)',fontsize=15)
- plt.xticks(fontsize=15)
- plt.ylabel('交易量',fontsize=15)
- plt.yscale('log')
- plt.show()
金额普遍较低,可见金额这一列的数据对分析的参考价值不大。
5、查看各个自变量(V1-V29)与因变量的关系 (编辑:西安站长网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|