Numpy是Python 的库,提供了大量,多维陈列与矩阵,以及其上操作的数据学函数。
如何使用Kaggle
Go to www.kaggle.comSign Up/Login to your account.Click on the notebook from the menu on the left side.Click on the New Notebook button to create a notebookNumPy
import numpy as np import sysPython 列表上 的常规NumPy处理
NumPy的矩阵比Python list要紧密,Python列表占20MB,然后用Numpy(a NumPy array wtih single-precision floats in the cells)只要4M。如下:
import numpy as np import sys py_array=[1,2,3,4,5,6] numpy_array=np.array([1,2,3,4,5,6]) sizeof_py_arr=sys.getsizeof(1)*len(py_array) sizeof_numpy_arr=numpy_array.itemsize*numpy_array.size print(sizeof_py_arr) print(sizeof_numpy_arr)输出 : 168
48
整形:
import numpy as np n_md_arry = np.array([[1,2,3,4,5],[6,7,8,9,10]]) np_modmd_arr = n_md_arry.reshape(5,2) print(np_modmd_arr) n_md_arry2 = np.array([1,2,3,4]) np_modmd_arr2 = n_md_arry2.reshape(2,2) print(np_modmd_arr2)输出:
Output: [[ 1 2] [3 4] [ 5 6] [ 7 8] [ 9 10]] [[1 2] [3 4]]
NumPy arrange
NumPyarange()是创建一个数值 范围陈列的常规方法。 它创建一个ndarray,数值 平均分布在某数值范围内。
numpy.arange([start, ]stop, [step, ], dtype=None) -> numpy.ndarray
import numpy as np np_arr=np.arange(0,20).reshape(5,4) print(np_arr) f_arr=np_arr.ravel() print(f_arr)输出: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11] [12 13 14 15] [16 17 18 19]]
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
两个阵列相加:
import numpy as np in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) in_arr2 = np.array([[7, 8, 9], [10, 11, 12]]) out_arr = np.add(in_arr1, in_arr2) print (out_arr)两个阵列相乘:
import numpy as np in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) in_arr2 = np.array([[7, 8, 9], [10, 11, 12]]) out_arr = np.multiply(in_arr1, in_arr2) print (out_arr)输出:
[[ 7 16 27] [40 55 72]]
NumPy的矩阵相乘:
实际 的矩阵相乘
import numpy as np in_arr1 = np.array([[1, 2, 3], [4, 5, 6]]) in_arr2 = np.array([[7, 8],[9, 10], [11, 12]]) out_arr = in_arr1.dot(in_arr2) print (out_arr)输出: [[ 58 64] [139 154]]
在NumPy矩阵中找一个元素
import numpy as np import sys np_arr = np.array([1,2,0,4,5]) find = np.where(np_arr > 2) #找到大于2的数 print(find)输出: (array([3, 4]),)
在NumPy矩阵中非零数值 位置
import numpy as np import sys np_arr = np.array([1,2,0,4,5]) find = np.nonzero(np_arr) print(find)Output: (array([0, 1, 3, 4]),) #指标
Panda
导入 NumPy and Panda:
import numpy as np import pandas as pd如何从 Kaggle导入CSV文件.
Go to www.kaggle.comGo to your Kaggle notebook.Click on >| button on the top right corner.Click on + Add data buttonSearch for a dataset in the search bar.Click on the add button to add it to your dataset.Click on .csv file and you will find the path of CSV file.使用Read 函数读入数据集
import numpy as np import pandas as pd df = pd.read_csv('../input/pima-indians-diabetes-database/diabetes.csv') df提取前10行: def.head(10)
提取最后10行:df.tail(10)
结论:学习使用Numpy与Panda如何使用kaggl实时数据集