05 ,df 查看,数据查看:泰坦尼克数据,行列数,共几条,全列显示,前后 n 行

    技术2024-01-07  66

    1 ,数据 : 泰坦尼克号数据介绍

    数据介绍 : PassengerId,Survived,Pclass ,Name,Sex, Age,SibSp , Parch , Ticket,Fare , Cabin , Embarked 乘客号 是否幸存 船舱等级 姓名 性别 年龄 同辈人数 不同辈人数 票号 票价 船舱号 登船港口 1-生 配偶 儿女 0-死 兄弟姐妹 父母 数据样子 : PassengerId,Survived,Pclass,Name, Sex, Age,SibSp, Parch, Ticket, Fare, Cabin, Embarked 1, 0, 3, "Braund, Mr. Owen Harris", male, 22, 1, 0, A/5 21171, 7.25, , S 2, 1, 1, "Cumings, Mrs. John Bradley (Florence Briggs Thayer)", female, 38, 1, 0, PC 17599, 71.2833, C85, C 3, 1, 3, "Heikkinen, Miss. Laina", female, 26, 0, 0, STON/O2. 3101282, 7.925, , S

    2 ,几行几列 :data.shape

    代码 : if __name__ == '__main__': # 读文件 csv data = pd.read_csv("titanic_train.csv") # 几行几列 : res = data.shape print(res) ======================== (891, 12)

    3 ,共几条 : data.count()

    意义 : 每一列有几条代码 : if __name__ == '__main__': # 读文件 csv data = pd.read_csv("titanic_train.csv") # 几行几列 : res = data.count() print(res) ================================= PassengerId 891 Survived 891 Age 714 Cabin 204 Embarked 889 ...

    4 ,行数据 : 1 行 data.loc[1]

    目的 : 取第 2 行得到 : Series代码 : if __name__ == '__main__': # 读文件 csv data = pd.read_csv("titanic_train.csv") res = data.loc[1] print(res) =========================================================================== PassengerId 2 Survived 1 Pclass 1 Name Cumings, Mrs. John Bradley (Florence Briggs Th... Sex female Age 38 SibSp 1 Parch 0 Ticket PC 17599 Fare 71.2833 Cabin C85 Embarked C Name: 1, dtype: object

    5 ,行数据 : 多行 data.loc[2:4]

    目的 : 取 3,4,5 行结果 : dataframe注意啦 : 取多条,依然得到 df ,不能得到 sr代码 : if __name__ == '__main__': # 读文件 csv data = pd.read_csv("titanic_train.csv") res = data.loc[2:4] print(res) ======================================= PassengerId Survived Pclass ... Fare Cabin Embarked 2 3 1 3 ... 7.925 NaN S 3 4 1 1 ... 53.100 C123 S 4 5 0 3 ... 8.050 NaN S

    6 ,全显示 :pd.set_option(‘display.max_columns’, None)

    显示所有 : 列 pd.set_option(‘display.max_columns’, None)显示所有 : 行 pd.set_option(‘display.max_rows’, None)例如 : 查看前 5 条数据 if __name__ == '__main__': # 全列显示 : pd.set_option('display.max_columns', None) # 读文件 csv data = pd.read_csv("titanic_train.csv") res = data.head(5) print(res) ================================================================================================================================================================== 索引 乘客号 是否幸存 船舱等级 姓名 性别 年龄 同辈数 长幼数 票号 票价 座位号 登船码头 ================================================================================================================================================================== PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked 0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 1 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 0 PC 17599 71.2833 C85 C 2 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.9250 NaN S 3 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1000 C123 S 4 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.0500 NaN S

    7 ,前 2 行 : data.head(2)

    代码 : if __name__ == '__main__': # 全列显示 : pd.set_option('display.max_columns', None) # 读文件 csv data = pd.read_csv("titanic_train.csv") res = data.head(2) print(res)

    8 ,后 2 行 : data.tail(2)

    代码 : if __name__ == '__main__': # 全列显示 : pd.set_option('display.max_columns', None) # 读文件 csv data = pd.read_csv("titanic_train.csv") res = data.tail(2) print(res)

    9 ,取多个列 : data[cols]

    目的 : 取多个列得到 : df代码 : if __name__ == '__main__': # 全列显示 : pd.set_option('display.max_columns', None) # 读文件 csv data = pd.read_csv("titanic_train.csv") # 我想要的列 : cols = ["PassengerId","Sex","Age","Survived"] # 取数据 res = data[cols] print(res) =============================================== PassengerId Sex Age Survived 0 1 male 22.0 0 1 2 female 38.0 1 2 3 female 26.0 1

    10 ,取一个元素 : res.loc[3,“Sex”]

    代码 : if __name__ == '__main__': # 全列显示 : pd.set_option('display.max_columns', None) # 读文件 csv data = pd.read_csv("titanic_train.csv") res = data[['PassengerId','Sex','Age','Survived']].loc[3:6] # 取索引 print(res) r = res.loc[3,"Sex"] print(r) print(type(r)) ======================================== PassengerId Sex Age Survived 3 4 female 35.0 1 4 5 male 35.0 0 5 6 male NaN 0 6 7 male 54.0 0 female <class 'str'>
    Processed: 0.014, SQL: 9