2、append
- 1.result = df1.append(df2)
- 2.result = df1.append(df4)
- 3.result = df1.append([df2, df3])
- 4.result = df1.append(df4, ignore_index=True)
4、join
left.join(right, on=key_or_keys)
- 1.result = left.join(right, on='key')
- 2.result = left.join(right, on=['key1', 'key2'])
- 3.result = left.join(right, on=['key1', 'key2'], how='inner')
5、concat
- 1.result = pd.concat([df1, df4], axis=1)
- 2.result = pd.concat([df1, df4], axis=1, join='inner')
- 3.result = pd.concat([df1, df4], axis=1, join_axes=[df1.index])
- 4.result = pd.concat([df1, df4], ignore_index=True)
文本处理:
1. lower()函数示例
- s = pd.Series(['Tom', 'William Rick', 'John', 'Alber@t', np.nan, '1234','SteveMinsu'])
- s.str.lower()
2. upper()函数示例
- s = pd.Series(['Tom', 'William Rick', 'John', 'Alber@t', np.nan, '1234','SteveMinsu'])
- s.str.upper()
3. len()计数
- s = pd.Series(['Tom', 'William Rick', 'John', 'Alber@t', np.nan, '1234','SteveMinsu'])
- s.str.len()
4. strip()去除空格
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- s.str.strip()
5. split(pattern)切分字符串
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- s.str.split(' ')
6. cat(sep=pattern)合并字符串
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- s.str.cat(sep=' <=> ')
- 执行上面示例代码,得到以下结果 -
- Tom <=> William Rick <=> John <=> Alber@t
7. get_dummies()用sep拆分每个字符串,返回一个虚拟/指示dataFrame
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- s.str.get_dummies()
8. contains()判断字符串中是否包含子串true; pat str或正则表达式
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- s.str.contains(' ')
9. replace(a,b)将值pat替换为值b。
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- .str.replace('@','$')
10. repeat(value)重复每个元素指定的次数
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- s.str.repeat(2)
执行上面示例代码,得到以下结果 -
- 0 Tom Tom
- 1 William Rick William Rick
- 2 JohnJohn
- 3 Alber@tAlber@t
11. count(pattern)子串出现次数
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- print ("The number of 'm's in each string:")
- print (s.str.count('m'))
执行上面示例代码,得到以下结果 -
The number of 'm's in each string:
12. startswith(pattern)字符串开头是否匹配子串True
- s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t'])
- print ("Strings that start with 'T':")
- print (s.str. startswith ('T'))
执行上面示例代码,得到以下结果 -
Strings that start with 'T':
- 0 True
- 1 False
- 2 False
- 3 False
(编辑:晋中站长网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|