Pandas 활용(8) - str 관련 메소드

import pandas as pd

df = pd.DataFrame({'Employee ID':[111, 222, 333, 444],
                   'Employee Name':['Chanel', 'Steve', 'Mitch', 'Bird'],
                   'Salary [$/h]':[35, 29, 38, 20],
                   'Years of Experience':[3, 4 ,9, 1]})
df

이전 포스트에서 apply()를 이용해,

Employee Name 컬럼에 있는 이름의 문자 갯수를 구해서 새로운 컬럼에 저장했는데

또 다른 방법으로 .str.len()을 사용할 수 있다.

# Employee Name의 이름의 문자 갯수를 구해서,
# 새로운 컬럼 length에 저장하세요.
df['Employee Name'].str.len()
>>> 0    6
    1    5
    2    5
    3    4
    Name: Employee Name, dtype: int64

df['length'] = df['Employee Name'].str.len()
df

이처럼, 데이터프레임에 .str.메소드() 형식으로 문자열 메소드들을 쓸 수 있다.

# Employee name의 이름을 모두 대문자로 바꿔서
# 새로운 컬럼 upper_name 이라는 컬럼으로 저장해 주세요.

# 1. apply 함수를 이용하는 방법
df['upper_name'] = df['Employee Name'].apply(str.upper)

# 2. 판다스의 str 메소드를 이용하는 방법
df['upper_name2'] = df['Employee Name'].str.upper()

df

특히, str.contains() 메소드는 자주 쓰이니 외워두는 것이 좋다.

str.contains() 는 특정 글자가 포함되면 True, 아니면 Flase를 반환한다.

df['Employee Name'].str.contains('e')
>>> 0     True
    1     True
    2    False
    3    False
    Name: Employee Name, dtype: bool

.str[ ] : 문자열 인덱싱
.str.split() : 분할
.str.startswith() : 시작글자 인식
.str.endswith() : 끝글자 인식
.str.contains() : 포함글자 인식
.str.find() : 문자 위치 찾기
.str.replace :문자 대체
.str.extract() : 문자열 추출
.str.pad() : 문자열 패딩
.str.strip() : 공백제거
.str.lower() : 소문자로
.str.upper() : 대문자로

'Python > Pandas' 카테고리의 다른 글

Pandas 활용(10) - 데이터프레임 합치기(concat, merge) (0)	2022.11.28
Pandas 활용(9) - 데이터프레임 정렬하기 sort_values, sort_index (0)	2022.11.28
Pandas 활용(7) - 함수의 일괄 적용 apply (0)	2022.11.28
Pandas 활용(6) - 범주로 묶어 집계하기 groupby, agg (0)	2022.11.27
Pandas 활용(5) - 결측값(NaN) 처리 (0)	2022.11.27

'Python > Pandas' 카테고리의 다른 글

티스토리툴바