pandas


Filtering columns in dataframe that begin with a specific string


I have the following df, and I would like to apply a filter over the column names and simply remain those that begin with a certain string:
This is my current df:
ruta2:
Current SAN Prev.1m SAN Prev.2m SAN Prev.3m SAN Current TRE \
A 5 6 7 6 3
B 6 5 7 6 6
C 12 11 11 11 8
Basically what I would like is to filter the dataframe and remain the columns that begin with Current.
Then the desired output would be:
ruta2:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
In order to do this I tried this filter but outputs a value error :
ruta2=ruta2[~(ruta2.columns.str[:4].str.startswith('Prev'))]
It seems you only need:
ruta2=ruta2.loc[:, ~(ruta2.columns.str[:4].str.startswith('Prev'))]
#same as
#ruta2=ruta2.loc[:, ~ruta2.columns.str.startswith('Prev')]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or:
cols = ruta2.columns[ ~(ruta2.columns.str[:4].str.startswith('Prev'))]
ruta2=ruta2[cols]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
But if need only Current columns use filter (^ means start of string in regex):
ruta2=ruta2.filter(regex='^Current')
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
#filter the columns names starting with 'Current'
ruta2[[e for e in ruta2.columns if e.startswith('Current')]]
Out[383]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or you can use a mask array to filter columns:
ruta2.loc[:,ruta2.columns.str.startswith('Current')]
Out[385]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8

Related Links

How to manipulate specific condition in some colums by pandas
Column operations with pandas DataFrame objects with non-unique indices
Aggregate over an index in pandas?
Pandas and Python3.4 co existing with Python 2.7
Python Pandas: drop a column from a multi-level column index?
Using a MultiIndex value in a boolean selection (while setting)
Pandas second largest value's column name
create new column based on other columns in pandas dataframe
value error while matching column names
ImportError HDFStore requires PyTables No module named tables
Pandas every nth row
Easy way to display constantly updating DataFrame
Convert double index to matrix PANDAS
merging 2 dataframes in pandas using 1 to supply missing values for the other
pandas calling an element from imported data
how to resample state change data for line chart?

Categories

HOME
ms-access
vim
winforms
wso2-am
image
date
pypi
netsuite
dictionary
tesseract
android-4.4-kitkat
q
getelementsbytagname
jxls
icloud
fancybox
uber-api
medical
here-api
nstableview
serverless-framework
django-admin
ef-migrations
database-replication
emulator
jtextfield
visjs
language-agnostic
su
react-css-modules
social-media
gitignore
directx-10
google-api-nodejs-client
wixsharp
revolution-slider
stacked
restlet
lto
lumberjack
mime
qcombobox
hibernate-tools
termination
zip4j
gabor-filter
forever
font-size
jquery-filter
vmware-tools
fputcsv
angstrom-linux
natvis
javax.sound.midi
ora-00900
pundit
interrupted-exception
fluid-dynamics
iiviewdeckcontroller
p2
directoryservices
system.reflection
dia
typekit
angularjs-ng-click
cctv
markers
onactivityresult
project-planning
jquery-layout
phpthumb
cloud-connect
quantlib-swig
commoncrypto
cdc
opcache
jplaton
random-seed
kyotocabinet
free-variable
google-email-migration
manchester-syntax
semantic-diff
code-cleanup
work-stealing
qt-jambi
joyent
xdomainrequest
invite
sql-server-profiler
google-friend-connect
filtered-index
libs
project-hosting
nerddinner

Resources

Database Users
RDBMS discuss
Database Dev&Adm
javascript
java
csharp
php
android
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App