pandas


Filtering columns in dataframe that begin with a specific string


I have the following df, and I would like to apply a filter over the column names and simply remain those that begin with a certain string:
This is my current df:
ruta2:
Current SAN Prev.1m SAN Prev.2m SAN Prev.3m SAN Current TRE \
A 5 6 7 6 3
B 6 5 7 6 6
C 12 11 11 11 8
Basically what I would like is to filter the dataframe and remain the columns that begin with Current.
Then the desired output would be:
ruta2:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
In order to do this I tried this filter but outputs a value error :
ruta2=ruta2[~(ruta2.columns.str[:4].str.startswith('Prev'))]
It seems you only need:
ruta2=ruta2.loc[:, ~(ruta2.columns.str[:4].str.startswith('Prev'))]
#same as
#ruta2=ruta2.loc[:, ~ruta2.columns.str.startswith('Prev')]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or:
cols = ruta2.columns[ ~(ruta2.columns.str[:4].str.startswith('Prev'))]
ruta2=ruta2[cols]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
But if need only Current columns use filter (^ means start of string in regex):
ruta2=ruta2.filter(regex='^Current')
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
#filter the columns names starting with 'Current'
ruta2[[e for e in ruta2.columns if e.startswith('Current')]]
Out[383]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or you can use a mask array to filter columns:
ruta2.loc[:,ruta2.columns.str.startswith('Current')]
Out[385]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8

Related Links

Reconstruct a categorical variable from dummies in pandas
Python folium GeoJSON map not displaying
Efficient way to clean a csv?
SKlearn metrics fails with expected y object and predicted y object
Tableau - blend, join, or modify raw?
Pandas Dataframe reindexing issue
pandas TimeStamp subtraction?
Convert string to integer pandas dataframe index
Pandas.dataframe.read_table() ignores my row labels
pandas multiindex selection with ranges
How to prefer Series over DataFrame
any demo for statsmodels regression model in crossvalidation setting?
Read --> Modify -->Write large .csv files with Pandas
How do you filter out rows with NaN in a panda's dataframe
Pandas: How to grab unique values from a group?
to_excel showed “No Excel writer 'openpyxl'”

Categories

HOME
yii2
cloud
alfresco
android-4.4-kitkat
survey
alpha
android-youtube-api
virtualization
slurm
iron-router
uitypeeditor
progressive-web-apps
tostring
crystal-reports-2008
scichart
one-hot-encoding
graphicsmagick
web-sql
chromium-embedded
sparse-matrix
fish
linkerd
maquette
madlib
binary-data
key-value-observing
functor
grails-3.1
android-kernel
geopositioning
y86
file-format
bootstrapper
ensembles
fedex
dynamic-reports
android-mediaprojection
filepicker
winscp-net
domain-model
return-value
android-cursor
bind9
firmata
infix-notation
thrust
google-web-starter-kit
multiple-regression
essence
python-stackless
browser-link
jfugue
qtableview
citrus-pay
browser-bugs
emailrelay
ionic
offloading
xc16
gui-test-framework
winddk
gulp-less
xojo
mesa
uitouch
elliptic-curve
jquery-layout
mbr
mylyn
bitsharp
quickdialog
cascalog
getmessage
postgresql-performance
qt-jambi
gcj
yui-datatable
disclosure
lang
gamma
routedevent
putchar
ubuntu-9.04
grid-system
ajax-forms
uiq3

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html