pandas


Filtering columns in dataframe that begin with a specific string


I have the following df, and I would like to apply a filter over the column names and simply remain those that begin with a certain string:
This is my current df:
ruta2:
Current SAN Prev.1m SAN Prev.2m SAN Prev.3m SAN Current TRE \
A 5 6 7 6 3
B 6 5 7 6 6
C 12 11 11 11 8
Basically what I would like is to filter the dataframe and remain the columns that begin with Current.
Then the desired output would be:
ruta2:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
In order to do this I tried this filter but outputs a value error :
ruta2=ruta2[~(ruta2.columns.str[:4].str.startswith('Prev'))]
It seems you only need:
ruta2=ruta2.loc[:, ~(ruta2.columns.str[:4].str.startswith('Prev'))]
#same as
#ruta2=ruta2.loc[:, ~ruta2.columns.str.startswith('Prev')]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or:
cols = ruta2.columns[ ~(ruta2.columns.str[:4].str.startswith('Prev'))]
ruta2=ruta2[cols]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
But if need only Current columns use filter (^ means start of string in regex):
ruta2=ruta2.filter(regex='^Current')
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
#filter the columns names starting with 'Current'
ruta2[[e for e in ruta2.columns if e.startswith('Current')]]
Out[383]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or you can use a mask array to filter columns:
ruta2.loc[:,ruta2.columns.str.startswith('Current')]
Out[385]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8

Related Links

Collecting together data in columns… and knowing if it goes wong
converting a dictionary with with multi values for each key to dataframe
Faceted plots of a multi-indexed DataFrame
How can I select rows from one DataFrame, where a part of the row's index is in another DataFrame's index and meets certain criteria?
How can I find correlation between tags with Pandas?
using time zone in pandas to_datetime
How to replace items with their indices in a pandas series
Check number of unique values in pandas dataframe
Finding the time spent by id in each location
dropping various columns using iloc
pandas Selecting/sampling at different interval frequencies
First five non-numeric, non-null, distinct values from a column
How to operate conditional calculation between columns in pandas dataframe?
Group by groups to Pandas Series/Dataframe
How to write a multiple dataframes to same sheet without duplicating the column labels
logic element-wise operations in pandas time-series dataframe

Categories

HOME
sendgrid
vbscript
c#-4.0
google-api-php-client
comparison
sharepoint-designer
alpha
cross-browser
user-input
numeral.js
handsontable
google-cloud-spanner
apache-cayenne
remote-access
quickfix
caml
opentracing
foselasticabundle
ef-migrations
interop
sms-gateway
visual-composer
kryo
dcevm
kudan
numerical-methods
qhull
librato
preg-match
frame
junit5
iframe-resizer
phonegap
winrt-xaml-toolkit
sequential
io-redirection
disassembling
bitbucket-pipelines
sharefile
ruby-on-rails-3.1
jvm-languages
botbuilder
midl
node-sass
galleria
dotcover
appcompat
jedis
worker-thread
r-forge
sqldf
azure-sdk
dstu2-fhir
django-debug-toolbar
qpid
firebaseui
angstrom-linux
flash-cs5
sigabrt
dlna
browser-link
iad
maven-tomcat-plugin
endeca-workbench
spidermonkey
purge
jqgrid-formatter
muse
map-projections
event-bubbling
system32
composite
sonarqube5.1.2
device-orientation
truevault
apache-commons-net
nsmutabledictionary
operator-precedence
tween
php-5.4
navigationservice
fluentautomation
valueconverter
datagridviewcolumn
oam
coverflow
flash-builder4.5
ticoredatasync
ohm
bigcouch
external-accessory
deobfuscation
actionview
jmock
xfbml
sudzc
ti-dsp
virtual-functions

Resources

Encrypt Message



code
soft
python
ios
c
html
jquery
cloud
mobile