pandas


Filtering columns in dataframe that begin with a specific string


I have the following df, and I would like to apply a filter over the column names and simply remain those that begin with a certain string:
This is my current df:
ruta2:
Current SAN Prev.1m SAN Prev.2m SAN Prev.3m SAN Current TRE \
A 5 6 7 6 3
B 6 5 7 6 6
C 12 11 11 11 8
Basically what I would like is to filter the dataframe and remain the columns that begin with Current.
Then the desired output would be:
ruta2:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
In order to do this I tried this filter but outputs a value error :
ruta2=ruta2[~(ruta2.columns.str[:4].str.startswith('Prev'))]
It seems you only need:
ruta2=ruta2.loc[:, ~(ruta2.columns.str[:4].str.startswith('Prev'))]
#same as
#ruta2=ruta2.loc[:, ~ruta2.columns.str.startswith('Prev')]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or:
cols = ruta2.columns[ ~(ruta2.columns.str[:4].str.startswith('Prev'))]
ruta2=ruta2[cols]
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
But if need only Current columns use filter (^ means start of string in regex):
ruta2=ruta2.filter(regex='^Current')
print (ruta2)
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
#filter the columns names starting with 'Current'
ruta2[[e for e in ruta2.columns if e.startswith('Current')]]
Out[383]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8
Or you can use a mask array to filter columns:
ruta2.loc[:,ruta2.columns.str.startswith('Current')]
Out[385]:
Current SAN Current TRE
A 5 3
B 6 6
C 12 8

Related Links

Show DataFrame as table in iPython Notebook
Pandas. Groupby multiple columns, then attach a calculated column to an existing dataframe
pandas dataframe transformation partial sums
Pycharm - Package installation on Windows
rolling polynomial regression in pandas
python list to dataframe object
Find string in multiple columns ?
Drop level from one specific column
build sums of columns of pandas dataframe despite missing some data
Index column names
convert hourly time period in 15-minute time period
Process an entire column from a DataFrameGroupby
How to create new column with positive instead negative values
Pandas: Location of a row with error
How to create new Pandas Dataframe with columns form DataFrame (PYTHON)
How can I access multiple columns in Pandas 0.15 DataFrame.resample method?

Categories

HOME
hive
coq
blogger
proxy
cookies
isabelle
relative-path
cplex
q
rdf
icloud
fingerprint
uber-api
node-notifier
python-unittest
handsontable
etl
google-cloud-spanner
remote-access
iron-router
clojurescript
highlight.js
text-rendering
errorlevel
cloudhub
web-sql
format-specifiers
fifo
saas
file-rename
siesta-swift
catch-all
mapbox-gl
host
stacked
devextreme
dartium
retina-display
xenforo
optix
ruby-on-rails-3.1
android-mediaprojection
import-from-excel
acoustics
reactive-cocoa-5
long-polling
taffy
dotcover
python-webbrowser
alphabet
diagnostics
pdfclown
nbconvert
yt-project
root-framework
grails-tomcat-plugin
sts-springsourcetoolsuite
home-directory
plottable.js
multiple-regression
connect-by
dlna
xpath-1.0
phpcas
freelancer.com-api
comobject
jqgrid-formatter
composite
asp.net-web-api-odata
client-side-templating
fmod
oam
ftps
cos
html4
qt-faststart
plasma
type-equivalence
drools-planner
delphi-6
appconkit
gnu-prolog
genshi
netbeans-7.1
disclosure
mysql-error-1005
jquery-ui-droppable
fixed-width
iweb

Resources

Encrypt Message