pandas


After rename column get keyerror


I have df:
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print (df)
a b c
0 7 1 5
1 8 3 3
2 9 5 6
Then rename first value by this:
df.columns.values[0] = 'f'
All seems very nice:
print (df)
f b c
0 7 1 5
1 8 3 3
2 9 5 6
print (df.columns)
Index(['f', 'b', 'c'], dtype='object')
print (df.columns.values)
['f' 'b' 'c']
If select b it works nice:
print (df['b'])
0 1
1 3
2 5
Name: b, dtype: int64
But if select a it return column f:
print (df['a'])
0 7
1 8
2 9
Name: f, dtype: int64
And if select f get keyerror.
print (df['f'])
#KeyError: 'f'
print (df.info())
#KeyError: 'f'
What is problem? Can somebody explain it? Or bug?
You aren't expected to alter the values attribute.
Try df.columns.values = ['a', 'b', 'c'] and you get:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-61-e7e440adc404> in <module>()
----> 1 df.columns.values = ['a', 'b', 'c']
AttributeError: can't set attribute
That's because pandas detects that you are trying to set the attribute and stops you.
However, it can't stop you from changing the underlying values object itself.
When you use rename, pandas follows up with a bunch of clean up stuff. I've pasted the source below.
Ultimately what you've done is altered the values without initiating the clean up. You can initiate it yourself with a followup call to _data.rename_axis (example can be seen in source below). This will force the clean up to be run and then you can access ['f']
df._data = df._data.rename_axis(lambda x: x, 0, True)
df['f']
0 7
1 8
2 9
Name: f, dtype: int64
Moral of the story: probably not a great idea to rename a column this way.
but this story gets weirder
This is fine
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
df.columns.values[0] = 'f'
df['f']
0 7
1 8
2 9
Name: f, dtype: int64
This is not fine
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print(df)
df.columns.values[0] = 'f'
df['f']
KeyError:
Turns out, we can modify the values attribute prior to displaying df and it will apparently run all the initialization upon the first display. If you display it prior to changing the values attribute, it will error out.
weirder still
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print(df)
df.columns.values[0] = 'f'
df['f'] = 1
df['f']
f f
0 7 1
1 8 1
2 9 1
As if we didn't already know that this was a bad idea...
source for rename
def rename(self, *args, **kwargs):
axes, kwargs = self._construct_axes_from_arguments(args, kwargs)
copy = kwargs.pop('copy', True)
inplace = kwargs.pop('inplace', False)
if kwargs:
raise TypeError('rename() got an unexpected keyword '
'argument "{0}"'.format(list(kwargs.keys())[0]))
if com._count_not_none(*axes.values()) == 0:
raise TypeError('must pass an index to rename')
# renamer function if passed a dict
def _get_rename_function(mapper):
if isinstance(mapper, (dict, ABCSeries)):
def f(x):
if x in mapper:
return mapper[x]
else:
return x
else:
f = mapper
return f
self._consolidate_inplace()
result = self if inplace else self.copy(deep=copy)
# start in the axis order to eliminate too many copies
for axis in lrange(self._AXIS_LEN):
v = axes.get(self._AXIS_NAMES[axis])
if v is None:
continue
f = _get_rename_function(v)
baxis = self._get_block_manager_axis(axis)
result._data = result._data.rename_axis(f, axis=baxis, copy=copy)
result._clear_item_cache()
if inplace:
self._update_inplace(result._data)
else:
return result.__finalize__(self)

Related Links

Pandas bar plot changes date format
pandas pie chart plot remove the label text on the wedge
pandas: optimizing my code (groupby() / apply())
Indexing on Pandas Grouby Data frame Gives error
Python 3.4 Pandas DataFrame Structuring
Python3.4 Pandas DataFrame from function
Select the row and column element of a dataframe and decide the regression variables
Apply function with pandas dataframe - POS tagger computation time
pandas: trouble transforming dataframe into aggregated dataframe
Python: Pandas to latex - Issues with the backslash
dropping rows in pandas dataframe based on column entries
Pandas dataframe apply function
Python/Pandas: counting the number of missing/NaN in each row
Unable to call value_counts on a new column
Using Pandas groupby to calculate many slopes
pandas: how to get scalar value on a cell using conditional indexing

Categories

HOME
maven
variables
clips
listview
netbeans
gremlin
dictionary
smarty
appx
q
kalman-filter
yahoo-oauth
infragistics
packages
c#-2.0
baqend
pivotal-cloud-foundry
synchronization
communication
fancybox-3
spring-kafka
dax
collectd
task
leiningen
try-catch
vaadin7
workload-scheduler
dosgi
opentracing
ghc
django-admin
pc
swingx
plunker
physics-engine
sylius
google-search-api
jndi
hammerspoon
maquette
jspm
vision
librato
key-value-observing
webtest
restlet
serve
hot-module-replacement
starteam
cookiecutter-django
nand2tetris
unixodbc
qcombobox
user-accounts
parentheses
atomicity
jquery-nestable
hls.js
tizen-native-app
mplayer
elgg
brightcove
arrow-keys
libpng
skobbler-maps
angstrom-linux
php-parse-error
oauth2client
cloudbees
energy
uid
ford-fulkerson
reactive-banana
cannon.js
myo
event-bubbling
geonetwork
operator-precedence
gwidgets
clicktag
coldbox
wordpress-theme-customize
terminfo
funcunit
phpthumb
mysql-error-1062
bundles
algebraic-data-types
farseer
flash-builder4.5
jplaton
yui-compressor
type-equivalence
smtp-auth
selected
sublist
cascalog
adk
sproutcore-2
forums
ext3
lzh

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App