pandas


After rename column get keyerror


I have df:
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print (df)
a b c
0 7 1 5
1 8 3 3
2 9 5 6
Then rename first value by this:
df.columns.values[0] = 'f'
All seems very nice:
print (df)
f b c
0 7 1 5
1 8 3 3
2 9 5 6
print (df.columns)
Index(['f', 'b', 'c'], dtype='object')
print (df.columns.values)
['f' 'b' 'c']
If select b it works nice:
print (df['b'])
0 1
1 3
2 5
Name: b, dtype: int64
But if select a it return column f:
print (df['a'])
0 7
1 8
2 9
Name: f, dtype: int64
And if select f get keyerror.
print (df['f'])
#KeyError: 'f'
print (df.info())
#KeyError: 'f'
What is problem? Can somebody explain it? Or bug?
You aren't expected to alter the values attribute.
Try df.columns.values = ['a', 'b', 'c'] and you get:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-61-e7e440adc404> in <module>()
----> 1 df.columns.values = ['a', 'b', 'c']
AttributeError: can't set attribute
That's because pandas detects that you are trying to set the attribute and stops you.
However, it can't stop you from changing the underlying values object itself.
When you use rename, pandas follows up with a bunch of clean up stuff. I've pasted the source below.
Ultimately what you've done is altered the values without initiating the clean up. You can initiate it yourself with a followup call to _data.rename_axis (example can be seen in source below). This will force the clean up to be run and then you can access ['f']
df._data = df._data.rename_axis(lambda x: x, 0, True)
df['f']
0 7
1 8
2 9
Name: f, dtype: int64
Moral of the story: probably not a great idea to rename a column this way.
but this story gets weirder
This is fine
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
df.columns.values[0] = 'f'
df['f']
0 7
1 8
2 9
Name: f, dtype: int64
This is not fine
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print(df)
df.columns.values[0] = 'f'
df['f']
KeyError:
Turns out, we can modify the values attribute prior to displaying df and it will apparently run all the initialization upon the first display. If you display it prior to changing the values attribute, it will error out.
weirder still
df = pd.DataFrame({'a':[7,8,9],
'b':[1,3,5],
'c':[5,3,6]})
print(df)
df.columns.values[0] = 'f'
df['f'] = 1
df['f']
f f
0 7 1
1 8 1
2 9 1
As if we didn't already know that this was a bad idea...
source for rename
def rename(self, *args, **kwargs):
axes, kwargs = self._construct_axes_from_arguments(args, kwargs)
copy = kwargs.pop('copy', True)
inplace = kwargs.pop('inplace', False)
if kwargs:
raise TypeError('rename() got an unexpected keyword '
'argument "{0}"'.format(list(kwargs.keys())[0]))
if com._count_not_none(*axes.values()) == 0:
raise TypeError('must pass an index to rename')
# renamer function if passed a dict
def _get_rename_function(mapper):
if isinstance(mapper, (dict, ABCSeries)):
def f(x):
if x in mapper:
return mapper[x]
else:
return x
else:
f = mapper
return f
self._consolidate_inplace()
result = self if inplace else self.copy(deep=copy)
# start in the axis order to eliminate too many copies
for axis in lrange(self._AXIS_LEN):
v = axes.get(self._AXIS_NAMES[axis])
if v is None:
continue
f = _get_rename_function(v)
baxis = self._get_block_manager_axis(axis)
result._data = result._data.rename_axis(f, axis=baxis, copy=copy)
result._clear_item_cache()
if inplace:
self._update_inplace(result._data)
else:
return result.__finalize__(self)

Related Links

Pandas Bug - Error when inserting list serialize as string
Formatting index of a pandas table in a plot
resample over consecutive chunks of large size CSV
Too many possibilities for categorical fields
How to install pandas on virtual machine?
How to change particular column value when defined mask is true?
Pandas/NumPy: concisely label first N values matching a mask
fast way to make index prefix with an alphabet
Is there a way to better format Pandas data frames when printing them in Sublime text?
Pandas pivot certain rows to columns
Combine three dataframes
Merging dataframes by file name
based on a value in column A, shift the values in columns C and D to the right in a pandas dataframe
Creating a pandas DataFrame from a list followed by an array produces error
Filling in missing value with probabilities
Python Pandas - pandas Missing required dependencies ['numpy']

Categories

HOME
pandas
testng
pypi
gerrit
cookies
magnific-popup
tizen
lodash
objectgears
rsyslog
cmd
yarn
v8
webpack-2
retrofit
convolution
directx
pheatmap
wheelnav.js
handsontable
correlation
php-7.1
alignment
commonmark
worldwind
progressive-web-apps
django-admin
viewport
karma-jasmine
textfield
jasonette
sox
excel-2007
autosys
java-7
javacv
social-media
opentype
libuv
libssl
librato
unboundid
copying
catch-all
protovis
accelerate-framework
y86
starteam
document.write
user-controls
turbogears
http-redirect
gpx
appcompat
hibernate-tools
qsslsocket
jquery-validate
python-webbrowser
blogengine.net
togetherjs
worker-thread
kbuild
sqlclient
libusb-win32
sonarlint-vs
minimization
smart-table
pycaffe
google-cdn
setuptools
wdf
database-optimization
phishing
skype4py
whois
livequery
freedesktop.org
kendonumerictextbox
spim
intel-fortran
pundit
angular-leaflet-directive
browser-bugs
ios9.1
notify
block-device
camanjs
gui-test-framework
javafx-webengine
qcodo
preferences
has-many-through
fluentautomation
system.net.webexception
transcoding
intentservice
oam
geos
dataservice
plasma
onsubmit
online-compilation
doh
javax.script
recent-documents
krl
gamequery
forums
memory-size
ti-dsp
boost-smart-ptr
uiq3

Resources

Database Users
RDBMS discuss
Database Dev&Adm
javascript
java
csharp
php
android
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App