pandas


VLOOKUP equivalent function to look up value in pandas DataFrame


I have a pandas dataframe with the following structure:
DF_Cell, DF_Site
C1,A
C2,A
C3,B
C4,B
C5,B
And I have a very long loop (100 million iterations) in which I treat one by one strings that correspond to the "DF_Cell" column in the DataFrame (first loop iteration creates C1, second iteration creates C2, etc...).
I would like to lookup in the dataframe the DF_Site corresponding to the cell (DF_Cell) treated in the loop.
One way I could think of was to put the treated cell in a one-cell DataFrame and then doing a left merge on it, but this is much too inefficient for such big data.
Is there a better way?
Perhaps you want to set DF_Cell as the index*:
In [11]: df = pd.read_csv('foo.csv', index_col='DF_Cell')
# or df.set_index('DF_Cell', inplace=True)
In [12]: df
Out[12]:
DF_Site
DF_Cell
C1 A
C2 A
C3 B
C4 B
C5 B
You can then refer to the row, or specific entry, using loc:
In [13]: df.loc['C1']
Out[13]:
DF_Site A
Name: C1, dtype: object
In [14]: df.loc['C1', 'DF_Site']
Out[14]: 'A'
*Assuming this has two columns, you could use squeeze=True.
I don't really understand what you mean in your first paragraph, but to be able to look up a field value by reference to the corresponding type in a different column, I agree with Alexis' example as the most idiomatic and efficient way to do it in pandas. However if this is really representative of your data structure you can just use a dict.
data = {'a': 1, 'b': 2, 'c': 3}
data['a']
# 2
map(lambda y: x[y]+1, ['c', 'b', 'a'])
# [4, 3, 2]

Related Links

seasonal_decompose: operands could not be broadcast together with shapes on a series
How to properly sample from a numpy.random.multivariate_normal (positive-semidefinite covariance matrix issue)
how to do logical operation between dataframe columns?
Console hangs up at the time of plotting
Pandas apply a function at fixed interval
float type column in pandas to convert to tuple/list
Getting an error with Pandas Panel boolean indexing
pandas dataframe subtraction causing nan
Pandas dataframe: truncate string fields
how to add new categorical column in pandas
Finding different Ids with the same value in pandas dataframe
Why can't iterrows do math - and instead returns integer values where these should be floats
How to merge/concatenate based on column multiindex? (Pandas)
Groupby function on pandas dataframe
How are the nan values filled in x.add(y, fill_value = 0)?
mulitiindexing in python: how to select level0-index based on multiple values in level1 rows

Categories

HOME
pandas
asp.net-core
netbeans
tizen
jxls
ip
windows-server
yum
autotools
synchronization
floating-action-button
collectd
circular-dependency
zebra-printers
carthage
pc
apache-metamodel
propel
restful-authentication
typo3-6.2.x
assistant
replaceall
ejabberd-module
large-file-upload
srcset
lightswitch-2013
directx-10
perlin-noise
android-ble
revolution-slider
gesture
dynamics-crm-2013
tasker
avro4s
grid.mvc
estimote
acoustics
youcompleteme
atomicity
hls.js
git-diff
qtwebview
orthogonal
migradoc
firmata
color-picker
android-textview
underscore.js-templating
login-control
lua-5.1
freelancer.com-api
eclipse-clp
metaclass
libressl
rgeo
event-bubbling
appfabric-cache
composite
qpainter
codeigniter-url
java.util.concurrent
xcode-6.2
device-manager
castle
aapt
valueconverter
xceed-datagrid
datagridviewcolumn
resty-gwt
app42
gnu-smalltalk
file-locking
django-nonrel
delphi-6
itmstransporter
e4x
jquery-knob
libstdc++
venn-diagram
javaspaces
hosts-file
adk
f#-powerpack
createwindow
getresponsestream
memory-size
photoshop-cs4
mtj
grid-system

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App