### pandas

#### VLOOKUP equivalent function to look up value in pandas DataFrame

I have a pandas dataframe with the following structure: DF_Cell, DF_Site C1,A C2,A C3,B C4,B C5,B And I have a very long loop (100 million iterations) in which I treat one by one strings that correspond to the "DF_Cell" column in the DataFrame (first loop iteration creates C1, second iteration creates C2, etc...). I would like to lookup in the dataframe the DF_Site corresponding to the cell (DF_Cell) treated in the loop. One way I could think of was to put the treated cell in a one-cell DataFrame and then doing a left merge on it, but this is much too inefficient for such big data. Is there a better way?

Perhaps you want to set DF_Cell as the index*: In [11]: df = pd.read_csv('foo.csv', index_col='DF_Cell') # or df.set_index('DF_Cell', inplace=True) In [12]: df Out[12]: DF_Site DF_Cell C1 A C2 A C3 B C4 B C5 B You can then refer to the row, or specific entry, using loc: In [13]: df.loc['C1'] Out[13]: DF_Site A Name: C1, dtype: object In [14]: df.loc['C1', 'DF_Site'] Out[14]: 'A' *Assuming this has two columns, you could use squeeze=True.

I don't really understand what you mean in your first paragraph, but to be able to look up a field value by reference to the corresponding type in a different column, I agree with Alexis' example as the most idiomatic and efficient way to do it in pandas. However if this is really representative of your data structure you can just use a dict. data = {'a': 1, 'b': 2, 'c': 3} data['a'] # 2 map(lambda y: x[y]+1, ['c', 'b', 'a']) # [4, 3, 2]

### Related Links

Taking second last observed row

retrieve data from pandas dataframe if it doesn't cooccur in previous column

pandas resample MAX-VALUE with corresponding ANGLE-VALUE

Performance issues with writing data to HDFStore

Finding same value index of pandas Series

Get Maximum Value from Dataframe

Slicing in group by function

Grouping factors in python patsy

pandas Series groupby col not found

Annotate labels in pandas scatter plot

Arithmetic in pandas HDF5 queries

Exception appending DataFrame chunk with string values to large HDF5 file using pandas

Unable to use seaborn.countplot

How to convert a key and list of values to a dataframe in pyspark?

Pandas standard deviation miracle

Pandas: Test for key existence in dictionary