### pandas

#### VLOOKUP equivalent function to look up value in pandas DataFrame

I have a pandas dataframe with the following structure: DF_Cell, DF_Site C1,A C2,A C3,B C4,B C5,B And I have a very long loop (100 million iterations) in which I treat one by one strings that correspond to the "DF_Cell" column in the DataFrame (first loop iteration creates C1, second iteration creates C2, etc...). I would like to lookup in the dataframe the DF_Site corresponding to the cell (DF_Cell) treated in the loop. One way I could think of was to put the treated cell in a one-cell DataFrame and then doing a left merge on it, but this is much too inefficient for such big data. Is there a better way?

Perhaps you want to set DF_Cell as the index*: In [11]: df = pd.read_csv('foo.csv', index_col='DF_Cell') # or df.set_index('DF_Cell', inplace=True) In [12]: df Out[12]: DF_Site DF_Cell C1 A C2 A C3 B C4 B C5 B You can then refer to the row, or specific entry, using loc: In [13]: df.loc['C1'] Out[13]: DF_Site A Name: C1, dtype: object In [14]: df.loc['C1', 'DF_Site'] Out[14]: 'A' *Assuming this has two columns, you could use squeeze=True.

I don't really understand what you mean in your first paragraph, but to be able to look up a field value by reference to the corresponding type in a different column, I agree with Alexis' example as the most idiomatic and efficient way to do it in pandas. However if this is really representative of your data structure you can just use a dict. data = {'a': 1, 'b': 2, 'c': 3} data['a'] # 2 map(lambda y: x[y]+1, ['c', 'b', 'a']) # [4, 3, 2]

### Related Links

Using pandas.ols on multiple dependent variables at once

Insert 0-values for missing dates within MultiIndex

Reindexing dataframes

pandas access axis by user-defined name

Trouble with groupss and aggregation

Replace MultiIndex's contents with DataFrame columns

What's the `DataFrameGroupBy`-equivalent of `dict.keys`?

How to split a dataframe according to a boolean criterion?

Pandas Rolling Computations on Sliding Windows (Unevenly spaced)

Resampling Minute data

How to get the last n row of pandas dataframe?

Resample time series in pandas to a weekly interval

Suppress output of object when plotting in ipython

qtconsole not rendering pandas dataframes as html notebook_repr_html option

How to get the sub_level index value of Pandas dataframe?

Pandas groupby size “count” intermittent under-count