### pandas

#### VLOOKUP equivalent function to look up value in pandas DataFrame

I have a pandas dataframe with the following structure: DF_Cell, DF_Site C1,A C2,A C3,B C4,B C5,B And I have a very long loop (100 million iterations) in which I treat one by one strings that correspond to the "DF_Cell" column in the DataFrame (first loop iteration creates C1, second iteration creates C2, etc...). I would like to lookup in the dataframe the DF_Site corresponding to the cell (DF_Cell) treated in the loop. One way I could think of was to put the treated cell in a one-cell DataFrame and then doing a left merge on it, but this is much too inefficient for such big data. Is there a better way?

Perhaps you want to set DF_Cell as the index*: In [11]: df = pd.read_csv('foo.csv', index_col='DF_Cell') # or df.set_index('DF_Cell', inplace=True) In [12]: df Out[12]: DF_Site DF_Cell C1 A C2 A C3 B C4 B C5 B You can then refer to the row, or specific entry, using loc: In [13]: df.loc['C1'] Out[13]: DF_Site A Name: C1, dtype: object In [14]: df.loc['C1', 'DF_Site'] Out[14]: 'A' *Assuming this has two columns, you could use squeeze=True.

I don't really understand what you mean in your first paragraph, but to be able to look up a field value by reference to the corresponding type in a different column, I agree with Alexis' example as the most idiomatic and efficient way to do it in pandas. However if this is really representative of your data structure you can just use a dict. data = {'a': 1, 'b': 2, 'c': 3} data['a'] # 2 map(lambda y: x[y]+1, ['c', 'b', 'a']) # [4, 3, 2]

### Related Links

How to check for boolean codition in pandas dataframe

Reading batches of data from BigQuery into Datalab

Jupyter/ipywidgets sorting dataframe on two levels

Groupby.sum() giving ValueError: overflow in timedelta operation

Why does DataFrameGroupBy.boxplot method throw error when given argument “subplots=True/False”?

Calculate age in months - optimize date transformations in pandas

pandas: list of dictionaries grouped by key from df

Pandas data frames and matplotlib.pyplot

Pandas.to_csv thousand separator

Annotating a graph with certain values of another series (Index is datetime)

Pandas rolling sum on string column

pandas apply() with and without lambda

Pandas read_html to retrieve Table

pandas: reshape dataframe for stacked bar plot

Change values in a column from a list

Pandas: How to Return Max Value in Multiindex