### pandas

#### Assigning one column to another column between pandas DataFrames (like vector to vector assignment)

I have a super strange problem which I spent the last hour trying to solve, but with no success. It is even more strange since I can't replicate it on a small scale. I have a large DataFrame (150,000 entries). I took out a subset of it and did some manipulation. the subset was saved as a different variable, x. x is smaller than the df, but its index is in the same range as the df. I'm now trying to assign x back to the DataFrame replacing values in the same column: rep_Callers['true_vpID'] = x.true_vpID This inserts all the different values in x to the right place in df, but instead of keeping the df.true_vpID values that are not in x, it is filling them with NaNs. So I tried a different approach: df.ix[x.index,'true_vpID'] = x.true_vpID But instead of filling x values in the right place in df, the df.true_vpID gets filled with the first value of x and only it! I changed the first value of x several times to make sure this is indeed what is happening, and it is. I tried to replicate it on a small scale but it didn't work: df = DataFrame({'a':ones(5),'b':range(5)}) a b 0 1 0 1 1 1 2 1 2 3 1 3 4 1 4 z =Series([random() for i in range(5)],index = range(5)) 0 0.812561 1 0.862109 2 0.031268 3 0.575634 4 0.760752 df.ix[z.index[[1,3]],'b'] = z[[1,3]] a b 0 1 0.000000 1 1 0.812561 2 1 2.000000 3 1 0.575634 4 1 4.000000 5 1 5.000000 I really tried it all, need some new suggestions...

Try using df.update(updated_df_or_series) Also using a simple example, you can modify a DataFrame by doing an index query and modifying the resulting object. df_1 a b 0 1 0 1 1 1 2 1 2 3 1 3 4 1 4 df_2 = df_1.ix[3:5] df_2.b = df_2.b + 2 df_2 a b 3 1 5 4 1 6 df_1 a b 0 1 0 1 1 1 2 1 2 3 1 5 4 1 6

### Related Links

Large scale pivot table in Python

how to filter by day with pandas

Select non-null rows from a specific column in a DataFrame and take a sub-selection of other columns

Merge and sum Pandas Pivot Table

Counting null as percentage

appending list of lists to pd.Dataframe()

how to perform where and distinct count operation in pandas dataframe?

Pandas Dataframe - Using index as value when slicing/filtering

How can I select out columns where the first values are NaN?

Record limitation in pandas dataframe when importing from a csv file

count of unique values in pandas dataframe column

Rolling sums on pandas dataframe

Pandas Bug - Error when inserting list serialize as string

Formatting index of a pandas table in a plot

resample over consecutive chunks of large size CSV

Too many possibilities for categorical fields