### pandas

#### Assigning one column to another column between pandas DataFrames (like vector to vector assignment)

I have a super strange problem which I spent the last hour trying to solve, but with no success. It is even more strange since I can't replicate it on a small scale. I have a large DataFrame (150,000 entries). I took out a subset of it and did some manipulation. the subset was saved as a different variable, x. x is smaller than the df, but its index is in the same range as the df. I'm now trying to assign x back to the DataFrame replacing values in the same column: rep_Callers['true_vpID'] = x.true_vpID This inserts all the different values in x to the right place in df, but instead of keeping the df.true_vpID values that are not in x, it is filling them with NaNs. So I tried a different approach: df.ix[x.index,'true_vpID'] = x.true_vpID But instead of filling x values in the right place in df, the df.true_vpID gets filled with the first value of x and only it! I changed the first value of x several times to make sure this is indeed what is happening, and it is. I tried to replicate it on a small scale but it didn't work: df = DataFrame({'a':ones(5),'b':range(5)}) a b 0 1 0 1 1 1 2 1 2 3 1 3 4 1 4 z =Series([random() for i in range(5)],index = range(5)) 0 0.812561 1 0.862109 2 0.031268 3 0.575634 4 0.760752 df.ix[z.index[[1,3]],'b'] = z[[1,3]] a b 0 1 0.000000 1 1 0.812561 2 1 2.000000 3 1 0.575634 4 1 4.000000 5 1 5.000000 I really tried it all, need some new suggestions...

Try using df.update(updated_df_or_series) Also using a simple example, you can modify a DataFrame by doing an index query and modifying the resulting object. df_1 a b 0 1 0 1 1 1 2 1 2 3 1 3 4 1 4 df_2 = df_1.ix[3:5] df_2.b = df_2.b + 2 df_2 a b 3 1 5 4 1 6 df_1 a b 0 1 0 1 1 1 2 1 2 3 1 5 4 1 6

### Related Links

HDF5 string serialization details in pandas?

Pandas dataframe without copy

Pandas: aggregation on multi-level groups

How to select last row of Pandas DataFrame with Multiindex?

Pandas select row of data frame by integer index

Filter on Pandas DataFrame with datetime columns raises error

How to divide the value of pandas columns by the other column

Pandas regression calculating 'nan' for some standard errors?

How to make first level of MultiIndex as the columns?

Use Pandas for best fit line on time based data

Reading variable column and row structure to Pandas by column amount

Why can I not change the values in a subset of cells from a DataFrame column?

adding a new column with values from the existing ones

How to suppress Pandas Future warning ?

How to delete rows with duplicate values in succeeding rows

How can I check if the values in a series are contained in any of the intervals defined the rows of a DataFrame?