2

I have data as follows. The score column is the score of x vs y (which is equivalent to y vs x).

from collections import Counter
import pandas as pd

d = pd.DataFrame([('a','b',1), ('a','c', 2), ('b','a',3), ('b','a',3)], 
                 columns=['x', 'y', 'score'])

    x   y   score
0   a   b   1
1   a   c   2
2   b   a   3
3   b   a   3

I want to evaluate the count of the score of each combination, so ('a' vs 'b) and ('b' vs 'a') should be grouped together, i.e.

        score
x   y   
a   b   {1: 1, 3: 2}
    c   {2: 1}

However if I do d.groupby(['x', 'y']).agg(Counter), ('a', 'b') and ('b', 'a') are not combined together. Is there a way to solve this? Thanks!

        score
x   y   
a   b   {1: 1}
    c   {2: 1}
b   a   {3: 2}

3 Answers 3

1

If you do not care about order then, may be you can use sort on two columns then, apply, groupby:

import pandas as pd
from collections import Counter

d = pd.DataFrame([('a','b',1), ('a','c', 2), ('b','a',3), ('b','a',3)], 
                 columns=['x', 'y', 'score'])
# Note: you can copy to other dataframe if you do not want to change original
d[['x', 'y']] = d[['x', 'y']].apply(sorted, axis=1) 
x = d.groupby(['x', 'y']).agg(Counter)
print(x)
# Result:
#             score
# x y              
# a b  {1: 1, 3: 2}
#   c        {2: 1}
1

You can also groupby using the aggregated frozenset of x and y and then agg using Counter

from collections import Counter
df.groupby(df[['x', 'y']].agg(frozenset, 1)).score.agg(Counter)

(b, a)    {1: 1, 3: 2}
(a, c)          {2: 1}

If you want a dataframe,

.to_frame()

        score
(b, a)  {1: 1, 3: 2}
(a, c)  {2: 1}
1

IIUC

d[['x','y']]=np.sort(d[['x','y']],1)
pd.crosstab([d.x,d.y],d.score)
Out[94]: 
score  1  2  3
x y           
a b    1  0  2
  c    0  1  0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.