Pandas: How to group by a value in column when there is list in one of the columnsHow to make a flat list out...

Why did the villain in the first Men in Black movie care about Earth's Cockroaches?

Citing paywalled articles accessed via illegal web sharing

How much mayhem could I cause as a sentient fish?

Highly technological aliens land nuclear fusion powered ships in medieval city and slaughter everyone, using swords?

What to do when being responsible for data protection in your lab, yet advice is ignored?

What is the most triangles you can make from a capital "H" and 3 straight lines?

Do authors have to be politically correct in article-writing?

Early credit roll before the end of the film

Would a National Army of mercenaries be a feasible idea?

Why would space fleets be aligned?

Why would the Pakistan airspace closure cancel flights not headed to Pakistan itself?

Why isn't there a non-conducting core wire for high-frequency coil applications

Can an insurance company drop you after receiving a bill and refusing to pay?

How to escape the null character in here-document?(bash and/or dash)

Are there any modern advantages of a fire piston?

Strange Sign on Lab Door

Writing a character who is going through a civilizing process without overdoing it?

What kind of hardware implements Fourier transform?

Typing Amharic inside a math equation?

How can I install sudo without using su?

Can I string the D&D Starter Set campaign into another module, keeping the same characters?

We are very unlucky in my court

How can animals be objects of ethics without being subjects as well?

Parsing a string of key-value pairs as a dictionary

Pandas: How to group by a value in column when there is list in one of the columns

How to make a flat list out of list of lists?How do I check if a list is empty?How do I sort a dictionary by value?How to make a flat list out of list of lists?How to concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?Renaming columns in pandasDelete column from pandas DataFrame by column nameSelect rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

asked 1 hour ago

johnJones901

634

New contributor

There are all strings and one list column?

– jezrael
1 hour ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
45 mins ago

OK, so both solution working.

– jezrael
40 mins ago

add a comment |

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

asked 1 hour ago

johnJones901

634

New contributor

There are all strings and one list column?

– jezrael
1 hour ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
45 mins ago

OK, so both solution working.

– jezrael
40 mins ago

add a comment |

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

asked 1 hour ago

johnJones901

634

New contributor

I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.

Dataframe:

 value_1:        value_2:           value_3:               list: 

 american     california, nyc      walmart, kmart      [supermarket, connivence] 

 canadian         toronto            dunkinDonuts      [coffee]

 american          texas                               [state]

 canadian                             walmart          [supermarket] 

   ...              ...                 ...              ....

My expected output is:

value_1:        value_2:              value_3:             list: 

american   california, nyc, texas   walmart, kmart      [supermarket, connivence, state] 

canadian         toronto         dunkinDonuts, walmart  [coffee, supermarket]

Thanks!

python pandas

asked 1 hour ago

johnJones901

634

New contributor

asked 1 hour ago

johnJones901

634

New contributor

asked 1 hour ago

johnJones901

634

New contributor

asked 1 hour ago

johnJones901

634

asked 1 hour ago

johnJones901

634

New contributor

johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

There are all strings and one list column?

– jezrael
1 hour ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
45 mins ago

OK, so both solution working.

– jezrael
40 mins ago

add a comment |

There are all strings and one list column?

– jezrael
1 hour ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
45 mins ago

OK, so both solution working.

– jezrael
40 mins ago

There are all strings and one list column?

– jezrael
1 hour ago

Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
45 mins ago

OK, so both solution working.

– jezrael
40 mins ago

add a comment |

2 Answers
2

active

oldest

votes

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 8 mins ago

answered 1 hour ago

jezrael

342k25296368

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
38 mins ago

1

Can you explain what f1, f2 and d are doing please? Thank you!

– johnJones901
14 mins ago

1

@johnJones901 - Answer was edited.

– jezrael
8 mins ago

1

Thanks for the help!

– johnJones901
6 mins ago

@johnJones901 - You are welcome!

– jezrael
5 mins ago

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And using GroupBy.sum to append the lists:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 13 mins ago

answered 1 hour ago

yatu

11.6k31137

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 8 mins ago

answered 1 hour ago

jezrael

342k25296368

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
38 mins ago

1

Can you explain what f1, f2 and d are doing please? Thank you!

– johnJones901
14 mins ago

1

@johnJones901 - Answer was edited.

– jezrael
8 mins ago

1

Thanks for the help!

– johnJones901
6 mins ago

@johnJones901 - You are welcome!

– jezrael
5 mins ago

add a comment |

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 8 mins ago

answered 1 hour ago

jezrael

342k25296368

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
38 mins ago

1

Can you explain what f1, f2 and d are doing please? Thank you!

– johnJones901
14 mins ago

1

@johnJones901 - Answer was edited.

– jezrael
8 mins ago

1

Thanks for the help!

– johnJones901
6 mins ago

@johnJones901 - You are welcome!

– jezrael
5 mins ago

add a comment |

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 8 mins ago

answered 1 hour ago

jezrael

342k25296368

Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:

f1 = lambda x: ', '.join(x.dropna())

#alternative for join only strings

#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

f2 = lambda x: [z for y in x for z in y]

d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)

d['list'] = f2 



df = df.groupby('value_1', as_index=False).agg(d)

print (df)

     value_1                 value_2                value_3  

0   american  california, nyc, texas         walmart, kmart   

1   canadian                 toronto  dunkinDonuts, walmart   



                               list  

0  [supermarket, connivence, state]  

1             [coffee, supermarket]

Explanation:

f1 and f2 are lambda functions.

First remove missing values (if exist) and join strings with separator:

f1 = lambda x: ', '.join(x.dropna())

First get only strings values (omit missing values, because NaNs) and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])

First get all string values with filtering empty strings and join strings with separator:

f1 = lambda x: ', '.join([y for y in x if y != ''])

Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]

f2 = lambda x: [z for y in x for z in y]

edited 8 mins ago

answered 1 hour ago

jezrael

342k25296368

edited 8 mins ago

answered 1 hour ago

jezrael

342k25296368

answered 1 hour ago

jezrael

342k25296368

answered 1 hour ago

jezrael

342k25296368

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
38 mins ago

1

Can you explain what f1, f2 and d are doing please? Thank you!

– johnJones901
14 mins ago

1

@johnJones901 - Answer was edited.

– jezrael
8 mins ago

1

Thanks for the help!

– johnJones901
6 mins ago

@johnJones901 - You are welcome!

– jezrael
5 mins ago

add a comment |

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
38 mins ago

1

Can you explain what f1, f2 and d are doing please? Thank you!

– johnJones901
14 mins ago

1

@johnJones901 - Answer was edited.

– jezrael
8 mins ago

1

Thanks for the help!

– johnJones901
6 mins ago

@johnJones901 - You are welcome!

– jezrael
5 mins ago

@johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

– jezrael
38 mins ago

Can you explain what f1, f2 and d are doing please? Thank you!

– johnJones901
14 mins ago

@johnJones901 - Answer was edited.

– jezrael
8 mins ago

Thanks for the help!

– johnJones901
6 mins ago

@johnJones901 - You are welcome!

– jezrael
5 mins ago

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And using GroupBy.sum to append the lists:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 13 mins ago

answered 1 hour ago

yatu

11.6k31137

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And using GroupBy.sum to append the lists:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 13 mins ago

answered 1 hour ago

yatu

11.6k31137

add a comment |

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And using GroupBy.sum to append the lists:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 13 mins ago

answered 1 hour ago

yatu

11.6k31137

You could groupby value_1 and aggregate with the following function for the strings:

def fun(x):

    return x.str.cat(sep=', ')

And using GroupBy.sum to append the lists:

df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})



                        list                       value_2  

value_1                                                              

american  [supermarket, connivence, state]  california, nyc, texas   

canadian             [coffee, sipermarket]          toronto, texas   



                    value_3  

value_1                                 

american  walmart, kmart, dunkinDonuts  

canadian         dunkinDonuts, walmart

edited 13 mins ago

answered 1 hour ago

yatu

11.6k31137

edited 13 mins ago

answered 1 hour ago

yatu

11.6k31137

answered 1 hour ago

yatu

11.6k31137

answered 1 hour ago

yatu

11.6k31137

add a comment |

johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Fhyujk