Pandas: How to group by a value in column when there is list in one of the columnsHow to make a flat list out...

Multi tool use
Why did the villain in the first Men in Black movie care about Earth's Cockroaches?
Citing paywalled articles accessed via illegal web sharing
How much mayhem could I cause as a sentient fish?
Highly technological aliens land nuclear fusion powered ships in medieval city and slaughter everyone, using swords?
What to do when being responsible for data protection in your lab, yet advice is ignored?
What is the most triangles you can make from a capital "H" and 3 straight lines?
Do authors have to be politically correct in article-writing?
Early credit roll before the end of the film
Would a National Army of mercenaries be a feasible idea?
Why would space fleets be aligned?
Why would the Pakistan airspace closure cancel flights not headed to Pakistan itself?
Why isn't there a non-conducting core wire for high-frequency coil applications
Can an insurance company drop you after receiving a bill and refusing to pay?
How to escape the null character in here-document?(bash and/or dash)
Are there any modern advantages of a fire piston?
Strange Sign on Lab Door
Writing a character who is going through a civilizing process without overdoing it?
What kind of hardware implements Fourier transform?
Typing Amharic inside a math equation?
How can I install sudo without using su?
Can I string the D&D Starter Set campaign into another module, keeping the same characters?
We are very unlucky in my court
How can animals be objects of ethics without being subjects as well?
Parsing a string of key-value pairs as a dictionary
Pandas: How to group by a value in column when there is list in one of the columns
How to make a flat list out of list of lists?How do I check if a list is empty?How do I sort a dictionary by value?How to make a flat list out of list of lists?How to concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?Renaming columns in pandasDelete column from pandas DataFrame by column nameSelect rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
There are all strings and one list column?
– jezrael
1 hour ago
Super, and if useprint (df.iloc[0].apply(type))
?
– jezrael
45 mins ago
OK, so both solution working.
– jezrael
40 mins ago
add a comment |
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.
Dataframe:
value_1: value_2: value_3: list:
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....
My expected output is:
value_1: value_2: value_3: list:
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]
Thanks!
python pandas
python pandas
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 1 hour ago
johnJones901johnJones901
634
634
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
There are all strings and one list column?
– jezrael
1 hour ago
Super, and if useprint (df.iloc[0].apply(type))
?
– jezrael
45 mins ago
OK, so both solution working.
– jezrael
40 mins ago
add a comment |
There are all strings and one list column?
– jezrael
1 hour ago
Super, and if useprint (df.iloc[0].apply(type))
?
– jezrael
45 mins ago
OK, so both solution working.
– jezrael
40 mins ago
There are all strings and one list column?
– jezrael
1 hour ago
There are all strings and one list column?
– jezrael
1 hour ago
Super, and if use
print (df.iloc[0].apply(type))
?– jezrael
45 mins ago
Super, and if use
print (df.iloc[0].apply(type))
?– jezrael
45 mins ago
OK, so both solution working.
– jezrael
40 mins ago
OK, so both solution working.
– jezrael
40 mins ago
add a comment |
2 Answers
2
active
oldest
votes
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
38 mins ago
1
Can you explain what f1, f2 and d are doing please? Thank you!
– johnJones901
14 mins ago
1
@johnJones901 - Answer was edited.
– jezrael
8 mins ago
1
Thanks for the help!
– johnJones901
6 mins ago
@johnJones901 - You are welcome!
– jezrael
5 mins ago
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And using GroupBy.sum
to append the lists:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
38 mins ago
1
Can you explain what f1, f2 and d are doing please? Thank you!
– johnJones901
14 mins ago
1
@johnJones901 - Answer was edited.
– jezrael
8 mins ago
1
Thanks for the help!
– johnJones901
6 mins ago
@johnJones901 - You are welcome!
– jezrael
5 mins ago
add a comment |
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
38 mins ago
1
Can you explain what f1, f2 and d are doing please? Thank you!
– johnJones901
14 mins ago
1
@johnJones901 - Answer was edited.
– jezrael
8 mins ago
1
Thanks for the help!
– johnJones901
6 mins ago
@johnJones901 - You are welcome!
– jezrael
5 mins ago
add a comment |
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
Create dynamically dictionary by all columns with no list
and value_1
and for list
use lambda function with list comprehension with flatenning:
f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2
df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart
list
0 [supermarket, connivence, state]
1 [coffee, supermarket]
Explanation:
f1
and f2
are lambda functions.
First remove missing values (if exist) and join
strings with separator:
f1 = lambda x: ', '.join(x.dropna())
First get only strings values (omit missing values, because NaN
s) and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
First get all string values with filtering empty strings and join
strings with separator:
f1 = lambda x: ', '.join([y for y in x if y != ''])
Function f2
is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]
f2 = lambda x: [z for y in x for z in y]
edited 8 mins ago
answered 1 hour ago


jezraeljezrael
342k25296368
342k25296368
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
38 mins ago
1
Can you explain what f1, f2 and d are doing please? Thank you!
– johnJones901
14 mins ago
1
@johnJones901 - Answer was edited.
– jezrael
8 mins ago
1
Thanks for the help!
– johnJones901
6 mins ago
@johnJones901 - You are welcome!
– jezrael
5 mins ago
add a comment |
@johnJones901 - Can you check changef1
tof1 = lambda x: ', '.join([y for y in x if y != ''])
?
– jezrael
38 mins ago
1
Can you explain what f1, f2 and d are doing please? Thank you!
– johnJones901
14 mins ago
1
@johnJones901 - Answer was edited.
– jezrael
8 mins ago
1
Thanks for the help!
– johnJones901
6 mins ago
@johnJones901 - You are welcome!
– jezrael
5 mins ago
@johnJones901 - Can you check change
f1
to f1 = lambda x: ', '.join([y for y in x if y != ''])
?– jezrael
38 mins ago
@johnJones901 - Can you check change
f1
to f1 = lambda x: ', '.join([y for y in x if y != ''])
?– jezrael
38 mins ago
1
1
Can you explain what f1, f2 and d are doing please? Thank you!
– johnJones901
14 mins ago
Can you explain what f1, f2 and d are doing please? Thank you!
– johnJones901
14 mins ago
1
1
@johnJones901 - Answer was edited.
– jezrael
8 mins ago
@johnJones901 - Answer was edited.
– jezrael
8 mins ago
1
1
Thanks for the help!
– johnJones901
6 mins ago
Thanks for the help!
– johnJones901
6 mins ago
@johnJones901 - You are welcome!
– jezrael
5 mins ago
@johnJones901 - You are welcome!
– jezrael
5 mins ago
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And using GroupBy.sum
to append the lists:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And using GroupBy.sum
to append the lists:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
add a comment |
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And using GroupBy.sum
to append the lists:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
You could groupby
value_1
and aggregate with the following function for the strings:
def fun(x):
return x.str.cat(sep=', ')
And using GroupBy.sum
to append the lists:
df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})
list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas
value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart
edited 13 mins ago
answered 1 hour ago


yatuyatu
11.6k31137
11.6k31137
add a comment |
add a comment |
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
QJy1dZE0SfjFHSR8 MaxRfzIR,zkM JqNRil oafjEP
There are all strings and one list column?
– jezrael
1 hour ago
Super, and if use
print (df.iloc[0].apply(type))
?– jezrael
45 mins ago
OK, so both solution working.
– jezrael
40 mins ago