Pandas: How to group by a value in column when there is list in one of the columnsHow to make a flat list out...

Multi tool use
Multi tool use

Why did the villain in the first Men in Black movie care about Earth's Cockroaches?

Citing paywalled articles accessed via illegal web sharing

How much mayhem could I cause as a sentient fish?

Highly technological aliens land nuclear fusion powered ships in medieval city and slaughter everyone, using swords?

What to do when being responsible for data protection in your lab, yet advice is ignored?

What is the most triangles you can make from a capital "H" and 3 straight lines?

Do authors have to be politically correct in article-writing?

Early credit roll before the end of the film

Would a National Army of mercenaries be a feasible idea?

Why would space fleets be aligned?

Why would the Pakistan airspace closure cancel flights not headed to Pakistan itself?

Why isn't there a non-conducting core wire for high-frequency coil applications

Can an insurance company drop you after receiving a bill and refusing to pay?

How to escape the null character in here-document?(bash and/or dash)

Are there any modern advantages of a fire piston?

Strange Sign on Lab Door

Writing a character who is going through a civilizing process without overdoing it?

What kind of hardware implements Fourier transform?

Typing Amharic inside a math equation?

How can I install sudo without using su?

Can I string the D&D Starter Set campaign into another module, keeping the same characters?

We are very unlucky in my court

How can animals be objects of ethics without being subjects as well?

Parsing a string of key-value pairs as a dictionary



Pandas: How to group by a value in column when there is list in one of the columns


How to make a flat list out of list of lists?How do I check if a list is empty?How do I sort a dictionary by value?How to make a flat list out of list of lists?How to concatenate two lists in Python?How to clone or copy a list?How do I list all files of a directory?Renaming columns in pandasDelete column from pandas DataFrame by column nameSelect rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers













12















I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.



Dataframe:



 value_1:        value_2:           value_3:               list: 
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....


My expected output is:



value_1:        value_2:              value_3:             list: 
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]


Thanks!










share|improve this question







New contributor




johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • There are all strings and one list column?

    – jezrael
    1 hour ago











  • Super, and if use print (df.iloc[0].apply(type)) ?

    – jezrael
    45 mins ago











  • OK, so both solution working.

    – jezrael
    40 mins ago
















12















I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.



Dataframe:



 value_1:        value_2:           value_3:               list: 
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....


My expected output is:



value_1:        value_2:              value_3:             list: 
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]


Thanks!










share|improve this question







New contributor




johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • There are all strings and one list column?

    – jezrael
    1 hour ago











  • Super, and if use print (df.iloc[0].apply(type)) ?

    – jezrael
    45 mins ago











  • OK, so both solution working.

    – jezrael
    40 mins ago














12












12








12








I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.



Dataframe:



 value_1:        value_2:           value_3:               list: 
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....


My expected output is:



value_1:        value_2:              value_3:             list: 
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]


Thanks!










share|improve this question







New contributor




johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I am trying to group-by the values in my "value_1" column. But my last column is made up of lists. When I try to group-by using my "value_1" column, the column made up of lists disappears.



Dataframe:



 value_1:        value_2:           value_3:               list: 
american california, nyc walmart, kmart [supermarket, connivence]
canadian toronto dunkinDonuts [coffee]
american texas [state]
canadian walmart [supermarket]
... ... ... ....


My expected output is:



value_1:        value_2:              value_3:             list: 
american california, nyc, texas walmart, kmart [supermarket, connivence, state]
canadian toronto dunkinDonuts, walmart [coffee, supermarket]


Thanks!







python pandas






share|improve this question







New contributor




johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 1 hour ago









johnJones901johnJones901

634




634




New contributor




johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






johnJones901 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • There are all strings and one list column?

    – jezrael
    1 hour ago











  • Super, and if use print (df.iloc[0].apply(type)) ?

    – jezrael
    45 mins ago











  • OK, so both solution working.

    – jezrael
    40 mins ago



















  • There are all strings and one list column?

    – jezrael
    1 hour ago











  • Super, and if use print (df.iloc[0].apply(type)) ?

    – jezrael
    45 mins ago











  • OK, so both solution working.

    – jezrael
    40 mins ago

















There are all strings and one list column?

– jezrael
1 hour ago





There are all strings and one list column?

– jezrael
1 hour ago













Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
45 mins ago





Super, and if use print (df.iloc[0].apply(type)) ?

– jezrael
45 mins ago













OK, so both solution working.

– jezrael
40 mins ago





OK, so both solution working.

– jezrael
40 mins ago












2 Answers
2






active

oldest

votes


















5














Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:



f1 = lambda x: ', '.join(x.dropna())
#alternative for join only strings
#f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
f2 = lambda x: [z for y in x for z in y]
d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
d['list'] = f2

df = df.groupby('value_1', as_index=False).agg(d)
print (df)
value_1 value_2 value_3
0 american california, nyc, texas walmart, kmart
1 canadian toronto dunkinDonuts, walmart

list
0 [supermarket, connivence, state]
1 [coffee, supermarket]


Explanation:



f1 and f2 are lambda functions.



First remove missing values (if exist) and join strings with separator:



f1 = lambda x: ', '.join(x.dropna())


First get only strings values (omit missing values, because NaNs) and join strings with separator:



f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])


First get all string values with filtering empty strings and join strings with separator:



f1 = lambda x: ', '.join([y for y in x if y != '']) 


Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]



f2 = lambda x: [z for y in x for z in y]





share|improve this answer


























  • @johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

    – jezrael
    38 mins ago






  • 1





    Can you explain what f1, f2 and d are doing please? Thank you!

    – johnJones901
    14 mins ago






  • 1





    @johnJones901 - Answer was edited.

    – jezrael
    8 mins ago






  • 1





    Thanks for the help!

    – johnJones901
    6 mins ago











  • @johnJones901 - You are welcome!

    – jezrael
    5 mins ago



















3














You could groupby value_1 and aggregate with the following function for the strings:



def fun(x):
return x.str.cat(sep=', ')


And using GroupBy.sum to append the lists:



df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})

list value_2
value_1
american [supermarket, connivence, state] california, nyc, texas
canadian [coffee, sipermarket] toronto, texas

value_3
value_1
american walmart, kmart, dunkinDonuts
canadian dunkinDonuts, walmart





share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5














    Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:



    f1 = lambda x: ', '.join(x.dropna())
    #alternative for join only strings
    #f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
    f2 = lambda x: [z for y in x for z in y]
    d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
    d['list'] = f2

    df = df.groupby('value_1', as_index=False).agg(d)
    print (df)
    value_1 value_2 value_3
    0 american california, nyc, texas walmart, kmart
    1 canadian toronto dunkinDonuts, walmart

    list
    0 [supermarket, connivence, state]
    1 [coffee, supermarket]


    Explanation:



    f1 and f2 are lambda functions.



    First remove missing values (if exist) and join strings with separator:



    f1 = lambda x: ', '.join(x.dropna())


    First get only strings values (omit missing values, because NaNs) and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])


    First get all string values with filtering empty strings and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if y != '']) 


    Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]



    f2 = lambda x: [z for y in x for z in y]





    share|improve this answer


























    • @johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

      – jezrael
      38 mins ago






    • 1





      Can you explain what f1, f2 and d are doing please? Thank you!

      – johnJones901
      14 mins ago






    • 1





      @johnJones901 - Answer was edited.

      – jezrael
      8 mins ago






    • 1





      Thanks for the help!

      – johnJones901
      6 mins ago











    • @johnJones901 - You are welcome!

      – jezrael
      5 mins ago
















    5














    Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:



    f1 = lambda x: ', '.join(x.dropna())
    #alternative for join only strings
    #f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
    f2 = lambda x: [z for y in x for z in y]
    d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
    d['list'] = f2

    df = df.groupby('value_1', as_index=False).agg(d)
    print (df)
    value_1 value_2 value_3
    0 american california, nyc, texas walmart, kmart
    1 canadian toronto dunkinDonuts, walmart

    list
    0 [supermarket, connivence, state]
    1 [coffee, supermarket]


    Explanation:



    f1 and f2 are lambda functions.



    First remove missing values (if exist) and join strings with separator:



    f1 = lambda x: ', '.join(x.dropna())


    First get only strings values (omit missing values, because NaNs) and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])


    First get all string values with filtering empty strings and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if y != '']) 


    Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]



    f2 = lambda x: [z for y in x for z in y]





    share|improve this answer


























    • @johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

      – jezrael
      38 mins ago






    • 1





      Can you explain what f1, f2 and d are doing please? Thank you!

      – johnJones901
      14 mins ago






    • 1





      @johnJones901 - Answer was edited.

      – jezrael
      8 mins ago






    • 1





      Thanks for the help!

      – johnJones901
      6 mins ago











    • @johnJones901 - You are welcome!

      – jezrael
      5 mins ago














    5












    5








    5







    Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:



    f1 = lambda x: ', '.join(x.dropna())
    #alternative for join only strings
    #f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
    f2 = lambda x: [z for y in x for z in y]
    d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
    d['list'] = f2

    df = df.groupby('value_1', as_index=False).agg(d)
    print (df)
    value_1 value_2 value_3
    0 american california, nyc, texas walmart, kmart
    1 canadian toronto dunkinDonuts, walmart

    list
    0 [supermarket, connivence, state]
    1 [coffee, supermarket]


    Explanation:



    f1 and f2 are lambda functions.



    First remove missing values (if exist) and join strings with separator:



    f1 = lambda x: ', '.join(x.dropna())


    First get only strings values (omit missing values, because NaNs) and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])


    First get all string values with filtering empty strings and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if y != '']) 


    Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]



    f2 = lambda x: [z for y in x for z in y]





    share|improve this answer















    Create dynamically dictionary by all columns with no list and value_1 and for list use lambda function with list comprehension with flatenning:



    f1 = lambda x: ', '.join(x.dropna())
    #alternative for join only strings
    #f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])
    f2 = lambda x: [z for y in x for z in y]
    d = dict.fromkeys(df.columns.difference(['value_1','list']), f1)
    d['list'] = f2

    df = df.groupby('value_1', as_index=False).agg(d)
    print (df)
    value_1 value_2 value_3
    0 american california, nyc, texas walmart, kmart
    1 canadian toronto dunkinDonuts, walmart

    list
    0 [supermarket, connivence, state]
    1 [coffee, supermarket]


    Explanation:



    f1 and f2 are lambda functions.



    First remove missing values (if exist) and join strings with separator:



    f1 = lambda x: ', '.join(x.dropna())


    First get only strings values (omit missing values, because NaNs) and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if isinstance(y, str)])


    First get all string values with filtering empty strings and join strings with separator:



    f1 = lambda x: ', '.join([y for y in x if y != '']) 


    Function f2 is for flatten lists, because after aggregation get nested lists like [['a','b'], ['c']]



    f2 = lambda x: [z for y in x for z in y]






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 8 mins ago

























    answered 1 hour ago









    jezraeljezrael

    342k25296368




    342k25296368













    • @johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

      – jezrael
      38 mins ago






    • 1





      Can you explain what f1, f2 and d are doing please? Thank you!

      – johnJones901
      14 mins ago






    • 1





      @johnJones901 - Answer was edited.

      – jezrael
      8 mins ago






    • 1





      Thanks for the help!

      – johnJones901
      6 mins ago











    • @johnJones901 - You are welcome!

      – jezrael
      5 mins ago



















    • @johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

      – jezrael
      38 mins ago






    • 1





      Can you explain what f1, f2 and d are doing please? Thank you!

      – johnJones901
      14 mins ago






    • 1





      @johnJones901 - Answer was edited.

      – jezrael
      8 mins ago






    • 1





      Thanks for the help!

      – johnJones901
      6 mins ago











    • @johnJones901 - You are welcome!

      – jezrael
      5 mins ago

















    @johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

    – jezrael
    38 mins ago





    @johnJones901 - Can you check change f1 to f1 = lambda x: ', '.join([y for y in x if y != '']) ?

    – jezrael
    38 mins ago




    1




    1





    Can you explain what f1, f2 and d are doing please? Thank you!

    – johnJones901
    14 mins ago





    Can you explain what f1, f2 and d are doing please? Thank you!

    – johnJones901
    14 mins ago




    1




    1





    @johnJones901 - Answer was edited.

    – jezrael
    8 mins ago





    @johnJones901 - Answer was edited.

    – jezrael
    8 mins ago




    1




    1





    Thanks for the help!

    – johnJones901
    6 mins ago





    Thanks for the help!

    – johnJones901
    6 mins ago













    @johnJones901 - You are welcome!

    – jezrael
    5 mins ago





    @johnJones901 - You are welcome!

    – jezrael
    5 mins ago













    3














    You could groupby value_1 and aggregate with the following function for the strings:



    def fun(x):
    return x.str.cat(sep=', ')


    And using GroupBy.sum to append the lists:



    df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})

    list value_2
    value_1
    american [supermarket, connivence, state] california, nyc, texas
    canadian [coffee, sipermarket] toronto, texas

    value_3
    value_1
    american walmart, kmart, dunkinDonuts
    canadian dunkinDonuts, walmart





    share|improve this answer






























      3














      You could groupby value_1 and aggregate with the following function for the strings:



      def fun(x):
      return x.str.cat(sep=', ')


      And using GroupBy.sum to append the lists:



      df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})

      list value_2
      value_1
      american [supermarket, connivence, state] california, nyc, texas
      canadian [coffee, sipermarket] toronto, texas

      value_3
      value_1
      american walmart, kmart, dunkinDonuts
      canadian dunkinDonuts, walmart





      share|improve this answer




























        3












        3








        3







        You could groupby value_1 and aggregate with the following function for the strings:



        def fun(x):
        return x.str.cat(sep=', ')


        And using GroupBy.sum to append the lists:



        df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})

        list value_2
        value_1
        american [supermarket, connivence, state] california, nyc, texas
        canadian [coffee, sipermarket] toronto, texas

        value_3
        value_1
        american walmart, kmart, dunkinDonuts
        canadian dunkinDonuts, walmart





        share|improve this answer















        You could groupby value_1 and aggregate with the following function for the strings:



        def fun(x):
        return x.str.cat(sep=', ')


        And using GroupBy.sum to append the lists:



        df.replace('',None).groupby('value_1').agg({'list':'sum', 'value_2': fun, 'value_3':fun})

        list value_2
        value_1
        american [supermarket, connivence, state] california, nyc, texas
        canadian [coffee, sipermarket] toronto, texas

        value_3
        value_1
        american walmart, kmart, dunkinDonuts
        canadian dunkinDonuts, walmart






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 13 mins ago

























        answered 1 hour ago









        yatuyatu

        11.6k31137




        11.6k31137






















            johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.













            johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.












            johnJones901 is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54944344%2fpandas-how-to-group-by-a-value-in-column-when-there-is-list-in-one-of-the-colum%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            QJy1dZE0SfjFHSR8 MaxRfzIR,zkM JqNRil oafjEP
            DPrrgjabJBm3qGKj,EFgvIu TuQS7,2Bov0NrJJ kwceUsNETmAnT,BNBL z3,NeOJn,A8hNRsss

            Popular posts from this blog

            Benedict Cumberbatch Contingut Inicis Debut professional Premis Filmografia bàsica Premis i...

            Escacs Janus Enllaços externs Menú de navegacióEscacs JanusJanusschachBrainKing.comChessV

            Monticle de plataforma Contingut Est de Nord Amèrica Interpretacions Altres cultures Vegeu...