Python Pandas - difference between 'loc' and 'where'?Calling an external command in PythonWhat are...

"We can't save the customer" error after Migration - Magento 2.3

Why do neural networks need so many training examples to perform?

Why is it that Bernie Sanders is always called a "socialist"?

What to look for when criticizing poetry?

How would an AI self awareness kill switch work?

Why did Luke use his left hand to shoot?

How do you funnel food off a cutting board?

Constexpr if with a non-bool condition

Hilchos Shabbos English Sefer

Clues on how to solve these types of problems within 2-3 minutes for competitive exams

Why does photorec keep finding files after I have filled the disk free space as root?

Why are the books in the Game of Thrones citadel library shelved spine inwards?

How do you catch Smeargle in Pokemon Go?

Ellipses aligned on the same boundary point

Dilemma of explaining to interviewer that he is the reason for declining second interview

Why did Democrats in the Senate oppose the Born-Alive Abortion Survivors Protection Act (2019 S.130)?

Is a new boolean field better than null reference when a value can be meaningfully absent?

Is there a lava-breathing lizard creature (that could be worshipped by a cult) in 5e?

Crontab: Ubuntu running script (noob)

Square Root Distance from Integers

Why publish a research paper when a blog post or a lecture slide can have more citation count than a journal paper?

In Linux what happens if 1000 files in a directory are moved to another location while another 300 files were added to the source directory?

Has any human ever had the choice to leave Earth permanently?

When do I have to declare that I want to twin my spell?



Python Pandas - difference between 'loc' and 'where'?


Calling an external command in PythonWhat are metaclasses in Python?What is the difference between @staticmethod and @classmethod?Finding the index of an item given a list containing it in PythonDifference between append vs. extend list methods in PythonHow can I safely create a nested directory in Python?Does Python have a ternary conditional operator?Difference between __str__ and __repr__?Does Python have a string 'contains' substring method?Renaming columns in pandas













7















Just curious on the behavior of 'where' and why you would use it over 'loc'.



If I create a dataframe:



df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10], 
'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
'Goals':[12,23,56,7,8,0,4,2,1,34],
'Gender':['m','m','m','f','f','m','f','m','f','m']})


And then apply the 'where' function:



df2 = df.where(df['Goals']>10)


I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:



  Gender  Goals    ID  Run Distance                                                                                                                                                  
0 m 12.0 1.0 234.0
1 m 23.0 2.0 35.0
2 m 56.0 3.0 77.0
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 m 34.0 10.0 123.0


If however I use the 'loc' function:



df2 = df.loc[df['Goals']>10]


It returns the dataframe subsetted without the NaN values:



  Gender  Goals  ID  Run Distance                                                                                                                                                    
0 m 12 1 234
1 m 23 2 35
2 m 56 3 77
9 m 34 10 123


So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?










share|improve this question





























    7















    Just curious on the behavior of 'where' and why you would use it over 'loc'.



    If I create a dataframe:



    df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10], 
    'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
    'Goals':[12,23,56,7,8,0,4,2,1,34],
    'Gender':['m','m','m','f','f','m','f','m','f','m']})


    And then apply the 'where' function:



    df2 = df.where(df['Goals']>10)


    I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:



      Gender  Goals    ID  Run Distance                                                                                                                                                  
    0 m 12.0 1.0 234.0
    1 m 23.0 2.0 35.0
    2 m 56.0 3.0 77.0
    3 NaN NaN NaN NaN
    4 NaN NaN NaN NaN
    5 NaN NaN NaN NaN
    6 NaN NaN NaN NaN
    7 NaN NaN NaN NaN
    8 NaN NaN NaN NaN
    9 m 34.0 10.0 123.0


    If however I use the 'loc' function:



    df2 = df.loc[df['Goals']>10]


    It returns the dataframe subsetted without the NaN values:



      Gender  Goals  ID  Run Distance                                                                                                                                                    
    0 m 12 1 234
    1 m 23 2 35
    2 m 56 3 77
    9 m 34 10 123


    So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?










    share|improve this question



























      7












      7








      7


      1






      Just curious on the behavior of 'where' and why you would use it over 'loc'.



      If I create a dataframe:



      df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10], 
      'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
      'Goals':[12,23,56,7,8,0,4,2,1,34],
      'Gender':['m','m','m','f','f','m','f','m','f','m']})


      And then apply the 'where' function:



      df2 = df.where(df['Goals']>10)


      I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:



        Gender  Goals    ID  Run Distance                                                                                                                                                  
      0 m 12.0 1.0 234.0
      1 m 23.0 2.0 35.0
      2 m 56.0 3.0 77.0
      3 NaN NaN NaN NaN
      4 NaN NaN NaN NaN
      5 NaN NaN NaN NaN
      6 NaN NaN NaN NaN
      7 NaN NaN NaN NaN
      8 NaN NaN NaN NaN
      9 m 34.0 10.0 123.0


      If however I use the 'loc' function:



      df2 = df.loc[df['Goals']>10]


      It returns the dataframe subsetted without the NaN values:



        Gender  Goals  ID  Run Distance                                                                                                                                                    
      0 m 12 1 234
      1 m 23 2 35
      2 m 56 3 77
      9 m 34 10 123


      So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?










      share|improve this question
















      Just curious on the behavior of 'where' and why you would use it over 'loc'.



      If I create a dataframe:



      df = pd.DataFrame({'ID':[1,2,3,4,5,6,7,8,9,10], 
      'Run Distance':[234,35,77,787,243,5435,775,123,355,123],
      'Goals':[12,23,56,7,8,0,4,2,1,34],
      'Gender':['m','m','m','f','f','m','f','m','f','m']})


      And then apply the 'where' function:



      df2 = df.where(df['Goals']>10)


      I get the following which filters out the results where Goals > 10, but leaves everything else as NaN:



        Gender  Goals    ID  Run Distance                                                                                                                                                  
      0 m 12.0 1.0 234.0
      1 m 23.0 2.0 35.0
      2 m 56.0 3.0 77.0
      3 NaN NaN NaN NaN
      4 NaN NaN NaN NaN
      5 NaN NaN NaN NaN
      6 NaN NaN NaN NaN
      7 NaN NaN NaN NaN
      8 NaN NaN NaN NaN
      9 m 34.0 10.0 123.0


      If however I use the 'loc' function:



      df2 = df.loc[df['Goals']>10]


      It returns the dataframe subsetted without the NaN values:



        Gender  Goals  ID  Run Distance                                                                                                                                                    
      0 m 12 1 234
      1 m 23 2 35
      2 m 56 3 77
      9 m 34 10 123


      So essentially I am curious why you would use 'where' over 'loc/iloc' and why it returns NaN values?







      python pandas






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 6 hours ago







      ScoutEU

















      asked 6 hours ago









      ScoutEUScoutEU

      74921134




      74921134
























          3 Answers
          3






          active

          oldest

          votes


















          7














          Think of loc as a filter - give me only the parts of the df that conform to a condition.



          where originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN. A nice feature of where is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0'), to replace values that don't meet the condition with 0.



          ID  Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 0 0 0 0
          4 0 0 0 0
          5 0 0 0 0
          6 0 0 0 0
          7 0 0 0 0
          8 0 0 0 0
          9 10 123 34 m


          Also, while where is only for conditional filtering, loc is the standard way of selecting in Pandas, along with iloc. loc uses row and column names, while iloc uses their index number. So with loc you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]:



              Gender  Goals
          0 m 12
          1 m 23





          share|improve this answer





















          • 1





            That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!

            – ScoutEU
            6 hours ago



















          6














          If check docs DataFrame.where it replace rows by condition - default by NAN, but is possible specify value:



          df2 = df.where(df['Goals']>10)
          print (df2)
          ID Run Distance Goals Gender
          0 1.0 234.0 12.0 m
          1 2.0 35.0 23.0 m
          2 3.0 77.0 56.0 m
          3 NaN NaN NaN NaN
          4 NaN NaN NaN NaN
          5 NaN NaN NaN NaN
          6 NaN NaN NaN NaN
          7 NaN NaN NaN NaN
          8 NaN NaN NaN NaN
          9 10.0 123.0 34.0 m

          df2 = df.where(df['Goals']>10, 100)
          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 100 100 100 100
          4 100 100 100 100
          5 100 100 100 100
          6 100 100 100 100
          7 100 100 100 100
          8 100 100 100 100
          9 10 123 34 m


          Another syntax is called boolean indexing and is for filter rows - remove rows matched condition.



          df2 = df.loc[df['Goals']>10]
          #alternative
          df2 = df[df['Goals']>10]

          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          9 10 123 34 m


          If use loc is possible also filter by rows by condition and columns by name(s):



          s = df.loc[df['Goals']>10, 'ID']
          print (s)
          0 1
          1 2
          2 3
          9 10
          Name: ID, dtype: int64

          df2 = df.loc[df['Goals']>10, ['ID','Gender']]
          print (df2)
          ID Gender
          0 1 m
          1 2 m
          2 3 m
          9 10 m





          share|improve this answer


























          • That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!

            – ScoutEU
            6 hours ago



















          4
















          • loc retrieves only the rows that matches the condition.


          • where returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).






          share|improve this answer



















          • 1





            Great, thank you. 'Where' is a lot more useful than originally thought!

            – ScoutEU
            6 hours ago











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54900717%2fpython-pandas-difference-between-loc-and-where%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          7














          Think of loc as a filter - give me only the parts of the df that conform to a condition.



          where originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN. A nice feature of where is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0'), to replace values that don't meet the condition with 0.



          ID  Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 0 0 0 0
          4 0 0 0 0
          5 0 0 0 0
          6 0 0 0 0
          7 0 0 0 0
          8 0 0 0 0
          9 10 123 34 m


          Also, while where is only for conditional filtering, loc is the standard way of selecting in Pandas, along with iloc. loc uses row and column names, while iloc uses their index number. So with loc you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]:



              Gender  Goals
          0 m 12
          1 m 23





          share|improve this answer





















          • 1





            That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!

            – ScoutEU
            6 hours ago
















          7














          Think of loc as a filter - give me only the parts of the df that conform to a condition.



          where originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN. A nice feature of where is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0'), to replace values that don't meet the condition with 0.



          ID  Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 0 0 0 0
          4 0 0 0 0
          5 0 0 0 0
          6 0 0 0 0
          7 0 0 0 0
          8 0 0 0 0
          9 10 123 34 m


          Also, while where is only for conditional filtering, loc is the standard way of selecting in Pandas, along with iloc. loc uses row and column names, while iloc uses their index number. So with loc you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]:



              Gender  Goals
          0 m 12
          1 m 23





          share|improve this answer





















          • 1





            That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!

            – ScoutEU
            6 hours ago














          7












          7








          7







          Think of loc as a filter - give me only the parts of the df that conform to a condition.



          where originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN. A nice feature of where is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0'), to replace values that don't meet the condition with 0.



          ID  Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 0 0 0 0
          4 0 0 0 0
          5 0 0 0 0
          6 0 0 0 0
          7 0 0 0 0
          8 0 0 0 0
          9 10 123 34 m


          Also, while where is only for conditional filtering, loc is the standard way of selecting in Pandas, along with iloc. loc uses row and column names, while iloc uses their index number. So with loc you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]:



              Gender  Goals
          0 m 12
          1 m 23





          share|improve this answer















          Think of loc as a filter - give me only the parts of the df that conform to a condition.



          where originally comes from numpy. It runs over an array and checks if each element fits a condition. So it gives you back the entire array, with a result or NaN. A nice feature of where is that you can also get back something different, e.g. df2 = df.where(df['Goals']>10, other='0'), to replace values that don't meet the condition with 0.



          ID  Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 0 0 0 0
          4 0 0 0 0
          5 0 0 0 0
          6 0 0 0 0
          7 0 0 0 0
          8 0 0 0 0
          9 10 123 34 m


          Also, while where is only for conditional filtering, loc is the standard way of selecting in Pandas, along with iloc. loc uses row and column names, while iloc uses their index number. So with loc you could choose to return, say, df.loc[0:1, ['Gender', 'Goals']]:



              Gender  Goals
          0 m 12
          1 m 23






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 5 hours ago

























          answered 6 hours ago









          Josh FriedlanderJosh Friedlander

          2,3961926




          2,3961926








          • 1





            That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!

            – ScoutEU
            6 hours ago














          • 1





            That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!

            – ScoutEU
            6 hours ago








          1




          1





          That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!

          – ScoutEU
          6 hours ago





          That is super helpful, thank you. So 'loc' filters, and 'where' is more for where you want to change values that do not fit the condition to something else. Perfect, thank you!

          – ScoutEU
          6 hours ago













          6














          If check docs DataFrame.where it replace rows by condition - default by NAN, but is possible specify value:



          df2 = df.where(df['Goals']>10)
          print (df2)
          ID Run Distance Goals Gender
          0 1.0 234.0 12.0 m
          1 2.0 35.0 23.0 m
          2 3.0 77.0 56.0 m
          3 NaN NaN NaN NaN
          4 NaN NaN NaN NaN
          5 NaN NaN NaN NaN
          6 NaN NaN NaN NaN
          7 NaN NaN NaN NaN
          8 NaN NaN NaN NaN
          9 10.0 123.0 34.0 m

          df2 = df.where(df['Goals']>10, 100)
          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 100 100 100 100
          4 100 100 100 100
          5 100 100 100 100
          6 100 100 100 100
          7 100 100 100 100
          8 100 100 100 100
          9 10 123 34 m


          Another syntax is called boolean indexing and is for filter rows - remove rows matched condition.



          df2 = df.loc[df['Goals']>10]
          #alternative
          df2 = df[df['Goals']>10]

          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          9 10 123 34 m


          If use loc is possible also filter by rows by condition and columns by name(s):



          s = df.loc[df['Goals']>10, 'ID']
          print (s)
          0 1
          1 2
          2 3
          9 10
          Name: ID, dtype: int64

          df2 = df.loc[df['Goals']>10, ['ID','Gender']]
          print (df2)
          ID Gender
          0 1 m
          1 2 m
          2 3 m
          9 10 m





          share|improve this answer


























          • That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!

            – ScoutEU
            6 hours ago
















          6














          If check docs DataFrame.where it replace rows by condition - default by NAN, but is possible specify value:



          df2 = df.where(df['Goals']>10)
          print (df2)
          ID Run Distance Goals Gender
          0 1.0 234.0 12.0 m
          1 2.0 35.0 23.0 m
          2 3.0 77.0 56.0 m
          3 NaN NaN NaN NaN
          4 NaN NaN NaN NaN
          5 NaN NaN NaN NaN
          6 NaN NaN NaN NaN
          7 NaN NaN NaN NaN
          8 NaN NaN NaN NaN
          9 10.0 123.0 34.0 m

          df2 = df.where(df['Goals']>10, 100)
          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 100 100 100 100
          4 100 100 100 100
          5 100 100 100 100
          6 100 100 100 100
          7 100 100 100 100
          8 100 100 100 100
          9 10 123 34 m


          Another syntax is called boolean indexing and is for filter rows - remove rows matched condition.



          df2 = df.loc[df['Goals']>10]
          #alternative
          df2 = df[df['Goals']>10]

          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          9 10 123 34 m


          If use loc is possible also filter by rows by condition and columns by name(s):



          s = df.loc[df['Goals']>10, 'ID']
          print (s)
          0 1
          1 2
          2 3
          9 10
          Name: ID, dtype: int64

          df2 = df.loc[df['Goals']>10, ['ID','Gender']]
          print (df2)
          ID Gender
          0 1 m
          1 2 m
          2 3 m
          9 10 m





          share|improve this answer


























          • That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!

            – ScoutEU
            6 hours ago














          6












          6








          6







          If check docs DataFrame.where it replace rows by condition - default by NAN, but is possible specify value:



          df2 = df.where(df['Goals']>10)
          print (df2)
          ID Run Distance Goals Gender
          0 1.0 234.0 12.0 m
          1 2.0 35.0 23.0 m
          2 3.0 77.0 56.0 m
          3 NaN NaN NaN NaN
          4 NaN NaN NaN NaN
          5 NaN NaN NaN NaN
          6 NaN NaN NaN NaN
          7 NaN NaN NaN NaN
          8 NaN NaN NaN NaN
          9 10.0 123.0 34.0 m

          df2 = df.where(df['Goals']>10, 100)
          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 100 100 100 100
          4 100 100 100 100
          5 100 100 100 100
          6 100 100 100 100
          7 100 100 100 100
          8 100 100 100 100
          9 10 123 34 m


          Another syntax is called boolean indexing and is for filter rows - remove rows matched condition.



          df2 = df.loc[df['Goals']>10]
          #alternative
          df2 = df[df['Goals']>10]

          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          9 10 123 34 m


          If use loc is possible also filter by rows by condition and columns by name(s):



          s = df.loc[df['Goals']>10, 'ID']
          print (s)
          0 1
          1 2
          2 3
          9 10
          Name: ID, dtype: int64

          df2 = df.loc[df['Goals']>10, ['ID','Gender']]
          print (df2)
          ID Gender
          0 1 m
          1 2 m
          2 3 m
          9 10 m





          share|improve this answer















          If check docs DataFrame.where it replace rows by condition - default by NAN, but is possible specify value:



          df2 = df.where(df['Goals']>10)
          print (df2)
          ID Run Distance Goals Gender
          0 1.0 234.0 12.0 m
          1 2.0 35.0 23.0 m
          2 3.0 77.0 56.0 m
          3 NaN NaN NaN NaN
          4 NaN NaN NaN NaN
          5 NaN NaN NaN NaN
          6 NaN NaN NaN NaN
          7 NaN NaN NaN NaN
          8 NaN NaN NaN NaN
          9 10.0 123.0 34.0 m

          df2 = df.where(df['Goals']>10, 100)
          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          3 100 100 100 100
          4 100 100 100 100
          5 100 100 100 100
          6 100 100 100 100
          7 100 100 100 100
          8 100 100 100 100
          9 10 123 34 m


          Another syntax is called boolean indexing and is for filter rows - remove rows matched condition.



          df2 = df.loc[df['Goals']>10]
          #alternative
          df2 = df[df['Goals']>10]

          print (df2)
          ID Run Distance Goals Gender
          0 1 234 12 m
          1 2 35 23 m
          2 3 77 56 m
          9 10 123 34 m


          If use loc is possible also filter by rows by condition and columns by name(s):



          s = df.loc[df['Goals']>10, 'ID']
          print (s)
          0 1
          1 2
          2 3
          9 10
          Name: ID, dtype: int64

          df2 = df.loc[df['Goals']>10, ['ID','Gender']]
          print (df2)
          ID Gender
          0 1 m
          1 2 m
          2 3 m
          9 10 m






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 6 hours ago

























          answered 6 hours ago









          jezraeljezrael

          341k25295366




          341k25295366













          • That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!

            – ScoutEU
            6 hours ago



















          • That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!

            – ScoutEU
            6 hours ago

















          That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!

          – ScoutEU
          6 hours ago





          That makes a lot of sense, thank you very much. Also thanks for the tip on the alternative!

          – ScoutEU
          6 hours ago











          4
















          • loc retrieves only the rows that matches the condition.


          • where returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).






          share|improve this answer



















          • 1





            Great, thank you. 'Where' is a lot more useful than originally thought!

            – ScoutEU
            6 hours ago
















          4
















          • loc retrieves only the rows that matches the condition.


          • where returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).






          share|improve this answer



















          • 1





            Great, thank you. 'Where' is a lot more useful than originally thought!

            – ScoutEU
            6 hours ago














          4












          4








          4









          • loc retrieves only the rows that matches the condition.


          • where returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).






          share|improve this answer















          • loc retrieves only the rows that matches the condition.


          • where returns the whole dataframe, replacing the rows that don't match the condition (NaN by default).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 6 hours ago









          CastiCasti

          7318




          7318








          • 1





            Great, thank you. 'Where' is a lot more useful than originally thought!

            – ScoutEU
            6 hours ago














          • 1





            Great, thank you. 'Where' is a lot more useful than originally thought!

            – ScoutEU
            6 hours ago








          1




          1





          Great, thank you. 'Where' is a lot more useful than originally thought!

          – ScoutEU
          6 hours ago





          Great, thank you. 'Where' is a lot more useful than originally thought!

          – ScoutEU
          6 hours ago


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54900717%2fpython-pandas-difference-between-loc-and-where%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Benedict Cumberbatch Contingut Inicis Debut professional Premis Filmografia bàsica Premis i...

          Monticle de plataforma Contingut Est de Nord Amèrica Interpretacions Altres cultures Vegeu...

          Escacs Janus Enllaços externs Menú de navegacióEscacs JanusJanusschachBrainKing.comChessV