How to get frequency counts using column breaks by row?How to count number of non-consecutive values in a column using SQL?How to sort a dataframe by multiple column(s)How to rename a single column in a data.frame?Frequency count of two column in Rcounts/frequencies based on two columnsCount of Row Frequency in Rcount frequency of rows based on a column value in R

Did NASA/JPL get "waning" and "waxing" backwards in this video?

Create a list of snaking numbers under 50,000

How would a disabled person earn their living in a medieval-type town?

Four day weekend?

How can I store milk for long periods of time?

Heavy Box Stacking

What's the origin of the concept of alternate dimensions/realities?

Deck of Many Things. What happens if you don't declare any number of cards and just start drawing?

How can I portray a character with no fear of death, without them sounding utterly bored?

Ideas behind the 8.Bd3 line in the 4.Ng5 Two Knights Defense

How smart contract transactions work?

Can a system of three stars exist?

Turn off Google Chrome's Notification for "Flash Player will no longer be supported after December 2020."

'spazieren' - walking in a silly and affected manner?

How to investigate an unknown 1.5GB file named "sudo" in my Linux home directory?

How to Calculate this definite integral or how to solve this series?

My colleague treats me like he's my boss, yet we're on the same level

Can I leave a large suitcase at TPE during a 4-hour layover, and pick it up 4.5 days later when I come back to TPE on my way to Taipei downtown?

Can two aircraft be allowed to stay on the same runway at the same time?

How to save money by shopping at a variety of grocery stores?

LINQ Extension methods MinBy and MaxBy

Why don't "echo -e" commands seem to produce the right output?

Understanding data transmission rates over copper wire

Moscow SVO airport, how to avoid scam taxis without pre-booking?



How to get frequency counts using column breaks by row?


How to count number of non-consecutive values in a column using SQL?How to sort a dataframe by multiple column(s)How to rename a single column in a data.frame?Frequency count of two column in Rcounts/frequencies based on two columnsCount of Row Frequency in Rcount frequency of rows based on a column value in R






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








7















I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).



library(tidyverse)

dat <- data.frame(name = rep("Bob", 100),
day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))


As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.



If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.



dat %>% 
group_by(name) %>%
summarise(ever_inv = max(srvc_inv))


However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!










share|improve this question






























    7















    I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).



    library(tidyverse)

    dat <- data.frame(name = rep("Bob", 100),
    day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
    srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))


    As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.



    If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.



    dat %>% 
    group_by(name) %>%
    summarise(ever_inv = max(srvc_inv))


    However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!










    share|improve this question


























      7












      7








      7


      0






      I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).



      library(tidyverse)

      dat <- data.frame(name = rep("Bob", 100),
      day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
      srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))


      As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.



      If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.



      dat %>% 
      group_by(name) %>%
      summarise(ever_inv = max(srvc_inv))


      However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!










      share|improve this question














      I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).



      library(tidyverse)

      dat <- data.frame(name = rep("Bob", 100),
      day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
      srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))


      As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.



      If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.



      dat %>% 
      group_by(name) %>%
      summarise(ever_inv = max(srvc_inv))


      However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!







      r






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 9 hours ago









      DJCDJC

      875 bronze badges




      875 bronze badges

























          3 Answers
          3






          active

          oldest

          votes


















          4















          One more solution based on base R rle



          library(dplyr)
          dat %>% group_by(name) %>%
          summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))

          # A tibble: 1 x 2
          name ever_inv
          <fct> <int>
          1 Bob 2





          share|improve this answer

























          • Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records

            – DJC
            9 hours ago












          • @DJC Sorry I don't think that is possible as rle a base R function and dplyr will fail to transfer it into a valid SQL.

            – A. Suliman
            9 hours ago



















          3















          One possibility could be:



          dat %>%
          group_by(name) %>%
          mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
          summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))

          name ever_inv
          <fct> <int>
          1 Bob 2





          share|improve this answer
































            1















            Alternatively to rle() you can use diff():



            dat %>%
            group_by(name) %>%
            summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))

            # A tibble: 1 x 2
            # name ever_inv
            # <fct> <int>
            # 1 Bob 2


            Assuming that srvc_inv is either 0 or 1, diff(srvc_inv) == 1 only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv for a case when it starts from 1s run.



            And with rle(), from my opinion, there is even simpler solution:



            dat %>%
            group_by(name) %>%
            summarise(ever_inv = sum(rle(srvc_inv)$value))

            # A tibble: 1 x 2
            # name ever_inv
            # <fct> <int>
            # 1 Bob 2


            Assuming that srvc_inv is either 0 or 1, that's enough just to sum values component of rle object, which returns the number of 1s runs.






            share|improve this answer



























              Your Answer






              StackExchange.ifUsing("editor", function ()
              StackExchange.using("externalEditor", function ()
              StackExchange.using("snippets", function ()
              StackExchange.snippets.init();
              );
              );
              , "code-snippets");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "1"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













              draft saved

              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57739667%2fhow-to-get-frequency-counts-using-column-breaks-by-row%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              4















              One more solution based on base R rle



              library(dplyr)
              dat %>% group_by(name) %>%
              summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))

              # A tibble: 1 x 2
              name ever_inv
              <fct> <int>
              1 Bob 2





              share|improve this answer

























              • Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records

                – DJC
                9 hours ago












              • @DJC Sorry I don't think that is possible as rle a base R function and dplyr will fail to transfer it into a valid SQL.

                – A. Suliman
                9 hours ago
















              4















              One more solution based on base R rle



              library(dplyr)
              dat %>% group_by(name) %>%
              summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))

              # A tibble: 1 x 2
              name ever_inv
              <fct> <int>
              1 Bob 2





              share|improve this answer

























              • Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records

                – DJC
                9 hours ago












              • @DJC Sorry I don't think that is possible as rle a base R function and dplyr will fail to transfer it into a valid SQL.

                – A. Suliman
                9 hours ago














              4














              4










              4









              One more solution based on base R rle



              library(dplyr)
              dat %>% group_by(name) %>%
              summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))

              # A tibble: 1 x 2
              name ever_inv
              <fct> <int>
              1 Bob 2





              share|improve this answer













              One more solution based on base R rle



              library(dplyr)
              dat %>% group_by(name) %>%
              summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))

              # A tibble: 1 x 2
              name ever_inv
              <fct> <int>
              1 Bob 2






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered 9 hours ago









              A. SulimanA. Suliman

              7,9334 gold badges14 silver badges26 bronze badges




              7,9334 gold badges14 silver badges26 bronze badges















              • Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records

                – DJC
                9 hours ago












              • @DJC Sorry I don't think that is possible as rle a base R function and dplyr will fail to transfer it into a valid SQL.

                – A. Suliman
                9 hours ago


















              • Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records

                – DJC
                9 hours ago












              • @DJC Sorry I don't think that is possible as rle a base R function and dplyr will fail to transfer it into a valid SQL.

                – A. Suliman
                9 hours ago

















              Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records

              – DJC
              9 hours ago






              Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records

              – DJC
              9 hours ago














              @DJC Sorry I don't think that is possible as rle a base R function and dplyr will fail to transfer it into a valid SQL.

              – A. Suliman
              9 hours ago






              @DJC Sorry I don't think that is possible as rle a base R function and dplyr will fail to transfer it into a valid SQL.

              – A. Suliman
              9 hours ago














              3















              One possibility could be:



              dat %>%
              group_by(name) %>%
              mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
              summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))

              name ever_inv
              <fct> <int>
              1 Bob 2





              share|improve this answer





























                3















                One possibility could be:



                dat %>%
                group_by(name) %>%
                mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
                summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))

                name ever_inv
                <fct> <int>
                1 Bob 2





                share|improve this answer



























                  3














                  3










                  3









                  One possibility could be:



                  dat %>%
                  group_by(name) %>%
                  mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
                  summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))

                  name ever_inv
                  <fct> <int>
                  1 Bob 2





                  share|improve this answer













                  One possibility could be:



                  dat %>%
                  group_by(name) %>%
                  mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
                  summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))

                  name ever_inv
                  <fct> <int>
                  1 Bob 2






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 9 hours ago









                  tmfmnktmfmnk

                  10.6k1 gold badge10 silver badges25 bronze badges




                  10.6k1 gold badge10 silver badges25 bronze badges
























                      1















                      Alternatively to rle() you can use diff():



                      dat %>%
                      group_by(name) %>%
                      summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))

                      # A tibble: 1 x 2
                      # name ever_inv
                      # <fct> <int>
                      # 1 Bob 2


                      Assuming that srvc_inv is either 0 or 1, diff(srvc_inv) == 1 only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv for a case when it starts from 1s run.



                      And with rle(), from my opinion, there is even simpler solution:



                      dat %>%
                      group_by(name) %>%
                      summarise(ever_inv = sum(rle(srvc_inv)$value))

                      # A tibble: 1 x 2
                      # name ever_inv
                      # <fct> <int>
                      # 1 Bob 2


                      Assuming that srvc_inv is either 0 or 1, that's enough just to sum values component of rle object, which returns the number of 1s runs.






                      share|improve this answer





























                        1















                        Alternatively to rle() you can use diff():



                        dat %>%
                        group_by(name) %>%
                        summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))

                        # A tibble: 1 x 2
                        # name ever_inv
                        # <fct> <int>
                        # 1 Bob 2


                        Assuming that srvc_inv is either 0 or 1, diff(srvc_inv) == 1 only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv for a case when it starts from 1s run.



                        And with rle(), from my opinion, there is even simpler solution:



                        dat %>%
                        group_by(name) %>%
                        summarise(ever_inv = sum(rle(srvc_inv)$value))

                        # A tibble: 1 x 2
                        # name ever_inv
                        # <fct> <int>
                        # 1 Bob 2


                        Assuming that srvc_inv is either 0 or 1, that's enough just to sum values component of rle object, which returns the number of 1s runs.






                        share|improve this answer



























                          1














                          1










                          1









                          Alternatively to rle() you can use diff():



                          dat %>%
                          group_by(name) %>%
                          summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))

                          # A tibble: 1 x 2
                          # name ever_inv
                          # <fct> <int>
                          # 1 Bob 2


                          Assuming that srvc_inv is either 0 or 1, diff(srvc_inv) == 1 only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv for a case when it starts from 1s run.



                          And with rle(), from my opinion, there is even simpler solution:



                          dat %>%
                          group_by(name) %>%
                          summarise(ever_inv = sum(rle(srvc_inv)$value))

                          # A tibble: 1 x 2
                          # name ever_inv
                          # <fct> <int>
                          # 1 Bob 2


                          Assuming that srvc_inv is either 0 or 1, that's enough just to sum values component of rle object, which returns the number of 1s runs.






                          share|improve this answer













                          Alternatively to rle() you can use diff():



                          dat %>%
                          group_by(name) %>%
                          summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))

                          # A tibble: 1 x 2
                          # name ever_inv
                          # <fct> <int>
                          # 1 Bob 2


                          Assuming that srvc_inv is either 0 or 1, diff(srvc_inv) == 1 only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv for a case when it starts from 1s run.



                          And with rle(), from my opinion, there is even simpler solution:



                          dat %>%
                          group_by(name) %>%
                          summarise(ever_inv = sum(rle(srvc_inv)$value))

                          # A tibble: 1 x 2
                          # name ever_inv
                          # <fct> <int>
                          # 1 Bob 2


                          Assuming that srvc_inv is either 0 or 1, that's enough just to sum values component of rle object, which returns the number of 1s runs.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered 5 hours ago









                          utubunutubun

                          2,8351 gold badge10 silver badges14 bronze badges




                          2,8351 gold badge10 silver badges14 bronze badges






























                              draft saved

                              draft discarded
















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid


                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.

                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57739667%2fhow-to-get-frequency-counts-using-column-breaks-by-row%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

                              Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

                              Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її