How to get frequency counts using column breaks by row?How to count number of non-consecutive values in a column using SQL?How to sort a dataframe by multiple column(s)How to rename a single column in a data.frame?Frequency count of two column in Rcounts/frequencies based on two columnsCount of Row Frequency in Rcount frequency of rows based on a column value in R
Did NASA/JPL get "waning" and "waxing" backwards in this video?
Create a list of snaking numbers under 50,000
How would a disabled person earn their living in a medieval-type town?
Four day weekend?
How can I store milk for long periods of time?
Heavy Box Stacking
What's the origin of the concept of alternate dimensions/realities?
Deck of Many Things. What happens if you don't declare any number of cards and just start drawing?
How can I portray a character with no fear of death, without them sounding utterly bored?
Ideas behind the 8.Bd3 line in the 4.Ng5 Two Knights Defense
How smart contract transactions work?
Can a system of three stars exist?
Turn off Google Chrome's Notification for "Flash Player will no longer be supported after December 2020."
'spazieren' - walking in a silly and affected manner?
How to investigate an unknown 1.5GB file named "sudo" in my Linux home directory?
How to Calculate this definite integral or how to solve this series?
My colleague treats me like he's my boss, yet we're on the same level
Can I leave a large suitcase at TPE during a 4-hour layover, and pick it up 4.5 days later when I come back to TPE on my way to Taipei downtown?
Can two aircraft be allowed to stay on the same runway at the same time?
How to save money by shopping at a variety of grocery stores?
LINQ Extension methods MinBy and MaxBy
Why don't "echo -e" commands seem to produce the right output?
Understanding data transmission rates over copper wire
Moscow SVO airport, how to avoid scam taxis without pre-booking?
How to get frequency counts using column breaks by row?
How to count number of non-consecutive values in a column using SQL?How to sort a dataframe by multiple column(s)How to rename a single column in a data.frame?Frequency count of two column in Rcounts/frequencies based on two columnsCount of Row Frequency in Rcount frequency of rows based on a column value in R
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).
library(tidyverse)
dat <- data.frame(name = rep("Bob", 100),
day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))
As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.
If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.
dat %>%
group_by(name) %>%
summarise(ever_inv = max(srvc_inv))
However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!
r
add a comment |
I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).
library(tidyverse)
dat <- data.frame(name = rep("Bob", 100),
day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))
As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.
If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.
dat %>%
group_by(name) %>%
summarise(ever_inv = max(srvc_inv))
However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!
r
add a comment |
I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).
library(tidyverse)
dat <- data.frame(name = rep("Bob", 100),
day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))
As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.
If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.
dat %>%
group_by(name) %>%
summarise(ever_inv = max(srvc_inv))
However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!
r
I have a data frame which tracks service involvement (srvc_inv 1, 0) for individual x (Bob) over a timeframe of interest (years 1900-1999).
library(tidyverse)
dat <- data.frame(name = rep("Bob", 100),
day = seq(as.Date("1900/1/1"), as.Date("1999/1/1"), "years"),
srvc_inv = c(rep(0, 25), rep(1, 25), rep(0, 25), rep(1, 25)))
As we can see, Bob has two service episodes: one episode between rows 26:50, and the other between rows 76:100.
If we want to determine any service involvement for Bob during the timeframe, we can use a simple max statement as shown below.
dat %>%
group_by(name) %>%
summarise(ever_inv = max(srvc_inv))
However, I would like to determine the number of service episodes that Bob had during the timeframe of interest (in this case, 2). A distinct service episode would be identified by a break in service involvement over consecutive dates. Anybody have any idea how to program this? Thanks!
r
r
asked 9 hours ago
DJCDJC
875 bronze badges
875 bronze badges
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
One more solution based on base R rle
library(dplyr)
dat %>% group_by(name) %>%
summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))
# A tibble: 1 x 2
name ever_inv
<fct> <int>
1 Bob 2
Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records
– DJC
9 hours ago
@DJC Sorry I don't think that is possible asrle
a base R function anddplyr
will fail to transfer it into a valid SQL.
– A. Suliman
9 hours ago
add a comment |
One possibility could be:
dat %>%
group_by(name) %>%
mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))
name ever_inv
<fct> <int>
1 Bob 2
add a comment |
Alternatively to rle()
you can use diff()
:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, diff(srvc_inv) == 1
only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv
for a case when it starts from 1s run.
And with rle()
, from my opinion, there is even simpler solution:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(rle(srvc_inv)$value))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, that's enough just to sum values
component of rle
object, which returns the number of 1s runs.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57739667%2fhow-to-get-frequency-counts-using-column-breaks-by-row%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
One more solution based on base R rle
library(dplyr)
dat %>% group_by(name) %>%
summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))
# A tibble: 1 x 2
name ever_inv
<fct> <int>
1 Bob 2
Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records
– DJC
9 hours ago
@DJC Sorry I don't think that is possible asrle
a base R function anddplyr
will fail to transfer it into a valid SQL.
– A. Suliman
9 hours ago
add a comment |
One more solution based on base R rle
library(dplyr)
dat %>% group_by(name) %>%
summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))
# A tibble: 1 x 2
name ever_inv
<fct> <int>
1 Bob 2
Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records
– DJC
9 hours ago
@DJC Sorry I don't think that is possible asrle
a base R function anddplyr
will fail to transfer it into a valid SQL.
– A. Suliman
9 hours ago
add a comment |
One more solution based on base R rle
library(dplyr)
dat %>% group_by(name) %>%
summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))
# A tibble: 1 x 2
name ever_inv
<fct> <int>
1 Bob 2
One more solution based on base R rle
library(dplyr)
dat %>% group_by(name) %>%
summarise(ever_inv = length(with(rle(srvc_inv), lengths[values==1])))
# A tibble: 1 x 2
name ever_inv
<fct> <int>
1 Bob 2
answered 9 hours ago
A. SulimanA. Suliman
7,9334 gold badges14 silver badges26 bronze badges
7,9334 gold badges14 silver badges26 bronze badges
Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records
– DJC
9 hours ago
@DJC Sorry I don't think that is possible asrle
a base R function anddplyr
will fail to transfer it into a valid SQL.
– A. Suliman
9 hours ago
add a comment |
Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records
– DJC
9 hours ago
@DJC Sorry I don't think that is possible asrle
a base R function anddplyr
will fail to transfer it into a valid SQL.
– A. Suliman
9 hours ago
Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records
– DJC
9 hours ago
Thanks so much! Quick question - does RLE work when querying an external database? The 'dat' input will be coming from an Oracle database and I'm planning on writing my query using dplyr/dbplyr and then having R translate to SQL. Just wondering as this operation will be performed on a dataset with a huge number of records
– DJC
9 hours ago
@DJC Sorry I don't think that is possible as
rle
a base R function and dplyr
will fail to transfer it into a valid SQL.– A. Suliman
9 hours ago
@DJC Sorry I don't think that is possible as
rle
a base R function and dplyr
will fail to transfer it into a valid SQL.– A. Suliman
9 hours ago
add a comment |
One possibility could be:
dat %>%
group_by(name) %>%
mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))
name ever_inv
<fct> <int>
1 Bob 2
add a comment |
One possibility could be:
dat %>%
group_by(name) %>%
mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))
name ever_inv
<fct> <int>
1 Bob 2
add a comment |
One possibility could be:
dat %>%
group_by(name) %>%
mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))
name ever_inv
<fct> <int>
1 Bob 2
One possibility could be:
dat %>%
group_by(name) %>%
mutate(rleid = with(rle(srvc_inv), rep(seq_along(lengths), lengths))) %>%
summarise(ever_inv = n_distinct(rleid[srvc_inv == 1]))
name ever_inv
<fct> <int>
1 Bob 2
answered 9 hours ago
tmfmnktmfmnk
10.6k1 gold badge10 silver badges25 bronze badges
10.6k1 gold badge10 silver badges25 bronze badges
add a comment |
add a comment |
Alternatively to rle()
you can use diff()
:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, diff(srvc_inv) == 1
only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv
for a case when it starts from 1s run.
And with rle()
, from my opinion, there is even simpler solution:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(rle(srvc_inv)$value))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, that's enough just to sum values
component of rle
object, which returns the number of 1s runs.
add a comment |
Alternatively to rle()
you can use diff()
:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, diff(srvc_inv) == 1
only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv
for a case when it starts from 1s run.
And with rle()
, from my opinion, there is even simpler solution:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(rle(srvc_inv)$value))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, that's enough just to sum values
component of rle
object, which returns the number of 1s runs.
add a comment |
Alternatively to rle()
you can use diff()
:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, diff(srvc_inv) == 1
only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv
for a case when it starts from 1s run.
And with rle()
, from my opinion, there is even simpler solution:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(rle(srvc_inv)$value))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, that's enough just to sum values
component of rle
object, which returns the number of 1s runs.
Alternatively to rle()
you can use diff()
:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(diff(c(0, srvc_inv)) > 0))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, diff(srvc_inv) == 1
only when xi is 1, and xi-1 is 0. It turns into 0 or -1 otherwise. I added 0 before srvc_inv
for a case when it starts from 1s run.
And with rle()
, from my opinion, there is even simpler solution:
dat %>%
group_by(name) %>%
summarise(ever_inv = sum(rle(srvc_inv)$value))
# A tibble: 1 x 2
# name ever_inv
# <fct> <int>
# 1 Bob 2
Assuming that srvc_inv
is either 0 or 1, that's enough just to sum values
component of rle
object, which returns the number of 1s runs.
answered 5 hours ago
utubunutubun
2,8351 gold badge10 silver badges14 bronze badges
2,8351 gold badge10 silver badges14 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57739667%2fhow-to-get-frequency-counts-using-column-breaks-by-row%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown