How to count the number of occurences before a particular value in dataframe python?How to get the current time in PythonHow can I make a time delay in Python?How do I sort a dictionary by value?How to sort a dataframe by multiple column(s)How do I concatenate two lists in Python?Adding new column to existing DataFrame in Python pandasHow can I replace all the NaN values with Zero's in a column of a pandas dataframeHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas

How does a linear operator act on a bra?

What is the name of this Allen-head furniture fastener?

Output a Super Mario Image

What officially disallows US presidents from driving?

Is there a tool to measure the "maturity" of a code in Git?

Which is the current decimal separator?

What organs or modifications would be needed for a life biological creature not to require sleep?

What do the French say for “Oh, you shouldn’t have”?

Why the car dealer is insisting on loan instead of cash

Some Prime Peerage

Has SHA256 been broken by Treadwell Stanton DuPont?

ColorFunction based on array index in ListLinePlot

2000s space film where an alien species has almost wiped out the human race in a war

Is there a real-world mythological counterpart to WoW's "kill your gods for power" theme?

I am getting "syntax error near unexpected token `'$#''" in a simple Bash script

Python web-scraper to download table of transistor counts from Wikipedia

Why is the year in this ISO timestamp not 2019?

Asked to Not Use Transactions and to Use A Workaround to Simulate One

The Planck constant for mathematicians

Does a succubus' charm end when it dies?

Should you only use colons and periods in dialogues?

Where is it? - The Google Earth Challenge Ep. 1

How to be sure services and researches offered by the University are not becoming cases of unfair competition?

Parallel resistance in electric circuits

How to count the number of occurences before a particular value in dataframe python?

How to get the current time in PythonHow can I make a time delay in Python?How do I sort a dictionary by value?How to sort a dataframe by multiple column(s)How do I concatenate two lists in Python?Adding new column to existing DataFrame in Python pandasHow can I replace all the NaN values with Zero's in a column of a pandas dataframeHow do I get the row count of a pandas DataFrame?How to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandas

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I have a dataframe like below:

I want the number of occurence of zeroes from df['B'] under the following condition:

if(df['B']<df['C']):
 #count number of zeroes in df['B'] until it sees 1.

expected output:

A B C output
1 1 1 Nan
2 0 1 1
3 0 0 Nan
4 1 0 Nan
5 0 1 1
6 0 1 0
7 1 0 Nan

I dont know how to formulate the count part. Any help is really appreciated

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

Me too, what does until it sees 1 mean?

– Joe
9 hours ago

until the first occurence of '1' in B

– hakuna_code
9 hours ago

add a comment
|

I have a dataframe like below:

I want the number of occurence of zeroes from df['B'] under the following condition:

if(df['B']<df['C']):
 #count number of zeroes in df['B'] until it sees 1.

expected output:

A B C output
1 1 1 Nan
2 0 1 1
3 0 0 Nan
4 1 0 Nan
5 0 1 1
6 0 1 0
7 1 0 Nan

I dont know how to formulate the count part. Any help is really appreciated

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

Me too, what does until it sees 1 mean?

– Joe
9 hours ago

until the first occurence of '1' in B

– hakuna_code
9 hours ago

add a comment
|

I have a dataframe like below:

I want the number of occurence of zeroes from df['B'] under the following condition:

if(df['B']<df['C']):
 #count number of zeroes in df['B'] until it sees 1.

expected output:

A B C output
1 1 1 Nan
2 0 1 1
3 0 0 Nan
4 1 0 Nan
5 0 1 1
6 0 1 0
7 1 0 Nan

I dont know how to formulate the count part. Any help is really appreciated

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

I have a dataframe like below:

I want the number of occurence of zeroes from df['B'] under the following condition:

if(df['B']<df['C']):
 #count number of zeroes in df['B'] until it sees 1.

expected output:

A B C output
1 1 1 Nan
2 0 1 1
3 0 0 Nan
4 1 0 Nan
5 0 1 1
6 0 1 0
7 1 0 Nan

I dont know how to formulate the count part. Any help is really appreciated

python pandas dataframe

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

edited 8 hours ago

Massifox

5421 silver badge13 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

asked 9 hours ago

hakuna_code

1518 bronze badges

Me too, what does until it sees 1 mean?

– Joe
9 hours ago

until the first occurence of '1' in B

– hakuna_code
9 hours ago

add a comment
|

Me too, what does until it sees 1 mean?

– Joe
9 hours ago

until the first occurence of '1' in B

– hakuna_code
9 hours ago

Me too, what does until it sees 1 mean?

– Joe
9 hours ago

until the first occurence of '1' in B

– hakuna_code
9 hours ago

add a comment
|

3 Answers
3

active

oldest

votes

IIUC one approach would be using a custom grouper and aggregating with groupby.cumcount:

c1 = df.B.lt(df.C)
g = df.B.eq(1).cumsum()
df['out'] = c1.groupby(g).cumcount(ascending=False).shift().where(c1).sub(1)

print(df)

 A B C out
0 1 1 1 NaN
1 2 0 1 1.0
2 3 0 0 NaN
3 4 1 0 NaN
4 5 0 1 1.0
5 6 0 1 0.0
6 7 1 0 NaN

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

add a comment
|

Using some masking and a groupby on your reversed series. This assumes binary data (only 0 and 1)

m = df['B'][::-1].eq(0)
d = m.groupby(m.ne(m.shift()).cumsum()).cumsum().sub(1)
d[::-1].where(df['B'] < df['C'])

0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
Name: B, dtype: float64

And a fast numpy based approach

def zero_until_one(a, b):
 n = a.shape[0] 
 x = np.flatnonzero(a < b)
 y = np.flatnonzero(a == 1) 
 d = np.searchsorted(y, x)
 r = y[d] - x - 1
 out = np.full(n, np.nan)
 out[x] = r 
 return out

zero_until_one(df['B'], df['C'])

array([nan, 1., nan, nan, 1., 0., nan])

Performance

df = pd.concat([df]*10_000)

%timeit chris1(df)
19.3 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit yatu(df)
12.8 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit zero_until_one(df['B'], df['C'])
2.32 ms ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited 8 hours ago

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

1

Great idea for numpy function , Just guess numba may faster

– WeNYoBen
8 hours ago

add a comment
|

Let us push into one-line

df.groupby(df.B.iloc[::-1].cumsum()).cumcount(ascending=False).shift(-1).where(df.B<df.C)
Out[80]: 
0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
dtype: float64

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

add a comment
|

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57925273%2fhow-to-count-the-number-of-occurences-before-a-particular-value-in-dataframe-pyt%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

IIUC one approach would be using a custom grouper and aggregating with groupby.cumcount:

c1 = df.B.lt(df.C)
g = df.B.eq(1).cumsum()
df['out'] = c1.groupby(g).cumcount(ascending=False).shift().where(c1).sub(1)

print(df)

 A B C out
0 1 1 1 NaN
1 2 0 1 1.0
2 3 0 0 NaN
3 4 1 0 NaN
4 5 0 1 1.0
5 6 0 1 0.0
6 7 1 0 NaN

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

add a comment
|

IIUC one approach would be using a custom grouper and aggregating with groupby.cumcount:

c1 = df.B.lt(df.C)
g = df.B.eq(1).cumsum()
df['out'] = c1.groupby(g).cumcount(ascending=False).shift().where(c1).sub(1)

print(df)

 A B C out
0 1 1 1 NaN
1 2 0 1 1.0
2 3 0 0 NaN
3 4 1 0 NaN
4 5 0 1 1.0
5 6 0 1 0.0
6 7 1 0 NaN

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

add a comment
|

IIUC one approach would be using a custom grouper and aggregating with groupby.cumcount:

c1 = df.B.lt(df.C)
g = df.B.eq(1).cumsum()
df['out'] = c1.groupby(g).cumcount(ascending=False).shift().where(c1).sub(1)

print(df)

 A B C out
0 1 1 1 NaN
1 2 0 1 1.0
2 3 0 0 NaN
3 4 1 0 NaN
4 5 0 1 1.0
5 6 0 1 0.0
6 7 1 0 NaN

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

IIUC one approach would be using a custom grouper and aggregating with groupby.cumcount:

c1 = df.B.lt(df.C)
g = df.B.eq(1).cumsum()
df['out'] = c1.groupby(g).cumcount(ascending=False).shift().where(c1).sub(1)

print(df)

 A B C out
0 1 1 1 NaN
1 2 0 1 1.0
2 3 0 0 NaN
3 4 1 0 NaN
4 5 0 1 1.0
5 6 0 1 0.0
6 7 1 0 NaN

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

answered 9 hours ago

yatu

32.6k6 gold badges26 silver badges58 bronze badges

add a comment
|

Using some masking and a groupby on your reversed series. This assumes binary data (only 0 and 1)

m = df['B'][::-1].eq(0)
d = m.groupby(m.ne(m.shift()).cumsum()).cumsum().sub(1)
d[::-1].where(df['B'] < df['C'])

0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
Name: B, dtype: float64

And a fast numpy based approach

def zero_until_one(a, b):
 n = a.shape[0] 
 x = np.flatnonzero(a < b)
 y = np.flatnonzero(a == 1) 
 d = np.searchsorted(y, x)
 r = y[d] - x - 1
 out = np.full(n, np.nan)
 out[x] = r 
 return out

zero_until_one(df['B'], df['C'])

array([nan, 1., nan, nan, 1., 0., nan])

Performance

df = pd.concat([df]*10_000)

%timeit chris1(df)
19.3 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit yatu(df)
12.8 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit zero_until_one(df['B'], df['C'])
2.32 ms ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited 8 hours ago

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

1

Great idea for numpy function , Just guess numba may faster

– WeNYoBen
8 hours ago

add a comment
|

Using some masking and a groupby on your reversed series. This assumes binary data (only 0 and 1)

m = df['B'][::-1].eq(0)
d = m.groupby(m.ne(m.shift()).cumsum()).cumsum().sub(1)
d[::-1].where(df['B'] < df['C'])

0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
Name: B, dtype: float64

And a fast numpy based approach

def zero_until_one(a, b):
 n = a.shape[0] 
 x = np.flatnonzero(a < b)
 y = np.flatnonzero(a == 1) 
 d = np.searchsorted(y, x)
 r = y[d] - x - 1
 out = np.full(n, np.nan)
 out[x] = r 
 return out

zero_until_one(df['B'], df['C'])

array([nan, 1., nan, nan, 1., 0., nan])

Performance

df = pd.concat([df]*10_000)

%timeit chris1(df)
19.3 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit yatu(df)
12.8 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit zero_until_one(df['B'], df['C'])
2.32 ms ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited 8 hours ago

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

1

Great idea for numpy function , Just guess numba may faster

– WeNYoBen
8 hours ago

add a comment
|

Using some masking and a groupby on your reversed series. This assumes binary data (only 0 and 1)

m = df['B'][::-1].eq(0)
d = m.groupby(m.ne(m.shift()).cumsum()).cumsum().sub(1)
d[::-1].where(df['B'] < df['C'])

0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
Name: B, dtype: float64

And a fast numpy based approach

def zero_until_one(a, b):
 n = a.shape[0] 
 x = np.flatnonzero(a < b)
 y = np.flatnonzero(a == 1) 
 d = np.searchsorted(y, x)
 r = y[d] - x - 1
 out = np.full(n, np.nan)
 out[x] = r 
 return out

zero_until_one(df['B'], df['C'])

array([nan, 1., nan, nan, 1., 0., nan])

Performance

df = pd.concat([df]*10_000)

%timeit chris1(df)
19.3 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit yatu(df)
12.8 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit zero_until_one(df['B'], df['C'])
2.32 ms ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited 8 hours ago

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

Using some masking and a groupby on your reversed series. This assumes binary data (only 0 and 1)

m = df['B'][::-1].eq(0)
d = m.groupby(m.ne(m.shift()).cumsum()).cumsum().sub(1)
d[::-1].where(df['B'] < df['C'])

0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
Name: B, dtype: float64

And a fast numpy based approach

def zero_until_one(a, b):
 n = a.shape[0] 
 x = np.flatnonzero(a < b)
 y = np.flatnonzero(a == 1) 
 d = np.searchsorted(y, x)
 r = y[d] - x - 1
 out = np.full(n, np.nan)
 out[x] = r 
 return out

zero_until_one(df['B'], df['C'])

array([nan, 1., nan, nan, 1., 0., nan])

Performance

df = pd.concat([df]*10_000)

%timeit chris1(df)
19.3 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit yatu(df)
12.8 ms ± 54.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit zero_until_one(df['B'], df['C'])
2.32 ms ± 31.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited 8 hours ago

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

edited 8 hours ago

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

answered 9 hours ago

user3483203

39k8 gold badges32 silver badges63 bronze badges

1

Great idea for numpy function , Just guess numba may faster

– WeNYoBen
8 hours ago

add a comment
|

1

Great idea for numpy function , Just guess numba may faster

– WeNYoBen
8 hours ago

Great idea for numpy function , Just guess numba may faster

– WeNYoBen
8 hours ago

add a comment
|

Let us push into one-line

df.groupby(df.B.iloc[::-1].cumsum()).cumcount(ascending=False).shift(-1).where(df.B<df.C)
Out[80]: 
0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
dtype: float64

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

add a comment
|

Let us push into one-line

df.groupby(df.B.iloc[::-1].cumsum()).cumcount(ascending=False).shift(-1).where(df.B<df.C)
Out[80]: 
0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
dtype: float64

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

add a comment
|

Let us push into one-line

df.groupby(df.B.iloc[::-1].cumsum()).cumcount(ascending=False).shift(-1).where(df.B<df.C)
Out[80]: 
0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
dtype: float64

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

Let us push into one-line

df.groupby(df.B.iloc[::-1].cumsum()).cumcount(ascending=False).shift(-1).where(df.B<df.C)
Out[80]: 
0 NaN
1 1.0
2 NaN
3 NaN
4 1.0
5 0.0
6 NaN
dtype: float64

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

answered 8 hours ago

WeNYoBen

158k8 gold badges54 silver badges86 bronze badges

add a comment
|

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfcttrf

3 Answers
3

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Post as a guest

Popular posts from this blog

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

3 Answers
3

3 Answers
3

3 Answers
3