How to create new column in dataframe from existing column using conditionsHow do I check whether a file exists without exceptions?How can I safely create a nested directory?How to sort a dataframe by multiple column(s)Selecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasHow to change the order of DataFrame columns?Delete column from pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

Journal standards vs. personal standards

Making a wall made from glass bricks

/etc/hosts not working

Losing queen and then winning the game

Two palindromes are not enough

Have any large aeroplanes been landed — safely and without damage — in locations that they could not be flown away from?

Could you fall off a planet if it was being accelerated by engines?

Closest Proximity of Oceans to Freshwater Springs

Compiling all Exception messages into a string

How to describe POV characters?

Why was p[:] designed to work differently in these two situations?

Ways to get SMD resistors from a strip

Can European countries bypass the EU and make their own individual trade deal with the U.S.?

Do home values typically rise and fall consistently across different price ranges?

Conference in Los Angeles, visa?

13th chords on guitar

List Manipulation : a,b,c,d,e,f,g,h into a,b,c,d,e,f,g,h

Sharing referee/AE report online to point out a grievous error in refereeing

Checkmate in 1 on a Tangled Board

Cooking a nice pan seared steak for picky eaters

Adjective for 'made of pus' or 'corrupted by pus' or something of something of pus

Why did the Apple //e make a hideous noise if you inserted the disk upside down?

How do ohm meters measure high resistances?

Could human civilization live 150 years in a nuclear-powered aircraft carrier colony without resorting to mass killing/ cannibalism?

How to create new column in dataframe from existing column using conditions

How do I check whether a file exists without exceptions?How can I safely create a nested directory?How to sort a dataframe by multiple column(s)Selecting multiple columns in a pandas dataframeAdding new column to existing DataFrame in Python pandasHow to change the order of DataFrame columns?Delete column from pandas DataFrameHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasGet list from pandas DataFrame column headers

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I have one column containing all the data which looks something like this (values that need to be separated have a mark like (c)):

UK (c)
London
Wales
Liverpool
US (c)
Chicago
New York
San Francisco
Seattle
Australia (c)
Sydney
Perth

And I want it split into two columns looking like this:

London UK
Wales UK
Liverpool UK
Chicago US
New York US
San Francisco US
Seattle US
Sydney Australia
Perth Australia

Q2. What if the countries did not have a pattern like (c)?

edited 7 hours ago

asked 8 hours ago

Tsatsa

585 bronze badges

New contributor

add a comment |

I have one column containing all the data which looks something like this (values that need to be separated have a mark like (c)):

UK (c)
London
Wales
Liverpool
US (c)
Chicago
New York
San Francisco
Seattle
Australia (c)
Sydney
Perth

And I want it split into two columns looking like this:

London UK
Wales UK
Liverpool UK
Chicago US
New York US
San Francisco US
Seattle US
Sydney Australia
Perth Australia

Q2. What if the countries did not have a pattern like (c)?

edited 7 hours ago

asked 8 hours ago

Tsatsa

585 bronze badges

New contributor

add a comment |

I have one column containing all the data which looks something like this (values that need to be separated have a mark like (c)):

UK (c)
London
Wales
Liverpool
US (c)
Chicago
New York
San Francisco
Seattle
Australia (c)
Sydney
Perth

And I want it split into two columns looking like this:

London UK
Wales UK
Liverpool UK
Chicago US
New York US
San Francisco US
Seattle US
Sydney Australia
Perth Australia

Q2. What if the countries did not have a pattern like (c)?

edited 7 hours ago

asked 8 hours ago

Tsatsa

585 bronze badges

New contributor

I have one column containing all the data which looks something like this (values that need to be separated have a mark like (c)):

UK (c)
London
Wales
Liverpool
US (c)
Chicago
New York
San Francisco
Seattle
Australia (c)
Sydney
Perth

And I want it split into two columns looking like this:

London UK
Wales UK
Liverpool UK
Chicago US
New York US
San Francisco US
Seattle US
Sydney Australia
Perth Australia

Q2. What if the countries did not have a pattern like (c)?

python pandas dataframe series

edited 7 hours ago

asked 8 hours ago

Tsatsa

585 bronze badges

New contributor

edited 7 hours ago

asked 8 hours ago

Tsatsa

585 bronze badges

New contributor

edited 7 hours ago

asked 8 hours ago

Tsatsa

585 bronze badges

New contributor

asked 8 hours ago

Tsatsa

585 bronze badges

asked 8 hours ago

Tsatsa

585 bronze badges

New contributor

add a comment |

5 Answers
5

active

oldest

votes

Step by step with endswith and ffill + str.strip

df['country']=df.loc[df.city.str.endswith('(c)'),'city']
df.country=df.country.ffill()
df=df[df.city.ne(df.country)]
df.country=df.country.str.strip('(c)')

edited 8 hours ago

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

What if the countries did not have a pattern like (c)?

– Tsatsa
7 hours ago

1

@Tsatsa in that case you may need build a country list , and using isin

– WeNYoBen
7 hours ago

add a comment |

`extract` and `ffill`

Start with extract and ffill, then remove redundant rows.

df['country'] = (
 df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill())
df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

Where,

df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill()

0 UK
1 UK
2 UK
3 UK
4 US
5 US
6 US
7 US
8 US
9 Australia
10 Australia
11 Australia
Name: country, dtype: object

The pattern '(.*)s+(c)' matches strings of the form "country (c)" and extracts the country name. Anything not matching this pattern is replaced with NaN, so you can conveniently forward fill on rows.

`split` with `np.where` and `ffill`

This splits on "(c)".

u = df['data'].str.split(r's+(c)')
df['country'] = pd.Series(np.where(u.str.len() == 2, u.str[0], np.nan)).ffill()

df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

edited 8 hours ago

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

extract('(.*)s+(c)') saves you from .str.strip().

– Quang Hoang
8 hours ago

@QuangHoang Yes, it works. Thanks!

– cs95
8 hours ago

add a comment |

You can first use str.extract to locate the cities ending in (c) and extract the country name, and ffill to populate a new country column.

The same extracted matches can be use to locate the rows to be dropped, i.e. rows which are notna:

m = df.city.str.extract('^(.*?)(?=(c)$)')
ix = m[m.squeeze().notna()].index
df['country'] = m.ffill()
df.drop(ix)

 city country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

edited 8 hours ago

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

add a comment |

You can use np.where with str.contains too:

mask = df['places'].str.contains('(c)', regex = False)
df['country'] = np.where(mask, df['places'], np.nan)
df['country'] = df['country'].str.replace('(c)', '').ffill()
df = df[~mask]
df
 places country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

The str contains looks for (c) and if present will return True for that index. Where this condition is True, the country value will be added to the country columns

edited 8 hours ago

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

add a comment |

You could do the following:

data = ['UK (c)','London','Wales','Liverpool','US (c)','Chicago','New York','San Francisco','Seattle','Australia (c)','Sydney','Perth']
df = pd.DataFrame(data, columns = ['city'])
df['country'] = df.city.apply(lambda x : x.replace('(c)','') if '(c)' in x else None)
df.fillna(method='ffill', inplace=True)
df = df[df['city'].str.contains('(c)')==False]

Output

+-----+----------------+-----------+
| | city | country |
+-----+----------------+-----------+
| 1 | London | UK |
| 2 | Wales | UK |
| 3 | Liverpool | UK |
| 5 | Chicago | US |
| 6 | New York | US |
| 7 | San Francisco | US |
| 8 | Seattle | US |
| 10 | Sydney | Australia |
| 11 | Perth | Australia |
+-----+----------------+-----------+

edited 8 hours ago

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

Tsatsa is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56792723%2fhow-to-create-new-column-in-dataframe-from-existing-column-using-conditions%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

Step by step with endswith and ffill + str.strip

df['country']=df.loc[df.city.str.endswith('(c)'),'city']
df.country=df.country.ffill()
df=df[df.city.ne(df.country)]
df.country=df.country.str.strip('(c)')

edited 8 hours ago

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

What if the countries did not have a pattern like (c)?

– Tsatsa
7 hours ago

1

@Tsatsa in that case you may need build a country list , and using isin

– WeNYoBen
7 hours ago

add a comment |

Step by step with endswith and ffill + str.strip

df['country']=df.loc[df.city.str.endswith('(c)'),'city']
df.country=df.country.ffill()
df=df[df.city.ne(df.country)]
df.country=df.country.str.strip('(c)')

edited 8 hours ago

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

What if the countries did not have a pattern like (c)?

– Tsatsa
7 hours ago

1

@Tsatsa in that case you may need build a country list , and using isin

– WeNYoBen
7 hours ago

add a comment |

Step by step with endswith and ffill + str.strip

df['country']=df.loc[df.city.str.endswith('(c)'),'city']
df.country=df.country.ffill()
df=df[df.city.ne(df.country)]
df.country=df.country.str.strip('(c)')

edited 8 hours ago

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

Step by step with endswith and ffill + str.strip

df['country']=df.loc[df.city.str.endswith('(c)'),'city']
df.country=df.country.ffill()
df=df[df.city.ne(df.country)]
df.country=df.country.str.strip('(c)')

edited 8 hours ago

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

edited 8 hours ago

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

answered 8 hours ago

WeNYoBen

143k8 gold badges51 silver badges79 bronze badges

What if the countries did not have a pattern like (c)?

– Tsatsa
7 hours ago

1

@Tsatsa in that case you may need build a country list , and using isin

– WeNYoBen
7 hours ago

add a comment |

What if the countries did not have a pattern like (c)?

– Tsatsa
7 hours ago

1

@Tsatsa in that case you may need build a country list , and using isin

– WeNYoBen
7 hours ago

What if the countries did not have a pattern like (c)?

– Tsatsa
7 hours ago

@Tsatsa in that case you may need build a country list , and using isin

– WeNYoBen
7 hours ago

add a comment |

`extract` and `ffill`

Start with extract and ffill, then remove redundant rows.

df['country'] = (
 df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill())
df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

Where,

df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill()

0 UK
1 UK
2 UK
3 UK
4 US
5 US
6 US
7 US
8 US
9 Australia
10 Australia
11 Australia
Name: country, dtype: object

`split` with `np.where` and `ffill`

This splits on "(c)".

u = df['data'].str.split(r's+(c)')
df['country'] = pd.Series(np.where(u.str.len() == 2, u.str[0], np.nan)).ffill()

df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

edited 8 hours ago

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

extract('(.*)s+(c)') saves you from .str.strip().

– Quang Hoang
8 hours ago

@QuangHoang Yes, it works. Thanks!

– cs95
8 hours ago

add a comment |

`extract` and `ffill`

Start with extract and ffill, then remove redundant rows.

df['country'] = (
 df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill())
df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

Where,

df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill()

0 UK
1 UK
2 UK
3 UK
4 US
5 US
6 US
7 US
8 US
9 Australia
10 Australia
11 Australia
Name: country, dtype: object

`split` with `np.where` and `ffill`

This splits on "(c)".

u = df['data'].str.split(r's+(c)')
df['country'] = pd.Series(np.where(u.str.len() == 2, u.str[0], np.nan)).ffill()

df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

edited 8 hours ago

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

extract('(.*)s+(c)') saves you from .str.strip().

– Quang Hoang
8 hours ago

@QuangHoang Yes, it works. Thanks!

– cs95
8 hours ago

add a comment |

`extract` and `ffill`

Start with extract and ffill, then remove redundant rows.

df['country'] = (
 df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill())
df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

Where,

df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill()

0 UK
1 UK
2 UK
3 UK
4 US
5 US
6 US
7 US
8 US
9 Australia
10 Australia
11 Australia
Name: country, dtype: object

`split` with `np.where` and `ffill`

This splits on "(c)".

u = df['data'].str.split(r's+(c)')
df['country'] = pd.Series(np.where(u.str.len() == 2, u.str[0], np.nan)).ffill()

df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

edited 8 hours ago

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

`extract` and `ffill`

Start with extract and ffill, then remove redundant rows.

df['country'] = (
 df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill())
df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

Where,

df['data'].str.extract(r'(.*)s+(c)', expand=False).ffill()

0 UK
1 UK
2 UK
3 UK
4 US
5 US
6 US
7 US
8 US
9 Australia
10 Australia
11 Australia
Name: country, dtype: object

`split` with `np.where` and `ffill`

This splits on "(c)".

u = df['data'].str.split(r's+(c)')
df['country'] = pd.Series(np.where(u.str.len() == 2, u.str[0], np.nan)).ffill()

df[~df['data'].str.contains('(c)', regex=False)].reset_index(drop=True)

 data country
0 London UK
1 Wales UK
2 Liverpool UK
3 Chicago US
4 New York US
5 San Francisco US
6 Seattle US
7 Sydney Australia
8 Perth Australia

edited 8 hours ago

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

edited 8 hours ago

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

answered 8 hours ago

cs95

156k26 gold badges208 silver badges278 bronze badges

extract('(.*)s+(c)') saves you from .str.strip().

– Quang Hoang
8 hours ago

@QuangHoang Yes, it works. Thanks!

– cs95
8 hours ago

add a comment |

extract('(.*)s+(c)') saves you from .str.strip().

– Quang Hoang
8 hours ago

@QuangHoang Yes, it works. Thanks!

– cs95
8 hours ago

extract('(.*)s+(c)') saves you from .str.strip().

– Quang Hoang
8 hours ago

@QuangHoang Yes, it works. Thanks!

– cs95
8 hours ago

add a comment |

You can first use str.extract to locate the cities ending in (c) and extract the country name, and ffill to populate a new country column.

The same extracted matches can be use to locate the rows to be dropped, i.e. rows which are notna:

m = df.city.str.extract('^(.*?)(?=(c)$)')
ix = m[m.squeeze().notna()].index
df['country'] = m.ffill()
df.drop(ix)

 city country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

edited 8 hours ago

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

add a comment |

You can first use str.extract to locate the cities ending in (c) and extract the country name, and ffill to populate a new country column.

The same extracted matches can be use to locate the rows to be dropped, i.e. rows which are notna:

m = df.city.str.extract('^(.*?)(?=(c)$)')
ix = m[m.squeeze().notna()].index
df['country'] = m.ffill()
df.drop(ix)

 city country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

edited 8 hours ago

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

add a comment |

You can first use str.extract to locate the cities ending in (c) and extract the country name, and ffill to populate a new country column.

The same extracted matches can be use to locate the rows to be dropped, i.e. rows which are notna:

m = df.city.str.extract('^(.*?)(?=(c)$)')
ix = m[m.squeeze().notna()].index
df['country'] = m.ffill()
df.drop(ix)

 city country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

edited 8 hours ago

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

You can first use str.extract to locate the cities ending in (c) and extract the country name, and ffill to populate a new country column.

The same extracted matches can be use to locate the rows to be dropped, i.e. rows which are notna:

m = df.city.str.extract('^(.*?)(?=(c)$)')
ix = m[m.squeeze().notna()].index
df['country'] = m.ffill()
df.drop(ix)

 city country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

edited 8 hours ago

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

edited 8 hours ago

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

answered 8 hours ago

yatu

26.4k4 gold badges22 silver badges53 bronze badges

add a comment |

You can use np.where with str.contains too:

mask = df['places'].str.contains('(c)', regex = False)
df['country'] = np.where(mask, df['places'], np.nan)
df['country'] = df['country'].str.replace('(c)', '').ffill()
df = df[~mask]
df
 places country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

The str contains looks for (c) and if present will return True for that index. Where this condition is True, the country value will be added to the country columns

edited 8 hours ago

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

add a comment |

You can use np.where with str.contains too:

mask = df['places'].str.contains('(c)', regex = False)
df['country'] = np.where(mask, df['places'], np.nan)
df['country'] = df['country'].str.replace('(c)', '').ffill()
df = df[~mask]
df
 places country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

The str contains looks for (c) and if present will return True for that index. Where this condition is True, the country value will be added to the country columns

edited 8 hours ago

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

add a comment |

You can use np.where with str.contains too:

mask = df['places'].str.contains('(c)', regex = False)
df['country'] = np.where(mask, df['places'], np.nan)
df['country'] = df['country'].str.replace('(c)', '').ffill()
df = df[~mask]
df
 places country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

The str contains looks for (c) and if present will return True for that index. Where this condition is True, the country value will be added to the country columns

edited 8 hours ago

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

You can use np.where with str.contains too:

mask = df['places'].str.contains('(c)', regex = False)
df['country'] = np.where(mask, df['places'], np.nan)
df['country'] = df['country'].str.replace('(c)', '').ffill()
df = df[~mask]
df
 places country
1 London UK 
2 Wales UK 
3 Liverpool UK 
5 Chicago US 
6 New York US 
7 San Francisco US 
8 Seattle US 
10 Sydney Australia 
11 Perth Australia

The str contains looks for (c) and if present will return True for that index. Where this condition is True, the country value will be added to the country columns

edited 8 hours ago

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

edited 8 hours ago

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

answered 8 hours ago

Mohit Motwani

3,0591 gold badge8 silver badges29 bronze badges

add a comment |

You could do the following:

data = ['UK (c)','London','Wales','Liverpool','US (c)','Chicago','New York','San Francisco','Seattle','Australia (c)','Sydney','Perth']
df = pd.DataFrame(data, columns = ['city'])
df['country'] = df.city.apply(lambda x : x.replace('(c)','') if '(c)' in x else None)
df.fillna(method='ffill', inplace=True)
df = df[df['city'].str.contains('(c)')==False]

Output

+-----+----------------+-----------+
| | city | country |
+-----+----------------+-----------+
| 1 | London | UK |
| 2 | Wales | UK |
| 3 | Liverpool | UK |
| 5 | Chicago | US |
| 6 | New York | US |
| 7 | San Francisco | US |
| 8 | Seattle | US |
| 10 | Sydney | Australia |
| 11 | Perth | Australia |
+-----+----------------+-----------+

edited 8 hours ago

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

add a comment |

You could do the following:

data = ['UK (c)','London','Wales','Liverpool','US (c)','Chicago','New York','San Francisco','Seattle','Australia (c)','Sydney','Perth']
df = pd.DataFrame(data, columns = ['city'])
df['country'] = df.city.apply(lambda x : x.replace('(c)','') if '(c)' in x else None)
df.fillna(method='ffill', inplace=True)
df = df[df['city'].str.contains('(c)')==False]

Output

+-----+----------------+-----------+
| | city | country |
+-----+----------------+-----------+
| 1 | London | UK |
| 2 | Wales | UK |
| 3 | Liverpool | UK |
| 5 | Chicago | US |
| 6 | New York | US |
| 7 | San Francisco | US |
| 8 | Seattle | US |
| 10 | Sydney | Australia |
| 11 | Perth | Australia |
+-----+----------------+-----------+

edited 8 hours ago

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

add a comment |

You could do the following:

data = ['UK (c)','London','Wales','Liverpool','US (c)','Chicago','New York','San Francisco','Seattle','Australia (c)','Sydney','Perth']
df = pd.DataFrame(data, columns = ['city'])
df['country'] = df.city.apply(lambda x : x.replace('(c)','') if '(c)' in x else None)
df.fillna(method='ffill', inplace=True)
df = df[df['city'].str.contains('(c)')==False]

Output

+-----+----------------+-----------+
| | city | country |
+-----+----------------+-----------+
| 1 | London | UK |
| 2 | Wales | UK |
| 3 | Liverpool | UK |
| 5 | Chicago | US |
| 6 | New York | US |
| 7 | San Francisco | US |
| 8 | Seattle | US |
| 10 | Sydney | Australia |
| 11 | Perth | Australia |
+-----+----------------+-----------+

edited 8 hours ago

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

You could do the following:

data = ['UK (c)','London','Wales','Liverpool','US (c)','Chicago','New York','San Francisco','Seattle','Australia (c)','Sydney','Perth']
df = pd.DataFrame(data, columns = ['city'])
df['country'] = df.city.apply(lambda x : x.replace('(c)','') if '(c)' in x else None)
df.fillna(method='ffill', inplace=True)
df = df[df['city'].str.contains('(c)')==False]

Output

+-----+----------------+-----------+
| | city | country |
+-----+----------------+-----------+
| 1 | London | UK |
| 2 | Wales | UK |
| 3 | Liverpool | UK |
| 5 | Chicago | US |
| 6 | New York | US |
| 7 | San Francisco | US |
| 8 | Seattle | US |
| 10 | Sydney | Australia |
| 11 | Perth | Australia |
+-----+----------------+-----------+

edited 8 hours ago

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

edited 8 hours ago

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

answered 8 hours ago

Sebastien D

2,1492 gold badges8 silver badges29 bronze badges

add a comment |

Tsatsa is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Tsatsa is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfcttrf

5 Answers
5

`extract` and `ffill`

`split` with `np.where` and `ffill`

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

`extract` and `ffill`

`split` with `np.where` and `ffill`

`extract` and `ffill`

`split` with `np.where` and `ffill`

`extract` and `ffill`

`split` with `np.where` and `ffill`

`extract` and `ffill`

`split` with `np.where` and `ffill`

Post as a guest

Popular posts from this blog

5 Answers 5

extract and ffill

split with np.where and ffill

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

extract and ffill

split with np.where and ffill

extract and ffill

split with np.where and ffill

extract and ffill

split with np.where and ffill

extract and ffill

split with np.where and ffill

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

5 Answers
5

`extract` and `ffill`

`split` with `np.where` and `ffill`

5 Answers
5

5 Answers
5

`extract` and `ffill`

`split` with `np.where` and `ffill`

`extract` and `ffill`

`split` with `np.where` and `ffill`

`extract` and `ffill`

`split` with `np.where` and `ffill`

`extract` and `ffill`

`split` with `np.where` and `ffill`