split a six digits number column into separated columns with one digitHow do you split a list into evenly sized chunks?How to add an extra column to a NumPy arrayRenaming columns in pandasAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasChange data type of columns in PandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasConvert list of dictionaries to a pandas DataFrame

My Friend James

Solve the given inequality below in the body.

Why there is no wireless switch?

How can I oppose my advisor granting gift authorship to a collaborator?

'This one' as a pronoun

Professor refuses to write a recommendation letter to students who haven't written a research paper with him

Tiny image scraper for xkcd.com

Left my gmail logged in when I was fired

How do I make my fill-in-the-blank exercise more obvious?

How does the UK House of Commons think they can prolong the deadline of Brexit?

Would you recommend a keyboard for beginners with or without lights in keys for learning?

What is the source of the fear in the Hallow spell's extra Fear effect?

Tying double knot of garbarge bag

Are buttons really enough to bound validities by S4.2?

A magician's sleight of hand

How were the names on the memorial stones in Avengers: Endgame chosen, out-of-universe?

Zermelo's proof for unique factorisation

Label "Alto en grasa saturada, sal, ..." should there also be Alta?

What is hot spotting in the context of adding files to tempdb?

Low quality postdoc application and deadline extension

What drugs were used in England during the High Middle Ages?

How could a planet have one hemisphere way warmer than the other without the planet being tidally locked?

GFI outlets tripped after power outage

Why did Boris Johnson call for new elections?

split a six digits number column into separated columns with one digit

How do you split a list into evenly sized chunks?How to add an extra column to a NumPy arrayRenaming columns in pandasAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasChange data type of columns in PandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasConvert list of dictionaries to a pandas DataFrame

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?

import pandas as pd
import numpy as np

df = pd.Series(range(123456,123465))

df = pd.DataFrame(df)

df.head()

what I have is like this one below

Number
654321
223344

The desired outcome should be like this one below.

Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |

edited 7 hours ago

asked 8 hours ago

msalem85

364 bronze badges

If you don't have to use numpy or pandas - for num in str(my_number): print(num)

– wcarhart
8 hours ago

What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

– Daweo
7 hours ago

add a comment |

how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?

import pandas as pd
import numpy as np

df = pd.Series(range(123456,123465))

df = pd.DataFrame(df)

df.head()

what I have is like this one below

Number
654321
223344

The desired outcome should be like this one below.

Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |

edited 7 hours ago

asked 8 hours ago

msalem85

364 bronze badges

If you don't have to use numpy or pandas - for num in str(my_number): print(num)

– wcarhart
8 hours ago

What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

– Daweo
7 hours ago

add a comment |

how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?

import pandas as pd
import numpy as np

df = pd.Series(range(123456,123465))

df = pd.DataFrame(df)

df.head()

what I have is like this one below

Number
654321
223344

The desired outcome should be like this one below.

Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |

edited 7 hours ago

asked 8 hours ago

msalem85

364 bronze badges

how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?

import pandas as pd
import numpy as np

df = pd.Series(range(123456,123465))

df = pd.DataFrame(df)

df.head()

what I have is like this one below

Number
654321
223344

The desired outcome should be like this one below.

Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |

python pandas numpy

edited 7 hours ago

asked 8 hours ago

msalem85

364 bronze badges

edited 7 hours ago

asked 8 hours ago

msalem85

364 bronze badges

edited 7 hours ago

asked 8 hours ago

msalem85

364 bronze badges

asked 8 hours ago

msalem85

364 bronze badges

asked 8 hours ago

msalem85

364 bronze badges

If you don't have to use numpy or pandas - for num in str(my_number): print(num)

– wcarhart
8 hours ago

What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

– Daweo
7 hours ago

add a comment |

If you don't have to use numpy or pandas - for num in str(my_number): print(num)

– wcarhart
8 hours ago

What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

– Daweo
7 hours ago

If you don't have to use numpy or pandas - for num in str(my_number): print(num)

– wcarhart
8 hours ago

What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

– Daweo
7 hours ago

add a comment |

8 Answers
8

active

oldest

votes

MCVE

Here is a simple suggestion:

import pandas as pd

# MCVE dataframe:
df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

def digit(x, n):
 """Return the n-th digit of integer in base 10"""
 return (x // 10**n) % 10

def digitize(df, key, n):
 """Extract n less significant digits from an integer in base 10"""
 for i in range(n):
 df['x%d' % i] = digit(df[key], n-i-1)

# Apply function on dataframe (inplace):
digitize(df, 'number', 6)

For the trial dataframe, it returns:

 number x0 x1 x2 x3 x4 x5
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7
3 123 0 0 0 1 2 3
4 123456789 4 5 6 7 8 9

Observations

This method avoids the need to cast into string and then cast again to int.

It relies on modular integer arithmetic, bellow details of operations:

10**3 # int: 1000 (integer power)
54321 // 10**3 # int: 54 (quotient of integer division)
(54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)

Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).

edited 7 hours ago

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

1

get rid off apply, you can simply do digit(df['Number'], i).

– Quang Hoang
8 hours ago

@QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

– jlandercy
8 hours ago

Without apply, it's vectorized, so you would see big improvement in terms of speed.

– Quang Hoang
8 hours ago

@QuangHoang updated thank you

– jlandercy
8 hours ago

add a comment |

Some fun with views, assuming that each number has 6 digits:

u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

Impressive one-liner, although it breaks if there are numbers with different number of digits.

– jdehesa
8 hours ago

Yea, that assumption has to be made, definitely more of a trick than something to use.

– user3483203
8 hours ago

add a comment |

Turn it into a string first!

Also, included a zfill just in case not all numbers are 6 digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
df.join(d)

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

Details

This gets the digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
dat

[[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]

This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'

d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
d

 x1 x2 x3 x4 x5 x6
0 6 5 4 3 2 1
1 2 2 3 3 4 4

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

add a comment |

While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.

import numpy as np
import pandas as pd

df = pd.DataFrame('Number': [654321, 223344])
num_cols = int(np.log10(df['Number'].max() - 1)) + 1
vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
df2 = pd.concat([df, df_digits])], axis=1)
print(df2)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

I definitely like this approach. I'm trying to make this prettier (-:

– piRSquared
7 hours ago

1

vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

– piRSquared
6 hours ago

add a comment |

You could use np.unravel_index

df = pd.DataFrame('Number': [654321,223344])

def split_digits(df):
 # get data as numpy array
 numbers = df['Number'].to_numpy()
 # extract digits
 digits = np.unravel_index(numbers, 6*(10,))
 # create column headers
 columns = ['Number', *(f'xi' for i in "123456")]
 # build and return new data frame
 return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


split_digits(df)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

timeit(lambda:split_digits(df),number=1000)
# 0.3550272472202778

Thanks @GZ0 for some pandas tips.

edited 56 mins ago

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

1

This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

– Karn Kumar
5 hours ago

@KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

– Paul Panzer
5 hours ago

@KarnKumar I've made an annotated version in case you are interested.

– Paul Panzer
5 hours ago

1

One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

– GZ0
4 hours ago

1

@PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

– GZ0
2 hours ago

|
show 2 more comments

Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:

import numpy as np
a = np.array([[654321],[223344]])
str_a = a.astype(str)
out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
print(out)

Output:

[['6' '5' '4' '3' '2' '1']
 ['2' '2' '3' '3' '4' '4']]

Note that out is currently np.array of strs, you might convert it to int if such need arise.

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

add a comment |

I really liked @user3483203's answer. I think .str.findall could work with any number of digits:

df = pd.DataFrame(
 'Number' : [65432178888, 22334474343]
)

u = df['Number'].astype(str).str.findall(r'(w)')
df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)

 Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
0 65432178888 6 5 4 3 2 1 7 8 8 8 8
1 22334474343 2 2 3 3 4 4 7 4 3 4 3

edited 7 hours ago

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

add a comment |

Simple way around:

>>> df
 number
0 123456
1 456789
2 135797

First convert the column into string

>>> df['number'] = df['number'].astype(str)

Create the new columns using string indexing

>>> df['x1'] = df['number'].str[0]
>>> df['x2'] = df['number'].str[1]
>>> df['x3'] = df['number'].str[2]
>>> df['x4'] = df['number'].str[3]
>>> df['x5'] = df['number'].str[4]
>>> df['x6'] = df['number'].str[5]

>>> df
 number x1 x2 x3 x4 x5 x6
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7

>>> df.drop('number', axis=1, inplace=True)
>>> df
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

@another trick with str.split()

>>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
>>> df
 x1 x3 x5 x7 x9 x11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

OR

>>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

>>> df
 1 3 5 7 9 11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

edited 5 hours ago

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57792952%2fsplit-a-six-digits-number-column-into-separated-columns-with-one-digit%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

8 Answers
8

active

oldest

votes

8 Answers
8

active

oldest

votes

MCVE

Here is a simple suggestion:

import pandas as pd

# MCVE dataframe:
df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

def digit(x, n):
 """Return the n-th digit of integer in base 10"""
 return (x // 10**n) % 10

def digitize(df, key, n):
 """Extract n less significant digits from an integer in base 10"""
 for i in range(n):
 df['x%d' % i] = digit(df[key], n-i-1)

# Apply function on dataframe (inplace):
digitize(df, 'number', 6)

For the trial dataframe, it returns:

 number x0 x1 x2 x3 x4 x5
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7
3 123 0 0 0 1 2 3
4 123456789 4 5 6 7 8 9

Observations

This method avoids the need to cast into string and then cast again to int.

It relies on modular integer arithmetic, bellow details of operations:

10**3 # int: 1000 (integer power)
54321 // 10**3 # int: 54 (quotient of integer division)
(54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)

Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).

edited 7 hours ago

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

1

get rid off apply, you can simply do digit(df['Number'], i).

– Quang Hoang
8 hours ago

@QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

– jlandercy
8 hours ago

Without apply, it's vectorized, so you would see big improvement in terms of speed.

– Quang Hoang
8 hours ago

@QuangHoang updated thank you

– jlandercy
8 hours ago

add a comment |

MCVE

Here is a simple suggestion:

import pandas as pd

# MCVE dataframe:
df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

def digit(x, n):
 """Return the n-th digit of integer in base 10"""
 return (x // 10**n) % 10

def digitize(df, key, n):
 """Extract n less significant digits from an integer in base 10"""
 for i in range(n):
 df['x%d' % i] = digit(df[key], n-i-1)

# Apply function on dataframe (inplace):
digitize(df, 'number', 6)

For the trial dataframe, it returns:

 number x0 x1 x2 x3 x4 x5
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7
3 123 0 0 0 1 2 3
4 123456789 4 5 6 7 8 9

Observations

This method avoids the need to cast into string and then cast again to int.

It relies on modular integer arithmetic, bellow details of operations:

10**3 # int: 1000 (integer power)
54321 // 10**3 # int: 54 (quotient of integer division)
(54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)

Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).

edited 7 hours ago

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

1

get rid off apply, you can simply do digit(df['Number'], i).

– Quang Hoang
8 hours ago

@QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

– jlandercy
8 hours ago

Without apply, it's vectorized, so you would see big improvement in terms of speed.

– Quang Hoang
8 hours ago

@QuangHoang updated thank you

– jlandercy
8 hours ago

add a comment |

MCVE

Here is a simple suggestion:

import pandas as pd

# MCVE dataframe:
df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

def digit(x, n):
 """Return the n-th digit of integer in base 10"""
 return (x // 10**n) % 10

def digitize(df, key, n):
 """Extract n less significant digits from an integer in base 10"""
 for i in range(n):
 df['x%d' % i] = digit(df[key], n-i-1)

# Apply function on dataframe (inplace):
digitize(df, 'number', 6)

For the trial dataframe, it returns:

 number x0 x1 x2 x3 x4 x5
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7
3 123 0 0 0 1 2 3
4 123456789 4 5 6 7 8 9

Observations

This method avoids the need to cast into string and then cast again to int.

It relies on modular integer arithmetic, bellow details of operations:

10**3 # int: 1000 (integer power)
54321 // 10**3 # int: 54 (quotient of integer division)
(54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)

Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).

edited 7 hours ago

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

MCVE

Here is a simple suggestion:

import pandas as pd

# MCVE dataframe:
df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

def digit(x, n):
 """Return the n-th digit of integer in base 10"""
 return (x // 10**n) % 10

def digitize(df, key, n):
 """Extract n less significant digits from an integer in base 10"""
 for i in range(n):
 df['x%d' % i] = digit(df[key], n-i-1)

# Apply function on dataframe (inplace):
digitize(df, 'number', 6)

For the trial dataframe, it returns:

 number x0 x1 x2 x3 x4 x5
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7
3 123 0 0 0 1 2 3
4 123456789 4 5 6 7 8 9

Observations

This method avoids the need to cast into string and then cast again to int.

It relies on modular integer arithmetic, bellow details of operations:

10**3 # int: 1000 (integer power)
54321 // 10**3 # int: 54 (quotient of integer division)
(54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)

Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).

edited 7 hours ago

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

edited 7 hours ago

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

answered 8 hours ago

jlandercy

1,9761 gold badge17 silver badges31 bronze badges

1

get rid off apply, you can simply do digit(df['Number'], i).

– Quang Hoang
8 hours ago

@QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

– jlandercy
8 hours ago

Without apply, it's vectorized, so you would see big improvement in terms of speed.

– Quang Hoang
8 hours ago

@QuangHoang updated thank you

– jlandercy
8 hours ago

add a comment |

1

get rid off apply, you can simply do digit(df['Number'], i).

– Quang Hoang
8 hours ago

@QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

– jlandercy
8 hours ago

Without apply, it's vectorized, so you would see big improvement in terms of speed.

– Quang Hoang
8 hours ago

@QuangHoang updated thank you

– jlandercy
8 hours ago

get rid off apply, you can simply do digit(df['Number'], i).

– Quang Hoang
8 hours ago

@QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

– jlandercy
8 hours ago

Without apply, it's vectorized, so you would see big improvement in terms of speed.

– Quang Hoang
8 hours ago

@QuangHoang updated thank you

– jlandercy
8 hours ago

add a comment |

Some fun with views, assuming that each number has 6 digits:

u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

Impressive one-liner, although it breaks if there are numbers with different number of digits.

– jdehesa
8 hours ago

Yea, that assumption has to be made, definitely more of a trick than something to use.

– user3483203
8 hours ago

add a comment |

Some fun with views, assuming that each number has 6 digits:

u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

Impressive one-liner, although it breaks if there are numbers with different number of digits.

– jdehesa
8 hours ago

Yea, that assumption has to be made, definitely more of a trick than something to use.

– user3483203
8 hours ago

add a comment |

Some fun with views, assuming that each number has 6 digits:

u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

Some fun with views, assuming that each number has 6 digits:

u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

answered 8 hours ago

user3483203

38.5k8 gold badges32 silver badges62 bronze badges

Impressive one-liner, although it breaks if there are numbers with different number of digits.

– jdehesa
8 hours ago

Yea, that assumption has to be made, definitely more of a trick than something to use.

– user3483203
8 hours ago

add a comment |

Impressive one-liner, although it breaks if there are numbers with different number of digits.

– jdehesa
8 hours ago

Yea, that assumption has to be made, definitely more of a trick than something to use.

– user3483203
8 hours ago

Impressive one-liner, although it breaks if there are numbers with different number of digits.

– jdehesa
8 hours ago

Yea, that assumption has to be made, definitely more of a trick than something to use.

– user3483203
8 hours ago

add a comment |

Turn it into a string first!

Also, included a zfill just in case not all numbers are 6 digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
df.join(d)

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

Details

This gets the digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
dat

[[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]

This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'

d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
d

 x1 x2 x3 x4 x5 x6
0 6 5 4 3 2 1
1 2 2 3 3 4 4

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

add a comment |

Turn it into a string first!

Also, included a zfill just in case not all numbers are 6 digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
df.join(d)

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

Details

This gets the digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
dat

[[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]

This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'

d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
d

 x1 x2 x3 x4 x5 x6
0 6 5 4 3 2 1
1 2 2 3 3 4 4

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

add a comment |

Turn it into a string first!

Also, included a zfill just in case not all numbers are 6 digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
df.join(d)

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

Details

This gets the digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
dat

[[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]

This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'

d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
d

 x1 x2 x3 x4 x5 x6
0 6 5 4 3 2 1
1 2 2 3 3 4 4

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

Turn it into a string first!

Also, included a zfill just in case not all numbers are 6 digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
df.join(d)

 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4

Details

This gets the digits

dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
dat

[[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]

This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'

d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
d

 x1 x2 x3 x4 x5 x6
0 6 5 4 3 2 1
1 2 2 3 3 4 4

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

answered 8 hours ago

piRSquared

178k26 gold badges195 silver badges352 bronze badges

add a comment |

While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.

import numpy as np
import pandas as pd

df = pd.DataFrame('Number': [654321, 223344])
num_cols = int(np.log10(df['Number'].max() - 1)) + 1
vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
df2 = pd.concat([df, df_digits])], axis=1)
print(df2)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

I definitely like this approach. I'm trying to make this prettier (-:

– piRSquared
7 hours ago

1

vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

– piRSquared
6 hours ago

add a comment |

While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.

import numpy as np
import pandas as pd

df = pd.DataFrame('Number': [654321, 223344])
num_cols = int(np.log10(df['Number'].max() - 1)) + 1
vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
df2 = pd.concat([df, df_digits])], axis=1)
print(df2)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

I definitely like this approach. I'm trying to make this prettier (-:

– piRSquared
7 hours ago

1

vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

– piRSquared
6 hours ago

add a comment |

While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.

import numpy as np
import pandas as pd

df = pd.DataFrame('Number': [654321, 223344])
num_cols = int(np.log10(df['Number'].max() - 1)) + 1
vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
df2 = pd.concat([df, df_digits])], axis=1)
print(df2)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.

import numpy as np
import pandas as pd

df = pd.DataFrame('Number': [654321, 223344])
num_cols = int(np.log10(df['Number'].max() - 1)) + 1
vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
df2 = pd.concat([df, df_digits])], axis=1)
print(df2)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

answered 8 hours ago

jdehesa

35k4 gold badges42 silver badges66 bronze badges

I definitely like this approach. I'm trying to make this prettier (-:

– piRSquared
7 hours ago

1

vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

– piRSquared
6 hours ago

add a comment |

I definitely like this approach. I'm trying to make this prettier (-:

– piRSquared
7 hours ago

1

vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

– piRSquared
6 hours ago

I definitely like this approach. I'm trying to make this prettier (-:

– piRSquared
7 hours ago

vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

– piRSquared
6 hours ago

add a comment |

You could use np.unravel_index

df = pd.DataFrame('Number': [654321,223344])

def split_digits(df):
 # get data as numpy array
 numbers = df['Number'].to_numpy()
 # extract digits
 digits = np.unravel_index(numbers, 6*(10,))
 # create column headers
 columns = ['Number', *(f'xi' for i in "123456")]
 # build and return new data frame
 return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


split_digits(df)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

timeit(lambda:split_digits(df),number=1000)
# 0.3550272472202778

Thanks @GZ0 for some pandas tips.

edited 56 mins ago

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

1

This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

– Karn Kumar
5 hours ago

@KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

– Paul Panzer
5 hours ago

@KarnKumar I've made an annotated version in case you are interested.

– Paul Panzer
5 hours ago

1

One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

– GZ0
4 hours ago

1

@PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

– GZ0
2 hours ago

|
show 2 more comments

You could use np.unravel_index

df = pd.DataFrame('Number': [654321,223344])

def split_digits(df):
 # get data as numpy array
 numbers = df['Number'].to_numpy()
 # extract digits
 digits = np.unravel_index(numbers, 6*(10,))
 # create column headers
 columns = ['Number', *(f'xi' for i in "123456")]
 # build and return new data frame
 return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


split_digits(df)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

timeit(lambda:split_digits(df),number=1000)
# 0.3550272472202778

Thanks @GZ0 for some pandas tips.

edited 56 mins ago

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

1

This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

– Karn Kumar
5 hours ago

@KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

– Paul Panzer
5 hours ago

@KarnKumar I've made an annotated version in case you are interested.

– Paul Panzer
5 hours ago

1

One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

– GZ0
4 hours ago

1

@PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

– GZ0
2 hours ago

|
show 2 more comments

You could use np.unravel_index

df = pd.DataFrame('Number': [654321,223344])

def split_digits(df):
 # get data as numpy array
 numbers = df['Number'].to_numpy()
 # extract digits
 digits = np.unravel_index(numbers, 6*(10,))
 # create column headers
 columns = ['Number', *(f'xi' for i in "123456")]
 # build and return new data frame
 return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


split_digits(df)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

timeit(lambda:split_digits(df),number=1000)
# 0.3550272472202778

Thanks @GZ0 for some pandas tips.

edited 56 mins ago

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

You could use np.unravel_index

df = pd.DataFrame('Number': [654321,223344])

def split_digits(df):
 # get data as numpy array
 numbers = df['Number'].to_numpy()
 # extract digits
 digits = np.unravel_index(numbers, 6*(10,))
 # create column headers
 columns = ['Number', *(f'xi' for i in "123456")]
 # build and return new data frame
 return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


split_digits(df)
# Number x1 x2 x3 x4 x5 x6
# 0 654321 6 5 4 3 2 1
# 1 223344 2 2 3 3 4 4

timeit(lambda:split_digits(df),number=1000)
# 0.3550272472202778

Thanks @GZ0 for some pandas tips.

edited 56 mins ago

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

edited 56 mins ago

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

answered 6 hours ago

Paul Panzer

35.1k2 gold badges23 silver badges53 bronze badges

1

This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

– Karn Kumar
5 hours ago

@KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

– Paul Panzer
5 hours ago

@KarnKumar I've made an annotated version in case you are interested.

– Paul Panzer
5 hours ago

1

One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

– GZ0
4 hours ago

1

@PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

– GZ0
2 hours ago

|
show 2 more comments

1

This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

– Karn Kumar
5 hours ago

@KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

– Paul Panzer
5 hours ago

@KarnKumar I've made an annotated version in case you are interested.

– Paul Panzer
5 hours ago

1

One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

– GZ0
4 hours ago

1

@PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

– GZ0
2 hours ago

This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

– Karn Kumar
5 hours ago

@KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

– Paul Panzer
5 hours ago

@KarnKumar I've made an annotated version in case you are interested.

– Paul Panzer
5 hours ago

One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

– GZ0
4 hours ago

@PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

– GZ0
2 hours ago

|
show 2 more comments

Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:

import numpy as np
a = np.array([[654321],[223344]])
str_a = a.astype(str)
out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
print(out)

Output:

[['6' '5' '4' '3' '2' '1']
 ['2' '2' '3' '3' '4' '4']]

Note that out is currently np.array of strs, you might convert it to int if such need arise.

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

add a comment |

Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:

import numpy as np
a = np.array([[654321],[223344]])
str_a = a.astype(str)
out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
print(out)

Output:

[['6' '5' '4' '3' '2' '1']
 ['2' '2' '3' '3' '4' '4']]

Note that out is currently np.array of strs, you might convert it to int if such need arise.

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

add a comment |

Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:

import numpy as np
a = np.array([[654321],[223344]])
str_a = a.astype(str)
out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
print(out)

Output:

[['6' '5' '4' '3' '2' '1']
 ['2' '2' '3' '3' '4' '4']]

Note that out is currently np.array of strs, you might convert it to int if such need arise.

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:

import numpy as np
a = np.array([[654321],[223344]])
str_a = a.astype(str)
out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
print(out)

Output:

[['6' '5' '4' '3' '2' '1']
 ['2' '2' '3' '3' '4' '4']]

Note that out is currently np.array of strs, you might convert it to int if such need arise.

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

answered 8 hours ago

Daweo

2,0651 gold badge2 silver badges6 bronze badges

add a comment |

I really liked @user3483203's answer. I think .str.findall could work with any number of digits:

df = pd.DataFrame(
 'Number' : [65432178888, 22334474343]
)

u = df['Number'].astype(str).str.findall(r'(w)')
df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)

 Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
0 65432178888 6 5 4 3 2 1 7 8 8 8 8
1 22334474343 2 2 3 3 4 4 7 4 3 4 3

edited 7 hours ago

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

add a comment |

I really liked @user3483203's answer. I think .str.findall could work with any number of digits:

df = pd.DataFrame(
 'Number' : [65432178888, 22334474343]
)

u = df['Number'].astype(str).str.findall(r'(w)')
df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)

 Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
0 65432178888 6 5 4 3 2 1 7 8 8 8 8
1 22334474343 2 2 3 3 4 4 7 4 3 4 3

edited 7 hours ago

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

add a comment |

I really liked @user3483203's answer. I think .str.findall could work with any number of digits:

df = pd.DataFrame(
 'Number' : [65432178888, 22334474343]
)

u = df['Number'].astype(str).str.findall(r'(w)')
df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)

 Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
0 65432178888 6 5 4 3 2 1 7 8 8 8 8
1 22334474343 2 2 3 3 4 4 7 4 3 4 3

edited 7 hours ago

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

I really liked @user3483203's answer. I think .str.findall could work with any number of digits:

df = pd.DataFrame(
 'Number' : [65432178888, 22334474343]
)

u = df['Number'].astype(str).str.findall(r'(w)')
df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)

 Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
0 65432178888 6 5 4 3 2 1 7 8 8 8 8
1 22334474343 2 2 3 3 4 4 7 4 3 4 3

edited 7 hours ago

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

edited 7 hours ago

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

answered 8 hours ago

political scientist

1,8121 gold badge8 silver badges18 bronze badges

add a comment |

Simple way around:

>>> df
 number
0 123456
1 456789
2 135797

First convert the column into string

>>> df['number'] = df['number'].astype(str)

Create the new columns using string indexing

>>> df['x1'] = df['number'].str[0]
>>> df['x2'] = df['number'].str[1]
>>> df['x3'] = df['number'].str[2]
>>> df['x4'] = df['number'].str[3]
>>> df['x5'] = df['number'].str[4]
>>> df['x6'] = df['number'].str[5]

>>> df
 number x1 x2 x3 x4 x5 x6
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7

>>> df.drop('number', axis=1, inplace=True)
>>> df
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

@another trick with str.split()

>>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
>>> df
 x1 x3 x5 x7 x9 x11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

OR

>>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

>>> df
 1 3 5 7 9 11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

edited 5 hours ago

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

add a comment |

Simple way around:

>>> df
 number
0 123456
1 456789
2 135797

First convert the column into string

>>> df['number'] = df['number'].astype(str)

Create the new columns using string indexing

>>> df['x1'] = df['number'].str[0]
>>> df['x2'] = df['number'].str[1]
>>> df['x3'] = df['number'].str[2]
>>> df['x4'] = df['number'].str[3]
>>> df['x5'] = df['number'].str[4]
>>> df['x6'] = df['number'].str[5]

>>> df
 number x1 x2 x3 x4 x5 x6
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7

>>> df.drop('number', axis=1, inplace=True)
>>> df
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

@another trick with str.split()

>>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
>>> df
 x1 x3 x5 x7 x9 x11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

OR

>>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

>>> df
 1 3 5 7 9 11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

edited 5 hours ago

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

add a comment |

Simple way around:

>>> df
 number
0 123456
1 456789
2 135797

First convert the column into string

>>> df['number'] = df['number'].astype(str)

Create the new columns using string indexing

>>> df['x1'] = df['number'].str[0]
>>> df['x2'] = df['number'].str[1]
>>> df['x3'] = df['number'].str[2]
>>> df['x4'] = df['number'].str[3]
>>> df['x5'] = df['number'].str[4]
>>> df['x6'] = df['number'].str[5]

>>> df
 number x1 x2 x3 x4 x5 x6
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7

>>> df.drop('number', axis=1, inplace=True)
>>> df
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

@another trick with str.split()

>>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
>>> df
 x1 x3 x5 x7 x9 x11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

OR

>>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

>>> df
 1 3 5 7 9 11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

edited 5 hours ago

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

Simple way around:

>>> df
 number
0 123456
1 456789
2 135797

First convert the column into string

>>> df['number'] = df['number'].astype(str)

Create the new columns using string indexing

>>> df['x1'] = df['number'].str[0]
>>> df['x2'] = df['number'].str[1]
>>> df['x3'] = df['number'].str[2]
>>> df['x4'] = df['number'].str[3]
>>> df['x5'] = df['number'].str[4]
>>> df['x6'] = df['number'].str[5]

>>> df
 number x1 x2 x3 x4 x5 x6
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7

>>> df.drop('number', axis=1, inplace=True)
>>> df
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

@another trick with str.split()

>>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
>>> df
 x1 x3 x5 x7 x9 x11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

OR

>>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

>>> df
 1 3 5 7 9 11
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

>>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
 x1 x2 x3 x4 x5 x6
0 1 2 3 4 5 6
1 4 5 6 7 8 9
2 1 3 5 7 9 7

edited 5 hours ago

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

edited 5 hours ago

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

answered 7 hours ago

Karn Kumar

3,7081 gold badge7 silver badges22 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

what I have is like this one below

The desired outcome should be like this one below.

what I have is like this one below

The desired outcome should be like this one below.

what I have is like this one below

The desired outcome should be like this one below.

what I have is like this one below

The desired outcome should be like this one below.

8 Answers 8

MCVE

Observations

Turn it into a string first!

Details

OR

Your Answer

Sign up or log in

Post as a guest

Post as a guest

8 Answers 8

8 Answers 8

MCVE

Observations

MCVE

Observations

MCVE

Observations

MCVE

Observations

Turn it into a string first!

Details

Turn it into a string first!

Details

Turn it into a string first!

Details

Turn it into a string first!

Details

OR

OR

OR

OR

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

8 Answers
8

8 Answers
8

8 Answers
8