split a six digits number column into separated columns with one digitHow do you split a list into evenly sized chunks?How to add an extra column to a NumPy arrayRenaming columns in pandasAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasChange data type of columns in PandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasConvert list of dictionaries to a pandas DataFrame

My Friend James

Solve the given inequality below in the body.

Why there is no wireless switch?

How can I oppose my advisor granting gift authorship to a collaborator?

'This one' as a pronoun

Professor refuses to write a recommendation letter to students who haven't written a research paper with him

Tiny image scraper for xkcd.com

Left my gmail logged in when I was fired

How do I make my fill-in-the-blank exercise more obvious?

How does the UK House of Commons think they can prolong the deadline of Brexit?

Would you recommend a keyboard for beginners with or without lights in keys for learning?

What is the source of the fear in the Hallow spell's extra Fear effect?

Tying double knot of garbarge bag

Are buttons really enough to bound validities by S4.2?

A magician's sleight of hand

How were the names on the memorial stones in Avengers: Endgame chosen, out-of-universe?

Zermelo's proof for unique factorisation

Label "Alto en grasa saturada, sal, ..." should there also be Alta?

What is hot spotting in the context of adding files to tempdb?

Low quality postdoc application and deadline extension

What drugs were used in England during the High Middle Ages?

How could a planet have one hemisphere way warmer than the other without the planet being tidally locked?

GFI outlets tripped after power outage

Why did Boris Johnson call for new elections?



split a six digits number column into separated columns with one digit


How do you split a list into evenly sized chunks?How to add an extra column to a NumPy arrayRenaming columns in pandasAdding new column to existing DataFrame in Python pandas“Large data” work flows using pandasChange data type of columns in PandasHow to iterate over rows in a DataFrame in Pandas?Select rows from a DataFrame based on values in a column in pandasConvert list of dictionaries to a pandas DataFrame






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








6















how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?



import pandas as pd
import numpy as np



df = pd.Series(range(123456,123465))



df = pd.DataFrame(df)



df.head()



what I have is like this one below



Number
654321
223344


The desired outcome should be like this one below.



Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |









share|improve this question


























  • If you don't have to use numpy or pandas - for num in str(my_number): print(num)

    – wcarhart
    8 hours ago












  • What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

    – Daweo
    7 hours ago


















6















how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?



import pandas as pd
import numpy as np



df = pd.Series(range(123456,123465))



df = pd.DataFrame(df)



df.head()



what I have is like this one below



Number
654321
223344


The desired outcome should be like this one below.



Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |









share|improve this question


























  • If you don't have to use numpy or pandas - for num in str(my_number): print(num)

    – wcarhart
    8 hours ago












  • What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

    – Daweo
    7 hours ago














6












6








6


1






how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?



import pandas as pd
import numpy as np



df = pd.Series(range(123456,123465))



df = pd.DataFrame(df)



df.head()



what I have is like this one below



Number
654321
223344


The desired outcome should be like this one below.



Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |









share|improve this question
















how can I by using pandas or numpy separate one column of 6 integer digits into 6 columns with one digit each?



import pandas as pd
import numpy as np



df = pd.Series(range(123456,123465))



df = pd.DataFrame(df)



df.head()



what I have is like this one below



Number
654321
223344


The desired outcome should be like this one below.



Number | x1 | x2 | x3 | x4 | x5 | x6 |
654321 | 6 | 5 | 4 | 3 | 2 | 1 |
223344 | 2 | 2 | 3 | 3 | 4 | 4 |






python pandas numpy






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 7 hours ago







msalem85

















asked 8 hours ago









msalem85msalem85

364 bronze badges




364 bronze badges















  • If you don't have to use numpy or pandas - for num in str(my_number): print(num)

    – wcarhart
    8 hours ago












  • What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

    – Daweo
    7 hours ago


















  • If you don't have to use numpy or pandas - for num in str(my_number): print(num)

    – wcarhart
    8 hours ago












  • What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

    – Daweo
    7 hours ago

















If you don't have to use numpy or pandas - for num in str(my_number): print(num)

– wcarhart
8 hours ago






If you don't have to use numpy or pandas - for num in str(my_number): print(num)

– wcarhart
8 hours ago














What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

– Daweo
7 hours ago






What is source of your data? numpy.array or pandas.dataframe are delivered to you or you are getting just text with numbers separated by newlines?

– Daweo
7 hours ago













8 Answers
8






active

oldest

votes


















5
















MCVE



Here is a simple suggestion:



import pandas as pd

# MCVE dataframe:
df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

def digit(x, n):
"""Return the n-th digit of integer in base 10"""
return (x // 10**n) % 10

def digitize(df, key, n):
"""Extract n less significant digits from an integer in base 10"""
for i in range(n):
df['x%d' % i] = digit(df[key], n-i-1)

# Apply function on dataframe (inplace):
digitize(df, 'number', 6)


For the trial dataframe, it returns:



 number x0 x1 x2 x3 x4 x5
0 123456 1 2 3 4 5 6
1 456789 4 5 6 7 8 9
2 135797 1 3 5 7 9 7
3 123 0 0 0 1 2 3
4 123456789 4 5 6 7 8 9


Observations



This method avoids the need to cast into string and then cast again to int.



It relies on modular integer arithmetic, bellow details of operations:



10**3 # int: 1000 (integer power)
54321 // 10**3 # int: 54 (quotient of integer division)
(54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)


Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).






share|improve this answer






















  • 1





    get rid off apply, you can simply do digit(df['Number'], i).

    – Quang Hoang
    8 hours ago











  • @QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

    – jlandercy
    8 hours ago












  • Without apply, it's vectorized, so you would see big improvement in terms of speed.

    – Quang Hoang
    8 hours ago











  • @QuangHoang updated thank you

    – jlandercy
    8 hours ago


















4
















Some fun with views, assuming that each number has 6 digits:




u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))




 Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4





share|improve this answer

























  • Impressive one-liner, although it breaks if there are numbers with different number of digits.

    – jdehesa
    8 hours ago











  • Yea, that assumption has to be made, definitely more of a trick than something to use.

    – user3483203
    8 hours ago


















3
















Turn it into a string first!



Also, included a zfill just in case not all numbers are 6 digits



dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
df.join(d)

Number x1 x2 x3 x4 x5 x6
0 654321 6 5 4 3 2 1
1 223344 2 2 3 3 4 4



Details



This gets the digits



dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
dat

[[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]


This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'



d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
d

x1 x2 x3 x4 x5 x6
0 6 5 4 3 2 1
1 2 2 3 3 4 4





share|improve this answer
































    3
















    While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.



    import numpy as np
    import pandas as pd

    df = pd.DataFrame('Number': [654321, 223344])
    num_cols = int(np.log10(df['Number'].max() - 1)) + 1
    vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
    df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
    df2 = pd.concat([df, df_digits])], axis=1)
    print(df2)
    # Number x1 x2 x3 x4 x5 x6
    # 0 654321 6 5 4 3 2 1
    # 1 223344 2 2 3 3 4 4





    share|improve this answer

























    • I definitely like this approach. I'm trying to make this prettier (-:

      – piRSquared
      7 hours ago






    • 1





      vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

      – piRSquared
      6 hours ago


















    3
















    You could use np.unravel_index



    df = pd.DataFrame('Number': [654321,223344])

    def split_digits(df):
    # get data as numpy array
    numbers = df['Number'].to_numpy()
    # extract digits
    digits = np.unravel_index(numbers, 6*(10,))
    # create column headers
    columns = ['Number', *(f'xi' for i in "123456")]
    # build and return new data frame
    return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


    split_digits(df)
    # Number x1 x2 x3 x4 x5 x6
    # 0 654321 6 5 4 3 2 1
    # 1 223344 2 2 3 3 4 4

    timeit(lambda:split_digits(df),number=1000)
    # 0.3550272472202778


    Thanks @GZ0 for some pandas tips.






    share|improve this answer






















    • 1





      This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

      – Karn Kumar
      5 hours ago












    • @KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

      – Paul Panzer
      5 hours ago











    • @KarnKumar I've made an annotated version in case you are interested.

      – Paul Panzer
      5 hours ago






    • 1





      One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

      – GZ0
      4 hours ago






    • 1





      @PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

      – GZ0
      2 hours ago



















    0
















    Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:



    import numpy as np
    a = np.array([[654321],[223344]])
    str_a = a.astype(str)
    out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
    print(out)


    Output:



    [['6' '5' '4' '3' '2' '1']
    ['2' '2' '3' '3' '4' '4']]


    Note that out is currently np.array of strs, you might convert it to int if such need arise.






    share|improve this answer
































      0
















      I really liked @user3483203's answer. I think .str.findall could work with any number of digits:



      df = pd.DataFrame(
      'Number' : [65432178888, 22334474343]
      )

      u = df['Number'].astype(str).str.findall(r'(w)')
      df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)


       Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
      0 65432178888 6 5 4 3 2 1 7 8 8 8 8
      1 22334474343 2 2 3 3 4 4 7 4 3 4 3





      share|improve this answer


































        0
















        Simple way around:



        >>> df
        number
        0 123456
        1 456789
        2 135797


        First convert the column into string



        >>> df['number'] = df['number'].astype(str)


        Create the new columns using string indexing



        >>> df['x1'] = df['number'].str[0]
        >>> df['x2'] = df['number'].str[1]
        >>> df['x3'] = df['number'].str[2]
        >>> df['x4'] = df['number'].str[3]
        >>> df['x5'] = df['number'].str[4]
        >>> df['x6'] = df['number'].str[5]

        >>> df
        number x1 x2 x3 x4 x5 x6
        0 123456 1 2 3 4 5 6
        1 456789 4 5 6 7 8 9
        2 135797 1 3 5 7 9 7

        >>> df.drop('number', axis=1, inplace=True)
        >>> df
        x1 x2 x3 x4 x5 x6
        0 1 2 3 4 5 6
        1 4 5 6 7 8 9
        2 1 3 5 7 9 7


        @another trick with str.split()



        >>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
        >>> df
        x1 x3 x5 x7 x9 x11
        0 1 2 3 4 5 6
        1 4 5 6 7 8 9
        2 1 3 5 7 9 7

        >>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
        x1 x2 x3 x4 x5 x6
        0 1 2 3 4 5 6
        1 4 5 6 7 8 9
        2 1 3 5 7 9 7


        OR



        >>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

        >>> df
        1 3 5 7 9 11
        0 1 2 3 4 5 6
        1 4 5 6 7 8 9
        2 1 3 5 7 9 7

        >>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
        x1 x2 x3 x4 x5 x6
        0 1 2 3 4 5 6
        1 4 5 6 7 8 9
        2 1 3 5 7 9 7





        share|improve this answer





























          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57792952%2fsplit-a-six-digits-number-column-into-separated-columns-with-one-digit%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          8 Answers
          8






          active

          oldest

          votes








          8 Answers
          8






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          5
















          MCVE



          Here is a simple suggestion:



          import pandas as pd

          # MCVE dataframe:
          df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

          def digit(x, n):
          """Return the n-th digit of integer in base 10"""
          return (x // 10**n) % 10

          def digitize(df, key, n):
          """Extract n less significant digits from an integer in base 10"""
          for i in range(n):
          df['x%d' % i] = digit(df[key], n-i-1)

          # Apply function on dataframe (inplace):
          digitize(df, 'number', 6)


          For the trial dataframe, it returns:



           number x0 x1 x2 x3 x4 x5
          0 123456 1 2 3 4 5 6
          1 456789 4 5 6 7 8 9
          2 135797 1 3 5 7 9 7
          3 123 0 0 0 1 2 3
          4 123456789 4 5 6 7 8 9


          Observations



          This method avoids the need to cast into string and then cast again to int.



          It relies on modular integer arithmetic, bellow details of operations:



          10**3 # int: 1000 (integer power)
          54321 // 10**3 # int: 54 (quotient of integer division)
          (54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)


          Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).






          share|improve this answer






















          • 1





            get rid off apply, you can simply do digit(df['Number'], i).

            – Quang Hoang
            8 hours ago











          • @QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

            – jlandercy
            8 hours ago












          • Without apply, it's vectorized, so you would see big improvement in terms of speed.

            – Quang Hoang
            8 hours ago











          • @QuangHoang updated thank you

            – jlandercy
            8 hours ago















          5
















          MCVE



          Here is a simple suggestion:



          import pandas as pd

          # MCVE dataframe:
          df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

          def digit(x, n):
          """Return the n-th digit of integer in base 10"""
          return (x // 10**n) % 10

          def digitize(df, key, n):
          """Extract n less significant digits from an integer in base 10"""
          for i in range(n):
          df['x%d' % i] = digit(df[key], n-i-1)

          # Apply function on dataframe (inplace):
          digitize(df, 'number', 6)


          For the trial dataframe, it returns:



           number x0 x1 x2 x3 x4 x5
          0 123456 1 2 3 4 5 6
          1 456789 4 5 6 7 8 9
          2 135797 1 3 5 7 9 7
          3 123 0 0 0 1 2 3
          4 123456789 4 5 6 7 8 9


          Observations



          This method avoids the need to cast into string and then cast again to int.



          It relies on modular integer arithmetic, bellow details of operations:



          10**3 # int: 1000 (integer power)
          54321 // 10**3 # int: 54 (quotient of integer division)
          (54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)


          Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).






          share|improve this answer






















          • 1





            get rid off apply, you can simply do digit(df['Number'], i).

            – Quang Hoang
            8 hours ago











          • @QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

            – jlandercy
            8 hours ago












          • Without apply, it's vectorized, so you would see big improvement in terms of speed.

            – Quang Hoang
            8 hours ago











          • @QuangHoang updated thank you

            – jlandercy
            8 hours ago













          5














          5










          5









          MCVE



          Here is a simple suggestion:



          import pandas as pd

          # MCVE dataframe:
          df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

          def digit(x, n):
          """Return the n-th digit of integer in base 10"""
          return (x // 10**n) % 10

          def digitize(df, key, n):
          """Extract n less significant digits from an integer in base 10"""
          for i in range(n):
          df['x%d' % i] = digit(df[key], n-i-1)

          # Apply function on dataframe (inplace):
          digitize(df, 'number', 6)


          For the trial dataframe, it returns:



           number x0 x1 x2 x3 x4 x5
          0 123456 1 2 3 4 5 6
          1 456789 4 5 6 7 8 9
          2 135797 1 3 5 7 9 7
          3 123 0 0 0 1 2 3
          4 123456789 4 5 6 7 8 9


          Observations



          This method avoids the need to cast into string and then cast again to int.



          It relies on modular integer arithmetic, bellow details of operations:



          10**3 # int: 1000 (integer power)
          54321 // 10**3 # int: 54 (quotient of integer division)
          (54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)


          Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).






          share|improve this answer















          MCVE



          Here is a simple suggestion:



          import pandas as pd

          # MCVE dataframe:
          df = pd.DataFrame([123456, 456789, 135797, 123, 123456789], columns=['number'])

          def digit(x, n):
          """Return the n-th digit of integer in base 10"""
          return (x // 10**n) % 10

          def digitize(df, key, n):
          """Extract n less significant digits from an integer in base 10"""
          for i in range(n):
          df['x%d' % i] = digit(df[key], n-i-1)

          # Apply function on dataframe (inplace):
          digitize(df, 'number', 6)


          For the trial dataframe, it returns:



           number x0 x1 x2 x3 x4 x5
          0 123456 1 2 3 4 5 6
          1 456789 4 5 6 7 8 9
          2 135797 1 3 5 7 9 7
          3 123 0 0 0 1 2 3
          4 123456789 4 5 6 7 8 9


          Observations



          This method avoids the need to cast into string and then cast again to int.



          It relies on modular integer arithmetic, bellow details of operations:



          10**3 # int: 1000 (integer power)
          54321 // 10**3 # int: 54 (quotient of integer division)
          (54321 // 10**3) % 10 # int: 4 (remainder of integer division, modulo)


          Last but not least, it is fail safe and exact for number shorter than n digits or greater than (notice it returns the n less significant digits in latter case).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 7 hours ago

























          answered 8 hours ago









          jlandercyjlandercy

          1,9761 gold badge17 silver badges31 bronze badges




          1,9761 gold badge17 silver badges31 bronze badges










          • 1





            get rid off apply, you can simply do digit(df['Number'], i).

            – Quang Hoang
            8 hours ago











          • @QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

            – jlandercy
            8 hours ago












          • Without apply, it's vectorized, so you would see big improvement in terms of speed.

            – Quang Hoang
            8 hours ago











          • @QuangHoang updated thank you

            – jlandercy
            8 hours ago












          • 1





            get rid off apply, you can simply do digit(df['Number'], i).

            – Quang Hoang
            8 hours ago











          • @QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

            – jlandercy
            8 hours ago












          • Without apply, it's vectorized, so you would see big improvement in terms of speed.

            – Quang Hoang
            8 hours ago











          • @QuangHoang updated thank you

            – jlandercy
            8 hours ago







          1




          1





          get rid off apply, you can simply do digit(df['Number'], i).

          – Quang Hoang
          8 hours ago





          get rid off apply, you can simply do digit(df['Number'], i).

          – Quang Hoang
          8 hours ago













          @QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

          – jlandercy
          8 hours ago






          @QuangHoang Thank you for pointing this out, is there any benefit (performance) alongside with code compactness and readability?

          – jlandercy
          8 hours ago














          Without apply, it's vectorized, so you would see big improvement in terms of speed.

          – Quang Hoang
          8 hours ago





          Without apply, it's vectorized, so you would see big improvement in terms of speed.

          – Quang Hoang
          8 hours ago













          @QuangHoang updated thank you

          – jlandercy
          8 hours ago





          @QuangHoang updated thank you

          – jlandercy
          8 hours ago













          4
















          Some fun with views, assuming that each number has 6 digits:




          u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

          df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))




           Number x1 x2 x3 x4 x5 x6
          0 654321 6 5 4 3 2 1
          1 223344 2 2 3 3 4 4





          share|improve this answer

























          • Impressive one-liner, although it breaks if there are numbers with different number of digits.

            – jdehesa
            8 hours ago











          • Yea, that assumption has to be made, definitely more of a trick than something to use.

            – user3483203
            8 hours ago















          4
















          Some fun with views, assuming that each number has 6 digits:




          u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

          df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))




           Number x1 x2 x3 x4 x5 x6
          0 654321 6 5 4 3 2 1
          1 223344 2 2 3 3 4 4





          share|improve this answer

























          • Impressive one-liner, although it breaks if there are numbers with different number of digits.

            – jdehesa
            8 hours ago











          • Yea, that assumption has to be made, definitely more of a trick than something to use.

            – user3483203
            8 hours ago













          4














          4










          4









          Some fun with views, assuming that each number has 6 digits:




          u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

          df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))




           Number x1 x2 x3 x4 x5 x6
          0 654321 6 5 4 3 2 1
          1 223344 2 2 3 3 4 4





          share|improve this answer













          Some fun with views, assuming that each number has 6 digits:




          u = df[['Number']].to_numpy().astype('U6').view('U1').astype(int)

          df.join(pd.DataFrame(u).rename(columns=lambda c: f'xc+1'))




           Number x1 x2 x3 x4 x5 x6
          0 654321 6 5 4 3 2 1
          1 223344 2 2 3 3 4 4






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 8 hours ago









          user3483203user3483203

          38.5k8 gold badges32 silver badges62 bronze badges




          38.5k8 gold badges32 silver badges62 bronze badges















          • Impressive one-liner, although it breaks if there are numbers with different number of digits.

            – jdehesa
            8 hours ago











          • Yea, that assumption has to be made, definitely more of a trick than something to use.

            – user3483203
            8 hours ago

















          • Impressive one-liner, although it breaks if there are numbers with different number of digits.

            – jdehesa
            8 hours ago











          • Yea, that assumption has to be made, definitely more of a trick than something to use.

            – user3483203
            8 hours ago
















          Impressive one-liner, although it breaks if there are numbers with different number of digits.

          – jdehesa
          8 hours ago





          Impressive one-liner, although it breaks if there are numbers with different number of digits.

          – jdehesa
          8 hours ago













          Yea, that assumption has to be made, definitely more of a trick than something to use.

          – user3483203
          8 hours ago





          Yea, that assumption has to be made, definitely more of a trick than something to use.

          – user3483203
          8 hours ago











          3
















          Turn it into a string first!



          Also, included a zfill just in case not all numbers are 6 digits



          dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
          d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
          df.join(d)

          Number x1 x2 x3 x4 x5 x6
          0 654321 6 5 4 3 2 1
          1 223344 2 2 3 3 4 4



          Details



          This gets the digits



          dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
          dat

          [[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]


          This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'



          d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
          d

          x1 x2 x3 x4 x5 x6
          0 6 5 4 3 2 1
          1 2 2 3 3 4 4





          share|improve this answer





























            3
















            Turn it into a string first!



            Also, included a zfill just in case not all numbers are 6 digits



            dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
            d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
            df.join(d)

            Number x1 x2 x3 x4 x5 x6
            0 654321 6 5 4 3 2 1
            1 223344 2 2 3 3 4 4



            Details



            This gets the digits



            dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
            dat

            [[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]


            This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'



            d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
            d

            x1 x2 x3 x4 x5 x6
            0 6 5 4 3 2 1
            1 2 2 3 3 4 4





            share|improve this answer



























              3














              3










              3









              Turn it into a string first!



              Also, included a zfill just in case not all numbers are 6 digits



              dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
              d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
              df.join(d)

              Number x1 x2 x3 x4 x5 x6
              0 654321 6 5 4 3 2 1
              1 223344 2 2 3 3 4 4



              Details



              This gets the digits



              dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
              dat

              [[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]


              This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'



              d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
              d

              x1 x2 x3 x4 x5 x6
              0 6 5 4 3 2 1
              1 2 2 3 3 4 4





              share|improve this answer













              Turn it into a string first!



              Also, included a zfill just in case not all numbers are 6 digits



              dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
              d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
              df.join(d)

              Number x1 x2 x3 x4 x5 x6
              0 654321 6 5 4 3 2 1
              1 223344 2 2 3 3 4 4



              Details



              This gets the digits



              dat = [list(map(int, str(x).zfill(6))) for x in df.Number]
              dat

              [[6, 5, 4, 3, 2, 1], [2, 2, 3, 3, 4, 4]]


              This creates a new dataframe with the same index as df AND renames the columns to have an 'x' in front and begin with 'x1' and not 'x0'



              d = pd.DataFrame(dat, df.index).rename(columns=lambda x: f'xx + 1')
              d

              x1 x2 x3 x4 x5 x6
              0 6 5 4 3 2 1
              1 2 2 3 3 4 4






              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered 8 hours ago









              piRSquaredpiRSquared

              178k26 gold badges195 silver badges352 bronze badges




              178k26 gold badges195 silver badges352 bronze badges
























                  3
















                  While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.



                  import numpy as np
                  import pandas as pd

                  df = pd.DataFrame('Number': [654321, 223344])
                  num_cols = int(np.log10(df['Number'].max() - 1)) + 1
                  vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
                  df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
                  df2 = pd.concat([df, df_digits])], axis=1)
                  print(df2)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4





                  share|improve this answer

























                  • I definitely like this approach. I'm trying to make this prettier (-:

                    – piRSquared
                    7 hours ago






                  • 1





                    vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

                    – piRSquared
                    6 hours ago















                  3
















                  While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.



                  import numpy as np
                  import pandas as pd

                  df = pd.DataFrame('Number': [654321, 223344])
                  num_cols = int(np.log10(df['Number'].max() - 1)) + 1
                  vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
                  df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
                  df2 = pd.concat([df, df_digits])], axis=1)
                  print(df2)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4





                  share|improve this answer

























                  • I definitely like this approach. I'm trying to make this prettier (-:

                    – piRSquared
                    7 hours ago






                  • 1





                    vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

                    – piRSquared
                    6 hours ago













                  3














                  3










                  3









                  While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.



                  import numpy as np
                  import pandas as pd

                  df = pd.DataFrame('Number': [654321, 223344])
                  num_cols = int(np.log10(df['Number'].max() - 1)) + 1
                  vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
                  df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
                  df2 = pd.concat([df, df_digits])], axis=1)
                  print(df2)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4





                  share|improve this answer













                  While string-based solutions are simpler and probably good enough in most cases, you can do this with math which, if you have a big data set, can make a significant difference in speed.



                  import numpy as np
                  import pandas as pd

                  df = pd.DataFrame('Number': [654321, 223344])
                  num_cols = int(np.log10(df['Number'].max() - 1)) + 1
                  vals = (df['Number'].values[:, np.newaxis] // (10 ** np.arange(num_cols - 1, -1, -1))) % 10
                  df_digits = pd.DataFrame(vals, columns=[f'xi + 1' for i in range(num_cols)
                  df2 = pd.concat([df, df_digits])], axis=1)
                  print(df2)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 8 hours ago









                  jdehesajdehesa

                  35k4 gold badges42 silver badges66 bronze badges




                  35k4 gold badges42 silver badges66 bronze badges















                  • I definitely like this approach. I'm trying to make this prettier (-:

                    – piRSquared
                    7 hours ago






                  • 1





                    vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

                    – piRSquared
                    6 hours ago

















                  • I definitely like this approach. I'm trying to make this prettier (-:

                    – piRSquared
                    7 hours ago






                  • 1





                    vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

                    – piRSquared
                    6 hours ago
















                  I definitely like this approach. I'm trying to make this prettier (-:

                  – piRSquared
                  7 hours ago





                  I definitely like this approach. I'm trying to make this prettier (-:

                  – piRSquared
                  7 hours ago




                  1




                  1





                  vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

                  – piRSquared
                  6 hours ago





                  vals = (df.to_numpy() // 10 ** np.arange(6) % 10)[:, ::-1] Obviously, assumptions have to be made. I basically made some golf improvements at the expense of generalization.

                  – piRSquared
                  6 hours ago











                  3
















                  You could use np.unravel_index



                  df = pd.DataFrame('Number': [654321,223344])

                  def split_digits(df):
                  # get data as numpy array
                  numbers = df['Number'].to_numpy()
                  # extract digits
                  digits = np.unravel_index(numbers, 6*(10,))
                  # create column headers
                  columns = ['Number', *(f'xi' for i in "123456")]
                  # build and return new data frame
                  return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


                  split_digits(df)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4

                  timeit(lambda:split_digits(df),number=1000)
                  # 0.3550272472202778


                  Thanks @GZ0 for some pandas tips.






                  share|improve this answer






















                  • 1





                    This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

                    – Karn Kumar
                    5 hours ago












                  • @KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

                    – Paul Panzer
                    5 hours ago











                  • @KarnKumar I've made an annotated version in case you are interested.

                    – Paul Panzer
                    5 hours ago






                  • 1





                    One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

                    – GZ0
                    4 hours ago






                  • 1





                    @PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

                    – GZ0
                    2 hours ago
















                  3
















                  You could use np.unravel_index



                  df = pd.DataFrame('Number': [654321,223344])

                  def split_digits(df):
                  # get data as numpy array
                  numbers = df['Number'].to_numpy()
                  # extract digits
                  digits = np.unravel_index(numbers, 6*(10,))
                  # create column headers
                  columns = ['Number', *(f'xi' for i in "123456")]
                  # build and return new data frame
                  return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


                  split_digits(df)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4

                  timeit(lambda:split_digits(df),number=1000)
                  # 0.3550272472202778


                  Thanks @GZ0 for some pandas tips.






                  share|improve this answer






















                  • 1





                    This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

                    – Karn Kumar
                    5 hours ago












                  • @KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

                    – Paul Panzer
                    5 hours ago











                  • @KarnKumar I've made an annotated version in case you are interested.

                    – Paul Panzer
                    5 hours ago






                  • 1





                    One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

                    – GZ0
                    4 hours ago






                  • 1





                    @PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

                    – GZ0
                    2 hours ago














                  3














                  3










                  3









                  You could use np.unravel_index



                  df = pd.DataFrame('Number': [654321,223344])

                  def split_digits(df):
                  # get data as numpy array
                  numbers = df['Number'].to_numpy()
                  # extract digits
                  digits = np.unravel_index(numbers, 6*(10,))
                  # create column headers
                  columns = ['Number', *(f'xi' for i in "123456")]
                  # build and return new data frame
                  return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


                  split_digits(df)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4

                  timeit(lambda:split_digits(df),number=1000)
                  # 0.3550272472202778


                  Thanks @GZ0 for some pandas tips.






                  share|improve this answer















                  You could use np.unravel_index



                  df = pd.DataFrame('Number': [654321,223344])

                  def split_digits(df):
                  # get data as numpy array
                  numbers = df['Number'].to_numpy()
                  # extract digits
                  digits = np.unravel_index(numbers, 6*(10,))
                  # create column headers
                  columns = ['Number', *(f'xi' for i in "123456")]
                  # build and return new data frame
                  return pd.DataFrame(np.stack([numbers, *digits], axis=1), columns=columns, index=df.index)


                  split_digits(df)
                  # Number x1 x2 x3 x4 x5 x6
                  # 0 654321 6 5 4 3 2 1
                  # 1 223344 2 2 3 3 4 4

                  timeit(lambda:split_digits(df),number=1000)
                  # 0.3550272472202778


                  Thanks @GZ0 for some pandas tips.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 56 mins ago

























                  answered 6 hours ago









                  Paul PanzerPaul Panzer

                  35.1k2 gold badges23 silver badges53 bronze badges




                  35.1k2 gold badges23 silver badges53 bronze badges










                  • 1





                    This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

                    – Karn Kumar
                    5 hours ago












                  • @KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

                    – Paul Panzer
                    5 hours ago











                  • @KarnKumar I've made an annotated version in case you are interested.

                    – Paul Panzer
                    5 hours ago






                  • 1





                    One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

                    – GZ0
                    4 hours ago






                  • 1





                    @PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

                    – GZ0
                    2 hours ago













                  • 1





                    This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

                    – Karn Kumar
                    5 hours ago












                  • @KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

                    – Paul Panzer
                    5 hours ago











                  • @KarnKumar I've made an annotated version in case you are interested.

                    – Paul Panzer
                    5 hours ago






                  • 1





                    One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

                    – GZ0
                    4 hours ago






                  • 1





                    @PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

                    – GZ0
                    2 hours ago








                  1




                  1





                  This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

                  – Karn Kumar
                  5 hours ago






                  This is an excellent trick and one-lines @Paul +1, What does ** in assign, would you mind explaining the code.

                  – Karn Kumar
                  5 hours ago














                  @KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

                  – Paul Panzer
                  5 hours ago





                  @KarnKumar ** "unrolls" the dictionary, so each key-value pair becomes a keyword argument to the function (assign in this case). Btw. I don't know much about pandas, so this part of the code may be far from being optimal.

                  – Paul Panzer
                  5 hours ago













                  @KarnKumar I've made an annotated version in case you are interested.

                  – Paul Panzer
                  5 hours ago





                  @KarnKumar I've made an annotated version in case you are interested.

                  – Paul Panzer
                  5 hours ago




                  1




                  1





                  One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

                  – GZ0
                  4 hours ago





                  One alternative way to return a new data frame using digits is df.assign(**dict(zip((f'xi' for i in range(1,7)), digits))). Also, df['Number'] can be used as a numpy array directly without explicitly accessing the .values attribute.

                  – GZ0
                  4 hours ago




                  1




                  1





                  @PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

                  – GZ0
                  2 hours ago






                  @PaulPanzer You solution is indeed a lot more performant. df.assign makes a copy of the orignal dataframe and then add columns one by one. The df.copy() call actually takes a lot more time than adding columns for some unknown reasons. IMO there are two things that could be improved in your solution though: (1) In pandas version >= 0.24.0, df.to_numpy() is recommended in favor of df.values; (2) the index of the original data frame should be preserved by passing index=df.index into the constructor function.

                  – GZ0
                  2 hours ago












                  0
















                  Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:



                  import numpy as np
                  a = np.array([[654321],[223344]])
                  str_a = a.astype(str)
                  out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
                  print(out)


                  Output:



                  [['6' '5' '4' '3' '2' '1']
                  ['2' '2' '3' '3' '4' '4']]


                  Note that out is currently np.array of strs, you might convert it to int if such need arise.






                  share|improve this answer





























                    0
















                    Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:



                    import numpy as np
                    a = np.array([[654321],[223344]])
                    str_a = a.astype(str)
                    out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
                    print(out)


                    Output:



                    [['6' '5' '4' '3' '2' '1']
                    ['2' '2' '3' '3' '4' '4']]


                    Note that out is currently np.array of strs, you might convert it to int if such need arise.






                    share|improve this answer



























                      0














                      0










                      0









                      Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:



                      import numpy as np
                      a = np.array([[654321],[223344]])
                      str_a = a.astype(str)
                      out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
                      print(out)


                      Output:



                      [['6' '5' '4' '3' '2' '1']
                      ['2' '2' '3' '3' '4' '4']]


                      Note that out is currently np.array of strs, you might convert it to int if such need arise.






                      share|improve this answer













                      Assuming that all numbers are of same length (have equal number of digits), I would do it following way using numpy:



                      import numpy as np
                      a = np.array([[654321],[223344]])
                      str_a = a.astype(str)
                      out = np.apply_along_axis(lambda x:list(x[0]),1,str_a)
                      print(out)


                      Output:



                      [['6' '5' '4' '3' '2' '1']
                      ['2' '2' '3' '3' '4' '4']]


                      Note that out is currently np.array of strs, you might convert it to int if such need arise.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered 8 hours ago









                      DaweoDaweo

                      2,0651 gold badge2 silver badges6 bronze badges




                      2,0651 gold badge2 silver badges6 bronze badges
























                          0
















                          I really liked @user3483203's answer. I think .str.findall could work with any number of digits:



                          df = pd.DataFrame(
                          'Number' : [65432178888, 22334474343]
                          )

                          u = df['Number'].astype(str).str.findall(r'(w)')
                          df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)


                           Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
                          0 65432178888 6 5 4 3 2 1 7 8 8 8 8
                          1 22334474343 2 2 3 3 4 4 7 4 3 4 3





                          share|improve this answer































                            0
















                            I really liked @user3483203's answer. I think .str.findall could work with any number of digits:



                            df = pd.DataFrame(
                            'Number' : [65432178888, 22334474343]
                            )

                            u = df['Number'].astype(str).str.findall(r'(w)')
                            df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)


                             Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
                            0 65432178888 6 5 4 3 2 1 7 8 8 8 8
                            1 22334474343 2 2 3 3 4 4 7 4 3 4 3





                            share|improve this answer





























                              0














                              0










                              0









                              I really liked @user3483203's answer. I think .str.findall could work with any number of digits:



                              df = pd.DataFrame(
                              'Number' : [65432178888, 22334474343]
                              )

                              u = df['Number'].astype(str).str.findall(r'(w)')
                              df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)


                               Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
                              0 65432178888 6 5 4 3 2 1 7 8 8 8 8
                              1 22334474343 2 2 3 3 4 4 7 4 3 4 3





                              share|improve this answer















                              I really liked @user3483203's answer. I think .str.findall could work with any number of digits:



                              df = pd.DataFrame(
                              'Number' : [65432178888, 22334474343]
                              )

                              u = df['Number'].astype(str).str.findall(r'(w)')
                              df.join(pd.DataFrame(list(u)).rename(columns=lambda c: f'xc+1')).apply(pd.to_numeric)


                               Number x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
                              0 65432178888 6 5 4 3 2 1 7 8 8 8 8
                              1 22334474343 2 2 3 3 4 4 7 4 3 4 3






                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited 7 hours ago

























                              answered 8 hours ago









                              political scientistpolitical scientist

                              1,8121 gold badge8 silver badges18 bronze badges




                              1,8121 gold badge8 silver badges18 bronze badges
























                                  0
















                                  Simple way around:



                                  >>> df
                                  number
                                  0 123456
                                  1 456789
                                  2 135797


                                  First convert the column into string



                                  >>> df['number'] = df['number'].astype(str)


                                  Create the new columns using string indexing



                                  >>> df['x1'] = df['number'].str[0]
                                  >>> df['x2'] = df['number'].str[1]
                                  >>> df['x3'] = df['number'].str[2]
                                  >>> df['x4'] = df['number'].str[3]
                                  >>> df['x5'] = df['number'].str[4]
                                  >>> df['x6'] = df['number'].str[5]

                                  >>> df
                                  number x1 x2 x3 x4 x5 x6
                                  0 123456 1 2 3 4 5 6
                                  1 456789 4 5 6 7 8 9
                                  2 135797 1 3 5 7 9 7

                                  >>> df.drop('number', axis=1, inplace=True)
                                  >>> df
                                  x1 x2 x3 x4 x5 x6
                                  0 1 2 3 4 5 6
                                  1 4 5 6 7 8 9
                                  2 1 3 5 7 9 7


                                  @another trick with str.split()



                                  >>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
                                  >>> df
                                  x1 x3 x5 x7 x9 x11
                                  0 1 2 3 4 5 6
                                  1 4 5 6 7 8 9
                                  2 1 3 5 7 9 7

                                  >>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
                                  x1 x2 x3 x4 x5 x6
                                  0 1 2 3 4 5 6
                                  1 4 5 6 7 8 9
                                  2 1 3 5 7 9 7


                                  OR



                                  >>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

                                  >>> df
                                  1 3 5 7 9 11
                                  0 1 2 3 4 5 6
                                  1 4 5 6 7 8 9
                                  2 1 3 5 7 9 7

                                  >>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
                                  x1 x2 x3 x4 x5 x6
                                  0 1 2 3 4 5 6
                                  1 4 5 6 7 8 9
                                  2 1 3 5 7 9 7





                                  share|improve this answer































                                    0
















                                    Simple way around:



                                    >>> df
                                    number
                                    0 123456
                                    1 456789
                                    2 135797


                                    First convert the column into string



                                    >>> df['number'] = df['number'].astype(str)


                                    Create the new columns using string indexing



                                    >>> df['x1'] = df['number'].str[0]
                                    >>> df['x2'] = df['number'].str[1]
                                    >>> df['x3'] = df['number'].str[2]
                                    >>> df['x4'] = df['number'].str[3]
                                    >>> df['x5'] = df['number'].str[4]
                                    >>> df['x6'] = df['number'].str[5]

                                    >>> df
                                    number x1 x2 x3 x4 x5 x6
                                    0 123456 1 2 3 4 5 6
                                    1 456789 4 5 6 7 8 9
                                    2 135797 1 3 5 7 9 7

                                    >>> df.drop('number', axis=1, inplace=True)
                                    >>> df
                                    x1 x2 x3 x4 x5 x6
                                    0 1 2 3 4 5 6
                                    1 4 5 6 7 8 9
                                    2 1 3 5 7 9 7


                                    @another trick with str.split()



                                    >>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
                                    >>> df
                                    x1 x3 x5 x7 x9 x11
                                    0 1 2 3 4 5 6
                                    1 4 5 6 7 8 9
                                    2 1 3 5 7 9 7

                                    >>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
                                    x1 x2 x3 x4 x5 x6
                                    0 1 2 3 4 5 6
                                    1 4 5 6 7 8 9
                                    2 1 3 5 7 9 7


                                    OR



                                    >>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

                                    >>> df
                                    1 3 5 7 9 11
                                    0 1 2 3 4 5 6
                                    1 4 5 6 7 8 9
                                    2 1 3 5 7 9 7

                                    >>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
                                    x1 x2 x3 x4 x5 x6
                                    0 1 2 3 4 5 6
                                    1 4 5 6 7 8 9
                                    2 1 3 5 7 9 7





                                    share|improve this answer





























                                      0














                                      0










                                      0









                                      Simple way around:



                                      >>> df
                                      number
                                      0 123456
                                      1 456789
                                      2 135797


                                      First convert the column into string



                                      >>> df['number'] = df['number'].astype(str)


                                      Create the new columns using string indexing



                                      >>> df['x1'] = df['number'].str[0]
                                      >>> df['x2'] = df['number'].str[1]
                                      >>> df['x3'] = df['number'].str[2]
                                      >>> df['x4'] = df['number'].str[3]
                                      >>> df['x5'] = df['number'].str[4]
                                      >>> df['x6'] = df['number'].str[5]

                                      >>> df
                                      number x1 x2 x3 x4 x5 x6
                                      0 123456 1 2 3 4 5 6
                                      1 456789 4 5 6 7 8 9
                                      2 135797 1 3 5 7 9 7

                                      >>> df.drop('number', axis=1, inplace=True)
                                      >>> df
                                      x1 x2 x3 x4 x5 x6
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7


                                      @another trick with str.split()



                                      >>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
                                      >>> df
                                      x1 x3 x5 x7 x9 x11
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7

                                      >>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
                                      x1 x2 x3 x4 x5 x6
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7


                                      OR



                                      >>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

                                      >>> df
                                      1 3 5 7 9 11
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7

                                      >>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
                                      x1 x2 x3 x4 x5 x6
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7





                                      share|improve this answer















                                      Simple way around:



                                      >>> df
                                      number
                                      0 123456
                                      1 456789
                                      2 135797


                                      First convert the column into string



                                      >>> df['number'] = df['number'].astype(str)


                                      Create the new columns using string indexing



                                      >>> df['x1'] = df['number'].str[0]
                                      >>> df['x2'] = df['number'].str[1]
                                      >>> df['x3'] = df['number'].str[2]
                                      >>> df['x4'] = df['number'].str[3]
                                      >>> df['x5'] = df['number'].str[4]
                                      >>> df['x6'] = df['number'].str[5]

                                      >>> df
                                      number x1 x2 x3 x4 x5 x6
                                      0 123456 1 2 3 4 5 6
                                      1 456789 4 5 6 7 8 9
                                      2 135797 1 3 5 7 9 7

                                      >>> df.drop('number', axis=1, inplace=True)
                                      >>> df
                                      x1 x2 x3 x4 x5 x6
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7


                                      @another trick with str.split()



                                      >>> df = df['number'].str.split('(d1)', expand=True).add_prefix('x').drop(columns=['x0', 'x2', 'x4', 'x6', 'x8', 'x10', 'x12'])
                                      >>> df
                                      x1 x3 x5 x7 x9 x11
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7

                                      >>> df.rename(columns='x3':'x2', 'x5':'x3', 'x7':'x4', 'x9':'x5', 'x11':'x6')
                                      x1 x2 x3 x4 x5 x6
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7


                                      OR



                                      >>> df = df['number'].str.split(r'(d1)', expand=True).T.replace('', np.nan).dropna().T

                                      >>> df
                                      1 3 5 7 9 11
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7

                                      >>> df.rename(columns=1:'x1', 3:'x2', 5:'x3', 7:'x4', 9:'x5', 11:'x6')
                                      x1 x2 x3 x4 x5 x6
                                      0 1 2 3 4 5 6
                                      1 4 5 6 7 8 9
                                      2 1 3 5 7 9 7






                                      share|improve this answer














                                      share|improve this answer



                                      share|improve this answer








                                      edited 5 hours ago

























                                      answered 7 hours ago









                                      Karn KumarKarn Kumar

                                      3,7081 gold badge7 silver badges22 bronze badges




                                      3,7081 gold badge7 silver badges22 bronze badges






























                                          draft saved

                                          draft discarded
















































                                          Thanks for contributing an answer to Stack Overflow!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid


                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.

                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function ()
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f57792952%2fsplit-a-six-digits-number-column-into-separated-columns-with-one-digit%23new-answer', 'question_page');

                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

                                          Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

                                          Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її