Generating sequential alphanumeric values that match a certain patternGenerating the sequence of the digits that appear in the sequence of natural numbersLeetcode number of atoms solution using stackAdd one to integer represented as an arrayGenerating combinations of specific values of a nested dictWrite a function that adds two numbers stored in a linked list and returns the sumCode to find if a number fulfills certain requirements (e.g. sequential increasing, palindrome, repeating, etc)Count Even Digits In NumberChallenge: phone keypad to letters - return all possible stringsInterpret a date from a string of digits
Encountering former, abusive advisor at a conference
Vergil Book XII, Line 756 | Meter Question
I run daily 5kms but I cant seem to improve stamina when playing soccer
Can a Pokemon that I tried to capture from field research run away?
How to deal with people whose priority is to not get blamed?
Did Terry Pratchett ever explain the inspiration behind the Luggage?
What could possibly power an Alcubierre drive?
Demod as a neologism
Christmas party at employers home
Advisor asked for my entire slide presentation so she could give the presentation at an international conference
How to typeset maths above the curve of a function with TikZ, with the curve masked (or locally opacified) to improve readibility?
What fantasy book has twins (except one's blue) and a cloaked ice bear on the cover?
Why do Computer Science degrees contain a high proportion of mathematics?
Are dead worlds a good galactic barrier?
Why would oxygen be stored as a super critical fluid?
Why is my paper "under review" if it contains no results?
Why are second inversion triads considered less consonant than first inversion triads?
How can I communicate feelings to players without impacting their agency?
In the twin paradox does the returning twin also come back permanently length contracted flatter than the twin on earth?
How does a ball bearing door hinge work?
Milk instead of water in bread
Is is possible to externally power my DSLR with the original battery that is connected to the DSLR by means of wires?
Could an American state survive nuclear war?
I am often given, occasionally stolen, rarely sold, and never borrowed
Generating sequential alphanumeric values that match a certain pattern
Generating the sequence of the digits that appear in the sequence of natural numbersLeetcode number of atoms solution using stackAdd one to integer represented as an arrayGenerating combinations of specific values of a nested dictWrite a function that adds two numbers stored in a linked list and returns the sumCode to find if a number fulfills certain requirements (e.g. sequential increasing, palindrome, repeating, etc)Count Even Digits In NumberChallenge: phone keypad to letters - return all possible stringsInterpret a date from a string of digits
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
$begingroup$
I'm working on generating large volumes of test data for some performance testing. Part of this is going to involve generating a lot of values that represent an "MBI", which is a certain kind of alpha-numeric identifier.
This identifier has the following rules:
- Is alpha-numeric (uppercase only letters)
- Excludes letters that are commonly confused for numbers or other letters (
SLOIBZ) - Is 11 digits long
- Has specific rules on which values can go in which digit
- The first digit can be a number 1-9
- The second, fifth, eighth, and ninth digits can be any upper case letter besides
SLOIBZ; I don't actually validate the upper-case requirement, and I just callmbi.upper()as needed. - The fourth, seventh, tenth, and eleventh digits can be any number 0-9
- The remaining digits (third and sixth) can be any of the valid letters or numbers (including 0)
The following function validates that an MBI is valid (you can assume this is correct for your own review)
Assuming that the numbers are "before" the letters, the "first" MBI is then "1A00A00AA00". My task is to generate a whole bunch of valid and unique MBIs by "incrementing" them. For example, I want to be able to do this:
for _, mbi in zip(range(count_mbis_to_generate), mbi_generator("1A00A00AA99")):
print(mbi)
This should print count_mbis_to_generate distinct MBIs, starting with "1A00A00AA99" (assuming there are that many distinct MBIs that are greater than or equal to "1A00A00AA99". If I wanted 10000 of them, then the last one should be "1A00A00GA98".
To accomplish this, I started by writing a function to validate that a given MBI matches the expected format:
import re
def validate_mbi(mbi):
flags = re.IGNORECASE ^ re.VERBOSE
pattern = re.compile(
r"""
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
""",
flags=flags,
)
return mbi and len(mbi) == 11 and pattern.match(mbi)
I then created a generator that will start spitting out MBIs:
def mbi_generator(start=None):
if not start:
start = "1A00A00AA00"
while validate_mbi(start):
yield start
start = increment_mbi(start)
Now I actually had to do some thinking. To do this arithmatic by hand, I came up with the following algorithm:
- Starting with the final digit (e.g.
mbi[10]), attempt to increment it by 1 (following all normal rules) - If that results in a "carrying" operation, then set this digit to the minimum allowed value for this digit, and move on to the next digit
- Repeat until I get to the first digit (e.g.
mbi[0]). - If the first digit is the largest allowed value as well (
9), then we cannot increment this MBI, so just returnNone
To accomplish this, I created dictionaries that I'm calling "pseudo-linked lists" (they are nowhere close to a true linked list, but it was the name that popped in my head when I made it). They would look like this:
- If my current value is
'A', then the next value is'B'and we don't have to carry. - If my current value is
'Z', then that is the largest possible value, so the "next" value is either0or'A', depending on digit, and we do have to carry.
This would end up looking like this kind of dictionary (for the 0-9 digits):
'0': ('1', False),
'1': ('2', False),
'2': ('3', False),
'3': ('4', False),
'4': ('5', False),
'5': ('6', False),
'6': ('7', False),
'7': ('8', False),
'8': ('9', False),
'9': ('0', True)
The key is the current value, and then the value's first index is the "next" value, and the second index is the "carry" flag. Note that the values are strings, not numbers, because we're always operating on strings and its easier to store it this way than do conversions all over the place.
The function to spit out one of these dictionaries is like so:
def build_pseudo_linked_list(value_list):
end_point = len(value_list) - 1
return
current_value: (value_list[0] # Get the first value
if i == end_point # If we're at the end of the list
else value_list[i + 1], # Otherwise get the next value
i == end_point) # Whether or not we were at the end of the list
for i, current_value in enumerate(value_list)
From there, I just need to build up the different kinds of dictionaries, and then use them to increment the MBI.
def increment_mbi(mbi):
# Get the list of possible values for each digit
string_list = [char for char in string.ascii_uppercase if char not in "SLOIBZ"] # exclude easily confused letters
digit_list = [str(i) for i in range(0, 10)] # any digit
partial_digit_list = [val for val in digit_list if val != "0"] # Any non-zero digit
combined_list = digit_list + string_list # any value from any list above
# The values a given digit can have as a pseudo-linked list
# current_value : (next_value, should_advance_digit)
first_digit_range = build_pseudo_linked_list(partial_digit_list)
letter_range = build_pseudo_linked_list(string_list)
full_digit_range = build_pseudo_linked_list(digit_list)
combined_range = build_pseudo_linked_list(combined_list)
# Each digit's ranges
ranges = [
first_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
letter_range,
full_digit_range,
full_digit_range,
]
index = 10 # Start from the back
mbi = mbi.upper() # If they gave us lower-case values, just roll with it
# Starting from the back, try to increment digits until we don't have to carry
for i in range(index, 0, -1):
next_value, should_advance = ranges[i][mbi[i]]
# If we get to the front and we need to reset, then we can't go any further
if i == 0 and should_advance:
return None
# Otherwise, slide our value in
mbi = mbi[:i] + next_value + mbi[i + 1 :]
# If we need to advance to the next digit then we continue, otherwise return our current state
if not should_advance:
return mbi
At this point, the logic was pretty straightforward, and mostly just requires building up our dictionaries and lists.
I'm looking for a review on a few things:
- Does this make sense? I need to add docstrings that explain all of this still, but hopefully the reasoning makes sense
- Would making these classes actually help at all? I played around with it, but it didn't really seem to be any clearer than before
- Are there any obvious ways to make this faster? This is working well enough to generate test data, which can be a "run it over the weekend on rare occasions" kind of thing, but I'd love for it to be faster.
Bonus points if you come up with a really clever itertools implementation; I played with it for a while but it eventually got a little too obtuse for me
python python-3.x strings iteration generator
$endgroup$
add a comment
|
$begingroup$
I'm working on generating large volumes of test data for some performance testing. Part of this is going to involve generating a lot of values that represent an "MBI", which is a certain kind of alpha-numeric identifier.
This identifier has the following rules:
- Is alpha-numeric (uppercase only letters)
- Excludes letters that are commonly confused for numbers or other letters (
SLOIBZ) - Is 11 digits long
- Has specific rules on which values can go in which digit
- The first digit can be a number 1-9
- The second, fifth, eighth, and ninth digits can be any upper case letter besides
SLOIBZ; I don't actually validate the upper-case requirement, and I just callmbi.upper()as needed. - The fourth, seventh, tenth, and eleventh digits can be any number 0-9
- The remaining digits (third and sixth) can be any of the valid letters or numbers (including 0)
The following function validates that an MBI is valid (you can assume this is correct for your own review)
Assuming that the numbers are "before" the letters, the "first" MBI is then "1A00A00AA00". My task is to generate a whole bunch of valid and unique MBIs by "incrementing" them. For example, I want to be able to do this:
for _, mbi in zip(range(count_mbis_to_generate), mbi_generator("1A00A00AA99")):
print(mbi)
This should print count_mbis_to_generate distinct MBIs, starting with "1A00A00AA99" (assuming there are that many distinct MBIs that are greater than or equal to "1A00A00AA99". If I wanted 10000 of them, then the last one should be "1A00A00GA98".
To accomplish this, I started by writing a function to validate that a given MBI matches the expected format:
import re
def validate_mbi(mbi):
flags = re.IGNORECASE ^ re.VERBOSE
pattern = re.compile(
r"""
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
""",
flags=flags,
)
return mbi and len(mbi) == 11 and pattern.match(mbi)
I then created a generator that will start spitting out MBIs:
def mbi_generator(start=None):
if not start:
start = "1A00A00AA00"
while validate_mbi(start):
yield start
start = increment_mbi(start)
Now I actually had to do some thinking. To do this arithmatic by hand, I came up with the following algorithm:
- Starting with the final digit (e.g.
mbi[10]), attempt to increment it by 1 (following all normal rules) - If that results in a "carrying" operation, then set this digit to the minimum allowed value for this digit, and move on to the next digit
- Repeat until I get to the first digit (e.g.
mbi[0]). - If the first digit is the largest allowed value as well (
9), then we cannot increment this MBI, so just returnNone
To accomplish this, I created dictionaries that I'm calling "pseudo-linked lists" (they are nowhere close to a true linked list, but it was the name that popped in my head when I made it). They would look like this:
- If my current value is
'A', then the next value is'B'and we don't have to carry. - If my current value is
'Z', then that is the largest possible value, so the "next" value is either0or'A', depending on digit, and we do have to carry.
This would end up looking like this kind of dictionary (for the 0-9 digits):
'0': ('1', False),
'1': ('2', False),
'2': ('3', False),
'3': ('4', False),
'4': ('5', False),
'5': ('6', False),
'6': ('7', False),
'7': ('8', False),
'8': ('9', False),
'9': ('0', True)
The key is the current value, and then the value's first index is the "next" value, and the second index is the "carry" flag. Note that the values are strings, not numbers, because we're always operating on strings and its easier to store it this way than do conversions all over the place.
The function to spit out one of these dictionaries is like so:
def build_pseudo_linked_list(value_list):
end_point = len(value_list) - 1
return
current_value: (value_list[0] # Get the first value
if i == end_point # If we're at the end of the list
else value_list[i + 1], # Otherwise get the next value
i == end_point) # Whether or not we were at the end of the list
for i, current_value in enumerate(value_list)
From there, I just need to build up the different kinds of dictionaries, and then use them to increment the MBI.
def increment_mbi(mbi):
# Get the list of possible values for each digit
string_list = [char for char in string.ascii_uppercase if char not in "SLOIBZ"] # exclude easily confused letters
digit_list = [str(i) for i in range(0, 10)] # any digit
partial_digit_list = [val for val in digit_list if val != "0"] # Any non-zero digit
combined_list = digit_list + string_list # any value from any list above
# The values a given digit can have as a pseudo-linked list
# current_value : (next_value, should_advance_digit)
first_digit_range = build_pseudo_linked_list(partial_digit_list)
letter_range = build_pseudo_linked_list(string_list)
full_digit_range = build_pseudo_linked_list(digit_list)
combined_range = build_pseudo_linked_list(combined_list)
# Each digit's ranges
ranges = [
first_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
letter_range,
full_digit_range,
full_digit_range,
]
index = 10 # Start from the back
mbi = mbi.upper() # If they gave us lower-case values, just roll with it
# Starting from the back, try to increment digits until we don't have to carry
for i in range(index, 0, -1):
next_value, should_advance = ranges[i][mbi[i]]
# If we get to the front and we need to reset, then we can't go any further
if i == 0 and should_advance:
return None
# Otherwise, slide our value in
mbi = mbi[:i] + next_value + mbi[i + 1 :]
# If we need to advance to the next digit then we continue, otherwise return our current state
if not should_advance:
return mbi
At this point, the logic was pretty straightforward, and mostly just requires building up our dictionaries and lists.
I'm looking for a review on a few things:
- Does this make sense? I need to add docstrings that explain all of this still, but hopefully the reasoning makes sense
- Would making these classes actually help at all? I played around with it, but it didn't really seem to be any clearer than before
- Are there any obvious ways to make this faster? This is working well enough to generate test data, which can be a "run it over the weekend on rare occasions" kind of thing, but I'd love for it to be faster.
Bonus points if you come up with a really clever itertools implementation; I played with it for a while but it eventually got a little too obtuse for me
python python-3.x strings iteration generator
$endgroup$
1
$begingroup$
How about "lookup table" instead of "pseudo-linked lists"?
$endgroup$
– AlexV
8 hours ago
add a comment
|
$begingroup$
I'm working on generating large volumes of test data for some performance testing. Part of this is going to involve generating a lot of values that represent an "MBI", which is a certain kind of alpha-numeric identifier.
This identifier has the following rules:
- Is alpha-numeric (uppercase only letters)
- Excludes letters that are commonly confused for numbers or other letters (
SLOIBZ) - Is 11 digits long
- Has specific rules on which values can go in which digit
- The first digit can be a number 1-9
- The second, fifth, eighth, and ninth digits can be any upper case letter besides
SLOIBZ; I don't actually validate the upper-case requirement, and I just callmbi.upper()as needed. - The fourth, seventh, tenth, and eleventh digits can be any number 0-9
- The remaining digits (third and sixth) can be any of the valid letters or numbers (including 0)
The following function validates that an MBI is valid (you can assume this is correct for your own review)
Assuming that the numbers are "before" the letters, the "first" MBI is then "1A00A00AA00". My task is to generate a whole bunch of valid and unique MBIs by "incrementing" them. For example, I want to be able to do this:
for _, mbi in zip(range(count_mbis_to_generate), mbi_generator("1A00A00AA99")):
print(mbi)
This should print count_mbis_to_generate distinct MBIs, starting with "1A00A00AA99" (assuming there are that many distinct MBIs that are greater than or equal to "1A00A00AA99". If I wanted 10000 of them, then the last one should be "1A00A00GA98".
To accomplish this, I started by writing a function to validate that a given MBI matches the expected format:
import re
def validate_mbi(mbi):
flags = re.IGNORECASE ^ re.VERBOSE
pattern = re.compile(
r"""
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
""",
flags=flags,
)
return mbi and len(mbi) == 11 and pattern.match(mbi)
I then created a generator that will start spitting out MBIs:
def mbi_generator(start=None):
if not start:
start = "1A00A00AA00"
while validate_mbi(start):
yield start
start = increment_mbi(start)
Now I actually had to do some thinking. To do this arithmatic by hand, I came up with the following algorithm:
- Starting with the final digit (e.g.
mbi[10]), attempt to increment it by 1 (following all normal rules) - If that results in a "carrying" operation, then set this digit to the minimum allowed value for this digit, and move on to the next digit
- Repeat until I get to the first digit (e.g.
mbi[0]). - If the first digit is the largest allowed value as well (
9), then we cannot increment this MBI, so just returnNone
To accomplish this, I created dictionaries that I'm calling "pseudo-linked lists" (they are nowhere close to a true linked list, but it was the name that popped in my head when I made it). They would look like this:
- If my current value is
'A', then the next value is'B'and we don't have to carry. - If my current value is
'Z', then that is the largest possible value, so the "next" value is either0or'A', depending on digit, and we do have to carry.
This would end up looking like this kind of dictionary (for the 0-9 digits):
'0': ('1', False),
'1': ('2', False),
'2': ('3', False),
'3': ('4', False),
'4': ('5', False),
'5': ('6', False),
'6': ('7', False),
'7': ('8', False),
'8': ('9', False),
'9': ('0', True)
The key is the current value, and then the value's first index is the "next" value, and the second index is the "carry" flag. Note that the values are strings, not numbers, because we're always operating on strings and its easier to store it this way than do conversions all over the place.
The function to spit out one of these dictionaries is like so:
def build_pseudo_linked_list(value_list):
end_point = len(value_list) - 1
return
current_value: (value_list[0] # Get the first value
if i == end_point # If we're at the end of the list
else value_list[i + 1], # Otherwise get the next value
i == end_point) # Whether or not we were at the end of the list
for i, current_value in enumerate(value_list)
From there, I just need to build up the different kinds of dictionaries, and then use them to increment the MBI.
def increment_mbi(mbi):
# Get the list of possible values for each digit
string_list = [char for char in string.ascii_uppercase if char not in "SLOIBZ"] # exclude easily confused letters
digit_list = [str(i) for i in range(0, 10)] # any digit
partial_digit_list = [val for val in digit_list if val != "0"] # Any non-zero digit
combined_list = digit_list + string_list # any value from any list above
# The values a given digit can have as a pseudo-linked list
# current_value : (next_value, should_advance_digit)
first_digit_range = build_pseudo_linked_list(partial_digit_list)
letter_range = build_pseudo_linked_list(string_list)
full_digit_range = build_pseudo_linked_list(digit_list)
combined_range = build_pseudo_linked_list(combined_list)
# Each digit's ranges
ranges = [
first_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
letter_range,
full_digit_range,
full_digit_range,
]
index = 10 # Start from the back
mbi = mbi.upper() # If they gave us lower-case values, just roll with it
# Starting from the back, try to increment digits until we don't have to carry
for i in range(index, 0, -1):
next_value, should_advance = ranges[i][mbi[i]]
# If we get to the front and we need to reset, then we can't go any further
if i == 0 and should_advance:
return None
# Otherwise, slide our value in
mbi = mbi[:i] + next_value + mbi[i + 1 :]
# If we need to advance to the next digit then we continue, otherwise return our current state
if not should_advance:
return mbi
At this point, the logic was pretty straightforward, and mostly just requires building up our dictionaries and lists.
I'm looking for a review on a few things:
- Does this make sense? I need to add docstrings that explain all of this still, but hopefully the reasoning makes sense
- Would making these classes actually help at all? I played around with it, but it didn't really seem to be any clearer than before
- Are there any obvious ways to make this faster? This is working well enough to generate test data, which can be a "run it over the weekend on rare occasions" kind of thing, but I'd love for it to be faster.
Bonus points if you come up with a really clever itertools implementation; I played with it for a while but it eventually got a little too obtuse for me
python python-3.x strings iteration generator
$endgroup$
I'm working on generating large volumes of test data for some performance testing. Part of this is going to involve generating a lot of values that represent an "MBI", which is a certain kind of alpha-numeric identifier.
This identifier has the following rules:
- Is alpha-numeric (uppercase only letters)
- Excludes letters that are commonly confused for numbers or other letters (
SLOIBZ) - Is 11 digits long
- Has specific rules on which values can go in which digit
- The first digit can be a number 1-9
- The second, fifth, eighth, and ninth digits can be any upper case letter besides
SLOIBZ; I don't actually validate the upper-case requirement, and I just callmbi.upper()as needed. - The fourth, seventh, tenth, and eleventh digits can be any number 0-9
- The remaining digits (third and sixth) can be any of the valid letters or numbers (including 0)
The following function validates that an MBI is valid (you can assume this is correct for your own review)
Assuming that the numbers are "before" the letters, the "first" MBI is then "1A00A00AA00". My task is to generate a whole bunch of valid and unique MBIs by "incrementing" them. For example, I want to be able to do this:
for _, mbi in zip(range(count_mbis_to_generate), mbi_generator("1A00A00AA99")):
print(mbi)
This should print count_mbis_to_generate distinct MBIs, starting with "1A00A00AA99" (assuming there are that many distinct MBIs that are greater than or equal to "1A00A00AA99". If I wanted 10000 of them, then the last one should be "1A00A00GA98".
To accomplish this, I started by writing a function to validate that a given MBI matches the expected format:
import re
def validate_mbi(mbi):
flags = re.IGNORECASE ^ re.VERBOSE
pattern = re.compile(
r"""
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
""",
flags=flags,
)
return mbi and len(mbi) == 11 and pattern.match(mbi)
I then created a generator that will start spitting out MBIs:
def mbi_generator(start=None):
if not start:
start = "1A00A00AA00"
while validate_mbi(start):
yield start
start = increment_mbi(start)
Now I actually had to do some thinking. To do this arithmatic by hand, I came up with the following algorithm:
- Starting with the final digit (e.g.
mbi[10]), attempt to increment it by 1 (following all normal rules) - If that results in a "carrying" operation, then set this digit to the minimum allowed value for this digit, and move on to the next digit
- Repeat until I get to the first digit (e.g.
mbi[0]). - If the first digit is the largest allowed value as well (
9), then we cannot increment this MBI, so just returnNone
To accomplish this, I created dictionaries that I'm calling "pseudo-linked lists" (they are nowhere close to a true linked list, but it was the name that popped in my head when I made it). They would look like this:
- If my current value is
'A', then the next value is'B'and we don't have to carry. - If my current value is
'Z', then that is the largest possible value, so the "next" value is either0or'A', depending on digit, and we do have to carry.
This would end up looking like this kind of dictionary (for the 0-9 digits):
'0': ('1', False),
'1': ('2', False),
'2': ('3', False),
'3': ('4', False),
'4': ('5', False),
'5': ('6', False),
'6': ('7', False),
'7': ('8', False),
'8': ('9', False),
'9': ('0', True)
The key is the current value, and then the value's first index is the "next" value, and the second index is the "carry" flag. Note that the values are strings, not numbers, because we're always operating on strings and its easier to store it this way than do conversions all over the place.
The function to spit out one of these dictionaries is like so:
def build_pseudo_linked_list(value_list):
end_point = len(value_list) - 1
return
current_value: (value_list[0] # Get the first value
if i == end_point # If we're at the end of the list
else value_list[i + 1], # Otherwise get the next value
i == end_point) # Whether or not we were at the end of the list
for i, current_value in enumerate(value_list)
From there, I just need to build up the different kinds of dictionaries, and then use them to increment the MBI.
def increment_mbi(mbi):
# Get the list of possible values for each digit
string_list = [char for char in string.ascii_uppercase if char not in "SLOIBZ"] # exclude easily confused letters
digit_list = [str(i) for i in range(0, 10)] # any digit
partial_digit_list = [val for val in digit_list if val != "0"] # Any non-zero digit
combined_list = digit_list + string_list # any value from any list above
# The values a given digit can have as a pseudo-linked list
# current_value : (next_value, should_advance_digit)
first_digit_range = build_pseudo_linked_list(partial_digit_list)
letter_range = build_pseudo_linked_list(string_list)
full_digit_range = build_pseudo_linked_list(digit_list)
combined_range = build_pseudo_linked_list(combined_list)
# Each digit's ranges
ranges = [
first_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
combined_range,
full_digit_range,
letter_range,
letter_range,
full_digit_range,
full_digit_range,
]
index = 10 # Start from the back
mbi = mbi.upper() # If they gave us lower-case values, just roll with it
# Starting from the back, try to increment digits until we don't have to carry
for i in range(index, 0, -1):
next_value, should_advance = ranges[i][mbi[i]]
# If we get to the front and we need to reset, then we can't go any further
if i == 0 and should_advance:
return None
# Otherwise, slide our value in
mbi = mbi[:i] + next_value + mbi[i + 1 :]
# If we need to advance to the next digit then we continue, otherwise return our current state
if not should_advance:
return mbi
At this point, the logic was pretty straightforward, and mostly just requires building up our dictionaries and lists.
I'm looking for a review on a few things:
- Does this make sense? I need to add docstrings that explain all of this still, but hopefully the reasoning makes sense
- Would making these classes actually help at all? I played around with it, but it didn't really seem to be any clearer than before
- Are there any obvious ways to make this faster? This is working well enough to generate test data, which can be a "run it over the weekend on rare occasions" kind of thing, but I'd love for it to be faster.
Bonus points if you come up with a really clever itertools implementation; I played with it for a while but it eventually got a little too obtuse for me
python python-3.x strings iteration generator
python python-3.x strings iteration generator
asked 8 hours ago
DannnnoDannnno
6,45421 silver badges61 bronze badges
6,45421 silver badges61 bronze badges
1
$begingroup$
How about "lookup table" instead of "pseudo-linked lists"?
$endgroup$
– AlexV
8 hours ago
add a comment
|
1
$begingroup$
How about "lookup table" instead of "pseudo-linked lists"?
$endgroup$
– AlexV
8 hours ago
1
1
$begingroup$
How about "lookup table" instead of "pseudo-linked lists"?
$endgroup$
– AlexV
8 hours ago
$begingroup$
How about "lookup table" instead of "pseudo-linked lists"?
$endgroup$
– AlexV
8 hours ago
add a comment
|
3 Answers
3
active
oldest
votes
$begingroup$
Explicit is better than implicit.
Simple is better than complex.
Readability counts.
The easiest way to make it faster would be to simply convert it into code:
def generate_mbis() -> Iterator[str]:
""" Generate MBIs, starting with 1A00A00AA00.
An MBI is a string with the following rules:
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
"""
LETTERS = "acdefghjkmnpqrtuvwxyz".upper()
D0_9 = "0123456789"
# Using 2-space indents because so many
for p0 in range(1, 9):
for p1 in LETTERS:
for p2 in LETTERS + D0_9:
for p3 in D0_9:
for p4 in LETTERS:
for p5 in LETTERS + D0_9:
for p6 in D0_9:
for p7 in LETTERS:
for p8 in LETTERS:
for p9 in D0_9:
for p10 in D0_9:
yield p1+p2+p3+p4+p5+p6+p7+p8+p9+p10
You could probably improve speed by caching the partial sums in the inner loops. You might see some gain by using purely numeric indexing, using range instead of for ... in .... (I don't know this, I just suspect it.)
Implementing a start-string would be straightforward, but would require an if/else at each level. You might make a helper function for that.
Edit:
It's worth noting that yes, you could use itertools.product to get the same effect:
for tpl in itertools.product(range(1, 9),
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS,
D0_9,
D0_9):
yield ''.join(tpl)
But you won't have any opportunity to tweak the performance. It's worth trying, but I expect you'll be able to use a bytearray or partial sum or something to get better speed than you can from this approach.
$endgroup$
3
$begingroup$
These are - by far - the most nested for loops I have ever seen in Python code (well, basically any code) ;-) I'm gladitertools.productexists.
$endgroup$
– AlexV
7 hours ago
2
$begingroup$
I guess you could also take advantage of the fact that there is a repeating pattern: LETTERS, LETTERS + D0_9, D0_9, to reduce the numbers of nested loops
$endgroup$
– Daniel Mesejo
6 hours ago
$begingroup$
Your nested-loop implementation fails to initialize the digit ranges to a starting value, as the OP shows.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Implementing a start-string would be straightforward, but would require an if/else - right; and any meaningful analysis of this code's performance - which is important, since that's your primary focus - would require that that feature be present.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
@Reinderien That is untrue. The if/else doesn't affect performance, since it would be changing the iterables outside the loop(s). My point mainly is that this is a clean, performant structure that makes obvious what is happening. The original code is not particularly simple, nor clean, nor performant.
$endgroup$
– Austin Hastings
3 hours ago
|
show 2 more comments
$begingroup$
This is a fun problem. I've taken the liberty of writing an alternate implementation, which I will describe below. Note that I've not focused too much on performance at all, only on simplicity and good structure.
Suggested
from typing import Iterable
N = 11
class MBI:
letters = tuple('ACDEFGHJKMNPQRTUVWXYZ')
numbers = tuple(str(i) for i in range(10))
digits = (
numbers[1:],
letters,
numbers + letters,
numbers,
letters,
numbers + letters,
numbers,
letters,
letters,
numbers,
numbers,
)
assert len(digits) == N
sizes = tuple(len(d) for d in digits)
def __init__(self, indices: Iterable[int] = (0,)*N):
indices = list(indices)
assert len(indices) == N
self.indices = indices
def copy(self):
return MBI(self.indices)
@classmethod
def parse(cls, as_str: str):
assert len(as_str) == N
return MBI((
digs.index(s)
for digs, s in zip(cls.digits, as_str)
))
@classmethod
def is_valid(cls, as_str: str) -> bool:
try:
cls.parse(as_str)
return True
except Exception as e:
return False
def __str__(self):
return ''.join(digs[i] for digs, i in zip(self.digits, self.indices))
def add(self, addend=1):
for pos in range(N-1, -1, -1):
carry, digit = divmod(self.indices[pos] + addend, self.sizes[pos])
self.indices[pos] = digit
if carry:
if pos == 0:
raise ValueError()
addend = carry
else:
break
def increasing_same(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi.add(addend)
def increasing_copy(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi = mbi.copy()
mbi.add(addend)
def test():
assert MBI.is_valid('1A00A00GA98')
assert not MBI.is_valid('2')
assert not MBI.is_valid('00000000000')
assert MBI.parse('1A00A00GA98').indices == [
0, 0, 0, 0, 0, 0, 0, 5, 0, 9, 8
]
assert MBI.parse('1A00A00AA00').indices == [0]*N
assert str(MBI()) == '1A00A00AA00'
def demo():
for mbi in increasing_same(MBI.parse('4M44M44MN98'), 6):
print(mbi)
print()
for mbi in increasing_same(MBI.parse('8ZZ9ZZ9ZZ95'), 10, 2):
print(mbi)
print()
if __name__ == '__main__':
test()
demo()
Rationale
This is class-based. You asked
Would making these classes actually help at all?
The answer is, in general, yes. Modelling a class like I've shown allows you to encapsulate parsing, formatting, validation and increment logic all in one tidy package. The class instance is mutable, so you can choose whether to iterate-and-mutate or iterate-and-copy.
As to your validation strategy: You do not need regexes; just attempt to parse and catch a failure. Parsing into a machine-usable sequence of integers can serve both purposes: preparation for processing, as well as validation that the thing is processible at all.
Another sensible modification is generalization of the addend in your incrementing logic. You can add by any number, not just incrementing by one, and the change is easy enough that it's better to make a generic function that can add anything instead of something that can only increment.
Does this make sense?
It's more complicated than it needs to be, and it isn't really tested. The suggested implementation includes some rudimentary tests that give you more confidence that this thing is doing what it should.
$endgroup$
1
$begingroup$
@dfhwze Fair enough; I've expanded a rationale section
$endgroup$
– Reinderien
5 hours ago
add a comment
|
$begingroup$
itertools.product()
This is a task made for itertools.product()
import itertools as it
DIGIT = '0123456789'
LETTER = 'ACDEFGHJKMNPQRTUVWXYZ'
DIGLET = DIGIT + LETTER
def mbi_gen(start=None):
pattern = [DIGIT[1:], # leading non-zero digit
LETTER,
DIGLET,
DIGIT,
LETTER,
DIGLET,
DIGIT,
LETTER,
LETTER,
DIGIT,
DIGIT
]
if start:
for pos, (char, seq) in enumerate(zip(start, pattern)):
print(f"pos, char, seq", end=" -> ")
index = seq.index(char)
pattern[pos] = seq[index:] + seq[:index]
print(f"char: pattern[pos]")
yield from (''.join(seq) for seq in it.product(*pattern))
mbi_gen() is a generator and can be used like so:
MBI = mbi_gen()
for n in range(10000):
print(next(MBI))
or to generate a list of 10000 MBIs:
MBIs = list(it.islice(mbi_gen('3C12D34EF56'), 10000)
$endgroup$
add a comment
|
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f230430%2fgenerating-sequential-alphanumeric-values-that-match-a-certain-pattern%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Explicit is better than implicit.
Simple is better than complex.
Readability counts.
The easiest way to make it faster would be to simply convert it into code:
def generate_mbis() -> Iterator[str]:
""" Generate MBIs, starting with 1A00A00AA00.
An MBI is a string with the following rules:
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
"""
LETTERS = "acdefghjkmnpqrtuvwxyz".upper()
D0_9 = "0123456789"
# Using 2-space indents because so many
for p0 in range(1, 9):
for p1 in LETTERS:
for p2 in LETTERS + D0_9:
for p3 in D0_9:
for p4 in LETTERS:
for p5 in LETTERS + D0_9:
for p6 in D0_9:
for p7 in LETTERS:
for p8 in LETTERS:
for p9 in D0_9:
for p10 in D0_9:
yield p1+p2+p3+p4+p5+p6+p7+p8+p9+p10
You could probably improve speed by caching the partial sums in the inner loops. You might see some gain by using purely numeric indexing, using range instead of for ... in .... (I don't know this, I just suspect it.)
Implementing a start-string would be straightforward, but would require an if/else at each level. You might make a helper function for that.
Edit:
It's worth noting that yes, you could use itertools.product to get the same effect:
for tpl in itertools.product(range(1, 9),
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS,
D0_9,
D0_9):
yield ''.join(tpl)
But you won't have any opportunity to tweak the performance. It's worth trying, but I expect you'll be able to use a bytearray or partial sum or something to get better speed than you can from this approach.
$endgroup$
3
$begingroup$
These are - by far - the most nested for loops I have ever seen in Python code (well, basically any code) ;-) I'm gladitertools.productexists.
$endgroup$
– AlexV
7 hours ago
2
$begingroup$
I guess you could also take advantage of the fact that there is a repeating pattern: LETTERS, LETTERS + D0_9, D0_9, to reduce the numbers of nested loops
$endgroup$
– Daniel Mesejo
6 hours ago
$begingroup$
Your nested-loop implementation fails to initialize the digit ranges to a starting value, as the OP shows.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Implementing a start-string would be straightforward, but would require an if/else - right; and any meaningful analysis of this code's performance - which is important, since that's your primary focus - would require that that feature be present.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
@Reinderien That is untrue. The if/else doesn't affect performance, since it would be changing the iterables outside the loop(s). My point mainly is that this is a clean, performant structure that makes obvious what is happening. The original code is not particularly simple, nor clean, nor performant.
$endgroup$
– Austin Hastings
3 hours ago
|
show 2 more comments
$begingroup$
Explicit is better than implicit.
Simple is better than complex.
Readability counts.
The easiest way to make it faster would be to simply convert it into code:
def generate_mbis() -> Iterator[str]:
""" Generate MBIs, starting with 1A00A00AA00.
An MBI is a string with the following rules:
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
"""
LETTERS = "acdefghjkmnpqrtuvwxyz".upper()
D0_9 = "0123456789"
# Using 2-space indents because so many
for p0 in range(1, 9):
for p1 in LETTERS:
for p2 in LETTERS + D0_9:
for p3 in D0_9:
for p4 in LETTERS:
for p5 in LETTERS + D0_9:
for p6 in D0_9:
for p7 in LETTERS:
for p8 in LETTERS:
for p9 in D0_9:
for p10 in D0_9:
yield p1+p2+p3+p4+p5+p6+p7+p8+p9+p10
You could probably improve speed by caching the partial sums in the inner loops. You might see some gain by using purely numeric indexing, using range instead of for ... in .... (I don't know this, I just suspect it.)
Implementing a start-string would be straightforward, but would require an if/else at each level. You might make a helper function for that.
Edit:
It's worth noting that yes, you could use itertools.product to get the same effect:
for tpl in itertools.product(range(1, 9),
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS,
D0_9,
D0_9):
yield ''.join(tpl)
But you won't have any opportunity to tweak the performance. It's worth trying, but I expect you'll be able to use a bytearray or partial sum or something to get better speed than you can from this approach.
$endgroup$
3
$begingroup$
These are - by far - the most nested for loops I have ever seen in Python code (well, basically any code) ;-) I'm gladitertools.productexists.
$endgroup$
– AlexV
7 hours ago
2
$begingroup$
I guess you could also take advantage of the fact that there is a repeating pattern: LETTERS, LETTERS + D0_9, D0_9, to reduce the numbers of nested loops
$endgroup$
– Daniel Mesejo
6 hours ago
$begingroup$
Your nested-loop implementation fails to initialize the digit ranges to a starting value, as the OP shows.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Implementing a start-string would be straightforward, but would require an if/else - right; and any meaningful analysis of this code's performance - which is important, since that's your primary focus - would require that that feature be present.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
@Reinderien That is untrue. The if/else doesn't affect performance, since it would be changing the iterables outside the loop(s). My point mainly is that this is a clean, performant structure that makes obvious what is happening. The original code is not particularly simple, nor clean, nor performant.
$endgroup$
– Austin Hastings
3 hours ago
|
show 2 more comments
$begingroup$
Explicit is better than implicit.
Simple is better than complex.
Readability counts.
The easiest way to make it faster would be to simply convert it into code:
def generate_mbis() -> Iterator[str]:
""" Generate MBIs, starting with 1A00A00AA00.
An MBI is a string with the following rules:
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
"""
LETTERS = "acdefghjkmnpqrtuvwxyz".upper()
D0_9 = "0123456789"
# Using 2-space indents because so many
for p0 in range(1, 9):
for p1 in LETTERS:
for p2 in LETTERS + D0_9:
for p3 in D0_9:
for p4 in LETTERS:
for p5 in LETTERS + D0_9:
for p6 in D0_9:
for p7 in LETTERS:
for p8 in LETTERS:
for p9 in D0_9:
for p10 in D0_9:
yield p1+p2+p3+p4+p5+p6+p7+p8+p9+p10
You could probably improve speed by caching the partial sums in the inner loops. You might see some gain by using purely numeric indexing, using range instead of for ... in .... (I don't know this, I just suspect it.)
Implementing a start-string would be straightforward, but would require an if/else at each level. You might make a helper function for that.
Edit:
It's worth noting that yes, you could use itertools.product to get the same effect:
for tpl in itertools.product(range(1, 9),
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS,
D0_9,
D0_9):
yield ''.join(tpl)
But you won't have any opportunity to tweak the performance. It's worth trying, but I expect you'll be able to use a bytearray or partial sum or something to get better speed than you can from this approach.
$endgroup$
Explicit is better than implicit.
Simple is better than complex.
Readability counts.
The easiest way to make it faster would be to simply convert it into code:
def generate_mbis() -> Iterator[str]:
""" Generate MBIs, starting with 1A00A00AA00.
An MBI is a string with the following rules:
[1-9] # No leading zeros
[ac-hjkmnp-rt-z] # No letters that can be confused for numbers or each other
[ac-hjkmnp-rt-zd] # Letter or number
d # Any number
[ac-hjkmnp-rt-z]
[ac-hjkmnp-rt-zd]
d
[ac-hjkmnp-rt-z]2 # Then two letters
d2 # Then two numbers
"""
LETTERS = "acdefghjkmnpqrtuvwxyz".upper()
D0_9 = "0123456789"
# Using 2-space indents because so many
for p0 in range(1, 9):
for p1 in LETTERS:
for p2 in LETTERS + D0_9:
for p3 in D0_9:
for p4 in LETTERS:
for p5 in LETTERS + D0_9:
for p6 in D0_9:
for p7 in LETTERS:
for p8 in LETTERS:
for p9 in D0_9:
for p10 in D0_9:
yield p1+p2+p3+p4+p5+p6+p7+p8+p9+p10
You could probably improve speed by caching the partial sums in the inner loops. You might see some gain by using purely numeric indexing, using range instead of for ... in .... (I don't know this, I just suspect it.)
Implementing a start-string would be straightforward, but would require an if/else at each level. You might make a helper function for that.
Edit:
It's worth noting that yes, you could use itertools.product to get the same effect:
for tpl in itertools.product(range(1, 9),
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS + D0_9,
D0_9,
LETTERS,
LETTERS,
D0_9,
D0_9):
yield ''.join(tpl)
But you won't have any opportunity to tweak the performance. It's worth trying, but I expect you'll be able to use a bytearray or partial sum or something to get better speed than you can from this approach.
edited 7 hours ago
answered 7 hours ago
Austin HastingsAustin Hastings
9,76515 silver badges40 bronze badges
9,76515 silver badges40 bronze badges
3
$begingroup$
These are - by far - the most nested for loops I have ever seen in Python code (well, basically any code) ;-) I'm gladitertools.productexists.
$endgroup$
– AlexV
7 hours ago
2
$begingroup$
I guess you could also take advantage of the fact that there is a repeating pattern: LETTERS, LETTERS + D0_9, D0_9, to reduce the numbers of nested loops
$endgroup$
– Daniel Mesejo
6 hours ago
$begingroup$
Your nested-loop implementation fails to initialize the digit ranges to a starting value, as the OP shows.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Implementing a start-string would be straightforward, but would require an if/else - right; and any meaningful analysis of this code's performance - which is important, since that's your primary focus - would require that that feature be present.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
@Reinderien That is untrue. The if/else doesn't affect performance, since it would be changing the iterables outside the loop(s). My point mainly is that this is a clean, performant structure that makes obvious what is happening. The original code is not particularly simple, nor clean, nor performant.
$endgroup$
– Austin Hastings
3 hours ago
|
show 2 more comments
3
$begingroup$
These are - by far - the most nested for loops I have ever seen in Python code (well, basically any code) ;-) I'm gladitertools.productexists.
$endgroup$
– AlexV
7 hours ago
2
$begingroup$
I guess you could also take advantage of the fact that there is a repeating pattern: LETTERS, LETTERS + D0_9, D0_9, to reduce the numbers of nested loops
$endgroup$
– Daniel Mesejo
6 hours ago
$begingroup$
Your nested-loop implementation fails to initialize the digit ranges to a starting value, as the OP shows.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Implementing a start-string would be straightforward, but would require an if/else - right; and any meaningful analysis of this code's performance - which is important, since that's your primary focus - would require that that feature be present.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
@Reinderien That is untrue. The if/else doesn't affect performance, since it would be changing the iterables outside the loop(s). My point mainly is that this is a clean, performant structure that makes obvious what is happening. The original code is not particularly simple, nor clean, nor performant.
$endgroup$
– Austin Hastings
3 hours ago
3
3
$begingroup$
These are - by far - the most nested for loops I have ever seen in Python code (well, basically any code) ;-) I'm glad
itertools.product exists.$endgroup$
– AlexV
7 hours ago
$begingroup$
These are - by far - the most nested for loops I have ever seen in Python code (well, basically any code) ;-) I'm glad
itertools.product exists.$endgroup$
– AlexV
7 hours ago
2
2
$begingroup$
I guess you could also take advantage of the fact that there is a repeating pattern: LETTERS, LETTERS + D0_9, D0_9, to reduce the numbers of nested loops
$endgroup$
– Daniel Mesejo
6 hours ago
$begingroup$
I guess you could also take advantage of the fact that there is a repeating pattern: LETTERS, LETTERS + D0_9, D0_9, to reduce the numbers of nested loops
$endgroup$
– Daniel Mesejo
6 hours ago
$begingroup$
Your nested-loop implementation fails to initialize the digit ranges to a starting value, as the OP shows.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Your nested-loop implementation fails to initialize the digit ranges to a starting value, as the OP shows.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Implementing a start-string would be straightforward, but would require an if/else - right; and any meaningful analysis of this code's performance - which is important, since that's your primary focus - would require that that feature be present.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
Implementing a start-string would be straightforward, but would require an if/else - right; and any meaningful analysis of this code's performance - which is important, since that's your primary focus - would require that that feature be present.
$endgroup$
– Reinderien
4 hours ago
$begingroup$
@Reinderien That is untrue. The if/else doesn't affect performance, since it would be changing the iterables outside the loop(s). My point mainly is that this is a clean, performant structure that makes obvious what is happening. The original code is not particularly simple, nor clean, nor performant.
$endgroup$
– Austin Hastings
3 hours ago
$begingroup$
@Reinderien That is untrue. The if/else doesn't affect performance, since it would be changing the iterables outside the loop(s). My point mainly is that this is a clean, performant structure that makes obvious what is happening. The original code is not particularly simple, nor clean, nor performant.
$endgroup$
– Austin Hastings
3 hours ago
|
show 2 more comments
$begingroup$
This is a fun problem. I've taken the liberty of writing an alternate implementation, which I will describe below. Note that I've not focused too much on performance at all, only on simplicity and good structure.
Suggested
from typing import Iterable
N = 11
class MBI:
letters = tuple('ACDEFGHJKMNPQRTUVWXYZ')
numbers = tuple(str(i) for i in range(10))
digits = (
numbers[1:],
letters,
numbers + letters,
numbers,
letters,
numbers + letters,
numbers,
letters,
letters,
numbers,
numbers,
)
assert len(digits) == N
sizes = tuple(len(d) for d in digits)
def __init__(self, indices: Iterable[int] = (0,)*N):
indices = list(indices)
assert len(indices) == N
self.indices = indices
def copy(self):
return MBI(self.indices)
@classmethod
def parse(cls, as_str: str):
assert len(as_str) == N
return MBI((
digs.index(s)
for digs, s in zip(cls.digits, as_str)
))
@classmethod
def is_valid(cls, as_str: str) -> bool:
try:
cls.parse(as_str)
return True
except Exception as e:
return False
def __str__(self):
return ''.join(digs[i] for digs, i in zip(self.digits, self.indices))
def add(self, addend=1):
for pos in range(N-1, -1, -1):
carry, digit = divmod(self.indices[pos] + addend, self.sizes[pos])
self.indices[pos] = digit
if carry:
if pos == 0:
raise ValueError()
addend = carry
else:
break
def increasing_same(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi.add(addend)
def increasing_copy(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi = mbi.copy()
mbi.add(addend)
def test():
assert MBI.is_valid('1A00A00GA98')
assert not MBI.is_valid('2')
assert not MBI.is_valid('00000000000')
assert MBI.parse('1A00A00GA98').indices == [
0, 0, 0, 0, 0, 0, 0, 5, 0, 9, 8
]
assert MBI.parse('1A00A00AA00').indices == [0]*N
assert str(MBI()) == '1A00A00AA00'
def demo():
for mbi in increasing_same(MBI.parse('4M44M44MN98'), 6):
print(mbi)
print()
for mbi in increasing_same(MBI.parse('8ZZ9ZZ9ZZ95'), 10, 2):
print(mbi)
print()
if __name__ == '__main__':
test()
demo()
Rationale
This is class-based. You asked
Would making these classes actually help at all?
The answer is, in general, yes. Modelling a class like I've shown allows you to encapsulate parsing, formatting, validation and increment logic all in one tidy package. The class instance is mutable, so you can choose whether to iterate-and-mutate or iterate-and-copy.
As to your validation strategy: You do not need regexes; just attempt to parse and catch a failure. Parsing into a machine-usable sequence of integers can serve both purposes: preparation for processing, as well as validation that the thing is processible at all.
Another sensible modification is generalization of the addend in your incrementing logic. You can add by any number, not just incrementing by one, and the change is easy enough that it's better to make a generic function that can add anything instead of something that can only increment.
Does this make sense?
It's more complicated than it needs to be, and it isn't really tested. The suggested implementation includes some rudimentary tests that give you more confidence that this thing is doing what it should.
$endgroup$
1
$begingroup$
@dfhwze Fair enough; I've expanded a rationale section
$endgroup$
– Reinderien
5 hours ago
add a comment
|
$begingroup$
This is a fun problem. I've taken the liberty of writing an alternate implementation, which I will describe below. Note that I've not focused too much on performance at all, only on simplicity and good structure.
Suggested
from typing import Iterable
N = 11
class MBI:
letters = tuple('ACDEFGHJKMNPQRTUVWXYZ')
numbers = tuple(str(i) for i in range(10))
digits = (
numbers[1:],
letters,
numbers + letters,
numbers,
letters,
numbers + letters,
numbers,
letters,
letters,
numbers,
numbers,
)
assert len(digits) == N
sizes = tuple(len(d) for d in digits)
def __init__(self, indices: Iterable[int] = (0,)*N):
indices = list(indices)
assert len(indices) == N
self.indices = indices
def copy(self):
return MBI(self.indices)
@classmethod
def parse(cls, as_str: str):
assert len(as_str) == N
return MBI((
digs.index(s)
for digs, s in zip(cls.digits, as_str)
))
@classmethod
def is_valid(cls, as_str: str) -> bool:
try:
cls.parse(as_str)
return True
except Exception as e:
return False
def __str__(self):
return ''.join(digs[i] for digs, i in zip(self.digits, self.indices))
def add(self, addend=1):
for pos in range(N-1, -1, -1):
carry, digit = divmod(self.indices[pos] + addend, self.sizes[pos])
self.indices[pos] = digit
if carry:
if pos == 0:
raise ValueError()
addend = carry
else:
break
def increasing_same(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi.add(addend)
def increasing_copy(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi = mbi.copy()
mbi.add(addend)
def test():
assert MBI.is_valid('1A00A00GA98')
assert not MBI.is_valid('2')
assert not MBI.is_valid('00000000000')
assert MBI.parse('1A00A00GA98').indices == [
0, 0, 0, 0, 0, 0, 0, 5, 0, 9, 8
]
assert MBI.parse('1A00A00AA00').indices == [0]*N
assert str(MBI()) == '1A00A00AA00'
def demo():
for mbi in increasing_same(MBI.parse('4M44M44MN98'), 6):
print(mbi)
print()
for mbi in increasing_same(MBI.parse('8ZZ9ZZ9ZZ95'), 10, 2):
print(mbi)
print()
if __name__ == '__main__':
test()
demo()
Rationale
This is class-based. You asked
Would making these classes actually help at all?
The answer is, in general, yes. Modelling a class like I've shown allows you to encapsulate parsing, formatting, validation and increment logic all in one tidy package. The class instance is mutable, so you can choose whether to iterate-and-mutate or iterate-and-copy.
As to your validation strategy: You do not need regexes; just attempt to parse and catch a failure. Parsing into a machine-usable sequence of integers can serve both purposes: preparation for processing, as well as validation that the thing is processible at all.
Another sensible modification is generalization of the addend in your incrementing logic. You can add by any number, not just incrementing by one, and the change is easy enough that it's better to make a generic function that can add anything instead of something that can only increment.
Does this make sense?
It's more complicated than it needs to be, and it isn't really tested. The suggested implementation includes some rudimentary tests that give you more confidence that this thing is doing what it should.
$endgroup$
1
$begingroup$
@dfhwze Fair enough; I've expanded a rationale section
$endgroup$
– Reinderien
5 hours ago
add a comment
|
$begingroup$
This is a fun problem. I've taken the liberty of writing an alternate implementation, which I will describe below. Note that I've not focused too much on performance at all, only on simplicity and good structure.
Suggested
from typing import Iterable
N = 11
class MBI:
letters = tuple('ACDEFGHJKMNPQRTUVWXYZ')
numbers = tuple(str(i) for i in range(10))
digits = (
numbers[1:],
letters,
numbers + letters,
numbers,
letters,
numbers + letters,
numbers,
letters,
letters,
numbers,
numbers,
)
assert len(digits) == N
sizes = tuple(len(d) for d in digits)
def __init__(self, indices: Iterable[int] = (0,)*N):
indices = list(indices)
assert len(indices) == N
self.indices = indices
def copy(self):
return MBI(self.indices)
@classmethod
def parse(cls, as_str: str):
assert len(as_str) == N
return MBI((
digs.index(s)
for digs, s in zip(cls.digits, as_str)
))
@classmethod
def is_valid(cls, as_str: str) -> bool:
try:
cls.parse(as_str)
return True
except Exception as e:
return False
def __str__(self):
return ''.join(digs[i] for digs, i in zip(self.digits, self.indices))
def add(self, addend=1):
for pos in range(N-1, -1, -1):
carry, digit = divmod(self.indices[pos] + addend, self.sizes[pos])
self.indices[pos] = digit
if carry:
if pos == 0:
raise ValueError()
addend = carry
else:
break
def increasing_same(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi.add(addend)
def increasing_copy(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi = mbi.copy()
mbi.add(addend)
def test():
assert MBI.is_valid('1A00A00GA98')
assert not MBI.is_valid('2')
assert not MBI.is_valid('00000000000')
assert MBI.parse('1A00A00GA98').indices == [
0, 0, 0, 0, 0, 0, 0, 5, 0, 9, 8
]
assert MBI.parse('1A00A00AA00').indices == [0]*N
assert str(MBI()) == '1A00A00AA00'
def demo():
for mbi in increasing_same(MBI.parse('4M44M44MN98'), 6):
print(mbi)
print()
for mbi in increasing_same(MBI.parse('8ZZ9ZZ9ZZ95'), 10, 2):
print(mbi)
print()
if __name__ == '__main__':
test()
demo()
Rationale
This is class-based. You asked
Would making these classes actually help at all?
The answer is, in general, yes. Modelling a class like I've shown allows you to encapsulate parsing, formatting, validation and increment logic all in one tidy package. The class instance is mutable, so you can choose whether to iterate-and-mutate or iterate-and-copy.
As to your validation strategy: You do not need regexes; just attempt to parse and catch a failure. Parsing into a machine-usable sequence of integers can serve both purposes: preparation for processing, as well as validation that the thing is processible at all.
Another sensible modification is generalization of the addend in your incrementing logic. You can add by any number, not just incrementing by one, and the change is easy enough that it's better to make a generic function that can add anything instead of something that can only increment.
Does this make sense?
It's more complicated than it needs to be, and it isn't really tested. The suggested implementation includes some rudimentary tests that give you more confidence that this thing is doing what it should.
$endgroup$
This is a fun problem. I've taken the liberty of writing an alternate implementation, which I will describe below. Note that I've not focused too much on performance at all, only on simplicity and good structure.
Suggested
from typing import Iterable
N = 11
class MBI:
letters = tuple('ACDEFGHJKMNPQRTUVWXYZ')
numbers = tuple(str(i) for i in range(10))
digits = (
numbers[1:],
letters,
numbers + letters,
numbers,
letters,
numbers + letters,
numbers,
letters,
letters,
numbers,
numbers,
)
assert len(digits) == N
sizes = tuple(len(d) for d in digits)
def __init__(self, indices: Iterable[int] = (0,)*N):
indices = list(indices)
assert len(indices) == N
self.indices = indices
def copy(self):
return MBI(self.indices)
@classmethod
def parse(cls, as_str: str):
assert len(as_str) == N
return MBI((
digs.index(s)
for digs, s in zip(cls.digits, as_str)
))
@classmethod
def is_valid(cls, as_str: str) -> bool:
try:
cls.parse(as_str)
return True
except Exception as e:
return False
def __str__(self):
return ''.join(digs[i] for digs, i in zip(self.digits, self.indices))
def add(self, addend=1):
for pos in range(N-1, -1, -1):
carry, digit = divmod(self.indices[pos] + addend, self.sizes[pos])
self.indices[pos] = digit
if carry:
if pos == 0:
raise ValueError()
addend = carry
else:
break
def increasing_same(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi.add(addend)
def increasing_copy(mbi: MBI, n: int, addend: int = 1) -> Iterable[MBI]:
for _ in range(n):
yield mbi
mbi = mbi.copy()
mbi.add(addend)
def test():
assert MBI.is_valid('1A00A00GA98')
assert not MBI.is_valid('2')
assert not MBI.is_valid('00000000000')
assert MBI.parse('1A00A00GA98').indices == [
0, 0, 0, 0, 0, 0, 0, 5, 0, 9, 8
]
assert MBI.parse('1A00A00AA00').indices == [0]*N
assert str(MBI()) == '1A00A00AA00'
def demo():
for mbi in increasing_same(MBI.parse('4M44M44MN98'), 6):
print(mbi)
print()
for mbi in increasing_same(MBI.parse('8ZZ9ZZ9ZZ95'), 10, 2):
print(mbi)
print()
if __name__ == '__main__':
test()
demo()
Rationale
This is class-based. You asked
Would making these classes actually help at all?
The answer is, in general, yes. Modelling a class like I've shown allows you to encapsulate parsing, formatting, validation and increment logic all in one tidy package. The class instance is mutable, so you can choose whether to iterate-and-mutate or iterate-and-copy.
As to your validation strategy: You do not need regexes; just attempt to parse and catch a failure. Parsing into a machine-usable sequence of integers can serve both purposes: preparation for processing, as well as validation that the thing is processible at all.
Another sensible modification is generalization of the addend in your incrementing logic. You can add by any number, not just incrementing by one, and the change is easy enough that it's better to make a generic function that can add anything instead of something that can only increment.
Does this make sense?
It's more complicated than it needs to be, and it isn't really tested. The suggested implementation includes some rudimentary tests that give you more confidence that this thing is doing what it should.
edited 5 hours ago
answered 5 hours ago
ReinderienReinderien
11.3k21 silver badges48 bronze badges
11.3k21 silver badges48 bronze badges
1
$begingroup$
@dfhwze Fair enough; I've expanded a rationale section
$endgroup$
– Reinderien
5 hours ago
add a comment
|
1
$begingroup$
@dfhwze Fair enough; I've expanded a rationale section
$endgroup$
– Reinderien
5 hours ago
1
1
$begingroup$
@dfhwze Fair enough; I've expanded a rationale section
$endgroup$
– Reinderien
5 hours ago
$begingroup$
@dfhwze Fair enough; I've expanded a rationale section
$endgroup$
– Reinderien
5 hours ago
add a comment
|
$begingroup$
itertools.product()
This is a task made for itertools.product()
import itertools as it
DIGIT = '0123456789'
LETTER = 'ACDEFGHJKMNPQRTUVWXYZ'
DIGLET = DIGIT + LETTER
def mbi_gen(start=None):
pattern = [DIGIT[1:], # leading non-zero digit
LETTER,
DIGLET,
DIGIT,
LETTER,
DIGLET,
DIGIT,
LETTER,
LETTER,
DIGIT,
DIGIT
]
if start:
for pos, (char, seq) in enumerate(zip(start, pattern)):
print(f"pos, char, seq", end=" -> ")
index = seq.index(char)
pattern[pos] = seq[index:] + seq[:index]
print(f"char: pattern[pos]")
yield from (''.join(seq) for seq in it.product(*pattern))
mbi_gen() is a generator and can be used like so:
MBI = mbi_gen()
for n in range(10000):
print(next(MBI))
or to generate a list of 10000 MBIs:
MBIs = list(it.islice(mbi_gen('3C12D34EF56'), 10000)
$endgroup$
add a comment
|
$begingroup$
itertools.product()
This is a task made for itertools.product()
import itertools as it
DIGIT = '0123456789'
LETTER = 'ACDEFGHJKMNPQRTUVWXYZ'
DIGLET = DIGIT + LETTER
def mbi_gen(start=None):
pattern = [DIGIT[1:], # leading non-zero digit
LETTER,
DIGLET,
DIGIT,
LETTER,
DIGLET,
DIGIT,
LETTER,
LETTER,
DIGIT,
DIGIT
]
if start:
for pos, (char, seq) in enumerate(zip(start, pattern)):
print(f"pos, char, seq", end=" -> ")
index = seq.index(char)
pattern[pos] = seq[index:] + seq[:index]
print(f"char: pattern[pos]")
yield from (''.join(seq) for seq in it.product(*pattern))
mbi_gen() is a generator and can be used like so:
MBI = mbi_gen()
for n in range(10000):
print(next(MBI))
or to generate a list of 10000 MBIs:
MBIs = list(it.islice(mbi_gen('3C12D34EF56'), 10000)
$endgroup$
add a comment
|
$begingroup$
itertools.product()
This is a task made for itertools.product()
import itertools as it
DIGIT = '0123456789'
LETTER = 'ACDEFGHJKMNPQRTUVWXYZ'
DIGLET = DIGIT + LETTER
def mbi_gen(start=None):
pattern = [DIGIT[1:], # leading non-zero digit
LETTER,
DIGLET,
DIGIT,
LETTER,
DIGLET,
DIGIT,
LETTER,
LETTER,
DIGIT,
DIGIT
]
if start:
for pos, (char, seq) in enumerate(zip(start, pattern)):
print(f"pos, char, seq", end=" -> ")
index = seq.index(char)
pattern[pos] = seq[index:] + seq[:index]
print(f"char: pattern[pos]")
yield from (''.join(seq) for seq in it.product(*pattern))
mbi_gen() is a generator and can be used like so:
MBI = mbi_gen()
for n in range(10000):
print(next(MBI))
or to generate a list of 10000 MBIs:
MBIs = list(it.islice(mbi_gen('3C12D34EF56'), 10000)
$endgroup$
itertools.product()
This is a task made for itertools.product()
import itertools as it
DIGIT = '0123456789'
LETTER = 'ACDEFGHJKMNPQRTUVWXYZ'
DIGLET = DIGIT + LETTER
def mbi_gen(start=None):
pattern = [DIGIT[1:], # leading non-zero digit
LETTER,
DIGLET,
DIGIT,
LETTER,
DIGLET,
DIGIT,
LETTER,
LETTER,
DIGIT,
DIGIT
]
if start:
for pos, (char, seq) in enumerate(zip(start, pattern)):
print(f"pos, char, seq", end=" -> ")
index = seq.index(char)
pattern[pos] = seq[index:] + seq[:index]
print(f"char: pattern[pos]")
yield from (''.join(seq) for seq in it.product(*pattern))
mbi_gen() is a generator and can be used like so:
MBI = mbi_gen()
for n in range(10000):
print(next(MBI))
or to generate a list of 10000 MBIs:
MBIs = list(it.islice(mbi_gen('3C12D34EF56'), 10000)
answered 17 mins ago
RootTwoRootTwo
2,1243 silver badges10 bronze badges
2,1243 silver badges10 bronze badges
add a comment
|
add a comment
|
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f230430%2fgenerating-sequential-alphanumeric-values-that-match-a-certain-pattern%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
How about "lookup table" instead of "pseudo-linked lists"?
$endgroup$
– AlexV
8 hours ago