Finding collisions of the first few bits of a SHA-1 hashUsing Levenstein distance to compare stringsReview of reservoir samplingKarp-Rabin with bitwise hashMinimum window substring which includes all the characters from substringOptimizing Java SHA-512 String-to-Hash GeneratorHackerEarth - SimpleFunctionLeetcode 49: Group Anagrams - Hash function design talkFollow-up 1: Compare 2 unordered, rooted trees for shape-isomorphismChecking whether one string contains all the characters of another stringFinding the lowest possible number of changes to turn one string into another

Find this cartoon

Why did Jon Snow do this immoral act if he is so honorable?

Drums and punctuation

Which European Languages are not Indo-European?

Natural Armour and Weapons

My players want to grind XP but we're using landmark advancement

Why did the person in charge of a principality not just declare themself king?

What did the 'turbo' button actually do?

Determine this limit

Why isn't 'chemically-strengthened glass' made with potassium carbonate to begin with?

SFDX: where can set Field-level security and accessibility?

Why did Theresa May offer a vote on a second Brexit referendum?

Finding collisions of the first few bits of a SHA-1 hash

How do I superimpose two math symbols?

Can a wizard copy a spell without first identifying it?

Make 24 using exactly three 3s

USPS Back Room - Trespassing?

Dad jokes are fun

How to patch glass cuts in a bicycle tire?

I know that there is a preselected candidate for a position to be filled at my department. What should I do?

Making a electromagnet

ESTA validity after a visa denial

Take elements from a list based on two criteria

How did NASA Langley end up with the first 737?



Finding collisions of the first few bits of a SHA-1 hash


Using Levenstein distance to compare stringsReview of reservoir samplingKarp-Rabin with bitwise hashMinimum window substring which includes all the characters from substringOptimizing Java SHA-512 String-to-Hash GeneratorHackerEarth - SimpleFunctionLeetcode 49: Group Anagrams - Hash function design talkFollow-up 1: Compare 2 unordered, rooted trees for shape-isomorphismChecking whether one string contains all the characters of another stringFinding the lowest possible number of changes to turn one string into another






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








5












$begingroup$


My objective is to find a hash collision of my modified hash function. Assuming my modified hash only outputs the first 36 bits of SHA-1. As we know, SHA-1 is a 160-bit hash value, hence, we only need to output 9 characters out of 40 characters for comparison.



How i begin my program is by hashing a string (I have a SHA-1 algorithm running and i will name it as sha1. I have also ensured that the output of the SHA-1 algorithm is correct.)



Firstly, i would hardcode 2 string and extract out 9 characters out of 40 characters since i require only the first 36 bits of SHA-1. The function below basically return a true if a collision is found and false if a collision is not found



public static boolean findCollision(int x1, int x2) 

String message1 = "I tried " + x1 + " iteration to find a collision";
String message2 = "I tried " + x2 + " iteration to find a collision";


//hashing my string and extracting 9 characters
String message_hash1 = sha1(message1);
String message_hash2 = sha1(message2);

String modified_hash1 = message_hash1.substring(0, 9);
String modified_hash2 = message_hash2.substring(0, 9);

if (modified_hash1.equals(modified_hash2))
return true;
else
return false;



Lastly, i will have a main function that will random Integers up to MAX_VALUE in a infinite loop and will break out if and only if a hash is found.



public static void main(String[] args) 

Random random = new Random();
int x1 = 0;
int x2 = 0;
int counter = 0;

while (true)

while(true)

x1 = random.nextInt(Integer.MAX_VALUE);
x2 = random.nextInt(Integer.MAX_VALUE);

if (x1 != x2)
break;


if (findCollision(x1, x2) == true)
break;

counter++;


System.out.println("nNumber of trials: " + counter);



If i tried taking only the first 24 bits of SHA-1, i could easily find a collision. However, i'm unable to find a collision for 36 bits instead despite running it for hours. Hence, I'm wondering what is other alternative way for me to find a collision with just 36 bits of SHA-1.










share|improve this question









New contributor



Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$











  • $begingroup$
    Out of curiosity, why are you trying to do this? Is it because you want to see if your modified algorithm is good?
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    of course not. this modified algorithm is never good because by decreasing the hash length is never a good thing. It's an assignment for my class work and i'm ran out of ideas on how to achieve a hash collision. that's the main objective of this assignment.
    $endgroup$
    – Astral Zhang
    3 hours ago










  • $begingroup$
    You should mention this in your question, I believe it's important :)
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    i have done so. thanks!
    $endgroup$
    – Astral Zhang
    3 hours ago

















5












$begingroup$


My objective is to find a hash collision of my modified hash function. Assuming my modified hash only outputs the first 36 bits of SHA-1. As we know, SHA-1 is a 160-bit hash value, hence, we only need to output 9 characters out of 40 characters for comparison.



How i begin my program is by hashing a string (I have a SHA-1 algorithm running and i will name it as sha1. I have also ensured that the output of the SHA-1 algorithm is correct.)



Firstly, i would hardcode 2 string and extract out 9 characters out of 40 characters since i require only the first 36 bits of SHA-1. The function below basically return a true if a collision is found and false if a collision is not found



public static boolean findCollision(int x1, int x2) 

String message1 = "I tried " + x1 + " iteration to find a collision";
String message2 = "I tried " + x2 + " iteration to find a collision";


//hashing my string and extracting 9 characters
String message_hash1 = sha1(message1);
String message_hash2 = sha1(message2);

String modified_hash1 = message_hash1.substring(0, 9);
String modified_hash2 = message_hash2.substring(0, 9);

if (modified_hash1.equals(modified_hash2))
return true;
else
return false;



Lastly, i will have a main function that will random Integers up to MAX_VALUE in a infinite loop and will break out if and only if a hash is found.



public static void main(String[] args) 

Random random = new Random();
int x1 = 0;
int x2 = 0;
int counter = 0;

while (true)

while(true)

x1 = random.nextInt(Integer.MAX_VALUE);
x2 = random.nextInt(Integer.MAX_VALUE);

if (x1 != x2)
break;


if (findCollision(x1, x2) == true)
break;

counter++;


System.out.println("nNumber of trials: " + counter);



If i tried taking only the first 24 bits of SHA-1, i could easily find a collision. However, i'm unable to find a collision for 36 bits instead despite running it for hours. Hence, I'm wondering what is other alternative way for me to find a collision with just 36 bits of SHA-1.










share|improve this question









New contributor



Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$











  • $begingroup$
    Out of curiosity, why are you trying to do this? Is it because you want to see if your modified algorithm is good?
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    of course not. this modified algorithm is never good because by decreasing the hash length is never a good thing. It's an assignment for my class work and i'm ran out of ideas on how to achieve a hash collision. that's the main objective of this assignment.
    $endgroup$
    – Astral Zhang
    3 hours ago










  • $begingroup$
    You should mention this in your question, I believe it's important :)
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    i have done so. thanks!
    $endgroup$
    – Astral Zhang
    3 hours ago













5












5








5





$begingroup$


My objective is to find a hash collision of my modified hash function. Assuming my modified hash only outputs the first 36 bits of SHA-1. As we know, SHA-1 is a 160-bit hash value, hence, we only need to output 9 characters out of 40 characters for comparison.



How i begin my program is by hashing a string (I have a SHA-1 algorithm running and i will name it as sha1. I have also ensured that the output of the SHA-1 algorithm is correct.)



Firstly, i would hardcode 2 string and extract out 9 characters out of 40 characters since i require only the first 36 bits of SHA-1. The function below basically return a true if a collision is found and false if a collision is not found



public static boolean findCollision(int x1, int x2) 

String message1 = "I tried " + x1 + " iteration to find a collision";
String message2 = "I tried " + x2 + " iteration to find a collision";


//hashing my string and extracting 9 characters
String message_hash1 = sha1(message1);
String message_hash2 = sha1(message2);

String modified_hash1 = message_hash1.substring(0, 9);
String modified_hash2 = message_hash2.substring(0, 9);

if (modified_hash1.equals(modified_hash2))
return true;
else
return false;



Lastly, i will have a main function that will random Integers up to MAX_VALUE in a infinite loop and will break out if and only if a hash is found.



public static void main(String[] args) 

Random random = new Random();
int x1 = 0;
int x2 = 0;
int counter = 0;

while (true)

while(true)

x1 = random.nextInt(Integer.MAX_VALUE);
x2 = random.nextInt(Integer.MAX_VALUE);

if (x1 != x2)
break;


if (findCollision(x1, x2) == true)
break;

counter++;


System.out.println("nNumber of trials: " + counter);



If i tried taking only the first 24 bits of SHA-1, i could easily find a collision. However, i'm unable to find a collision for 36 bits instead despite running it for hours. Hence, I'm wondering what is other alternative way for me to find a collision with just 36 bits of SHA-1.










share|improve this question









New contributor



Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




My objective is to find a hash collision of my modified hash function. Assuming my modified hash only outputs the first 36 bits of SHA-1. As we know, SHA-1 is a 160-bit hash value, hence, we only need to output 9 characters out of 40 characters for comparison.



How i begin my program is by hashing a string (I have a SHA-1 algorithm running and i will name it as sha1. I have also ensured that the output of the SHA-1 algorithm is correct.)



Firstly, i would hardcode 2 string and extract out 9 characters out of 40 characters since i require only the first 36 bits of SHA-1. The function below basically return a true if a collision is found and false if a collision is not found



public static boolean findCollision(int x1, int x2) 

String message1 = "I tried " + x1 + " iteration to find a collision";
String message2 = "I tried " + x2 + " iteration to find a collision";


//hashing my string and extracting 9 characters
String message_hash1 = sha1(message1);
String message_hash2 = sha1(message2);

String modified_hash1 = message_hash1.substring(0, 9);
String modified_hash2 = message_hash2.substring(0, 9);

if (modified_hash1.equals(modified_hash2))
return true;
else
return false;



Lastly, i will have a main function that will random Integers up to MAX_VALUE in a infinite loop and will break out if and only if a hash is found.



public static void main(String[] args) 

Random random = new Random();
int x1 = 0;
int x2 = 0;
int counter = 0;

while (true)

while(true)

x1 = random.nextInt(Integer.MAX_VALUE);
x2 = random.nextInt(Integer.MAX_VALUE);

if (x1 != x2)
break;


if (findCollision(x1, x2) == true)
break;

counter++;


System.out.println("nNumber of trials: " + counter);



If i tried taking only the first 24 bits of SHA-1, i could easily find a collision. However, i'm unable to find a collision for 36 bits instead despite running it for hours. Hence, I'm wondering what is other alternative way for me to find a collision with just 36 bits of SHA-1.







java time-limit-exceeded cryptography hashcode






share|improve this question









New contributor



Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|improve this question









New contributor



Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|improve this question




share|improve this question








edited 2 hours ago









200_success

132k20160427




132k20160427






New contributor



Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 4 hours ago









Astral ZhangAstral Zhang

262




262




New contributor



Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Astral Zhang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • $begingroup$
    Out of curiosity, why are you trying to do this? Is it because you want to see if your modified algorithm is good?
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    of course not. this modified algorithm is never good because by decreasing the hash length is never a good thing. It's an assignment for my class work and i'm ran out of ideas on how to achieve a hash collision. that's the main objective of this assignment.
    $endgroup$
    – Astral Zhang
    3 hours ago










  • $begingroup$
    You should mention this in your question, I believe it's important :)
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    i have done so. thanks!
    $endgroup$
    – Astral Zhang
    3 hours ago
















  • $begingroup$
    Out of curiosity, why are you trying to do this? Is it because you want to see if your modified algorithm is good?
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    of course not. this modified algorithm is never good because by decreasing the hash length is never a good thing. It's an assignment for my class work and i'm ran out of ideas on how to achieve a hash collision. that's the main objective of this assignment.
    $endgroup$
    – Astral Zhang
    3 hours ago










  • $begingroup$
    You should mention this in your question, I believe it's important :)
    $endgroup$
    – IEatBagels
    3 hours ago










  • $begingroup$
    i have done so. thanks!
    $endgroup$
    – Astral Zhang
    3 hours ago















$begingroup$
Out of curiosity, why are you trying to do this? Is it because you want to see if your modified algorithm is good?
$endgroup$
– IEatBagels
3 hours ago




$begingroup$
Out of curiosity, why are you trying to do this? Is it because you want to see if your modified algorithm is good?
$endgroup$
– IEatBagels
3 hours ago












$begingroup$
of course not. this modified algorithm is never good because by decreasing the hash length is never a good thing. It's an assignment for my class work and i'm ran out of ideas on how to achieve a hash collision. that's the main objective of this assignment.
$endgroup$
– Astral Zhang
3 hours ago




$begingroup$
of course not. this modified algorithm is never good because by decreasing the hash length is never a good thing. It's an assignment for my class work and i'm ran out of ideas on how to achieve a hash collision. that's the main objective of this assignment.
$endgroup$
– Astral Zhang
3 hours ago












$begingroup$
You should mention this in your question, I believe it's important :)
$endgroup$
– IEatBagels
3 hours ago




$begingroup$
You should mention this in your question, I believe it's important :)
$endgroup$
– IEatBagels
3 hours ago












$begingroup$
i have done so. thanks!
$endgroup$
– Astral Zhang
3 hours ago




$begingroup$
i have done so. thanks!
$endgroup$
– Astral Zhang
3 hours ago










2 Answers
2






active

oldest

votes


















2












$begingroup$

With 24 bits, there are approximately 16.8 million possible hashes, so on average, you'd have to try 8.4 million pairs until you find a collision. With 36 bits, the numbers have to be multiplied by 4096, which yields 68 respectively 34 billion. You can cut this in half by not computing two hashes during each iteration, but instead computing one before the loop and keeping it constant.



However, that is probably still more time than you want to spend. One way to reduce that time is to utilize the Birthday Paradox (https://en.m.wikipedia.org/wiki/Birthday_problem). By computing a list of hashes and comparing each one with each other your chances of finding a collision are much higher. I won't try to type a complete algorithm on this phone screen, though :-)






share|improve this answer








New contributor



Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





$endgroup$




















    1












    $begingroup$

    I realized that while typing this, you should look into the Birthday Attack, which will probably much more elaborated than my answer, lol.



    Considering your have 36 bits of data, it means you have a total number of possibilities of $2^36 = 68719476736$. Based on the Pigeonhole principle, if you compared the hashed of 68719476736 different strings, you'd get a collision.



    So that's easy! All you need to do is run this (pseudocode):



    hashset = (new hashset that uses your hashing algorithm)
    for(i = 0; i < 68719476736 + 1; i++)

    if hashset.contains(string(i)) break;

    hashset.add(string(i));


    print("This took " + i + " tried!")


    With this, you're guaranteed to get a collision. But there's a problem. If every iteration took 100 milliseconds, you'd need about 217 years to get a solution. Tell your teacher your great-great-grandchildren will get back to you with a solution. You could also buy 2000 computers to run this in ~40 days.



    My algorithm is pretty much guaranteed to finish (assuming none of the computers crash in any way), yours isn't. This is more of a lesson on "how to test things", which is valuable for developers. When you want to test something, try not to be random. The thing with your algorithm is that maybe you wouldn't ever get collisions. I understand that you kind of need randomness if you want this to finish running this one day.



    So the question is, what could we do to make this better?



    We could check about how many different strings we'd need to get a good enough chance there's a collision. It's kind of the Birthday Problem which gives us an easy way to calculate what are the chances n different persons have the same birthday in a group of m persons.



    We can use the formula under Approximation of number of people and adapt it to our problem to understand that if we had 68719476736 + 1 strings in our possession (for example, every number between 1 and 68719476736 + 1), you'd need to pick 308652 of those to have about 50% chances of having a collision.



    What you could try is randomly take 308652 numbers between 1 and 68719476736 + 1 and hash them to find a collision. Repeat this as long as you don't have a collision.



    The pseudocode would look like this :



    generator = RandomBigIntGenerator()
    numbers = []

    for(i = 0; i < 308652; i++)
    numbers.add(generator.random(1,68719476736+1))


    hashset = (With your hashing function)
    for n in numbers
    if hashset.contains(n)
    print("yay done");
    break;

    hashset.add(m);



    All in all, you can hope to have collisions, but you need computing power.



    In a code review point of view :



    • You need to keep track of the strings you already tried, they give you valuable information that can actually help you find collision.

    • Don't be too random when you create your strings.

    • Understand that 68719476736 is a biiggg number, but still so much smaller than the 1461501637330902918203684832716283019655932542976 possible values the SHA1 have :)





    share|improve this answer









    $endgroup$












    • $begingroup$
      I will try this birthday paradox method right now. Hopefully it will work.
      $endgroup$
      – Astral Zhang
      2 hours ago










    • $begingroup$
      i now have a problem while implementing this solution. If my final output of my program has to display the two original string and it's hash values, then using hashset doesn't work because hashset doesn't allow us to store 2 datatype. From your recommended algorithm, it mentioned to store the hash value directly and compare. If it's done that way, then there's no way i can identify the number used to generate that particular collision hash.
      $endgroup$
      – Astral Zhang
      1 hour ago











    • $begingroup$
      how do you obtain this value 308652?
      $endgroup$
      – Astral Zhang
      1 hour ago











    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Astral Zhang is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f220756%2ffinding-collisions-of-the-first-few-bits-of-a-sha-1-hash%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    With 24 bits, there are approximately 16.8 million possible hashes, so on average, you'd have to try 8.4 million pairs until you find a collision. With 36 bits, the numbers have to be multiplied by 4096, which yields 68 respectively 34 billion. You can cut this in half by not computing two hashes during each iteration, but instead computing one before the loop and keeping it constant.



    However, that is probably still more time than you want to spend. One way to reduce that time is to utilize the Birthday Paradox (https://en.m.wikipedia.org/wiki/Birthday_problem). By computing a list of hashes and comparing each one with each other your chances of finding a collision are much higher. I won't try to type a complete algorithm on this phone screen, though :-)






    share|improve this answer








    New contributor



    Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





    $endgroup$

















      2












      $begingroup$

      With 24 bits, there are approximately 16.8 million possible hashes, so on average, you'd have to try 8.4 million pairs until you find a collision. With 36 bits, the numbers have to be multiplied by 4096, which yields 68 respectively 34 billion. You can cut this in half by not computing two hashes during each iteration, but instead computing one before the loop and keeping it constant.



      However, that is probably still more time than you want to spend. One way to reduce that time is to utilize the Birthday Paradox (https://en.m.wikipedia.org/wiki/Birthday_problem). By computing a list of hashes and comparing each one with each other your chances of finding a collision are much higher. I won't try to type a complete algorithm on this phone screen, though :-)






      share|improve this answer








      New contributor



      Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      $endgroup$















        2












        2








        2





        $begingroup$

        With 24 bits, there are approximately 16.8 million possible hashes, so on average, you'd have to try 8.4 million pairs until you find a collision. With 36 bits, the numbers have to be multiplied by 4096, which yields 68 respectively 34 billion. You can cut this in half by not computing two hashes during each iteration, but instead computing one before the loop and keeping it constant.



        However, that is probably still more time than you want to spend. One way to reduce that time is to utilize the Birthday Paradox (https://en.m.wikipedia.org/wiki/Birthday_problem). By computing a list of hashes and comparing each one with each other your chances of finding a collision are much higher. I won't try to type a complete algorithm on this phone screen, though :-)






        share|improve this answer








        New contributor



        Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        $endgroup$



        With 24 bits, there are approximately 16.8 million possible hashes, so on average, you'd have to try 8.4 million pairs until you find a collision. With 36 bits, the numbers have to be multiplied by 4096, which yields 68 respectively 34 billion. You can cut this in half by not computing two hashes during each iteration, but instead computing one before the loop and keeping it constant.



        However, that is probably still more time than you want to spend. One way to reduce that time is to utilize the Birthday Paradox (https://en.m.wikipedia.org/wiki/Birthday_problem). By computing a list of hashes and comparing each one with each other your chances of finding a collision are much higher. I won't try to type a complete algorithm on this phone screen, though :-)







        share|improve this answer








        New contributor



        Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.








        share|improve this answer



        share|improve this answer






        New contributor



        Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.








        answered 2 hours ago









        Hans-Martin MosnerHans-Martin Mosner

        1212




        1212




        New contributor



        Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.




        New contributor




        Hans-Martin Mosner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.

























            1












            $begingroup$

            I realized that while typing this, you should look into the Birthday Attack, which will probably much more elaborated than my answer, lol.



            Considering your have 36 bits of data, it means you have a total number of possibilities of $2^36 = 68719476736$. Based on the Pigeonhole principle, if you compared the hashed of 68719476736 different strings, you'd get a collision.



            So that's easy! All you need to do is run this (pseudocode):



            hashset = (new hashset that uses your hashing algorithm)
            for(i = 0; i < 68719476736 + 1; i++)

            if hashset.contains(string(i)) break;

            hashset.add(string(i));


            print("This took " + i + " tried!")


            With this, you're guaranteed to get a collision. But there's a problem. If every iteration took 100 milliseconds, you'd need about 217 years to get a solution. Tell your teacher your great-great-grandchildren will get back to you with a solution. You could also buy 2000 computers to run this in ~40 days.



            My algorithm is pretty much guaranteed to finish (assuming none of the computers crash in any way), yours isn't. This is more of a lesson on "how to test things", which is valuable for developers. When you want to test something, try not to be random. The thing with your algorithm is that maybe you wouldn't ever get collisions. I understand that you kind of need randomness if you want this to finish running this one day.



            So the question is, what could we do to make this better?



            We could check about how many different strings we'd need to get a good enough chance there's a collision. It's kind of the Birthday Problem which gives us an easy way to calculate what are the chances n different persons have the same birthday in a group of m persons.



            We can use the formula under Approximation of number of people and adapt it to our problem to understand that if we had 68719476736 + 1 strings in our possession (for example, every number between 1 and 68719476736 + 1), you'd need to pick 308652 of those to have about 50% chances of having a collision.



            What you could try is randomly take 308652 numbers between 1 and 68719476736 + 1 and hash them to find a collision. Repeat this as long as you don't have a collision.



            The pseudocode would look like this :



            generator = RandomBigIntGenerator()
            numbers = []

            for(i = 0; i < 308652; i++)
            numbers.add(generator.random(1,68719476736+1))


            hashset = (With your hashing function)
            for n in numbers
            if hashset.contains(n)
            print("yay done");
            break;

            hashset.add(m);



            All in all, you can hope to have collisions, but you need computing power.



            In a code review point of view :



            • You need to keep track of the strings you already tried, they give you valuable information that can actually help you find collision.

            • Don't be too random when you create your strings.

            • Understand that 68719476736 is a biiggg number, but still so much smaller than the 1461501637330902918203684832716283019655932542976 possible values the SHA1 have :)





            share|improve this answer









            $endgroup$












            • $begingroup$
              I will try this birthday paradox method right now. Hopefully it will work.
              $endgroup$
              – Astral Zhang
              2 hours ago










            • $begingroup$
              i now have a problem while implementing this solution. If my final output of my program has to display the two original string and it's hash values, then using hashset doesn't work because hashset doesn't allow us to store 2 datatype. From your recommended algorithm, it mentioned to store the hash value directly and compare. If it's done that way, then there's no way i can identify the number used to generate that particular collision hash.
              $endgroup$
              – Astral Zhang
              1 hour ago











            • $begingroup$
              how do you obtain this value 308652?
              $endgroup$
              – Astral Zhang
              1 hour ago















            1












            $begingroup$

            I realized that while typing this, you should look into the Birthday Attack, which will probably much more elaborated than my answer, lol.



            Considering your have 36 bits of data, it means you have a total number of possibilities of $2^36 = 68719476736$. Based on the Pigeonhole principle, if you compared the hashed of 68719476736 different strings, you'd get a collision.



            So that's easy! All you need to do is run this (pseudocode):



            hashset = (new hashset that uses your hashing algorithm)
            for(i = 0; i < 68719476736 + 1; i++)

            if hashset.contains(string(i)) break;

            hashset.add(string(i));


            print("This took " + i + " tried!")


            With this, you're guaranteed to get a collision. But there's a problem. If every iteration took 100 milliseconds, you'd need about 217 years to get a solution. Tell your teacher your great-great-grandchildren will get back to you with a solution. You could also buy 2000 computers to run this in ~40 days.



            My algorithm is pretty much guaranteed to finish (assuming none of the computers crash in any way), yours isn't. This is more of a lesson on "how to test things", which is valuable for developers. When you want to test something, try not to be random. The thing with your algorithm is that maybe you wouldn't ever get collisions. I understand that you kind of need randomness if you want this to finish running this one day.



            So the question is, what could we do to make this better?



            We could check about how many different strings we'd need to get a good enough chance there's a collision. It's kind of the Birthday Problem which gives us an easy way to calculate what are the chances n different persons have the same birthday in a group of m persons.



            We can use the formula under Approximation of number of people and adapt it to our problem to understand that if we had 68719476736 + 1 strings in our possession (for example, every number between 1 and 68719476736 + 1), you'd need to pick 308652 of those to have about 50% chances of having a collision.



            What you could try is randomly take 308652 numbers between 1 and 68719476736 + 1 and hash them to find a collision. Repeat this as long as you don't have a collision.



            The pseudocode would look like this :



            generator = RandomBigIntGenerator()
            numbers = []

            for(i = 0; i < 308652; i++)
            numbers.add(generator.random(1,68719476736+1))


            hashset = (With your hashing function)
            for n in numbers
            if hashset.contains(n)
            print("yay done");
            break;

            hashset.add(m);



            All in all, you can hope to have collisions, but you need computing power.



            In a code review point of view :



            • You need to keep track of the strings you already tried, they give you valuable information that can actually help you find collision.

            • Don't be too random when you create your strings.

            • Understand that 68719476736 is a biiggg number, but still so much smaller than the 1461501637330902918203684832716283019655932542976 possible values the SHA1 have :)





            share|improve this answer









            $endgroup$












            • $begingroup$
              I will try this birthday paradox method right now. Hopefully it will work.
              $endgroup$
              – Astral Zhang
              2 hours ago










            • $begingroup$
              i now have a problem while implementing this solution. If my final output of my program has to display the two original string and it's hash values, then using hashset doesn't work because hashset doesn't allow us to store 2 datatype. From your recommended algorithm, it mentioned to store the hash value directly and compare. If it's done that way, then there's no way i can identify the number used to generate that particular collision hash.
              $endgroup$
              – Astral Zhang
              1 hour ago











            • $begingroup$
              how do you obtain this value 308652?
              $endgroup$
              – Astral Zhang
              1 hour ago













            1












            1








            1





            $begingroup$

            I realized that while typing this, you should look into the Birthday Attack, which will probably much more elaborated than my answer, lol.



            Considering your have 36 bits of data, it means you have a total number of possibilities of $2^36 = 68719476736$. Based on the Pigeonhole principle, if you compared the hashed of 68719476736 different strings, you'd get a collision.



            So that's easy! All you need to do is run this (pseudocode):



            hashset = (new hashset that uses your hashing algorithm)
            for(i = 0; i < 68719476736 + 1; i++)

            if hashset.contains(string(i)) break;

            hashset.add(string(i));


            print("This took " + i + " tried!")


            With this, you're guaranteed to get a collision. But there's a problem. If every iteration took 100 milliseconds, you'd need about 217 years to get a solution. Tell your teacher your great-great-grandchildren will get back to you with a solution. You could also buy 2000 computers to run this in ~40 days.



            My algorithm is pretty much guaranteed to finish (assuming none of the computers crash in any way), yours isn't. This is more of a lesson on "how to test things", which is valuable for developers. When you want to test something, try not to be random. The thing with your algorithm is that maybe you wouldn't ever get collisions. I understand that you kind of need randomness if you want this to finish running this one day.



            So the question is, what could we do to make this better?



            We could check about how many different strings we'd need to get a good enough chance there's a collision. It's kind of the Birthday Problem which gives us an easy way to calculate what are the chances n different persons have the same birthday in a group of m persons.



            We can use the formula under Approximation of number of people and adapt it to our problem to understand that if we had 68719476736 + 1 strings in our possession (for example, every number between 1 and 68719476736 + 1), you'd need to pick 308652 of those to have about 50% chances of having a collision.



            What you could try is randomly take 308652 numbers between 1 and 68719476736 + 1 and hash them to find a collision. Repeat this as long as you don't have a collision.



            The pseudocode would look like this :



            generator = RandomBigIntGenerator()
            numbers = []

            for(i = 0; i < 308652; i++)
            numbers.add(generator.random(1,68719476736+1))


            hashset = (With your hashing function)
            for n in numbers
            if hashset.contains(n)
            print("yay done");
            break;

            hashset.add(m);



            All in all, you can hope to have collisions, but you need computing power.



            In a code review point of view :



            • You need to keep track of the strings you already tried, they give you valuable information that can actually help you find collision.

            • Don't be too random when you create your strings.

            • Understand that 68719476736 is a biiggg number, but still so much smaller than the 1461501637330902918203684832716283019655932542976 possible values the SHA1 have :)





            share|improve this answer









            $endgroup$



            I realized that while typing this, you should look into the Birthday Attack, which will probably much more elaborated than my answer, lol.



            Considering your have 36 bits of data, it means you have a total number of possibilities of $2^36 = 68719476736$. Based on the Pigeonhole principle, if you compared the hashed of 68719476736 different strings, you'd get a collision.



            So that's easy! All you need to do is run this (pseudocode):



            hashset = (new hashset that uses your hashing algorithm)
            for(i = 0; i < 68719476736 + 1; i++)

            if hashset.contains(string(i)) break;

            hashset.add(string(i));


            print("This took " + i + " tried!")


            With this, you're guaranteed to get a collision. But there's a problem. If every iteration took 100 milliseconds, you'd need about 217 years to get a solution. Tell your teacher your great-great-grandchildren will get back to you with a solution. You could also buy 2000 computers to run this in ~40 days.



            My algorithm is pretty much guaranteed to finish (assuming none of the computers crash in any way), yours isn't. This is more of a lesson on "how to test things", which is valuable for developers. When you want to test something, try not to be random. The thing with your algorithm is that maybe you wouldn't ever get collisions. I understand that you kind of need randomness if you want this to finish running this one day.



            So the question is, what could we do to make this better?



            We could check about how many different strings we'd need to get a good enough chance there's a collision. It's kind of the Birthday Problem which gives us an easy way to calculate what are the chances n different persons have the same birthday in a group of m persons.



            We can use the formula under Approximation of number of people and adapt it to our problem to understand that if we had 68719476736 + 1 strings in our possession (for example, every number between 1 and 68719476736 + 1), you'd need to pick 308652 of those to have about 50% chances of having a collision.



            What you could try is randomly take 308652 numbers between 1 and 68719476736 + 1 and hash them to find a collision. Repeat this as long as you don't have a collision.



            The pseudocode would look like this :



            generator = RandomBigIntGenerator()
            numbers = []

            for(i = 0; i < 308652; i++)
            numbers.add(generator.random(1,68719476736+1))


            hashset = (With your hashing function)
            for n in numbers
            if hashset.contains(n)
            print("yay done");
            break;

            hashset.add(m);



            All in all, you can hope to have collisions, but you need computing power.



            In a code review point of view :



            • You need to keep track of the strings you already tried, they give you valuable information that can actually help you find collision.

            • Don't be too random when you create your strings.

            • Understand that 68719476736 is a biiggg number, but still so much smaller than the 1461501637330902918203684832716283019655932542976 possible values the SHA1 have :)






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered 2 hours ago









            IEatBagelsIEatBagels

            9,08223579




            9,08223579











            • $begingroup$
              I will try this birthday paradox method right now. Hopefully it will work.
              $endgroup$
              – Astral Zhang
              2 hours ago










            • $begingroup$
              i now have a problem while implementing this solution. If my final output of my program has to display the two original string and it's hash values, then using hashset doesn't work because hashset doesn't allow us to store 2 datatype. From your recommended algorithm, it mentioned to store the hash value directly and compare. If it's done that way, then there's no way i can identify the number used to generate that particular collision hash.
              $endgroup$
              – Astral Zhang
              1 hour ago











            • $begingroup$
              how do you obtain this value 308652?
              $endgroup$
              – Astral Zhang
              1 hour ago
















            • $begingroup$
              I will try this birthday paradox method right now. Hopefully it will work.
              $endgroup$
              – Astral Zhang
              2 hours ago










            • $begingroup$
              i now have a problem while implementing this solution. If my final output of my program has to display the two original string and it's hash values, then using hashset doesn't work because hashset doesn't allow us to store 2 datatype. From your recommended algorithm, it mentioned to store the hash value directly and compare. If it's done that way, then there's no way i can identify the number used to generate that particular collision hash.
              $endgroup$
              – Astral Zhang
              1 hour ago











            • $begingroup$
              how do you obtain this value 308652?
              $endgroup$
              – Astral Zhang
              1 hour ago















            $begingroup$
            I will try this birthday paradox method right now. Hopefully it will work.
            $endgroup$
            – Astral Zhang
            2 hours ago




            $begingroup$
            I will try this birthday paradox method right now. Hopefully it will work.
            $endgroup$
            – Astral Zhang
            2 hours ago












            $begingroup$
            i now have a problem while implementing this solution. If my final output of my program has to display the two original string and it's hash values, then using hashset doesn't work because hashset doesn't allow us to store 2 datatype. From your recommended algorithm, it mentioned to store the hash value directly and compare. If it's done that way, then there's no way i can identify the number used to generate that particular collision hash.
            $endgroup$
            – Astral Zhang
            1 hour ago





            $begingroup$
            i now have a problem while implementing this solution. If my final output of my program has to display the two original string and it's hash values, then using hashset doesn't work because hashset doesn't allow us to store 2 datatype. From your recommended algorithm, it mentioned to store the hash value directly and compare. If it's done that way, then there's no way i can identify the number used to generate that particular collision hash.
            $endgroup$
            – Astral Zhang
            1 hour ago













            $begingroup$
            how do you obtain this value 308652?
            $endgroup$
            – Astral Zhang
            1 hour ago




            $begingroup$
            how do you obtain this value 308652?
            $endgroup$
            – Astral Zhang
            1 hour ago










            Astral Zhang is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Astral Zhang is a new contributor. Be nice, and check out our Code of Conduct.












            Astral Zhang is a new contributor. Be nice, and check out our Code of Conduct.











            Astral Zhang is a new contributor. Be nice, and check out our Code of Conduct.














            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f220756%2ffinding-collisions-of-the-first-few-bits-of-a-sha-1-hash%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

            Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

            François Viète Contents Biography Work and thought Bibliography See also Notes Further reading External links Navigation menup. 21Google Bookspp. 75–77Google BooksDe thou (from University of Saint Andrews)ArchivedGoogle BooksGoogle BooksGoogle BooksGoogle booksGoogle Bookscc-parthenay.frL'histoire universelle (fr)Universal History (en)ArchivedAdsabs.harvard.eduPagesperso-orange.frArchive.orgChikara Sasaki. Descartes' mathematical thought p.259Google BooksGoogle BooksGoogle Bookspp. 152 and onwardGoogle BooksGoogle BooksScribd.comGoogle Books1257-7979Google BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGallica.bnf.frGoogle BooksGoogle Books"François Viète"Francois Viète: Father of Modern Algebraic NotationThe Lawyer and the GamblerAbout TarporleySite de Jean-Paul GuichardL'algèbre nouvelle"About the Harmonicon"cb120511976(data)1188044800000 0001 0913 5903n82164680ola2013766880073431702w6vt1sb70287374827140948071409480