Machine Learning Golf: MultiplicationBuild an adding machine using NAND logic gatesBuild a multiplying machine using NAND logic gatesBuild a minifloat adding machine using NAND logic gatesHow to golf string output in ChefMeta-atomic code golfCollatz sequence on a two counter machineZipper multiplicationRegular expression parserDirichlet ConvolutionCan a neural network recognize primes?

Milky way is orbiting around?

Does the Defensive Duelist feat stack with the Warforged integrated protection?

Does Evolution Sage proliferate Blast Zone when played?

Should I cheat if the majority does it?

What is the difference between a historical drama and a period drama?

How to travel between two stationary worlds in the least amount of time? (time dilation)

Do intermediate subdomains need to exist?

Was Wolfgang Unzicker the last Amateur GM?

Explain how 'Sharing the burden' puzzle from Professor Layton and the Miracle Mask should be solved

How do both sides know the MTU

Olive oil in Japanese cooking

Contributing to a candidate as a Foreign National US Resident?

Why would a propellor have blades of different lengths?

Why do Klingons use cloaking devices?

How can I effectively map a multi-level dungeon?

What can a novel do that film and TV cannot?

What is the addition in the re-released version of Avengers: Endgame?

Can 4 Joy cons connect to the same Switch?

Data normalization before or after train-test split?

How to deal with administrative duties killing the research spirit?

Why do we need a bootloader separate than our application program in MCU's?

Has chattel slavery ever been used as a criminal punishment in the USA since the passage of the Thirteenth Amendment?

Magento 2 Professional Developer certification study guidelines

Do I need to be legally qualified to install a Hive smart thermostat?



Machine Learning Golf: Multiplication


Build an adding machine using NAND logic gatesBuild a multiplying machine using NAND logic gatesBuild a minifloat adding machine using NAND logic gatesHow to golf string output in ChefMeta-atomic code golfCollatz sequence on a two counter machineZipper multiplicationRegular expression parserDirichlet ConvolutionCan a neural network recognize primes?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








12












$begingroup$


I'd like to propose a different kind of golfing challenge to this community:



In the language and framework of your choice, design and train a neural network that, given $(x_1, x_2)$ calculates their product $x_1 cdot x_2$ for all integers $x_1, x_2$ between (and including) $-10$ and $10$.



Performance Goal



To qualify, your model may not deviate by more than $0.5$ from the correct result on any of those entries.



Rules



Your model



  • must be a 'traditional' neural network (a node's value is calculated as a weighted linear combination of some of the nodes in a previous layer followed by an activation function),

  • may only use the following standard activation functions:


    1. $mathrmlinear(x) = x$,


    2. $mathrmsoftmax(vecx)_i = frace^x_isum_j e^x_j$,


    3. $mathrmselu_alpha, beta(x) = begincases
      beta cdot x & text, if x > 0 \
      alpha cdot beta (e^x -1 ) & text, otherwise
      endcases$
      ,


    4. $mathrmsoftplus(x) = ln(e^x+1)$,


    5. $mathrmleaky-relu_alpha(x) = begincases
      x & text, if x < 0 \
      alpha cdot x & text, otherwise
      endcases$
      ,


    6. $tanh(x)$,


    7. $mathrmsigmoid(x) = frace^xe^x+1$,


    8. $mathrmhard-sigmoid(x) = begincases
      0 & text, if x < -2.5 \
      1 & text, if x > -2.5 \
      0.2 cdot x + 0.5 & text, otherwise
      endcases$
      ,

    9. $e^x$


  • must take $(x_1, x_2)$ either as a tupel/vector/list/... of integers or floats as its only input,

  • return the answer as an integer, float (or a suitable container, e.g. a vector or list, that contains this answer).

Your answer must include (or link to) all code necessary to check your results -- including the trained weights of your model.



Scoring



The neural network with the smallest number of weights (including bias weights) wins.



Enjoy!










share|improve this question









New contributor



Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$







  • 3




    $begingroup$
    Welcome to the site! I think this challenge could benefit a good deal from a more robust definition of a neural network. There are a couple of things here 1) It would be very nice for you to state it in language that does not already imply knowledge of NNs 2) You really should list the activation functions in your post rather than link out to an external source (outside links can change or disappear).
    $endgroup$
    – Sriotchilism O'Zaic
    9 hours ago






  • 1




    $begingroup$
    Can we have constant offsets? e.g., for a node with a single input x, compute its value as ax+b? And if yes, does that count as 1 or 2 weights?
    $endgroup$
    – Grimy
    7 hours ago






  • 3




    $begingroup$
    Can we reuse weights/use convolutional layers? (I recommend removing the bonus, as it doesn't add anything to the challenge and just distracts from the main goal.) Are the weights supposed to be real or can they be complex?
    $endgroup$
    – flawr
    7 hours ago







  • 2




    $begingroup$
    Your wording implies nodes from layer 3 can't use inputs from layer 1. Does it cost a weight to have a layer 2 node simply doing f(x) = x to forward its input?
    $endgroup$
    – Grimy
    7 hours ago






  • 1




    $begingroup$
    @Grimy You're correct. My wording doesn't correctly reflect my intentions. I'll edit that. Skipping layers is allowed and if you do so with one of the listed activation functions, it doesn't cost you any weights.
    $endgroup$
    – Stefan Mesken
    7 hours ago

















12












$begingroup$


I'd like to propose a different kind of golfing challenge to this community:



In the language and framework of your choice, design and train a neural network that, given $(x_1, x_2)$ calculates their product $x_1 cdot x_2$ for all integers $x_1, x_2$ between (and including) $-10$ and $10$.



Performance Goal



To qualify, your model may not deviate by more than $0.5$ from the correct result on any of those entries.



Rules



Your model



  • must be a 'traditional' neural network (a node's value is calculated as a weighted linear combination of some of the nodes in a previous layer followed by an activation function),

  • may only use the following standard activation functions:


    1. $mathrmlinear(x) = x$,


    2. $mathrmsoftmax(vecx)_i = frace^x_isum_j e^x_j$,


    3. $mathrmselu_alpha, beta(x) = begincases
      beta cdot x & text, if x > 0 \
      alpha cdot beta (e^x -1 ) & text, otherwise
      endcases$
      ,


    4. $mathrmsoftplus(x) = ln(e^x+1)$,


    5. $mathrmleaky-relu_alpha(x) = begincases
      x & text, if x < 0 \
      alpha cdot x & text, otherwise
      endcases$
      ,


    6. $tanh(x)$,


    7. $mathrmsigmoid(x) = frace^xe^x+1$,


    8. $mathrmhard-sigmoid(x) = begincases
      0 & text, if x < -2.5 \
      1 & text, if x > -2.5 \
      0.2 cdot x + 0.5 & text, otherwise
      endcases$
      ,

    9. $e^x$


  • must take $(x_1, x_2)$ either as a tupel/vector/list/... of integers or floats as its only input,

  • return the answer as an integer, float (or a suitable container, e.g. a vector or list, that contains this answer).

Your answer must include (or link to) all code necessary to check your results -- including the trained weights of your model.



Scoring



The neural network with the smallest number of weights (including bias weights) wins.



Enjoy!










share|improve this question









New contributor



Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$







  • 3




    $begingroup$
    Welcome to the site! I think this challenge could benefit a good deal from a more robust definition of a neural network. There are a couple of things here 1) It would be very nice for you to state it in language that does not already imply knowledge of NNs 2) You really should list the activation functions in your post rather than link out to an external source (outside links can change or disappear).
    $endgroup$
    – Sriotchilism O'Zaic
    9 hours ago






  • 1




    $begingroup$
    Can we have constant offsets? e.g., for a node with a single input x, compute its value as ax+b? And if yes, does that count as 1 or 2 weights?
    $endgroup$
    – Grimy
    7 hours ago






  • 3




    $begingroup$
    Can we reuse weights/use convolutional layers? (I recommend removing the bonus, as it doesn't add anything to the challenge and just distracts from the main goal.) Are the weights supposed to be real or can they be complex?
    $endgroup$
    – flawr
    7 hours ago







  • 2




    $begingroup$
    Your wording implies nodes from layer 3 can't use inputs from layer 1. Does it cost a weight to have a layer 2 node simply doing f(x) = x to forward its input?
    $endgroup$
    – Grimy
    7 hours ago






  • 1




    $begingroup$
    @Grimy You're correct. My wording doesn't correctly reflect my intentions. I'll edit that. Skipping layers is allowed and if you do so with one of the listed activation functions, it doesn't cost you any weights.
    $endgroup$
    – Stefan Mesken
    7 hours ago













12












12








12


1



$begingroup$


I'd like to propose a different kind of golfing challenge to this community:



In the language and framework of your choice, design and train a neural network that, given $(x_1, x_2)$ calculates their product $x_1 cdot x_2$ for all integers $x_1, x_2$ between (and including) $-10$ and $10$.



Performance Goal



To qualify, your model may not deviate by more than $0.5$ from the correct result on any of those entries.



Rules



Your model



  • must be a 'traditional' neural network (a node's value is calculated as a weighted linear combination of some of the nodes in a previous layer followed by an activation function),

  • may only use the following standard activation functions:


    1. $mathrmlinear(x) = x$,


    2. $mathrmsoftmax(vecx)_i = frace^x_isum_j e^x_j$,


    3. $mathrmselu_alpha, beta(x) = begincases
      beta cdot x & text, if x > 0 \
      alpha cdot beta (e^x -1 ) & text, otherwise
      endcases$
      ,


    4. $mathrmsoftplus(x) = ln(e^x+1)$,


    5. $mathrmleaky-relu_alpha(x) = begincases
      x & text, if x < 0 \
      alpha cdot x & text, otherwise
      endcases$
      ,


    6. $tanh(x)$,


    7. $mathrmsigmoid(x) = frace^xe^x+1$,


    8. $mathrmhard-sigmoid(x) = begincases
      0 & text, if x < -2.5 \
      1 & text, if x > -2.5 \
      0.2 cdot x + 0.5 & text, otherwise
      endcases$
      ,

    9. $e^x$


  • must take $(x_1, x_2)$ either as a tupel/vector/list/... of integers or floats as its only input,

  • return the answer as an integer, float (or a suitable container, e.g. a vector or list, that contains this answer).

Your answer must include (or link to) all code necessary to check your results -- including the trained weights of your model.



Scoring



The neural network with the smallest number of weights (including bias weights) wins.



Enjoy!










share|improve this question









New contributor



Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




I'd like to propose a different kind of golfing challenge to this community:



In the language and framework of your choice, design and train a neural network that, given $(x_1, x_2)$ calculates their product $x_1 cdot x_2$ for all integers $x_1, x_2$ between (and including) $-10$ and $10$.



Performance Goal



To qualify, your model may not deviate by more than $0.5$ from the correct result on any of those entries.



Rules



Your model



  • must be a 'traditional' neural network (a node's value is calculated as a weighted linear combination of some of the nodes in a previous layer followed by an activation function),

  • may only use the following standard activation functions:


    1. $mathrmlinear(x) = x$,


    2. $mathrmsoftmax(vecx)_i = frace^x_isum_j e^x_j$,


    3. $mathrmselu_alpha, beta(x) = begincases
      beta cdot x & text, if x > 0 \
      alpha cdot beta (e^x -1 ) & text, otherwise
      endcases$
      ,


    4. $mathrmsoftplus(x) = ln(e^x+1)$,


    5. $mathrmleaky-relu_alpha(x) = begincases
      x & text, if x < 0 \
      alpha cdot x & text, otherwise
      endcases$
      ,


    6. $tanh(x)$,


    7. $mathrmsigmoid(x) = frace^xe^x+1$,


    8. $mathrmhard-sigmoid(x) = begincases
      0 & text, if x < -2.5 \
      1 & text, if x > -2.5 \
      0.2 cdot x + 0.5 & text, otherwise
      endcases$
      ,

    9. $e^x$


  • must take $(x_1, x_2)$ either as a tupel/vector/list/... of integers or floats as its only input,

  • return the answer as an integer, float (or a suitable container, e.g. a vector or list, that contains this answer).

Your answer must include (or link to) all code necessary to check your results -- including the trained weights of your model.



Scoring



The neural network with the smallest number of weights (including bias weights) wins.



Enjoy!







arithmetic atomic-code-golf neural-networks






share|improve this question









New contributor



Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|improve this question









New contributor



Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|improve this question




share|improve this question








edited 4 hours ago









Expired Data

1,5454 silver badges21 bronze badges




1,5454 silver badges21 bronze badges






New contributor



Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 10 hours ago









Stefan MeskenStefan Mesken

1616 bronze badges




1616 bronze badges




New contributor



Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Stefan Mesken is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









  • 3




    $begingroup$
    Welcome to the site! I think this challenge could benefit a good deal from a more robust definition of a neural network. There are a couple of things here 1) It would be very nice for you to state it in language that does not already imply knowledge of NNs 2) You really should list the activation functions in your post rather than link out to an external source (outside links can change or disappear).
    $endgroup$
    – Sriotchilism O'Zaic
    9 hours ago






  • 1




    $begingroup$
    Can we have constant offsets? e.g., for a node with a single input x, compute its value as ax+b? And if yes, does that count as 1 or 2 weights?
    $endgroup$
    – Grimy
    7 hours ago






  • 3




    $begingroup$
    Can we reuse weights/use convolutional layers? (I recommend removing the bonus, as it doesn't add anything to the challenge and just distracts from the main goal.) Are the weights supposed to be real or can they be complex?
    $endgroup$
    – flawr
    7 hours ago







  • 2




    $begingroup$
    Your wording implies nodes from layer 3 can't use inputs from layer 1. Does it cost a weight to have a layer 2 node simply doing f(x) = x to forward its input?
    $endgroup$
    – Grimy
    7 hours ago






  • 1




    $begingroup$
    @Grimy You're correct. My wording doesn't correctly reflect my intentions. I'll edit that. Skipping layers is allowed and if you do so with one of the listed activation functions, it doesn't cost you any weights.
    $endgroup$
    – Stefan Mesken
    7 hours ago












  • 3




    $begingroup$
    Welcome to the site! I think this challenge could benefit a good deal from a more robust definition of a neural network. There are a couple of things here 1) It would be very nice for you to state it in language that does not already imply knowledge of NNs 2) You really should list the activation functions in your post rather than link out to an external source (outside links can change or disappear).
    $endgroup$
    – Sriotchilism O'Zaic
    9 hours ago






  • 1




    $begingroup$
    Can we have constant offsets? e.g., for a node with a single input x, compute its value as ax+b? And if yes, does that count as 1 or 2 weights?
    $endgroup$
    – Grimy
    7 hours ago






  • 3




    $begingroup$
    Can we reuse weights/use convolutional layers? (I recommend removing the bonus, as it doesn't add anything to the challenge and just distracts from the main goal.) Are the weights supposed to be real or can they be complex?
    $endgroup$
    – flawr
    7 hours ago







  • 2




    $begingroup$
    Your wording implies nodes from layer 3 can't use inputs from layer 1. Does it cost a weight to have a layer 2 node simply doing f(x) = x to forward its input?
    $endgroup$
    – Grimy
    7 hours ago






  • 1




    $begingroup$
    @Grimy You're correct. My wording doesn't correctly reflect my intentions. I'll edit that. Skipping layers is allowed and if you do so with one of the listed activation functions, it doesn't cost you any weights.
    $endgroup$
    – Stefan Mesken
    7 hours ago







3




3




$begingroup$
Welcome to the site! I think this challenge could benefit a good deal from a more robust definition of a neural network. There are a couple of things here 1) It would be very nice for you to state it in language that does not already imply knowledge of NNs 2) You really should list the activation functions in your post rather than link out to an external source (outside links can change or disappear).
$endgroup$
– Sriotchilism O'Zaic
9 hours ago




$begingroup$
Welcome to the site! I think this challenge could benefit a good deal from a more robust definition of a neural network. There are a couple of things here 1) It would be very nice for you to state it in language that does not already imply knowledge of NNs 2) You really should list the activation functions in your post rather than link out to an external source (outside links can change or disappear).
$endgroup$
– Sriotchilism O'Zaic
9 hours ago




1




1




$begingroup$
Can we have constant offsets? e.g., for a node with a single input x, compute its value as ax+b? And if yes, does that count as 1 or 2 weights?
$endgroup$
– Grimy
7 hours ago




$begingroup$
Can we have constant offsets? e.g., for a node with a single input x, compute its value as ax+b? And if yes, does that count as 1 or 2 weights?
$endgroup$
– Grimy
7 hours ago




3




3




$begingroup$
Can we reuse weights/use convolutional layers? (I recommend removing the bonus, as it doesn't add anything to the challenge and just distracts from the main goal.) Are the weights supposed to be real or can they be complex?
$endgroup$
– flawr
7 hours ago





$begingroup$
Can we reuse weights/use convolutional layers? (I recommend removing the bonus, as it doesn't add anything to the challenge and just distracts from the main goal.) Are the weights supposed to be real or can they be complex?
$endgroup$
– flawr
7 hours ago





2




2




$begingroup$
Your wording implies nodes from layer 3 can't use inputs from layer 1. Does it cost a weight to have a layer 2 node simply doing f(x) = x to forward its input?
$endgroup$
– Grimy
7 hours ago




$begingroup$
Your wording implies nodes from layer 3 can't use inputs from layer 1. Does it cost a weight to have a layer 2 node simply doing f(x) = x to forward its input?
$endgroup$
– Grimy
7 hours ago




1




1




$begingroup$
@Grimy You're correct. My wording doesn't correctly reflect my intentions. I'll edit that. Skipping layers is allowed and if you do so with one of the listed activation functions, it doesn't cost you any weights.
$endgroup$
– Stefan Mesken
7 hours ago




$begingroup$
@Grimy You're correct. My wording doesn't correctly reflect my intentions. I'll edit that. Skipping layers is allowed and if you do so with one of the listed activation functions, it doesn't cost you any weights.
$endgroup$
– Stefan Mesken
7 hours ago










2 Answers
2






active

oldest

votes


















7












$begingroup$


33 31 weights





# Activation functions
sub hard $_[0] < -2.5 ? 0 : $_[0] > 2.5 ? 1 : 0.2 * $_[0] + 0.5
sub linear $_[0]

# Layer 0
sub inputA() $a
sub inputB() $b

# Layer 1
sub a15() hard(5*inputA)

# Layer 2
sub a8() hard(-5*inputA + 75*a15 - 37.5)

# Layer 3
sub aa() linear(-5*inputA + 75*a15 - 40*a8)

# Layer 4
sub a4() hard(aa - 17.5)

# Layer 5
sub a2() hard(aa - 20*a4 - 7.5)

# Layer 6
sub a1() linear(0.2*aa - 4*a4 - 2*a2)

# Layer 7
sub b15() hard(0.25*inputB - 5*a15)
sub b8() hard(0.25*inputB - 5*a8)
sub b4() hard(0.25*inputB - 5*a4)
sub b2() hard(0.25*inputB - 5*a2)
sub b1() hard(0.25*inputB - 5*a1)

# Layer 8
sub output() linear(-300*b15 + 160*b8 + 80*b4 + 40*b2 + 20*b1 - 10*inputA)

# Test
for $a (-10..10)
for $b (-10..10)
die if abs($a * $b - output) >= 0.5;



print "All OK";


Try it online!



This does long multiplication in (sorta) binary, and thus returns the exact result. It should be possible to take advantage of the 0.5 error window to golf this some more, but I'm not sure how.



Layers 1 to 6 decompose the first input in 5 "bits". For golfing reasons, we don't use actual binary. The most significant "bit" has weight -15 instead of 16, and when the input is 0, all the "bits" are 0.5 (which still works out fine, since it preserves the identity inputA = -15*a15 + 8*a8 + 4*a4 + 2*a2 + 1*a1).






share|improve this answer











$endgroup$








  • 1




    $begingroup$
    I did expect someone to come up with a hard-coded, ANN-ified multiplication algorithm. But I didn't think it would be the first response. Well done! (I'm also eager to see whether you'll be able to pull something like this off with the MNIST dataset or some other, more relastic ML problem :D.)
    $endgroup$
    – Stefan Mesken
    6 hours ago


















4












$begingroup$


21 13 weights



This is based on the polarization identity of bilinear forms which in the one dimensional real case reduces to the polynomial identity:



$$ xcdot y = frac(x+y)^2 - (x-y)^24$$



So y1 just computes [x+y, x-y] using a linear transformation, and y3 is just the absolute value of y1 as a preprocessing step for the next one: Then the "hard" part is computing the squares which I will explain below, and after that just computing a difference and scaling which is again a linear operation.



To compute the squares I use an exponential series $s$ which should be accurate for all integers $0,1,2,ldots,20$ within around $0.5$. This series is of the form



$$ textapprox_square(x) = sum_i=0^2 w_i exp(0.0001 cdot i cdot x)$$



where I just optimized for the weights W2 ($=(w_i)_i$). This whole approximation comprises again just two linear transformations with an exponential activation sandwiched in between. This approach results in a maximal deviation of about 0.02.





function p = net(x)
relu = @(x)max(0,x);

% Linear: 4 weights
y1 = [1,1;1,-1]*x;

% Linear + ReLu + Sum: 1 weight (the -1):
y2 = -1 * y1;
y3 = relu(y1) + relu(y2);

% Linear + exp: 3 weights:
W1 = (0:2)*0.0001;
y4 = exp(y3 * W1);

% Linear: 3 weights:
W2 = [99700114.4299316;-199400468.100687;99700353.6313757];
y5 = y4 * W2;

% Linear : 2 weights:
p = [0.25,-0.25]*y5;

end


Try it online!






share|improve this answer











$endgroup$















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "200"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Stefan Mesken is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f187562%2fmachine-learning-golf-multiplication%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    7












    $begingroup$


    33 31 weights





    # Activation functions
    sub hard $_[0] < -2.5 ? 0 : $_[0] > 2.5 ? 1 : 0.2 * $_[0] + 0.5
    sub linear $_[0]

    # Layer 0
    sub inputA() $a
    sub inputB() $b

    # Layer 1
    sub a15() hard(5*inputA)

    # Layer 2
    sub a8() hard(-5*inputA + 75*a15 - 37.5)

    # Layer 3
    sub aa() linear(-5*inputA + 75*a15 - 40*a8)

    # Layer 4
    sub a4() hard(aa - 17.5)

    # Layer 5
    sub a2() hard(aa - 20*a4 - 7.5)

    # Layer 6
    sub a1() linear(0.2*aa - 4*a4 - 2*a2)

    # Layer 7
    sub b15() hard(0.25*inputB - 5*a15)
    sub b8() hard(0.25*inputB - 5*a8)
    sub b4() hard(0.25*inputB - 5*a4)
    sub b2() hard(0.25*inputB - 5*a2)
    sub b1() hard(0.25*inputB - 5*a1)

    # Layer 8
    sub output() linear(-300*b15 + 160*b8 + 80*b4 + 40*b2 + 20*b1 - 10*inputA)

    # Test
    for $a (-10..10)
    for $b (-10..10)
    die if abs($a * $b - output) >= 0.5;



    print "All OK";


    Try it online!



    This does long multiplication in (sorta) binary, and thus returns the exact result. It should be possible to take advantage of the 0.5 error window to golf this some more, but I'm not sure how.



    Layers 1 to 6 decompose the first input in 5 "bits". For golfing reasons, we don't use actual binary. The most significant "bit" has weight -15 instead of 16, and when the input is 0, all the "bits" are 0.5 (which still works out fine, since it preserves the identity inputA = -15*a15 + 8*a8 + 4*a4 + 2*a2 + 1*a1).






    share|improve this answer











    $endgroup$








    • 1




      $begingroup$
      I did expect someone to come up with a hard-coded, ANN-ified multiplication algorithm. But I didn't think it would be the first response. Well done! (I'm also eager to see whether you'll be able to pull something like this off with the MNIST dataset or some other, more relastic ML problem :D.)
      $endgroup$
      – Stefan Mesken
      6 hours ago















    7












    $begingroup$


    33 31 weights





    # Activation functions
    sub hard $_[0] < -2.5 ? 0 : $_[0] > 2.5 ? 1 : 0.2 * $_[0] + 0.5
    sub linear $_[0]

    # Layer 0
    sub inputA() $a
    sub inputB() $b

    # Layer 1
    sub a15() hard(5*inputA)

    # Layer 2
    sub a8() hard(-5*inputA + 75*a15 - 37.5)

    # Layer 3
    sub aa() linear(-5*inputA + 75*a15 - 40*a8)

    # Layer 4
    sub a4() hard(aa - 17.5)

    # Layer 5
    sub a2() hard(aa - 20*a4 - 7.5)

    # Layer 6
    sub a1() linear(0.2*aa - 4*a4 - 2*a2)

    # Layer 7
    sub b15() hard(0.25*inputB - 5*a15)
    sub b8() hard(0.25*inputB - 5*a8)
    sub b4() hard(0.25*inputB - 5*a4)
    sub b2() hard(0.25*inputB - 5*a2)
    sub b1() hard(0.25*inputB - 5*a1)

    # Layer 8
    sub output() linear(-300*b15 + 160*b8 + 80*b4 + 40*b2 + 20*b1 - 10*inputA)

    # Test
    for $a (-10..10)
    for $b (-10..10)
    die if abs($a * $b - output) >= 0.5;



    print "All OK";


    Try it online!



    This does long multiplication in (sorta) binary, and thus returns the exact result. It should be possible to take advantage of the 0.5 error window to golf this some more, but I'm not sure how.



    Layers 1 to 6 decompose the first input in 5 "bits". For golfing reasons, we don't use actual binary. The most significant "bit" has weight -15 instead of 16, and when the input is 0, all the "bits" are 0.5 (which still works out fine, since it preserves the identity inputA = -15*a15 + 8*a8 + 4*a4 + 2*a2 + 1*a1).






    share|improve this answer











    $endgroup$








    • 1




      $begingroup$
      I did expect someone to come up with a hard-coded, ANN-ified multiplication algorithm. But I didn't think it would be the first response. Well done! (I'm also eager to see whether you'll be able to pull something like this off with the MNIST dataset or some other, more relastic ML problem :D.)
      $endgroup$
      – Stefan Mesken
      6 hours ago













    7












    7








    7





    $begingroup$


    33 31 weights





    # Activation functions
    sub hard $_[0] < -2.5 ? 0 : $_[0] > 2.5 ? 1 : 0.2 * $_[0] + 0.5
    sub linear $_[0]

    # Layer 0
    sub inputA() $a
    sub inputB() $b

    # Layer 1
    sub a15() hard(5*inputA)

    # Layer 2
    sub a8() hard(-5*inputA + 75*a15 - 37.5)

    # Layer 3
    sub aa() linear(-5*inputA + 75*a15 - 40*a8)

    # Layer 4
    sub a4() hard(aa - 17.5)

    # Layer 5
    sub a2() hard(aa - 20*a4 - 7.5)

    # Layer 6
    sub a1() linear(0.2*aa - 4*a4 - 2*a2)

    # Layer 7
    sub b15() hard(0.25*inputB - 5*a15)
    sub b8() hard(0.25*inputB - 5*a8)
    sub b4() hard(0.25*inputB - 5*a4)
    sub b2() hard(0.25*inputB - 5*a2)
    sub b1() hard(0.25*inputB - 5*a1)

    # Layer 8
    sub output() linear(-300*b15 + 160*b8 + 80*b4 + 40*b2 + 20*b1 - 10*inputA)

    # Test
    for $a (-10..10)
    for $b (-10..10)
    die if abs($a * $b - output) >= 0.5;



    print "All OK";


    Try it online!



    This does long multiplication in (sorta) binary, and thus returns the exact result. It should be possible to take advantage of the 0.5 error window to golf this some more, but I'm not sure how.



    Layers 1 to 6 decompose the first input in 5 "bits". For golfing reasons, we don't use actual binary. The most significant "bit" has weight -15 instead of 16, and when the input is 0, all the "bits" are 0.5 (which still works out fine, since it preserves the identity inputA = -15*a15 + 8*a8 + 4*a4 + 2*a2 + 1*a1).






    share|improve this answer











    $endgroup$




    33 31 weights





    # Activation functions
    sub hard $_[0] < -2.5 ? 0 : $_[0] > 2.5 ? 1 : 0.2 * $_[0] + 0.5
    sub linear $_[0]

    # Layer 0
    sub inputA() $a
    sub inputB() $b

    # Layer 1
    sub a15() hard(5*inputA)

    # Layer 2
    sub a8() hard(-5*inputA + 75*a15 - 37.5)

    # Layer 3
    sub aa() linear(-5*inputA + 75*a15 - 40*a8)

    # Layer 4
    sub a4() hard(aa - 17.5)

    # Layer 5
    sub a2() hard(aa - 20*a4 - 7.5)

    # Layer 6
    sub a1() linear(0.2*aa - 4*a4 - 2*a2)

    # Layer 7
    sub b15() hard(0.25*inputB - 5*a15)
    sub b8() hard(0.25*inputB - 5*a8)
    sub b4() hard(0.25*inputB - 5*a4)
    sub b2() hard(0.25*inputB - 5*a2)
    sub b1() hard(0.25*inputB - 5*a1)

    # Layer 8
    sub output() linear(-300*b15 + 160*b8 + 80*b4 + 40*b2 + 20*b1 - 10*inputA)

    # Test
    for $a (-10..10)
    for $b (-10..10)
    die if abs($a * $b - output) >= 0.5;



    print "All OK";


    Try it online!



    This does long multiplication in (sorta) binary, and thus returns the exact result. It should be possible to take advantage of the 0.5 error window to golf this some more, but I'm not sure how.



    Layers 1 to 6 decompose the first input in 5 "bits". For golfing reasons, we don't use actual binary. The most significant "bit" has weight -15 instead of 16, and when the input is 0, all the "bits" are 0.5 (which still works out fine, since it preserves the identity inputA = -15*a15 + 8*a8 + 4*a4 + 2*a2 + 1*a1).







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 5 hours ago

























    answered 6 hours ago









    GrimyGrimy

    4,71112 silver badges26 bronze badges




    4,71112 silver badges26 bronze badges







    • 1




      $begingroup$
      I did expect someone to come up with a hard-coded, ANN-ified multiplication algorithm. But I didn't think it would be the first response. Well done! (I'm also eager to see whether you'll be able to pull something like this off with the MNIST dataset or some other, more relastic ML problem :D.)
      $endgroup$
      – Stefan Mesken
      6 hours ago












    • 1




      $begingroup$
      I did expect someone to come up with a hard-coded, ANN-ified multiplication algorithm. But I didn't think it would be the first response. Well done! (I'm also eager to see whether you'll be able to pull something like this off with the MNIST dataset or some other, more relastic ML problem :D.)
      $endgroup$
      – Stefan Mesken
      6 hours ago







    1




    1




    $begingroup$
    I did expect someone to come up with a hard-coded, ANN-ified multiplication algorithm. But I didn't think it would be the first response. Well done! (I'm also eager to see whether you'll be able to pull something like this off with the MNIST dataset or some other, more relastic ML problem :D.)
    $endgroup$
    – Stefan Mesken
    6 hours ago




    $begingroup$
    I did expect someone to come up with a hard-coded, ANN-ified multiplication algorithm. But I didn't think it would be the first response. Well done! (I'm also eager to see whether you'll be able to pull something like this off with the MNIST dataset or some other, more relastic ML problem :D.)
    $endgroup$
    – Stefan Mesken
    6 hours ago













    4












    $begingroup$


    21 13 weights



    This is based on the polarization identity of bilinear forms which in the one dimensional real case reduces to the polynomial identity:



    $$ xcdot y = frac(x+y)^2 - (x-y)^24$$



    So y1 just computes [x+y, x-y] using a linear transformation, and y3 is just the absolute value of y1 as a preprocessing step for the next one: Then the "hard" part is computing the squares which I will explain below, and after that just computing a difference and scaling which is again a linear operation.



    To compute the squares I use an exponential series $s$ which should be accurate for all integers $0,1,2,ldots,20$ within around $0.5$. This series is of the form



    $$ textapprox_square(x) = sum_i=0^2 w_i exp(0.0001 cdot i cdot x)$$



    where I just optimized for the weights W2 ($=(w_i)_i$). This whole approximation comprises again just two linear transformations with an exponential activation sandwiched in between. This approach results in a maximal deviation of about 0.02.





    function p = net(x)
    relu = @(x)max(0,x);

    % Linear: 4 weights
    y1 = [1,1;1,-1]*x;

    % Linear + ReLu + Sum: 1 weight (the -1):
    y2 = -1 * y1;
    y3 = relu(y1) + relu(y2);

    % Linear + exp: 3 weights:
    W1 = (0:2)*0.0001;
    y4 = exp(y3 * W1);

    % Linear: 3 weights:
    W2 = [99700114.4299316;-199400468.100687;99700353.6313757];
    y5 = y4 * W2;

    % Linear : 2 weights:
    p = [0.25,-0.25]*y5;

    end


    Try it online!






    share|improve this answer











    $endgroup$

















      4












      $begingroup$


      21 13 weights



      This is based on the polarization identity of bilinear forms which in the one dimensional real case reduces to the polynomial identity:



      $$ xcdot y = frac(x+y)^2 - (x-y)^24$$



      So y1 just computes [x+y, x-y] using a linear transformation, and y3 is just the absolute value of y1 as a preprocessing step for the next one: Then the "hard" part is computing the squares which I will explain below, and after that just computing a difference and scaling which is again a linear operation.



      To compute the squares I use an exponential series $s$ which should be accurate for all integers $0,1,2,ldots,20$ within around $0.5$. This series is of the form



      $$ textapprox_square(x) = sum_i=0^2 w_i exp(0.0001 cdot i cdot x)$$



      where I just optimized for the weights W2 ($=(w_i)_i$). This whole approximation comprises again just two linear transformations with an exponential activation sandwiched in between. This approach results in a maximal deviation of about 0.02.





      function p = net(x)
      relu = @(x)max(0,x);

      % Linear: 4 weights
      y1 = [1,1;1,-1]*x;

      % Linear + ReLu + Sum: 1 weight (the -1):
      y2 = -1 * y1;
      y3 = relu(y1) + relu(y2);

      % Linear + exp: 3 weights:
      W1 = (0:2)*0.0001;
      y4 = exp(y3 * W1);

      % Linear: 3 weights:
      W2 = [99700114.4299316;-199400468.100687;99700353.6313757];
      y5 = y4 * W2;

      % Linear : 2 weights:
      p = [0.25,-0.25]*y5;

      end


      Try it online!






      share|improve this answer











      $endgroup$















        4












        4








        4





        $begingroup$


        21 13 weights



        This is based on the polarization identity of bilinear forms which in the one dimensional real case reduces to the polynomial identity:



        $$ xcdot y = frac(x+y)^2 - (x-y)^24$$



        So y1 just computes [x+y, x-y] using a linear transformation, and y3 is just the absolute value of y1 as a preprocessing step for the next one: Then the "hard" part is computing the squares which I will explain below, and after that just computing a difference and scaling which is again a linear operation.



        To compute the squares I use an exponential series $s$ which should be accurate for all integers $0,1,2,ldots,20$ within around $0.5$. This series is of the form



        $$ textapprox_square(x) = sum_i=0^2 w_i exp(0.0001 cdot i cdot x)$$



        where I just optimized for the weights W2 ($=(w_i)_i$). This whole approximation comprises again just two linear transformations with an exponential activation sandwiched in between. This approach results in a maximal deviation of about 0.02.





        function p = net(x)
        relu = @(x)max(0,x);

        % Linear: 4 weights
        y1 = [1,1;1,-1]*x;

        % Linear + ReLu + Sum: 1 weight (the -1):
        y2 = -1 * y1;
        y3 = relu(y1) + relu(y2);

        % Linear + exp: 3 weights:
        W1 = (0:2)*0.0001;
        y4 = exp(y3 * W1);

        % Linear: 3 weights:
        W2 = [99700114.4299316;-199400468.100687;99700353.6313757];
        y5 = y4 * W2;

        % Linear : 2 weights:
        p = [0.25,-0.25]*y5;

        end


        Try it online!






        share|improve this answer











        $endgroup$




        21 13 weights



        This is based on the polarization identity of bilinear forms which in the one dimensional real case reduces to the polynomial identity:



        $$ xcdot y = frac(x+y)^2 - (x-y)^24$$



        So y1 just computes [x+y, x-y] using a linear transformation, and y3 is just the absolute value of y1 as a preprocessing step for the next one: Then the "hard" part is computing the squares which I will explain below, and after that just computing a difference and scaling which is again a linear operation.



        To compute the squares I use an exponential series $s$ which should be accurate for all integers $0,1,2,ldots,20$ within around $0.5$. This series is of the form



        $$ textapprox_square(x) = sum_i=0^2 w_i exp(0.0001 cdot i cdot x)$$



        where I just optimized for the weights W2 ($=(w_i)_i$). This whole approximation comprises again just two linear transformations with an exponential activation sandwiched in between. This approach results in a maximal deviation of about 0.02.





        function p = net(x)
        relu = @(x)max(0,x);

        % Linear: 4 weights
        y1 = [1,1;1,-1]*x;

        % Linear + ReLu + Sum: 1 weight (the -1):
        y2 = -1 * y1;
        y3 = relu(y1) + relu(y2);

        % Linear + exp: 3 weights:
        W1 = (0:2)*0.0001;
        y4 = exp(y3 * W1);

        % Linear: 3 weights:
        W2 = [99700114.4299316;-199400468.100687;99700353.6313757];
        y5 = y4 * W2;

        % Linear : 2 weights:
        p = [0.25,-0.25]*y5;

        end


        Try it online!







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 55 mins ago

























        answered 1 hour ago









        flawrflawr

        28.1k6 gold badges73 silver badges198 bronze badges




        28.1k6 gold badges73 silver badges198 bronze badges




















            Stefan Mesken is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Stefan Mesken is a new contributor. Be nice, and check out our Code of Conduct.












            Stefan Mesken is a new contributor. Be nice, and check out our Code of Conduct.











            Stefan Mesken is a new contributor. Be nice, and check out our Code of Conduct.














            If this is an answer to a challenge…



            • …Be sure to follow the challenge specification. However, please refrain from exploiting obvious loopholes. Answers abusing any of the standard loopholes are considered invalid. If you think a specification is unclear or underspecified, comment on the question instead.


            • …Try to optimize your score. For instance, answers to code-golf challenges should attempt to be as short as possible. You can always include a readable version of the code in addition to the competitive one.
              Explanations of your answer make it more interesting to read and are very much encouraged.


            • …Include a short header which indicates the language(s) of your code and its score, as defined by the challenge.


            More generally…



            • …Please make sure to answer the question and provide sufficient detail.


            • …Avoid asking for help, clarification or responding to other answers (use comments instead).




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodegolf.stackexchange.com%2fquestions%2f187562%2fmachine-learning-golf-multiplication%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

            Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

            François Viète Contents Biography Work and thought Bibliography See also Notes Further reading External links Navigation menup. 21Google Bookspp. 75–77Google BooksDe thou (from University of Saint Andrews)ArchivedGoogle BooksGoogle BooksGoogle BooksGoogle booksGoogle Bookscc-parthenay.frL'histoire universelle (fr)Universal History (en)ArchivedAdsabs.harvard.eduPagesperso-orange.frArchive.orgChikara Sasaki. Descartes' mathematical thought p.259Google BooksGoogle BooksGoogle Bookspp. 152 and onwardGoogle BooksGoogle BooksScribd.comGoogle Books1257-7979Google BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGallica.bnf.frGoogle BooksGoogle Books"François Viète"Francois Viète: Father of Modern Algebraic NotationThe Lawyer and the GamblerAbout TarporleySite de Jean-Paul GuichardL'algèbre nouvelle"About the Harmonicon"cb120511976(data)1188044800000 0001 0913 5903n82164680ola2013766880073431702w6vt1sb70287374827140948071409480