The No-Free-Lunch Theorem and K-NN consistency$a_n$ consistency and other consistencyConsistency of the extremum estimatorsWhy is N/k the effective number of parameters in k-NN?What does $w_ni$ mean in the weighted nearest neighbour classifier?

Asymmetric table

Why is 1. d4 Nf6 2. c4 e6 3. Bg5 almost never played?

Can a Rogue PC teach an NPC to perform Sneak Attack?

Notepad++ cannot print

Can I get temporary health insurance while moving to the US?

Uri tokenizer as a simple state machine

Compelling story with the world as a villain

Heyacrazy: No Diagonals

Round towards zero

Non-visual Computers - thoughts?

Can RMSE and MAE have the same value?

Algorithms vs LP or MIP

Disambiguation of "nobis vobis" and "nobis nobis"

How many String objects would be created when concatenating multiple Strings?

Prevent use of CNAME Record for Untrusted Domain

I don't have the theoretical background in my PhD topic. I can't justify getting the degree

How to prevent clipped screen edges on my TV, HDMI-connected?

How to determine car loan length as a function of how long I plan to keep a car

Where was Carl Sagan working on a plan to detonate a nuke on the Moon? Where was he applying when he leaked it?

Are there any elected officials in the U.S. who are not legislators, judges, or constitutional officers?

Is there any way white can win?

Why is the UK so keen to remove the "backstop" when their leadership seems to think that no border will be needed in Northern Ireland?

Papers on arXiv solving the same problem at the same time

Why did Khan ask Admiral James T. Kirk about Project Genesis?



The No-Free-Lunch Theorem and K-NN consistency


$a_n$ consistency and other consistencyConsistency of the extremum estimatorsWhy is N/k the effective number of parameters in k-NN?What does $w_ni$ mean in the weighted nearest neighbour classifier?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








5












$begingroup$


In computational learning, The NFL theorem states that there is no universal learner. For every learning algorithm , there is a distribution that causes the learner output a hypotesis with a large error, with high probability (although there is a low error hypotesis).
The conclusion is that in order to learn, the hypotesis class or the distributions must be restricted.
In their book "A probabilistic theory of pattern recognition", Devroye et al prove the following theroem for the K-nearest neighbors learner:
$$textAssume mu text has a density. if kto infty text and k/nto0 \ text then for every epsilon>0, text there's N, text s.t.
text for all n>N : \ P(R_n - R^* > epsilon)< 2exp(-C_dn epsilon ^2) $$

Where $R^*$ is the error of the bayes-optimal rule, $R_n$ is the true error of the K-NN output (the probability is over the training set of size $n$), $mu$ is the probability measure on the instance space $mathbbR^d$ and $C_d$ is some constant depends only on the euclidean dimension.
Therefore, we can get as close as we want to the best hypothesis there is (not the best in some restricted class), without making any assumption on the ditribution. So I'm trying to understand how this result does not contradict the NFL theroem? thanks!










share|cite|improve this question









New contributor



michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




















    5












    $begingroup$


    In computational learning, The NFL theorem states that there is no universal learner. For every learning algorithm , there is a distribution that causes the learner output a hypotesis with a large error, with high probability (although there is a low error hypotesis).
    The conclusion is that in order to learn, the hypotesis class or the distributions must be restricted.
    In their book "A probabilistic theory of pattern recognition", Devroye et al prove the following theroem for the K-nearest neighbors learner:
    $$textAssume mu text has a density. if kto infty text and k/nto0 \ text then for every epsilon>0, text there's N, text s.t.
    text for all n>N : \ P(R_n - R^* > epsilon)< 2exp(-C_dn epsilon ^2) $$

    Where $R^*$ is the error of the bayes-optimal rule, $R_n$ is the true error of the K-NN output (the probability is over the training set of size $n$), $mu$ is the probability measure on the instance space $mathbbR^d$ and $C_d$ is some constant depends only on the euclidean dimension.
    Therefore, we can get as close as we want to the best hypothesis there is (not the best in some restricted class), without making any assumption on the ditribution. So I'm trying to understand how this result does not contradict the NFL theroem? thanks!










    share|cite|improve this question









    New contributor



    michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$
















      5












      5








      5





      $begingroup$


      In computational learning, The NFL theorem states that there is no universal learner. For every learning algorithm , there is a distribution that causes the learner output a hypotesis with a large error, with high probability (although there is a low error hypotesis).
      The conclusion is that in order to learn, the hypotesis class or the distributions must be restricted.
      In their book "A probabilistic theory of pattern recognition", Devroye et al prove the following theroem for the K-nearest neighbors learner:
      $$textAssume mu text has a density. if kto infty text and k/nto0 \ text then for every epsilon>0, text there's N, text s.t.
      text for all n>N : \ P(R_n - R^* > epsilon)< 2exp(-C_dn epsilon ^2) $$

      Where $R^*$ is the error of the bayes-optimal rule, $R_n$ is the true error of the K-NN output (the probability is over the training set of size $n$), $mu$ is the probability measure on the instance space $mathbbR^d$ and $C_d$ is some constant depends only on the euclidean dimension.
      Therefore, we can get as close as we want to the best hypothesis there is (not the best in some restricted class), without making any assumption on the ditribution. So I'm trying to understand how this result does not contradict the NFL theroem? thanks!










      share|cite|improve this question









      New contributor



      michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      $endgroup$




      In computational learning, The NFL theorem states that there is no universal learner. For every learning algorithm , there is a distribution that causes the learner output a hypotesis with a large error, with high probability (although there is a low error hypotesis).
      The conclusion is that in order to learn, the hypotesis class or the distributions must be restricted.
      In their book "A probabilistic theory of pattern recognition", Devroye et al prove the following theroem for the K-nearest neighbors learner:
      $$textAssume mu text has a density. if kto infty text and k/nto0 \ text then for every epsilon>0, text there's N, text s.t.
      text for all n>N : \ P(R_n - R^* > epsilon)< 2exp(-C_dn epsilon ^2) $$

      Where $R^*$ is the error of the bayes-optimal rule, $R_n$ is the true error of the K-NN output (the probability is over the training set of size $n$), $mu$ is the probability measure on the instance space $mathbbR^d$ and $C_d$ is some constant depends only on the euclidean dimension.
      Therefore, we can get as close as we want to the best hypothesis there is (not the best in some restricted class), without making any assumption on the ditribution. So I'm trying to understand how this result does not contradict the NFL theroem? thanks!







      k-nearest-neighbour consistency






      share|cite|improve this question









      New contributor



      michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.










      share|cite|improve this question









      New contributor



      michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      share|cite|improve this question




      share|cite|improve this question








      edited 8 hours ago







      michael J













      New contributor



      michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      asked 9 hours ago









      michael Jmichael J

      262 bronze badges




      262 bronze badges




      New contributor



      michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




      New contributor




      michael J is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.

























          1 Answer
          1






          active

          oldest

          votes


















          5













          $begingroup$

          The way I understand the NFL theorem is that there is no learning algorithm that's better than the rest in every task. This isn't however a theorem in the clear mathematical sense that it has a proof, rather an empirical observation.



          Similar to what you said for the kNN, there is also the Universal Approximation Theorem for Neural Networks, which states that given a 2-layer neural network, we can approximate any function with any arbitrary error.



          Now, how does this not break the NFL? It basically states that you can solve any conceivable problem with a simple 2-layer NN. The reason is that while, theoretically NNs can approximate anything, in practice its very hard to teach them to approximate anything. That's why for some tasks, other algorithms are preferable.



          A more practical way to interpret NFL is the following:




          There is no way of determine a-priori which algorithm will do best for a given task.







          share|cite|improve this answer









          $endgroup$

















            Your Answer








            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "65"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );






            michael J is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f423543%2fthe-no-free-lunch-theorem-and-k-nn-consistency%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            5













            $begingroup$

            The way I understand the NFL theorem is that there is no learning algorithm that's better than the rest in every task. This isn't however a theorem in the clear mathematical sense that it has a proof, rather an empirical observation.



            Similar to what you said for the kNN, there is also the Universal Approximation Theorem for Neural Networks, which states that given a 2-layer neural network, we can approximate any function with any arbitrary error.



            Now, how does this not break the NFL? It basically states that you can solve any conceivable problem with a simple 2-layer NN. The reason is that while, theoretically NNs can approximate anything, in practice its very hard to teach them to approximate anything. That's why for some tasks, other algorithms are preferable.



            A more practical way to interpret NFL is the following:




            There is no way of determine a-priori which algorithm will do best for a given task.







            share|cite|improve this answer









            $endgroup$



















              5













              $begingroup$

              The way I understand the NFL theorem is that there is no learning algorithm that's better than the rest in every task. This isn't however a theorem in the clear mathematical sense that it has a proof, rather an empirical observation.



              Similar to what you said for the kNN, there is also the Universal Approximation Theorem for Neural Networks, which states that given a 2-layer neural network, we can approximate any function with any arbitrary error.



              Now, how does this not break the NFL? It basically states that you can solve any conceivable problem with a simple 2-layer NN. The reason is that while, theoretically NNs can approximate anything, in practice its very hard to teach them to approximate anything. That's why for some tasks, other algorithms are preferable.



              A more practical way to interpret NFL is the following:




              There is no way of determine a-priori which algorithm will do best for a given task.







              share|cite|improve this answer









              $endgroup$

















                5














                5










                5







                $begingroup$

                The way I understand the NFL theorem is that there is no learning algorithm that's better than the rest in every task. This isn't however a theorem in the clear mathematical sense that it has a proof, rather an empirical observation.



                Similar to what you said for the kNN, there is also the Universal Approximation Theorem for Neural Networks, which states that given a 2-layer neural network, we can approximate any function with any arbitrary error.



                Now, how does this not break the NFL? It basically states that you can solve any conceivable problem with a simple 2-layer NN. The reason is that while, theoretically NNs can approximate anything, in practice its very hard to teach them to approximate anything. That's why for some tasks, other algorithms are preferable.



                A more practical way to interpret NFL is the following:




                There is no way of determine a-priori which algorithm will do best for a given task.







                share|cite|improve this answer









                $endgroup$



                The way I understand the NFL theorem is that there is no learning algorithm that's better than the rest in every task. This isn't however a theorem in the clear mathematical sense that it has a proof, rather an empirical observation.



                Similar to what you said for the kNN, there is also the Universal Approximation Theorem for Neural Networks, which states that given a 2-layer neural network, we can approximate any function with any arbitrary error.



                Now, how does this not break the NFL? It basically states that you can solve any conceivable problem with a simple 2-layer NN. The reason is that while, theoretically NNs can approximate anything, in practice its very hard to teach them to approximate anything. That's why for some tasks, other algorithms are preferable.



                A more practical way to interpret NFL is the following:




                There is no way of determine a-priori which algorithm will do best for a given task.








                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered 3 hours ago









                CaucMCaucM

                1413 bronze badges




                1413 bronze badges























                    michael J is a new contributor. Be nice, and check out our Code of Conduct.









                    draft saved

                    draft discarded


















                    michael J is a new contributor. Be nice, and check out our Code of Conduct.












                    michael J is a new contributor. Be nice, and check out our Code of Conduct.











                    michael J is a new contributor. Be nice, and check out our Code of Conduct.














                    Thanks for contributing an answer to Cross Validated!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f423543%2fthe-no-free-lunch-theorem-and-k-nn-consistency%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

                    Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

                    Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її