Jargon request: “Canonical Form” of a wordArgot vs JargonDoes maths use jargon or metalanguage?What's the name of a lexeme's “surface form”?What is the word for a word which can mean anything you want it to?What's the difference between word vectors, word representations and vector embeddings?Сoncept of an attribute usesd by Russian grammariansDoes the single word 'equimolar' have single-word equivalents for less than or greater than?Word for a word that was the ancestor for another word?Term for borrowing an inflected form as an uninflected formIs there distinct jargon for syllabaries depending on their inventory?

How can I end combat quickly when the outcome is inevitable?

Why would future John risk sending back a T-800 to save his younger self?

Jargon request: "Canonical Form" of a word

Importance of Building Credit Score?

Union with anonymous struct with flexible array member

Longest bridge/tunnel that can be cycled over/through?

A IP can traceroute to it, but can not ping

How do governments keep track of their issued currency?

Can Rydberg constant be in joules?

What can I, as a user, do about offensive reviews in App Store?

Arriving at the same result with the opposite hypotheses

How is water heavier than petrol, even though its molecular weight is less than petrol?

What ways have you found to get edits from non-LaTeX users?

Compiling C files on Ubuntu and using the executable on Windows

Is the term 'open source' a trademark?

Zeros of the Hadamard product of holomorphic functions

Using "subway" as name for London Underground?

Is using haveibeenpwned to validate password strength rational?

Which languages would be most useful in Europe at the end of the 19th century?

Generate basis elements of the Steenrod algebra

Why didn't Voldemort recognize that Dumbledore was affected by his curse?

How can I make some of my chapters "come to life"?

Why can't I use =default for default ctors with a member initializer list

Why can my keyboard only digest 6 keypresses at a time?



Jargon request: “Canonical Form” of a word


Argot vs JargonDoes maths use jargon or metalanguage?What's the name of a lexeme's “surface form”?What is the word for a word which can mean anything you want it to?What's the difference between word vectors, word representations and vector embeddings?Сoncept of an attribute usesd by Russian grammariansDoes the single word 'equimolar' have single-word equivalents for less than or greater than?Word for a word that was the ancestor for another word?Term for borrowing an inflected form as an uninflected formIs there distinct jargon for syllabaries depending on their inventory?













3















I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.



Question



While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.



In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!










share|improve this question







New contributor



Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.























    3















    I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.



    Question



    While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.



    In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!










    share|improve this question







    New contributor



    Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















      3












      3








      3








      I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.



      Question



      While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.



      In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!










      share|improve this question







      New contributor



      Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.



      Question



      While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.



      In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!







      terminology






      share|improve this question







      New contributor



      Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.










      share|improve this question







      New contributor



      Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      share|improve this question




      share|improve this question






      New contributor



      Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      asked 8 hours ago









      StudentStudent

      1162




      1162




      New contributor



      Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




      New contributor




      Student is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes


















          6














          The term you are looking for is lemma.

          And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.

          I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.

          BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.






          share|improve this answer

























            Your Answer








            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "312"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );






            Student is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f31683%2fjargon-request-canonical-form-of-a-word%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            6














            The term you are looking for is lemma.

            And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.

            I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.

            BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.






            share|improve this answer





























              6














              The term you are looking for is lemma.

              And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.

              I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.

              BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.






              share|improve this answer



























                6












                6








                6







                The term you are looking for is lemma.

                And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.

                I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.

                BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.






                share|improve this answer















                The term you are looking for is lemma.

                And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.

                I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.

                BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 6 hours ago

























                answered 7 hours ago









                lemontreelemontree

                4,88541130




                4,88541130




















                    Student is a new contributor. Be nice, and check out our Code of Conduct.









                    draft saved

                    draft discarded


















                    Student is a new contributor. Be nice, and check out our Code of Conduct.












                    Student is a new contributor. Be nice, and check out our Code of Conduct.











                    Student is a new contributor. Be nice, and check out our Code of Conduct.














                    Thanks for contributing an answer to Linguistics Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f31683%2fjargon-request-canonical-form-of-a-word%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

                    Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

                    Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її