How do I decide if I need to go for Normalization and not Standardization or vice-versa? ...

Most bit efficient text communication method?

What is Adi Shankara referring to when he says "He has Vajra marks on his feet"?

How to compare two different files line by line in unix?

How come Sam didn't become Lord of Horn Hill?

Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode

Sum letters are not two different

How could we fake a moon landing now?

Should there be a hyphen in the construction "IT affin"?

Central Vacuuming: Is it worth it, and how does it compare to normal vacuuming?

Can a new player join a group only when a new campaign starts?

How does light 'choose' between wave and particle behaviour?

Is CEO the "profession" with the most psychopaths?

What happened to Thoros of Myr's flaming sword?

How to improve on this Stylesheet Manipulation for Message Styling

Is it possible for SQL statements to execute concurrently within a single session in SQL Server?

Significance of Cersei's obsession with elephants?

draw dynamic circle around node and edges

How to unroll a parameter pack from right to left

Is there any word for a place full of confusion?

Has negative voting ever been officially implemented in elections, or seriously proposed, or even studied?

Put R under double integral

How does the secondary effect of the Heat Metal spell interact with a creature resistant/immune to fire damage?

What was the first language to use conditional keywords?

Importance of からだ in this sentence



How do I decide if I need to go for Normalization and not Standardization or vice-versa?



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsWhen do I have to use aucPR instead of auROC? (and vice versa)Does Batch Normalization make sense for a ReLU activation function?How to scale data for LSTM autoencoder?Why do most of the research papers simply adopt mainstream CNN architectures designed for ImageNet dataset (AlexNet, VGG, ResNet, Inception etc.)?normalization/denormalization for linear regression problemHow to plot High Dimensional supervised K-means on a 2D plot chartHow do I develop a system to Recommend a marketing channel using data science?What are Machine learning model characteristics?How to decide the processing power required based on the dataset?How to choose between classification Vs regression approach?












3












$begingroup$


While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?










share|improve this question







New contributor




Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    3












    $begingroup$


    While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?










    share|improve this question







    New contributor




    Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      3












      3








      3





      $begingroup$


      While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?










      share|improve this question







      New contributor




      Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?







      machine-learning python data-science-model






      share|improve this question







      New contributor




      Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 4 hours ago









      Ajith MadhavAjith Madhav

      161




      161




      New contributor




      Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Ajith Madhav is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          2 Answers
          2






          active

          oldest

          votes


















          1












          $begingroup$

          Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.



          Now to answer your question, generally speaking the choice should be made based on what model you want to employ:




          • If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.


          • If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.


          • Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.



          As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).






          share|improve this answer









          $endgroup$





















            0












            $begingroup$

            I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!



            When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.



            I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.






            share|improve this answer










            New contributor




            Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$














              Your Answer








              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "557"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });






              Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.










              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49578%2fhow-do-i-decide-if-i-need-to-go-for-normalization-and-not-standardization-or-vic%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1












              $begingroup$

              Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.



              Now to answer your question, generally speaking the choice should be made based on what model you want to employ:




              • If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.


              • If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.


              • Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.



              As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).






              share|improve this answer









              $endgroup$


















                1












                $begingroup$

                Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.



                Now to answer your question, generally speaking the choice should be made based on what model you want to employ:




                • If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.


                • If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.


                • Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.



                As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).






                share|improve this answer









                $endgroup$
















                  1












                  1








                  1





                  $begingroup$

                  Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.



                  Now to answer your question, generally speaking the choice should be made based on what model you want to employ:




                  • If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.


                  • If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.


                  • Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.



                  As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).






                  share|improve this answer









                  $endgroup$



                  Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.



                  Now to answer your question, generally speaking the choice should be made based on what model you want to employ:




                  • If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.


                  • If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.


                  • Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.



                  As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 2 hours ago









                  Djib2011Djib2011

                  2,66231125




                  2,66231125























                      0












                      $begingroup$

                      I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!



                      When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.



                      I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.






                      share|improve this answer










                      New contributor




                      Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.






                      $endgroup$


















                        0












                        $begingroup$

                        I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!



                        When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.



                        I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.






                        share|improve this answer










                        New contributor




                        Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                        Check out our Code of Conduct.






                        $endgroup$
















                          0












                          0








                          0





                          $begingroup$

                          I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!



                          When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.



                          I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.






                          share|improve this answer










                          New contributor




                          Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          $endgroup$



                          I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!



                          When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.



                          I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.







                          share|improve this answer










                          New contributor




                          Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.









                          share|improve this answer



                          share|improve this answer








                          edited 26 mins ago









                          Stephen Rauch

                          1,52551330




                          1,52551330






                          New contributor




                          Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.









                          answered 1 hour ago









                          Kalyan PrasadKalyan Prasad

                          11




                          11




                          New contributor




                          Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.





                          New contributor





                          Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          Kalyan Prasad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






















                              Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.










                              draft saved

                              draft discarded


















                              Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.













                              Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.












                              Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.
















                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49578%2fhow-do-i-decide-if-i-need-to-go-for-normalization-and-not-standardization-or-vic%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              ORA-01691 (unable to extend lob segment) even though my tablespace has AUTOEXTEND onORA-01692: unable to...

                              Always On Availability groups resolving state after failover - Remote harden of transaction...

                              Circunscripción electoral de Guipúzcoa Referencias Menú de navegaciónLas claves del sistema electoral en...