How do I decide if I need to go for Normalization and not Standardization or vice-versa? ...
Most bit efficient text communication method?
What is Adi Shankara referring to when he says "He has Vajra marks on his feet"?
How to compare two different files line by line in unix?
How come Sam didn't become Lord of Horn Hill?
Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode
Sum letters are not two different
How could we fake a moon landing now?
Should there be a hyphen in the construction "IT affin"?
Central Vacuuming: Is it worth it, and how does it compare to normal vacuuming?
Can a new player join a group only when a new campaign starts?
How does light 'choose' between wave and particle behaviour?
Is CEO the "profession" with the most psychopaths?
What happened to Thoros of Myr's flaming sword?
How to improve on this Stylesheet Manipulation for Message Styling
Is it possible for SQL statements to execute concurrently within a single session in SQL Server?
Significance of Cersei's obsession with elephants?
draw dynamic circle around node and edges
How to unroll a parameter pack from right to left
Is there any word for a place full of confusion?
Has negative voting ever been officially implemented in elections, or seriously proposed, or even studied?
Put R under double integral
How does the secondary effect of the Heat Metal spell interact with a creature resistant/immune to fire damage?
What was the first language to use conditional keywords?
Importance of からだ in this sentence
How do I decide if I need to go for Normalization and not Standardization or vice-versa?
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsWhen do I have to use aucPR instead of auROC? (and vice versa)Does Batch Normalization make sense for a ReLU activation function?How to scale data for LSTM autoencoder?Why do most of the research papers simply adopt mainstream CNN architectures designed for ImageNet dataset (AlexNet, VGG, ResNet, Inception etc.)?normalization/denormalization for linear regression problemHow to plot High Dimensional supervised K-means on a 2D plot chartHow do I develop a system to Recommend a marketing channel using data science?What are Machine learning model characteristics?How to decide the processing power required based on the dataset?How to choose between classification Vs regression approach?
$begingroup$
While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?
machine-learning python data-science-model
New contributor
$endgroup$
add a comment |
$begingroup$
While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?
machine-learning python data-science-model
New contributor
$endgroup$
add a comment |
$begingroup$
While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?
machine-learning python data-science-model
New contributor
$endgroup$
While designing a ML model, how do I decide if I need to go for Normalization and not Standardization or vice-versa? On what factor is this decision made?
machine-learning python data-science-model
machine-learning python data-science-model
New contributor
New contributor
New contributor
asked 4 hours ago
Ajith MadhavAjith Madhav
161
161
New contributor
New contributor
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.
Now to answer your question, generally speaking the choice should be made based on what model you want to employ:
If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.
If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.
Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.
As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).
$endgroup$
add a comment |
$begingroup$
I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!
When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.
I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.
New contributor
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49578%2fhow-do-i-decide-if-i-need-to-go-for-normalization-and-not-standardization-or-vic%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.
Now to answer your question, generally speaking the choice should be made based on what model you want to employ:
If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.
If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.
Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.
As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).
$endgroup$
add a comment |
$begingroup$
Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.
Now to answer your question, generally speaking the choice should be made based on what model you want to employ:
If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.
If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.
Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.
As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).
$endgroup$
add a comment |
$begingroup$
Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.
Now to answer your question, generally speaking the choice should be made based on what model you want to employ:
If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.
If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.
Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.
As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).
$endgroup$
Before we start keep in mind that in most cases it doesn't play much of a difference which of the two you'll choose.
Now to answer your question, generally speaking the choice should be made based on what model you want to employ:
If you use a distance-based estimator (e.g. k-NN, k-means) it's better to normalize your features so that they occupy the same exact range of values (i.e. $[0,1]$). This forces your estimator to treat each feature with equal importance.
If you're using Neural Networks, it's better to standardize your features, because gradient descent has some useful properties when your data is centered around $0$ with unit variance.
Tree-based algorithms don't require any form of scaling, so its irrelevant if you scale or normalize your features.
As a rule of the thumb, I usually standardize the data (unless I'm going to strictly work with distance-based algorithms).
answered 2 hours ago
Djib2011Djib2011
2,66231125
2,66231125
add a comment |
add a comment |
$begingroup$
I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!
When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.
I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.
New contributor
$endgroup$
add a comment |
$begingroup$
I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!
When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.
I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.
New contributor
$endgroup$
add a comment |
$begingroup$
I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!
When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.
I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.
New contributor
$endgroup$
I think it purely depends upon the model. For instance, if it is a Naive Bayes, as it deals with probabilities only, you can't use the negative values. In this case Normalization works!
When you deal with geometry based algorithms such as SVM or Logistic Regression, it's better to standardize the data because due to (-1,1) symmetry in the data. The learning of training process happens very fast (due to symmetry points) when compared to Normalization.
I believe Standardization mostly works for many algorithms. However, what I suggest you is do check the context of algorithm and loss function metric.
New contributor
edited 26 mins ago
Stephen Rauch♦
1,52551330
1,52551330
New contributor
answered 1 hour ago
Kalyan PrasadKalyan Prasad
11
11
New contributor
New contributor
add a comment |
add a comment |
Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.
Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.
Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.
Ajith Madhav is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49578%2fhow-do-i-decide-if-i-need-to-go-for-normalization-and-not-standardization-or-vic%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown