Python Feature Selection: Remove Multicollinearity from Machine Learning Model in Python

Stats Wire

2 года назад

7,544 Просмотров

Скачать видео

Комментарии:

@anishdeshpande395 - 17.10.2023 00:19

Is this method better than variance inflation factor?

Ответить

@baburamchaudhary159 - 21.09.2023 22:17

I have been following you for feature selection, covered forward, backward, exhaustive, variance threshold, chi2, etc.
You have not shared the dataset, in them.
for us to follow along you, why don't you share dataset?

Ответить

@naveedullah390 - 28.04.2023 14:46

when i enter the code line =====> corrmatrix = X_train.corr()
it gives the error of =========> AttributeError: 'numpy.ndarray' object has no attribute 'corr'

Ответить

@user-ep5mg7kx5n - 14.03.2023 18:04

what about a scenario where the order of the columns change? since we're checking for adjacent columns and their correlations to be more than the threshold and then remove the first out of the two in case the threshold is matched or passed, if I change the order of columns, the result received will be different. is that going to a correct list of features as well?

Ответить

@d1pranjal - 26.01.2023 11:40

How are diagonal elements being handled in the user defined function correlation(df, threshold) ?

Ответить

@akiwhitesoyo918 - 15.11.2022 16:54

Nice ! Would it be the same if we use PCA to avoid multicollinearity ?

Ответить

@gisflow406 - 30.10.2022 11:28

This wasn't helpful at all. You just picked one of the correlated variables randomly without additional criteria. Anyways, correlation matrix can't do much. It's much more reliable to use VIF or hierarchical clustering for feature selection.

Ответить

@michaelsagols8295 - 14.10.2022 12:24

Thank you for the video! very well explained! keep it up!

Ответить

@maskman9630 - 27.09.2022 11:27

how to find collinearity for categorical features

Ответить

@imontesee - 13.09.2022 08:18

nice video, what about checking from where the high threshold is coming and comparing the correlation with the target column and only dropping the one with the less correlation

Ответить