Outils EDA en python

https://builtin.com/data-science/EDA-python

Quatre outils de rapports EDA aux fonctionnalités similaires :

DataPrep

pip install dataprep
import pandas as pd from dataprep.eda import create_report df = pd.read_csv("parking_violations.csv") create_report(df)

Ydata Profiling

pip install ydata-profiling

from ydata-profiling import ProfileReport
profile = ProfileReport(df, title="Report")
profile

SweetViz

pip install sweetviz

import sweetviz as sv
analyze_report = sv.analyze(df)
analyze_report.show_html(report.html', open_browser=False)

AutoViz

pip install autoviz

from autoviz.AutoViz_Class import AutoViz_Class
AV = AutoViz_Class()
df_av = AV.AutoViz('parking.csv')

Et aussi https://towardsdatascience.com/comparing-five-most-popular-eda-tools-dccdef05aa4c

Lux

pip install lux-api

jupyter nbextension install --py luxwidget
jupyter nbextension enable --py luxwidget

import pandas as pd
import lux
df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/gapminderDataFiveYear.csv", parse_dates=["year"])
df

D-Tale

pip install dtale

import dtale
import pandas as pd
df = pd.read_csv(‘data.csv’)
d = dtale.show(df)
d.open_browser()

Et encore https://www.kaggle.com/code/mozattt/automated-eda-tools-part-1

Dabl

!pip install dabl

import dabl
Titanic_data = pd.read_csv("../input/titanic/train.csv")
TitanicClean = dabl.clean(Titanic_data, verbose=1)
dabl.plot(TitanicClean, target_col="Survived")

Datatile

import pandas as pd
from datatile.summary.df import DataFrameSummary 

df = pd.read_csv('../input/titanic/train.csv') 
dfs = DataFrameSummary(df) 
dfs.summary()

Laisser un commentaire