https://builtin.com/data-science/EDA-python
Quatre outils de rapports EDA aux fonctionnalités similaires :
DataPrep
import pandas as pd from dataprep.eda import create_report df = pd.read_csv("parking_violations.csv") create_report(df)pip install dataprep
Ydata Profiling
pip install ydata-profiling from ydata-profiling import ProfileReport profile = ProfileReport(df, title="Report") profile
SweetViz
pip install sweetviz import sweetviz as sv analyze_report = sv.analyze(df) analyze_report.show_html(report.html', open_browser=False)
AutoViz
pip install autoviz
from autoviz.AutoViz_Class import AutoViz_Class
AV = AutoViz_Class()
df_av = AV.AutoViz('parking.csv')
Et aussi https://towardsdatascience.com/comparing-five-most-popular-eda-tools-dccdef05aa4c
Lux
pip install lux-api
jupyter nbextension install --py luxwidget
jupyter nbextension enable --py luxwidget
import pandas as pd
import lux
df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/gapminderDataFiveYear.csv", parse_dates=["year"])
df
D-Tale
pip install dtale import dtale import pandas as pd df = pd.read_csv(‘data.csv’) d = dtale.show(df) d.open_browser()
Et encore https://www.kaggle.com/code/mozattt/automated-eda-tools-part-1
Dabl
!pip install dabl
import dabl
Titanic_data = pd.read_csv("../input/titanic/train.csv")
TitanicClean = dabl.clean(Titanic_data, verbose=1)
dabl.plot(TitanicClean, target_col="Survived")
Datatile
import pandas as pd
from datatile.summary.df import DataFrameSummary
df = pd.read_csv('../input/titanic/train.csv')
dfs = DataFrameSummary(df)
dfs.summary()