https://builtin.com/data-science/EDA-python
Quatre outils de rapports EDA aux fonctionnalités similaires :
DataPrep
import pandas as pd from dataprep.eda import create_report df = pd.read_csv("parking_violations.csv") create_report(df)pip install dataprep
Ydata Profiling
pip install ydata-profiling from ydata-profiling import ProfileReport profile = ProfileReport(df, title="Report") profile
SweetViz
pip install sweetviz import sweetviz as sv analyze_report = sv.analyze(df) analyze_report.show_html(report.html', open_browser=False)
AutoViz
pip install autoviz from autoviz.AutoViz_Class import AutoViz_Class AV = AutoViz_Class() df_av = AV.AutoViz('parking.csv')
Et aussi https://towardsdatascience.com/comparing-five-most-popular-eda-tools-dccdef05aa4c
Lux
pip install lux-api jupyter nbextension install --py luxwidget jupyter nbextension enable --py luxwidget import pandas as pd import lux df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/gapminderDataFiveYear.csv", parse_dates=["year"]) df
D-Tale
pip install dtale import dtale import pandas as pd df = pd.read_csv(‘data.csv’) d = dtale.show(df) d.open_browser()
Et encore https://www.kaggle.com/code/mozattt/automated-eda-tools-part-1
Dabl
!pip install dabl import dabl Titanic_data = pd.read_csv("../input/titanic/train.csv") TitanicClean = dabl.clean(Titanic_data, verbose=1) dabl.plot(TitanicClean, target_col="Survived")
Datatile
import pandas as pd from datatile.summary.df import DataFrameSummary df = pd.read_csv('../input/titanic/train.csv') dfs = DataFrameSummary(df) dfs.summary()