Real-World Datasets
The Titanic dataset from the Titanic Kaggle competition. |
The Adult Census Income dataset from Kaggle. |
The Forest Cover Type dataset from Kaggle. |
The Dota2 Game Results dataset. |
The Mushroom classification Kaggle competition dataset. |
The Poker Hand dataset. |
The Bank Marketing dataset. |
A collection of Tabular benchmark datasets introduced in "Why do tree-based models still outperform deep learning on tabular data?". |
The Yandex dataset collections used by "Revisiting Deep Learning Models for Tabular Data". |
The KDD Census Income dataset. |
The tabular data with text columns benchmark datasets used by "Benchmarking Multimodal AutoML for Tabular Data with Text Fields". |
A collection of standardized datasets for tabular learning, covering categorical and numerical features. |
A collection of datasets for tabular learning with text columns, covering categorical, numerical, multi-categorical and timestamp features. |
The Mercari Price Suggestion Challenge dataset from Kaggle. |
The MovieLens 1M rating dataset, assembled by GroupLens Research from the MovieLens web site, consisting of movies (3,883 nodes) and users (6,040 nodes) with approximately 1 million ratings between them. |
The Amazon Fine Food Reviews dataset. |
The Diamond Images dataset from Kaggle. |
Synthetic Datasets
A fake dataset for testing purpose. |
Other Datasets
Load a Hugging Face |