Fabien Torre's site, Lille university, France


Datasets

The datasets I used (original data from the UCI Machine Learning Repository and my own cross-validation files).

See also :

Mushroom Database (agaricus-lepiota)

Description

  • original description : [ agaricus-lepiota.txt ]
  • 2 classes, 8124 instances, 22 attributes (all nominal)
  • 1.4 % missing values
  • best observed accuracy: 100.0 % (majority class: 51.8 %)

Downloads

Standardized Audiology Database (audiology)

Description

  • original description : [ audiology.txt ]
  • 24 classes, 226 instances, 70 attributes (all nominal)
  • 2.0 % missing values
  • best observed accuracy: 88.0 % (majority class: 25.2 %)

Downloads

ML94/COLT94 Badge Problem (badges)

Description

  • original description : [ badges.txt ]
  • 2 classes, 294 instances, 9 attributes (all nominal)
  • no missing values
  • best observed accuracy: 98.7 % (majority class: 71.4 %)

Downloads

blood-transfusion (blood-transfusion)

Description

  • original description : [ blood-transfusion.txt ]
  • 2 classes, 748 instances, 4 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 76.2 %

Downloads

Wisconsin Breast Cancer (breast-cancer)

Description

  • original description : [ breast-cancer.txt ]
  • 2 classes, 699 instances, 10 attributes (all numeric)
  • 0.2 % missing values
  • best observed accuracy: 97.0 % (majority class: 65.5 %)

Downloads

Car Evaluation Database (car)

Description

  • original description : [ car.txt ]
  • 4 classes, 1728 instances, 6 attributes (all nominal)
  • no missing values
  • best observed accuracy: 99.5 % (majority class: 70.0 %)

Downloads

Contraceptive Method Choice (cmc)

Description

  • original description : [ cmc.txt ]
  • 3 classes, 1473 instances, 9 attributes (2 numeric and 7 nominal)
  • no missing values
  • best observed accuracy: 55.2 % (majority class: 42.7 %)

Downloads

Credit Approval (crx)

Description

  • original description : [ crx.txt ]
  • 2 classes, 690 instances, 15 attributes (6 numeric and 9 nominal)
  • 0.6 % missing values
  • best observed accuracy: 86.6 % (majority class: 55.5 %)

Downloads

Dermatology Database (dermatology)

Description

  • original description : [ dermatology.txt ]
  • 6 classes, 366 instances, 34 attributes (1 numeric and 33 nominal)
  • 0.1 % missing values
  • best observed accuracy: 96.9 % (majority class: 30.6 %)

Downloads

Protein Localization Sites (ecoli)

Description

  • original description : [ ecoli.txt ]
  • 8 classes, 336 instances, 8 attributes (all numeric)
  • no missing values
  • best observed accuracy: 85.4 % (majority class: 42.6 %)

Downloads

Glass Identification (glass)

Description

  • original description : [ glass.txt ]
  • 6 classes, 214 instances, 10 attributes (all numeric)
  • no missing values
  • best observed accuracy: 95.5 % (majority class: 35.5 %)

Downloads

Hepatitis Domain (hepatitis)

Description

  • original description : [ hepatitis.txt ]
  • 2 classes, 155 instances, 19 attributes (6 numeric and 13 nominal)
  • 5.7 % missing values
  • best observed accuracy: 85.2 % (majority class: 79.4 %)

Downloads

Horse Colic Database (horse-colic)

Description

  • original description : [ horse-colic.txt ]
  • 2 classes, 368 instances, 23 attributes (7 numeric and 16 nominal)
  • 22.8 % missing values
  • best observed accuracy: 86.4 % (majority class: 63.0 %)

Downloads

1984 United States Congressional Voting Records Database (house-votes-84)

Description

  • original description : [ house-votes-84.txt ]
  • 2 classes, 435 instances, 16 attributes (all nominal)
  • 5.6 % missing values
  • best observed accuracy: 96.8 % (majority class: 61.4 %)

Downloads

Ionosphere

Description

  • original description : [ ionosphere.txt ]
  • 2 classes, 351 instances, 34 attributes (all numeric)
  • no missing values
  • best observed accuracy: 93.8 % (majority class: 64.1 %)

Downloads

Iris Plant (iris)

Description

  • original description : [ iris.txt ]
  • 3 classes, 150 instances, 4 attributes (all numeric)
  • no missing values
  • best observed accuracy: 96.7 % (majority class: 33.3 %)

Downloads

MAGIC gamma telescope data 2004 (magic04)

Description

  • original description : [ magic04.txt ]
  • 2 classes, 19020 instances, 10 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 64.8 %

Downloads

Ozone Level Detection (ozone)

Description

  • original description : [ ozone.txt ]
  • 2 classes, 2536 instances, 73 attributes (all numeric)
  • 8.1 % missing values
  • percent of instances in the majority class: 97.1 %

Downloads

Parkinsons Data Set (parkinsons)

Description

  • original description : [ parkinsons.txt ]
  • 2 classes, 195 instances, 23 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 75.4 %

Downloads

Pima Indians Diabetes (pima)

Description

  • original description : [ pima.txt ]
  • 2 classes, 768 instances, 8 attributes (all numeric)
  • no missing values
  • best observed accuracy: 75.4 % (majority class: 65.1 %)

Downloads

Promoter Gene Sequences Database (promoters)

Description

  • original description : [ promoters.txt ]
  • 2 classes, 106 instances, 57 attributes (all nominal)
  • no missing values
  • best observed accuracy: 96.2 % (majority class: 50.0 %)

Downloads

Sonar: Mines vs. Rocks (sonar)

Description

  • original description : [ sonar.txt ]
  • 2 classes, 208 instances, 60 attributes (all numeric)
  • no missing values
  • best observed accuracy: 85.5 % (majority class: 53.4 %)

Downloads

Spambase Data Set (spambase)

Description

  • original description : [ spambase.txt ]
  • 2 classes, 4601 instances, 57 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 60.6 %

Downloads

Tic-Tac-Toe Endgame (tic-tac-toe)

Description

  • original description : [ tic-tac-toe.txt ]
  • 2 classes, 958 instances, 9 attributes (all nominal)
  • no missing values
  • best observed accuracy: 100.0 % (majority class: 65.3 %)

Downloads

Vowel Recognition (vowel)

Description

  • original description : [ vowel.txt ]
  • 11 classes, 990 instances, 10 attributes (all numeric)
  • no missing values
  • best observed accuracy: 93.7 % (majority class: 9.1 %)

Downloads

Wine Recognition (wine)

Description

  • original description : [ wine.txt ]
  • 3 classes, 178 instances, 13 attributes (all numeric)
  • no missing values
  • best observed accuracy: 97.7 % (majority class: 39.9 %)

Downloads

Zoo database (zoo)

Description

  • original description : [ zoo.txt ]
  • 7 classes, 101 instances, 17 attributes (all nominal)
  • no missing values
  • best observed accuracy: 97.3 % (majority class: 40.6 %)

Downloads

Fabien Torre Valid HTML5! Valid CSS!
Accueil > Research > Experiments > Datasets
(last update )
Fabien Torre's site, Lille university, France

Description

Survoler un lien de navigation pour lire sa description ici...


Une photo au hasard

Une journée à Beaune.

Promenade dans Beaune.

(le 30 septembre 2006)

Une abeille qui butine.