Fabien Torre's site, Lille university, France


Datasets

The datasets I used (original data from the UCI Machine Learning Repository and my own cross-validation files).

See also :

Mushroom Database (agaricus-lepiota)

Description

  • original description : [ agaricus-lepiota.txt ]
  • 2 classes, 8124 instances, 22 attributes (all nominal)
  • 1.4 % missing values
  • best observed accuracy: 100.0 % (majority class: 51.8 %)

Downloads

Standardized Audiology Database (audiology)

Description

  • original description : [ audiology.txt ]
  • 24 classes, 226 instances, 70 attributes (all nominal)
  • 2.0 % missing values
  • best observed accuracy: 88.0 % (majority class: 25.2 %)

Downloads

ML94/COLT94 Badge Problem (badges)

Description

  • original description : [ badges.txt ]
  • 2 classes, 294 instances, 9 attributes (all nominal)
  • no missing values
  • best observed accuracy: 98.7 % (majority class: 71.4 %)

Downloads

blood-transfusion (blood-transfusion)

Description

  • original description : [ blood-transfusion.txt ]
  • 2 classes, 748 instances, 4 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 76.2 %

Downloads

Wisconsin Breast Cancer (breast-cancer)

Description

  • original description : [ breast-cancer.txt ]
  • 2 classes, 699 instances, 10 attributes (all numeric)
  • 0.2 % missing values
  • best observed accuracy: 97.0 % (majority class: 65.5 %)

Downloads

Car Evaluation Database (car)

Description

  • original description : [ car.txt ]
  • 4 classes, 1728 instances, 6 attributes (all nominal)
  • no missing values
  • best observed accuracy: 99.5 % (majority class: 70.0 %)

Downloads

Contraceptive Method Choice (cmc)

Description

  • original description : [ cmc.txt ]
  • 3 classes, 1473 instances, 9 attributes (2 numeric and 7 nominal)
  • no missing values
  • best observed accuracy: 55.2 % (majority class: 42.7 %)

Downloads

Credit Approval (crx)

Description

  • original description : [ crx.txt ]
  • 2 classes, 690 instances, 15 attributes (6 numeric and 9 nominal)
  • 0.6 % missing values
  • best observed accuracy: 86.6 % (majority class: 55.5 %)

Downloads

Dermatology Database (dermatology)

Description

  • original description : [ dermatology.txt ]
  • 6 classes, 366 instances, 34 attributes (1 numeric and 33 nominal)
  • 0.1 % missing values
  • best observed accuracy: 96.9 % (majority class: 30.6 %)

Downloads

Protein Localization Sites (ecoli)

Description

  • original description : [ ecoli.txt ]
  • 8 classes, 336 instances, 8 attributes (all numeric)
  • no missing values
  • best observed accuracy: 85.4 % (majority class: 42.6 %)

Downloads

Glass Identification (glass)

Description

  • original description : [ glass.txt ]
  • 6 classes, 214 instances, 10 attributes (all numeric)
  • no missing values
  • best observed accuracy: 95.5 % (majority class: 35.5 %)

Downloads

Hepatitis Domain (hepatitis)

Description

  • original description : [ hepatitis.txt ]
  • 2 classes, 155 instances, 19 attributes (6 numeric and 13 nominal)
  • 5.7 % missing values
  • best observed accuracy: 85.2 % (majority class: 79.4 %)

Downloads

Horse Colic Database (horse-colic)

Description

  • original description : [ horse-colic.txt ]
  • 2 classes, 368 instances, 23 attributes (7 numeric and 16 nominal)
  • 22.8 % missing values
  • best observed accuracy: 86.4 % (majority class: 63.0 %)

Downloads

1984 United States Congressional Voting Records Database (house-votes-84)

Description

  • original description : [ house-votes-84.txt ]
  • 2 classes, 435 instances, 16 attributes (all nominal)
  • 5.6 % missing values
  • best observed accuracy: 96.8 % (majority class: 61.4 %)

Downloads

Ionosphere

Description

  • original description : [ ionosphere.txt ]
  • 2 classes, 351 instances, 34 attributes (all numeric)
  • no missing values
  • best observed accuracy: 93.8 % (majority class: 64.1 %)

Downloads

Iris Plant (iris)

Description

  • original description : [ iris.txt ]
  • 3 classes, 150 instances, 4 attributes (all numeric)
  • no missing values
  • best observed accuracy: 96.7 % (majority class: 33.3 %)

Downloads

MAGIC gamma telescope data 2004 (magic04)

Description

  • original description : [ magic04.txt ]
  • 2 classes, 19020 instances, 10 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 64.8 %

Downloads

Ozone Level Detection (ozone)

Description

  • original description : [ ozone.txt ]
  • 2 classes, 2536 instances, 73 attributes (all numeric)
  • 8.1 % missing values
  • percent of instances in the majority class: 97.1 %

Downloads

Parkinsons Data Set (parkinsons)

Description

  • original description : [ parkinsons.txt ]
  • 2 classes, 195 instances, 23 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 75.4 %

Downloads

Pima Indians Diabetes (pima)

Description

  • original description : [ pima.txt ]
  • 2 classes, 768 instances, 8 attributes (all numeric)
  • no missing values
  • best observed accuracy: 75.4 % (majority class: 65.1 %)

Downloads

Promoter Gene Sequences Database (promoters)

Description

  • original description : [ promoters.txt ]
  • 2 classes, 106 instances, 57 attributes (all nominal)
  • no missing values
  • best observed accuracy: 96.2 % (majority class: 50.0 %)

Downloads

Sonar: Mines vs. Rocks (sonar)

Description

  • original description : [ sonar.txt ]
  • 2 classes, 208 instances, 60 attributes (all numeric)
  • no missing values
  • best observed accuracy: 85.5 % (majority class: 53.4 %)

Downloads

Spambase Data Set (spambase)

Description

  • original description : [ spambase.txt ]
  • 2 classes, 4601 instances, 57 attributes (all numeric)
  • no missing values
  • percent of instances in the majority class: 60.6 %

Downloads

Tic-Tac-Toe Endgame (tic-tac-toe)

Description

  • original description : [ tic-tac-toe.txt ]
  • 2 classes, 958 instances, 9 attributes (all nominal)
  • no missing values
  • best observed accuracy: 100.0 % (majority class: 65.3 %)

Downloads

Vowel Recognition (vowel)

Description

  • original description : [ vowel.txt ]
  • 11 classes, 990 instances, 10 attributes (all numeric)
  • no missing values
  • best observed accuracy: 93.7 % (majority class: 9.1 %)

Downloads

Wine Recognition (wine)

Description

  • original description : [ wine.txt ]
  • 3 classes, 178 instances, 13 attributes (all numeric)
  • no missing values
  • best observed accuracy: 97.7 % (majority class: 39.9 %)

Downloads

Zoo database (zoo)

Description

  • original description : [ zoo.txt ]
  • 7 classes, 101 instances, 17 attributes (all nominal)
  • no missing values
  • best observed accuracy: 97.3 % (majority class: 40.6 %)

Downloads

Fabien Torre Valid HTML5! Valid CSS!
Accueil > Research > Experiments > Datasets
(last update )
Fabien Torre's site, Lille university, France

Description

Survoler un lien de navigation pour lire sa description ici...


Une photo au hasard

En Corse.

Lac de Crenu.

(le 14 juillet 2008)

Couvent saint François dans le village de Vico.