Browse datasets
Public datasets
22
New...
Title | Description | Cues | Cases | |
---|---|---|---|---|
1_data_CCU_89 | The reconstructed dataset of Green and Mehr (1997) it contains 89 cases and 4 attributes: a binary criterion (0 means “no infarction”, 1 means “infarction”) and three binary cues: “ST” refers to an altered segment in the Electrocardiography profile, “CP” is chest pain, and “OC” is a compound cue that notes the presence or absence of one or more other cues. | 4 | 89 | Details | Get |
2_data_CMC_1473 | The data used in this exercise was obtained from the UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/) and is a subset of the 1987 National Indonesia Contraceptive Prevalence Survey. The N = 1473 cases are samples of married women who were either not pregnant or did not know if they were pregnant at the time of interview. | 9 | 1473 | Details | Get |
3_data_CMC_800 | The dataset contains the data of two villages, each with 400 cases. The cases are randomly sampled from the full CMC dataset 02_data_CMC_1473. | 9 | 800 | Details | Get |
Attractivity of men (ATM) | predict average (inter-subject) attractiveness ratings based on the subjects¹ average likeability ratings of each person, the percent of subjects who recognized each name (subjects saw only the name, no photos), and whether the person was American. (Based on data from a study by Henss, 1996, using 115 male and 131 female Germans, aged 17-66 years.) | 6 | 32 | Details | Get |
Attractivity of women (ATW) | predict average (inter-subject) attractiveness ratings based on the subjects¹ average likeability ratings of each person, the percent of subjects who recognized each name (subjects saw only the name, no photos), and whether the person was American. (Based on data from a study by Henss, 1996, using 115 male and 131 female Germans, aged 17-66 years.) | 6 | 30 | Details | Get |
Biodiversity (GLP) | Predict the number of species on a Galapagos island, given its area, elevation, distance of nearest island, area of nearest island, distance from the coast, etc.. (Based on Johnson and Raven, 1973, reported in Weisberg, 1985.) | 9 | 26 | Details | Get |
Body fat (FAT) | predict percentage of body fat determined by underwater weighing (a more accurate measure of body fat) using various body circumference measurements (which are more convenient measures than underwater weighing) for 252 men. (Data supplied by A. Garth Fisher from the study of Penrose, Nelson, and Fisher, 1985; reported in StatLib.) | 17 | 218 | Details | Get |
Car accidents (CAR) | predict accident rate per million vehicle miles for a given segment of highway, using this segments length, average traffic count, percent of truck volume, speed limit, number of lanes, lane width, shoulder width, number of intersections, etc. for Minnesota in 1973. (Based on an unpublished master¹s paper in civil engineering by Carl Hoffstedt, reported in Weisberg, 1985.) | 16 | 37 | Details | Get |
City population (CIT) | predict population of biggest German cities based on whether each city has a soccer team, university, intercity train line, etc.. (From Fischer Welt Almanach.) | 12 | 83 | Details | Get |
Cow manure (O2U) | predict amount of oxygen absorbed by dairy wastes (yum!) given the biological oxygen deman, chemical oxygen demand, total Kjedahl nitrogen, total solids, and total volatile solids. (No, I don¹t understand the variables or the point of the study, either.) (Reported in Weisberg or Rice-- I must still find it.) | 9 | 14 | Details | Get |
Fuel consumption (FUL) | predict average motor fuel consumption per person for each of the 48 contiguous United States using population of the state, number of licensed drivers, fuel tax, per capita income, miles of primary highways, etc.. (Based on data collected by Cristopher Bingham for the American Almanac for 1974, except fuel consumption, which was given in the 1974 World Almanac, reported in Weisberg, 1985.) | 9 | 48 | Details | Get |
Highschool drop-out rates (SHL) | predict drop-out rate of a Chicago public high school given % low-income students, % non-White students, average SAT scores, etc.. (Based on Morton, 1995, and Rodkin, 1995.) | 21 | 57 | Details | Get |
Homelessness (HMF) | predict rate of homelessness in U.S. cities given average temperature, unemployment rate, percent of inhabitants below the poverty line, vacancy rate, whether the city has rent control, and percent public housing. (From Tucker, 1987.) | 9 | 50 | Details | Get |
House price (HSE) | predict selling price of a house in Eirie, PA, based on current property taxes, number of bathrooms, number of bedrooms, lot size, total living space, garage space, age of house, etc.. (Based on Narula and Wellington, 1977, reported in Weisberg, 1985) | 13 | 22 | Details | Get |
Land rent (LND) | predict rent per acre paid in different counties in Minnesota (in 1977 for agricultural land planted to alfalfa) based on average rent for all tillable land, density of dairy cows, proportion of pasture land, and whether liming is required to grow alfalfa. (Alfalfa is often fed to dairy cows.) | 7 | 58 | Details | Get |
Mammals sleep (SLP) | predict the average amount of sleep for a given species of mammal based on brain and body weight, life span, gestation time, and predation and danger indices. (From Allison and Cicchetti, 1976; reported in StatLib.) | 12 | 35 | Details | Get |
Mortality (POL) | predict mortality rate of U.S. cities given average January temp, HC pollution, % non-White, etc.. (Based on McDonald and Schwing, 1973; reported in StatLib.) | 18 | 20 | Details | Get |
Obesity at age 18 (KID) | predict somatotype (fatness) at age 18 based on body measurements from age 2 to age 18. The body measurements included height, weight, leg circumference and strength. (Based on the longitudinal monitoring of the Berkeley Guidance Study, Tuddenham and Snyder, 1954; reported in Weisberg, 1985.) | 13 | 46 | Details | Get |
Oxidants in Los Angeles (OXD) | predict amount of oxidant in Los Angeles given windspeed, temperature, humidity, and insolation (a measure of the amount of sunlight). (Data provided by the Los Angeles Pollution Control District, reported in Rice, 1995.) | 7 | 17 | Details | Get |
Ozone in San Francisco (SOZ) | predict amount of ozone in San Francisco based on the year, average winter precipitation for the last two years, and ozone level in San Jose, at the southern end of the Bay. (From Sandberg, Basso, and Okin, 1978, reported in Weisberg, 1985) | 6 | 11 | Details | Get |
Professor's salaries (PRF) | predict professor¹s salaries at a Midwestern college given rank, number of years in current rank, highest degree earned, and number of years since highest degree earned. (Reported in Rice, 1995.) | 8 | 51 | Details | Get |
Rainfall from cloud seeding (CLD) | predicting amount of rainfall on a given day in Coral Gables, FL, given the types of clouds, the percent of cloud cover, whether the clouds were seeded, days since the first day of the experiment, etc.. (From Woodley et al., 1977; reported in Weisberg, 1985.) | 9 | 24 | Details | Get |