(cache)Covalic

ISIC 2017: Skin Lesion Analysis Towards Melanoma DetectionPart 3: Lesion Classification

Phase 1: Details and Training Data

Overview

Goal

In this task, participants are asked to complete two independent binary image classification tasks that involve three unique diagnoses of skin lesions (melanoma, nevus, and seborrheic keratosis). In the first binary classification task, participants are asked to distinguish between (a) melanoma and (b) nevus and seborrheic keratosis. In the second binary classification task, participants are asked to distinguish between (a) seborrheic keratosis and (b) nevus and melanoma.

Definitions:

Melanoma – malignant skin tumor, derived from melanocytes (melanocytic)
Nevus – benign skin tumor, derived from melanocytes (melanocytic)
Seborrheic keratosis – benign skin tumor, derived from keratinocytes (non-melanocytic)

Data

Lesion classification data includes the original image, paired with a gold standard (definitive) diagnosis, referred to as "Ground Truth".

Training Image Data

2000 images are provided as training data, including 374 "melanoma", 254 "seborrheic keratosis", and the remainder as benign nevi (1372). The training data is provided as a ZIP file, containing dermoscopic lesion images in JPEG format and a CSV file with some clinical metadata for each image.

All images are named using the scheme ISIC_<image_id>.jpg, where <image_id> is a 7-digit unique identifier. EXIF tags in the images have been removed; any remaining EXIF tags should not be relied upon to provide accurate metadata.

The CSV file contains three columns:

image_id, identifying the image that the row corresponds to
age_approximate, containing the age of the lesion patient, rounded to 5 year intervals, or "unknown"
sex, containing the sex of the lesion patient, or "unknown"

Ground Truth Data

The Training Ground Truth file is a single CSV (comma-separated value) file, containing 3 columns:

The first column of each row contains a string of the form ISIC_<image_id>, where <image_id> matches the corresponding Training Data image.
The second column of each row pertains to the first binary classification task (melanoma vs. nevus and seborrheic keratosis) and contains the value 0 or 1.
- The number 1 = lesion is melanoma
- The number 0 = lesion is nevus or seborrheic keratosis
The third column of each row pertains to the second classification task (seborrheic keratosis vs. melanoma and nevus) and contains the value 0 or 1.
- The number 1 = lesion is seborrheic keratosis
- The number 0 = lesion is melanoma or nevus

Malignancy diagnosis data were obtained from expert consensus and pathology report information. Participants are not strictly required to limit development to the training data, and are free to train their algorithm using external data sources. However, any other sources of data in system development must be properly cited in the abstract.

Submission Instructions

This year, there are two phases for result submission:

An optional Validation Phase, with 150 images. Submissions to the Validation Phase are immediately evaluated and made public, allowing participants to test their submission systems and get some feedback on the performance of their submitted algorithm.
An official Test Phase, with 600 images.. Submissions to the Test Phase are made against a blind held-out dataset and are immediately evaluated, but not made public until after the final submission date, as they constitute the final evaluation of participants' algorithms.

Participants may make unlimited and independent submissions to each phase, but only the most recent submission to the Test Phase will be used for official judging.

Evaluation

Participants will be ranked according to each category individually, as well as the average performance across both categories (giving rise to the possibility of 3 distinct "winners"). Ranks and awards will be assigned based only on area under the receiver operating characteristic curve (AUC). However, submissions will also be evaluated using using a variety of common binary classification metrics, reported for scientific completeness, including:

sensitivity at 0.5 confidence threshold
specificity at 0.5 confidence threshold
accuracy at 0.5 confidence threshold
average precision evaluated at sensitivity of 100%
specificity evaluated at a sensitivity of 82%
specificity evaluated at a sensitivity of 89%
specificity evaluated at a sensitivity of 95%
area under the receiver operating characteristic curve (AUC)

Some useful resources for metrics computation include: