Instrumentation
The portable imaging system used in this study consisted of a modified commercial headlamp system; details of the device have been described previously [13]. Briefly, the multi-modal imaging system uses light emitting diodes (LEDs) to illuminate the oral mucosa. For fluorescence imaging, the system has a blue LED with an excitation peak at 455 nm wavelength; for reflectance imaging, it has a white LED with an illumination range of 400 to 700 nm. Images can either be observed visually or captured digitally through a set of optical filters using an integrated, miniature charge coupled device (CCD) camera. The system is connected to a laptop via a firewire interface to record and store the images. The portable system weighs only 3 pounds and can be powered by a lithium-ion battery.
Protocol and Image Acquisition
The study was conducted at Tata Memorial Hospital (TMH) in Mumbai, India. Patients who were referred to the Cancer Prevention Clinic at Tata Memorial Hospital because of suspicious oral lesions or were waiting for head & neck surgery in the hospital ward were recruited to participate in the study. In addition to patients, healthy volunteers with and without a history of using tobacco were recruited to participate in the study. The clinical study was reviewed and approved by the Hospital Ethics Committee (HEC) at TMH and the image analysis study was reviewed and approved by the Institutional Review Board at Rice University. Written informed consent was obtained from each subject enrolled in the study.
In vivo imaging measurements from subjects were obtained in the Cancer Prevention Clinic. All measurements were taken in a darkened room to avoid room light interference. The imaging system was positioned approximately 20 cm away from the subjects. Reflectance image exposure was a few milliseconds while fluorescence image exposure was approximately 500 milliseconds.
A head & neck specialist at the clinic assessed each participating patient by conducting a conventional examination of the oral cavity. Initial clinical impression of each site as normal or abnormal was noted. In addition, the presence of either melanosis - darkly pigmented lesions - or oral submucous fibrosis - rigid, fibrotic white lesions - was noted if visible. After clinical examination, digital reflectance and fluorescence images were obtained from clinically abnormal sites and contralateral clinically normal sites. Images were also obtained from the lateral border of the tongue, the buccal mucosa, and the lip of each subject whenever these sites were accessible. A quality control check was performed on all images before further analysis. Sites with poor image quality (e.g. out-of-focus images) were excluded from analysis.
For sites with an initial clinical impression of abnormal, the white light reflectance images were reviewed by three expert observers who were blinded to the fluorescence images (NI, AG, PC). At each site, the observer assigned a single clinical impression of 'Normal', 'Low Risk for Neoplasia', 'High Risk for Neoplasia', or 'Cancer'. Consensus clinical impression was used to determine the final diagnostic category for each site imaged. In cases where the impression of one of the expert observers differed from the other two, the clinical impression assigned by two of the three observers was used as the consensus. Measurements in which all three observers disagreed on the clinical impression were excluded from the analysis; a total of four sites were excluded for this reason. Sites with an initial clinical impression of normal were categorized with a diagnosis of 'Normal'. Performances of algorithms based on features of digital optical image analysis are reported relative to this consensus clinical impression.
Image Analysis
White light reflectance images of each site were first examined; if the area was clinically abnormal, a region of interest (ROI) corresponding to the lesion was defined. If the area was clinically normal, a representative ROI was selected from the white light reflectance image. The same ROI was identified in each color fluorescence image of that site, and quantitative image features were calculated for each ROI.
Reflectance and fluorescence images were analyzed to yield possible features for use in classification algorithms. The following metrics were generated for ROIs corresponding to lesions and contralateral normal measurements: the average intensity in the red, green and blue (RGB) channels, average values of the ratios of the R/G, R/B and B/G intensities, the average intensity following grayscale conversion, and the standard deviation of the RGB and grayscale intensity values. In addition, for each ROI corresponding to a clinically abnormal site, we calculated the ratio of the metric for the lesion relative to that measured from the contralateral normal ROI in the same patient. We refer to these metrics as 'normalized ratios'. For measurements from clinically normal sites, normalized ratios were obtained by dividing each ROI in two and calculating the ratio of the metrics from the two resulting regions.
We explored which of these features provided the best separation between non-neoplastic oral mucosa and neoplastic oral mucosa. For calculation of sensitivity and specificity, sites with a diagnosis of 'Cancer' or 'High Risk' were considered to be neoplastic, while sites with a clinical diagnosis of 'Normal' or 'Low Risk' were considered to be non-neoplastic. Prior to feature selection, sites with a clinical descriptor of melanosis were excluded from the data set. Binary classification algorithms were developed using linear discriminant analysis with a single image feature as input. The same clinical dataset was used to both develop the algorithm and to assess classification accuracy. For each image feature, diagnostic performance was assessed as the threshold was varied from the minimum to the maximum of its value to generate a receiver operating characteristic (ROC) curve. Classification performance measures, such as the area under the receiver operating curve (AUC), sensitivity and specificity at the Q-point, were calculated for each of the input metrics using consensus clinical impression as the gold standard.