Medical Imaging

Narrow Band Active Contour Attention Model

Abnormal Pap Cervix Detection and Classification 

Brain Tumor Segmentation 

The proposed method incorporates VLS into deep learning by defining a novel end-to-end trainable model called as Deep Recurrent Level Set (DRLS). The proposed DRLS consists of three layers, i.e, Convolutional layers, Deconvolutional layers with skip connections and LevelSet layers. Brain tumor segmentation is taken as an instant to illustrate the  performance of the proposed DRLS. Convolutional layer learns visual representation of brain tumor at different scales. Since brain tumors occupy a small portion of the image, deconvolutional layers are designed with skip connections to obtain a high quality feature map. Level-Set Layer drives the contour towards the brain tumor. In each step, the Convolutional Layer is fed with the LevelSet map to obtain a brain tumor feature map. This in turn serves as input for the LevelSet layer in the next step.

Deep learning-based approaches have achieved impressive performance in semantic segmentation but they are limited to pixel-wise settings with imbalanced-class data problems and weak boundary object segmentation.  In this work, we tackle those limitations by developing Narrow Band Active Contour (NB-AC) attention model which treats the object contour as a hyperplane and all data inside a narrow band as support information that influences the position and orientation of the hyperplane. Our proposed NB-AC attention model incorporates the contour length with the region energy involving a fixed-width band around the curve or surface. The proposed network loss contains two fitting terms: (i) high level features (i.e. region) fitting term from the first branch; (ii) lower level features (i.e. contour) fitting term from the second branch including the (ii_1) length of object contour and (ii_2) regional energy functional formed by the homogeneity criterion of both inner band and outer band neighboring the evolving curve or surface. The proposed NB-AC loss is able to incorporate into both 2D and 3D deep network architectures.  Our 3D network which is built upon the proposed NB-AC loss and 3DUnet framework archives the state-of-the-art results on multiple volumetric datasets.

A (Papanicolaou) Pap smear is a test to check for signs of cancer of the cervix which is part of your uterus (womb). During a pap smear, a sample of cells from your cervix is taken tested and examined by a doctor. The cells on the slide are checked for signs that they're changing from normal to abnormal. Cells go through a series of changes before they turn into cancer. There has a great effort to automate Pap smear test and it is one of the critical fields of medical image processing. So this work proposes a method for automatic cervical cancer detection using cervical cell detection and classification.

Abnormal Blood Cell Detection and Segmentation 

This is an end-to-end framework for automatically detecting and segmenting blood cells including normal red blood cells (RBCs), connected RBCs, abnormal RBCs and white blood cells (WBCs). We first design a novel blood cell color representation which is able to emphasize the RBCs and WBCs in separate channels. Template matching technique is then employed to individually detect RBCs and WBCs in our proposed representation. In order to automatically segment the RBCs and nuclei from WBCs, we develop an adaptive level set-based segmentation method which makes use of both local and global information. The detected and segmented RBCs, however, can be a single RBC, a connected RBC or an abnormal RBC. Therefore, we first separate and reconstruct RBCs from the connected RBCs by our suggested modified template matching. Shape matching by inner distance is later used to classify the abnormal RBCs from the normal RBCs 

Enhance Portable Radiographs

This work aims to assist physicians to improve their speed and diagnostic accuracy when using portable chest radiographs, which are in especially high demand in the setting of the ongoing COVID-19 pandemic. Advancing the recent artificial intelligence (A.I) development, we introduce new deep learning frameworks to align and enhance the quality of portable chest radiographs appearance to be consistent with higher quality conventional chest radiographs. These enhanced portable chest radiographs can then help the doctors to provide faster and more accurate diagnosis and treatment strategy.

This work has been undertaken in collaboration with the Department of Radiology in University of Arkansas for Medical Sciences (UAMS) to enhance portable/mobile COVID-19 chest radiographs, to improve the speed and accuracy of portable chest radiograph images and aid in urgent COVID-19 diagnosis and treatment.

A Multi-task Contextual Network for Brain Tumor Detection & Segmentation

We investigate that segmenting brain tumor is facing to the imbalanced data problem where the number of pixels belonging to background class (non tumor pixel) is much larger than the number of pixels belonging to foreground class (tumor pixel). To address this problem, we propose a multi-task network which is formed as a cascaded structure and designed to share their feature maps. Our model consists of two targets, namely, effectively differentiating brain tumor regions and estimating brain tumor masks. The first task is performed by our proposed contextual detection network which aims at reducing redundant background pixels and focusing on the region around brain tumor. Instead of processing every pixel, our contextual detection network only processes contextual regions around ground-truth instances and this strategy helps to produce meaningful regions proposals. The second task is built upon a 3D atrous residual network and under  an encode-decode network in order to effectively segment both large and small object (brain tumor). Our 3D atrous residual network is designed with a skip connection to enables the gradient from the deep layers to be directly propagated to shallow layers thus features of different depth are preserved and used for refining each other.  In order to incorporate larger contextual information in volume MRI data, our network is designed with 3D atrous convolution with various kernel sizes which enlarges the field of view of filters

Unpaired Multi-Contrast MRI Image-to-Image Translation

We introduce a novel approach to unpaired image-to-image translation based on the invertible architecture. The invertible property of the flow-based architecture assures a cycle-consistency of image-to-image translation without additional loss functions. We utilize the temporal information between consecutive slices to provide more constraints to the optimization for transforming one domain to another in unpaired volumetric medical images. To capture temporal structures in the medical images, we explore the displacement between the consecutive slices using a deformation field. In our approach, the deformation field is used as a guidance to keep the translated slides realistic and consistent across the translation.

(a) source image, (b) target image, (c) cycleGAN, (d) recycleGAN, (e) cycleflow, (f) flow2flow, and (g) our method. Our method provides a good boundary on the tumor regions (red arrows in the fifth row) compared with the existing methods

Scene Understanding

Contextual Residual Recurrent Network

Designed as extremely deep architectures, deep residual networks which provide a rich visual representa- tion and offer robust convergence behaviors have recently achieved exceptional performance in numerous computer vision problems. Being directly applied to a scene labeling problem, however, they were limited to capture long-range contextual dependence, which is a critical aspect. To address this issue, we propose a novel approach, Contextual Recurrent Residual Networks (CRRN) which is able to simultaneously han- dle rich visual representation learning and long-range context modeling within a fully end-to-end deep network. Furthermore, our proposed end-to-end CRRN is completely trained from scratch, without using any pre-trained models in contrast to most existing methods usually fine-tuned from the state-of-the-art pre-trained models, e.g. VGG-16, ResNet, etc 

Semantic Instance Segmentation

Contextual Recurrent LetSets Network

Variational Level Set (LS) has been a widely used method in medical segmentation. However, it is limited when dealing with multi-instance objects in the real world. In addition, its segmentation results are quite sensitive to initial settings and highly depend on the number of iterations. To address these issues and boost the classic variational LS methods to a new level of the learnable deep learning approaches, we propose a novel definition of contour evolution named Recurrent Level Set (RLS)1 to employ Gated Recurrent Unit under the energy minimization of a variational LS functional. The curve deformation process in RLS is formed as a hidden state evolution procedure and updated by minimizing an energy functional composed of fitting forces and contour length. By sharing the convolutional features in a fully end-to-end trainable framework, we extend RLS to Contextual RLS (CRLS) to address semantic segmentation in the wild. The experimental results have shown that our proposed RLS improves both computational time and segmentation accuracy against the classic variational LS-based method whereas the fully end-to-end system CRLS achieves competitive performance compared to the state-of-the-art semantic segmentation approaches. 

Safety Driving

Hand On Steering Wheel Detection and Classification

This paper presents an advanced Convolutional Neural Network (ConvNet) based approach, named Multiple Scale Region-based Fully Convolutional Networks (MS-RFCN), for hand detection and classification. In order to robustly deal with the challenging factors, we proposed to span the receptive fields in the ConvNet in multiple deep feature maps. By this way, both global and local context information are able to be efficiently synchronized and simultaneously contribute to the human hand feature representation process. 

The experiments are presented on the challenging hand databases, i.e. the Vision for Intelligent Vehicles and Applications (VIVA) Challenge and Oxford Hand Detection database. Our proposed method achieves the state-of-the-art results 

A Grammar-aware Driver Parsing Approach

This paper presents an advanced Convolutional Neural Network (ConvNet) based approach, named Multiple Scale Region-based Fully Convolutional Networks (MS-RFCN), for hand detection and classification. In order to robustly deal with the challenging factors, we proposed to span the receptive fields in the ConvNet in multiple deep feature maps. By this way, both global and local context information are able to be efficiently synchronized and simultaneously contribute to the human hand feature representation process. 

The experiments are presented on the challenging hand databases, i.e. the Vision for Intelligent Vehicles and Applications (VIVA) Challenge and Oxford Hand Detection database. Our proposed method achieves the state-of-the-art results 

Twin Identification

Intrinsic and extrinsic facial asymmetry are common in humans and have been used in many biometric applications. Intrinsic facial asymmetry is caused by changes that occur to the structure of the face as a result of aging, growth, injuries, birthmarks and splotches and sun burns. Extrinsic facial asymmetr is caused by external factors such as viewing orientation, illumination variation, etc. The asymmetry of a face is an individualized characteristic, differing in perceptible ways even between identical twins. We describe two techniques of asymmetry decomposition used to identify twins. The first technique is based on projecting the difference between two symmetric images obtained by reflecting the right side of a face and the left side of a face respectively onto an SVD subspace. The second technique uses Procrustes analysis and makes used of the angle between the two left sides image and the two right sides image.

Document Enhancement

The proposed methods are able to deal with image degradations which can occur due to bleeding-through ink, large black border, fading ink, uneven illumination, contrast variation, smear, and various pattern backgrounds. Given an input image, the contrast of intensity is first estimated by a grayscale morphological closing operator. A double-threshold is generated by our Shannon entropy-based thresholding methods corresponding to 1-D histogram and 2-D histogram to classify pixels into text, near-text, and non-text categories. The pixels in the second group are relabeled by the local mean and the standard deviation values. Our proposed methods classify noise into two classes, which are dealt with by binary morphological operators, shrink and swell filters, and a graph searching strategy

Carnegie Mellon University

University of Arkansas

  • Facebook Clean Grey
  • Twitter Clean Grey
  • LinkedIn Clean Grey