On Incorporating the Stochasticity of Quantum Machine Learning into Classical Models   

Wednesday, July 6, 2022 - 03:00 pm

THESIS DEFENSE

Author : Joseph Lindsay

Advisor : Dr. Ramtin Zand

Date : July 6, 2022

Time 3:00 pm

Place : Virtual (Teams link below)

 

Meeting Link: click here

 

Abstract

While many of the most exciting quantum computing algorithms are currently impossible to be implemented until fault-tolerant quantum error correction is achieved, noisy intermediate-scale quantum (NISQ) devices allow for smaller scale applications that leverage the paradigm for speed-ups to be researched and realized. A currently popular application for these devices is quantum machine learning (QML). Recent works over the past few years indicate that QML algorithms can function just as well as their classical counterparts, and even outperform them in some cases. Many current QML models take advantage of variational quantum algorithm (VQA) circuits, given that their scale is typically small enough to be compatible with NISQ devices and the method of automatic differentiation for optimizing circuit parameters is familiar to machine learning. As with many skeptics on its benefits of quantum computing in general, there is some concern as to whether machine learning is the "best" use case for the advantages that NISQ devices make possible. To this end, the nature of this work is to investigate the utilization of stochastic methods inspired by QML in attempt to approach the reported successes in performance. Using the long short-term memory (LSTM) model as a case study and by analyzing the performance of classical, stochastic, and QML methods, this work aims to elucidate if it is possible to achieve QML's benefits on classical machines by incorporating aspects of its stochasticity.

Cross Domain Semantic Segmentation

Tuesday, June 7, 2022 - 09:00 am

DISSERTATION DEFENSE

Author : Xinyi Wu

Advisor : Dr. Song Wang

Date : June 7, 2022

Time 9:00 am

Place : Virtual (Zoom link below)

 

Meeting Link: https://zoom.us/j/98175586174?pwd=bEs2a1g0azBmUlhCNWluT2R2bDhidz09

 

Abstract

As a long-standing computer vision task, semantic segmentation is still extensively researched till now because of its importance to visual understanding and analysis. The goal of semantic segmentation is to classify each pixel of images based on the pre-defined classes. In the era of deep learning, convolutional neural networks largely improve the accuracy and efficiency of semantic segmentation. However, this success is achieved with two limitations: 1) a large-scale labeled dataset is required for training while the labeling process for this task is quite labor-intensive and tedious; 2) the trained deep networks can get promising results when testing on the same domain (i.e., intra-domain test) but might suffer from a large performance drop when testing on different domains (i.e.,  cross-domain test). Therefore, developing algorithms that can transfer knowledge from labeled source domains to unlabeled target domains is highly desirable to address these two limitations. 

In this research, we explore three settings of cross domain semantic segmentation conditioned on the use of different training data in the target domain: 1) the use of a sole unlabeled target image, 2) the use of multiple unlabeled target images, and 3) the use of unlabeled target videos, respectively. 

At the first part, we tackle the problem of one-shot unsupervised domain adaptation (OSUDA) for semantic segmentation where the segmentors only use one unlabeled target image during training. In this case, traditional unsupervised domain adaptation models usually fail since they cannot adapt to the target domain with over-fitting to one (or few) unlabeled target samples. To address this problem, existing OSUDA methods usually integrate a style-transfer module to perform domain randomization based on the unlabeled target sample, with which multiple domains around the target sample can be explored during training. However, such a style-transfer module relies on an additional set of images as style reference for pre-training and also increases the memory demand for domain adaptation. Here we propose a new OSUDA method that can effectively relieve such computational burden by making full use of the sole target image in two aspects: (1) implicitly stylizing the source domain in both image and feature levels; (2) softly selecting the source training pixels. Experimental results on two commonly-used synthetic-to-real scenarios demonstrate the effectiveness and efficiency of the proposed method.  

Secondly, we work on the problem of nighttime semantic segmentation which plays an equally important role as that of daytime images in autonomous driving but is much more challenging and less studied due to poor illuminations and arduous human annotations. Our proposed solution employs an adversarial training with a labeled daytime dataset and an unlabeled dataset that contains coarsely aligned day-night image pairs. The unlabeled daytime images from the target dataset serve as an intermediate domain to mitigate the difficulty in day-to-night adaption since they share similarities with the source in illumination pattern and contain the same static-category objects as the their nighttime counterparts. Extensive experiments on Dark Zurich and Nighttime Driving datasets show that our method achieves state-of-the-art performance for nighttime semantic segmentation. 

Finally, we propose a domain adaptation method for video semantic segmentation, i.e.,  the target is in video format. Before our work, other works were achieving this goal by transferring the knowledge from the source domain of self-labeled simulated videos to the target domain of unlabeled real-world videos. In our work, we argue that it is not necessary to use a labeled video dataset as the source since the temporal continuity of video segmentation in the target domain can be estimated and enforced without reference to videos in the source domain. This motivates a new framework of Image-to-Video Domain Adaptive Semantic Segmentation (I2VDA), where the source domain is a set of images without temporal information. Under this setting, we bridge the domain gap via adversarial training based on only the spatial knowledge, and develop a novel temporal augmentation strategy, through which the temporal consistency in the target domain is well-exploited and learned. In addition, we introduce a new training scheme by leveraging a proxy network to produce pseudo-labels on-the-fly, which is very effective to improve the stability of adversarial training. Experimental results on two synthetic-to-real scenarios show that the proposed I2VDA method can achieve even better performance on video semantic segmentation than existing state-of-the-art video-to-video domain adaption approaches. 

Image Restoration Under Adverse Illumination for Various Applications 

Monday, May 30, 2022 - 09:00 am

DISSERTATION DEFENSE

Author : Lan Fu

Advisor : Dr. Song Wang

Date : May 30, 2022

Time 9:00 am

Place : Virtual (Zoom link below)

Zoom link is : https://us05web.zoom.us/j/9860655563?pwd=Qld4ZUozUkFBSGFoa3lRZjNBN3ZVUT09 

 

Abstract

 

Many images are captured in sub-optimal environment, bringing about various kinds of degradations, such as noise, blur, and shadow. Adverse illumination is one of the most important factors resulting in image degradation with color and illumination distortion or even unidentified image content. Degradation caused by the adverse illumination makes the images suffer from worse visual quality, which might also lead to negative effects on high-level perception tasks, e.g., object detection.

 

Image restoration under adverse illumination is an effective way to remove such kind of degradations to obtain visual pleasing images. Existing state-of-the-art deep neural networks (DNNs) based image restoration methods have achieved impressive  performance for image visual quality improvement. However, different real-world applications require the image restoration under adverse illumination to achieve different goals. For example, in the computational photography field, visually pleasing image is desired in the smartphone photography. Nevertheless, for traffic surveillance and autonomous driving in the low light or nighttime scenario, high-level perception tasks, e.g., object detection, become more important to ensure safe and robust driving performance. Therefore, in this dissertation, we try to explore DNN-based image restoration solutions for images captured under adverse illumination in various applications: 1) image visual quality enhancement, 2) object detection improvement, and 3) enhanced image visual quality and better detection performance simultaneously.

 

First, in the computational photography field, a visually pleasing image is desired. We take shadow removal task as an example to fully explore image visual quality enhancement. Shadow removal is still a challenging task due to its inherent background-dependent and spatial-variant properties, leading to unknown and diverse shadow patterns. We propose a novel solution by formulating this task as an exposure fusion problem to address the challenges. We propose shadow-aware FusionNet to `smartly' fuse multiple over-exposure images with pixel-wise fusion weight maps, and boundary-aware RefineNet to eliminate the remaining shadow trace further. Experiment results show that our method outperforms other CNN-based methods in three datasets.

 

Second, we explore the application of CNN-based night-to-day image translation for vehicle detection improvement in the traffic surveillance field for safe and robust driving performance. We propose a detail-preserving method to implement the nighttime to daytime image translation and thus adapt daytime trained detection model to nighttime vehicle detection. We firstly utilize StyleMix method to acquire paired images of daytime and nighttime for following nighttime to daytime image translation training. The translation is implemented based on kernel prediction network to avoid texture corruption. Experimental results showed that the proposed method fit the nighttime vehicle detection task to reuse the daytime domain knowledge.

 

Third, we explore the image visual quality and facial landmark detection improvement simultaneously. For the portrait images captured in the wild, the facial landmark detection can be affected by the foreign shadow. We construct a novel benchmark SHAREL covering diverse face shadow patterns with different intensities, sizes, shapes, and locations to study the effects of shadow removal on facial landmark detection. Moreover, we propose a novel adversarial shadow attack to mine hard shadow patterns. We conduct extensive analysis on three shadow removal methods and three landmark detectors. Then, we design a novel landmark detection-aware shadow removal framework, which empowers shadow removal to achieve higher restoration quality and enhance the shadow robustness of deployed facial landmark detectors.

Identifying and Discovering Curve Pattern Designs from Fragments of Pottery 

Tuesday, May 24, 2022 - 09:00 am

DISSERTATION DEFENSE

Author : Jun Zhou

Advisor : Dr. Song Wang

Date : May 24, 2022

Time 9:00 am

Place : Virtual (Teams link below)

The Teams invite link is here

Abstract

A challenging problem in modern archaeology is to identify and reconstruct full decorative curve pattern designs from fragmented heritage objects, such as the pottery sherds from southeastern North America. The difficulties of this problem lie in 1) these pottery sherds are usually fragmented so that each sherd only covers a small portion of its underlying full design; 2) these sherds can be highly degraded that curves may contain missing segments or become very shallow; and 3) curve patterns on sherd surfaces may overlap, resulting in composite patterns. Abstracted from this archaeological problem, two computer vision problems are studied: design identification for identifying underlying full design on a sherd by curve pattern matching and sherd identification for grouping unidentified sherds for new design discovery by curve pattern clustering. For design identification, two new curve pattern matching methods are proposed, a Chamfer matching based method for composite pattern matching and a patch-based matching method for noisy curve patterns and composite patterns by deep metric learning and region growing. For sherd identification, a new curve pattern clustering method is proposed involving curve pattern similarity matrix building by deep feature learning, graph partition and iterative cluster refinement. An archaeological computer-aided system, called Snowvision, is developed in this research. The proposed algorithms frame the core of Snowvision.

CNN-Based Semantic Segmentation with Shape Prior Knowledge 

Monday, May 23, 2022 - 09:00 am

DISSERTATION DEFENSE 

Author : Yuhang Lu

Advisor : Dr. Song Wang

Date : May 23, 2022

Time 9:00 am

Place : Virtual (Zoom link below)

Meeting Link: https://zoom.us/j/7182193712

 

Abstract

Semantic segmentation that aims at grouping discrete pixels into connected regions is a fundamental step in many high-level computer vision tasks. In recent years, Convolutional Neural Networks (CNNs) have made breakthrough progresses in public semantic segmentation benchmarks. The ability of learning from large-scale labeled datasets empowers them to generalize to unseen images better than traditional non-learning-based methods. Nevertheless, the heavy dependency on labeled data also limits their applications in tasks where high-quality ground truth segmentation masks are scarce or difficult to acquire. In this dissertation, we study the problem of alleviating the data dependency for CNN-based segmentation with a focus on leveraging the shape prior knowledge of objects.

    Shape prior knowledge could provide rich learning-free information of object boundaries if properly utilized. However, this is not trivial for CNN-based segmentation because of its nature of pixel-wise classification. To address this problem, we propose novel methods to integrate three types of shape priors into CNN training, including implicit, explicit and class-agnostic priors. They cover from specific objects with strong prior to general objects with weak prior. To demonstrate the practical value of our methods, we present each of them within a challenging real-world image segmentation task. 1) We propose a weakly supervised segmentation method to extract curve structures stamped on cultural heritage objects, which implicitly takes advantage of the prior knowledge of their thin and elongated shape to relax the training label from pixel-wise curve mask to single-pixel curve skeleton, and outperforms fully supervised alternatives by at least 7.7% in F1 score. 2) We propose a one-shot segmentation method to learn to segment anatomical structure from X-ray images with only one labeled image, which is realized by explicitly model the shape and appearance prior knowledge of objects into the objective function of CNNs. It performs competitively compared to state-of-the-art fully supervised methods when using a single label, and could outperform them when a human-in-the-loop mechanism is incorporated. 3) Finally, we attempt to model shape priors in a universal form that is agnostic to object classes, where the knowledge can be distilled from a few labeled samples through a meta-learning strategy. Given a base model pretrained on existing large-scale dataset, our method could adapt it to any unseen domains with the help of a few labeled images and masks. Experimental results show that our method significantly improve the performance of base models in a variety of cross-domain segmentation tasks.

Learning Depth from Images

Wednesday, May 18, 2022 - 09:00 am

DISSERTATION DEFENSE 

Author : Zhenyao Wu

Advisor : Dr. Song Wang

Date : May 18, 2022

Time 9:00 am

Place : Virtual (Zoom link below)

 

Meeting Link: https://zoom.us/j/91863722659?pwd=KytCSmc3NGRRbHhPSmczM2EyUnpuQT09

 

Abstract

Estimating depth from images has become a very popular task in computer vision which aims to restore the 3D scene from 2D images and identify important geometric knowledge of the scene. Its performance has been significantly improved by convolutional neural networks in recent years, which surpass the traditional methods by a large margin. However, the natural scenes are usually complicated, and hard to build the correspondence between pixels across frames, such as the region containing moving objects, illumination changes, occlusions, and reflections. This research explores rich and comprehensive spatial correspondence across images and designs three new network architectures for depth estimation whose inputs can be a single image, stereo pairs, or monocular video. 

First,  we propose a novel semantic stereo network named SSPCV-Net, which includes newly designed pyramid cost volumes for describing semantic and spatial correspondence on multiple levels. The semantic features are inferred from a semantic segmentation subnetwork while the spatial features are constructed by hierarchical spatial pooling. In the end, we design a 3D multi-cost aggregation module to integrate the extracted multilevel correspondence and perform regression for accurate disparity maps. We conduct comprehensive experiments and comparisons with some recent stereo matching networks on Scene Flow, KITTI 2015 and 2012, and Cityscapes benchmark datasets, and the results show that the proposed SSPCV-Net significantly promotes the state-of-the-art stereo-matching performance. 

Second, we present a novel SC-GAN network with end-to-end adversarial training for depth estimation from monocular videos without estimating the camera pose and pose change over time. To exploit cross-frame relations, SC-GAN includes a spatial correspondence module that uses Smolyak sparse grids to efficiently match the features across adjacent frames and an attention mechanism to learn the importance of features in different directions. Furthermore, the generator in SC-GAN learns to estimate depth from the input frames, while the discriminator learns to distinguish between the ground-truth and estimated depth map for the reference frame. Experiments on the KITTI and Cityscapes datasets show that the proposed SC-GAN can achieve much more accurate depth maps than many existing state-of-the-art methods on monocular videos. 

Finally, we propose a new method for single image depth estimation which utilize the spatial correspondence from stereo matching. To achieve the goal, we incorporate a pre-trained stereo network as a teacher to provide depth cues for the features and output generated by the student network which is a monocular depth estimation network. To further leverage the depth cues, we developed a new depth-aware convolution operation that can adaptively choose subsets of relevant features for convolutions at each location. Specifically, we compute hierarchical depth features as the guidance, and then estimate the depth map using such depth-aware convolution which can leverage the guidance to adapt the filters.  Experimental results on the KITTI online benchmark and Eigen split datasets show that the proposed method achieves the state-of-the-art performance for single-image depth estimation. 

The Automatic Computer Scientist

Friday, April 15, 2022 - 02:30 pm
Swearingen Engineering Center in Room 2A31

Abstract

Building machines that automatically write computer programs is a grand challenge in AI. Such a development would offer the potential to automatically build bug-free programs and to discover novel efficient algorithms. In this talk, I will describe progress towards this grand challenge, i.e. progress towards building an `Automatic Computer Scientist (AutoCS)`. I will focus on major recent breakthroughs in inductive logic programming (ILP), a form of machine learning based on mathematical logic, with wider applications in drug design, game playing, and visual reasoning.

Bio

I am a research fellow at the University of Oxford. I work on logic and machine learning, i.e. inductive logic programming. I run the Logic and Learning (LoL) group and the `Automatic Computer Scientist' project.

Location

In person: Swearingen Engineering Center in Room 2A31

 Virtual MS Teams

 

Event-Driven Approximate Dynamic Programming for Feedback Control

Friday, April 8, 2022 - 02:20 pm
Swearingen Engineering Center in Room 2A31

Abstract

Adaptive controllers employ online observations of system performance to determine control policies for driving a system toward a desired state. For example, the adaptive cruise control module in a car utilizes data from various sensors to steer the vehicle such that it maintains a safe following distance and stays within the speed limit. In this talk, I will introduce a set of learning algorithms to synthesize feedback control policies for dynamic systems. Specifically, I will discuss topics including event-triggered control, approximate dynamic programming, and the limits of learning-based controllers for real-time control.

 

Bio

Vignesh Narayanan (Member, IEEE) received the B.Tech. Electrical and Electronics Engineering degree from SASTRA University, Thanjavur, India, the M.Tech. degree with specialization in Control Systems from the National Institute of Technology Kurukshetra, Haryana, India, in 2012 and 2014, respectively, and the Ph.D. degree from the Missouri University of Science and Technology, Rolla, MO, USA, in 2017. He joined the Applied Mathematics Lab and Brain Dynamics and Control Research Group in the Dept. of Electrical and Systems Engineering at the Washington University in St. Louis, where he was a postdoctoral research associate. He is currently with the Dept. of Computer Science and Engineering and AI institute of University of South Carolina. He is also affiliated with CAN (Center for Autism and Neurodevelopmental disorders). His current research interests include learning and adaptation in dynamic population systems, complex dynamic networks, reinforcement learning, and computational neuroscience.

 

Location

In person: Swearingen Engineering Center in Room 2A31

Virtual MS Teams

Time

2:20-3:10pm

CNN-based Dendrite Core Detection from Microscopic Images 

Thursday, March 31, 2022 - 03:00 pm

DISSERTATION DEFENSE 

Author : Xiaoguang Li

Advisor : Dr. Song Wang

Date : March 31, 2022

Time : 3:00 pm

Place : Virtual (Zoom link below)

Zoom link: https://zoom.us/j/3845952539?pwd=WkVxVmdETU4zcy9FcDNnOVNDdzE4UT09

 

Abstract

Dendrite core is the center point of the dendrite. The information of dendrite core is very helpful for material scientists to analyze the properties of materials. Therefore, detecting the dendrite core is a very important task in the material science field. Meanwhile, because of some special properties of the dendrites, this task is also very challenging. Different from the typical detection problems in the computer vision field, detecting the dendrite core aims to detect a single point location instead of the bounding-box. As a result, the existing regressing bounding-box based detection methods can not work well on this task because the calculated center point location based on the upper-left and lower-right corners of the bounding-box is usually not precise. In this work, we formulate the dendrite core detection problem as a segmentation task and proposed a novel detection method to detect the dendrite core directly. Our whole pipeline contains three steps: Easy Sample Detection (ESD), Hard Sample Detection (HSD), and Hard Sample Refinement (HSR). Specifically, ESD and HSD focus on the easy samples and hard samples of dendrite cores respectively. Both of them employ the same Central Point Detection Network (CPDN) but not sharing parameters. To make HSD only focus on the feature of hard samples of dendrite cores, we destroy the structure of the easy samples of dendrites which are detected by ESD and force HSD to learn the feature of hard samples. HSR is binary classifier which is used to filter out the false positive prediction of HSD. We evaluate our method on the dendrite dataset. Our method outperforms the state-of-the-art baselines on three metrics, i.e., Recall, Precision, and F-score

Python Basics

Monday, March 28, 2022 - 06:00 pm
Innovation Center Room 2277

Women in Computing is hosting its first ever programming workshop! We will be learning the basics of Python! If you have an interest in learning coding, come on out! We hope to do more workshops in other languages in the future so come by and show your interest tonight, March 28th at 6pm in the Innovation Center Room 2277!

If you want to join virtually we will try to simultaneously share screen via Zoom so be sure to join our GroupMe to get access to the Zoom link.

GroupMe: https://groupme.com/join_group/34681325/pIJInQ

Everyone – all genders and majors is welcome!