Overview of Intersection of Physics & AI with a Deep Dive on Incorporating Physics into Data-Driven Computer Vision.

Friday, February 23, 2024 - 02:20 pm
SWGN 2A27 or Zoom

This talk on the intersection of Physics & AI will give a high level overview of physics based neural networks, applications of physics in NLP, healthcare, data generation, autonomous vehicles, data generation etc. The talk will also have a deep dive on "Incorporating Physics into Data-Driven Computer Vision" which will cover approaches like modifications to datasets, network design, loss function, optimization and regularization schemes.
 

Bio: Parth Patwa is a data scientist at Amazon, working on generative AI. He graduated from UCLA. Previously he was working as a ML scientist at MIT Pathcheck. His publication in NLP and CV have over 1000 citations.

Zoom
 

GenAI for creative design of novel materials

Friday, February 16, 2024 - 02:15 pm

About
Human society of this century is facing several fundamental challenges including global climate change, energy crisis, and public health crisis such as cancers and COVID-19. Common to their solutions are the discovery of novel materials, molecules, proteins, and drugs. Designing these functional atomic structures is challenging due to the astonishing complexity of the interatomic interactions, sophisticated physical/chemical/geometric constraints and patterns to form stable structures, and how the structures relate to their functions. Like most other engineering design activities, currently, the mainstream paradigm of material design is the rational design approach, which emphasizes a causal understanding of the structure-function relationship and depends on heuristic expert knowledge and explicit design rules. However, the traditional material design paradigm is facing increasing challenges in designing extraordinary functional materials that can effectively meet our needs: it usually leads to sub-optimal solutions in the huge chemical design space due to their limited search capability; it is difficult to handle huge amount of implicit knowledge and constraints, and cannot exploit such rules for efficient design space exploration; it needs too many explicit design rules; it is difficult to design highly constrained structures such as the periodic inorganic crystals.

In this talk, I will introduce the transformative shift from rational materials design to the data-driven deep generative material design paradigm, in which known materials data are fed to the deep generative models to learn explicit and implicit knowledge of atomic structures and then exploit them for efficient structure generation. This is inspired by the deep learning based Artificial Intelligence Generated Content (AIGC) technologies that have been accelerating in generating authentic images, videos, texts, music, and human voices. Our work shows that designing images and texts shares many characteristics with the task of designing proteins, molecules, and materials, in which building blocks of different levels are assembled together to form specific stable or meaningful structures that satisfy diverse grammatical, physical, chemical or geometric constraints. While Nature has used the physical apparatus of DNA as the information carrier of synthesis rules for protein synthesis and biochemistry through evolution, deep neural networks can also be exploited similarly to achieve Nature's way of material design by learning the designing rules from known materials or from computational simulations. Just as a female frog can give birth to a frog without knowing how a frog is grown from a zygote through a developmental process, we show that our deep generative materials design works in a similar design-without-understanding process.

-----------------------
Bio: Dr. Jianjun Hu is professor of computer science at the University of South Carolina. He directs the Machine Learning and Evolution Laboratory (http://mleg.cse.sc.edu). Dr. Hu received his Ph.D. of computer science in 2004 from Michigan State University in the areas of machine learning and evolutionary computation and then conducted postdoc studies in bioinformatics at Purdue University and University of Southern California from 2004 to 2007. His current research interests include AI for science, machine learning, deep learning, evolutionary algorithms, and their application in material informatics, bioinformatics, health informatics, and automated design synthesis with a total of more than 200 papers. Dr. Hu is the winner of the National Science Foundation Career Award. His research has been funded by NSF, DOE, NIH. He can be reached at jianjunh@cse.sc.edu

Details: https://www.linkedin.com/events/7162579233444192256/about/

Civilizing AI: Examining Emerging Capabilities & Mitigation

Friday, February 9, 2024 - 02:15 pm
online

Abstract - The emergence of very large foundation models such as GPT, Stable Diffusion, DALL-E and Midjourney has dramatically altered the trajectory of progress in AI and its applications. The enthusiasm for AI has expanded beyond the realm of AI researchers and has reached the general population; indeed it asserts we are living in an exciting time of scientific proliferation. The present-day capability of AI exhibits promising forms of intelligence with wider usability, but it also possesses unexpected limitations and is susceptible to potential misuse. AI has reached a level where discerning AI-generated content, be it in text, images or videos, has become notably challenging, which we term as "eloquence” characteristics. Conversely, the worrisome rise of hallucinations of AI models raises credibility issues, which we refer to as "adversity" characteristics. Recently, the governments of both the United States and the European Union have put forth their preliminary proposals concerning the regulatory framework for the safety of systems powered by AI. AI systems that adhere to these regulations in the future will be referred to by a recently coined term “Constitutional AI”. The primary objective of regulatory frameworks is to establish safeguards against the misuse of AI systems. In the event of any such misuse, these frameworks aim to impose penalties on the individuals, groups, and/or organizations responsible for such misconduct. "CIVILIZING AI" embodies a nuanced equilibrium between the machine's eloquence and its inclination towards adversarial behavior, with the goal of enforcing constitutional principles. Have a look if you wish to pre-read: https://analyticsindiamag.com/this-usc-professor-from-kolkata-is-on-a-j…

Brief Bio
----------------
Currently, Dr. Das holds the position of Research Associate Professor at The Artificial Intelligence Institute (AIISC), University of South Carolina, USA. Previously, he served as a founding researcher and played a pivotal role in establishing Wipro Labs in Bangalore, India, from its inception. He maintains an association with Wipro Labs in the capacity of an Advisory Scientist. He also holds an adjunct faculty position at IIT Patna. In the past, he also worked for Samsung Research, India. Managing/guiding/collaborating 100+ people across all the aforementioned organizations.

He has held two academic postdoctoral positions, one in Europe and the other in the USA. During his tenure in the USA, he worked at the University of Texas-Austin, while his European postdoc was awarded by the European Research Consortium for Informatics and Mathematics (ERCIM) and hosted at NTNU, Norway. He earned his Ph.D. in Engineering from Jadavpur University, India, with active collaboration with the Tokyo Institute of Technology, Japan. Dr. Das boasts nearly two decades of research experience in NLP, with a substantial publication record of over 120+ research papers spanning a wide array of topics and holds an h-index of 36

 

Details at https://www.linkedin.com/events/aiiscseminar-civilizingai-exami71606855…

Predictive Filtering-based Image Inpainting

Wednesday, February 7, 2024 - 11:00 am

DISSERTATION DEFENSE

Department of Computer Science and Engineering

University of South Carolina

Author : Xiaoguang Li

Advisor : Dr. Song Wang

Date : Feb 7, 2024

Time:  11 am – 12: 30 pm

Place : Teams

Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_MWJkNGI5OTYtNzk5…

Abstract

Image inpainting is an important challenge in the computer vision field. The primary goal of image inpainting is to fill in the missing parts of an image. This technique has many real-life uses including fixing old photographs and restoring ancient artworks, e.g., the degraded Dunhuang frescoes. Moreover, image inpainting is also helpful in image editing. It has the capability to eliminate unwanted objects from images while maintaining a natural and realistic appearance, e.g., removing watermarks and subtitles. Disregarding the fact that image inpainting expects the restored result to be identical to the original clean one, existing deep generative inpainting methods often treat image inpainting as a pure generative task and emphasize the naturalness or realism of the generated result. Although achieving significant progress, the existing deep generative inpainting methods are far from real-world applications due to the low generalization across different scenes. As a result, the generated images usually contain artifacts or the filled pixels differ greatly from the ground truth. To address this challenge, in this research, we propose two approaches that utilize the predictive filtering technique to improve the image inpainting performance. Furthermore, we harness the predictive filtering technique and inpainting pretraining to tackle the challenge of shadow removal effectively. Specifically, for the first approach, we formulate image inpainting as a mix of two problems, i.e., predictive filtering and deep generation. Predictive filtering is good at preserving local structures and removing artifacts but falls short to complete the large missing regions. The deep generative network can fill the numerous missing pixels based on the understanding of the whole scene but hardly restores the details identical to the original ones. To make use of their respective advantages, we propose the joint predictive filtering and generative network (JPGNet). We validate our first approach on three public datasets, i.e., Dunhuang, Places2, and CelebA, and demonstrate that our method can enhance three state-of-the-art generative methods (i.e., StructFlow, EdgeConnect, and RFRNet) significantly with slightly extra time costs. For the second approach, inspired by this inherent advantage of image-level predictive filtering, we explore the possibility of addressing image inpainting as a filtering task. We first study the advantages and challenges of image-level predictive filtering for inpainting: the method can preserve local structures and avoid artifacts but fails to fill large missing areas. Then, we propose the semantic filtering by conducting filtering on the deep feature level, which fills the missing semantic information but fails to recover the details. To address the issues while adopting the respective advantages, we propose a novel filtering technique, i.e., Multi-level Interactive Siamese Filtering (MISF). The extensive experiments demonstrate that our method surpasses state-of-the-art baselines across four metrics, i.e., L1, PSNR, SSIM, and LPIPS. In the end, we employ the predictive filtering technique and inpainting pretraining to address the shadow removal problem. Specifically, we find that pretraining shadow removal networks on the image inpainting dataset can reduce the shadow remnants significantly: a naive encoder-decoder network gets competitive restoration quality w.r.t. the state-of-the-art methods via only 10% shadow & shadow-free image pairs. After analyzing networks with/without inpainting pretraining via the information stored in the weight (IIW), we find that inpainting pretraining improves restoration quality in non-shadow regions and enhances the generalization ability of networks significantly. Additionally, shadow removal fine-tuning enables networks to fill in the details of shadow regions. Inspired by these observations we formulate shadow removal as an adaptive fusion task and propose the Inpaint4Shadow:Leveraging Inpainting for Single-Image Shadow Removal. The extensive experiments show that our method empowered with predictive filtering and inpainting outperforms all state-of-the-art shadow removal methods.

 

Enhancing healthcare with AI-in-the-loop

Friday, February 2, 2024 - 02:20 pm
SWGN 2A27

Date: 2 February 2024 (Friday)

Time: 2:20pm to 3:20pm

Location: SWGN 2A27 (preferred) or online on Zoom

Historically, Artificial Intelligence has taken a symbolic route for representing and reasoning about objects at a higher-level or a statistical route for learning complex models from large data. To achieve true AI in complex domains such as healthcare, it is necessary to make these different paths meet and enable seamless human interaction. First, I will introduce learning from rich, structured, complex and noisy data. One of the key attractive properties of the learned models is that they use a rich representation for modeling the domain that potentially allows for seam-less human interaction. I will present the recent progress that allows for more reasonable human interaction where the human input is taken as “advice” and the learning algorithm combines this advice with data. I will present these algorithms in the context of several healthcare problems -- learning from electronic health records, clinical studies, and surveys -- and demonstrate the value of involving experts during learning.

Speaker Bio:

Sriraam Natarajan is a Professor and the Director for Center for ML at the Department of Computer Science at University of Texas Dallas, a hessian.AI fellow at TU Darmstadt and a RBDSCAII Distinguished Faculty Fellow at IIT Madras. His research interests lie in the field of Artificial Intelligence, with emphasis on Machine Learning, Statistical Relational Learning and AI, Reinforcement Learning, Graphical Models and Biomedical Applications. He is a AAAI senior member and has received the Young Investigator award from US Army Research Office, Amazon Faculty Research Award, Intel Faculty Award, XEROX Faculty Award, Verisk Faculty Award, ECSS Graduate teaching award from UTD and the IU trustees Teaching Award from Indiana University. He is the program chair of AAAI 2024, the general chair of CoDS-COMAD 2024, AI and society track chair of AAAI 2023 and 2022, senior member track chair of AAAI 2023, demo chair of IJCAI 2022, program co-chair of SDM 2020 and ACM CoDS-COMAD 2020 conferences. He was the specialty chief editor of Frontiers in ML and AI journal, and is an associate editor of JAIR, DAMI and Big Data journals.

Tracking Human Activities in an Interactive Space: Unveiling the Potential of mmWave Sensing

Friday, February 2, 2024 - 11:00 am
Storey Innovation Center, Room 2277

SUMMARY: Imagine living in an intuitively interactive space without the need to understand its grammar.
One does not need to interact in a specific way or use voice commands, like Amazon echo, or always wear
something, like a smart band to control the space and the devices therein. Sharing this intelligent space with
others does not also degrade the individual user experience. Interestingly, this vision of seamless smart
spaces is not novel and quite dated; however, we have yet to occupy this kind of space regularly. For this
vision to become an everyday reality, we argue that there is a need for multi-user continuous activity tracking
through passive sensing mechanisms. In this talk, I will discuss how we have developed a pervasive sensing
system for continuous multi-user room-scale tracking of human activities using a single COTS mmWave
radar. We will further highlight how such a sensing system can be utilized for continuous monitoring of
driving activities on the road, to prevent road accidents from driver distraction and promote safe driving. We
will conclude the talk with some potential research challenges where mmWave sensing can help.

BIO: Dr. Sandip Chakraborty is an Associate Professor in the Department of Computer Science and Engineering at
the Indian Institute of Technology (IIT) Kharagpur, and is leading the Ubiquitous Networked Systems Lab (UbiNet). He
obtained his Bachelor’s degree from the Jadavpur University, Kolkata in 2009 and M. Tech and Ph.D. from IIT
Guwahati, in 2011 and 2014, respectively. The primary research interests of Dr. Chakraborty are Pervasive and
Ubiquitous Computing, Computer Systems, and Human-Computer Interactions. He received various awards including
the Google Award for Inclusion Research in Societal Computing (2023), the Indian National Academy of Engineering
(INAE) Young Engineers’ Award 2019, and the Honorable Mention Award in ACM SIGCHI EICS 2020. Dr. Chakraborty
is one of the founding chairs of ACM IMOBILE, the ACM SIGMOBILE Chapter in India. He is working as an Area
Editor of Elsevier Ad Hoc Networks journal and Elsevier Pervasive and Mobile Computing journal. Further details
about his works and publications can be found at https://cse.iitkgp.ac.in/~sandipc/.

 

 

Bridging Knowledge Gaps in Agriculture with Generative AI

Friday, January 26, 2024 - 02:20 pm
SWGN 2A27

Abstract 
Agriculture, vital for global sustenance, is increasingly challenged by a knowledge gap arising from language and literacy barriers, especially in regions like India. This seminar explores the transformative role of Generative AI in overcoming these obstacles, providing a comprehensive overview of its technical and practical applications. The use of Large Language Models (LLMs), enhanced with vernacular voice-based conversational interfaces, has revolutionized the delivery of farming advice in native languages, thereby bridging critical information gaps. 
 

The seminar also emphasizes the significance of robust data curation in developing such AI models. In countries like India, a substantial amount of agricultural data is produced by various institutes in regional languages to assist farmers. However, this data often remains inaccessible for various reasons. Effective data collection, validation, and integration are crucial to ensure the accuracy and contextual relevance of the information provided. 
 

Dhenu 1.0, a pioneering domain-specific model, demonstrates how Generative AI can be customized for bilingual communication in English and Hindi, providing real-time, personalized farming advice. This model's cost-effectiveness and accuracy are particularly beneficial where diverse languages and varying literacy levels among farmers can hinder access to essential agricultural information. 
 

This seminar aims to highlight how innovative solutions like KissanAI and Dhenu 1.0 can transform agricultural practices, making expert knowledge accessible and understandable to all farmers. This approach not only promotes more informed and efficient farming decisions but also showcases the potential of AI in driving sustainable and inclusive agricultural development.

About the speaker: Bio:

Pratik Desai is a seasoned entrepreneur and computer scientist passionate about leveraging knowledge extraction through AI and Semantic Web technologies. Before venturing into AI applications for Agriculture, Dr. Desai honed his skills by co-founding three innovative startups in Silicon Valley. Grounded in his family's agricultural heritage, Dr. Desai established KissanAI. His vision for KissanAI has been to contribute meaningfully to his roots by democratizing the use of AI in agriculture. KissanAI stands out as a vernacular AI CoPilot platform meticulously designed to offer voice-based assistance tailored to the agricultural sector, particularly addressing the challenges faced by farmers in developing nations by overcoming literacy and language barriers. Dr. Desai’s pioneering work in incorporating generative AI in the Agriculture domain has led to the development of Dhenu, the world's first agricultural vertical-specific Large Language Model (LLM). More: https://www.linkedin.com/in/pratikkumardesai/

 

For those who cannot join in person: Zoom
https://us06web.zoom.us/j/8440139296?pwd=b09lRCtJR0FCTWcyeGtCVVlUMDNKQT…;

A Neurosymbolic AI Approach to Scene Understanding.

Friday, January 19, 2024 - 02:20 pm
SWGN 2A27

Abstract: Scene understanding is a major challenge for autonomous systems. It requires combining diverse information sources, background knowledge, and different sensor data to understand the physical and semantic aspects of dynamic environments. Current technology for scene understanding relies heavily on computer vision and deep learning techniques to perform tasks such as object detection and localization. However, due to the complex and dynamic nature of driving scenes, the technology's complete reliance on raw data poses challenges, especially with respect to edge cases. In this talk, I will discuss some of these challenges along with how they are currently being handled. Next, I will discuss a novel perspective that we have introduced as part of my dissertation, which leverages the use of external knowledge, representation learning, and neurosymbolic AI to address some of these challenges. Finally, I will share my thoughts on directions for future research and new applications and domains where we can apply this technology to improve machine perception in autonomous systems. 

Bio: Ruwan Wickramarachchi is a Ph.D. candidate at the AI Institute, University of South Carolina. His dissertation research focuses on introducing expressive knowledge representation and neurosymbolic AI techniques to improve machine perception and context understanding in autonomous systems. He has published several research papers, co-invented patents, and co-orgnaized multiple tutorials on neurosymbolic AI and its emerging uses in addressing scene understanding challenges in autonomous systems. Prior to joining the doctoral program, he worked as a senior engineer in the machine learning research group at London Stock Exchange Group (LSEG). 

Location: SWGN 2A27

 

We would love in-person attendance (required for registered students),
but remote attendance is possible on Zoom:

https://us06web.zoom.us/j/8440139296?pwd=b09lRCtJR0FCTWcyeGtCVVlUMDNKQT…

Meeting ID: 844 013 9296
Passcode: 12345

On Parallelization of Graph Algorithms Performance Modeling and Autonomous 3D Printable Object Synthesis

Monday, December 11, 2023 - 11:00 am
Meeting room 2267 (Innovation Building)

DISSERTATION DEFENSE

Author : Shams-Ul-Haq Syed

Advisor : Dr. Manton Matthews

Date : December 11, 2023

Time:  11 am – 1 pm

Place : Meeting room 2267 (Innovation Building)

 

Abstract

 

The degree of hardware level parallelism offered by today’s GPU architecture makes it ideal for problem domains with massive inherent parallelism potential, fields such as computer vision, image processing, graph theory and graph computations. We have identified three problem areas for purpose of this research dissertation, under the umbrella of performance improvement by harnessing the power of GPUs for novel applications. The first area is concerned with k-vertex connectivity in graph theory, the second area deals performance evaluation using extended roofline models for GPU parallel applications and finally the third problem area is related to synthesis 3D printable objects from 2D images.

In this thesis we examined k-vertex connectivity in undirected graphs, its applications and measure the performance of GPU computations using the CUDA Toolkit. Matthews and Sumner in 1984 presented the conjecture that every 4-connected claw-free graph is Hamiltonian. In the initial paper [1] it was shown that every 3-connected claw-free graph on fewer than 20 vertices is Hamiltonian. Over the years there have been several papers establishing the result for connectivity higher than 4. So, all that remains is the case for 4 connected claw-free graphs conjectured by C. Thomassen [2]. We present a new CUDA based parallel k-vertex connectivity test algorithm to determine the connectivity of any vi given claw-free graph. The parallel algorithm is several orders of magnitude faster when compared to the serial counterpart. It is a major step towards efficiently finding whether the conjecture holds for graphs with connectivity exactly equal to 4. Our parallel algorithm can also be applied to find the value of k (connectedness) for a given graph. It is validated using number of different types of graphs such as complete graphs, complete bipartite graphs, and chorded cycle graphs of sizes ranging from 𝐺𝑛,𝑘 , 20 ≤ 𝑛 ≤ 300.

For GPU architecture we proposed the unified cache aware roofline model which provides better insights by capturing details such as memory transfers between host and device. Unlike traditional roofline models that strictly focused on memory bandwidth and computation performance of either only CPU or GPU. Our model provides more holistic picture of application performance in a single view by capturing computations on CPUs and GPUs along with the data transfers from host to device including theoretical bandwidths of host and device memories.

Finally, a novel approach to synthesize 3D printable objects from a single input image source is presented. The algorithm employs probabilistic machine learning based framework along with multiple shapes and depth cues such as manifolds, contours, gradients, and prior knowledge about the shapes to generate a plausible 3D model from a single 2D image. Our algorithm intelligently combines isolated shapes into a single object while retaining their relative positions. It also considers minimal 3D printable area vii and strength while generating watertight mesh object. Consequently, the resultant 3D model is 3D printer compatible; and the actual 3D printed object is sturdy and less prone to breakage. In addition, our scalable algorithm runs in a quadratic time of the size of the image. The preliminary results have demonstrated several different 2D images turned into actual 3D printed objects that are sturdy and aesthetically pleasing.

Object Classification, Detection and Tracking in Challenging Underwater Environment

Friday, December 8, 2023 - 04:00 pm
online

DISSERTATION DEFENSE

 

Author : Md Modasshir

Advisor : Dr. Ioannis Rekleitis

Date : December 8, 2023

Time:  4 pm – 5 pm

Place : Virtual

Meeting Link : https://teams.microsoft.com/l/meetup-join/19%3ameeting_ZWVjMjFkNzEtNDQx…

Abstract

The main contributions of this thesis is the applicability and architectural designs of deep learning algorithms in underwater imagery. In recent times, deep learning techniques for object classifcation and detection have achieved exceptional levels of accuracy that surpass human capabilities. However, the effectiveness of these techniques in underwater environments has not been thoroughly researched. This thesis delves into various research areas related to underwater environments, such as object classifcation, detection, semantic segmentation, pose regression, and semi-supervised retraining of detection models.

The frst part of the thesis studies image classifcation and detection. Image classifcation is a fundamental process that involves assigning a label to an image from a predetermined set of categories. Detection, on the other hand, refers to the process of locating an object within an image, along with its label. We have developed a coral classification model, MDNet, for object classifcation that is trained using point-annotated visual data and is capable of classifying eight species of corals. MDNet incorporates state-of-the-art convolutional neural network architectural designs that allow for acceleration on embedded devices. To further enhance its capabilities, we utilize the detection capability of MDNet along with Kernelized Correlation Filters (KCF)-based trackers to identify unique coral objects. For a given trajectory on the seafloor, we can track unique coral objects and estimate coral population density. This population estimation is a valuable tool for marine biologists to analyze the effects of climate change and water population on coral reefs over time. To deploy the system on embedded devices such as Aqua2, we have conducted a comprehensive study of available neural network accelerators based on feld-programmable gate arrays (FPGAs) and optimized MDNet to achieve real-time performance. For object devitection, we combine the output of the classifer model with a crowd-annotated dataset to develop a robust model for detecting relevant species of corals. We also test the generalization capability of models designed for underwater images in medical domain. The similar models were trained to classify and quantify nuclei from human blood neutrophils. The model achieved over 94% accuracy in differentiating different cell types.

Next part of the thesis explores and suggests on how to integrate deep learning based object detection with SLAM system to create semantic 3D map. A semantic 3D map is required for sea-floor exploration and coral reef monitoring systems. In our research, we integrate a coral reef detection algorithm with Direct Sparse Odometry (DSO), a Simultaneous Localization and Mapping system (SLAM) method. By combining the output of the detection system with DSO feature mapping, we have developed a semantic 3D map of the system that allows for effcient navigation and better coverage of the coral reef.

In the subsequent part of the thesis, we extend object detection neural networks to predict 6D pose of underwater vehicles. Pose regression, the process of predicting 6D poses, in deep learning involves using monocular images to predict the 3D location and orientation of an object. In order to facilitate cooperative localization, we have created a vision-based localization system called DeepURL for Aqua2 robots operating underwater. The DeepURL system frst detects objects in the images and then predicts their 3D positions and orientations.

Finally, in the fourth part of the thesis, we have developed a semi-supervised approach for training the detection algorithm using a dataset with labels for a subset of samples. This allows the algorithm to use unlabeled visual data from future experiments and scuba diving. We have found that this semi-supervised approach has improved the performance and robustness of the detection algorithm.

The thesis aims at developing deep learning based object understanding in underwater environments while maintaining the generalization capability of the models. We demonstrate how object classifcation and detection can be redesigned and repurposed for unviiderwater environments. We also provide intuitions behind the model design and evaluate against the state-of-the-art models.