Process Knowledge-Guided Neurosymbolic Learning and Reasoning

Monday, June 30, 2025 - 09:00 am
Rm 529 AI Institute

DISSERTATION DEFENSE

Author : Kaushik
Advisor: Dr. Amit Sheth
Date: June 30, 2025
Time: 09:00 am
Place: Rm 529 AI Institute
Zoom Link : Join Zoom Meeting
Meeting ID: 868 4960 6766

Abstract

Neural‐network–driven artificial intelligence has achieved impressive predictive accuracy, yet its opaque, data-centric modus operandi still limits trustworthy deployment in safety-critical settings. A central barrier is the difficulty of aligning continuous representations learned by networks with the process knowledge -- formal, expert-crafted diagnostic or operational procedures that govern real-world decision making. This dissertation introduces Process Knowledge-Infused Learning and Reasoning (PK-iL), a neurosymbolic framework that injects task-specific process structures directly into end-to-end differentiable models. PK-iL marries symbolic constraints with gradient-based optimization, yielding predictors whose internal reasoning steps remain faithful to domain processes while retaining the adaptability and scale of modern deep learning.


The contributions are fourfold: (1) a formal representation for encoding process knowledge as differentiable constraints; (2) algorithms that integrate these constraints into training objectives and inference routines; (3) theoretical analysis showing how process alignment improves controllability and transparency without sacrificing expressivity; and (4) empirical validation in mental-health decision support, where psychiatric diagnostic criteria provide rigorous process ground truth. Across multiple datasets and baselines, PK-iL delivers higher diagnostic accuracy, markedly more evident explanation traces, and graceful handling of out-of-distribution cases, features essential for adoption as a human-AI “partner” in high-stakes workflows. These results demonstrate a viable path toward reliable, process-guided neurosymbolic AI.

A Neuro-Symbolic AI Approach to Scene Understanding in Autonomous Systems

Monday, June 23, 2025 - 10:00 am
Online

    DISSERTATION DFENSE
 

Author : Ruwan Tharanga Wickramarachchige Don
Advisor: Dr. Amit Sheth
Date: June 23rd, 2025
Time: 10:00 am
Place: AI Institute Seminar room
Zoom Link / Online Access: Join Zoom Meeting
https://sc-edu.zoom.us/j/89344836465?pwd=5sm3lb06ESCU8kcFmNhBWKLL8MnwhF…

 

Meeting ID: 893 4483 6465

Passcode: 180289


Abstract

 

Scene understanding remains a central challenge in the machine perception of autonomous systems. It requires the integration of multiple sources of information, background knowledge, and heterogeneous sensor data to perceive, interpret, and reason about both physical and semantic aspects of dynamic environments. Current approaches to scene understanding primarily rely on computer vision and deep learning models that operate directly on raw sensor data to perform tasks such as object detection, recognition, and localization. However, in real-world domains – such as autonomous driving and smart manufacturing/ Industry 4.0 – this sole reliance on raw perceptual data exposes limitations in safety, robustness, generalization, and explainability. To address these challenges, this dissertation proposes a novel perspective on scene understanding using a Neurosymbolic AI approach that combines knowledge representation, representation learning, and reasoning to advance cognitive and visual reasoning in autonomous systems.

Our approach involves several key contributions. First, we introduce methods for constructing unified knowledge representations that integrate scene data with background knowledge. This includes the development of a dataset-agnostic scene ontology and the construction of knowledge graphs (KGs) to represent multimodal data from autonomous systems. Specifically, we introduce DSceneKG, a suite of large-scale KGs representing real-world driving scenes across multiple autonomous driving datasets. DSceneKG has already been utilized in several emerging neurosymbolic AI tasks, including explainable scene clustering and causal reasoning, and has been adopted for an industrial cross-modal retrieval task. Second, we propose methods to enhance the expressiveness of scene knowledge in sub-symbolic representations to support downstream learning tasks that rely on high-quality translation of KG into embedding space. Our investigation identifies effective KG patterns and structures that enhance the semantic richness of KG embeddings, thereby improving model reasoning capabilities. Third, we introduce knowledge-based entity prediction (KEP), a novel cognitive visual reasoning task that leverages relational knowledge in KGs to predict entities that are not directly observed but are likely to exist given the scene context. Using two high-quality autonomous driving datasets, we evaluate the effectiveness of this approach in predicting entities that are likely to be seen given the current scene context. Fourth, we present CLUE, a context-based method for labeling unobserved entities, designed to improve annotation quality in existing multimodal datasets by incorporating contextual knowledge of entities that may be missing due to perceptual failures. Finally, by integrating these contributions, we introduce CUEBench, a benchmark for contextual entity prediction that systematically evaluates both neurosymbolic and foundation model-based approaches (i.e., large language models and multimodal language models). CUEBench fills a critical gap in current benchmarking by targeting high-level cognitive reasoning under perceptual incompleteness, reflecting real-world challenges faced by autonomous systems.

Causal Neuro-symbolic Artificial Intelligence: Synergy between Neuro-symbolic and Causal Artificial Intelligence

Thursday, June 19, 2025 - 09:00 am
AI Institute Seminar room
Author : Utkarshani Jaimini
Advisor: Dr. Amit Sheth
Date: June 19th, 2025
Time: 09:00 am
Place: AI Institute Seminar room
 

Abstract

Understanding and reasoning about cause and effect is innate to human cognition. In everyday life, humans continuously engage in causal reasoning and hypothetical retrospection to make decisions, plan actions, and interpret events. This cognitive ability allows us to ask questions such as: “What caused this situation?”, “What will happen if I take this action?”, or “What would have happened had I chosen differently?” This intuitive capacity to form mental models of the world, infer causal relationships, and reason about alternative scenarios, particularly counterfactuals, is

central to our intelligence and adaptability. In contrast, current machine learning (ML) and artificial intelligence (AI) systems, despite significant advances in learning from large-scale data and representing knowledge across time and space, lack a fundamental understanding of causality and counterfactual reasoning. This limitation poses challenges in high-stakes domains such as healthcare, autonomous systems, and manufacturing, where causal reasoning is indispensable for explanation, decision- making, and generalization. As argued by researchers such as Judea Pearl and Gary Marcus, endowing AI systems with causal reasoning capabilities is critical for building robust, generalizable, and human-aligned intelligence.
This dissertation proposes a novel framework: Causal Neuro-Symbolic (Causal NeSy) Artificial Intelligence, an integration of causal modeling with neuro-symbolic (NeSy) AI . The goal of Causal NeSy AI is to bridge the gap between statistical learning and causal reasoning, enabling machines to model, understand and reason upon the underlying causal structure of the world while leveraging the strengths of both neural and symbolic representations. At its core, the framework leverages causal Bayesian networks, encoded through a series of ontologies, to represent and propagate structured causal knowledge. By unifying structured causal symbolic knowledge with neural inference, the framework introduces a scalable and explainable causal reasoning pipeline grounded in knowledge graphs. The proposed Causal NeSy framework has been validated using the CLEVRER-Humans benchmark dataset, which involves video-based event causality annotated by human experts, and several real-world domains, including smart manufacturing, and  autonomous driving, areas that require high levels of robustness, interpretability, and causal understanding. Empirical results demonstrate that the integration of causal modeling into NeSy architectures significantly enhances both performance and explainability, particularly in settings with limited data or complex counterfactual scenarios. This dissertation advances the field of AI by proposing a unified framework that imbues NeSy systems with causal reasoning capabilities. By enabling machines to model, infer, and reason about causal structures, this work takes a crucial step toward building more human-aligned, trustworthy, and generalizable AI systems. It introduces scalable, explainable, and bias-aware methodologies for causal reasoning, by moving AI closer to human-like understanding. The contributions pave the way for future intelligent systems capable of meaningful intervention, retrospective explanation, and counterfactual reasoning. The Causal NeSy AI paradigm opens promising avenues for future research at the intersection of causality, learning, and reasoning, a necessary convergence on the path to truly intelligent systems.

An Efficient Detection and Deep Clustering Based Pipeline for Reliable Rodent Ultrasonic Vocalization Analysis

Friday, May 16, 2025 - 10:00 am
online

THESIS DFENSE
 

Author : Sabah Shahnoor Anis
Advisor: Dr. Christian O'Reilly
Date: May 16, 2025
Time: 10:00 am
Place: Teams
Meeting Link: https://teams.microsoft.com/l/meetup-join/19%3ameeting_ODY3YTA2NWMtMjk5…
 

Abstract

Ultrasonic vocalizations (USVs) are critical for understanding rodents' emotional states and social behaviors. However, manual analysis of USVs is time-consuming, subjective, and prone to errors. This thesis presents an automated pipeline that addresses these challenges by performing efficient USV detection and clustering. The proposed approach significantly reduces the time and effort needed to analyze USV data while improving accuracy and reproducibility.


To address this gap, we introduce ContourUSV, a five-step pipeline for USV detection. First, it begins with generating spectrograms from audio recordings, which are then pre-processed to enhance the contrast between USVs and background noise. Key steps include median filtering, global thresholding, and morphological operations to clean the spectrograms and highlight the contours of the USVs. Contours are detected using the OpenCV findContours function, and bounding boxes are drawn around each detected USV. The bounding box coordinates are then used to compute the time and frequency annotations of the USVs, allowing for precise temporal and spectral localization of the USVs. The detection system is validated against manually annotated datasets, demonstrating high precision, recall, and overall reliability.

In the clustering phase, deep unsupervised clustering of USVs (DUCUSV) is introduced, where preprocessed USV contours are further analyzed to reveal distinct patterns in vocal behavior. Dimensionality reduction is achieved using deep autoencoders, which compress the high-dimensional spectrogram data into a latent space suitable for clustering. After testing multiple unsupervised clustering algorithms, a hierarchical clustering approach called HDBSCAN is applied to group the USVs based on their spectro-temporal features. Various scores (Validity Index, Silhouette Coefficient, Calinski-Harabasz Index, and Davies-Bouldin Index) are used to evaluate the algorithms and determine the optimal number of clusters. Lastly, two data dimensionality reduction algorithms (Principal Component Analysis (PCA) and t-distributed Stochastic Neighbor Embedding (t-SNE)) are employed to visualize the results.

For robustness and reliability, the ContourUSV detector was tested and compared against three other state-of-the-art systems with two datasets containing various rodent USV recordings. On average, across the two datasets, ContourUSV outperformed the other three systems by 1.51× improvement in precision, 1.17× in recall, 1.80× in F1 score, and 1.49× in specificity while achieving an average speedup of 117.07×. The DUCUSV pipeline is developed by comparing different autoencoder architectures and clustering algorithms. Our benchmark results show better performance with dense autoencoders and hierarchical clustering based on evaluation metrics such as the silhouette score and validity index. To address the limitation due to a scarcity of open-access datasets in this research area, we made a subset of our internal dataset open-access. This contribution will allow the research community to benchmark the reliability of USV detection and clustering tools using a common dataset.

This fully automated USV detection and clustering pipeline offers a scalable, objective, and accurate solution for rodent ultrasonic vocalization analysis. The integration of advanced clustering techniques enables researchers to uncover novel patterns in vocal behavior, providing deeper insights into rodent communication. The work presented in this thesis contributes to more efficient and reliable methods for USV analysis, supporting future research in behavioral neuroscience.

Physics Oriented Deep Learning for Material Prediction and Generation

Thursday, May 1, 2025 - 10:00 am
Online
DISSERTATION DFENSE

Department of Computer Science and Engineering

University of South Carolina

Author : Nihang Fu
Advisor: Dr. Jianjun Hu
Date: May 1st, 2025
Time: 10:00 am
Place: Zoom

Abstract

The discovery of new materials is critical to advancing various industries, but traditional experimental methods for materials discovery remain slow and resource-intensive. Recent advances in machine learning (ML), particularly deep learning (DL), have greatly improved and accelerated two main aspects of modern computational material discovery: material design (e.g., material generation) and material screening (e.g., property prediction). However, a key challenge remains: standard ML models often struggle to perform domain-specific tasks effectively. Incorporating domain-specific knowledge, specifically the underlying physics of materials, into ML/DL models is key to improving the accuracy and reliability of material generation and prediction models.

This dissertation discusses and addresses this challenge through physics-oriented deep learning for computational materials discovery. In the first topic, we explore the use of transformer-based deep learning language models for the generative design of inorganic material compositions. Experiments showed that our transformer models can capture key physicochemical knowledge, such as charge neutrality and balanced electronegativity, and generate novel and chemically plausible inorganic material compositions. As an additional demonstration of the power of transformer neural networks models to capture physics and chemistry from raw compound data, in the second topic, we propose a bidirectional encoder transformer based model, BERTOS, for atomic oxidation state prediction from composition alone, which has significant applications in crystal structure prediction and virtual screening of candidate materials. Compared to the heuristic OS assignment algorithm in Pymatgen, our BERTOS achieves 97.61% versus 37.82%.
We further explore physics-guided deep learning for materials property prediction, emphasizing the importance of incorporating physical information in the input features to guide the neural network model training process, which helps guide the model to produce more physically accurate and reliable results, especially when the data is limited or noisy. In the third topic, we propose a novel framework called DSSL (Dual Self-Supervised Learning) to overcome the data scarcity issue for materials property prediction. This is a two-stage physics-guided approach based on the graph neural network approach that leverages both large-scale labeled and limited unlabeled data. It includes three complementary self-supervised learning (SSL) strategies: Mask-based generative SSL, Contrastive learning SSL, and Physics-guided predictive SSL. In the fourth topic, we investigate the impact of physical encoding on ML performance for property prediction and found that physical encoding of atoms can significantly improve the generalization prediction performance, especially for out-of-distribution samples. Finally, in the fifth topic, we investigate the issue of data redundancy in materials science datasets, arguing that standard random data splitting leads to overestimation of machine learning model performance, particularly concerning generalization to new materials. To address this, we developed MD-HIT algorithms for both composition- and structure-based redundancy using various similarity metrics, which provides a more objective evaluation of ML models' true extrapolation capabilities for materials property prediction and allows ML models to learn true physics from the data instead of overfitting ML models with low generalization performance.

Career in Security

Wednesday, April 16, 2025 - 03:55 pm
300 Main St. Room B201

Talk by David Weston. VP of Security at Microsoft. LinkedIn.

Explainable Process Recommendation through Multi-Contextual Grounding of Dynamic Multimodal Process Knowledge Graphs

Friday, March 21, 2025 - 10:30 am
Zoom and AI Institute, Seminar Room 529

DISSERTATION DEFENSE

Author : Revathy Venkataramanan Chandrasekaran
Advisor: Dr. Amit Sheth
Date: March 21, 2025
Time: 10:30 am
Place: Zoom and AI Institute, Seminar Room 529
Meeting Link: https://sc-edu.zoom.us/j/8440139296
Meeting ID: 844 013 9296

Abstract

Can I eat this food or not, and why? Which AI pipeline is best for a task and dataset? These questions differ from factual questions and answering tasks as they involve processes with interacting entities. Recipes consist of ingredients, methods, and interactions, while AI pipelines include datasets, models, and tasks. Each entity must be analyzed independently, and a collective inference, known as compositional reasoning, is required to draw the conclusion.

Existing process recommendation methods rely on the availability of structured data but struggle with unstructured data like recipes and AI pipelines. These datasets are often lengthy and noisy, making it hard to capture interactions and derive relevant insights. Additionally, natural language descriptions don’t provide necessary domain knowledge. For example, recipes don’t state that potatoes are healthy carbs with a high glycemic index. Domain-specific knowledge is needed for effective analysis and recommendations.
While neural networks excel in pattern recognition, they struggle with compositional reasoning. This work introduces a neurosymbolic framework for explainable process recommendation using Dynamic Multimodal Process Knowledge Graphs (DMPKGs). DMPKGs provide structured process representations grounded in multi-contextual knowledge for reasoning, explainability, and traceability while utilizing neural networks for pattern recognition. They enable modular entity inference and capture interactions for dynamic decision-making. DMPKGs allow continuous updates and store multimodal data, improving recommendation accuracy and explainability. Two use cases, recipe suitability analysis and AI pipeline recommendation, are explored to demonstrate the effectiveness of this approach in process recommendation.

Hallucinations in Large Foundation Models: Characterization, Quantification, Detection, Avoidance, and Mitigation

Tuesday, March 18, 2025 - 09:00 am
Online

DISSERTATION DEFENSE
Department of Computer Science and Engineering

University of South Carolina

Author : Vipula Rawte
Advisor: Dr. Amit Sheth
Date: March 18, 2025
Time:  9:00 am – 11:00 am
Place: Zoom and AI Institute, Seminar Room 529
Meeting Link: https://sc-edu.zoom.us/j/83442966750
Meeting ID: 834 4296 6750

Abstract

Deception is inherent in human interactions, and AI systems increasingly exhibit similar tendencies, mainly through hallucinations - plausible yet incorrect outputs stemming from their design, memory limitations, and statistical nature. As AI progresses into Wave 2 - Generative AI, as outlined by Mustafa Suleyman in The Coming Wave, models like GPT and DALL-E are revolutionizing fields like healthcare and education. However, their rapid adoption brings misinformation, safety, and ethics challenges. Notable cases, such as Air Canada’s chatbot providing false information, highlight the real-world impact of AI hallucinations, a phenomenon so prevalent that hallucinate was named Cambridge Dictionary’s Word of the Year for 2023.

This dissertation tackles AI hallucinations through six key areas: (i) Characterization - developing a taxonomy and benchmark (HILT); (ii) Quantification - introducing evaluation metrics (HVI and HVI_auto); (iii) Detection - proposing a span-based Factual Entailment method to improve accuracy; (iv) Avoidance - creating techniques like “Sorry, Come Again?” (SCA) and [PAUSE] injection for better responses; (v) Mitigation - developing RADIANT, a retrieval-augmented framework for entity-context alignment; and (vi) Multi-modal - constructing VHILT and ViBe datasets for hallucination analysis in image-to-text and text-to-video models. This research makes generative AI more reliable and trustworthy by systematically addressing AI hallucinations.

Exploiting structures in reinforcement learning: multi-agent homogeneity, euclidean symmetry, and natural languages

Tuesday, March 11, 2025 - 03:00 pm
Online
DISSERTATION DEFENSE
 
Author : Dingyang Chen
 
Advisor: Dr. Qi Zhang
 
Date: March 11, 2025
 
Time:  3:00 pm – 5:00 pm
 
Place: Zoom
 

Abstract

Reinforcement learning (RL) has emerged as a powerful paradigm for decision-making in complex environments. However, many RL tasks exhibit inherent structural properties—such as homogeneity, symmetry, and linguistic patterns—that are often underutilized, leading to inefficiencies in learning and generalization. This dissertation systematically exploits these structures to improve the efficiency, scalability, and robustness of RL algorithms across multi-agent and sequential decision-making settings.

First, we investigate homogeneity in multi-agent systems, where agents share similar roles and objectives. By leveraging this structure, we develop communication-efficient actor-critic methods for homogeneous Markov games, enabling scalable learning with reduced coordination overhead.
Second, we introduce Euclidean symmetry in RL, demonstrating how equivariant function approximators can significantly enhance sample efficiency and generalization in spatially structured tasks, such as robotic control.
Third, we integrate large language models (LLMs) into RL to improve sequential decision-making while avoiding expensive retraining. Our framework efficiently combines LLM inference with RL-based optimization, leading to better adaptability and reduced computational costs in contextual decision-making tasks.
Finally, we explore Markov Potential Games (MPGs), a subclass of multi-agent RL with inherent homogeneity. We develop best-response learning dynamics that mitigate non-stationarity and improve equilibrium quality, providing theoretical guarantees on convergence and the first known Price of Anarchy (POA) bounds for policy gradient methods in MPGs.
Through extensive theoretical analysis and empirical validation on diverse benchmarks, this work demonstrates the power of structural exploitation in RL. By leveraging homogeneity, symmetry, and natural languages, this research lays the foundation for more efficient, generalizable, and scalable RL algorithms, with applications in multi-robot coordination, traffic management, recommendation systems, and strategic game playing.