Computing Research Week - Open House 2023

February 23 and 24

NUS School of Computing (SoC) is organizing a two-day research program where students and faculty will showcase some of the department’s top-quality research works. While the program is open to everyone in SoC, it is mainly targeted towards potential incoming PhD students as well as enrolled first year PhD students.


Thursday, 23/2/2023

Venue: Multipurpose Hall 1 (COM3-01-26)

08:45 – 09:15 Registration (Foyer)

09:15 – 10:15 Welcome and IS Research Area Overview


10:45 – 12:15 Overview of SoC Research Area


13:30 – 15:00 Faculty Talks

13:30 – 14:00 Towards Trustworthy Robots: From Touch to Talk - Harold Soh

14:00 – 14:30 Dynamic Data Race Prediction - Umang Mathur

14:30 – 15:00 A Unified Hardware-Software Interface For Security - Prateek Saxena


Venue: Atrium, outside multipurpose hall 1 (COM3-01-26)

15:30 – 18:00 Poster + Research Lab Open House


Friday, 24/2/2023

Venue: Multipurpose Hall 1 (COM3-01-26)

09:30 – 11:00 Faculty Talks

09:30 – 10:00 Blockchain Security and the Coalition Formation of its Writers - Li Xiaofan

10:00 – 10:30 The Good, the Bad, and the Ugly Influence of Individual Data Points in Machine Learning - Reza Shokri

10:30 – 11:00 To Catch a (Distributed) Thief - Seth Gilbert


11:30 – 1:00 Student Talks

11:30 – 12:00 What's new in Congestion Control? - Raj Joshi

12:00 – 12:30 Short-Video Marketing in E-commerce: Analyzing and Predicting Consumer Response - Guo Yutong

12:30 – 1:00 Building Explainable AI with Human Centered Principles - Wencan Zhang


14:00 – 15:30 Student Talks + Meeting with research group/PI.

14:00 – 14:30 Towards a Theory-Based Evaluation of Explainable Predictions in Healthcare - Suparna Ghanvatkar

14:30 – 15:00 Causal Recommender Systems - Wang Wenjie

15:00 – 15:30 On the Effective Horizon of Inverse Reinforcement Learning - Yiqing Xu


16:00 – 17:30 Student Talks + Meeting with research group/PI.

16:00 – 16:30 Learning Causal DAGs using Adaptive Interventions - Davin Choo

16:30 – 17:00 Testing Database Engines via Query Plan Guidance - Jinsheng Ba

17:00 – 17:30 Hardness of Testing Machine Learning - Teodora Baluta

17:30 – 18:00 Closing Session ( Feedback )

Program Details

Thrusday, 23/2/2023, 13:30 – 14:00 Towards Trustworthy Robots: From Touch to Talk - Harold Soh

In this talk, I will give an overview of our recent work in enabling robots to feel the round around them (the sense of touch) and the ability to communicate with humans. We will discuss work on tactile learning using state-of-the-art touch sensors on robots and show robots can handle delicate objects and feel using tools. Then we discuss work on modeling humans for the purposes of communicating relevant task information. I will end with some unpublished work on large-language models for translation to planning goals for robots.

Thrusday, 23/2/2023, 14:00 – 14:30 Dynamic Data Race Prediction - Umang Mathur

Concurrent programs are notoriously hard to write correctly, as scheduling nondeterminism introduces subtle errors that are both hard to detect and to reproduce. Data races are arguably the most insidious amongst concurrency bugs and extensive research efforts have been dedicated to effectively detect them. A data race occurs when memory-conflicting actions are executed concurrently. Consequently, considerable effort has been made towards developing efficient techniques for race detection. The preferred approach to detect data races is through dynamic analysis, where one observes an execution of a concurrent program and checks for the presence of data races in the execution observed. Traditional dynamic race detectors rely on Lamport's happens-before (HB) partial order, which can be conservative and are often unable to discover simple data races, even after executing the program several times. Dynamic data race prediction aims to expose data races, that can be otherwise missed by traditional dynamic race detectors (such as those based on HB), by inferring data races in alternate executions of the underlying program, without re-executing it. In this talk, I will talk about the fundamentals of and recent algorithmic advances in data race prediction.

Thrusday, 23/2/2023, 14:30 – 15:00 A Unified Hardware-Software Interface For Security - Prateek Saxena

There are a staggering number of data breaches and software exploits disclosed every year. At the heart of this phenomenon is the lack of hardware abstractions to isolate data and limit its use to protection domains in a fine-grained and flexible manner. For example, modern architectures provide many specialized low-level abstractions for partial memory safety, virtualization, process isolation, secure enclaves, and safe synchronization over shared memory. But with so many options comes splintering---no architecture supports all of these specialized abstractions due to their complexity and performance cost. As a result, software cannot always rely on their availability for security. In this talk, I will describe a unified hardware interface that aims to cater to all the security goals mentioned above simultaneously. We challenge several incumbent ideas found in modern hardware and OS design: the idea of fixed privileged rings, the ubiquitous use of access control, and the separation of synchronization primitives from the actual resources accessed. Early simulations suggest that our interface is feasible to implement in RISC-V processors.

Friday, 24/2/2023, 09:30 – 10:00 Blockchain Security and the Coalition Formation of its Writers - Li Xiaofan

The aggregated computational power working on a proof-of-work system is usually considered the only determinant of the system's security. This argument is based on the assumption that the writers are acting independently. However, decentralization does not necessarily imply independence, especially when the number of significant writers of a blockchain platform, such as the Bitcoin Network, is relatively small. Writers can make side payments and form coalitions, which can threaten the platform's security. We discuss the process of a double spending attack, also known as a 51% attack, and analyze the interactions among the attacker and the other writers with a game-theoretic model. We show that a necessary condition for a proof-of-work system to function is for its writers to have sufficiently strong preferences for its integrity. With such preferences, a proof-of-work system can be substituted by a proof-of-stake system without becoming less secure. The justification for the resources used and pollution generated by proof-of-work systems is therefore unclear.

Friday, 24/2/2023, 10:00 – 10:30 The Good, the Bad, and the Ugly Influence of Individual Data Points in Machine Learning - Reza Shokri

Machine learning models are typically assessed based on their average performance, overlooking the impact of individual data points on the model. In this talk, we will explore the influence of data points on model behavior and how it explains significant machine learning phenomena, including memorization and information leakage, fairness, robustness, and data valuation.

Friday, 24/2/2023, 10:30 – 11:00 To Catch a (Distributed) Thief - Seth Gilbert

Over the last several years we have seen a boom in the development of new Byzantine agreement protocols, in large part driven by the excitement over blockchains and cryptocurrencies. A key goal of Byzantine agreement protocols is to ensure correct behaviour, even if some of the participants are malicious. What if, instead of preventing bad behavior by a malicious attacker, we guarantee "accountability," i.e., we can provide irrefutable evidence of the bad behavior and the identity of the perpetrator of those illegal actions? Much in the way we prevent crime in the real world, we can prevent bad behavior in a distributed system: either the protocol succeeds, or alternatively we record sufficient information to catch the criminal and take remedial actions. (Accountability has been increasingly discussed as a desirable property in blockchains like Ethereum, which "slashes" the stake of cheating users.) In this talk, we give an overview of accountability in distributed systems, with a focus on problems like consensus.

Friday, 24/2/2023, 11:30 – 12:00 What's new in Congestion Control? - Raj Joshi

Research in Internet congestion control has seen a renaissance in the past few years driven by two key developments. In 2016, Google proposed and deployed BBR, a congestion control algorithm that represents a departure from traditional loss-based algorithms like CUBIC and New Reno. Internet transport is also moving to the userspace, with the adoption of QUIC, a new transport stack that is already widely deployed and is set to be the default with HTTP3. While both these developments pose their own unique challenges, they both introduce a large amount of heterogeneity in the Internet's congestion control landscape. In my talk, I will present two of our recent works that study these two developments and their contribution in making the Internet the 'zoo' that it is today - and how taming this 'Zoo' poses the biggest challenge in we've faced in congestion control so far.

Friday, 24/2/2023, 12:00 – 12:30 Short-Video Marketing in E-commerce: Analyzing and Predicting Consumer Response - Guo Yutong

This study analyzes and predicts consumer viewing response to e-commerce short-videos (ESVs). We first construct a large-scale ESV dataset that contains 23,001 ESVs across 40 product categories. The dataset consists of the consumer response label in terms of average viewing durations and human-annotated ESV content attributes. Using the constructed dataset and mixed-effects model, we find that product description, product demonstration, pleasure, and aesthetics are four key determinants of ESV viewing duration. Furthermore, we design a content-based multimodal-multitask framework to predict consumer viewing response to ESVs. We propose the information distillation module to extract the shared, special, and conflicted information from ESV multimodal features. Additionally, we employ a hierarchical multitask classification module to capture feature-level and label-level dependencies. We conduct extensive experiments to evaluate the prediction performance of our proposed framework. Taken together, our paper provides theoretical and methodological contributions to the IS and relevant literature.

Friday, 24/2/2023, 12:30 – 1:00 Building Explainable AI with Human Centered Principles - Wencan Zhang

Explainable AI (XAI) can help people better understand models and enable humans to gain more trust in model decision-making. A large amount of AI products deployed in our daily life and their target users would be lay person, it is important to provide explanation that can be easily interpreted by end users. In this talk, we will cover human-centered explainable AI in two topics. Firstly, inspired by theories from cognitive psychology, we propose the XAI Perceptual Processing Framework and the RexNet model for relatable explainable AI with Contrastive Saliency, Counterfactual Synthetic, and Contrastive Cues explanations. We investigate the application of vocal emotion recognition and share insights about providing relatable explainable AI in perception applications. Secondly, we observe that saliency map explanations tend to become distorted and misleading when explaining predictions of images under systematic error. We present Debiased-CAM to achieve explanation faithfulness and robust performance on a wide range of image applications with data bias.

Friday, 24/2/2023, 14:00 – 14:30 Towards a Theory-Based Evaluation of Explainable Predictions in Healthcare - Suparna Ghanvatkar

Modern Artificial Intelligence (AI) models offer high predictive accuracy but often lack interpretability with respect to reasons for predictions. Explanations for predictions are usually necessary in making high-stakes clinical decisions. Hence, many Explainable AI (XAI) techniques have been designed to generate explanations for predictions from blackbox models. However, there are no rigorous metrics to evaluate these explanations, especially with respect to their usefulness to clinicians. We develop a principled method to evaluate explanations by drawing on theories from social science and accounting for specific requirements of the clinical context. As a case study, we use our metric to evaluate explanations generated by two popular XAI algorithms in the task of predicting the onset of Alzheimer’s disease using genetic data. Our preliminary findings are promising and illustrate the versatility and utility of our metric. Our work contributes to the practical and theoretical development of XAI techniques and Clinical Decision Support Systems.

Friday, 24/2/2023, 14:30 – 15:00 Causal Recommender Systems - Wang Wenjie

Recommender systems have been widely deployed to alleviate information overloading in extensive platforms such as e-commerce and social networks. Technically speaking, recommender models learn personalized user preference from users' historical interactions (e.g., clicks). However, many interference factors (e.g., items' deceptive titles) will affect the users' interaction process, injecting data bias into the interactions. Such bias causes the historical interactions not to be an ideal representation of user preference, hindering the accurate preference learning of recommender models. In this talk, we focus on four essential bias issues in recommendation: clickbait bias, bias amplification, filter bubbles, and OOD bias. To alleviate these issues, we propose a causal recommender framework, which first studies how the bias issues are generated, and then mitigates them by causal modeling. Specifically, 1) we mitigate the clickbait bias by estimating the causal effect of exposure features on recommendations and reducing the harmful effect via counterfactual inference; 2) we alleviate the bias amplification of existing recommender models by causal intervention; 3) we contribute a user-controllable inference strategy to let users freely control the effect of filter bubbles; and 4) we reduce the OOD bias by using causal representation learning to model causal relationships and leveraging such robust relationships to predict the shifted user preference. Finally, extensive experiments on real-world datasets demonstrate the effectiveness of our causal framework.

Friday, 24/2/2023, 15:00 – 15:30 On the Effective Horizon of Inverse Reinforcement Learning - Yiqing Xu

Inverse reinforcement learning (IRL) algorithms typically rely on (forward) reinforcement learning or planning over a given time horizon to compute an approximately optimal policy for a hypothesized reward function and then match this policy with expert demonstrations. The time horizon plays a critical role in determining the accuracy of reward estimate as well as the computational efficiency of IRL algorithms. Interestingly, an effective horizon shorter than the ground-truth value often produces better results faster. This work formally analyzes this phenomenon and provides an explanation: the effective horizon controls the complexity of an induced policy class and mitigates overfitting when limited data are available. This analysis leads to a principled choice of the effective horizon for IRL. It also suggests a change in the classic IRL formulation: it is more natural to learn the reward and the effective horizon jointly rather than the reward alone with a fixed horizon. Our experimental results confirm the theoretical analysis.

Friday, 24/2/2023, 16:00 – 16:30 Learning Causal DAGs using Adaptive Interventions - Davin Choo

Suppose that the underlying data is generated according to some hidden causal directed acyclic graph (DAG). It is well-known that one can recover causal graphs only up to a Markov equivalence class using observational data, and additional assumptions or interventional data is needed to recover the underlying ground truth causal graph. In this talk, we explore the problem of recovering causal DAGs using as few adaptive interventions as possible, through the lens of two problems: verification and adaptive search. The problem of verification asks whether a given DAG is the true causal DAG and the number of interventions needed serves as a natural lower bound on the number of interventions needed for any search algorithm, adaptive or not. Prior to our work, only an approximation to the verification problem was known and only theoretical guarantees for adaptive search on special classes of graphs were known. In the talk, I will present a simple complete characterization to the verification problem and a simple adaptive search algorithm that provably uses at most a logarithmic factor more interventions than necessary on any input graph. The techniques used are mostly graph-theoretic, adapted to the context of causal graph discovery. If time permits, I will also share some newer results in the setting of subset verification and adaptive search, and searching with imperfect advice. This talk is based on joint work with Arnab Bhattacharyya, Themis Gouleakis, and Kirankumar Shiragur.

Friday, 24/2/2023, 16:30 – 17:00 Testing Database Engines via Query Plan Guidance - Jinsheng Ba

Database systems are widely used to store and query data. Test oracles have been proposed to find logic bugs in such systems, that is, bugs that cause the database system to compute an incorrect result. To realize a fully automated testing approach, such test oracles are paired with a test case generation technique; a test case refers to a database state and a query on which the test oracle can be applied. In this work, we propose the concept of Query Plan Guidance (QPG) for guiding automated testing towards ''interesting'' test cases. SQL and other query languages are declarative. Thus, to execute a query, the database system translates every operator in the source language to one of the potentially many so-called physical operators that can be executed; the tree of physical operators is referred to as the query plan. Our intuition is that by steering testing towards exploring a variety of unique query plans, we also explore more interesting behaviors---some of which are potentially incorrect. To this end, we propose a mutation technique that gradually applies promising mutations to the database state, causing the DBMS to create potentially unseen query plans for subsequent queries. We applied our method to three mature, widely-used, and extensively-tested database systems---SQLite, TiDB, and CockroachDB---and found 53 unique, previously unknown bugs. Our method exercises 4.85x-408.48x more unique query plans than a naive random generation method and 7.46x more than a code coverage guidance method. Since most database systems---including commercial ones---expose query plans to the user, we consider QPG a generally applicable, black-box approach and believe that the core idea could also be applied in other contexts (e.g., to measure the quality of a test suite).

Friday, 24/2/2023, 17:00 – 17:30 Hardness of Testing Machine Learning - Teodora Baluta

Testing for desirable properties such as adversarial robustness is hard. Despite this worst-case hardness, efficient attacks can still provide adversarial examples as counterexamples of robustness. Is this due to some structure they are exploiting? In my talk, I will show that a measure of the hard instances is correlated with the density of adversarial examples under uniform random sampling. In order to estimate the density, we provide techniques that soundly and approximately compute the density. Besides robustness, membership inference attacks have been proposed to test privacy weaknesses in machine learning models. It is not well understood, however, why they arise. Are they a natural consequence of imperfect generalization only? Which underlying causes should we address during training to mitigate these attacks? Towards answering such questions, we propose the first approach to explain MI attacks and their connection to generalization based on principled causal reasoning.

Research Group Meeting Signup

Applicants can sign up for meeting with research groups.

Additional Information for Visiting International Students

Travel to Singapore

For up-to-date information on whether you need a visa to enter Singapore, as well as public health requirements, please check the Singapore Immigration and Checkpoints Authority website.

Please note that the provided offer letter can serve as a visa support letter.

Accommodation in Singapore

Hotel: Park Avenue Rochester, Lyf One-North Singapore, More Hotels around NUS

Commute in Singapore

Bus travel is extensive, affordable and convenient in Singapore (Google Maps has real-time bus and metro data, and you can pay for your ride with your credit card). In addition, travel by taxi or ride-share apps (Grab app or GoJek app) is fast and very reasonably priced.

Data roaming / WiFi

Singtel has a hi!Tourist visitor SIM card package with pickup locations in Changi airport, as well as other locations. The card also comes with an EZ-Link refillable card which allows you to use public transit (buses, MRT (subway), etc.) systems if you do not have a credit card. Here are some additional SIM purchase locations at the airport.


Question related to the event (e.g., venue, how to reach): Soundarya Ramesh (, Nitya Lakshmanan (

Question related to the visit (e.g., visa, offer letter, renumeration):Esther Low Xinyi (


Staff Committee
Chan Mun Choon
Xiao Xiaokui
Chuan Hoo Tan
Qiao Dandan
Vaibhav Rajan
Nitya Lakshmanan
Wee Sun Lee

Student Committee
Soundarya Ramesh

Admin Committee
Agnes Ang (
Esther Low Xinyi (

Aishwarya Jayagopal (
Mia Huong Nguyen (
Jason Yu (


All talks will be held in Multipurpose Hall 1 (COM3-01-26), in the NUS School of Computing, COM3 building. Poster presentation will be held in the atrium outside multipurpose hall 1.