Home Machine Learning CAIT announces two new PhD student fellowships and five new faculty research awards

CAIT announces two new PhD student fellowships and five new faculty research awards

CAIT announces two new PhD student fellowships and five new faculty research awards


In February 2021, Columbia Engineering and Amazon announced the initial faculty research award recipients and PhD fellowships for the Columbia Center of AI Technology (CAIT). Now CAIT is announcing two new PhD fellowships and five faculty research awards.

Madhumitha Shridharan, a PhD candidate in operations research, and Tuhin Chakrabarty, a PhD candidate in computer science, are the new fellows.

Madhumitha Shridharan and Garud-Iyengar

Shridharan, whose faculty advisor is Garud Iyengar, the Tang Family Professor of Industrial Engineering and Operations Research and vice dean of research at Columbia, is studying optimization methods for computing causal bounds.

“We are interested in causal analysis of large-scale systems in data-rich business environments,” Shridharan explained. “While learning correlations in data can potentially enable models to predict data with similar distributions to the training data, causal learning and inference attempts to understand how systems respond to counterfactual interventions. We are interested in developing causal models and algorithms for explainability and sequential decision making in real-world business systems.”

Chakrabarty’s faculty advisor is Smaranda Muresan, an adjunct associate professor in the computer science department, and a research scientist at the Data Science Institute. Chakrabarty is studying knowledge-aware models for natural language understanding and generation.

Tuhin Chakrabarty and Smaranda Muresan

“Large-scale language models based on transformer architectures, such as GPT-3, or RoBERTa have advanced the state of the art in natural language understanding and generation. However, even though these models have shown impressive performance in a zero-shot, few-shot or supervised setting for a variety of tasks, they often struggle with implicit or non-compositional meaning. My research interest is to combine commonsense knowledge with the power of transfer learning from large-scale pre-trained language models to improve the capabilities of current language models.”

Faculty research projects

The five new faculty research projects being supported are:

Conveying empathy in spoken language, Julia Hirschberg, the Percy K. and Vida L. W. Hudson Professor of Computer Science

Related content

Hirschberg explains why mastering empathetic speech is critical for successful dialogue systems.

“Much research has been done in the past 15 years on creating empathetic responses in text, facial expression and gesture in conversational systems,” Hirschberg notes. “However almost none has been done to identify the speech features that can create an empathetic sounding voice. This type of category has been found to be especially useful in dialogue systems, avatars, and robots, since empathetic behavior can encourage users to like a speaker more, to believe the speaker is more intelligent, to actually take the speaker’s advice, to trust and like it more, and to want to speak with the speaker longer and more often. We propose to identify acoustic/prosodic as well as lexical features which produce empathetic speech by collecting the first corpus of empathetic podcasts and videos, crowdsourcing their labels for empathy, building machine learning models to identify empathetic speech and the speech and language features as well as the visual features which can be used to generate it.”

A tale of two models, Junfeng Yang, professor and co-director of Software Systems Lab in the Department of Computer Science; and Asaf Cidon, assistant professor of electrical engineering and computer science

Related content

A combination of deep learning, natural language processing, and computer vision enables Amazon to hone in on the right amount of packaging for each product.

“Full-precision deep learning models are often too large or costly to deploy on edge devices,” observe Yang and Cidon. “To accommodate to the limited hardware resources, models are often quantized, compressed, or pruned. While such techniques often have a negligible impact on top-line accuracy, the adapted models exhibit subtle differences in output compared to the full-precision model from which they are derived.

“We propose a new attack termed Adversarial Deviation Attack, or ADA, that exploits the differences in model quantization, compression and pruning, by adding adversarial noise to input data that maximizes the output difference between the original and the edge model. It will construct malicious inputs that will trick the edge model but will be virtually undetectable by the original model. Such an attack is particularly dangerous: even after extensive robust training on the original model, quantization, compression or pruning will always introduce subtle differences, providing ample vulnerabilities for the attackers. Moreover, data scientists may not even be able to notice such attacks because the original model typically serves as the authoritative model version, used for validation, debugging and retraining. We will also investigate how new or existing defenses can fend off ADA attacks, greatly improving the security of edge devices.”

Joint selection and inventory optimization under limited capacity, Will Ma, assistant professor of decision, risk, and operations

Related content

The story of a decade-plus long journey toward a unified forecasting model.

“E-tailers have begun deploying ‘forward’ distribution centers close to city centers, which have very limited space,” write Ma and Topaloglu. “Our proposal is to develop scalable optimization algorithms that allow e-tailers to systematically determine the SKU variety and inventory that should be placed in these precious spaces. Our model accounts for demand that depends endogenously on our SKU selection, inventory pooling effects, and the interplay between different categories of SKU’s. Our model is designed to yield insights about: the relationship between demand variability and SKU fragmentation; sorting rules for selecting a few SKUs within a given category; and the marginal value of capacity to different categories.”

Exponentially faster parallel algorithms for machine learning, Eric Balkanski, assistant professor in the Department of Industrial Engineering and Operations Research

“This proposal aims to develop fast optimization techniques for fundamental problems in machine learning,” Balkanski explains. “In a wide variety of domains, such as computer vision, recommender systems, and immunology, objectives we care to optimize exhibit a natural diminishing returns property called submodularity. Off-the-shelf tools have been developed to exploit the common structure of these problems and have been used to optimize complex objectives. However, the main obstacle to the widespread use of these optimization techniques is that they are inherently sequential and too slow for problems on large data sets. Consequently, the existing toolbox for submodular optimization is not adequate to solve large scale optimization problems in ML.

“This proposal considers developing novel parallel optimization techniques for problems whose current state-of-the-art algorithms are inherently sequential and hence cannot be parallelized. These algorithms use new techniques that have shown promising results for problems such as movie recommendation and maximizing influence in social networks.”

Confidence-aware reinforcement learning for human-in-the-loop decision making, Shuran Song, assistant professor of computer science and director of the Columbia Artificial Intelligence and Robotics (CAIR) Lab; and Matei Ciocarlie, associate professor in the Mechanical Engineering Department

“We propose novel methods for leveraging human assistance in reinforcement learning (RL). The sparse reward problem has been one of the biggest challenges in RL, often leading to inefficient exploration and learning. While real-time immediate feedback from a human could resolve this issue, it is often impractical for complex tasks that require a large number of training steps. To address this problem, we aim to develop new confidence measures, which the agent computes during both training and deployment. In this paradigm, a Deep RL policy will train autonomously, but stop and request assistance when the confidence in the ultimate success of the task is too low to continue. We aim to show that expert assistance can speed up learning and/or increase performance, while minimizing the number of calls for assistance made to the expert.”


Source link


Please enter your comment!
Please enter your name here