Paper by Youngeun Kwon and Minsoo Rhu was presented at 51th IEEE/ACM International Symposium on Microarchitecture (MICRO-51).

Title: Beyond the Memory Wall: A Case for Memory-centric HPC System for Deep Learning

Authors: Youngeun Kwon and Minsoo Rhu

As the models and the datasets to train deep learning (DL) models scale, computer system architects are faced with new challenges, one of which is the memory capacity bottleneck, where the limited physical memory inside the accelerator device constrains the algorithm that can be studied. We propose a memory-centric deep learning system (MC-DLA) that can transparently expand the memory capacity available to the accelerators while also providing fast inter-device communication for parallel training. Our proposal aggregates a pool of memory modules locally within the device-side interconnect, which are decoupled from the host interface and function as a vehicle for transparent memory capacity expansion. Compared to device-centric DL systems (DC-DLA) such as NVIDIA’s DGX, our proposal achieves an average 2.8x speedup on eight DL applications and increases the system-wide memory capacity to tens of TBs.

mcdla micro 0

Professors Jin-Woo Shin & Dong-Su Han’s research group develops super-resolution technology using Deep Learning

Prof. Jin-Woo Shin, Prof. Dong-Su Han’s group’s Quang Nguyen Ngoc, Jee-hoon Tak, Byuck-Chan Lee participated in the modeling technology of used cars using deep learning technique developed by KB Capital and reported to domestic media including Money Today.

This system utilizes deep running to derive used car quotes. Using the developed deep-learning model, AI learns about hundreds of thousands of vehicles, and rates an accurate used car in consideration of about 50 factors that influence the price of the vehicle (sales time, mileage, fuel economy, vehicle type, etc.). Accurate quotes are derived through processes that reflect what is often considered a careful consideration in the used car market, such as time of sale, mileage, fuel economy, and type of vehicle. KB Capital will be able to provide the accurate used car quotes to customers through KB Chachacha site in real time by using the developed deep running model on the deep running server using the latest GPU and estimating residual value that reflects the vehicle specific price.

This research is an accomplishment of KB-KAIST financial AI research center (center director: professor Dae-Sik Kim) project to upgrade used car market data provided to KB Chachacha used cars site. The used car model based on AI model developed by KAIST will be provided through the KB Chachacha Site.

 

<Link>

Kyung-Hwan Son, Dae-Woo Kim, Wan-Ju Kang, David Earl Hostallero, Yung Yi paper accepted at 36th International Conference on Machine Learning (ICML 2019)

Title: QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

Authors: Kyung-Hwan Son, Dae-Woo Kim, Wan-Ju Kang, David Earl Hostallero, Yung Yi

 We explore value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently. However, VDN and QMIX are representative examples that use the idea of factorization of the joint action-value function into individual ones for decentralized execution. VDN and QMIX address only a fraction of factorizable MARL tasks due to their structural constraint in factorization such as additivity and monotonicity. In this paper, we propose a new factorization method for MARL, QTRAN, which is free from such structural constraints and takes on a new approach to transforming the original joint action-value function into an easily factorizable one, with the same optimal actions. QTRAN guarantees more general factorization than VDN or QMIX, thus covering a much wider class of MARL tasks than does previous methods. Our experiments for the tasks of multi-domain Gaussian-squeeze and modified predator-prey demonstrate QTRAN’s superior performance with especially larger margins in games whose payoffs penalize non-cooperative behavior more aggressively.
 

fig1

  

fig2

Yun-Hun Jang, Han-Kook Lee, Sung-Ju Hwang (Advised by Jin-Woo Shin) accepted at 36th International Conference on Machine Learning (ICML 2019)

Title: Learning What and Where to Transfer 

Authors: Yun-Hun Jang, Han-Kook Lee, Sung-Ju Hwang, Jin-Woo Shin

 As the application of deep learning has expanded to real-world problems with insufficient volume of training data, transfer learning recently has gained much attention as means of improving the performance in such small-data regime. However, when existing methods are applied between heterogeneous architectures and tasks, it becomes more important to manage their detailed configurations and often requires exhaustive tuning on them for the desired performance. To address the issue, we propose a novel transfer learning approach based on meta-learning that can automatically learn what knowledge to transfer from the source network to where in the target network. Given source and target networks, we propose an efficient training scheme to learn meta-networks that decide (a) which pairs of layers between the source and target networks should be matched for knowledge transfer and (b) which features and how much knowledge from each feature should be transferred. We validate our meta-transfer approach against recent transfer learning methods on various datasets and network architectures, on which our automated scheme significantly outperforms the prior baselines that find “what and where to transfer” in a hand-crafted manner.

hankook

Jong-Heon Jeong and Jin-Woo Shin paper accepted at 36th International Conference on Machine Learning (ICML 2019)

Title: Training CNNs with Selective Allocation of Channels

Authors: Jong-Heon Jeong and Jin-Woo Shin

 Recent progress in deep convolutional neural networks (CNNs) have enabled a simple paradigm of architecture design: larger models typically achieve better accuracy. Due to this, in modern CNN architectures, it becomes more important to design models that generalize well under certain resource constraints, e.g. the number of parameters. In this paper, we propose a simple way to improve the capacity of any CNN model having large-scale features, without adding more parameters. In particular, we modify a standard convolutional layer to have a new functionality of channel-selectivity, so that the layer is trained to select important channels to re-distribute their parameters. Our experimental results under various CNN architectures and datasets demonstrate that the proposed new convolutional layer allows new optima that generalize better via efficient resource utilization, compared to the baseline.

jongheonj

Ki-Min Lee (KAIST EE), Suk-Min Yun (KAIST EE), Ki-Bok Lee (CS UMich), Hong-Lak Lee (CS Umich / Google Brain), Bo Li (CS UIUC) and Jin-Woo Shin(KAIST EE) paper accepted at 36th International Conference on Machine Learning (ICML 2019)

Title: Robust Inference via Generative Classifiers for Handling Noisy Labels (ICML 2019)

Authors: Ki-Min Lee (KAIST EE), Suk-Min Yun (KAIST EE), Ki-Bok Lee (CS UMich), Hong-Lak Lee (CS Umich / Google Brain), Bo Li (CS UIUC) and Jin-Woo Shin (KAIST EE)

 Large-scale datasets may contain significant proportions of noisy (incorrect) class labels, and it is well-known that modern deep neural networks (DNNs) poorly generalize from such noisy training datasets. To mitigate the issue, we propose a novel inference method, termed Robust Generative classifier (RoG), applicable to any discriminative (e.g., softmax) neural classifier pre-trained on noisy datasets. In particular, we induce a generative classifier on top of hidden feature spaces of the pre-trained DNNs, for obtaining a more robust decision boundary. By estimating the parameters of generative classifier using the minimum covariance determinant estimator, we significantly improve the classification accuracy with neither re-training of the deep model nor changing its architectures. With the assumption of Gaussian distribution for features, we prove that RoG generalizes better than baselines under noisy labels. Finally, we propose the ensemble version of RoG to improve its performance by investigating the layer-wise characteristics of DNNs. Our extensive experimental results demonstrate the superiority of RoG given different learning models optimized by several training techniques to handle diverse scenarios of noisy labels.

kimin
Figure 1. Visualization of features on the penultimate layer using t-SNE from training samples when the noise fraction is 20%

Dan Hendrycks (UC Berkeley), Ki-Min Lee (KAIST EE), Mantas Mazeika (University of Chicago) accepted at 36th International Conference on Machine Learning (ICML 2019)

Title: Using Pre-Training Can Improve Model Robustness and Uncertainty

Authors: Dan Hendrycks (UC Berkeley), KimMin Lee (KAIST EE), Mantas Mazeika (University of Chicago)

 He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance to pre-training. We show that although pre-training may not improve performance on traditional classification metrics, it improves model robustness and uncertainty estimates. Through extensive experiments on label corruption, class imbalance, adversarial examples, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. We show approximately a 10% absolute improvement over the previous state-of-the-art in adversarial robustness. In some cases, using pre-training without task-specific methods also surpasses the stateof-the-art, highlighting the need for pre-training when evaluating future methods on robustness and uncertainty tasks.

kimin2
Figure 1. Training for longer is not a suitable strategy for label corruption. By training for longer, the network eventually begins to model and memorize label noise, which harms its overall performance. Labels are corrupted uniformly to incorrect classes with 60% probability, and the Wide Residual Network classifier has learning rate drops at epochs 80, 120, and 160.

Se-Jun Park (KAIST EE), Eun-Ho Yang (KAIST CS), Se-Young Yun (KAIST IE), Jin-Woo Shin (KAIST EE) accepted at 36th International Conference on Machine Learning (ICML 2019)

Title: Spectral Approximate Inference

Authors: SejJun Park, EunhHo Yang, Se-Young Yun, JinwWoo Shin

 Graphical models (GMs) have been successfully applied to various applications of machine learning. Given a GM, computing its partition function is the most essential inference task, but it is computationally intractable in general. To address the issue, iterative approximation algorithms exploring certain local structure/consistency of GM have been investigated as popular choices in practice. However, due to their local/iterative nature, they often output poor approximations or even do not converge, e.g., in low-temperature regimes (hard instances of large parameters). To overcome the limitation, we propose a novel approach utilizing the global spectral feature of GM. Our contribution is two-fold: (a) we first propose a fully polynomial-time approximation scheme (FPTAS) for approximating the partition function of GM associating with a low-rank coupling matrix; (b) for general high-rank GMs, we design a spectral mean-field scheme utilizing (a) as a subroutine, where it approximates a high-rank GM into a product of rank-1 GMs for an efficient approximation of the partition function. The proposed algorithm is more robust in its running time and accuracy than prior methods, i.e., neither suffers from the convergence issue nor depends on hard local structures. Our experiments demonstrate that it indeed outperforms baselines, in particular, significantly in the low-temperature regimes.

seun
Figure1. An illustration of the spectral approximate inference for the partition function approximation

Professor Yong-Dae Kim and Professor Seungwon Shin reported in etnews on information security research using AI

Professor Yong-Dae Kim and Professor Seungwon Shin have been reported in etnews on information security research using AI.

Professor Kim Yongdae’s lab has been introduced as a way to attack information security loopholes in sensors and information and communication sensors, and to study AI-based technology security thoroughly.

Professor Seungwon Shin’s laboratory was introduced in the Dark Web to detect various types of crime and terrorism information and to develop technology to analyze the identity of information publishers.

Link: http://www.etnews.com/20170507000055

Professor Yung Yi, Professor Kyoung-Soo Park and Professor Dong-Su Han have been reported in etnews regarding the development of 'AI Technology-based Drone'.

Professor Yung Yi, Professor KyoungSoo Park, and Professor Dongsu Han have been introduced to develop an integrated platform technology that enables intelligent AI drone to communicate effectively with other drones and external environments.

Link: http://www.etnews.com/20170602000167

Link: http://www.etnews.com/20170602000165