Next Event

Date: January 29th, 2020
“Visual Data Analytics for Dietary Assessment and Video Compression” by Dr. Fengqing Maggie Zhu



Want to volunteer?

The IEEE SCV CAS chapter is seeking volunteers to help with the organization of technical meetings. Please contact us.


SCV-CAS Mailing List

To subscribe or unsubcribe, please visit the IEEE SCV-CAS list.

CASS-SCV Artificial Intelligence for Industry (AI4I) Forum – Fall 2019

Date: September 6th, 2019

CASS-SCV Artificial Intelligence for Industry (AI4I) Forum – Fall 2019

Event sponsored and organized by:

IEEE Circuits and Systems Society (CASS)



Friday, September 6th, 2019. 1 PM – 5 PM


Link here


1:00 – 1:30 PM Check-in / Networking & Refreshments

1:30 – 2:15 PM Prof. S.Y. Kung (Princeton)

2:15 – 3:00 PM Pete Warden (Google)

3:30 – 4:15 PM Ephrem Wu (Xilinx)

4:15 – 5:00 PM Jianhui Li (Intel)

5:00 PM Adjourn


Intel SC-9 Auditorium

2191 Laurelwood Rd, Santa Clara, CA 95054 (Northwest corner of 101 & Montague Expwy)


1:30 – 2:15 PM Prof. S.Y. Kung (Princeton)

TITLE: From Deep Learning to X-Learning: An Internal and Explainable Learning for XAI (slides)

ABSTRACT: The success of deep Learning (or AI2.0) depends solely on Back-propagation (BP), a external learning paradigm, whose supervision is exclusively accessed via the external interfacing nodes (i.e. input/output neurons). As such, Deep Learning has been limited to the parameter training of the neural nets (NNs). The important task of designing optimal net structures has to resort to trial and error. Therefore, we shall design an Xnet which may be use to simultaneously train the structure and parameters of the net. In addition, it can facilitate Internal Neuron’s Explainablility so as to fully support DARPA’s Explainable AI (i.e. XAI or AI3.0). Our internal learning paradigm leads to an Explainable Neural Networks (Xnet) comprising (1) internal teacher labels (ITL) and (2) internal optimization metrics (IOM). X-learning allows us to effectively rank the internal neurons (hidden nodes) and thus sets the footing for the notion of structural gradient and structural learning.

Pursuant to our simulation studies, Xnet can simultaneously compress the structure and raise the accuracy. There is evidence supporting that it may outperform many popular pruning/compression methods. Most importantly, X-learning opens up promising research fronts on (1) explainable learning models for XAI and (2) machine-to-machine mutual learning which will become appealing in the 5G era

BIO: S.Y. Kung, Life Fellow of IEEE, is a Professor at Department of Electrical Engineering in Princeton University. His research areas include multimedia information processing, machine learning, systematic design of deep learning networks, VLSI array processors, and compressive privacy. He was a founding member of several Technical Committees (TC) of the IEEE Signal Processing Society. He was elected to Fellow in 1988 and served as a Member of the Board of Governors of the IEEE Signal Processing Society (1989-1991). He was a recipient of IEEE Signal Processing Society’s Technical Achievement Award for the contributions on “parallel processing and neural network algorithms for signal processing” (1992); a Distinguished Lecturer of IEEE Signal Processing Society (1994); a recipient of IEEE Signal Processing Society’s Best Paper Award; and a recipient of the IEEE Third Millennium Medal (2000). Since 1990, he has been the Editor-In-Chief of the Journal of VLSI Signal Processing Systems. He has authored and co-authored more than 500 technical publications and numerous textbooks including “VLSI Array Processors”, Prentice-Hall (1988); “Digital Neural Networks”, Prentice-Hall (1993) ; “Principal Component Neural Networks”, John-Wiley (1996); “Biometric Authentication: A Machine Learning Approach”, Prentice-Hall (2004); and “Kernel Methods and Machine Learning”, Cambridge University Press (2014).

2:15 – 3:00 PM Pete Warden (Google)

TITLE: What Machine Learning Needs from Embedded Hardware
This talk will cover the emerging requirements of deep learning workloads, including the barriers that are preventing more widespread deployment on low-energy systems. Many peculiar characteristics of machine learning include arithmetic intensity, robustness to noise and precision loss, and scalability, and these affect what the software layer expects from the underlying hardware. We’ll discuss what has been learned over the last few years of deploying practical applications, and what is likely to be required in the future.

BIO: Pete Warden is a technical lead on the TensorFlow Lite team, responsible for embedded platforms. He was previously CTO of Jetpac, joining Google when it was acquired in 2014, and has worked at Apple on GPGPU processing.

3:30 – 4:15 PM Ephrem Wu (Xilinx)

TITLE: Compute-Efficient Neural-Network Acceleration


To enhance the performance of FPGA-based neural-network accelerators, maximizing both operating clock rates and compute efficiency is paramount. Streamlining data movement between memory and compute holds the key to boosting these metrics. To unleash latent performance in FPGA-based inference processors, we outline a convolutional neural network accelerator that operates at 92.9% of the peak FPGA clock rate. First, we map neural-network operators to a minimalist hardware architecture to simplify data movement between memory and compute. Doing so enables the design to close timing at high clock rates. Second, we describe a schedule that keeps compute utilization high. We apply this architecture to classify MNIST, CIFAR-10, and ImageNet datasets. This design achieves 95.5% compute efficiency with GoogLeNet, whose nested topology makes creating an efficient design especially challenging.


Ephrem Wu is a Senior Director in Silicon Architecture at Xilinx. Ephrem joined Xilinx in 2010, when he spearheaded the design of the first 2.5D FPGA with 28 Gb/s transceivers. Since then, Ephrem led the definition of the UltraRAM and the Versal DSP. His current focus is machine-learning accelerators for machine vision and machine translation. From 2000-2010, Ephrem led backplane switch and security processor development at Velio Communications and LSI. Prior to Velio, he developed ASICs and DSP software at SGI, HP, Panasonic, and AT&T. Ephrem holds 32 U.S. patents. He earned a bachelor’s degree from Princeton University and a master’s degree from the University of California, Berkeley, both in EE.

4:15 – 5:00 PM Jianhui Li (Intel)

TITLE: Performance Software for Deep Learning (slides)


Driven by the availability of large data, algorithm advance, and hardware innovation, deep learning applications have evolved quickly from the early success in compute vision, to the adoption in natural language processing, Recommendation Systems, and Reinforcement Learning. Performance software plays a critical role in efficiently running the machine learning algorithm on general purpose CPU and various deep learning accelerators. The talk will analyze deep learning algorithms from compute performance perspective and discuss how key software optimizations unleash the power of AI hardware from the cloud to the edge.


Jianhui Li is a principal engineer at Intel® Architecture, Graphics and Software group and leads deep learning framework integration and workload optimization. He was a software developer for binary translation and JIT compiler and led the development of Houdini which runs Android ARM applications transparently with comparable user experience on AI-based platform. Jianhui received PhD from Fudan University in computer science. He holds 21 US patents in binary translation and real-life application optimization.

  • September 2019
    M T W T F S S
    « Aug   Oct »