Archive for the ‘Invited Talks’ Category

Talk: Towards Understanding Vision and Language Systems

Wednesday, February 26th, 2020

Dear All,

IEEE Signal Processing Society, Bangalore Chapter and Department of Computational and Data Sciences,Indian Institute of Science Invite you to the following talk:

SPEAKER : Badri Narayana Patro, Postdoctoral Fellow, Google Research India
TITLE : “Towards Understanding Vision and Language Systems: Controllability, Uncertainty and
Interpretability for VQA and VQG”
Venue : #102, CDS Seminar Hall
Date & Time : Feb 26, 2020, 04:00 PM

Intelligent interaction between humans and automated systems is an important area of research in computer vision, natural language processing, and machine learning. One such interaction involves visual question answering (VQA), where questions asked by humans are answered by a machine. This involves using information from different modalities, such as image and language. Such automated systems have many application possibilities in building assistive technology for visually impaired people and in efficient surveillance and robotics systems. Now to have a healthy interaction, it is important also to have engaging questions being generated by machines. Generating natural questions based on an image is a challenging semantic task termed visual question generation (VQG) that requires multimodal representations. Images can have multiple visual and language contexts that are relevant for generating questions, namely places, captions, and tags. In the literature, there are several ways based on deep learning to solve these problems. In this talk, we aim to understand these techniques better to incorporate controllability, obtain an estimate of the uncertainty of solving the problem, and also be able to explain these techniques using visual and textual explanations in a robust manner.

Badri Narayana Patro is currently a Postdoctoral Fellow at Google Research Lab for AI, India. He has submitted his PhD thesis on the title “ Towards Understanding Vision and Language Systems” in the Department of Electrical Engineering from Indian Institute of Technology, Kanpur. He received his M.Tech degree in Department of Electrical Engineering from Indian Institute of Technology, Bombay and received his B.Tech degree in Electronic and telecommunication Engineering from National Institute of Science and Technology, Odisha. He was working as a Lead engineering in Samsung R&D institute Delhi, India. He has also worked for Harman International limited, Pune, India as an associate software engineer and assistant software engineer at Larsen and Toubro at Mysore. His expertise is in the fields of Computer Vision, Natural Language Processing, Pattern Recognition and applied Machine Learning. He has authored papers in different venues such as CVPR, ICCV, AAAI, EMNLP, COLING, WACV. He works the Vision and Language team at Google Research Lab and works closely with allied teams such as AI for Social Goods and Google Image teams within Google. He has served as a reviewer for CVPR, ICCV, AAAI, ACL, TIP, TMM, WACV, ICVGIP, NCVPRIPG conferences and Journals. He is a member of the ACL, AAAI, and CVF. He actively collaborates with institutes to further his research interests.

Host Faculty: Prof. Venkatesh Babu

Talk: Model-based Signal Processing in Neurocritical Care

Sunday, January 19th, 2020


Venue: Golden Jubilee Seminar Hall, ECE Dept, IISc, Bangalore, India

Time: 4.00 pm On 20th January 2020

Organized by

IEEE EMBS Bangalore Chapter, EMB IISc and RIT Student Chapters,

       IEEE SPS Bangalore Chapter, Department of ECE, Indian Institute of Science.


Title of the Talk: Model-based Signal Processing in Neurocritical Care

Abstract: Large volumes of heterogeneous data are now routinely collected and archived from patients in a variety of clinical environments, to support real-time decision-making, monitoring of disease progression, and titration of therapy. This rapid expansion of available physiological data has resulted in a data-rich – but often knowledge-poor – environment. Yet the abundance of clinical data also presents an opportunity to systematically fuse and analyze the available data streams, through appropriately chosen mathematical models, and to provide clinicians with insights that may not be readily extracted from visual review of the available data streams. In this talk, I will highlight our work in model-based signal processing for improved neurocritical care to derive additional and clinically useful information from routinely available data streams. I will present our model-based approach to noninvasive, patient-specific and calibration free estimation of intracranial pressure and will elaborate on the challenges of (and some solutions to) collecting high-quality clinical data for validation.

Speaker: Prof Thomas Heldt
Massachusetts Institute of Technology, United States   

Thomas Heldt studied physics at Johannes Gutenberg University, Germany, at Yale University, and at MIT. He received the PhD degree in Medical Physics from MIT’s Division of Health Sciences and Technology and undertook postdoctoral training at MIT’s Laboratory for Electromagnetic and Electronic Systems. Prior to joining the MIT faculty in 2013, Thomas was a Principal Research Scientist with MIT’s Research Laboratory of Electronics. He currently holds the W.M. Keck Career Development Chair in Biomedical Engineering. He is a member of MIT’s Institute for Medical Engineering and Science and on the faculty of the Department of Electrical Engineering and Computer Science.

Thomas’s research interests focus on signal processing, mathematical modeling and model identification in support of real-time clinical decision making, monitoring of disease progression, and titration of therapy, primarily in neurocritical and neonatal critical care. In particular Thomas is interested in developing a mechanistic understanding of physiologic systems, and in formulating appropriately chosen computational physiologic models for improved patient care. His research is conducted in close collaboration with clinicians from Boston-area hospitals, where he is integrally involved in designing and deploying high-quality data-acquisition systems and collecting clinical data. 

Seminar: CV and Deep Learning – A Marriage of Neuroscience and ML

Wednesday, December 18th, 2019

IEEE Signal Processing Society, Bangalore Chapter and Department of EE, Indian Institute of Science Invite you to the following talk:

SPEAKER   :  Prof. Ardhendu Behera, Associate Professor (Reader), Edge Hill University, UK

TITLE          : “Computer Vision and Deep Learning – A Marriage of Neuroscience and Machine Learning”

Venue            :  MMCR Room No C241, First Floor, EE Dept.

Date & Time :  Dec 20, 2019, 04:00 PM


For almost 10 decades, human vision researchers have been studying how the human vision system has evolved. While computer vision is a much younger discipline, it has achieved impressive results in many detection and classification tasks (e.g. object recognition, scene classification, face recognition, etc.) within a short span of time. Computer vision is one of the fastest growing fields and one of the reasons is due to the amount of video/image data from urban environment growing exponentially (e.g. 24/7 cameras, social media sources, smart city, etc.). The scale and diversity of these videos/images make it very difficult to extract reliable information to automate in a timely manner. Recently, Deep Convolutional Neural Networks (DCNNs) have shown impressive performance for solving visual recognition tasks when trained on large-scale datasets. However, such progresses face challenges when rolling into automation and production. These include enough data of good quality, executives’ expectations about model performance, responsibility and trustworthiness in decision making, data ingest, storage, security and overall infrastructure, as well as understanding how machine learning differ from software engineering.

In this talk, I will focus on recent progress in advancing human action/activity and behaviour recognition from images/videos, addressing the research challenges of relational learning, deep learning, human pose, human-objects interactions and transfer learning. I will then briefly describe some of our recent efforts to adopt these challenges in automation and robotics, in particular human-robot social interaction, in-vehicle activity monitoring and smart factories.

Speaker Bio:

Ardhendu Behera is a Senior Lecturer (Associate Professor) in the Department of Computer Science in Edge Hill University (EHU). Prior to this, he held post-doc positions at the universities of Fribourg (2006-07) and Leeds (2007-14). He holds a PhD from University of Fribourg, MEng from Indian Institute of Science, Bangalore and BEng from NIT Allahabad. He is leading the visualisation theme of the Data and Complex Systems Research Centre at the EHU. He is also a member of Visual Computing Lab. His main research interests are computer vision, deep learning, pattern recognition, robotics and artificial intelligence. He applies this interest to interdisciplinary research areas such as monitoring and recognising suspicious behaviour, human-robot social interactions, autonomous vehicles, monitoring driving behaviour, healthcare and patient monitoring, and smart environments. Dr Behera has been involved in various outreach activities and some of his research are covered by media, press, newspaper and television.




Seminar: A Small Rearguard Action in the Age of Big Data and ML

Tuesday, December 3rd, 2019

Indian Institute of Science
Centre for BioSystems Science and Engineering

BSSE Seminar

(Organized by IEEE Signal Processing Society Bangalore Chapter)

9th December 2019 (Monday), 11:00 AM, MRDG Seminar Hall, 1st floor, Biological Sciences Building

Title: A Small Rearguard Action in the Age of Big Data and Machine Learning: Mechanistic Models in Computational Physiology

Speaker: Dr. George Verghese, MIT, Cambridge, Massachusetts

Abstract: The talk will draw some contrasts between phenomenological or empirical models (e.g., regression, neural networks) and mechanistic models (e.g., circuit analogs). Mechanistic models focus on meaningful component parts/subprocesses of the phenomenon of interest, and on their interconnections/interactions, which then generate the range of possible system behaviors. Examples will be given of mechanistic models for aspects of cardiovascular, cerebrovascular and respiratory physiology, and application of these models to extracting interpretable information from relevant data obtained in clinical or ambulatory settings.

Bio: Dr. George Verghese received his BTech from the Indian Institute of Technology, Madras in 1974, his MS from the State University of New York, Stony Brook in 1975, and his PhD from Stanford University in 1979, all in Electrical Engineering. Since 1979, he has been with MIT, where he is the Henry Ellis Warren (1894) Professor, and Professor of Electrical and Biomedical Engineering, in the Department of Electrical Engineering and Computer Science. He was named a MacVicar Faculty Fellow at MIT for the period 2011-2012, for outstanding contributions to undergraduate education.Verghese is also a principal investigator with MIT’s Research Laboratory of Electronics (RLE). His research interests and publications are in the areas of dynamic systems, modeling, estimation, signal processing, and control. Over the past decade, his research focus has shifted from applications in power systems and power electronics entirely to applications in biomedicine. He directs the Computational Physiology and Clinical Inference Group in RLE. He is an IEEE Fellow, and has co-authored two texts: Principles of Power Electronics (with J.G. Kassakian and M.F. Schlecht, 1991), and Signals, Systems and Inference (with A.V. Oppenheim, 2015).


Talk: Semi-supervised Learning for Amazon Alexa

Thursday, November 21st, 2019
The IEEE Signal Processing Society, Bangalore Chapter and Department of Electrical Engineering, Indian Institute of Science, welcomes you to following talk.
Location and Date : MMCR, EE, Thursday Nov. 21, 4 pm (Coffee at 345pm).
Speakers : Dr. Sivaram Garimella and Kishore Nandury


Title: Semi-supervised Learning for Amazon Alexa.
State-of-the-art Acoustic Models (AM) are large, complex deep neural networks that typically comprise millions of model parameters. Deep neural networks can express highly complex input-output relationships and transformations, but the key to getting the best performance out of them is the availability of large amounts of matched acoustic data – matched to the desired dialect, language, environmental/channel condition, microphone characteristic, speaking style, and so on. Since it is both time consuming and expensive to transcribe large amounts of matched acoustic data for every desired condition, we leverage Teacher/Student based Semi-Supervised Learning technology for improving the AM. Our training leverages vast amount of un-transcribed data in addition to multi-dialect transcribed data yielding up to 7% relative word error rate reduction over the baseline model, which has not seen any unlabelled data.
Sri Garimella is a Senior Manager heading the Alexa Machine Learning/Speech Recognition group in Amazon, India. He has been associated with Amazon for more than 7 years. He obtained PhD from the Department of Electrical and Computer Engineering, Center for Language and Speech Processing at the Johns Hopkins University, Baltimore, USA in 2012. And Master of Engineering in Signal Processing from the Indian Institute of Science, Bangalore, India in 2006.
Kishore Nandury is an Applied scientist in Alexa ASR team in Amazon Bangalore. Prior to Amazon, he has worked in Intel, Sling media & NVIDIA graphics. He has obtained Masters degree in Signal Processing from Indian Institute of Science in 2005.
Host Faculty:  Sriram Ganapathy

Talk: Bayesian approaches for target tracking in radar applications

Friday, November 8th, 2019

CNRS, France, Indian Institute of Science, Bangalore, and IEEE Signal Processing Society Bangalore Chapter invite you to the following talk

Title: Some contributions to Bayesian approaches for target tracking in radar applications

Speaker: Prof. Eric Grivel, IMS Lab, Bordeaux, France

Venue: Golden Jubilee Seminar Hall, ECE Department, Indian Institute of Science Bangalore

Date and time: November 13, 2019; 4 PM – 5 PM.

Abstract: Detecting and tracking maritime or ground targets is one of the application fields for surveillance by airborne radar systems. More particularly, the purpose is to estimate the trajectories of one or more moving objects over time based on noisy radar measurements. When dealing with one target, one approach consists in using a Kalman filter by making an assumption on the type of the target motion and the parameters of the motion model. However, these assumptions are not necessarily well-suited to the situation. In addition, when dealing with maneuvering targets, the motion model may often change over time. Finally, false detections may appear and have a bad influence on the estimation of the target position. To address these issues, multiple-model algorithms, joint tracking and classification approaches and Bernoulli filtering can be considered. The purpose of this talk is to present some variants based on these concepts.

It should be noted that the methods that will be presented were developed with PhD students and a French colleague.

Biography of the speaker: Eric Grivel received his PhD in signal processing in 2000 in Bordeaux (France). He joined Bordeaux Institute of Technology (Bordeaux INP), in 2001 as an assistant professor  and then as a professor in 2011.  For more than 20 years, he has been with the Signal & Image research group at IMS lab (which is a joint research unit for the French National Center for Scientific Research CNRS, University of Bordeaux and Bordeaux INP). His research activities deal with statistical signal processing with applications in speech and audio processing, mobile communication systems, radar processing, GPS navigation and biomedical.

Talk on Latent Dirichlet Allocation

Tuesday, October 29th, 2019

The IEEE Signal Processing Society, Bangalore Chapter and Department of Electrical Engineering, Indian Institute of Science, welcomes you to following talk.

Location and Date : MMCR, EE, Thursday Oct. 31, 4-5 pm

Speaker : Dr. Hemant Misra (Vice President, Head of Applied Research, Swiggy)  

Abstract : Topic models such as Latent Dirichlet Allocation (LDA: have been used extensively in the last decade for tasks such as information retrieval, topic discovery, dimensionality reduction etc. In the current presentation, the application of LDA for the task of text-segmentation ( has been explained. Results on multiple datasets are shown to demonstrate the performance of the proposed LDA based system vis-a-vis other standard methods. The talk also uncovers challenge faced by the dynamic programming (DP) algorithm used in proposed LDA based segmentation and how it was overcome.

Talk will also cover some of the exciting things we are doing at Swiggy in the Applied Research team in the areas of speech, computer vision (CV) and natural language processing (NLP).

Speaker Bio : Dr. Hemant Misra is an active researcher in the areas of text and signal processing, speech/speaker recognition, machine learning, healthcare applications and education. He did his MS (1999) from IIT, Madras, and PhD from EPFL (2006). Then he held post-doc positions at Telecom ParisTech, University of Glasgow, and Xerox Research Centre Europe.  After having successful stints at Philips (Healthcare) Research, IBM’s India Research Lab and Citicorp Services India, currently Hemant is ‘VP – Head of Applied Research’ at Swiggy.

Host Faculty : Sriram Ganapathy (EE)

Talk: Large Scale Data Analytics for Airborne Imagery

Sunday, September 8th, 2019

Dept. of Electrical Communication Engineering and


IEEE Signal Processing Society Bangalore Chapter


invite you to the following seminar:


Title: Large Scale Data Analytics for Airborne Imagery

Speaker: Prof. Gaurav Sharma, Univ. of Rochester

Time and Date: 11 AM, September 10, 2019

Venue: Golden Jubilee Hall, ECE




The widespread availability of high resolution aerial imagery covering wide geographical areas is spurring a revolution in large scale visual data analytics. Specifically, modern aerial wide area motion imagery (WAMI) platforms capture large high resolution at rates of 1-3 frames per second. The sequences of images, which individually span several square miles of ground area, represent rich spatio-termporal datasets that are key enablers for new applications. The effectiveness of such analytics can be enhanced by combining WAMI with alternative sources of rich geo-spatial information such as road maps or prior georegistered images. We present results from our recent research in this area covering three topics. First, we describe a novel method for pixel accurate, real-time registration of vector roadmaps to WAMI imagery based on moving vehicles in the scene. Next, we present a framework for tracking WAMI vehicles across multiple frames by using the registered roadmap and a new probabilistic framework that allows us to better estimate associations across multiple frames in a computationally tractable algorithm. Finally, in the third part, we highlight, how we can combine structure from motion and our proposed registration approach to obtain 3D georegistration for use in application such as change detection. We present results on multiple WAMI datasets, including nighttime infrared WAMI imagery, highlighting the effectiveness of the proposed methods through both visual and numerical comparisons.


Speaker Biography


Gaurav Sharma is a professor in the Electrical and Computer Engineering Department and a Distinguished Researcher in Center of Excellence in Data Science (CoE) at the Goergen Institute for Data Science at the University of Rochester. He received the PhD degree in Electrical and Computer engineering from North Carolina State University, Raleigh in 1996. From 1993 through 2003, he was with the Xerox Innovation group in Webster, NY, most recently in the position of Principal Scientist and Project Leader. His research interests include data analytics, cyber physical systems, signal and image processing, computer vision, and media security; areas in which he has 52 patents and has authored over 200 journal and conference publications. He currently serves as the Editor-in-Chief for the IEEE Transactions on Image Processing. From 2011 through 2015, he served as the Editor-in-Chief for the Journal of Electronic Imaging and, in the past, has served as an associate editor for the Journal of Electronic Imaging, the IEEE Transactions on Image Processing, and for the IEEE Transactions on Information Forensics and Security. He is a member of the IEEE Publications, Products, and Services Board (PSPB) and chaired the IEEE Conference Publications Committee in 2017-18. He is the editor of the Digital Color Imaging Handbook published by CRC press in 2003. Dr. Sharma is a fellow of the IEEE, a fellow of SPIE, a fellow of the Society for Imaging Science and Technology (IS&T) and has been elected to Sigma Xi, Phi Kappa Phi, and Pi Mu Epsilon. In recognition of his research contributions, he received an IEEE Region I technical innovation award in 2008.

Special Lecture: Brain Inspired Automated Concept and Object Learning

Tuesday, September 3rd, 2019


Department of ECE, Indian Institute of Science

IEEE Bangalore Section

IEEE Signal Processing Society Bangalore Chapter

Welcome you to a 

Special Lecture


Title:  Brain Inspired Automated Concept and Object Learning: Vision, Text, and Beyond

Speakers: Vwani Roychowdhury (UCLA) and 

Thomas Kailath (Stanford) 

Venue: ECE Golden Jubilee Seminar Hall

             Department of  ECE, IISc

Day/ Date: Friday, 6 September 2019

Time: 3-5 pm

High Tea at 5pm


Brains are endowed with innate models that can learn effective informational and reasoning prototypes of the various objects and concepts in the real world around us. A distinctive hallmark of the brain, for example, is its ability to automatically discover and model objects, at multi-scale resolutions, from repeated exposures to unlabeled contextual data and then to be able to robustly detect the learned objects under various non-ideal circumstances, such as partial occlusion and different view angles. Replication of such capabilities in a machine would require three key ingredients: (i) access to large-scale perceptual data of the kind that humans experience, (ii) flexible representations of objects, and (iii) an efficient unsupervised learning algorithm. The Internet fortunately provides unprecedented access to vast amounts of visual data. The first part of this work will focus on our recent work that leverages the availability of such data to develop a scalable framework for unsupervised learning of object prototypes—brain-inspired flexible, scale, and shift invariant representations of deformable objects (e.g., humans, motorcycles, cars, airplanes) composed of parts, their different configurations and views, and their spatial relationships. We apply our framework to various datasets and show that our approach is computationally scalable and can construct accurate and operational part-aware object models much more efficiently than in much of the recent computer vision literature. We also present efficient algorithms for detection and localization in new scenes of objects and their partial views. The second part of this work will focus on processing large scale textual data, wherein our algorithms can create semantic concept-level maps from unstructured data sets. Finally we will conclude with the outlines of a general framework of contextual unsupervised learning that can remove many of the scalability and robustness limitations of existing supervised frameworks that require large amounts of labeled training sets and mostly act as impressive memorization engines. 

Short Bio:

Vwani Roychowdhury is a Professor of Electrical and Computer Engineering at UCLA and received his BTech and PhD degrees in Electrical Engineering from IIT Kanpur and Stanford University, respectively. Prof. Roychowdhury’s expertise lies in combining tools from a number of disciplines, including computer science, engineering, information theory, mathematics, and physics, and solving fundamental problems in multiple disciplines. His research interests have spanned a diverse set of topics related to combinatorics and theoretical computer science, Artificial Neural Networks, nanoelectronics and device modeling, quantum computing, quantum information and cryptography, the physics of information processing and computation, Bioinformatics, and more recently, brain inspired machine learning and brain modeling. He has published more than 250 journal and conference papers, and coauthored several books. He has mentored more than 25 Ph.D. students and 20 post-doctoral fellows and is always seeking collaborations with problem solvers and seekers. He also cofounded four silicon valley startups; one of these,, was founded in Jan. 2017, pioneered the unsupervised distillation of Concept Graphs from billions of documents, raised upwards of 45M US dollars in investment and was acquired in February 2017. 

Short Bio of Prof. Kailath:

Thomas Kailath received a B.E. (Telecom) degree in 1956 from the College of Engineering, Pune, India, and S.M. (1959) and Sc.D. (1961) degrees in electrical engineering from the Massachusetts Institute of Technology. He then worked at the Jet Propulsion Labs in Pasadena, CA, before being appointed to Stanford University as Associate Professor of Electrical Engineering in 1963. He was promoted to Professor in 1968, and appointed as the first holder of the Hitachi America Professorship in Engineering in1988. He assumed emeritus status in 2001, but remains active with his research and writing activities. He also held shorter-term appointments at several institutions around the world. 

His research and teaching have ranged over several fields of engineering and mathematics: information theory, communications, linear systems, estimation and control, signal processing, semiconductor manufacturing, probability and statistics, and matrix and operator theory. He has also co-founded and served as a director of several high-technology companies. He has mentored an outstanding array of over a hundred doctoral and postdoctoral scholars. Their joint efforts have led to over 300 journal papers, a dozen patents and several books and monographs, including the major textbooks: Linear Systems (1980) and Linear Estimation (2000). 

He received the IEEE Medal of Honor in 2007 for “exceptional contributions to the development of powerful algorithms for communications, control, computing and signal processing.” Among other major honors are the Shannon Award of the IEEE Information Theory Society; the IEEE Education Medal and the IEEE Signal Processing Medal; the 2009 BBVA Foundation Prize for Information and Communication Technologies; the Padma Bhushan, India’s third highest civilian award; election to the U.S. National Academy of Engineering, the U.S. National Academy of Sciences, and the American Academy of Arts and Sciences; foreign membership of the Royal Society of London, the Royal Spanish Academy of Engineering, the Indian National Academy of Engineering, the Indian National Science Academy, the National Academy of Sciences,India, the Indian Academy of Sciences, and TWAS (The World Academy of Sciences). 

In November 2014, he received the 2012 US National Medal of Science from President Obama “for transformative contributions to the fields of information and system science, for distinctive and sustained mentoring of young scholars, and for translation of scientific ideas into entrepreneurial ventures that have had a significant impact on industry.”

IEEE SP chapter talk: 14 June 2019 – Physics-Based Vision and Learning

Thursday, June 13th, 2019


Department of Electrical Engineering (EE), IISc and IEEE SPS Bangalore Chapter invite you to the following seminar:

Title: “Physics-Based Vision and Learning ”
(Joint Work with Yunhao Ba and Guangyuan Zhao)

Speaker: Dr. Achuta Kadambi

Achuta Kadambi

Host faculty : Dr. Chandra Sekhar Seelamantula

Time & Venue : 14th June, 2019, Friday on 4:00 PM at MMCR first floor, Electrical Engineering Department, IISc.

Abstract: Today, deep learning is the de facto approach to solving many computer vision problems. However, in adopting deep learning, one may overlook a subtlety: the physics of how light interacts with matter. By exploiting these previously overlooked subtleties, we will describe how we can rethink the longstanding problem of 3D reconstruction. Using the lessons learned from this prior work, we will then discuss the future symbiosis between physics and machine learning, and how this fusion can transform many application areas in imaging.

Biography: Achuta Kadambi is an Assistant Professor of Electrical and Computer Engineering at UCLA, where he directs the Visual Machines Group. The group blends the physics of light with artificial intelligence to give the gift of sight to robots. Achuta received his BS from UC Berkeley and his PhD from MIT, completing an interdepartmental doctorate between the MIT Media Lab and MIT EECS. Please see his group web page for research specifics: