I'm an engineering leader based in Toronto, currently managing a team of performance engineers at Amazon Web Services focused on ML workloads for AWS Trainium, Amazon's custom ML accelerator chip. My team benchmarks, profiles, and optimizes large-scale models to get the most out of custom silicon.
Before AWS, I spent five years at Intel managing a compiler engineering team that built the oneAPI FPGA toolkit. I joined Intel as an individual contributor in 2015, worked on their HLS and OpenCL compilers, and eventually led the team through the design and launch of SYCL HLS. I enjoy working on the interface between hardware and software and helping expose the capabilities of novel architectures to developers in a way that's both powerful and usable.
My academic background is in computer architecture. I did my MASc at the University of Toronto under Prof. Natalie Enright Jerger, where I worked on silicon interposer-based multi-chip systems and networks-on-chip. That work led to publications at MICRO and was selected for IEEE MICRO Top Picks. Before Toronto, I completed my BTech in Electronics & Communications at IIT Guwahati.