Computer Scientist/Software Engineer

Columbia University

About Me:

I’m a researcher at heart and software engineer in practice with several years of experience leading software engineering teams in successful and impactful projects. I pride myself in delivering results, and driving innovation in organizations, as well as improving engineering and buisiness process to enable developer productivity and the org as a whole.

I have a PhD in Computer Science from Columbia University, where I worked at the Programming Systems Laboratory with Prof. Gail Kaiser. My research interests span data analytics, big data, stream processing, distributed systems, large scale system debugging, and program analysis. I have briefly also worked on cloud computing and software defined networking.

Work History:

Currently, I am the Director of Engineering at Priceline for Flights Platform. I lead a team of 30+ engineers and managers on the search and pricing veriticals, with teams in US, Canda, and consultant teams in Ukraine, and Buenos Aires. I help drive our strategy, architectural decisions, and innovation for our flagship project Firefly which is also the global supply aggregation platform for Booking Holdings (Priceline/Agoda/Kayak etc.). Our stack deals with both B2B customers, and B2C customers and we are leveraging a plug and play architecture to move towards integration of approx. 50 supply connections by the end of 2022 (Direct connect to Airlines, and GDS’s/Aggregrators) to make Firefly the largest flights supply aggregator.

Before joining Priceline, I worked as a Senior IC at Dropbox, New York with the Previews Infrastructure Services Team. The preview-infrastructure team provides middle layer services to convert uploaded files into previewable content for all user-facing frontends for dropbox (this is the second largest infra fleet after storage at dropbox).

Even earlier, I was a researcher at NEC Labs America, Princeton, NJ where I worked with Systems Research Group (formerly a part of the Autonomic Computing Group). I have also briefly interned as Business Analyst at McKinsey & Co., New York in 2008. In my undergrad years, I interned as a Research Consultant at Instituto de Soldedura Equalidade (Lisbon, Portugal), a research organization under the aegis of the European Union where I was involved in a Project called “Natrualhy”. I was also a Research Assistant at the Indian Institute of Technology (Delhi, India) in the Computer Integrated Manufacturing Lab, where I worked on Supply Chain Management.

Recent News:

  • Invited talk at Google Journal Club- Replay without Recording of Production Bugs for SOA (ASE 2018)
  • Awarded Excellent Invention Awards for patent applications- 2017: Next Generation Log Analytics Application: An Automated Anomaly Detection Service on Heterogeneous Logs
  • Awarded NEC Business Contribution Award - 2016 (awarded for Research Commercialization)
  • Spot Recognition Award for Supporting Log Analysis Technology Development, Oct 2016
  • NEC Recognition Award for Creating Patent Portfolio for Log Analysis Technologies, Jun 2016
  • Awarded NEC Business Contribution Award - 2015 (awarded for Research Commercialization)

Publications/Patents:

  • 10 Issued Patents, 26 Filed Patents (pending) - as of Feb 24, 2017
  • 17 Peer-Reviewed Publications​

Community Activity:

  • Program Committee Member Middleware 2015
  • Reviewer IEEE’s Journal on Transactions for Parallel and Distributed Systems
  • Peer Reviwer SPIN 2014
  • Peer Reviewer Globecom 2014
  • Reviewer IEEE’s Journal on Transactions for Service Computing
  • Peer Reviewer SCSC 2014
  • Peer Reviewer SDN-AA Workshop 2014
  • Peer Reviewer ICAC 2014
  • Peer Reviewer SIGMETRICS 2014

Previous Interns/Students:

  • Mohammad Ali Gulzar, PhD Student, UCLA, Summer 2016
  • Muhammad Solaimani, University of Texas Dallas, Summer 2016
  • Pradeep Fernando, PhD Student, Georgia Tech University, Summer 2015
  • Yuanzhen Gu, PhD Student, Rutgers University, Summer 2014
  • Advait Dixit, PhD Student, Purdue University, Spring 2014
  • Hui Lu, PhD Student, Purdue University, Summer 2013
  • Nitin Natrajan, MS, Columbia - Fall 2010
  • Jyotsna Sebe, MS, Columbia - Fall 2009
  • Bing Wu, MS, Columbia - Fall 2008, Spring 2009
  • Suhas Anand, MS, Columbia - Fall 2008
  • Junxiong Jia, MS, Columbia - Fall 2008

Resources:

Interests
  • Data Engineering/Streaming Pipelines
  • Search & Retrieval
  • Scalability and Performance
  • ML and ML Infra
  • PL & Compilers (once upon a time)
Education
  • PhD in Computer Science

    Columbia University

Experience

 
 
 
 
 
Director of Engineering - Global Flights Architecture
May 2023 – Present Greater New York City Area

Part of the core management team, responsible for owning and defining tech strategy. The stack serves both B2B and B2C, using layered microservices to integrate approx 50 suppliers enabling over 40MM+ daily search requests, and over 500k monthly bookings.

  • Architect: Lead a team of 10 architects/principals across global locations, managing architectural design, technical strategy and roadmap of all verticals for the global flight org (~75-85 FTEs).
  • Innovation/Modernization: Leveraged process and architectural innovation to deliver tech migrations, including postbook modernization, crucial for getting a firm-wide CTO-level initiative back on track.
  • GenAI: Advocated, implemented, and demonstrated the use of Agentic AI utilizing LLMs for sentiment analysis, and triaging tickets, collaborating on a platform to enable this across the org.
  • Vendor Management: Negotiated and collaborated with external consultant vendor teams to ensure timely delivery of projects, managed remediation of off-track projects, and adhered to budgets.
  • Lead architect in the acquisition and integration of Amadeus LTC(5.5MM+), an ML-based supply cache for multi-source flight aggregation, resulting in a cache hit increase of approx. 15%, and significant projected cost-savings in terms of look-to-book.
  • Enabled business operations by introducing self-serve debugging insights into business/pricing rules in production, reducing triage time from days to minutes.
  • Led multiple initiatives that resulted in cost savings of over $1MM+ annually, including migrating from Redis to Bigtable, horizontal pod auto-scaling, reducing memory thrash etc.
  • Initiated the transition towards monorepos and a unified domain model to better manage the fragmented state of multiple micro-services and repositories and reducing KTLO, increasing productivity.
 
 
 
 
 
Director of Engineering - Flights Search & Price
Oct 2019 – May 2023 Greater New York City Area
  • Led multiple teams of approx 30+ engineers/managers with teams in US, Canada and consultants in Ukraine and Buenos Aires, delivering on product initiatives and managing core platform services.
  • Helped drive strategy and innovation for our flagship project Firefly a global flights supply aggregation platform for Booking Holdings (Priceline, Agoda, Kayak).
  • My team delivered on a complete re-write of the search, price and control verticals, and key product functionalities - Fused Itins(join one-ways for round trip), Express Deals(deals with hidden details), Pricing Engine, Rule Engine etc.
  • Partnered with TPMs, product in driving hyper-growth within flights, where we opened 2 new offices (Toronto/Mumbai), and scaled the overall team to several times it’s initial size
  • Partnered with cross-functional architects, product/program managers, engineering leaders, driving a successful migration from on-prem to Google Kubernetes micro-services stack.
  • Introduced the concept of a blueprint for a multi-step booking process, which orchestrated the transaction, and reduced development time for new suppliers from weeks to hours.
  • Collaborated with the Enterprise Architecture team to implement developer testing “tracks” using Istio network subsets and header-based routing. This enabled dedicated QA test environments for GitHub branches, significantly enhancing developer productivity.
  • Led a tiger team to onboard our first platform user Agoda.com, onboarding approx. 20k/min requests in a new region, splitting the traffic path, managing success criterias/expectations in a timely manner
 
 
 
 
 
Senior Software Engineer
Jan 2018 – Oct 2019 New York City

My team managed previews serving infrastructure which converts uploaded files into previewable formats. Our services handled 20k qps requests ensuring that conversions happen in a secure fashion, with minimal latency. Previews is the 2nd largest infra at Dropbox, and key driver for most product initiatives.

  • DRI (Directly Responsible Individual) for multiple initiatives including migrating legacy HTTP /v1 restful endpoints and all it’s related calls in the dropbox codebase to a GRPC based scalable micro-service with thin wrappers around existing libraries.
  • DRI for file metadata storage and extraction pipeline to on-the-fly extraction. Gained alignment across 5 different product teams in order to deprecate defunct use-cases and reduce costs by 400k/yr.
  • Developed previews for e-books (i.e read ebooks on dropbox.com), working with teams across web, android/ios platforms to ensure a seamless rollout.
  • Worked on new pipeline converting MS Office documents into previewable formats (pdf/images) securely in jailed containers at scale.
  • Mentored and onboarded junior engineers/interns, co-authored team’s North Star Doc.
  • Stack/Languages: Python, Golang, S3/EC2, Hive, Bazel, protobufs, Inhouse jailed containers.
 
 
 
 
 
Sr. Assoc Research Staff
Nov 2011 – Dec 2017 Princeton, NJ

NGLA: An end-to-end log analytics service (Jan 2015- Nov 2017)

  • Architect and led the design & development of streaming anomaly detection with NoSQL database (Elas- ticSearch), Kafka and Spark Streaming. Owned most components of the pipeline for streaming analytics - Collaborated on design of complex time-series, stateful and stateless log analytics in a multi-tier setup - Designed a control interface for streaming analytic task job management (tasks involved - model man- agement, in-memory states, periodic anomaly check, start/stop, and cleanup)
  • Modified core apache spark code to introduce support for on-the-fly broadcast model update, leveraged this in deploying model control management interface in spark streaming
  • Designed a prototype web-interface for real-time visualization using Flask, javascript and bootstrap
  • Founding member with an initial team of 3, experience in data cleaning, preprocessing, log pattern rep- resentation and parsing including multiple POC trials for customer data

Behavior Analysis Engine (Jan 2017-Nov 2017)

  • A Semantic language framework for knowledge representation - expressing complex machine learning log models and allowing administrators to express domain knowledge as “rules” and “behaviors”
  • Worked in a two person team for writing language grammar and execution operations using Spark SQL - Developed (in progress) RESTful API to convert BAE to an SOA with job and rule management

CLUE: Distributed System Trace Analytics (Jan 2013- May 2015)

  • Stitched kernel event logs to generate end-to-end “transactions”, which can help give a “CLUE” to the root-cause of bugs. CLUE uses data-mining and transaction clustering to find potential anomalies
  • Developed a novel hybrid (static + dynamic) binary instrumentation tool called iProbe with an order of magnitude better performance than state of art-tools
  • Collaborated on core-engine development, and designed the interface along with data visualizations, and project management

NetLogic (Jan 2015 - Dec 2015):

  • Building a software defined data-center and cloud environments by deploying OpenStack and Open- VSwitch based network management. Deployed and managed the OpenStack infrastructure, and wrote several wrappers to setup a small internal cloud
  • Developed a novel prototype network manager called HybNET for hybrid network infrastructure with both SDN and legacy switches. The controller allowed centralized network management despite partial transition to SDN switches

 
 
 
 
 
Graduate Research Assistant
Jan 2008 – Jan 2011 New York, NY
  • Thesis: Developed on-the-fly sandboxed debugging framework called Parikshan, which allows developers to debug SOA applications hosted on user-space containers in a cloned parallel container, without any downtime and any impact on the production facing service
  • Parikshan leverages live-cloning a modification of live-migration and a new network duplication proxy to enable on-the-fly cloning of OpenVZ containers
  • Also worked on other projects associated with the lab- COMPASS, research in Multi-core Software Engi- neering, Binary/Run-time instrumentation, static and dynamic program analysis, Recommender Systems. and system administration/mentoring research students.
 
 
 
 
 
Business Analyst
Jun 2008 – Aug 2008 New York, NY
Product Owner Proxy for Scrum roll-out team (Agile s/w Development) in McKinsey App-Dev. Also designed architecture & a proof of concept of a trend analysis tool.
 
 
 
 
 
Research Consultant
Feb 2007 – May 2007 Cascais, Portugal
Designed a prototype for a Decision Support Tool with an interactive interface for Natural Gas + Hydrogen combine fuel being tested for use in pipelines all over Europe. The tool was designed in Visual Basic.Net
 
 
 
 
 
Research Assistant
Jan 2006 – Jan 2005 New Delhi, India
Worked in the Computer Integrated Manufacturing (CIM) Lab on comparing genetic algorithms, simulated annealing and tabu search algorithms to evaluate algorithm efficiencies.

Publications

PerfScope: Practical Online Server Performance Bug Inference in Production Cloud Computing Infrastructures

Patents

Issued:
  • USPTO - 14030 Path Selection in Hybrid NetworksUtility-ORGUS (8/9/2016)
  • USPTO - 13148 Dynamic Border Line Tracing for Tracking Message Flows Across Distributed Systems (1/3/2017)
  • USPTO - 13062 Transparent Performance Inference of Whole Software Layers and Context Sensitive Performance Debugging (6/14/2016)
  • USPTO - 13035 Method and Apparatus for managing Hybrid Network Systems (9/20/2016)
  • USPTO - 12155 Guarding a Monitoring Scope and Interpreting Partial Control Flow (10/18/2016)
  • USPTO - 12082 Method and System for Computer Assisted Hot-Tracing Mechanism (11/8/2016)
  • USPTO - 12049 Blackbox Memory Monitoring with a Calling Context Memory Map and Semantic ExtractionUtility (4/7/2015)
  • USPTO - 12016 Efficient Unified Tracing of Kernel and User Events with Multi-Mode Stacking (11/25/2014)
  • USPTO - 12010 Method and Apparatus for Correlated Tracing with Automated Multi-Layer Function Instrumentation Localization (7/28/2015)
  • Japan Patent Office - 13035J Hybrid Network Management (11/10/2015)
Pending:

Pending patents available on request.

Contact

nipun<at>cs<dot>columbia<dot>edu