Hey! I am

Parin Shah

I'm a

About

About Me

Data Engineer with a strong foundation in product analytics and scalable data pipeline development. Skilled at translating business problems into data-driven solutions by transforming complex data into actionable insights. Adept at communicating technical concepts clearly to cross-functional teams and business stakeholders to drive informed decision-making.

  • Name: Parin Shah
  • Date of birth: January 09, 1999
  • Address: San Francisco, California
  • Email: parin090199@gmail.com
  • Phone: +1 (213) 285-8984

      Colorlib Template   

Resume

Education

Colorlib Template
Aug 2021 - May 2023

Masters of Science in Computer Science (Data Science)

University of Southern California
Grade: 4.0/4.0

Relevant Coursework: (CSCI 570) Analysis of Algorithms, (CSCI 561) Foundation of Artificial Intelligence, (DSCI 553) Foundation and Applications of Data Mining, (CSCI 535) Multimodal Probabilistic Learning of Human communication

Colorlib Template
Aug 2016 - May 2020

Bachelor of Technology in Computer Engineering

NMIMS University
Grade: 83.5% (3.56/4)

Relevant Coursework: Artificial Intelligence, Soft Computing, Data Mining, Predictive Modelling, Database Management Systems, Data Warehousing and Mining, Operations Research, Data Structures, Design and Analysis of Algorithms, Discrete Structures, Fundamentals of Web Technology, Computer Networks

Colorlib Template
Aug 2014 - May 2016

Senior Secondary Education

Pace Junior Science College
Grade 12th(HSC): 86%
Colorlib Template
Aug 2002 - May 2014

Secondary School

Ryan International School
Grade 10th(ICSE): 90%

Experience

Jan 2023 - Present

Senior Data Engineer

Unity Technologies

  • Leading product analytics, internal reporting and invoicing for the Offerwall product in the Ads org; partnering with PMs and business teams to deliver insights that drive product growth and support support strategic decision making
  • Building datasets by integrating third-party APIs (SensorTower, DeviceAtlas) to enrich feature platform inputs; improved data coverage by 30% using fuzzy matching logic, and collaborated with DS teams to drive adoption in ML models.
  • Designed and scaled a near real-time OLAP platform (Imply on Apache Druid), to enable full-funnel analytics—from ad requests to conversions to post-install events—supporting 400+ stakeholders with low-latency, high-concurrency querying.
  • The OLAP platform enabled 70% reduction in ad-hoc data pulls and cut insight turnaround time from days to minutes by democratizing access to granular funnel metrics for PMs and business teams to make faster data driven decisions
  • Migrated 50+ reports and dashboards from MicroStrategy to Looker by building aggregate tables and LookML models, ensuring data completeness and accuracy; reduced report load times by 40% and seamlessly transitioned 100+ email subs.
  • Optimized data pipelines and LookML codebase, reducing validation times by 4x and saving $150k annually through code refactoring
  • Identified key areas for cost savings and optimizations by developing GCP Cost Reporting dashboard in Looker
  • Technologies Used: SQL, Python, GCP, BigQuery, Looker, LookML, Apache Airflow, Apache Druid, Github, Terraform, dbt.

May 2022 - Aug 2022

Data Engineer Intern

Unity Technologies
Revenue Attribution

  • Discovered 10k+ new publishers using Unity to build games and advertising through Unity Ads by consolidating multiple datasets leading to an improved coverage over previous ETL from 2% to 65%>.
  • Attributed a 63% revenue increase to Unity-built publishers via in-depth analytics and enrichment.
  • Technologies Used: GCP, BigQuery, SQL, Looker, LookML, Airflow, Github.

Colorlib Template
June 2020 - June 2021

Data Scientist

Limechat

Limechat is India's first level-3 AI D2C chatbot company. (YC W21)

DevOps/MLOps

  • Lead all the DevOps initiatives by collaborating cross-functionally with various teams to design and incorporate infrastructure solutions that were used by over 35 clients.
  • Designed automated CI/CD pipelines on GitLab for the deployment of 100+ cloud servers using Docker, Kubernetes, GitLab and Azure.
  • Setup Azure cluster to deploy horizontally scalable, clustered software using Kubernetes. Launched 50+ cronjobs to automate repetitive tasks.
  • Deployed alerting, logging and monitoring (Fluentd, Prometheus, Grafana) tools to deliver timely alerts for all downtimes and issues.

Engineering

  • Designed and implemented a real-time data pipeline in python to process and upload over 50,000+ rows of semi-structured data per day from PostgreSQL server to product analytics tool.
  • Spearheaded the development of standard client deliverable dashboards on Mixpanel aggregating data from 5 sources into a single tool. The dashboard served as the primary repository to consume reports by PM's and clients.
  • Ingested data from disparate sources using a combination of SQL, Google Analytics API, Segment API using python to create dashboards in Mixpanel.
  • Worked with clients to understand business needs and translated those business needs into actionable reports either in-house or on Mixpanel saving 8 hrs of manual work each week.
  • Developed a custom data analytics dashboard on Flask and VueJS displaying critical KPI's for each client with the ability to download standard excel and pdf reports.
  • Integrated a caching algorithm that fetches previous days data from Redis cache that led to 90% faster loading times.
  • Incorporated session flow visualization using a tree data structure to understand the paths users take during their lifetime. Analysis of this helped understand key areas of improvement for the chatbot.

Machine Learning

  • Designed custom metric based upon weighted F1 score to measure the actual performance of the NLP model used for chatting with 10000+ users per day.
  • Devised a tool to label new intents using clustering and active learning reducing the manual labelling time by over 60%
  • Spearheaded a research initiative to find the optimal number of data points to be labelled by analyzing the effort vs improvement graph, saving 40+ work hours of manual labelling per month.

Product Manager

  • Managed 6 cross functional projects by providing project leadership and daily management throughout the project. Some of these projects are Customer Satisfaction score, Data Labelling, Re-Engagement campaigns.
  • Examined the dropoff of users at various stages to effectively target certain groups in Re-engagement campaigns leading to increase in sales by 28% on average across companies.
  • Engineered a heuristic-based algorithm to calculate Customer Satisfaction Score for each chat. Handed off unsatisfactory chats to customer support improving the overall customer experience.
  • Communicated weekly insights to over 70+ stakeholders to help take an informed decision on changes and future features.
  • Synchronized with the Bot development team in a fast paced Scrum development environment to iterate upon old and deliver new features and functionality.

Recommendation Letter

Colorlib Template
March 2019 - May 2020

Project Intern

Birthvenue

  • Developed a universal rating platform for all types of cryptocurrencies and tokens available in the market based on financial and non-financial variables to help investors make decisions.
  • Designed a regularized regression model that takes in the variables, determines the ranking and displays the rankings of over 1100+ cryptocurrencies on a website.
  • Published a whitepaper that includes details about the phases, parameters and the developed model.
  • Technologies and Frameworks used: Python, HTML, CSS, ReactJS, MySQL, AWS, Flask.

Colorlib Template
May 2019 - July 2019

Research Intern

Oracle Financial Services Software

  • Interlinked LDAP server in Kubernetes for authorization purposes. Automated manual formation of access roles in banking systems helping save 10+ hours of work per client.
  • Incorporated OpenID connect to fetch relevant roles from server. Authorized multiple users by assigning permission to roles with the help of Role Based Access Control (RBAC).
  • Technologies Used: Docker, Kubernetes, Java, HTML.

Colorlib Template

Association for Computing Machinery

June 2019 - May 2020
Student Mentor

  • Mentored the Technical Department on steps to be taken for furthering the growth of the committee.

June 2018 - May 2019
Technical Head

  • Responsible for recruiting and managing a team of 15 students focused at conducting workshops and seminars to imbibe a technical culture in college.
  • Conducted and taught in various workshops like Augmented Reality workshop using Unity 3D (40 participants), C workshop (150 participants), Amazon Alexa skill building workshop (25 participants).
  • Organized and designed questions for intercollege and intracollege coding events. Developed the committee website.
  • Mentored teams for building projects in the domain of Augmented Reality.

June 2017 - May 2018
Technical Executive

  • Assisted the committee in conducting events and served as debugger during workshops.

Skills

Languages Python, SQL, SAS, C++.
Databases Apache Druid, MySQL, Postgres, Redis.
Python Libraries Matplotlib, Seaborn, Plotly, Pandas, Numpy, Scikit Learn, Tensorflow, Keras.
Software Tools Apache Airflow, Looker, Tableau, Imply, Alteryx, SAS Visual Analytics, Unity 3D, Android Studio.
Big Data BigQuery, Apache Spark (PySpark), Hadoop MapReduce, Kafka.
DevOps Docker, Kubernetes, Prometheus, Grafana, Fluentd, Github, GitLab.
Productivity Notion, Jira.
Web Dev Bootstrap, JavaScript, PHP, ReactJS, VueJS, Django, Flask.

Certification and Courses

Colorlib Template
June 2020

Data Visualization with Tableau Specilization

University of California Davis

Generated powerful reports and dashboards that help people make decisions and take action based on the business data. Used Tableau to create high-impact visualizations of common data analyses to help see and understand the data. (Capstone Project)

Colorlib Template
June 2020

AI for Trading

Udacity Nanodegree

Learned the basics of quantitative analysis, including data processing, trading signal generation, and portfolio management. Used Python to work with historical stock data, develop trading strategies, and construct a multi-factor model with optimization.

Colorlib Template
Dec 2019

Applied Data Science with Python Specilization

University of Michigan

Applied statistical, machine learning, information visualization, text analysis, and social network analysis techniques through python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx to gain insight into the data.

Colorlib Template
Nov 2019

Programming Fundamentals using SAS 9.4

SAS Certified Associate

Cleared the Base SAS certifiation exam and demonstrated experience in programming and data management using SAS 9.4.

Colorlib Template
Oct 2018

Machine Learning

Stanford University

Learned about the most effective machine learning techniques, and gained practice implementing. Gained the practical know-how needed to quickly and powerfully apply these techniques to new problems.

Projects

Projects

"You can do anything you set your mind to"  - Benjamin Franklin

Reports 3D    

Deep Learning | Augmented Reality

Immersive Visualization in Medical Imaging. Generating 3D reports of brain.

C:Drive  

Web Application | Databases

Online collaborative tool to aid resource sharing for students and teachers.

Grocery Store Case Study  

Machine Learning | Forecasting

Provided recommendation using various analytical techniques on how to expand to a grocery store chain.

Indian Startups Analysis  

Data Analysis | Data Visualization

Exploratory Data Analysis of startup ecosystem in India

FurnitAR

Augmented Reality | Android Application

Augmented Reality application to enhance online furniture shopping experience.

RCBU App

Android Application | Databases

Android app for Rotaract Club of Bombay Uptown

NewspapAR

Application to bring static images and other content in newspaper to life.

Contact

Contact Me

"The art of communication is the language of leadership" - James Humes

Address

San Francisco, California

Contact Number

+1 (213) 285-8984

Email Address

parin090199@gmail.com