WebMD and its affiliates is an Equal Opportunity/Affirmative Action employer and does not discriminate on the basis of race, ancestry, color, religion, sex, gender, age, marital status, sexual orientation, gender identity, national origin, medical condition, disability, veterans status, or any other basis protected by law.
About PulsePoint:
PulsePoint is a fast-growing healthcare technology company (with adtech roots) using real-time data to transform healthcare. We help brands and agencies interpret the hard-to-read signals across the health journey and unify these digital determinants of health with real-world data to produce the most dimensional view of the customer. Our award-winning advertising platforms use machine learning and programmatic automation to seamlessly activate this data, making marketing, predictive analytics, and decision support easy and instantaneous.
Description
Our Data & Analytics team is at the very heart of what makes PulsePoint an innovative, fast-paced, and market-changing company.
Our path forward is through data and this team is in the driver's seat for the journey.
The Big Picture:
You will build, deliver & continually innovate on PulsePoint's insightful reporting and data-driven solutions. Your efforts help alleviate friction points and streamline processes that enable internal teams to provide exceptional service, powering the decisions of our customers.
As a Data Engineer, ML/Data Science, you will use your data science and stats expertise to contribute to R&D projects for DTC, new Data Products, and Bespoke Segments expansion. You can work fully remotely in India, and we will provide you with a company-issued laptop. This is a FTE role.
In short, you will be the conduit through which we will revolutionize health decisions through real-time data.
Key Responsibilities:
- Write robust, modular, production-ready code in Python and SQL, following best practices in OOP, version control (Git), and software design principles.
- Collaborate with other data scientists to design and productionize ML models and integrate them into end-to-end data systems.
- Build tools, frameworks and ETL/ELT pipelines to enable efficient data access, processing, and model deployment.
- Apply a working knowledge of common ML algorithms (classification, regression, clustering, etc.) to support experimentation and solution design.
Here are some projects you can help with:
- Data science & stats-related projects
- Work on R&D projects for DTC
- Help build new Data Products
- Contribute to Bespoke Segments expansion
- Help us design and define the methodology for our measurement products and user identification
- Continuously improve the quality of HCP onboarding/Targeting/measurement
- Audience IQ/DTC product development, Identity graph/Data IQ
- Collaborate with internal teams to delight our customers with timely and accurate data reporting that meets all requirements
- Research & implement new data products or capabilities
- Automate data visualization and reporting capabilities that empower users (both internal and external) to access data on their own, thereby improving quality, accuracy, and speed
- Synthesize raw data into actionable insights to drive business results, identify key trends and opportunities for business teams, and report the findings in a simple, compelling way
- Evaluate and approve additional data partners or data assets to be utilized for identity resolution, targeting, or measurement
- Enhance PulsePoint's data reporting and insights generation capability by publishing internal reports about Health data
- Act as the "Subject Matter Expert" to help internal teams understand the capabilities of our platforms, how to implement & troubleshoot
Requirements
Required qualifications:
- 2-6 years of hands-on experience as a Data Science Engineer, ML Engineer, or similar role
- 4-5+ years of relevant experience in:
-Strong SQL skills for querying and managing structured datasets on cloud databases like GCP, AWS, Trino etc.
-Highly proficient knowledge of Excel (pivot tables, VLOOKUP, formulas, functions)
-Data analysis & manipulation
-Solid programming experience in Python, especially in production environments (modular design, data validation, error handling, testing)
- At least a Bachelor's degree in Business Intelligence and Analytics or closely related field
- Practical experience with:
-Knowledge of Distributed Systems and Cluster computing frameworks like Apache Spark, for large-scale data processing and machine learning with PySpark ML
-Google Cloud Architecture covering BigQuery, Cloud Storage (GCS), Compute Engine VMs, Dataproc clusters
-ML Pipeline Orchestration
-Deploying and managing ML models, with working knowledge of Bagging & Boosting Techniques, Model performance metrics, hyperparameter tuning etc.
-MLOps practices, exposure to MLflow, Vertex AI, or other MLOps tools
- Experience with Containerization (Docker) and Kubernetes
- Knowledge of Airflow, Dagster, or similar orchestration tools
- Proven experience in experimentation methods and Stats modeling in support of product development and optimization
- Willing and able to work 3:30pm-12:30am IST, you can work fully remotely
Preferred qualifications:
- Experience with LookML & DBT
- Understanding of Frontend Dev Tools
- And one of:
-ELT experience
-Tableau/Looker/PowerBI
-Experience with automation
- Able to organize large data sets to answer critical questions, extrapolate trends, and tell a story
- Experience in Programmatic/Adtech
- Familiarity with health-related data sets
What are 'red flags' for us:
Candidates won't succeed here if they haven't worked closely with data sets or have simply translated requirements created by others into SQL without a deeper understanding of how the data impacts our business and, in turn, our clients' success metrics.
Selection Process (order of these sessions may be subject to change):
1) Online SQL Test (40 mins)
2) Initial Screen (30 mins)
3) Hiring Manager Interview (45 mins)
4) Video call w/ Sr. Data Scientist (45 mins)
4) 1:1s w/ SVP of Data (30 mins)
