Job Description, duties & responsibilities
Progress Rail's data science team is looking for a motivated and talented Data Scientist who will primarily focus on developing Machine Learning/Artificial Intelligence based data models for condition-based monitoring of its assets. In this role, the candidate will contribute to the design, development and deployment of world class rail products and services vital to our customer's needs. Reporting to the Director of Data Science, this role will enable innovative, strategic, and high-tech solutions for the rail industry through the application of specialized knowledge, skills, and abilities. Work involves independent judgement, problem solving skills, resourcefulness, teamwork, and creativity in ambiguous situations. A high degree of personal initiative is a prerequisite. Typical data science team efforts are a combination of some, or all the key job elements listed below.
The ideal candidate is an experienced self-starter, strong attention to details, with excellent written and verbal skills. Enjoys working in a collaborative, fast-paced, environment and is willing to take on roles outside of comfort zone. Technical aptitude and being well versed in Machine Learning and Data Science tools and processes is a must. The role will work closely with the different engineering teams.
Key Job Elements
Contribute to the design, development, testing, and deployment of software systems and applications.
Processing, cleansing, and verifying the integrity of data used for analysis.
Apply Machine Learning and other advanced analytical techniques to develop models for condition-based monitoring of locomotive systems.
Apply Natural Language Processing (NLP) and Large Language Model (LLM) to support text mining, document summarization and others.
Understand the business needs and develop data-based solutions.
Doing ad-hoc analysis and presenting results in a clear manner
Supporting field reported issue resolution through data analysis
System integration of Machine Learning models
Mentor and assist data scientists providing technical assistance and direction as needed
Technical Skill
Experienced Data Scientist with 7+ years' experience in Data Extraction, Data Modelling, Data Wrangling, Statistical Modeling, Data Mining, Machine Learning and Data Visualization.
Expertise in transforming business resources and requirements into manageable data formats and analytical models, designing algorithms, building models, developing data mining, and reporting solutions that scale across a massive volume of structured and unstructured data.
Proficiency in managing entire data science project life cycle and actively involved in all the phases of project life cycle including data acquisition, data cleaning, data engineering, features scaling, features engineering, testing and validation and data visualization.
Expertise in applying Machine Learning algorithms (such as Regression Models, XGBoost, Neural Network, and others) for predictive analytics.
Experience in Generative AI developing LLM based solutions for document search/summarization using RAG architecture.
Expertise in applied statistics skills, such as distributions, statistical testing, regression, etc.
Strong experience with Python, SQL, and R.
Experience and knowledge of AWS cloud which includes Machine Learning related services, S3, Elastic search, Lambda, and others.
Experience in integrating Machine Learning models into larger deployed systems.
Proficiency in data visualization tools such as PowerBI, Python Matplotlib, R Shiny to create visually powerful and actionable interactive reports and dashboards.
Experience in Natural Language Processing and Text Mining.
Strong business sense and abilities to communicate data insights to both technical and non-technical clients.
Competent to perform all job duties without close supervision.
Desired :
Rail industry experience
Experience in developing models using telematics (sensor) data from equipment such as engines, machines, and others.
Qualifications and Education Requirements
B.S, M.S, or PhD degree in quantitative discipline such as data science, data analytics, computer science, engineering, statistics, mathematics, or other related degree.
7+ years of data science experience with B.S., or 5+ years of experience with Advanced degrees
7+ years of experience with Python, R, SQL, and relational data bases
