we at stgya are seeking a highly motivated and skilled ai/ml engineer with a strong focus on training and fine-tuning large language models (llms), vision language models (vlms), and small language models (slms).
the ideal candidate will possess a deep understanding of deep learning principles, experience with state-of-the-art model architectures, and a proven track record of developing and deploying high-performance ai models. you will play a crucial role in advancing our ai capabilities and contributing to the development of innovative ai-powered products and services.
responsibilities:
model training and fine-tuning: design, implement, and optimize training pipelines for llms, vlms, and slms using large-scale datasets.
- experiment with various training techniques, including transfer learning, reinforcement learning, and parameter-efficient fine-tuning (peft).
- evaluate and improve model performance using relevant metrics and benchmarks.
- address model biases and ensure fairness and robustness.
model architecture and development: research and implement cutting-edge model architectures and techniques for llms, vlms, and slms.
- adapt and customize existing models to meet specific application requirements.
- develop and maintain efficient code for model training and inference.
data management and processing: work with large-scale datasets, including text, images, and multimodal data.
- develop data preprocessing and augmentation pipelines to improve model performance.
- collaborate with data engineers to ensure data quality and availability.
infrastructure and deployment: optimize model training and inference for performance and scalability.
- deploy trained models to production environments.
- monitor and maintain deployed models.
- work with cloud based infrastructure such as aws, gcp, or azure.
research and development: stay up-to-date with the latest advancements in ai/ml, particularly in llms, vlms, and slms.
- contribute to research projects and publications.
- collaborate with other researchers and engineers to develop innovative ai solutions.
collaboration and communication: work closely with cross-functional teams, including product managers, data scientists, and software engineers.
- communicate technical concepts and findings effectively to both technical and non-technical audiences.
- document all work clearly and effectively.
qualifications:
- technical skills: strong understanding of deep learning principles and techniques.
- extensive experience with training and fine-tuning llms, vlms, and slms.
- proficiency in deep learning frameworks such as tensorflow, pytorch, or jax.
- experience with transformer-based architectures (e.g., bert, gpt, t5, vit).
- experience with cloud computing platforms (e.g., aws, gcp, azure).
- proficiency in python and other relevant programming languages.
- experience with version control systems (e.g., git).
preferred skills: experience with distributed training and large-scale model deployment.
- knowledge of natural language processing (nlp), computer vision, and multimodal learning.
- experience with reinforcement learning.
- experience with prompt engineering.
- experience with model quantization and pruning.
soft skills:
- strong problem-solving and analytical skills.
- excellent communication and collaboration skills.
- ability to work independently and as part of a team.
- strong passion for ai and machine learning.
- ability to adapt to fast changing technologies.