Specialize in fine-tuning and optimizing large language models at APPIT Software in London, applying RLHF, DPO, LoRA, and advanced alignment techniques to build domain-specific AI systems.
London, UK
Full-time
AI & Machine Learning
Responsibilities
Lead LLM fine-tuning projects using LoRA, QLoRA, full fine-tuning, and continued pre-training
Design and curate high-quality training datasets for domain-specific model adaptation
Implement RLHF and DPO alignment pipelines for safety and instruction following
Optimize model inference using quantization (GPTQ, AWQ, GGUF) and pruning techniques
Build automated evaluation pipelines for measuring model quality across benchmarks
Research and implement state-of-the-art parameter-efficient fine-tuning methods
Requirements
5+ years of ML engineering with 2+ years focused on LLM training and fine-tuning
Deep expertise in parameter-efficient fine-tuning (LoRA, QLoRA, adapters)
Experience with distributed training using DeepSpeed, FSDP, or Megatron-LM
Strong knowledge of model optimization, quantization, and serving at scale
Hands-on experience with Hugging Face Transformers, PEFT, and TRL libraries
Understanding of data curation, quality filtering, and synthetic data generation
Nice to Have
Published research on LLM training or alignment
Experience with custom CUDA kernels or FlashAttention
Knowledge of EU AI Act compliance requirements
Skills
PythonPyTorchLoRADeepSpeedHugging FaceRLHFModel QuantizationDistributed Training
Apply for this position
Fill in your details below to submit your application.