Skip to main content
APPIT Software - Solutions Delivered
Demos
LoginGet Started
Aegis BrowserFlowSenseVidhaanaTrackNexusWorkisySlabIQLearnPathAI InterviewAll ProductsDigital TransformationAI/ML IntegrationLegacy ModernizationCloud MigrationCustom DevelopmentData AnalyticsStaffing & RecruitmentAll ServicesHealthcareFinanceManufacturingRetailLogisticsProfessional ServicesEducationHospitalityReal EstateAgricultureConstructionInsuranceHRTelecomEnergyAll IndustriesCase StudiesBlogResource LibraryProduct ComparisonsAbout UsCareersContact
APPIT Software - Solutions Delivered

Transform your business from legacy systems to AI-powered solutions. Enterprise capabilities at SMB-friendly pricing.

Company

  • About Us
  • Leadership
  • Careers
  • Contact

Services

  • Digital Transformation
  • AI/ML Integration
  • Legacy Modernization
  • Cloud Migration
  • Custom Development
  • Data Analytics
  • Staffing & Recruitment

Products

  • Aegis Browser
  • FlowSense
  • Vidhaana
  • TrackNexus
  • Workisy
  • SlabIQ
  • LearnPath
  • AI Interview

Industries

  • Healthcare
  • Finance
  • Manufacturing
  • Retail
  • Logistics
  • Professional Services
  • Hospitality
  • Education

Resources

  • Case Studies
  • Blog
  • Live Demos
  • Resource Library
  • Product Comparisons

Contact

  • info@appitsoftware.com

Global Offices

🇮🇳

India(HQ)

PSR Prime Towers, 704 C, 7th Floor, Gachibowli, Hyderabad, Telangana 500032

🇺🇸

USA

16192 Coastal Highway, Lewes, DE 19958

🇦🇪

UAE

IFZA Business Park, Dubai Silicon Oasis, DDP Building A1, Dubai

🇸🇦

Saudi Arabia

Futuro Tower, King Saud Road, Riyadh

© 2026 APPIT Software Solutions. All rights reserved.

Privacy PolicyTerms of ServiceCookie PolicyRefund PolicyDisclaimer
Back to all positions
AI & Machine LearningFull-timeHybrid

Reinforcement Learning Engineer

Design reinforcement learning systems at APPIT Software in Montreal, building adaptive AI agents for optimization, autonomous decision-making, and RLHF alignment of large language models.

Montreal, Canada
Full-time
AI & Machine Learning

Responsibilities

  • Design and implement reinforcement learning algorithms for enterprise optimization problems
  • Build RLHF and reward modeling pipelines for LLM alignment and fine-tuning
  • Develop simulation environments for training and evaluating RL agents
  • Implement multi-agent reinforcement learning systems for complex coordination tasks
  • Optimize RL training stability and sample efficiency using state-of-the-art techniques
  • Collaborate with research teams to translate RL advances into production applications

Requirements

  • 5+ years of ML experience with 2+ years focused on reinforcement learning
  • Deep knowledge of RL algorithms (PPO, SAC, DQN, MCTS, and their variants)
  • Experience with RL frameworks (Stable-Baselines3, RLlib, CleanRL)
  • Strong mathematical background in dynamic programming, control theory, and optimization
  • Experience with RLHF for language model alignment
  • Proficiency in PyTorch and experience with parallel environment simulation

Nice to Have

  • Publications in RL research (NeurIPS, ICML, ICLR)
  • Experience with robotics or autonomous systems
  • Knowledge of offline RL and decision transformers

Skills

PythonPyTorchReinforcement LearningRLHFPPOSimulationMulti-Agent RLOptimization

Apply for this position

Fill in your details below to submit your application.

Click to upload your resume

PDF or Word document (max 5MB)

Related Positions

AI & Machine LearningHybrid

LLM Fine-Tuning & Optimization Engineer

London, UK5+ yrs
View & Apply
AI & Machine LearningOn-site

AI/ML Engineer

Hyderabad, India3-5 yrs
View & Apply
EngineeringHybrid

iOS Developer (Swift/SwiftUI)

Toronto, Canada5+ yrs
View & Apply
EngineeringHybrid

Ruby on Rails Developer

Vancouver, Canada3-5 yrs
View & Apply