Skip to main content
APPIT Software - Solutions Delivered
Demos
LoginGet Started
Aegis BrowserFlowSenseVidhaanaTrackNexusWorkisySlabIQLearnPathAI InterviewAll ProductsDigital TransformationAI/ML IntegrationLegacy ModernizationCloud MigrationCustom DevelopmentData AnalyticsStaffing & RecruitmentAll ServicesHealthcareFinanceManufacturingRetailLogisticsProfessional ServicesEducationHospitalityReal EstateAgricultureConstructionInsuranceHRTelecomEnergyAll IndustriesCase StudiesBlogResource LibraryProduct ComparisonsAbout UsCareersContact
APPIT Software - Solutions Delivered

Transform your business from legacy systems to AI-powered solutions. Enterprise capabilities at SMB-friendly pricing.

Company

  • About Us
  • Leadership
  • Careers
  • Contact

Services

  • Digital Transformation
  • AI/ML Integration
  • Legacy Modernization
  • Cloud Migration
  • Custom Development
  • Data Analytics
  • Staffing & Recruitment

Products

  • Aegis Browser
  • FlowSense
  • Vidhaana
  • TrackNexus
  • Workisy
  • SlabIQ
  • LearnPath
  • AI Interview

Industries

  • Healthcare
  • Finance
  • Manufacturing
  • Retail
  • Logistics
  • Professional Services
  • Hospitality
  • Education

Resources

  • Case Studies
  • Blog
  • Live Demos
  • Resource Library
  • Product Comparisons

Contact

  • info@appitsoftware.com

Global Offices

🇮🇳

India(HQ)

PSR Prime Towers, 704 C, 7th Floor, Gachibowli, Hyderabad, Telangana 500032

🇺🇸

USA

16192 Coastal Highway, Lewes, DE 19958

🇦🇪

UAE

IFZA Business Park, Dubai Silicon Oasis, DDP Building A1, Dubai

🇸🇦

Saudi Arabia

Futuro Tower, King Saud Road, Riyadh

© 2026 APPIT Software Solutions. All rights reserved.

Privacy PolicyTerms of ServiceCookie PolicyRefund PolicyDisclaimer
Back to all positions
Data EngineeringFull-timeHybrid

Senior Data Engineer (Apache Spark)

Lead the design of large-scale distributed data processing systems using Apache Spark and cloud platforms at APPIT Software in San Francisco.

San Francisco, USA
Full-time
Data Engineering

Responsibilities

  • Architect and optimize large-scale Spark jobs processing terabytes of data daily
  • Design data lake and lakehouse architectures on AWS S3 or Azure Data Lake Storage
  • Mentor junior data engineers on distributed computing best practices and Spark internals
  • Build and maintain real-time and batch data pipelines with robust fault-tolerance
  • Partner with ML engineers to deliver feature stores and training data sets at scale
  • Drive performance tuning including partitioning, caching, and shuffle optimization strategies

Requirements

  • 6+ years of data engineering experience with at least 3 years focused on Apache Spark
  • Deep understanding of distributed computing, MapReduce paradigms, and cluster resource management
  • Expert-level Python and/or Scala programming for Spark application development
  • Experience with cloud data services (AWS EMR, Glue, Redshift, or Azure Synapse)
  • Strong knowledge of data lake architectures, Delta Lake, or Apache Iceberg table formats
  • Proven ability to optimize Spark jobs for cost efficiency and processing speed

Nice to Have

  • Experience with Databricks Unified Analytics Platform
  • Knowledge of streaming with Spark Structured Streaming or Flink
  • Contributions to open-source data projects

Skills

Apache SparkPythonScalaAWSDelta LakeSQLData LakeAirflow

Apply for this position

Fill in your details below to submit your application.

Click to upload your resume

PDF or Word document (max 5MB)

Related Positions

Data EngineeringOn-site

Data Engineer (Python & SQL)

Hyderabad, India3-5 yrs
View & Apply
Data EngineeringHybrid

Snowflake Data Engineer

Seattle, USA4-6 yrs
View & Apply
Cloud & InfrastructureHybrid

Senior DevOps Engineer

San Francisco, USA6-9 yrs
View & Apply
AI & Machine LearningHybrid

Senior AI/ML Engineer

San Francisco, USA6+ yrs
View & Apply