AryaXAI
Sr Software Engineer Developer
Job Location
Job Description
AryaXAI stands at the forefront of AI innovation, revolutionizing AI for mission-critical businesses by building explainable, safe, and aligned systems that scale responsibly. Our mission is to create AI tools that empower researchers, engineers, and organizations to unlock AI's full potential while maintaining transparency and safety.
Our team thrives on a shared passion for cutting-edge innovation, collaboration, and a relentless drive for excellence. At AryaXAI, everyone contributes hands-on to our mission in a flat organizational structure that values curiosity, initiative, and exceptional performance.
As a Senior Software Engineer, you will be a technical leader within the Platform Engineering team, responsible for architecting and implementing the most critical components of the AryaXAI platform. You will move beyond feature development to take ownership of entire systems, from initial design to production deployment and long-term reliability. Your role will involve tackling our most complex technical challenges, such as designing a cost-aware distributed job scheduler, building a high-performance AI inferencing setup or architecting our multi-cloud serverless framework. You will mentor junior engineers, drive technical decision-making, and set the standard for engineering excellence across the team.
Key Responsibilities
- Architect and Own Core Systems: Lead the design, development, and operational ownership of large-scale distributed systems that are the foundation of our AI inferencing platform.
- Solve High-Complexity Problems: Tackle ambiguous and challenging engineering problems in areas like high-performance computing, distributed consensus, resource scheduling, and multi-cloud orchestration.
- Drive Technical Strategy: Influence the team's technical roadmap, evaluate new technologies, and make key architectural decisions that impact the entire platform's scalability, reliability, and cost-effectiveness.
- Mentor and Elevate the Team: Act as a technical mentor to other engineers, providing guidance through code reviews, design discussions, and pair programming, fostering a culture of high-quality engineering.
- Champion Best Practices: Set and enforce best practices for software development, including advanced testing strategies, CI/CD automation, system observability, and operational readiness.
- Performance Optimization: Profile and optimize critical system components, diving deep into performance bottlenecks across the stack, from Python code to network I/O and infrastructure configuration.
Required Qualifications
- Extensive Experience: 5+ years of professional software engineering experience, with a demonstrated track record of designing and delivering complex backend systems.
- Expert in Python and have built scalable systems.
- Experience with Puppet/Chef/Ansible, Amazon Web Services (AWS), Git, Graphite and related tools for large-scale systems management is a must.
- Expert-Level Python: Deep, idiomatic knowledge of Python. This includes advanced concepts like asynchronous programming (asyncio), concurrency models (multithreading/multiprocessing), the Global Interpreter Lock (GIL), and experience with performance profiling and memory optimization tools.
- Proven System Design Skills: Demonstrable experience architecting scalable, fault-tolerant, and highly available distributed systems. You should be able to fluently discuss the trade-offs of different architectural patterns (e.g., microservices vs. monoliths), data consistency models (e.g., eventual vs. strong), and communication protocols (e.g., REST, gRPC, message queues).
Deep Cloud and Distributed Systems Knowledge:
- Cloud Infrastructure: Hands-on experience building and deploying production workloads on at least one major cloud provider (AWS, GCP, Azure).
- Messaging & Queuing: Practical experience with message brokers like RabbitMQ or Kafka for building asynchronous, event-driven systems.
- Caching Strategies: In-depth knowledge of caching patterns and hands-on experience with systems like Redis or Memcached.
- Database Expertise: A strong understanding of the trade-offs between SQL and NoSQL databases and experience designing data models for performance and scale.
Mastery of Containerization and Orchestration:
- Docker: Expert-level understanding of Docker, including multi-stage builds, image optimization, and container networking.
- Kubernetes: Deep, hands-on experience managing production applications on Kubernetes. This goes beyond deploying simple services and includes understanding the Kubernetes control plane, writing custom controllers/operators, and configuring networking (CNI), storage (CSI), and security policies.
Preferred Qualifications
- High-Performance Computing (HPC): Experience writing or integrating with performance-critical code using lower-level languages like C++, Go, orRust.
- Building Developer Platforms: Experience building internal developer platforms (IDPs), custom schedulers, or orchestration frameworks from the ground up.
- AIOps and AI Infrastructure: Familiarity with the AIOps lifecycle and experience with tools like Kubeflow, MLflow, or building model inference servers (e.g., Triton, TorchServe, or custom vLLM/SGLang implementations).
- Advanced Kubernetes: Experience with service meshes (e.g., Istio, Linkerd) and building custom CRDs and Operators to extend Kubernetes functionality.
- Multi-Cloud and Serverless Architecture: Direct experience designing and implementing systems that span multiple cloud providers or building custom serverless platforms.
What’ll you get:
- Highly competitive and meaningful compensation package
- One of the best health care plans that covers not only you but also your family
- A great team
- Micro-entrepreneurial tasks and responsibilities.
- Career development and leadership opportunities
Location: Mumbai, IN
Posted Date: 7/17/2025
Contact Information
Contact | Human Resources AryaXAI |
---|