AryaXAI
AI Infrastructure Engineer
Job Location
Job Description
AryaXAI stands at the forefront of AI innovation, revolutionizing AI for mission-critical businesses by building explainable, safe, and aligned systems that scale responsibly. Our mission is to create AI tools that empower researchers, engineers, and organizations to unlock AI's full potential while maintaining transparency and safety.
Our team thrives on a shared passion for cutting-edge innovation, collaboration, and a relentless drive for excellence. At AryaXAI, everyone contributes hands-on to our mission in a flat organizational structure that values curiosity, initiative, and exceptional performance.
Requirements:
- Core Languages: CUDA, Python
- Frameworks: CUTLASS, pybind11 (or similar, model serving frameworks)
- Tools: Nsight, JAX/XLA bindings
- Focus Areas: GPU Kernel Optimization, Deep Learning Inference & Training
Role Overview
We are looking for a highly skilled AI Researcher - GPU Kernel Developer to join our team and push the boundaries of high-performance AI computation. In this role, you will design, develop, and optimize GPU kernels that power state-of-the-art AI models. Your work will directly influence the performance and scalability of our AI systems.
Key Responsibilities:
- Develop and refine low-level CUDA kernel optimizations for deep learning inference and training.
- Profile, debug, and optimize single and multi-GPU operations using tools like Nsight.
- Deeply understand and exploit GPU memory hierarchies and computational capabilities.
- Implement cutting-edge methods from research papers into CUDA kernels.
- Collaborate on designing innovative solutions to achieve peak GPU performance.
Ideal Candidate Profile
We are looking for candidates with a proven track record of excellence in GPU programming and AI system optimization. You should bring:
Core Experiences:
- Expertise in designing high-performance GeMM CUDA kernels using Tensor cores or CUDA cores, leveraging tools like CuTe or CUTLASS.
- Proficiency in extending or writing custom attention and deep learning kernels from scratch.
- Confidence in writing both forward and backward kernels while managing floating-point precision errors.
- Strong optimization skills for both memory-bound and compute-bound operations.
- Advanced knowledge of GPU architecture, including register pressure, shared-memory usage, and GPU utilization.
Preferred Skills:
- Familiarity with profiling tools (e.g., Nsight) to identify bottlenecks and improve performance.
- Experience integrating custom kernels with frameworks like JAX/XLA through tools like pybind11.
- Awareness of the latest advancements in GPU optimization techniques for AI workloads.
Why Join AryaXAI?
- Mission-Driven Impact: Work on challenges that shape the future of responsible AI.
- Technical Excellence: Collaborate with a team of passionate and experienced professionals.
- Growth Opportunities: Contribute across domains and expand your expertise in GPU kernel development and AI research.
- Flexible Work Environment: Choose between remote work or relocation support to one of our key offices.
Interview Process
- Application Review: We review your CV and a statement of exceptional work.
- Initial Interview (15 Minutes): A technical team member will evaluate your basic skills and fit for the role.
Main Process:
- Coding Assessment: Solve programming challenges in your preferred language.
- Systems Problem-Solving: A live, hands-on session to showcase practical expertise.
- Project Deep-Dive: Present your most notable project to our team.
- Team Meet & Greet: Engage with the broader AryaXAI team.
Note: Our interviews are designed to conclude within one week to streamline your onboarding process.
Location: Mumbai, IN
Posted Date: 6/2/2025
Contact Information
Contact | Human Resources AryaXAI |
---|