Mastech InfoTrellis

Data Scientist - Optical Character Recognition

Job Location

bangalore, India

Job Description

Job Description : Data Scientist Company : Mastech Digital Location : Bangalore Urban, Karnataka, India Position Type : Full Time Duration : Permanent Notice Period : Immediate Joiner / Serving Notice / Less than 30 Days Experience : 5 Years About the Role : Mastech Digital is seeking a highly skilled and experienced Data Scientist to join our dynamic team. In this role, you will be responsible for developing and deploying advanced AI models, with a focus on OCR, LLMs, and computer vision. You will work within the AWS ecosystem, adhering to best practices for code quality, data security, and model deployment. This position requires a strong understanding of machine learning techniques, cloud technologies, and the ability to collaborate effectively with cross-functional teams. Responsibilities/Duties : AI Model Development and Deployment : - Train and fine-tune AI models using OCR and Large Language Models (LLMs). - Develop and implement computer vision models for object detection and segmentation. - Deploy and maintain models in production, collaborating with software engineers. Cloud Infrastructure and Architecture : - Utilize AWS services, including SageMaker, Bedrock, Lambda, S3, and API Gateway, for model development and deployment. - Adhere to the AWS Well-Architected Framework for robust and scalable solutions. Data Management and Security : - Perform data cleaning and preprocessing to ensure high-quality training data. - Ensure data confidentiality and implement HIPAA compliance measures. Software Development Practices : - Follow internal best practices for code monitoring, testing, and version control. - Implement CI/CD pipelines using Jenkins and other relevant tools. - Conduct thorough QA and application testing. Model Evaluation and Optimization : - Perform robust testing of models to ensure accuracy and reliability. - Compare the feasibility of different models and select the most appropriate solution. - Fine-tune LLMs (Mistral, Llama, and other open-source models) and perform prompt tuning. Collaboration and Communication : - Collaborate with other data scientists to divide work and ensure timely project completion. - Meet deadlines for weekly/bi-weekly meetings and provide regular updates. - Create data visualizations to communicate results to non-technical stakeholders. - Testing and implementing NER models. Huggingface and Related Technologies : - Familiarity with huggingface packages. Skills : Programming and Data Science : - Proficient in Python. - Strong SQL skills. - Experience with data cleaning and big data processing. - Experience with OCR and NER models. Cloud Technologies (AWS) : - Extensive experience with AWS SageMaker, Bedrock, Lambda, S3, and API Gateway. - Proficiency in using Textract API. Machine Learning and AI : - Experience with training and fine-tuning LLMs (Mistral, Llama, etc.). - Proficiency in prompt tuning. - Experience with computer vision models for object detection and segmentation. DevOps and CI/CD : - Experience with CI/CD pipelines and version control systems. - Proficiency in using Jenkins. Huggingface : - Familiarity with huggingface packages. Qualifications : - 5 years of experience as a Data Scientist. - Bachelor's or Master's degree in Computer Science, Data Science, or a related field. - Strong understanding of machine learning algorithms and techniques. - Excellent problem-solving and analytical skills. - Strong communication and collaboration skills. - Ability to work independently and as part of a team. Preferred Qualifications : - Experience with healthcare data and HIPAA compliance. - AWS certifications. - Experience with advanced computer vision techniques. (ref:hirist.tech)

Location: bangalore, IN

Posted Date: 5/7/2025
View More Mastech InfoTrellis Jobs

Contact Information

Contact Human Resources
Mastech InfoTrellis

Posted

May 7, 2025
UID: 5107222497

AboutJobs.com does not guarantee the validity or accuracy of the job information posted in this database. It is the job seeker's responsibility to independently review all posting companies, contracts and job offers.