ML Researcher – Coding
Location: Hyderabad / Bangalore
Experience: 2+ years
About Deccan AI
Deccan AI is a high-growth, venture-backed AI model training and evaluation company headquartered in the Bay Area. Founded by alumni of
IIT Bombay, IIM Ahmedabad, and ex-Google
, we partner with the world’s top AI frontier labs including
Google DeepMind, Snowflake
, and several cutting-edge research groups. We are backed by
Prosus Ventures
, and our India office is based in Hyderabad.
We’re not just participating in the AI race
we’re building the infrastructure that powers it.
With 1M+ global experts
, advanced automation, and vertically integrated platforms, we deliver the gold-standard data that world-class AI models rely on. The AI data annotation market is exploding set to
quadruple by 2032
. The opportunity? Massive, and you can help define the future.
About the role:
The ML Researcher – Coding will focus on building and curating datasets for code understanding, generation, and related developer‑assistant capabilities. The role is highly experimental and involves partnering with the data delivery team to define, collect, and refine high‑quality coding datasets.
What you will do:
Design tasks and evaluation schemes for code generation, debugging, refactoring, and multi‑step coding workflows.
Specify and co‑create large custom datasets for programming languages and frameworks of interest, working closely with the data delivery team.
Prototype ML models or prompts on these datasets, analyze failure modes, and feed insights back into new data collection rounds.
Define edge cases and nuanced coding scenarios to ensure robust coverage in the datasets.
Track latest research in code LLMs and coding assistants, and adapt relevant ideas into data‑centric experiments.
What you should have:
2+ years in ML, AI for code, or strong software engineering plus ML exposure; senior candidates are encouraged to apply.
Strong programming skills (Python plus at least one of: Java, C++, TypeScript/JavaScript, Go, or similar).
Good understanding of ML for code or strong interest in this space, with the ability to design meaningful coding tasks and benchmarks.
Demonstrated research/experimentation mindset (papers, open‑source repos, contributions on Hugging Face or similar).
Master’s or PhD preferred, but not mandatory with strong evidence of research inclination.
What Sets Us Apart
We operate at the intersection of
human expertise
and
next-gen automation
across three flagship platforms:
Databench
A secure, enterprise-grade annotation platform with drag-and-drop customisation, client-side data protection, dedicated deployments, and automated large-scale workflow management.
Expert Network
A global, vetted community powering RLHF, SFT, data/code annotation, SQL generation, multimodal evaluation, and rigorous “agentic” assessments.
RL Gym
A first-of-its-kind reinforcement learning gym where top models and agents are stress-tested and tuned against real-world challenges at scale.
We eliminate annotation backlogs, solve security and compliance complexity, ensure national-scale quality, and enable truly reliable multimodal AI.
Why Deccan AI
Work directly with
frontier AI research teams
shaping the future of autonomous intelligence.
See your work
go live
powering real-world products, not sitting in a repo.
Join a team obsessed with
building AI that thinks, reasons, and acts.
An environment that values
speed, creativity, and first-principles innovation.
This Is the Moment.
If you want to work on
AI that speaks, reasons, and evolves
, this is your once-in-a-decade opportunity.
Let’s build the future — together. 🌍