Research Engineer / Research Scientist, Vision
New York City, NY; San Francisco, CA; Seattle, WAFull-TimeMid-levelSoftware Engineering
About Anthropic
- Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
About the role
- We’re looking for research engineers with a strong computer vision background who believe that visual and spatial reasoning are core to fully unlocking the capabilities of LLMs. In this role, you'll work on research, development, and evaluation for state-of-the-art Claude models, with a focus on visual and spatial capabilities. This role is highly collaborative and will touch many aspects of our broader research efforts, taking a full-stack approach across pretraining, RL, and runtime techniques like agentic harnesses. Additionally, you’ll partner with the product org to ensure that the vision improvements you deliver impact Claude’s performance on real-world tasks.
What you'll do
- Run experiments to evaluate architectural variants, data strategies, and SL and RL techniques to improve Claude’s vision
- Develop and test tools, skills, and agentic infrastructure that enable Claude to reason over visual inputs
- Create evaluations and benchmarks that measure progress on multimodal capabilities across training and deployment
- Work with our product org to find solutions to our most vexing API customer challenges related to vision and spatial reasoning
- Run experiments to evaluate architectural variants, data strategies, and SL and RL techniques to improve Claude’s vision
- Develop and test tools, skills, and agentic infrastructure that enable Claude to reason over visual inputs
- Create evaluations and benchmarks that measure progress on multimodal capabilities across training and deployment
- Work with our product org to find solutions to our most vexing API customer challenges related to vision and spatial reasoning
You may be a good fit if you
- Have 7+ years of ML, computer vision, and software engineering experience through industry, academia, or other projects
- Are familiar with the architecture, training, and operation of large vision language models
- Have experience creating and evaluating large synthetic and real-world visual training datasets
- Have experience engaging in systematic prompting, finetuning, or evaluation
- Are results-oriented, with a bias towards flexibility and impact
- Enjoy pair programming and cross-team collaboration
- Care about the societal impacts of your work
- Have 7+ years of ML, computer vision, and software engineering experience through industry, academia, or other projects
- Are familiar with the architecture, training, and operation of large vision language models
- Have experience creating and evaluating large synthetic and real-world visual training datasets
- Have experience engaging in systematic prompting, finetuning, or evaluation
- Are results-oriented, with a bias towards flexibility and impact
- Enjoy pair programming and cross-team collaboration
- Care about the societal impacts of your work
Strong candidates may also have experience with
- Large-scale pretraining, SL, and RL on language models
- Deep learning research on images, video, or other modalities
- Developing complex agentic systems using LLMs
- High-performance ML systems (GPUs, TPUs, JAX, PyTorch)
- Large-scale ETL and data pipeline development
- Large-scale pretraining, SL, and RL on language models
- Deep learning research on images, video, or other modalities
- Developing complex agentic systems using LLMs
- High-performance ML systems (GPUs, TPUs, JAX, PyTorch)
- Large-scale ETL and data pipeline development
Representative projects
- Running experiments to determine ideal training datamixes and parameters for a synthetically generated vision dataset
- Finetuning Claude to maximize its performance using a particular set of agent tools/skills
- Building a pipeline to ingest and process a novel source of visual training data
- Designing and running experiments to evaluate the scalability of two architectural variants
- Running experiments to determine ideal training datamixes and parameters for a synthetically generated vision dataset
- Finetuning Claude to maximize its performance using a particular set of agent tools/skills
- Building a pipeline to ingest and process a novel source of visual training data
- Designing and running experiments to evaluate the scalability of two architectural variants
- The annual compensation range for this role is listed below.
- For sales roles, the range provided is the role’s On Target Earnings ("OTE") range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.
How we're different
- We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.
- The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.
