About this role
ZYPHRA IS AN ARTIFICIAL INTELLIGENCE COMPANY BASED IN SAN FRANCISCO, CALIFORNIA.
THE ROLE:
As a Research Engineer - Audio & Speech Models, you will be a core contributor on Zyphra’s Audio Team, building the next generation of open-source autoencoders, ASR, TTS, SSL, and speech-to-speech models. You will be deeply involved in the entire model training process, from data gathering and processing to designing novel architectures and training methodologies.
YOU’LL WORK ACROSS:
- Large-scale audio training runs
- Performance optimization of our training stack
- Audio dataset collection, processing, and evaluation
- Architecture and training methodology ablations and improvements
WHAT WE'RE LOOKING FOR / REQUIREMENTS:
- Strong research taste and intuition. The ability to work through a research project from conception to execution to write-up.
- Strong implementation and prototyping ability (can take an idea from conception to experimentation quickly)
- The ability to work well with others in a high-paced research setting
- Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale
QUALIFICATIONS / ADDITIONAL SKILLS:
- Expertise and intuition for training models in the audio domain, including text-to-speech, ASR, speech-to-speech, speech-emotion-recognition, or other models
- Experience in training audio autoencoders
- Understanding of signal processing, especially of audio signals
- Experience with diffusion models, consistency models, or GANs
- Experience with training on large-scale (multi-node) GPU clusters
- Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing
- Understanding of and interest in large-scale, highly parallel data processing pipelines
- Proficiency with PyTorch and Python
- Experience contributing to large pre-existing codebases and rapidly getting up to speed
- Previously published machine learning research in well-respected venues
- Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Mathematics, Physics, Machine Learning)
WHY WORK AT ZYPHRA:
- Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued
- We strongly value new and crazy ideas and are very willing to bet big on new ideas
- We move as quickly as we can; we aim to minimize the bar to impact as low as possible
- We all enjoy what we do and love discussing AI
BENEFITS AND PERKS:
- Comprehensive medical, dental, vision, and FSA plans
- Competitive compensation and 401(k) plan
- Relocation and immigration support on a case-by-case basis
- In-office snacks and meals provided
- Unlimited PTO and company holidays
- In-person team in San Francisco with a collaborative, high-energy environment
THE ROLE:
As a Research Engineer - Audio & Speech Models, you will be a core contributor on Zyphra’s Audio Team, building the next generation of open-source autoencoders, ASR, TTS, SSL, and speech-to-speech models. You will be deeply involved in the entire model training process, from data gathering and processing to designing novel architectures and training methodologies.
YOU’LL WORK ACROSS:
- Large-scale audio training runs
- Performance optimization of our training stack
- Audio dataset collection, processing, and evaluation
- Architecture and training methodology ablations and improvements
WHAT WE'RE LOOKING FOR / REQUIREMENTS:
- Strong research taste and intuition. The ability to work through a research project from conception to execution to write-up.
- Strong implementation and prototyping ability (can take an idea from conception to experimentation quickly)
- The ability to work well with others in a high-paced research setting
- Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale
QUALIFICATIONS / ADDITIONAL SKILLS:
- Expertise and intuition for training models in the audio domain, including text-to-speech, ASR, speech-to-speech, speech-emotion-recognition, or other models
- Experience in training audio autoencoders
- Understanding of signal processing, especially of audio signals
- Experience with diffusion models, consistency models, or GANs
- Experience with training on large-scale (multi-node) GPU clusters
- Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing
- Understanding of and interest in large-scale, highly parallel data processing pipelines
- Proficiency with PyTorch and Python
- Experience contributing to large pre-existing codebases and rapidly getting up to speed
- Previously published machine learning research in well-respected venues
- Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Mathematics, Physics, Machine Learning)
WHY WORK AT ZYPHRA:
- Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued
- We strongly value new and crazy ideas and are very willing to bet big on new ideas
- We move as quickly as we can; we aim to minimize the bar to impact as low as possible
- We all enjoy what we do and love discussing AI
BENEFITS AND PERKS:
- Comprehensive medical, dental, vision, and FSA plans
- Competitive compensation and 401(k) plan
- Relocation and immigration support on a case-by-case basis
- In-office snacks and meals provided
- Unlimited PTO and company holidays
- In-person team in San Francisco with a collaborative, high-energy environment
Tech stack
PyTorchPython
About Zyphra Technologies
Zyphra Technologies is hiring for the research engineer - audio & speech models role. NewJob aggregates active openings directly from Zyphra Technologies's applicant tracking system, so this listing is current.
More jobs at Zyphra Technologies →