Lead Data Scientist
AtlantaFull-TimeLeadAI / Data Science
How will you contribute?
- Collect, analyze, and interpret small/large datasets to uncover meaningful insights to support the development of statistical methods / machine learning algorithms.
- Lead the design, training, and deployment of NLP and transformer-based models for financial surveillance and supervisory use cases (e.g., misconduct detection, market abuse, trade manipulation, insider communication).
- Development of machine learning models and other analytics following established workflows, while also looking for optimization and improvement opportunities
- Data annotation and quality review
- Exploratory data analysis and model fail state analysis
- Contribute to model governance, documentation, and explainability frameworks aligned with internal and regulatory AI standards.
- Client/prospect guidance in machine learning model and analytic fine-tuning/development processes
- Provide guidance to junior team members on model development and EDA
- Work with Product Manager(s) to intake project/product requirements and translate these to technical tasks within the team’s tooling, technique and procedures
- Continued self-led personal development
What will you bring?
- Strong understanding of financial markets, compliance, surveillance, supervision, or regulatory technology
- Experience with one or more data science and machine/deep learning frameworks and tooling, including scikit-learn, H2O, keras, pytorch, tensorflow, pandas, numpy, carot, tidyverse
- Command of data science and statistics principles (regression, Bayes, time series, clustering, P/R, AUROC, exploratory data analysis etc…)
- Strong knowledge of key programming concepts (e.g. split-apply-combine, data structures, object-oriented programming)
- Solid statistics knowledge (hypothesis testing, ANOVA, chi-square tests, etc…)
- Knowledge of NLP transfer learning, including word embedding models (gloVe, fastText, word2vec) and transformer models (Bert, SBert, HuggingFace, and GPT-x etc.)
- Experience with natural language processing toolkits like NLTK, spaCy, Nvidia NeMo
- Knowledge of microservices architecture and continuous delivery concepts in machine learning and related technologies such as helm, Docker and Kubernetes
- Familiarity with Deep Learning techniques for NLP.
- Familiarity with LLMs - using ollama & Langchain
- Excellent verbal and written skills
- Proven collaborator, thriving on teamwork
- Master’s or Doctor of Philosophy degree in Computer Science, Applied Math, Statistics, or a scientific field
- Familiarity with cloud computing platforms (AWS, GCS, Azure)
- Experience with automated supervision/surveillance/compliance tools
