Senior Data Scientist
We are seeking a Senior Data Scientist to design, train, evaluate, and deliver machine learning models that solve operational problems across USCENTCOMs Data Office initiatives. This is a hands-on ML practitioner rolenot a platform or infrastructure position. The Senior Data Scientist will work within an established on-premises Data Analytical Environment (DAE) built on a Data Lakehouse architecture with H100 GPU infrastructure, applying their expertise in statistical modeling, deep learning, and applied ML to turn enterprise data into actionable intelligence. The ideal candidate brings deep experience in model development across multiple problem domainsforecasting, NLP, anomaly detection, and classificationand can independently lead the ML practice for the team. WHAT YOU WILL BE DOING:Model Development & TrainingDesign, train, and validate supervised, unsupervised, and deep learning models using open-source libraries (PyTorch, TensorFlow, Scikit-learn, XGBoost, LightGBM) to support forecasting, classification, anomaly detection, and NLP use casesConduct rigorous experiment design: feature engineering, hyperparameter tuning, cross-validation, and evaluation using appropriate metrics (precision/recall/F1, RMSE, AUC-ROC) to ensure production-quality model performanceFine-tune and adapt open-source LLMs (LLMA, Mistral, and similar) for domain-specific tasks including document summarization, entity extraction, and question-answering over classified and unclassified networksDevelop and maintain RAG pipelines: chunking strategies, embedding model selection, retrieval evaluation, and prompt engineering to deliver high-quality LLM-augmented analyticsApplied Problem-SolvingTranslate mission requirements into ML solutions: work directly with analysts, operators, and leadership to scope problems, define success criteria, and deliver models that produce actionable operational insightsBuild models across multiple domains including predictive analytics (logistics, readiness), NLP/text analytics (reports, intelligence documents), anomaly detection (cybersecurity, network, behavioral), and computer vision where applicableDesign lightweight, optimized models for edge and disconnected environments when required, supporting model optimization and conversion (ONNX, TensorRT, OpenVINO) for tactical deploymentMLOps & Lifecycle (Collaborative)Version, track, and reproduce experiments using MLflow, DVC, and Git; maintain clear documentation of model lineage, training data, and performance baselinesPackage trained models for deployment in containerized environments (Docker, Kubernetes) in coordination with the platform engineering team. Ownership of deployment infrastructure is flexible and project-dependentIntegrate models into existing CI/CD pipelines, analytics platforms, and decision support tools in collaboration with the DevSecOps and data engineering teamsData Security & ComplianceEnsure all model development adheres to DoD security, encryption, and data handling standards, including tagging, metadata management, and retention policiesOperate within classified environments (SIPR/NIPR), following cybersecurity and data stewardship protocols across air-gapped and hybrid infrastructure WHAT YOU WILL NEED:Education & ExperienceBachelors or Masters degree in Computer Science, Machine Learning, Statistics, Applied Mathematics, Data Science, or related quantitative field8+ years of hands-on AI/ML model development experience with a strong record of delivering production models, not just prototypesCompliant with DoD Directive 8140 (i.e., CompTIA Security + CE cert)Active Secret clearance is required. Must be TS/SCI eligibleMust be able to work on site at MacDill AFB. Not a remote role.Technical SkillsStrong Python proficiency and deep experience with open-source ML frameworks (PyTorch, TensorFlow, Scikit-learn, XGBoost, LightGBM, Hugging Face Transformers)Demonstrated ability to train, fine-tune, and evaluate models end-to-endfrom raw data through feature engineering, model selection, training, validation, and production handoffExperience with LLM fine-tuning techniques (LoRA, QLoRA, PEFT) and RAG architecture design (vector databases, embedding strategies, retrieval evaluation)Working knowledge of MLOps toolchains (MLflow, DVC, Weights & Biases) and version control (Git).Familiarity with containerized deployment (Docker, Kubernetes) in air-gapped or on-premise environmentsExperience working with large-scale data systems and medallion/lakehouse architecturesDESIRED QUALIFICATIONSExperience with model optimization and conversion (ONNX, TensorRT, OpenVINO) for edge or tactical deploymentKnowledge of NLP techniques applied to defense or intelligence domains (entity extraction, document classification, summarization of operational reports)Familiarity with distributed data frameworks (Apache Spark, Dask)Experience with edge AI hardware (NVIDIA Jetson, Coral TPU) WHAT GDIT CAN OFFER : At GDIT, the mission is our purpose, and our people are at the center of everything we do.Growth: AI-powered career tool that identifies career steps and learning opportunitiesSupport: An internal mobility team focused on helping you achieve your career goalsRewards: Comprehensive benefits and wellness packages, 401K with company match, competitive pay and paid time offCommunity: Award-winning culture of innovation and a military-friendly workplace#ARMA #GDITPRIORITY #CENTCOM/CITS