Data Scientist (NLP / LLM ) Job at Beacon Talent, Menlo Park, CA

Y0t5bnFQeitWVmczUm1DVi9CMHhvQ2JXRnc9PQ==
  • Beacon Talent
  • Menlo Park, CA

Job Description

Job Description

Job Description

About the Client
Our client is an innovative company at the intersection of healthcare and artificial intelligence. Founded by experts in health technology, clinical AI, and academia, they are building a leading AI training and validation platform. Their mission is to simplify data access for AI developers while ensuring the highest standards of data quality and patient safety. With an ambitious vision, they aim to become the go-to platform for responsible AI development in healthcare.

About the Role
Our client is seeking a skilled Data Scientist with expertise in Natural Language Processing (NLP) and Large Language Models (LLM) to work with structured and unstructured healthcare data. This role will focus on extracting meaningful insights from multimodal clinical datasets to support AI model development. The ideal candidate will collaborate with data scientists and technical teams to create high-quality, AI-ready datasets, driving innovation in patient care through data-driven solutions.

Responsibilities

  • Develop NLP, LLM, and other ML-based pipelines to extract relevant information from text-based healthcare data and store it in scalable data models.
  • Query complex source systems, including electronic medical records (EMRs) and clinical provider notes, to curate high-quality datasets for AI training and real-world evidence analysis.
  • Lead data extraction, transformation, labeling, and quality control tasks to ensure accurate dataset development.
  • Stay informed on advancements in applied NLP and generative AI, integrating cutting-edge methods into workflows where applicable.
  • Design and validate multimodal data structures by integrating information from diverse healthcare sources.
  • Develop clean, well-documented code to ensure compliance with HIPAA and other data privacy regulations.
  • Troubleshoot data-related challenges and ensure the accuracy and integrity of all datasets.
  • Collaborate with the technical product team to enhance data-related product offerings.
  • Communicate complex findings and methodologies in a clear, accessible manner to both technical and non-technical audiences.

Requirements

  • Advanced degree in Data Science, Biomedical Informatics, Computer Science, Biostatistics, or a related quantitative field. A bachelor's degree with significant experience may also be considered.
  • Minimum of 2 years of experience in data science and machine learning, particularly in building pipelines for unstructured and semi-structured data extraction. Prior experience with healthcare data is highly desirable.
  • Proficiency in Python and SQL, with experience using ML/NLP libraries such as PyTorch, TensorFlow, and Hugging Face.
  • Familiarity with modern LLM applications and techniques.
  • Strong technical writing, analytical, and communication skills.
  • Excellent organizational abilities with the capacity to manage multiple projects in a dynamic environment.
  • Experience in startup environments is a plus.

Preferred Technical Skills

  • Experience with Git and version control.
  • Understanding of encryption methods and data security.
  • Experience querying enterprise data warehouses (EDWs) and working with electronic medical records (Epic, Cerner, Allscripts).
  • Knowledge of medical ontologies and healthcare data standards.
  • Experience handling imaging data formats like DICOM.
  • Familiarity with AWS cloud infrastructure.
  • Prior experience publishing research in peer-reviewed journals is a bonus.

Why Join?

  • Fully remote role with flexible working hours.
  • Comprehensive benefits, including healthcare, dental, vision, PTO, and more.
  • Dedicated professional development days for skill enhancement.
  • Collaborative and intellectually stimulating work environment.
  • Balance between focused project time and team collaboration.
  • Mission-driven company focused on improving patient care through AI innovation.
  • Opportunity for occasional travel for in-person team collaboration.

If you’re passionate about working with healthcare data and driving advancements in AI-driven patient care, we encourage you to apply!

Job Tags

Remote job, Flexible hours,

Similar Jobs

LHH

Category Manager Job at LHH

 ...The Category Manager is responsible for overseeing the development and execution of category management programs in partnership with the channel marketing team and relevant stakeholders. This role also leads local and regional new product development and launches, including... 

Drive Time Transports

CDL-A DRIVERS NEEDED FOR A VERY HIGH PAYING DOLLAR ACCOUNT AND HOME WEEKLY Job at Drive Time Transports

CDL-A DRIVERS NEEDED DOLLAR ACCOUNT AND YOU WILL BE HOME WEEKLY! MINIMUM 3 Months VERIFIABLE tractor trailer experience required $1000 SIGN ON BONUS!*Drivers who run 1500 miles and 3 loads to make an average of $1725 WEEKLY**TOP 10% ARE MAKING $2,800 WEEKLY**...

DoorDash

Drive with DoorDash Job at DoorDash

 ...money almost immediately. DoorDash offers a clear pay model and daily cash-out options, allowing you to achieve your financial goals on...  ...begin working that same day. With Dasher Direct, you can also get paid the same day! Why Deliver with DoorDash Choose your own... 

CT West

Transport Refrigeration Technician Job at CT West

 ...West is the Carrier Transicold dealer in the Boise Area Role Description This is a full-time on-site role for a Transport Refrigeration Technician at CT West located in Boise, ID. The role will involve day-to-day tasks such as preventive maintenance,... 

FHI 360

Crisis Response and Resilience Program Director Job at FHI 360

Job Summary:With oversight from the Senior Director, Crisis Response and Resilience, the CRR Program Director will oversee a portfolio...  ...partners, and government agencies. Liaises with other program management teams within FHI 360 and CRR to integrate and manage work plans...