Businesses with large proprietary data stores increasingly find a need to develop their own “private” large language models (LLMs). At the same time, threats against data privacy increase along with LLM and datasets sizes. Hence, Blattner Technologies is planning to address privacy preservation in LLMs via the use of applying federated machine learning (FML), differential privacy, and/or privacy detection algorithms.
1.2 Desired Outcomes
Prototype capabilities demonstrating benefits and/or highlighting challenges in creating privacy-aware LLMs. This will focus on backend code but may also include UX capabilities.
Presentation to broader company highlighting approach, challenges, solutions, and significant insights stemming from the effort.
1.3 Core Skills Required
Required skills:
Fundamental LLM knowledge (e.g., prompt engineering, fine-tuning, training)
NLP techniques (e.g., tokenization, vectorization)
ML techniques (e.g., neural network training, graph/parameter extraction and setting, textfication)
Python development
Privacy preservation techniques (e.g., differential privacy, homomorphic encryption)
Optional/preferable skills:
Knowledge of federated machine learning
KubeFlow
1.4 Estimated Effort
Full-time summer internship (40 hours/week)
Depending on progress, work my extend to part time during the Fall semester (e.g., 10 hours/week)
1.5 Additional Information
This is a remote internship opportunity, working with summer mentors and reporting to the Chief Product Officer of BOSS AI. The group has a deep focus on implementing LLMs “as a service” (LLMaaS) and team members have a range of skills from enterprise software engineering, NLP, ML, and UX. You can expect to gain valuable experience in operationalizing LLMs and addressing critical security needs for all language models.