Language Research
Research Scientists & Research Interns @ Language Research, NAVER AI Lab
Join Our Team!
About Us
Language Research Team at NAVER AI Lab is dedicated to understanding humanity and society, and advancing human-like but also trustworthy and safe language models and Artificial Intelligence. As a team operating in both academic and industrial environments, we strive to tackle problems that are both fundamental and relevant to the real world.
Our current Research Mission and Interests are centered around building trustworthy and safe Large Language Models (LLMs), with a focus on:
Datasets, Benchmarks, and Evaluation Metrics for LLMs
LLM Security: Attacks, Defenses & Detections
Safety Alignment, Learning, and Inference Algorithms
LLM Agents, (Multi-)Agent Interactions, Decision-making, and Autonomous Agents
Check out our latest papers*(selected papers, all).*
About the Research Scientists
About the Role
We are looking for Research Scientists to join our team to research and development of safe and trustworthy Language Models and AI. The Research Scientists are encouraged to lead and/or support research projects collaboratively within the team and across the research field, other teams, and external organizations.
Specifically, the research topics include following but not limited to:
Red-teaming, Adversarial Attack, Security Attack
Watermarking
Training Data/Privacy Probing & Leakage
Model/Data/Task Contamination
Robustness
Safety Alignment
Model Unlearning
AI Explainability & Interpretability
Causality
Societal Impact by LLM Applications
Key Responsibilities
Undertake pioneering research by formulating challenging research questions and devising problem-solving methods.
Lead a wide range of research activities including but not limited to the ideation and development of safe and trustworthy AI systems, and authoring research papers.
Communicate research progress and findings clearly and effectively.
Actively collaborate with other researchers.
Report and present the research findings and developments at top-tier academic venues.
Requirements
Holds a PhD degree or equivalent (or expected to receive within 6 months) in Computer Science (CS), Electrical Engineering (EE), Mathematics, or other relevant fields.
An academic publication record at top-tier conferences in Natural Language Processing (e.g., *ACL), Machine Learning (e.g., NeurIPS, ICLR), and others (e.g., FAccT).
Experience in research collaborations and academic writing in related fields.
(Preferred) Global research/industrial collaboration experiences.
Excellent analytical and problem-solving skills.
Strong communication skills, openness to constructive discussion, and receptiveness to feedback.
How to apply
Hiring process
Application screening → Coding test → Job talk → Interview → (optional) Second Interview → Notification
About the Internship
Our team is offering research intern positions for 2023 winter and 2024 spring/summer. As an intern, you'll be actively involved in developing and conducting research on trustworthy and safe large language models.
Before starting your internship, we will collaborate closely to refine and develop your research plan. This process ensures that your proposal aligns with our mutual research interests. We strongly support your initiative to lead your main project while also engaging in other research projects. This approach offers a balanced experience in both research leadership and collaboration.
A key goal of this internship is to produce an academic paper suitable for submission to top-tier conferences or journals. Additionally, we anticipate that the outcomes of your project will make meaningful contributions to real-world applications.
This is a full-time, in-person role at NAVER 1784 (Seongnam-si, Gyeonggi-do, South Korea)
The office could be changed to NAVER Green Factory or the other near building.
This internship offers a flexible starting date.
Key Responsibilities
Undertake pioneering research by formulating challenging research questions and devising problem-solving methods. This includes implementing and evaluating models, as well as authoring research papers.
Communicate research progress and findings clearly and effectively.
Demonstrate proactivity and the ability to successfully complete projects.
Requirements
Pursuing a PhD or equivalent in Computer Science (CS), Electrical Engineering (EE), Mathematics, or other relevant fields.
At least one paper authored as the first author in AI/ML-related conferences.
(Preferred) A strong academic publication record at top-tier conferences in Natural Language Processing (e.g., *ACL), Machine Learning (e.g., NeurIPS, ICLR), and others.
Experience in research collaborations and academic writing in related fields.
Excellent analytical and problem-solving skills.
Strong communication skills, openness to constructive discussion, and receptiveness to feedback.
How to apply
Your application should include the following:
CV
Brief research interests and research plans
that include research questions and goals with a few related works.
that include brief idea and direction to solve the problem. (not necessary to be perfect!)
Hiring process
Application screening → Coding test → Job talk → Interview → (optional) Second Interview → Notification
Note
If you are planning to start in Summer, please submit your application by Feb. 28.
This position could be closed early when the position is full.
We look forward to your application and the possibility of you joining our team. If you have any question, please contact us! 🤗
Selected Papers
LifeTox: Unveiling Implicit Toxicity in Life Advice, M Kim, J Koo, H Lee, J Park, H Lee, K Jung, arXiv preprint arXiv:2311.09585
dataset & benchmark
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models, S Kim, J Shin, Y Cho, J Jang, S Longpre, H Lee, S Yun, S Shin, S Kim, J Throne, M Seo, arXiv preprint arXiv:2310.08491
dataset
evaluation
EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria, TS Kim, Y Lee, J Shin, YH Kim, J Kim, arXiv preprint arXiv:2309.13633
evaluation
KoBBQ: Korean Bias Benchmark for Question Answering, J Jin, J Kim, N Lee, H Yoo, A Oh, H Lee, arXiv preprint arXiv:2307.16778
dataset & benchmark
Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation, T Kim, J Shin, YH Kim, S Bae, S Kim, arXiv preprint arXiv:2305.13857
evaluation
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning, S Kim, SJ Joo, D Kim, J Jang, S Ye, J Shin, M Seo, EMNLP 2023
dataset
Aligning Large Language Models through Synthetic Feedback, S Kim, S Bae, J Shin, S Kang, D Kwak, KM Yoo, M Seo, EMNLP 2023
alignment
ProPILE: Probing Privacy Leakage in Large Language Models, S Kim, S Yun, H Lee, M Gubri, S Yoon, SJ Oh, NeurIPS 2023 (spotlight)
llm-security
Who Wrote this Code? Watermarking for Code Generation, T Lee, S Hong, J Ahn, I Hong, H Lee, S Yun, J Shin, G Kim, arXiv preprint arXiv:305.15060
llm-security
KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application, H Lee, S Hong, J Park, T Kim, G Kim, JW Ha, ACL 2023 (industry track)
dataset & benchmark
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration, H Lee, S Hong, J Park, T Kim, M Cha, Y Choi, BP Kim, G Kim, EJ Lee, Y Lim, A Oh, S Park, JW Ha, ACL 2023 (best paper nominated)
dataset & benchmark
Query-Efficient Black-Box Red Teaming via Bayesian Optimization, D Lee, JY Lee, JW Ha, JH Kim, SW Lee, H Lee, HO Song, ACL 2023
llm-security
Critic-Guided Decoding for Controlled Text Generation, M Kim, H Lee, KM Yoo, J Park, H Lee, K Jung, ACL 2023 (Findings)
learning & inference
ClaimDiff: Comparing and Contrasting Claims on Contentious Issues, M Ko, I Seong, H Lee, J Park, M Chang, M Seo, ACL 2023(Findings)
dataset & benchmark
Last updated