ML Research

Job description of ML research team of NAVER AI Lab (Location: Seongnam, South Korea or full-remote)

Mission

Our research interest lies in developing better machine learning models with diverse modalities or developing general machine learning algorithms such as trustworthy AI, or optimization. Our current research topics mainly focus on, but not limited to

  • Multi-modal learning algorithms: Vision-language representation learning, Audio-visual representation learning, Language-audio, or other multi-modal learning.

  • ML Reliability: Robustness to distribution shifts, input noises, label noises, or other realistic scenarios. Target scenarios include domain generalization, de-biasing, algorithmic fairness, or robust learning against natural or adversarial corruptions.

  • Discoveries into datasets or models: Explainable AI, discovering shortcuts in datasets or understanding the inner mechanism of models.

  • Uncertainty estimation.

  • Constructing a new metric, dataset, or benchmark.

  • General machine learning algorithms.

We hire full-time regular research scientists and research interns. As a research scientist in ML Research in NAVER AI Lab, your mission will be publications at top AI venues and contributions to NAVER and AI communities through impactful research.

Requirements

  • Research scientist

    • Strong track record of publications at top-tier conferences in machine learning and computer vision, e.g., NeurIPS, ICLR, CVPR, ICCV, ECCV, and ICML.

    • Relevant job experiences, e.g., laboratory researcher experiences or full-time industrial research experiences.

    • Preferred

      • Ph.D. in CS, EE, mathematics, or other related technical fields, or equivalent work experience.

      • Strong programming skills in Python (PyTorch).

      • Experience with serving as an active member in the research community (e.g. reviewing activities, tutorial and workshop organization, and research code contributions).

    • Responsibilities

      • Organize and execute one’s own research agenda.

      • Lead and collaborate on ambitious research projects.

  • Research intern

    • Experience in research collaborations and paper writing in related fields.

    • Proficient programming skills in Python (PyTorch).

      • Preferred

        • Currently in an MS or Ph.D. program in CS, EE, mathematics, or other related technical fields.

        • Proficient track record of publications at top-tier conferences in machine learning, computer vision, natural language processing, audio, hci, and speech.

Application Process, and Contact

Please submit your application via the recruitment platform to register for our Talent Pool where sign-in is required.

  • Internship: Korean/English

  • Full-time: Korean/ English

  • Please be advised that the hiring process could extend up to three months. If you have a deadline for another offer, we encourage you to reach out to HR for further assistance.

  • If you clarify that you are applying to "ML Research" in your application, then we might be available to proceed with your application more quickly (if you do not specify where you apply, all sub groups will review your application, which takes more time).

Selected papers

  • Multi-modal learning

    • [Gu and Chun 2024] Language-only Efficient Training of Zero-shot Composed Image Retrieval. Geonmo Gu*, Sanghyuk Chun*, Wonjae Kim, Yoohoon Kang, Sangdoo Yun, CVPR 2024.

    • [Lee 2024] Toward Interactive Regional Understanding in Vision-Large Language Models. Jungbeom Lee, Sanghyuk Chun*, Sangdoo Yun*, NAACL 2024

    • [Chun 2024] Improved Probabilistic Image-Text Representations. Sanghyuk Chun, ICLR 2024.

    • [Park 2024] Bridging Vision and Language Spaces with Assignment Prediction. Jungin Park, Jiyoung Lee, Kwanghoon Sohn, ICLR 2024.

    • [Kim 2023a] Dense text-to-image generation with attention modulation. Yunji Kim, Jiyoung Lee, Jin-Hwa Kim, Jung-Woo Ha, Jun-Yan Zhu, ICCV 2023

    • [Kim 2023b] Hierarchical visual primitive experts for compositional zero-shot learning. Hanjae Kim, Jiyoung Lee, Seongheon Park, Kwanghoon Sohn, CVPR 2023

    • [Lee 2023a] Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech. Jiyoung Lee, Joon Son Chung, Soo-Whan Chung, ICASSP 2023

    • [Gu and Chun 2023] CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion. Geonmo Gu*, Sanghyuk Chun*, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun, preprint 2023

    • [Chun 2022] ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO. Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh, ECCV 2022.

    • [Chun 2021a] Probabilistic Embeddings for Cross-Modal Retrieval. Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus, CVPR 2021.

    • [Park 2023a] RoCOCO: Robust Benchmark MS-COCO to Stress-test Robustness of Image-Text Matching Models. Seulki Park, Daeho Um, Hajung Yoon, Sanghyuk Chun, Sangdoo Yun, Jin Young Choi, preprint 2023

    • [Lee 2023b] Lifelong Audio-video Masked Autoencoder with Forget-robust Localized Alignments. Jaewoo Lee, Jaehong Yoon, Wonjae Kim, Yunji Kim, Sung Ju Hwang, preprint 2023

    • [Park and Chun 2021] Few-shot Font Generation with Localized Style Representations and Factorization. Song Park*, Sanghyuk Chun*, Junbum Cha, Bado Lee, Hyunjung Shim, AAAI 2021.

    • [Park 2021] Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts. Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim, ICCV 2021.

  • ML Reliability

    • [Bahng 2020] Learning De-biased Representations with Biased Representations. Hyojin Bahng, Sanghyuk Chun, Sangdoo Yun, Jaegul Choo, Seong Joon Oh, ICML 2020.

    • [Cha 2021] SWAD: Domain Generalization by Seeking Flat Minima. Junbum Cha, Sanghyuk Chun, Kyungjae Lee, Han-Cheol Cho, Seunghyun Park, Yunsung Lee, Sungrae Park, NeurIPS 2021.

    • [Cha 2022] Domain Generalization by Mutual-Information Regularization with Pre-trained Models. Junbum Cha, Kyungjae Lee, Sungrae Park, Sanghyuk Chun, ECCV 2022.

    • [Chun 2021b] StyleAugment: Learning Texture De-biased Representations by Style Augmentation without Pre-defined Textures. Sanghyuk Chun, Song Park, ArXiv Preprint.

    • [Jung 2022] Learning Fair Classifiers with Partially Annotated Group Labels. Sangwon Jung, Sanghyuk Chun, Taesup Moon, CVPR 2022.

    • [Jung and Park 2023] Re-weighting based Group Fairness Regularization via Classwise Robust Optimization. Sangwon Jung*, Taeeon Park*, Sanghyuk Chun, Taesup Moon, ICLR 2023

    • [Lee 2022] Weakly Supervised Semantic Segmentation using Out-of-Distribution Data. Jungbeom Lee, Seong Joon Oh, Sangdoo Yun, Junsuk Choe, Eunji Kim, Sungroh Yoon, CVPR 2022.

    • [Park 2022a] The Majority Can Help The Minority: Context-rich Minority Oversampling for Long-tailed Classification. Seulki Park, Youngkyu Hong, Byeongho Heo, Sangdoo Yun, Jin Young Choi, CVPR 2022.

    • [Song 2021b] Robust Learning by Self-Transition for Handling Noisy Labels. Hwanjun Song, Minseok Kim, Dongmin Park, Yooju Shin, Jae-Gil Lee, KDD 2021.

    • [Song 2022b] Learning from Noisy Labels with Deep Neural Networks: A Survey. Hwanjun Song, Minseok Kim, Dongmin Park, Yooju Shin, Jae-Gil Lee, TNNLS 2022

    • [Yun 2019] CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo, ICCV 2019.

  • Constructing a new metric, dataset, or benchmark

    • [Chun 2022] ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO. Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh, ECCV 2022.

    • [Park 2023a] RoCOCO: Robust Benchmark MS-COCO to Stress-test Robustness of Image-Text Matching Models. Seulki Park, Daeho Um, Hajung Yoon, Sanghyuk Chun, Sangdoo Yun, Jin Young Choi, preprint 2023

    • [Cheo and Oh 2020] Evaluating Weakly Supervised Object Localization Methods Right. Junsuk Choe*, Seong Joon Oh*, Seungho Lee, Sanghyuk Chun, Zeynep Akata, Hyunjung Shim, CVPR 2020.

    • [Choe and Oh 2022] Evaluation for Weakly Supervised Object Localization: Protocol, Metrics, and Datasets. Junsuk Choe*, Seong Joon Oh*, Sanghyuk Chun, Seungho Lee, Zeynep Akata, Hyunjung Shim, PAMI 2022

    • [Chun 2019] An Empirical Evaluation on Robustness and Uncertainty of Regularization Methods. Sanghyuk Chun, Seong Joon Oh, Sangdoo Yun, Dongyoon Han, Junsuk Choe, Youngjoon Yoo, ICML Workshop 2019.

    • [Naeem and Oh 2020] Reliable Fidelity and Diversity Metrics for Generative Models. Muhammad Ferjad Naeem*, Seong Joon Oh*, Youngjung Uh, Yunjey Choi, Jaejun Yoo, ICML 2020.

  • Discoveries into datasets or models

    • [Kim and Choe 2021] Keep CALM and Improve Visual Feature Attribution. Jae Myung Kim, Junsuk Choe, Zeynep Akata, Seong Joon Oh, ICCV 2021.

    • [Scimeca 2022] Which shortcut cues will DNNs choose? a study from the parameter-space perspective. Luca Scimeca, Seong Joon Oh, Sanghyuk Chun, Michael Poli, Sangdoo Yun, ICLR 2022.

    • [Hwang 2022] Similarity of Neural Architectures Based on Input Gradient Transferability. Jaehui Hwang, Dongyoon Han, Byeongho Heo, Song Park, Sanghyuk Chun*, Jong-Seok Lee*, preprint 2022.

    • [Park 2023b] What Do Self-Supervised Vision Transformers Learn? Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, Sangdoo Yun, ICLR 2023

  • Uncertainty estimation

    • [Chun 2024] Improved Probabilistic Image-Text Representations. Sanghyuk Chun, ICLR 2024.

    • [Park 2022b] Probabilistic representations for video contrastive learning. Jungin Park, Jiyoung Lee, Ig-Jae Kim, Kwanghoon Sohn, CVPR 2022

    • [Chun 2021a] Probabilistic Embeddings for Cross-Modal Retrieval. Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus, CVPR 2021.

  • General machine learning algorithms.

    • [Heo 2021] AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights. Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha. ICLR 2021.

    • [Park and Yun 2022] A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective. Chanwoo Park*, Sangdoo Yun*, Sanghyuk Chun, NeurIPS 2022

    • [Park and Chun 2024] What Does Automatic Differentiation Compute for Neural Networks? Sejun Park*, Sanghyuk Chun*, Wonyeol Lee, ICLR 2024

Last updated