SUMMER SEMINAR
DATE | SUBJECT | PRESENTER | MATERIALS |
07.04 | Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation | Jung, Dahyun | link |
Machine Unlearning of Pre-trained Large Language Models | |||
Fine-Tuning Language Models For Factuality | Kang, Myunghoon | link | |
Assessing Factual Reliability of Large Language Model Knowledge | |||
Language models can explain neurons in language models | Chun, Yong Chan | link | |
Sparse autoencoders find highly interpretable features in large language model | |||
07.11 | QLLM: ACCURATE AND EFFICIENT LOW-BITWIDTH QUANTIZATION FOR LARGE LANGUAGE MODELS | Lim, Jungwoo | link |
OMNIQUANT: OMNIDIRECTIONALLY CALIBRATED QUANTIZATION FOR LARGE LANGUAGE MODELS | |||
INSIDE: LLMS’ INTERNAL STATES RETAIN THE POWER OF HALLUCINATION DETECTION | Seo, Jaehyung | link | |
On Large Language Models’ Hallucination with Regard to Known Facts | |||
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems | Park, Chanhee | link | |
LLM Comparative Assessment Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models | |||
07.18 | Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation | Son, Suhyune | link |
FEEL: A Framework for Evaluating Emotional Support Capability with Large Language Models | |||
LOFTQ: LORA-FINE-TUNING-AWARE QUANTIZATION FOR LARGE LANGUAGE MODELS | Kim, Minhyuk | link | |
Divergent Token Metrics: Measuring degradation to prune away LLM components – and optimize quantization | |||
Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity | Jang, Youngjoon | link | |
ARAGOG: Advanced RAG Output Grading | |||
07.25 | Longformer: The Long-Document Transformer | Kim, Jeongwook | link |
Generating Long Sequences with Sparse Transformers | |||
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards | Eo, Sugyeong | link | |
RouteLLM: Learning to Route LLMs with Preference Data | |||
Toward Informal Language Processing: Knowledge of Slang in Large Language Models | Shim, Gyuho | link | |
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study | |||
08.01 | Knowledge Graph Enhanced Large Language Model Editing | Lee, Jaewook | link |
MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors | |||
Neuron-Level Knowledge Attribution in Large Language Models | Kim, Dongjun | link | |
Towards Uncovering How Large Language Model Works: An Explainability Perspective | |||
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning | Moon, Hyeonseok | link | |
QuRating: Selecting High-Quality Data for Training Language Models | |||
08.08 | RARR: Researching and Revising What Language Models Say, Using Language Models | Kim, Jinsung | link |
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models | |||
Instruction Pre-Training: Language Models are Supervised Multitask Learners | Lee, Seungyoon | link | |
FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing | |||
Retrieval meets Long Context Large Language Models | Son, Junyoung | link | |
Understanding Finetuning for Factual Knowledge Extraction | |||
08.22 | RAFT: Adapting Language Model to Domain Specific RAG | Jang, Yoonna | link |
Injecting New Knowledge Into Large Language Models via Supervised Fine-tuning | |||
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? | Koo, Seonmin | link | |
DFA-RAG: Conversational Semantic Router for Large Language Model with Definite Finite Automaton | |||
unveiling linguistic regions in large language models | Kim, Dongjun | link | |
anthropocentric bias and the possibility of artificial cognition | |||
08.29 | Not all Layers of LLMs are Necessary during Inference | Hong, Seongtae | link |
Tokenization Falling Short: The Curse of Tokenization | |||
Challenging the Validity of Personality Tests for Large Language Models | Moon, Hyeonseok | link | |
WHO IS CHATGPT? BENCHMARKING LLMS’ PSYCHOLOGICAL PORTRAYAL USING PSYCHOBENCH | |||
Self-Alignment with Instruction Backtranslation | Lee, Jungseob | link | |
Self-Rewarding Language Models |
WINTER SEMINAR
DATE | SUBJECT | PRESENTER | MATERIALS |
01.04 | ALCUNA: Large Language Models Meet New Knowledge | Lee, Jungseob | link |
Large Language Models Can Self-Improve | |||
Evaluating Large Language Models At Evaluating Instruction Following | Moon, Hyeonseok | link | |
Human Feedback is not Gold Standard | |||
Language Representation Projection: Can We Transfer Factual Knowledge across Languages in Multilingual Language Models? | Hong, Seongtae | link | |
SoulChat: Improving LLMs’ Empathy, Listening, and Comfort Abilities through Fine-tuning with Multi-turn Empathy Conversations | |||
01.11 | Inference-Time Intervention: Eliciting Truthful Answers from a Language Model | Jung, Dahyun | link |
Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation | |||
Hallucination Mitigation in Natural Language Generation from Large-Scale Open-Domain Knowledge Graphs | Seo, Jaehyung | link | |
The Troubling Emergence of Hallucination in Large Language Models – An Extensive Definition, Quantification, and Prescriptive Remediations | |||
Unveiling the Pitfalls of Knowledge Editing for Large Language Models | Son, Junyoung | link | |
RA-DIT: Retrieval-Augmented Dual Instruction Tuning | |||
01.19 | Emergent and Predictable Memorization in Large Language Models | Lim, Jungwoo | link |
ProPILE: Probing Privacy Leakage in Large Language Models | |||
CESAR: Automatic Induction of Compositional Instructions for Multi-turn Dialogs | Koo, Seonmin | link | |
SELF-ICL: Zero-Shot In-Context Learning with Self-Generated Demonstrations | |||
02.01 | The case for 4-bit precision: k-bit Inference Scaling Laws | Lee, Jaewook | link |
LLM-FP4: 4-Bit Floating-Point Quantized Transformers | |||
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators | Kang, Myunghoon | link | |
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation | |||
Direct Preference Optimization: Your Language Model is Secretly a Reward Model | Kim, Jeongwook | link | |
Mixtral of Experts | |||
02.22 | Prompting is not a substitute for probability measurements in large language models | Kim, Jinsung | link |
Evaluating Large Language Models on Controlled Generation Tasks | |||
Knowledge-enhanced mixed-initiative dialogue system for emotional support conversations | Son, Suhyune | link | |
Enhancing Empathetic and Emotion Support Dialogue Generation with Prophetic Commonsense Inference | |||
02.29 | Bridging the Digital Divide: Performance Variation across Socio-Economic Factors in Vision-Language Models | Lee, Seungyoon | link |
Merging Generated and Retrieved Knowledge for Open-Domain QA | |||
MoLE: Mixture of LoRA Experts | Eo, Sugyeong | link | |
Mixture-of-Experts Meets Instruction Tuning: A Winning Combination for Large Language Models | |||
DYNOSAUR: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation | Jang, Yoonna | link | |
Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration |
SUMMER SEMINAR
DATE | SUBJECT | PRESENTER | MATERIALS | COMMENTS |
08.03 | Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph | Son, Suhyune | link | |
Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases | ||||
Rethinking with Retrieval: Faithful Large Language Model Inference | ||||
How Language Model Hallucinations Can Snowball | Eo, Sugyeong | link | ||
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models | ||||
Detoxifying Text with MARCO: Controllable Revision with Experts and Anti-Experts | ||||
Generate rather than Retrieve: Large Language Models are Strong Context Generators | Lee, Seungyoon | link | ||
Guess The Instruction! Flipped Learning Makes Language Models Strong Zero-Shot Learners | ||||
Leveraging Large Language Models For Multiple Choice Question Answering | ||||
08.10 | SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions | Lee, Jeongwoo | link | |
WizardLM: Empowering Large Language Models to Follow Complex Instructions | ||||
Large Language Models Can Self-Improve | ||||
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models | Kim, Jeongwook | link | ||
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning | ||||
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism | ||||
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning | Moon, Hyeonseok | link | ||
PARAMETER-EFFICIENT FINE-TUNING DESIGN SPACES | ||||
Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models | ||||
08.18 | Linearly Mapping from Image to Text Space | Lee, Jungseob | link | |
MAGMA – Multimodal Augmentation of Generative Models through Adapter-based Finetuning | ||||
MAPL: Parameter-Efficient Adaptation of Unimodal Pre-Trained Models for Vision-Language Few-Shot Prompting | ||||
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models | ||||
Visual Instruction Tuning | ||||
LLaMA2: Open and Efficient Foundation Language Models | Lee, Seungjun | link | ||
FLAN | ||||
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment | ||||
Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering | Lee, Jaewook | link | ||
Enhanced Story Comprehension for Large Language Models through Dynamic Document-Based Knowledge Graphs | ||||
ChatDB: Augmenting LLMs With Databases as Their Symbolic Memory | ||||
08.24 | LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention | Hong, Seongtae | link | |
LLAMA-Adapter. V2: | ||||
LIMA: Less Is More for Alignment | ||||
Plug-and-Play Knowledge Injection for Pre-trained Language Models | Jung, Dahyun | link | ||
Towards Continual Knowledge Learning of Language Models | ||||
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback | ||||
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models | Lim, Jungwoo | link | ||
Mitigating Language Model Hallucination with Interactive Question-Knowledge Alignment | ||||
PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions | ||||
08.31 | fireball: a dataset of dungeons and dragons actual-play with structured game state information | Kim, Jinsung | link | comments |
marked personas: using natural language prompts to measure stereotypes in language models | ||||
What, When, and How to Ground: Designing User Persona-Aware Conversational Agents for Engaging Dialogue | ||||
Automatic Chain of Thought Prompting in Large Language Models | Son, Junyoung | link | ||
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models | ||||
Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework | ||||
Zero-shot Faithful Factual Error Correction | Kang, Myunghoon | link | ||
SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models | ||||
Language Models (Mostly) Know What They Know | ||||
09.07 | HellaSwag: Can a Machine Really Finish Your Sentence? | Seo, Jaehyung | link | comments |
Measuring Massive Multitask Language Understanding | ||||
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge (ARC) | ||||
TruthfulQA: Measuring How Models Mimic Human Falsehoods | ||||
Clues Before Answers: Generation-Enhanced Multiple-Choice QA | Koo, Seonmin | link | ||
Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors | ||||
Say What You Mean! Large Language Models Speak Too Positively about Negative Commonsense Knowledge | ||||
LoRA: Low-Rank Adaptation of Large Language Models | Jang, Yoonna | link | ||
Stack More Layers Differently: High-Rank Training Through Low-Rank Updates | ||||
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition |
WINTER SEMINAR
DATE | SUBJECT | PRESENTER | MATERIALS |
01.26 | RankGen: Improving Text Generation with Large Ranking Models | Lim, Jungwoo | link |
Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination | |||
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space | |||
Generative Language Models for Paragraph-Level Question Generation | Kang, Myunghoon | link | |
Varifocal Question Generation for Fact-checking | |||
Generating Literal and Implied Subquestions to Fact-check Complex Claims | |||
02.02 | Detecting Label Erros by using Pre-trained Langauge Model | Lee, Seungjun | link |
Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition | |||
Break it Down into BTS: Basic, Tiniest Subword Units for Korean | |||
SALTED: A Framework for SAlient Long-tail Translation Error Detection | Eo, Sugyeong | link | |
CTRLsum: Towards Generic Controllable Text Summarization | |||
SentBS: Sentence-level Beam Search for Controllable Summarization | |||
02.09 | AMAL:Meta Knowledge-Driven Few-Shot Adapter Learning | Kim, Jinsung | link |
Dictionary-Assisted Supervised Contrastive Learning | |||
Fast Vocabulary Transfer for Language Model Compression | |||
Revisiting Parameter-Efficient Tuning: Are We Really There Yet? | Moon, Hyeonseok | link | |
Evaluating Parameter Efficient Learning for Generation | |||
An Empirical Study on the Transferability of Transformer Modules in Parameter-Efficient Fine-Tuning | |||
02.16 | Entity-centered Cross-document Relation Extraction | Son, Junyoung | link |
DocInfer: Document-level Natural Language Inference using Optimal Evidence Selection | |||
Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models |