Results

Seminar Talk - 9

15 March 2024

}

Topic:

AI Safety Research

Presenter:

Dr. Adel Bibi- Senior Researcher at the Department of Engineering Science of the University of Oxford

Date and Time:

Friday, March 15, 2024 at 01:00 PM

Location of event:

Beirut Arab University, Debbieh Campus, A3 building, meeting room

Zoom:

https://zoom.us/j/96417417775?pwd=bEd3WmFsNlZyWGh2WXpBTVd6MUhDZz09

Talk Summary:

We delve into my research on AI safety, focusing on advancements aimed at ensuring the robustness, alignment, and fairness of large language models (LLMs). The talk will start with an exploration of the challenges posed by sensitivity in AI systems and strategies for providing provable guarantees of against and worst-case adversaries. Building upon this, we navigate through the alignment challenges and safety considerations of LLMs, addressing both their limitations and capabilities particularly following techniques related to instruction prefix tuning and their theoretical limitations towards alignment. At last, I will talk about fairness across languages in common tokenizers in LLMs.

Speaker Bio:

Adel Bibi is a senior researcher in machine learning and computer vision at the Department of Engineering Science of the University of Oxford, a Research Fellow of Kellogg College, and a member of the European Laboratory for Learning and Intelligent Systems (ELLIS) Society. Prior to that, Bibi was a senior research associate and a postdoctoral researcher in Oxford with Philip H.S. Torr since October 2020. He earned his MSc and PhD degrees from King Abdullah University of Science & Technology (KAUST) in 2016 and 2020, respectively. Bibi was awarded an Amazon Research Award in 2022 in the Machine Learning Algorithms and Theory track. Bibi received four best paper awards: a NeurIPS23 workshop, an ICML23 workshop, a 2022 CVPR workshop, and one at Optimization and Big Data Conference in 2018. His contributions include over 30 papers published in top machine learning and computer vision conferences. He also received four outstanding reviewer awards (CVPR18, CVPR19, ICCV19, ICLR22) and a notable Area Chair Award in NeurIPS23.