Improving activation steering in language models with mean-centring O Jorgensen, D Cope, N Schoots, M Shanahan AAAI-24 Workshop on Responsible Language Models (ReLM 2024), 2024 | 14 | 2024 |
Learning to Communicate with Strangers via Channel Randomisation Methods D Cope, N Schoots 4th Workshop on Emergent Communication at NeurIPS 2020, 2020 | 3 | 2020 |
Learning to plan with tree search via deep RL D Cope, J Svegliato, S Russell Bridging the Gap Between AI Planning and Reinforcement Learning Workshop at …, 2023 | 2 | 2023 |
Learning Translations: Emergent Communication Pretraining for Cooperative Language Acquisition D Cope, P McBurney Proceedings of the 33rd International Joint Conference on Artificial …, 2024 | 1 | 2024 |
Real-time Evolution of Multicellularity with Artificial Gene Regulation D Cope Proceedings of the 2023 Conference on Artificial Life, 2023 | 1 | 2023 |
Low-Entropy Latent Variables Hurt Out-of-Distribution Performance D Cope, N Schoots Domain Generalization at ICLR 2023, 2023 | 1* | 2023 |
Joining the Conversation: Towards Language Acquisition for Ad Hoc Team Play D Cope, P McBurney The 5th Emergent Communication Workshop at the International Conference on …, 2022 | 1 | 2022 |
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs Y Mathew, O Matthews, R McCarthy, J Velja, CS de Witt, D Cope, ... arXiv preprint arXiv:2410.03768, 2024 | | 2024 |
Training Neural Networks for Modularity aids Interpretability S Golechha, D Cope, N Schoots arXiv preprint arXiv:2409.15747, 2024 | | 2024 |
Channel Randomisation Methods for Zero-Shot Communication D Cope, N Schoots ECAI 2024, 3620-3627, 2024 | | 2024 |
Mimicry and the Emergence of Cooperative Communication D Cope, P McBurney The 2024 International Conference on Artificial Life (ALIFE), 2024 | | 2024 |
A Measure of Explanatory Effectiveness D Cope, P McBurney 1st International Workshop on Trusted Automated Decision-Making, 2021 | | 2021 |
Steganography in Large Language Models: Investigating Emergence and Mitigations Y Mathew, R McCarthy, O Matthews, J Velja, N Schoots, D Cope Red Teaming GenAI: What Can We Learn from Adversaries?, 0 | | |
Emergence of Steganography Between Large Language Models Y Mathew, R McCarthy, J Velja, O Matthews, N Schoots, D Cope Workshop on Socially Responsible Language Modelling Research, 0 | | |