关注
Nandi Schoots
Nandi Schoots
PhD student, King's College London, Imperial College London
在 imperial.ac.uk 的电子邮件经过验证
标题
引用次数
引用次数
年份
A theory of representation learning gives a deep generalisation of kernel methods
AX Yang, M Robeyns, E Milsom, B Anson, N Schoots, L Aitchison
International Conference on Machine Learning, 39380-39415, 2023
192023
Improving activation steering in language models with mean-centring
O Jorgensen, D Cope, N Schoots, M Shanahan
Responsible Language Models @AAAI, 2023
142023
Any Deep ReLU Network is Shallow
MJ Villani, N Schoots
arXiv preprint arXiv:2306.11827, 2023
92023
Dissecting Language Models: Machine Unlearning via Selective Pruning
N Pochinkov, N Schoots
arXiv preprint arXiv:2403.01267, 2024
82024
Dissecting Large Language Models
N Pochinkov, N Schoots
Socially Responsible Language Modelling Research @NeurIPS, 2023
42023
Learning to Communicate with Strangers via Channel Randomisation Methods
D Cope, N Schoots
4th Workshop on Emergent Communication at NeurIPS 2020, 2021
32021
Extending Activation Steering to Broad Skills and Multiple Behaviours
T van der Weij, M Poesio, N Schoots
arXiv preprint arXiv:2403.05767, 2024
22024
Finding Sparse Initialisations using Neuroevolutionary Ticket Search (NeTS)
A Jackson, N Schoots, A Ahantab, M Luck, E Black
Artificial Life Conference Proceedings 35 2023 (1), 110, 2023
22023
Safety Properties of Inductive Logic Programming.
G Leech, N Schoots, J Skalse
SafeAI @AAAI, 2021
22021
Comparing Optimization Targets for Contrast-Consistent Search
H Fry, S Fallows, I Fan, J Wright, N Schoots
Socially Responsible Language Modelling Research @NeurIPS, 2023
12023
Low-Entropy Latent Variables Hurt Out-of-Distribution Performance
N Schoots, D Cope
Domain Generalization @ICLR, 2023
12023
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Y Mathew, O Matthews, R McCarthy, J Velja, CS de Witt, D Cope, ...
arXiv preprint arXiv:2410.03768, 2024
2024
Training Neural Networks for Modularity aids Interpretability
S Golechha, D Cope, N Schoots
arXiv preprint arXiv:2409.15747, 2024
2024
Channel Randomisation Methods for Zero-Shot Communication
D Cope, N Schoots
ECAI 2024, 3620-3627, 2024
2024
The Propensity for Density in Feed-forward Models
N Schoots, A Jackson, A Kholmovia, P McBurney, M Shanahan
ECAI 2024, 2830-2837, 2024
2024
Steganography in Large Language Models: Investigating Emergence and Mitigations
Y Mathew, R McCarthy, O Matthews, J Velja, N Schoots, D Cope
Red Teaming GenAI: What Can We Learn from Adversaries?, 0
Emergence of Steganography Between Large Language Models
Y Mathew, R McCarthy, J Velja, O Matthews, N Schoots, D Cope
Workshop on Socially Responsible Language Modelling Research, 0
系统目前无法执行此操作,请稍后再试。
文章 1–17