Fries, J., Seelam, N., Altay, G., Weber, L., Kang, M., Datta, D., Su, R., Garda, S., Wang, B., Ott, S., Samwald, M., & Kusa, W. (2022). Dataset Debt in Biomedical Language Modeling. In Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models (pp. 137–145). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.bigscience-1.10