The Spanish/English Bilingual Youth Texts (BYTs) Corpus is a collection of 44,597 text messages (SMSs) sent and received by urban bilingual youth. The messages were downloaded directly from participants’ phones, and include fully intact conversations as well as metadata about their phones and the messages. This corpus is free to researchers and students interested in text messaging from any perspective or discipline.

Participants in this study are emergent bilinguals living in New York City. They are enrolled in a high school equivalency program and are developing literacy skills in both Spanish and English. All were born outside of the United States and speak Spanish at home and English at school and work.

Special thanks to all of the students who donated their messages, though you must remain anonymous, I am deeply grateful to you for donating your voices to this project. Special thanks as well to Professor Gita Martohardjono and the Second Language Acquisition Lab at the Graduate Center at CUNY for their support in the development of this corpus.

For access, please email mjohnson2@gradcenter.cuny.edu with your name, affiliation, and a 1-2 sentence description of your research.


Need help with the Commons? Visit our
help page
Send us a message