About

These norms lists are based on data originally collected as part of a project to create core vocabulary lists for adults learning Welsh. In the absence of a comprehensive corpus of contemporary Welsh at the time, A1 and A2 core vocabulary lists had been put together using an adaptation of the methodology originally employed in the creation of français fondamental (Gougenheim, 19641).2

To extend the research to B1 (Canolradd3) level, a set of 900 cue words based on the A1 and A2 lists was then used to elicit word association responses. Around 80% of word association responses are collocations, (partial) synonyms, or hyponyms of the cue (Fitzpatrick, 20074) and a B1 wordlist generated from these would extend A1 and A2 vocabulary knowledge appropriately.

Around 85 participants were recruited from all over Wales to provide responses to the 900 A1 and A2 cue words. The only condition for taking part was that they were all self-defined fluent users of Welsh. Information was provided to indicate whether they were L1 speakers of Welsh or L2 new speakers of the language.

The 900 cue words (nouns, verbs, adjectives and adverbs but not function words) were randomly distributed into 30 sets of 30 cues and one set was sent to the participants every few days. The participants were asked to give three responses to each cue resulting in a  database of around 190,000 responses  and, according to its original purpose, informed the Canolradd core vocabulary list.

The full database of responses constitutes the first (to our knowledge) set of Welsh language word association response norms and is published here as a research resource. 

  1. Gougenheim, Georges, René Michéa, Paul Rivenc, & Aurélien Sauvageot. (1964). L’Élaboration du français fondamental (1er degré): étude sur l’établissement d’un vocabulaire et d’une grammaire de base. Didier: Paris. ↩︎
  2. For a full outline of the methodology, see Morris, S. (2010/2011) Geirfa Graidd i’r Gymraeg: Creating an A1 and A2 core vocabulary for adult learners of Welsh – a Celtic template? Journal of Celtic Language Learning 15, 111-127 ↩︎
  3. Canolradd equates to Intermediate in English and is used to denominate the CEFR B1 learning level by the National Centre for Learning Welsh and the assessment body, WJEC/CBAC.  CEFR A1 is referred to as the Mynediad (Entry) level and A2 as the Sylfaen (Foundation) level ↩︎
  4. Fitzpatrick, T. (2007). Word association patterns: Unpacking the assumptions. International Journal of Applied Linguistics, 17(3). ↩︎

Outputs related to this data include:

  • Morris, S., Fitzpatrick, T. and Mills, T. (2024, September 5-7).  An analysis of word association behaviour in Welsh [poster presentation]. BAAL conference 2024, University of Essex. (see Resources > talks and posters)
  • Morris, S., Fitzpatrick, T. and Mills, T. (in preparation). Word association behaviour in a minoritised language.

Please cite the information and data on this page as: Fitzpatrick, T., Mills, T., and Morris, S. (2025). Finding, Sharing and Losing Words: Understanding the Mental Lexicon [Morris, S., Meara, P. and Fitzpatrick, T.  Swansea University Cymraeg Word Associations]. Swansea University. https://mental-lexicon.swansea.ac.uk/.

Norms Lists

Below are the norms lists for the Welsh Canolradd data visualised by Tableau. These are dynamic, fully-interactive visualisations that show all the responses given to all the cues.

You can filter by (1) cue and (2) response number (three responses were gathered for each cue in this dataset) along the side. You can select more than one cue to compare responses given.

Hub Words

This table presents the hub words in the Cymraeg Word Association dataset. These are words that are given as a response to many different cues. The table below shows these hub words, organised by Count of Response (the number of times in the dataset this word was given as a response, i.e., tokens) and Count of Cue (number of different cues to which this word was given as a response, i.e., types).

Data

By downloading the files below, you agree to use these datasets under the Creative Commons Attribution-NonCommercial 4.0 International.

FileLink
Cymraeg Word Association Data and NormsDataset.xlsx
Cymraeg Word Association Data and Norms User GuideUser Guide.pdf
Cymraeg Word Association Hub WordsHub Words (Tableau)
Cymraeg Word Association Norms VisualisationsNorms Lists (Tableau)
Cymraeg Word Association Norms Lists TablesNorms Lists (Tableau)

Acknowledgments

This data used here was originally collected by Steve Morris, Paul Meara and Tess Fitzpatrick as part of a pedagogical wordlists project part-funded by WJEC.