Fine-tuned BERT Model for Large Scale and Cognitive Classification of MOOCs
DOI:
https://doi.org/10.19173/irrodl.v23i2.6023Keywords:
cognitive MOOC classification, BERT, LSTM, transfer learningAbstract
The quality assurance of MOOCs focuses on improving their pedagogical quality. However, the tools that allow reflection on and assistance regarding the pedagogical aspects of MOOCs are limited. The pedagogical classification of MOOCs is a difficult task, given the variability of MOOCs' content, structure, and designs. Pedagogical researchers have adopted several approaches to examine these variations and identify the pedagogical models of MOOCs, but these approaches are manual and operate on a small scale. Furthermore, MOOCs do not contain any metadata on their pedagogical aspects. Our objective in this research work was the automatic and large-scale classification of MOOCs based on their learning objectives and Bloom’s taxonomy. However, the main challenge of our work was the lack of annotated data. We created a dataset of 2,394 learning objectives. Due to the limited size of our dataset, we adopted transfer learning via bidirectional encoder representations from Transformers (BERT). The contributions of our approach are twofold. First, we automated the pedagogical annotation of MOOCs on a large scale and based on the cognitive levels of Bloom’s taxonomy. Second, we fine-tuned BERT via different architectures. In addition to applying a simple softmax classifier, we chose prevalent neural networks long short-term memory (LSTM) and Bi-directional long short-term memory (Bi-LSTM). The results of our experiments showed, on the one hand, that choosing a more complex classifier does not boost the performance of classification. On the other hand, using a model based on dense layers upon BERT in combination with dropout and the rectified linear unit (ReLU) activation function enabled us to reach the highest accuracy value.
References
Abduljabbar, D. A., & Omar, N. (2015). Exam questions classification based on Bloom’s taxonomy cognitive level using classifiers combination. Journal of Theoretical and Applied Information Technology, 78(3), 447. http://www.jatit.org/volumes/Vol78No3/15Vol78No3.pdf
Conole, G. (2014, April). The 7Cs of learning design: A new approach to rethinking design practice. Proceedings of the Ninth International Conference on Networked Learning (pp. 502–509). Edinburgh, Scotland. https://www.lancaster.ac.uk/fss/organisations/netlc/past/nlc2014/abstracts/pdf/conole.pdf
Conole, G. (2016). MOOCs as disruptive technologies: Strategies for enhancing the learner experience and quality of MOOCs. Revista de Educación a Distancia, 50(2). http://dx.doi.org/10.6018/red/50/2
Das, S., Das Mandal, S. K., & Basu, A. (2020). Identification of cognitive learning complexity of assessment questions using multi-class text classification. Contemporary Educational Technology, 12(2), Article ep275. https://doi.org/10.30935/cedtech/8341
Davis, D., Seaton, D., Hauff, C., & Houben, G. J. (2018, June). Toward large-scale learning design: Categorizing course designs in service of supporting learning outcomes. Proceedings of the Fifth Annual ACM Conference on Learning at Scale (pp. 1–10). https://doi.org/10.1145/3231644.3231663
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805
González-Carvajal, S., & Garrido-Merchán, E. C. (2020). Comparing BERT against traditional machine learning text classification. arXiv preprint arXiv:2005.13012. https://arxiv.org/abs/2005.13012
Grandini, M., Bagli, E., & Visani, G. (2020). Metrics for multi-class classification: An overview. arXiv preprint arXiv:2008.05756. https://arxiv.org/abs/2008.05756
Haris, S. S., & Omar, N. (2012, December). A rule-based approach in Bloom’s taxonomy question classification through natural language processing. Seventh International Conference on Computing and Convergence Technology (ICCCT; pp. 410–414). Institute of Electrical and Electronics Engineers. https://ieeexplore.ieee.org/abstract/document/6530368
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. https://arxiv.org/abs/1412.6980
Kopp, M., & Lackner, E. (2014). Do MOOCs need a special instructional design? Proceedings of Sixth International Conference on Education and New Learning (EDULEARN14; pp. 7138–7147). Barcelona, Spain. https://library.iated.org/view/KOPP2014DOM
Krathwohl, D. R. (2002). A revision of Bloom’s taxonomy: An overview. Theory into Practice, 41(4), 212–218. https://doi.org/10.1207/s15430421tip4104_2
Major, C. H., & Blackmon, S. J. (2016). Massive open online courses: Variations on a new instructional form. New Directions for Institutional Research, 2015(167), 11–25. https://doi.org/10.1002/ir.20151
Margaryan, A., Bianco, M., & Littlejohn, A. (2015). Instructional quality of massive open online courses (MOOCs). Computers & Education, 80, 77–83. https://doi.org/10.1016/j.compedu.2014.08.005
Merrill, M. D. (2012). First principles of instruction. John Wiley & Sons. https://digitalcommons.usu.edu/usufaculty_monographs/100/
Mohammed, M., & Omar, N. (2020). Question classification based on Bloom’s taxonomy cognitive domain using modified TF-IDF and word2vec. PLOS ONE 15, e0230442. https://doi.org/10.1371/journal.pone.0230442
Molenda, M. (2003). In search of the elusive ADDIE model. Performance improvement, 42(5), 34-37. http://www.damiantgordon.com/Courses/DT580/In-Search-of-Elusive-ADDIE.pdf
Nevid, J. S., & McClelland, N. (2013). Using action verbs as learning outcomes: Applying Bloom’s taxonomy in measuring instructional objectives in introductory psychology. Journal of Education and Training Studies, 1(2), 19–24. https://doi.org/10.11114/jets.v1i2.94
Omar, N., Haris, S. S., Hassan, R., Arshad, H., Rahmat, M., Zainal, N. F. A., & Zulkifli, R. (2012). Automated analysis of exam questions according to Bloom's taxonomy. Procedia: Social and Behavioral Sciences, 59, 297–303. https://doi.org/10.1016/j.sbspro.2012.09.278
Osman, A., & Yahya, A. A. (2016). Classifications of exam questions using linguistically-motivated features: A case study based on Bloom’s taxonomy. https://www.researchgate.net/publication/298286164_Classifications_of_Exam_Questions_Using_Linguistically-Motivated_Features_A_Case_Study_Based_on_Blooms_Taxonomy
Pardos, Z. A., & Schneider, E. (2013). AIED 2013 workshops proceedings (Vol. 1). http://people.cs.pitt.edu/~falakmasir/docs/AIED2013.pdf
Quintana, R. M., & Tan, Y. (2019). Characterizing MOOC pedagogies: Exploring tools and methods for learning designers and researchers. Online Learning, 23(4), 62–84. https://doi.org/10.24059/olj.v23i4.2084
Rosselle, M., Caron, P. A., & Heutte, J. (2014, February). A typology and dimensions of a description framework for MOOCs. In Proceedings of European MOOCs Stakeholders Summit 2014, (eMOOCs 2014; pp. 130–139). Lausanne, Switzerland. Proceedings document published by Open Education Europa (www. openeducationeuropa. eu). https://hal.archives-ouvertes.fr/hal-00957025/
Sebbaq, H., El Faddouli, N. E., & Bennani, S. (2020, September). Recommender system to support MOOCs teachers: Framework based on ontology and linked data. Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications,Article 18. https://doi.org/10.1145/3419604.3419619
Sebbaq, H., & Faddouli, N. E. E. (2021, January). MOOCs semantic interoperability: Towards unified and pedagogically enriched model for building a linked data repository. International Conference on Digital Technologies and Applications. Springer, 621-631.
Stanny, C. J. (2016). Reevaluating Bloom’s taxonomy: What measurable verbs can and cannot say about student learning. Education Sciences, 6(4), 37. https://doi.org/10.3390/educsci6040037
Swan, K., Day, S., Bogle, L., & van Prooyen, T. (2014). AMP: A tool for characterizing the pedagogical approaches of MOOCs. E-mentor, 2(54), 75–85. https://doi.org/10.15219/em54.1098
Swart, A. J., & Daneti, M. (2019, April). Analyzing learning outcomes for electronic fundamentals using Bloom’s taxonomy. 2019 IEEE Global Engineering Education Conference (EDUCON; pp. 39–44). Institute of Electrical and Electronics Engineers. https://ieeexplore.ieee.org/document/8725137
Ting Fei, Wei Jyh Heng, Kim Chuan Toh, & Tian Qi. (2003). Question classification for e-learning by artificial neural network. Proceedings of the Joint Fourth International Conference on Information, Communications and Signal Processing, and the Fourth Pacific Rim Conference on Multimedia (pp. 1757–1761). Institute of Electrical and Electronics Engineers. Singapore. https://doi.org/10.1109/ICICS.2003.1292768
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. 31st Conference on Neural Information Processing Systems (NIPS 2017; pp. 5998–6008). Long Beach, USA. https://arxiv.org/abs/1706.03762
Xing, W. (2019). Exploring the influences of MOOC design features on student performance and persistence. Distance Education, 40(1), 98–113. https://doi.org/10.1080/01587919.2018.1553560
Yousef, A. M. F., Chatti, M. A., Schroeder, U., & Wosnitza, M. (2014, July). What drives a successful MOOC? An empirical examination of criteria to assure design quality of MOOCs. 14th International Conference on Advanced Learning Technologies (pp. 44–48). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICALT.2014.23
Yusof, N., Hui, C. J. (2010). Determination of Bloom’s cognitive level of question items using artificial neural network. 10th International Conference on Intelligent Systems Design and Applications (ISDA; pp. 866–870). Institute of Electrical and Electronics Engineers. Cairo, Egypt. https://doi.org/10.1109/ISDA.2010.5687152
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International Licence. The copyright of all content published in IRRODL is retained by the authors.
This copyright agreement and use license ensures, among other things, that an article will be as widely distributed as possible and that the article can be included in any scientific and/or scholarly archive.
You are free to
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms below:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.