1. Trang chủ
  2. » Giáo Dục - Đào Tạo

(LUẬN văn THẠC sĩ) evaluating the validity of the final achievement test for second year non major students at electronic electrical engineering department, namdinh university of technology education

70 5 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Evaluating The Validity Of The Final Achievement Test For Second – Year Non – Major Students At Electronic – Electrical Engineering Department, Nam Dinh University Of Technology Education
Tác giả Trần Thị Thu Hương
Người hướng dẫn Phạm Lan Anh, M.A
Trường học Nam Dinh University of Technology Education
Chuyên ngành Electronic Electrical Engineering
Thể loại thesis
Năm xuất bản 2011
Thành phố Hanoi
Định dạng
Số trang 70
Dung lượng 1,74 MB

Cấu trúc

  • CHAPTER 1: INTRODUCTION (8)
    • 1.1. Rationale (8)
    • 1.2. Scope of the study (9)
    • 1.3. Aims of the study (9)
    • 1.4. Methods of the study (9)
    • 1.5. Research questions (10)
    • 1.6. Design of the study (10)
  • CHAPTER 2: LITERATURE REVIEW (11)
    • 2.1. Relationship between teaching, learning and assessment (11)
    • 2.2. Purposes of formative and summative assessments (15)
    • 2.3. Achievement tests and their characteristics (16)
      • 2.3.1. Achievements tests (16)
      • 2.3.2. Characteristics of a good EGP test (18)
      • 2.3.3. Characteristics of a good ESP test (21)
    • 2.4. Face validity (22)
      • 2.4.1. Definition (22)
      • 2.4.2. Relationship between reliability and validity (23)
      • 2.4.3. Reasons for choosing face validity (24)
    • 2.5. Some measures to increase face validity (25)
  • CHAPTER 3: THE STUDY (27)
    • 3.1. English learning and teaching at Nam Dinh University of Technology Education (27)
      • 3.1.1. Students’ backgrounds (27)
      • 3.1.2. The English teaching staff (27)
      • 3.1.3. Objectives of the English course (28)
      • 3.1.4. Checklist of the course book (29)
      • 3.1.5. Objectives of the final test (30)
      • 3.1.6. Difficulty level and discrimination of the final test (31)
    • 3.2. English testing at Nam Dinh University of Technology Education (31)
      • 3.2.1. Testing situation (31)
      • 3.2.2. The current final achievement test (32)
    • 3.3. Research methods (33)
      • 3.3.1. Survey questionnaire (33)
      • 3.3.2. Interview and informal discussion (33)
    • 3.4. Data analysis of survey questionnaires and interviews (33)
      • 3.4.1. Data analysis of the administration of the test (34)
        • 3.4.1.1. Data analysis of the format of the test (34)
        • 3.4.1.2. Data analysis of the logistics of the test (35)
      • 3.4.2. Data analysis of face validity of the test (36)
        • 3.4.2.1. Data analysis of general opinion about the test (37)
        • 3.4.2.2. Data analysis of reading comprehension task (38)
        • 3.4.2.3. Data analysis of grammar knowledge task (40)
        • 3.4.2.4. Data analysis of translation task (41)
    • 3.5. Discussion and findings (43)
      • 3.5.1. Similarities in teachers and students’ perception (0)
        • 3.5.1.1. Test administration (43)
        • 3.5.1.2. Face validity (43)
      • 3.5.2. Differences in teachers and students’ perception (44)
        • 3.5.2.1. Test administration (44)
        • 3.5.2.2. Face validity (44)
    • 3.6. Suggestions to improve the final achievement test (45)
  • CHAPTER 4: CONCLUSION (48)

Nội dung

INTRODUCTION

Rationale

Learning English in Vietnam is increasingly popular, driven by the country's open-door policy that fosters international cooperation across various sectors, including diplomacy, economy, culture, and technology As a developing nation, Vietnam recognizes English as a crucial tool for accessing the latest scientific and technological advancements, which are essential for its growth and modernization.

In Vietnam, English is recognized as a crucial global language, making it a mandatory subject in schools, colleges, and universities Evaluating language proficiency through tests is essential in the teaching and learning process, as it provides valuable feedback on the achievement of educational objectives At Nam Dinh University of Technology Education (NUTE), particularly within the Faculty of Foreign Languages, assessing students' progress is vital for informed educational decision-making Without achievement tests, it becomes challenging to gauge language proficiency and effectively guide students' learning paths.

At NUTE, a technological university, students' English proficiency is notably low, highlighting the urgent need for effective evaluation methods to enhance their learning outcomes Despite previous research addressing this issue, the evaluation of tests after each semester remains largely overlooked, resulting in a decline in student performance As a teacher, I observe that our focus is limited to the procedural aspects of test creation, administration, and marking, while we neglect the importance of feedback from both colleagues and students This lack of comprehensive test analysis and evaluation contributes to the stagnation of student achievement.

Test validity is essential for enhancing the quality of assessments, as it directly measures students' achievement of learning objectives Final examinations must exhibit validity, which is a key characteristic of a high-quality test.

This study evaluates the validity of the final achievement test for second-year non-major students at the Electronic-Electrical Engineering Department of Nam Dinh University of Technology Education The research aims to benefit the author, educators, test-takers, and those interested in language testing, particularly in the context of achievement test validity Unlike previous studies, this research focuses solely on the face validity of the test due to time constraints in collecting student scores The author aspires for the findings to enhance the current testing process and to develop a reliable item bank, while also motivating both teachers and learners in their educational endeavors.

Scope of the study

This thesis focuses on evaluating the face validity of the current achievement test for second-year non-English major students at the Electronic-Electrical Engineering Department of NUTE Due to constraints in time, resources, and data availability, the study does not encompass all final achievement tests or the creation of a sample test for these students Instead, it presents a test specification specifically for Test 12 in Semester 3.

Aims of the study

Following the scope of the research above, the aims of this research are:

1 To indentify the English teachers and students’ evaluation of the final existing achievement test (test 12) at NUTE in terms of face validity

2 To provide suggestions for test designers.

Methods of the study

In order to achieve the above aims, the study has been carried out as follows:

The author conducts research in a library to explore theories related to assessment and testing, focusing on the characteristics of effective achievement tests and the concept of test validity, particularly face validity Through critical reading, she compiles, analyzes, and synthesizes various reference materials to establish a theoretical framework for evaluating the face validity of the current test administered to third-semester students.

Then, qualitative methodologies involving data collected through survey questionnaires and interviews were employed from both teachers and students at NUTE.

Research questions

This study is implemented to find answers to the following research questions:

1 What are the teachers’ and test takers’ (students’) perceptions of the final 3 rd semester English achievement test at NUTE in terms of its face validity?

2 What are suggestions to improve face validity of the final 3 rd semester English achievement test at NUTE?

Design of the study

The thesis is divided into four major chapters:

Chapter 1: Introduction presents basic information such as: the rationale, the scope, the aims, the method, the research questions and the design of the study

Chapter 2: Literature review reviews theoretical backgrounds on evaluating a test, which includes relationship between teaching, learning and assessment, purposes of formative and summative assessments, achievement tests, characteristics of good EGP and ESP tests, face validity and some measures to increase face validity

Chapter 3: The study is the main part of the thesis showing the context of the study and the detailed result obtained from collected tests and findings in response to the research questions Then, the author gives some solutions to improve the final achievement test Chapter 4: Conclusion offers conclusions and proposes some suggestions for further research on the topic.

LITERATURE REVIEW

Relationship between teaching, learning and assessment

The relationship between teaching, learning, and assessment is significantly influenced by curriculum and content standards Curriculum outlines what should occur in the classroom, detailing the topics, themes, and questions aligned with content standards, which serve as the framework for curriculum development While curriculum can differ among programs and instructors, it focuses on delivering essential concepts identified by content standards for learners to grasp and apply It guides instructors on effective teaching techniques, recommended activities, and necessary materials to support student achievement of content standards Additionally, assessment plays a crucial role in evaluating whether these standards have been met, requiring that selected assessments align with state standards to ensure validity and reliability Ultimately, the interplay between assessment, curriculum, and content standards underscores their interconnectedness in the educational process.

The Longman Dictionary of Language Teaching and Applied Linguistics (3rd edition, Richard et al., 2005) defines assessment as a systematic method for gathering information and making evaluations regarding a student's abilities or the effectiveness of a teaching course, utilizing diverse sources of evidence.

Assessment is essential for effective teaching and learning, significantly influencing curriculum design and implementation According to Richards & Nunan (1990), assessment encompasses the processes used to evaluate a learner's skills and knowledge It serves as a critical tool for making informed judgments about student progress and understanding.

- Provide for pre-, while- and post-testing;

- Be criterion – or standards – referenced;

- Serve as an accountability measure;

- Be adaptable to a variety of instructional environments;

- Accommodate learners with special needs

Assessment measures such as evaluation, examination, questionnaires, interviews, discussions, and observations are widely recognized in the educational field Testing serves as a crucial tool for implementing these assessments within the teaching process According to Brown (2001), a comprehensive curriculum system requires analysis, objectives, testing, materials, teaching, and evaluation Similarly, Richards (1990) highlights that language curriculum development involves analysis, resources, objectives, syllabus design, methodologies, testing, and evaluation Both experts underscore the vital role of testing in education.

A test is defined as a measurement instrument designed to elicit specific behaviors (Bachman, 1990), while a language test measures the extent of students' learning in a foreign language course (Oller, 1979) This research posits that a language test comprises various instruments, including questions and problems, aimed at assessing an individual student's language abilities and knowledge in the foreign language they have studied.

Language tests serve as essential tools for educators to gain reliable insights into their students' language abilities, enabling them to monitor progress and identify individual strengths and weaknesses These assessments provide critical feedback on the effectiveness of English courses and offer valuable feedforward for students at the start of their learning journey The interplay between feedback and feedforward is illustrated through the analogy of catching a ball, where interpreting its movement allows for necessary adjustments Feedforward equips teachers to anticipate potential challenges students may face, fostering confidence and promoting effective study habits, while feedback enables educators to refine their teaching methods for optimal student outcomes Additionally, feedback aids in evaluating the effectiveness of the curriculum, methods, and materials used in the classroom, ensuring continuous improvement in language education.

In addition, testing may bring many impacts on teaching and learning Hughes (1989:

01) calls the effect of testing on teaching and learning as “backwash” He appreciates the role of backwash in the teaching-learning process Backwash can be harmful if the test content doesn’t go with the objectives of the course It leads to the problem of teaching in one way and testing in another way and vice versa However, backwash need not always be harmful, it can be positive, too A test which would be based directly on the needs of a specific group of learners will be useful for them to perform in real life

In view of the important role of language test in education system, Shohamy (2001:

2) emphasizes that “language tests need to be of high quality and follow careful rules of science of psychometrics.” In other words, a good language test must present accurate answers to the test takers in reference to the aspect of knowledge that it measures Furthermore, a high-quality language test must be reliable and valid so as to give precise information on the test takers’ language ability Language test may differ according to the purposes of their design and how they are designed (see figure 1)

Figure 1: Three Considerations for Test Choice

The intended impacts of a test encompass the effects envisioned by its designer, as illustrated in figure 2 According to Bachman and Palmer (1996), various entities that may be influenced by a test include individuals such as students and teachers, language classes and programs, as well as the broader society.

Figure 2: The Scope of Impact of Language Tests

The significance of testing, particularly in English for Specific Purposes (ESP), is undeniable, as it has become a crucial aspect of university education since the 1960s The rise of ESP has led to an increase in publications, conferences, and journals focusing on this specialized area of English language teaching, transitioning from traditional general English courses to those tailored for specific fields, such as English for Business Purposes Consequently, the demand for testing specific learner groups has prompted a gradual yet noticeable growth in the ESP testing movement Both ESP testing and English as a Foreign Language (EFL) testing are essential components of effective teaching and learning.

On student, teachers, classes, and programs

On student, teachers, and programs and institutions

On student, teachers, classes, and programs, institutions and society

The interplay between teaching, learning, and assessment is deeply interconnected, as testing is not a standalone entity, but rather an integral component of the learning process A well-designed test can serve as a valuable teaching and learning tool, facilitating the discovery of new ideas and ways of organizing knowledge Effective teaching involves guiding individuals in their learning journey, whether through systematic instruction or a discovery-based approach, with testing playing a crucial role in reinforcing learning outcomes.

Purposes of formative and summative assessments

Assessment is a crucial process for documenting and measuring knowledge, skills, attitudes, and beliefs In educational settings, various types of assessments are utilized, including continuous, formative, summative, peer, and self-assessment This research will specifically examine the relationship between formative and summative assessments According to Atkin, Black, and Coffey (2001), teachers act as coaches and facilitators when using formative assessment to enhance student learning, while they take on the role of judges when making summative evaluations of student achievement.

Formative assessment serves as a crucial feedback mechanism for both students and instructors, aimed at enhancing the teaching and learning experience For students, it offers insights into their performance, progress in acquiring necessary skills, and challenges they may face in a course, without impacting their final grades This self-awareness allows students to identify their strengths and weaknesses and work towards improving their overall performance From an instructor's viewpoint, formative assessment acts as a diagnostic tool, evaluating the effectiveness of course design and teaching methods It reveals areas needing improvement in curriculum and highlights successful teaching strategies Common types of formative assessments include diagnostic tests, which identify specific language issues students encounter, and placement tests, which determine the appropriate class level for students at the beginning of a course These assessments enable instructors to tailor their teaching strategies to address anticipated challenges effectively.

Summative assessment aims to evaluate student achievements by providing a meaningful overview of what they know, understand, and can do (Brown & Knight, 1999) Typically conducted at the end of a topic or course, it assesses the knowledge and skills acquired by students throughout the learning process An achievement test serves as a common example of this type of assessment.

The relationship between formative and summative assessments is integral, as both serve distinct purposes in evaluating student abilities and improving teaching quality Teachers must identify students' challenges to accurately assess their levels, adjust instructional methods accordingly, and ultimately test their understanding of the material.

Achievement tests and their characteristics

This research focuses solely on summative assessment due to its relevance in evaluating the final English for Specific Purposes (ESP) test By utilizing a summative approach, the study effectively assesses student achievement through a detailed achievement test.

Achievement tests are essential in school programs as they effectively assess students' language knowledge and skills acquired throughout their courses, and they are commonly utilized across various educational levels.

According to Sparatt (1985:145), an achievement test serves as a valuable tool for both teachers and students to evaluate progress, with its unique aim and content setting it apart from other test types.

David (1999: 2) also shares an idea that “achievement refers to the mastery of what has been learnt, what has been taught or what is in the syllabus, textbook, materials, etc

An achievement test therefore is an instrument designed to measure what a person has learnt within or up to a given time”

Brown (1994b:259) defines an achievement test as one that is directly linked to specific classroom lessons, units, or the overall curriculum, focusing on the materials covered within a defined timeframe In contrast to progress tests, achievement tests aim to encompass a broader range of the syllabus to assess students' understanding comprehensively.

If we confine our test to only part of the syllabus, the contents of the test will not reflect all that has been learned

There are two kinds of achievement tests: final achievement test and progress achievement test

Progress achievement tests, often created by teachers and administered after chapters or terms, are designed to assess student learning and measure the effectiveness of teaching methods According to Hughes (1900:12), these tests aim to evaluate the progress students are making, enabling teachers to identify areas where learners may struggle and adjust their instruction accordingly For students, these tests serve as valuable tools that boost confidence and provide opportunities to practice the target language effectively Furthermore, they prepare students for final achievement tests by familiarizing them with the testing format and strategies, ultimately enhancing their overall performance.

Final achievement tests are conducted at the conclusion of a course to evaluate learners' understanding of the material These assessments can be created by educational ministries, official examining boards, or teaching institutions They serve to measure students' performance in relation to the course objectives and content According to Hughes (1990:11), there are two primary types of final achievement tests: the syllabus-content approach and the syllabus-objective approach.

The syllabus-content approach relies on a comprehensive course syllabus and relevant materials, ensuring that tests reflect what students have actually learned, thus promoting fairness However, if the syllabus is poorly structured or if the chosen resources are inadequate, test results may be misleading, potentially failing to accurately represent students' achievement of course objectives.

The syllabus-objective approach aligns test content directly with course objectives, offering several advantages It compels course designers to clearly define objectives, allowing students to demonstrate their progress towards achieving them This alignment pressures syllabus creators to select appropriate books and materials that support these objectives, thereby reducing the likelihood of poor teaching practices The author argues that tests based on course objectives yield more accurate insights into individual and group performance, ultimately fostering a positive impact on teaching methodologies.

2.3.2 Characteristics of a good EGP test

To create an effective test, educators must consider multiple factors, including the test's purpose, syllabus content, students' backgrounds, and administrative goals Additionally, the characteristics of the test are crucial in developing quality English for General Purpose (EGP) assessments The primary attribute of a successful test is its usefulness, which encompasses four key components: reliability, validity, practicality, and washback.

Reliability has been defined in different ways by different authors Berkowitz,

Reliability in testing is defined as the consistency of test scores over repeated applications, indicating that results should be dependable for individual test takers (Wolkowitz, Fitch, and Kopriva, 2000) Bachman (1990) emphasizes that reliability is a crucial quality of test scores, highlighting the importance of consistent results For a test to be deemed reliable, students taking the same assessment on different occasions should yield similar scores, assuming no significant changes have occurred in the interim If there are substantial discrepancies in their results, the test's reliability is called into question.

Validity refers to the degree that a test actually measures what it was designed to measure Validity is often discussed under the headings: face, content, construct and criterion-related

Content validity is a non-statistical type of validity that assesses whether a test adequately represents the behavior domain it aims to measure (Anastasi & Urbina, 1997) It is established through the careful selection of test items, ensuring they align with a well-defined test specification derived from a comprehensive analysis of the subject area According to Foxcraft et al (2004), involving a panel of experts in reviewing the test specifications and item selection can enhance content validity These experts evaluate the items to ensure they effectively cover a representative sample of the intended behavior domain.

Construct validity refers to a test's ability to accurately measure a theoretical, non-observable trait or construct Establishing construct validity requires time and a collection of evidence Two primary methods for assessing a test's construct validity are convergent and divergent validation, as well as factor analysis.

A test exhibits convergent validity when it shows a strong correlation with another assessment measuring the same construct, while divergent validity is indicated by a weak correlation with a test that evaluates a different construct.

Factor analysis is a complex statistical procedure which is conducted for a variety of purposes, one of which is to assess the construct validity of a test or a number of tests

Face validity, as defined by Hughes (1989), refers to the extent to which a test appears to measure what it is intended to measure However, as Anatasi (1982) highlights, face validity does not equate to technical validity; instead, it focuses on the superficial appearance of the test rather than its actual measurement capabilities.

Face validity and content validity are closely linked concepts in test assessment While content validity relies on a theoretical framework to evaluate whether a test adequately covers all relevant domains of a specific criterion, face validity focuses on the superficial appearance of the test, determining if it seems to effectively measure what it claims to assess.

Face validity

Face validity refers to the extent to which a test appears to measure what it claims to measure, as defined by Hughes (1989) It is primarily concerned with the perceptions of non-experts, including candidates and their families, regarding the test's relevance and familiarity For instance, the Grade 9 exam traditionally included familiar content from English 9, allowing students to prepare effectively However, when the exam unexpectedly incorporated unfamiliar materials, it lost face validity, which is crucial for acceptance among candidates, educators, and employers McNamara (2000) emphasizes that a language test is considered face valid only if it meets the expectations of those involved in its design and use Ingram (1977) also highlights that face validity encompasses surface credibility and public acceptability, underscoring its importance in the educational assessment landscape.

Ensuring face validity of a language is important in view that this validation procedure is one of the major aspects of validity The procedure of face validation

“involves an intuitive judgment about the test’s content by people whose judgment is not necessary expert”, as it is mentioned by Anderson et al (1995: 289) Anderson et al (1995:

Face validation refers to the assessment of how individuals perceive the appearance of a language test, often with limited focus on the actual content of the test items Analyzing the face validity of an English test involves collecting opinions on whether the test appears to be a valid measure of English proficiency.

2.4.2 Relationship between reliability and validity

Reliability and validity are interconnected concepts that are essential for assessing the quality of a test While often viewed as distinct, both characteristics play a crucial role in determining a test's effectiveness Understanding the complex relationship between reliability and validity is key to developing a robust assessment tool.

For a test to be considered valid, it must provide consistently accurate measurements, which means it must be reliable; however, a reliable test does not guarantee validity For instance, a writing test that requires candidates to translate a 500-word text may yield reliable results, but it does not validly assess writing skills Thus, while reliability is essential for validity, it alone is not sufficient To illustrate this relationship, envision a target where the center represents the concept being measured Each attempt to measure an individual’s performance is akin to taking a shot at the target; hitting the center indicates a perfect measurement, while missing it signifies varying degrees of inaccuracy.

Figure 3: Relationship between reliability and validity

The diagram illustrates three scenarios regarding measurement accuracy In the first scenario, consistent hits are recorded, but they miss the target's center, indicating a reliable yet invalid measure The second scenario features random hits across the target, demonstrating a lack of both reliability and validity In the final scenario, consistent hits at the target's center signify that the measure is both reliable and valid In summary, while reliability is essential, it alone does not guarantee validity.

2.4.3 Reasons for choosing face validity

Validity is a crucial quality for effective testing, as it ensures that a test accurately measures what it intends to measure According to Hughes (1982: 22), a test with high content validity is more likely to provide accurate results Consequently, prioritizing test validity from the beginning of the test construction process is essential for achieving reliable outcomes.

Validity of a language test has four facets, namely face validity, content validity, construct validity and criterion - referenced validity However, the author focuses on face validity because of some reasons

This research excludes content validity, construct validity, and criterion-referenced validity due to time and resource limitations As noted by Anastasi (1982) and cited by Weir (1990), face validity, while not technically a form of validity, is important as it assesses whether a test appears valid to its users Heaton (1988) emphasizes that good face validity can enhance student motivation and provide a practical balance to an overemphasis on statistical analysis Content validity, defined by Anastasi & Urbina (1997), involves a systematic examination of test content to ensure it represents the behavior domain being measured, requiring a representative sample for effective analysis Construct validation, according to Bachman and Cohen (1998), involves justifying the inferences drawn from test scores, while Bachman and Palmer (1996) highlight its relevance to the meaningfulness of interpretations related to test results Lastly, criterion-referenced validity, as described by Bachman (1990), examines the relationship between test scores and a criterion indicative of the tested ability However, due to the absence of actual test scores or reliable samples, this research does not investigate these three validation processes.

Secondly, face validity is chosen because of its importance in society As Hughes

Face validity, while not a scientific concept, plays a crucial role in the acceptance of tests by candidates, educators, and employers (1989) Huong (2000: 69) emphasizes that the appearance of a test significantly influences its usage She suggests that understanding test takers' perceptions can provide valuable insights for test development, particularly regarding the relevance of tests to real-life tasks Ultimately, the face validity of a test is essential for societal evaluation, as it reflects the expertise of test designers in ensuring the test's relevance and acceptance.

This study aims to develop a language test that aligns with the abilities of students at NUTE The reasons outlined serve as a significant motivation for the author to explore the face validity of the achievement language test 12 at NUTE.

Some measures to increase face validity

Face validity is crucial for the effectiveness of a test, as it reflects the perceptions of non-professional testers, such as parents and students, regarding the test's appropriateness If these individuals feel that the test does not adequately assess candidates' knowledge, they may express their dissatisfaction, leading to a lack of motivation among candidates A test that lacks face validity may ultimately fail to serve its intended purpose and may require redesign To enhance face validity, it is essential to implement strategies focused on both the administration and content of the test.

- Test format is familiar and clear to the students;

- The quantity of questions designed in a test is suitable with the time allowance;

- Test conditions (space and atmosphere in the testing room in particular) that are biased for best that bring out students’ best performance;

- Test has to be assured equally by collecting materials before testing and stricter in testing process;

- Test content reflects syllabus objectives and cover totally what the students have been taught in the course;

- Task types are familiar with students;

- A difficulty level of task is appropriate to students’ language ability;

The author assesses the final achievement test 12 at NUTE by examining face validity through the perspectives of both teachers and test takers This evaluation aims to identify key factors that can lead to constructive suggestions for enhancing the test's effectiveness.

This chapter explores the interconnectedness of teaching, learning, and assessment, highlighting various assessment types, including achievement tests It emphasizes the essential characteristics of effective English for General Purposes (EGP) and English for Specific Purposes (ESP) achievement tests Additionally, the chapter discusses the concept of face validity and outlines strategies to enhance it, underscoring its significance in validating assessments within the teaching and learning process.

THE STUDY

English learning and teaching at Nam Dinh University of Technology Education

Students at Nam Dinh University of Technology Education (NUTE) exhibit varying levels of English proficiency due to their diverse backgrounds Those from small towns and rural areas often have fewer opportunities to learn English compared to their peers from larger cities, where foreign language education is prioritized Additionally, some students enter university without prior English instruction, having studied other languages like Russian, French, or Chinese in high school, while others have only recently begun learning English The university's admission process, which does not require entrance exams but rather evaluates submitted applications, contributes to a general lack of motivation and appreciation for learning English and other subjects among these students.

Nam Dinh University of Technology Education, established five years ago, specializes in training engineers and teachers, with its English program transitioning from the General Department to the Faculty of Foreign Languages in 2009 The department focuses solely on English, staffed by 11 teachers—three experienced and eight younger instructors—who teach both English for General Purposes (EGP) and English for Specific Purposes (ESP) in fields such as economics, computing, electronics, welding, and automobiles All teachers are locally trained, with three holding Master's degrees in Managing Education and six currently pursuing an M.A in English They often use Vietnamese in the classroom to effectively communicate concepts, considering the students' limited English proficiency, which enhances student engagement and participation in lessons.

3.1.3 Objectives of the English course

The university excels in various technological fields, with electronics being a prominent area of focus Out of eleven faculty members, seven specialize in teaching English within the electronic domain The English for Specific Purposes (ESP) syllabus for students in the Electronic-Electrical Engineering Department has been effectively implemented for the past four years, developed by the foreign language department Prior to engaging in the ESP curriculum, students complete two English for General Purposes (EGP) terms, each comprising 60 periods and utilizing the Headway series at the Elementary and Pre-Intermediate levels This initial training emphasizes reading skills, vocabulary, and grammar, while neglecting other essential skills such as listening, writing, and speaking Upon completing the EGP terms, students transition to an Electronic-Electrical English textbook, crafted by the Foreign Language Department at NUTE, which includes 30 periods dedicated to reading skills such as skimming, scanning, and detailed comprehension, as well as translation abilities and specialized ESP vocabulary.

Table 1: ESP syllabus content allocation

No Units 45-minutes periods Skills

03 Reading skill, vocabulary and translation

2 Circuit Elements 03 Reading skill, vocabulary and translation

3 The DC Motor 03 Reading skill, vocabulary and translation

4 The Cathode Ray Tube 03 Reading skill, vocabulary and translation

5 Review 03 Reading skill, vocabulary and translation

6 Electronics in the home 03 Reading skill, vocabulary and translation

7 Semiconductor Diodes 03 Reading skill, vocabulary and translation

8 Radio 03 Reading skill, vocabulary and translation

9 Audio Recording System 03 Reading skill, vocabulary and translation

10 Review 03 Reading skill, vocabulary and translation

This course book emphasizes reading skills, vocabulary, and translation abilities, with minimal focus on language structure The final assessment is designed to evaluate reading comprehension, grammar, and vocabulary, excluding listening and speaking skills The selected reading texts are both meaningful and beneficial, as they reinforce language concepts while providing relevant background knowledge and vocabulary related to Electronic and Electrical specifications Additionally, these texts are primarily sourced and adapted from Vietnamese-authored electronic and electrical textbooks Over the span of three English terms, students will achieve significant learning objectives.

During the first and second terms, teachers focus on reinforcing students' grammar knowledge and enhancing their reading skills to prepare them for the upcoming semester In the third term, the emphasis shifts to consolidating reading abilities, introducing English for Specific Purposes (ESP) vocabulary, and teaching translation techniques that will be beneficial for students in their future careers.

The primary objective is to provide students with a solid foundation in general English grammar and vocabulary, enhance their reading comprehension and translation skills, and impart essential knowledge of Electronic and Electrical English, all of which are crucial for their future careers.

3.1.4 Checklist of the course book

The checklist of language skills and sub-skills taught in the course will help the author easily compare the content areas with the current final test 12

+ linking words: and, or, but, however, therefore, because, although, etc;

+ pronoun references and possessive adjectives;

- Translation: mainly sentences related to reading text in each unit;

- Reading: topics of the reading texts are mentioned in table 1 with focusing on main reading skills as follows:

- Vocabulary: each unit has a list of new words related to the topic of its reading text

3.1.5 Objectives of the final test

The teachers give final achievement tests at the end of each semester to achieve the following purposes:

- To assess students’ ability in reading comprehension which limits at understanding main ideas, extracting information, guessing words in context and making a little inference and understanding opinions;

- To assess students' ability to use general grammar knowledge by using correctly such grammatical input as below:

+ recognition and use of tenses;

+ recognition and use of voices;

+ recognition and use of modal verbs;

+ recognition and use of relative clauses

- To assess student’s ability to translate the technical materials into English in such a way as follows:

+ express grammatically correct ideas within basic structures presented in the course book; + express students’ technical vocabulary field;

+ link ideas using linking words “and, but, or, because, so, although, etc”;

+ use reference pronouns and possessive adjectives

- To assess student’s ability to translate the technical materials into Vietnamese in such a way as follows:

+ express students’ technical vocabulary field;

+ express students’ background and specialized knowledge;

- To check how much the students have required the target language skills and knowledge; and how far the objectives of the course have been achieved in the set timeframe;

- To help students to see what they have achieved during their learning process;

- To help teachers indentify teaching method, syllabus and material in order to adjust and adapt to the students’ needs and capacities

3.1.6 Difficulty level and discrimination of the final test

Bloom's taxonomy (1956) identifies the cognitive domain as encompassing knowledge and the enhancement of intellectual skills, which involve recalling specific facts, procedural patterns, and concepts essential for intellectual development It comprises six major categories arranged from simplest to most complex, representing varying degrees of difficulty Mastery of the initial categories is typically required before progressing to more advanced levels.

- Knowledge: Recall data or information

- Comprehension: Understand the meaning, translation, interpolation, and interpretation of instructions and problems State a problem in one's own words

- Application: Use a concept in a new situation or unprompted use of an abstraction

Applies what was learned in the classroom into novel situations in the work place

- Analysis: Separate material or concepts into component parts so that its organizational structure may be understood Distinguish between facts and inferences

- Synthesis: Build a structure or pattern from diverse elements Put parts together to form a whole, with emphasis on creating a new meaning or structure

- Evaluation: Make judgments about the value of ideas or materials

The test is designed to assess students' knowledge and comprehension skills based on the course book's objectives It focuses solely on evaluating their understanding of the material, rather than their ability to apply English in practical situations Consequently, the test's difficulty and discrimination are restricted to the students' knowledge level, primarily determining whether they have thoroughly studied the textbook.

English testing at Nam Dinh University of Technology Education

At NUTE, the ESP tests for students are crafted by the Foreign Language Department teachers, with each instructor responsible for creating two tests per class at the end of the semester, resulting in a total of fourteen tests for the 3rd semester These tests are designed in accordance with the syllabus and follow a standardized format After completion, the tests are reviewed by the head of the ESP subject before being forwarded to the educational testing and quality assurance department, which randomly selects one test for printing This article will specifically examine test 12, administered to K3 students in the Electronic - Electrical Engineering Department during their final exam of semester 3.

3.2.2 The current final achievement test

The ESP Test 12 is a syllabus-based final achievement assessment conducted in the third semester, comprising four distinct parts The first section involves reading an ESP text and determining the accuracy of information by identifying true or false statements The second part focuses on grammar, assessing students' grammatical knowledge while providing an opportunity to enhance their scores The final two sections evaluate students' translation skills, testing their overall vocabulary comprehension and language usage.

Table 2: Specification of test 12 Part Language items/ skills Input Task types Marking

I Reading comprehension Narrative text relating to electric (related to topic in unit 1) x 5, deciding true or false

II Grammar Sentences (passive voices) x 5, putting the right verbs in passive voice

Sentences in English (related to topics in unit

Sentences in Vietnamese (related to topics in unit 2, 3, 4, 9) x 4 2.0

The current test 12 demonstrates a strong alignment with the curriculum, featuring a reading comprehension text from unit 1, a grammar task focused on passive voice, and a translation task covering sentences from units 2 to 9 The familiar task types, including multiple choice, short answer, and full answer, cater to the students' prior learning experiences While the test effectively assesses students' knowledge and comprehension levels, it is important to note that the grammar task only addresses a specific aspect of grammar, and the reading task does not fully encompass essential reading skills.

Research methods

The study employed a qualitative method to analyze data collected from a survey questionnaire administered to 239 second-year students and 7 teachers in the foreign language department at NUTE This questionnaire aimed to assess the face validity of the test and gather suggestions for improvement from both students and teachers.

The author conducts interviews with teachers from the foreign language department and students from classes DDT-3A and DDT-3B to gather additional insights The interview questions are derived from previous questionnaires, with a specific emphasis on understanding the reasons behind their choices The findings from these interviews are documented for comparison with the questionnaire results, allowing for the identification and adjustment of any discrepancies through alternative methods.

The author engages in discussions with foreign language department teachers to explore the successes and failures of language testing, specifically focusing on Test 12 The insights gained from these conversations serve as valuable supporting data for the previously discussed methods.

Data analysis of survey questionnaires and interviews

The study involved administering over 200 questionnaires to second-year K3 students, resulting in 142 completed responses, along with 7 questionnaires distributed to teachers in the foreign language department at NUTE The objective was to analyze the similarities and differences in perceptions of the test between students and teachers Data was gathered through surveys comprising a total of 31 questions for students and 34 questions for teachers, organized into two main sections Part A included 10 questions focused on feedback regarding the test administration, while Part B featured four subsections with 21-24 questions addressing the face validity of the test, covering general opinions, reading comprehension tasks, grammar tasks, and translation tasks.

3.4.1 Data analysis of the administration of the test

3.4.1.1 Data analysis of the format of the test

The analysis of survey results regarding the test format reveals a consensus between teachers and students According to Table 3 (see appendix 4), 100% of teachers and 69.7% of students agree that a Times New Roman font in size 12 is appropriate However, 22.5% of students consider it somewhat suitable, while 7.8% find it unsuitable The primary concern among students who deem the font unsuitable is the desire for a larger size, with many suggesting font size 14, although they acknowledge that they can still read the test in size 12.

In a recent survey, all teachers and 69% of students believe that the test copies are clear, while only 26.7% of students consider them somewhat clear, and 4.3% find them unclear When asked about the quality of the copies received during the test, only a small number of students reported receiving poor copies, indicating that the majority of the test materials are of good quality.

A significant disparity exists between students and teachers regarding the perceived quantity of questions in tests, with 60.6% of students believing there are too many questions compared to only 28.6% of teachers Students cite the length and difficulty of specific test sentences as contributing factors, while 30.3% feel the number of questions is adequate Conversely, 71.4% of teachers consider the question quantity average, with only 9.1% deeming it inadequate Teachers assume students are accustomed to the question volume due to prior mid-term tests, but students argue that the testing environment differs significantly; in class, they can collaborate with peers, whereas in a formal testing room, they must rely solely on their own efforts, leading to challenges in completing the test This discrepancy highlights issues of perceived fairness in mid-term assessments and student accountability.

Regarding the suggestions for the improvement of test format, 86 students (60.6%) and 2 teachers (28.6%) give the suggestions They suppose that sentences 3 and 4 in part

The IV section of the test needs to be simplified, as translating exercises into English can be challenging, despite their presence in the course book Notably, 71.4% of teachers and 39.4% of students believe the current test format is satisfactory and do not suggest any changes.

Both students and teachers agree that font size and the availability of test copies positively impact students' test results, indicating that the current testing format ensures fairness However, there is a discrepancy in their views regarding the number of questions on the test, attributed to unequal administration of the mid-term exam Consequently, both groups provide suggestions for improvement in the fourth question.

3.4.1.2 Data analysis of the logistics of the test

The analysis of survey responses regarding the logistics of the test, as detailed in Table 4 (see Appendix 4), reveals that 57.7% of students feel the time allowed is insufficient, while 35.2% consider it adequate and 7.1% believe it is excessive Students attribute their inability to complete the test within the allotted time to their limited language skills In contrast, 28.6% of teachers believe the time is inadequate, whereas 71.4% feel it is sufficient Most teachers are confident that students can complete the test promptly, as they are aware of the time constraints and have practiced under similar conditions during mid-term tests This disparity in opinions may stem from concerns about the fairness of mid-term assessments and perceived student complacency.

A survey revealed that 83.1% of students and 100% of teachers believe the testing rooms are adequately sized, while only 16.9% of students disagree, citing overcrowding as an issue This overcrowding raises concerns about the fairness of the testing environment, as students in cramped spaces may be more susceptible to cheating.

In response to the seventh and eighth questions, both teachers and students unanimously agree that the number of tests administered in classrooms is sufficient Additionally, teachers implement strategies to prevent cheating by collecting all materials related to the test prior to its administration.

In a recent survey, 28.6% of teachers and 28.1% of students reported that the atmosphere in testing rooms is very tense, while a majority, 71.4% of teachers and 63.4% of students, described it as usual Only 8.5% of students indicated they feel very comfortable during tests Despite this, students believe that most teachers strive to create a comfortable environment for testing, although only a small number of teachers are perceived as being overly strict in their supervision during these assessments.

The feedback from both teachers and students regarding question 10 reveals a notable similarity, with only 2 teachers and 5 students offering suggestions for test administration Their recommendations emphasize the need for stricter oversight by teachers during the test In contrast, the majority of teachers and students appear to lack insights on the logistical aspects of the testing process.

In conclusion, the analyzed data indicates that the test is equitable for all students, as both teachers and students concur on key aspects of the test's logistics.

 The rooms for testing are big enough

 The quantity of the test is adequate

 The teachers have measures to prevent cheating in the exam

 The atmosphere of the testing rooms is usual

However, there is minor difference between perception of teachers and students about time allowance because of various language abilities and lacking pre-test before administering the test

3.4.2 Data analysis of face validity of the test

To assess the face validity of the test from the perspectives of both teachers and students, the author will analyze data gathered from surveys and interviews focusing on general opinions, reading comprehension tasks, grammar knowledge tasks, and translation tasks This analysis seeks to identify similarities and differences in perceptions between teachers and students at NUTE, enabling the author to provide recommendations for enhancing the final ESP test.

3.4.2.1 Data analysis of general opinion about the test

The investigation into the language abilities measured by the test reveals that 74.4% of students and 71.4% of teachers believe it assesses a comprehensive range of skills, including vocabulary, translation, reading comprehension, and grammar knowledge Conversely, 9% of students and 28.6% of teachers feel the test primarily evaluates translation skills, while 9.7% of students consider grammar knowledge as the main focus, and only 6.9% emphasize vocabulary knowledge Notably, both students and teachers agree that the test does not solely assess reading comprehension.

According to the descriptive data from chart 2 (see appendix 4), 44.3% of the students think that “translation into English” task can reflect their true ability The

A recent study reveals that translation into Vietnamese is the second most preferred task among students, chosen by 20.7% of respondents, while 19.9% favor reading comprehension Only 15.1% believe that grammar tasks accurately assess their true abilities In a survey of teachers, 57.1% agree that the test reflects students' true abilities, particularly in translation, whereas 42.9% disagree, arguing that the reading texts do not adequately represent student skills Interviews with students indicate that mastering both vocabulary and structure is crucial for effective translation from Vietnamese to English, highlighting the importance of translation skills in demonstrating their true capabilities While reading skills are also vital, students feel that the test does not accurately reflect their abilities.

Discussion and findings

Basing on the above findings, the evaluation of the final achievement test finds the similarities and difficulties in opinion between teachers and students at NUTE

3.5.1 Similarities in teachers and students’ perception

Both teachers and students believe that the standardized test administration promotes fairness, as evidenced by the use of Times New Roman font in size 12 and well-prepared test copies Additionally, many agree that the testing rooms are spacious and that the quantity of test papers is sufficient Teachers implement anti-cheating measures by collecting materials prior to the exam, contributing to a familiar atmosphere in the testing environment.

3.5.1.2.1 General opinion about the test

Most teachers and students share the same view that “translation into English or Vietnamese” task is perceived as good tasks to measure students’ language ability (chart 2) 3.5.1.2.2 Reading comprehension task

Teachers and students generally share a consensus regarding the reading comprehension task, agreeing that its themes are clearly outlined in the syllabus and that the reading texts are appropriately lengthy and of normal difficulty, reflecting previously learned information The reading tasks also align with the skills practiced in class However, the final test is primarily designed based on the syllabus-content approach, relying heavily on course materials and specific test objectives This approach may not accurately assess students' actual reading comprehension abilities if the selected materials diverge from the course objectives, potentially compromising the test's validity Consequently, the reading comprehension task may not effectively measure students' true reading skills, as it largely draws from textbook content.

Some students possess the ability to predict answers in reading comprehension tasks, while others resort to cheating during exams This disparity highlights a lack of fairness in the assessment of reading comprehension skills.

Most teachers and students concur that the grammar task effectively represents the material learned and possesses face validity However, they express a desire for a more diverse range of content that encompasses the entirety of the grammar knowledge acquired throughout the course Additionally, they recommend that multiple-choice questions be utilized for the grammar test format.

Both test makers and test takers agree that the instructions for the translation tasks are clear and the content is suitable for students However, there is a consensus that the structures of the translation tasks are not simple, particularly noting that sentences 3 and 4 in part IV require adjustments for simplification.

3.5.2 Differences in teachers and students’ perception

Many students believe that the number of questions on the test is excessive and cannot be completed within the allotted time, as indicated in tables 3 and 4 In contrast, most teachers disagree, asserting that the question quantity is adequate for the time provided Despite this, numerous students express frustration over their inability to finish the test due to the limited time frame Consequently, students are requesting that teachers consider omitting or simplifying questions 3 and 4 in part IV, or adjusting the time allowance for the final test.

Many students struggle to comprehend the instructions in grammar tasks, while teachers often believe these instructions are clear This disconnect largely stems from students' limited vocabulary, which hinders their understanding of key terms.

Teachers often overlook the term "passive form" in their lessons, leading to confusion among students To enhance grammar instruction, it is essential for educators to provide clear examples during tests This approach will help clarify the concept and improve students' understanding of passive constructions.

Teachers believe that translation tasks are standard for students and that high marks are achievable, while students feel that obtaining high marks, particularly in the "translation into English" section, is challenging The primary reasons for students' low scores in this area are insufficient vocabulary knowledge and a lack of motivation Students express a desire for more frequent practice and testing of translation skills in class to improve their performance.

The limited number of new words in translation tasks often leads teachers to believe that the content is derived from the course book, which primarily assesses students' familiarity with the textbook rather than their proficiency in using English To excel in these assessments, students must dedicate themselves to expanding their vocabulary and enhancing their language skills.

Suggestions to improve the final achievement test

In section 2.5, the author discusses strategies to enhance the face validity of the test, supported by data analysis from surveys conducted with teachers and students at NUTE The findings indicate that issues with the test stem from an oversimplified approach to both test administration and content To address these challenges, the author proposes several recommendations aimed at improving the design and administration of the final achievement test.

To create effective test items, test makers must ensure clarity and precision in their questions This can be achieved by having the items critically reviewed by colleagues who will seek alternative interpretations beyond the intended meaning (Hughes, 1989: 38) In the context of the English section at NUTE, teachers should first conduct trial tests, evaluate their results, and identify any weak items By eliminating these weaker questions and replacing them with stronger alternatives, teachers can enhance the quality of the test items, ultimately leading to a more effective assessment.

To ensure objective scoring in assessments, the test should include items such as multiple-choice questions, open-ended items with a single correct answer, cloze tests, and matching exercises, as recommended by Hughes (1989: 40).

According to Weir (2000), test tasks must be clear and unambiguous, ensuring that candidates cannot misinterpret the instructions Test instructions should be user-friendly, easy to understand, concise, and accessible, without adding to the task's difficulty To aid comprehension, teachers should provide examples of correctly answered test items, helping students understand what is expected in each task Ultimately, the complexity of a test should stem from its content, not the instructions provided.

First, the easiest way to solve the problem is that teacher should simplify sentence 3

Teachers must adhere to the university's regulations regarding time allowances, as outlined in part IV However, they can facilitate understanding by selecting simpler sentences, structures, and vocabulary for their students.

In constructing tests, it is crucial to incorporate texts and activities that closely resemble what students have encountered and will likely face in their future professional environments, particularly for English for Specific Purposes (ESP) assessments Current tests often merely evaluate students' rote memorization rather than their actual ability to apply knowledge in real-world scenarios To address this, test creators should redesign assessments to not only gauge students' understanding but also their practical application skills relevant to future jobs This can be achieved by avoiding the use of reading materials directly from course books and instead providing new texts that align with the course themes Additionally, reading comprehension tasks should encompass essential skills such as skimming, scanning, and detailed reading to ensure a fair evaluation of all students' true abilities and knowledge.

To enhance students' grammar skills, the grammar tasks must be diverse and encompass the entirety of the grammatical concepts learned throughout the course Rather than concentrating on a single aspect of grammar, these tasks should integrate various elements of grammar knowledge as outlined in section 3.1.5.

The test emphasizes translation skills, highlighting a common issue where students neglect vocabulary acquisition To address this, teachers should prioritize regular vocabulary checks and practice in the classroom.

When designing a test, it is crucial to carefully consider its intended purpose, including the specific objectives to be measured and the target examinees (Henning, G., 1987) Test specifications should align with teaching objectives and relevant content, and should be established early in the test development process The primary goal of the course is to enhance students' English for Specific Purposes (ESP) vocabulary, improve reading skills, and develop translation abilities, while also reinforcing grammar knowledge These competencies are essential for students' future careers Priority is given to vocabulary acquisition, followed by reading and translation skills, with grammar knowledge being less emphasized, as it has been covered in previous terms.

Solution 4: Pilot study and Pre-testing

Piloting and pre-testing are crucial phases in the development of language tests, as they provide essential feedback on task types and overall test design A Pilot Study was conducted to gather insights on these aspects, while pre-testing involves a thorough analysis of the quality and level of test items According to Weir (2000), the main goal of pre-testing is to assess the appropriateness of time allocations, ensuring that test takers can perform optimally Additionally, pre-testing focuses on evaluating the test's effectiveness, allowing for necessary revisions to both the test and its administration procedures, rather than drawing conclusions about individual test-takers Therefore, it is vital for test developers to implement pre-testing diligently before finalizing the language test.

Ngày đăng: 28/06/2022, 10:06

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN