Instrumentation.

The method

In order to measure the parameters which have a significant influence on achievement (high school final exam grades or academic success, AS), this large scale quantitative study relies on the natural variation of experiences in physics education, background and decisions of the students. The measured parameters relate to facts in the reality of the student and are therefore always to be considered at least partly subjective.



The recollection of the student corresponds with reality (Bradburn, 2000)
The ideal in this research would be to measure all possible predictors to achievement (maximum completeness) concerning all individuals taking part in pre-academic physics education (maximum representativeness) with the collected data representing reality for 100% (maximum validity and reliability).

The reality of the social construct of physics education is far too complex to achieve this ideal. Therefore, the goal of the research process will be to search this complexity for nomothetic conclusions, i.e. conclusions that concern all, discarding information that is too idiographic, i.e. only concerning individuals. The quality of the process depends on how successful one is in retaining the most valid, reliable and interesting information in this process. This can be seen as an optimization process in the tension between total completeness yielding extreme idiographic conclusions of little use and total generalization yielding extreme nomothetic conclusions of little interest (Trochim, 2006).

Taking a more practical approach, the research process can be seen as an optimization towards the ideal while accounting for the many practical restrictions that are posed in the real world of everyday physics education. Many choices made within the research process are determined by weighing the consequences of the different options in different fields of tension between ideal and reality. As the consequences of potential choices are never exactly predictable, the choices are determined by experiences of others (literature, experts, physics teachers and physics students), by common sense, and sometimes intuition. Part of the optimization process takes place after the choice has been made and consists of strengthening the perceived positive sides of the choice and weakening the perceived negative sides.

For instance, the choice to use quantitative instead of qualitative methods, has a positive influence on the representativeness of the sample, because one can involve higher numbers of respondents, but has a negative influence on reliability because of the introduction of measuring errors, misinterpretations and reading errors. Moreover, it provides the researcher with less control and less flexibility. The choice to use answering scales instead of open ended questions has positive influences on the potential to process a large amount of data which may increase the completeness of the dataset, but negative influences on the reliability of the answers, since answering scales provide the respondents with contextual cues that influence the answers the participants choose (Menon &Yorkton, 2000). Furthermore, this choice introduces possible technical problems, lack of control and flexibility on the part of the scientist and on the part of the respondent as well. In the choice of the questions to be included in the questionnaire, including more questions has a positive influence on the completeness of the questionnaire, but, when the body of questions is too large, it is bound to deter potential participants and thereby decrease the representativeness of the sample.

Menon & Yorkton (2000) quote Schwarz: 'the traditional distinction between opinion questions (presumably answered on the basis of somewhat unreliable judgmental processes) and factual questions (presumably answered on the basis of more reliable recall from memory) is misleading. For a respondent, at the time that she is answering the questions, there is only the information in her memory (internal information) and the information she gets from the questions itself (external, contextual cues). Rate of occurrence reported by her are based on her internal information and therefore are judgments and so subjective (p.75).

I used the 2004 Meta-Analysis of Robbins, Lauver, Le, Davis, Langley, & Carlstrom (2004) to cover a lot of studies involving psychosocial and study skill predictors on college outcomes. Next to the traditional predictors as SES (Social economic status), High school GPA and ACT/SAT scores, the predictors 'Achievement motivation' and 'Academic Self-efficacy' were of relative high impact. Almer Gungor, who did a study to find the influence of affective characteristics on freshman's Physics Achievement in Turkey (Gungor, Eryılmaz, & Fakıoglu, 2007) also found achievement motivation to be the strongest predictor in this field. Here self-efficacy didn't have any significance.

Self-efficacy is one of those predictors that are hard to capture by questioning a student through a survey. Even standard questions could be interpreted very differently in different cultures and self-efficacy could have a different status and a thereby a different influence on different characters, especially, but not only, if they are from different countries or cultural backgrounds. And it could be hard to differentiate between self-efficacy and general self-concept in a self-report? Considering that the correlations of these two parameters with AS differ an order of magnitude (0.4 and 0.05 respectively, Robbins et al.,2004). Could it be that Gungor measured general self-concept in stead of self-efficacy when he measured an insignificant negative correlation between self-efficacy and achievement? (Gungor, Eryılmaz, & Fakıoglu, 2007).


  • First questionnaire: back
    by email - went to more than 9000 students - September/October 2008. 3230 students responded.

    The Dutch questions can be found on internet.

    Presurvey

    No open-ended questions are included, except the last question, asking for comments. 326 students responded seriously to this question, some of their comments have been quoted on this site.

    The PRiSE, 2007 and the FICSS, 2003 questionnaires of the Harvard-Smithsonian Center for Astrophysics were used as examples for the basic questions. Follow-up comments on these questionnaires and information from 49 Dutch physics teachers about their own classroom activities helped me to optimize this basic questionnaire. Furthermore, predictors for high grades suggested in other literature were included (Carlone, 2004; Gungor, Eryılmaz, & Fakıoglu, 2007; Robbins et al., 2004; Zohar & Sela, 2003).

    In asking questions about physics lessons and teacher(s), the focus was on the pre-graduation school year (Dutch: 5th grade). In this junior year, physics has a large number of lessons per week (as compared to the previous year) and the lessons are not yet focused on exams too much (as compared to the senior year). However, the choice for the junior year over senior increases the chance of recollection errors. In questions asking about occurrences where possible answer scales with an even number of options were used. This prevents students to choose for the easy middle option and forces them to choose sides (Menon & Yorkton, 2000).

    'Why didn't you include questions about 6th grade, those are at least as important as 5th grade' 'This questionnaire was quite long. Next time shorter, please. Thanks in advance.' 'It is hard to remember everything exactly; this makes 'it' less reliable.'
    (Three different respondents).
    To validate this questionnaire, the following steps were taken:
    1. The questions were discussed with several small groups of students;
    2. some physics teachers have commented on them;
    3. and 14 students filled in the same questionnaire two times, 2 weeks apart.

    Eventually, numerical and qualitative scales were adjusted and confusing or unanswerable questions were eliminated or reworded. The most important adjustment was including keywords from the questions in the scales for students that read the questions only superficially. Many school related questions, e.g. asking for the number of physics lessons per week, have been eliminated from the questionnaire. These school related data have been collected at the schools themselves, eventually covering 30% of the 2 058 students who reported the name of their high school.

    Questions in the survey:
    1. Questions about student characteristics.

      gender
      male - female
      part of youth in the Netherlands (student and/or parents)
      6 point scale; 0.0 - 1.0
      highest education of parent/caregiver
      5 most common levels
      level of inteldtgence (own opinion)
      6 point scale; not at all - very
      preferences (e.g. dtking to think logically)
      yes/no
      major hobbies (e.g. making music, construction toys)
      less than - more than 4 years
      kind of pre-university education (atheneum/gymnasium)
      atheneum - gymnasium
      working for physics outside of lessons (in minutes/week)
      ...10...20...30...60...
      part of mandatory exercises made (percentage)
      ...20...40...60...80...
      learning strategy preparing for tests (by heart, through insight)
      6 point scale; not at all - always
      occurrence of student asking question in physics lessons
      6 point scale; never - once/month -- always
      challenged when exercises difficult
      6 point; not at all - very much so
      own opinion of working hard or effectively
      6 point scale; not at all - very much so
      exam grades of major subjects.
      full grades 4-10
      average grade in middle school for science and math lower than
      6 - 6 - 7 - 8 or higher
      occurrence of interesting lessons for student
      6 point scale; never - once/month -- always
      occurrence of lessons understood by student
      6 point scale; never - once/month -- always

    2. Questions about the physics teacher.

      competence (knowledge physics)
      6 point scale with keywords: not competent - very competent
      enthusiasm
      6 point scale with keywords: not enthusiastic - very enthusiastic
      pleasant for pupils very
      6 point scale with keywords: unpleasant - very pleasant
      attitude
      6 point scale with keywords: not consequent (unpredictable) - very consequent (predictable)
      organization
      6 point scale with keywords: not structured (not clear what to do) - very structured (always clear what to do)
      feedback to pupils
      6 point scale with keywords: mostly negative comments - mostly positive comments
      control
      6 point scale with keywords: gives freedom to pupils - keeps class under control
      way of explaining
      6 point scale with keywords: no variation (the same all the time) - a lot of variation (different ways)
      relationship with class
      6 point scale with keywords: never understood each other - always understood each other
      gender
      male/female
      age
      younger than 30; 30-50; older than 50

    3. Questions about physics lessons and classroom characteristics:

      total number of pupils in the classroom
      scale: < 5; 5-10; 10-15; 15-20; 20-25; 25-30; > 30
      percentage female in the classroom
      scale: 0-20%; 20-40%; 40-60%; 60-80%; 80-100%
      teacher activities (e.g. up front; strategy; demo)
      6 point scale; never - once/month -- always
      pupil activities (e.g. working in groups)
      6 point scale; never - once/month -- always
      labs (in or outside of regular lessons)
      6 point scale; never - once/month -- always
      occurrence of lessons with homework
      6 point scale; never - once/month -- always
      pupil gave lesson or presentation
      6 point scale; never - once/month -- always
      use of textbook in lesson
      6 point scale; never - once/month -- always
      number of mandatory exercises per lesson (homework included)
      0; 2 or less; 2-6; 6 or more
      part of exercises from book
      6 point scale; 0.0 - 1.0
      availabidtty of answers to exercises
      not at all; in classroom only; on internet/paper
      occurrence of different kinds of questions in test (e.g. involving calculations, graphs, fact)
      6 point scale; never - one per test -- all questions involved ... (keyword).