2

I am looking for a publicly available data set of (ideally) the following kind.

  1. Multiple (must be 3 or more) examiners each grade the academic work (e.g., essay, dissertation, term-assignment) of multiple (ideally 20 or more) students.
  2. Every examiner grades every student.
  3. In addition to the grade given by each examiner to each student, there is a single "comprehensive" grade for each student given by faculty, based (in some manner that is transparent) on the other grades.

An example might be where all Honors-level dissertations are marked by the four members of a faculty examining committee. The description is of an ideal example, but I'd be very pleased to learn of other similar open-data examples where the grade involves some degree of "judgement" on the part of the examiner. I'm not looking, for example, for a dataset where multiple people ("examiners") record their reading from a measuring instrument (i.e., give a grade) under multiple conditions (equivalent to "multiple students")

CrimsonDark
  • 211
  • 1
  • 9
  • What about figure skating championships? – Deer Hunter Oct 10 '14 at 08:41
  • It's a nice idea. I previously had a look at the Olympic diving scores which are somewhat similar and I might have to settle for some sort of Olympic data ... but I'd really like to find an example related to academic grades. There is a dataset I saw written up (I think in the British Medical Journal) relating to hundreds of different examiners grading thousands of prospective medical practitioners but I couldn't extract a large enough (sub)set meeting the kinds of conditions I described. – CrimsonDark Oct 10 '14 at 11:20
  • Did you find any good dataset? – abc Mar 15 '19 at 19:15
  • @abc Unfortunately I did not but I'd be interested to know what use you were thinking of for such a data set. – CrimsonDark Mar 19 '19 at 00:17

1 Answers1

1

It may be hard to find a dataset that matches your exact criteria, but there are some promising open datasets with essay scoring.


The data will contain ASCII formatted text for each essay followed by one or more human scores, and (where necessary) a final resolved human score.

Where it is relevant, you are provided with more than one human score, so that you may evaluate the reliability of the human scorers

You can find code for benchmarks here.


Data available on this page include annotated organization scores for 1,003 essays from the International Corpus of Learner English (ICLE).

philshem
  • 17,647
  • 7
  • 68
  • 170
  • 1
    The two data sets are interesting and I appreciate knowing about them. – CrimsonDark Jan 16 '15 at 11:47
  • archive of the second dataset: https://web.archive.org/web/20190228093110/http://www.hlt.utdallas.edu/~persingq/ICLE/OrganizationScores.txt – philshem Jan 24 '20 at 07:41