Dolomites v1.0 readme

Packaged May, 2024. Download the dataset below:

I agree

By checking this box, I agree to the Dolomites dataset license.

Tasks Task Validation Labels Examples

Tasks: 519 task descriptions provided by experts.
Task Validation Labels: Labels for task validation provided by 3 independent experts.
Examples: Examples of the tasks post-edited by experts. We provide the development set (830 examples) **with** reference outputs and the test set (1037 examples) **without** reference outputs.

Data overview

Tasks: Data Structure

The tasks are provided as a jsonlines file where each task contains the following fields:

task_id: Unique ID for each task
field: Field of the expert who provided the task.
specific_field: Specific subfield or area which the expert works on.
task_objective: 1-2 sentence objective of the task.
task_procedure: A few sentences describing the procedure to conduct the task.
task_input: Sections provided as input for the task, formatted in the form of * section title: section description.
task_output: Sections expected as output for the task, formatted in the form of * section title: section description.
task_notes: A few sentences describing missing context or additional details about the task.
task_urls: A list of URLs that may be useful for the task provided by the annotator.
annotator_id: Unique anonymized ID for the annotator who provided the task.

Task Validation Labels: Data Structure

The task validation labels are provided as a json file. In each key-value pair, the key is a task objective and the value is a list of independent judgements about the task in the following format:

practical_qa: Task validation labels that answer the given questions for the following properties:

representativeness: How likely is this task to be conducted by an expert in your field?
complexity: How would you rate the complexity of this task?
time: How much time is typically required to complete this task?
usefulness: Would you or other experts find it useful if an AI system could be used to propose initial outputs for the task (which may be lacking), that can be validated and improved by experts?

societal_qa: Labels for the societal implications of using LMs as assistants for the task. Each label answers the given questions for the following properties:

anonymity: Is it important to ensure anonymity of any individuals or organizations mentioned if an AI system is used for conducting this task? This may be the case if there is sensitive information in the input that should not be stored or accidentally leaked by an AI system.
bias: Could relying on automatically generated outputs for this task result in biased or potentially harmful decisions for certain groups of people?
ethical: Are there ethical considerations associated with the use of AI systems for this task? This can include privacy issues, moral issues, copyright issues or any other issues.
workforce: Could partial automation of this task potentially have an impact on the workforce in the short term?
accessibility: Does the use of AI systems for this task require making exceptional considerations for ensuring accessibility to all users? For instance, a task that requires producing visual data might pose challenges for people with visual or motor disabilities.

additional_comments: Free-text justifications of each label provided for societal implications. This is a dictionary with the following keys and the relevant comment as value.

anonymity
bias
ethical
workforce
accessibility

comments: Any other comments provided about the task.
duration: Time taken to annotate this task.

Examples: Data Structure

We provide both the development set (830 examples) and test set (1037 examples) in a zip file above. Each of these files is a jsonlines file where each row contains the fields listed below. The example outputs for the test set are not released publicly to ensure a clean evaluation.

example_id: Unique ID for each example.
task: A dictionary denoting the task in the same format as above.
post_edited_example: Complete post-edited example (input & output).
example_input: Input text for the example.
example_input: Output text for the example.