Datasets are collected and offered on all kinds of websites. We provide some key starting points here.
Researchers from many scientific institutes store their data in 1 of these databases:
Figshare: http://figshare.com
Dataverse: http://dataverse.org
OSF: https://osf.io/
Of university repositories, the content is limited to the 'production' of 1 or a few institutions. At the UvA and HvA this is
UvA/HvA-Figshare: https://uvaauas.figshare.com/
National repositories: in these, research results including datasets from several universities in a country are made accessible, often by "harvesting" ( = retrieving information) from university repositories. In the Netherlands, this is
in which mainly output from humanities and social sciences can be found.
For datasets related to exact science, engineering and medicine, the best places to look are
DANS Easy: https://easy.dans.knaw.nl/ and
4TU data centre: https://data.4tu.nl/portal
The City of Amsterdam: https://data.amsterdam.nl/
The European Union: https://data.europa.eu/euodp/nl/home
Dutch government: https://data.overheid.nl/
There are also all kinds of subject-specific data search engines. On the websites of many university libraries, information specialists offer an anthology of these for their specific field.
See for example the data management pages per discipline of the
UvA Library: https://uba.uva.nl/en/search-the-collection/search-by-discipline
(choose a discipline and then click on Data Management; this is not yet available for all disciplines)
There are also the metacatalogues, or "repositories of repositories". These inventory not the datasets themselves, but the collecting repositories. To be successful with this, it is wise to use large subject categories.
Example: if you are looking for datasets on precipitation in a particular year in Europe, search first on the larger topic 'weather'. The metacatalogue links to various repositories. Once there, only use the more specific search terms 'rainfall' etc.
Registry of Research Data Repositories: https://www.re3data.org/
OpenDOAR: https://v2.sherpa.ac.uk/opendoar
Dataportals: https://dataportals.org (geographically ordered)
Datahub: https://datahub.io
You can also use the general search engine Google.com to search for datasets. To avoid drowning in the number of irrelevant results, we offer the following tips:
- in addition to the subject, type
data OR dataset OR "data set"
in the search query.
- You can search specifically for a certain file format with, for example
filetype:csv
and for data from a particular site or internet domain with, for example
site:.gov
- Before words that should NOT appear in the search result, place a - (minus sign).
Google also offers a dataset search engine, launched in 2020:
Google Dataset Search: https://datasetsearch.research.google.com/