Skip to Content

Scava logo

This web site hosts the open datasets generated in the course of the Crossminer research project. The datasets include various pieces of data retrieved from the Eclipse forge: Mailing lists, Project development data, and AERI stacktraces in handy CSV and JSON formats. Each dataset has a R Markdown document describing its content and providing hints about how to use it. Examples provided mainly use the R statistical analysis software.

All data is retrieved from the Eclipse Alambic instance at Alambic is an open-source framework for development data extraction and processing, for more information see

All datasets are published under the Creative Commons BY-Attribution-Share Alike 4.0 (International).

All data is anonymised, please see the dedicated document to learn more about privacy and the anonymisation mechanism.

We’re open: if you’d like to contribute, or for any request or question, please see the Eclipse GitLab project page.