We are happy to contribute to open source data and software.
- GitTables - a corpus of 1.7 million relational tables extracted from CSV files from GitHub. The table columns have been automatically annotated with types from Schema.org.
- MeasEval - manually annotated dataset for entity and semantic relation extraction focused on finding counts and measurements, attributes of these quantities, and additional contextual information. This was produced for SemEval-2021 Task 8, which we organized.
- mlinspect - analyze and inspect python machine learning pipelines to check for common issues.
- Torch-RGCN - a pytorch implementation of relational graph convolutional networks.
- BLP - a model for performing inductive link prediction and entity classification for knowledge graphs where entites have textual descriptions.
- conversationkg - package for turning email lists into knowledge graphs.