Unsupervised relation extraction using sentence encoding



Adopt unsupervise approach on sentence encoding without explicit features selection to extract relations from plane text ...

learn more



Create clusters of similar sentences using clustering algorithm...

learn more



Outperform the existing unsupervised approaches in F-score...

learn more


Future Work

Add hand-crafted features to improve the overall accuracy…

learn more

DICE, Paderborn University | AKSW, University of Leipzig

Manzoor Ali, Muhammad Saleem, and Axel-Cyrille Ngonga Ngomo

What We Want

Aims and Objectives

  • Extract relations by adopting an unsupervised approach dubbed (US-BERT)
  • Use sentence encoding instead of word embeddings to find semantically similar sentences
  • A feature-less approach that requires no explicit features selection for relation extraction 

Arecticture of US-BERT

What we Have achieved

  • Evaluation is performed on NYT-FB dataset
  • NYT-FB contains 455,771 training sentences and 172,448 test sentence
  • Annotated by linking New York Times articles to the freebase knowledge graph
  • Total number of relations are 253
  • Evaluation is based on intial setting and anotation on AllenNLP NER
Models P R F1 P R F1
RelLDA1 0.30 0.47 0.36 - - -
Simon 0.32 0.50 0.39 0.33 0.50 0.40
EType+ 0.30 0.62 0.40 0.31 0.64 0.42
US-BERT 0.35 0.45 0.39 0.38 0.61 0.47



Precision (P) Recall (R) and F1 score of different systems using two NER annotation techniques on NYT-FB.

our Future Plan

Conclusion and Future Work

We used pre-trained sentence encoding to extract high-quality relations with-out any explicit features selection. We achieved the best F1, and precision score compares to the (SOTA) unsupervised methods. To further investigate the relation extraction, we will use some feature selection, compare the results with our work and see the impact also, we will compare our approach with some other (SOTA) approaches in our future work, mainly to the relation extraction systems based on language models.


This work has been supported by the EU H2020 Marie Skłodowska-Curie projectKnowGraphs (860801), the BMBF-funded EuroStars projects E!113314 FROCKG(01QE19418) and E! 114154 PORQUE (01QE2056C).