Towards the identification of semantically similar texts with structure-aware methods