Linguistic  Distance

Given a family tree of languages, the (tree-based) linguistic distance between two countries, is defined as the expected normalized tree distance between the languages spoken by two individuals randomly drawn from the population of those two countries. The cognate-based linguistic proximity is defined as the expected lexical similarity between the languages spoken by two individuals randomly drawn from the population of those two countries

The linguistic distance data can be downloaded here:

Dataset Format: 

Variables included: 

Methodology