Artificial Intelligence to map family trees

AI & Genealogy

The National Archives is starting to develop a new multi-generation register of the Danes’ family relations since 1920.

The register will strengthen research in Denmark and will provide greater insight into how hereditary and family conditions affect for instance health and social life paths.

Danish research is an international leader and recognized for being able to analyze very long time series with historical data from registers, biobanks, and historical archives. The information can be compiled via the CPR-number; however, with the clear limitation that studies of health and social mobility today can only be elucidated through a few generations, as kinship in the CPR-register is only registered for children born from 1960 onwards.

With the creation of a new epoch-making multi-generation register, the National Archives will now go a step further and create opportunities for research across generations in a much longer perspective, so that data can be linked together from the time before the CPR-register. The multigenerational register will be able to identify kinship between all Danes born as far back as 1920. Thus, there will be an opportunity to research the significance of hereditary and familial relationships of health, disease, etc.

Establishing family relationships

The new multi-generation register will link information about family relations for people from the CPR-register together with the much older information from parish registers, so that family relations can be established between all Danes since 1920.

Many social and health problems are supposed to be explained based on family relationships, so there is important knowledge to be gained if researchers are allowed to study phenomena over 3-5 generations.

Analog sources are digitized with AI

In the work of linking the CPR-register to information from the National Archives’ parish registers, researchers from the University of Copenhagen’s Center for AI will develop algorithms that can decipher the handwritten church books. It is a difficult task because there are many different manuscripts in church records from more than 2000 parishes over a period of almost 60 years. The algorithms must be trained partly on manually entered parish registers and partly on a data set that combines information about names and dates from the parish registers with the digital reproduction in the CPR register.

The National Archive says:

“We are incredibly looking forward to getting started – not least with the work with artificial intelligence and historical manuscripts that open up new exciting opportunities for the National Archives and our users. The multigenerational register is a good example of how important it is that the National Archives preserves data and documents from the public administration so that they can be used in new contexts for the benefit of Danish society. Together with our unique, Danish health data, the multigenerational register will be an unparalleled research resource, and I do not think we can imagine the difference the register will make for Danish research.”

My Danish Roots:

We’re looking very much forward to the new opportunities. We’ll follow the progress closely and shall keep you updated here!

Scroll to Top