Representing Anthropological Knowledge: Calculating Kinship
Michael D. Fischer


Kinship Introduction
Learning Kinship with
the Kinship Editor
Use the Kinship Editor
Kinship Editor Results

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

Kinship Contents

The following is extracted from Chapter 7 of Applications in Computing for Social Anthropologists, Michael Fischer, 1994, Routledge, London.

Kinship programs: Introduction

In § 6.2 we outlined the conceptual specification for programs which would identify the people corresponding to specific (etic) genealogical positions relative to some ego and which would list the genealogical terms describing the relationship(s) between an ego and a specific person.

In this section we will examine some the ways in which this specification can be implemented in the programming language Prolog (the specification can be implemented in virtually any computer programming language). The approach to implementation was chosen based on the likelihood that the methods would be accessible to anthropologists.

The methods are relatively straight-forward and effective, at the cost of some efficiency and risk of derision by programmers. This method corresponds to what has been called 'straight-line coding', because little or no use is made of the algorithmic capabilities of the programming language. This form of coding is, however, more defensible in Prolog, which makes impossible some of the worst excesses possible in other programming languages. This method is the suited for those who will be using such techniques only occasionally, because there is relatively little to understand about the computing and program language if the problem itself is well understood by the anthropologist-programmer. It does, however, limit the degree to which the computer can be used in a given problem area, since much of the value of computers lay in the ability to represent generative descriptions of complex structures.

A second method (§7.3.3) uses the capacity of the programming language to write more general (and powerful) generative definitions of structures and operations on structures, but requires much greater ability on the part of the programmer. Again, these do not necessarily correspond to 'best programming practice'. A balance is struck between style and efficiency and intelligibility for the non-programmer. Even though these are not written in the most compact and generative manner possible, they will initially prove difficult to understand. (Indeed, it is unlikely that the beginner will understand these methods enough to write new programs based on them, without considerable effort and resort to introductory books on Prolog (see Bratko 1986 for a full introduction. Kononenko and Lavrac 1989 provides a quick introduction with examples )).

The programs are included because they a) illustrate the implementation of specification such as those we produced in the Specification section, and b) serve as a working model and starting point of discussion with a programming partner. They provide examples of attacking familiar problems which can be useful if you do develop programming further. Perhaps most importantly, as written they perform tasks which should be useful and are immediately accessible by typing the program text into the computer.

7.2.2 Representing relations in Prolog

The basic building block in Prolog is called a clause, which can take forms which include:

P(X,Y), Q(Y,Z).
P(X,Y) :- Q(Y,X).

where P is any predicate and X and Y are variables which range over some domain. The predicate P asserts a relationship between the arguments, if any, and can evaluated by Prolog as either true or false. The relationship asserted by a predicate can be of any sort: 'male(X)' /X is male/;' mother(X,Y)' /X is the mother of Y/; 'younger(X,Y)' /X is younger than Y/. The programmer, by their authoring of the predicates, states the relationships and is responsible for ensuring that conclusions drawn from evaluating the predicates has a reasonable interpretation outside the program. For obvious reasons, names are used which are meaningful to the programmer. These names are not meaningful to Prolog. For example, we might have two named predicates, spouse and gender:

spouse(X,Y) /X is a spouse of Y/
gender(X,Y) /the gender of X is Y/

Prolog must have some basis for assigning values of true or false to statements consisting of or including these predicates. The most basic method of supplying this information is to insert concrete instances of the predicates, by replacing the variables with data:

spouse(mary, john).
spouse(carl, janet).
gender(mary, female).
gender(john, male).
gender(carl, male).
gender(janet, female).

This form of predicate instance is, obviously, data. The anthropologist selects the predicate names and types in the information. At this early stage the anthropologist can access the information in four ways, by instructing Prolog to evaluate:

spouse(mary, john). /is mary the spouse of john?/
spouse(john,mary). /is john the spouse of mary?/
spouse(F, john). /who is the spouse of john?/
spouse(mary, K). /mary is the spouse of who?/
spouse(V,L). /who is the spouse of whom?/

The first query simply evaluates to true, because this is an instance previously supplied to Prolog. The second query evaluates to false. There is no instance which matches 'spouse(john,mary)'. Prolog, in evaluating this query, can make no use of the knowledge that spouse is a reciprocal term, because this knowledge has not been made available. A simple (but wasteful) means of establishing this relationship is to add the instance 'spouse(john,mary)' to the program.

The terms F, K, V and L in the latter three examples are variables. Any term which begins with an uppercase letter (A-Z) is treated as a variable by Prolog. When variables are present, information from each matching instance of the predicate is assigned to the variable from the corresponding position. The latter three will evaluate to true, but the presence of variables in the query will give more information. The third query supplies the additional information that the value of the variable F is 'mary'. The fourth that K is 'john'. The fifth evaluates to true twice, the first time with 'V=mary' and 'L=john', and the second time to 'V=carl' and 'L=janet'.

This is mildly useful, particularly if a large number of spouses are under investigation. Much more useful are compound clauses. If, rather than spouses, we want to list wives, make the query:

spouse(Wife,Ego), gender(Wife,female).

Prolog will evaluate 'spouse(Wife,Ego)' to be twice true. The first time Wife will be bound to 'mary', and Ego to 'john'. The comma instructs Prolog to evaluate the second clause with these same variable bindings. Treating 'gender(Wife, female)' as 'gender(mary, female)', this clause will evaluate to true, and the variable bindings can be examined to determine who this wife is and to whom. The second time 'spouse(Wife, Ego)' will evaluate to true, with Wife bound to 'carl' and Ego to 'janet'. Evaluating the second clause will these bindings results in evaluating 'gender(carl,female)' to be false, therefore the compound clause is false, and no variable bindings are available for examination.

Quite complex compound clauses can be formed, but if you intend to use the same clause more than once, or use it as the basis of an even more complex clause, the clause can be added to the program as a rule. We can define a predicate wife with the following instance in the program:

wife(Wife, Ego) :- spouse(Wife,Ego), gender(Wife,female).

This can be read: 'wife(Wife, Ego)' is true when the compound clause is true. If the compound clause is true, then the variables in 'wife(Wife,Ego)' will be bound to the values which led to this evaluation. If you make the query, 'wife(X,Y)', this will be evaluated precisely as the previous example, reporting that X is 'mary' and Y is 'john'. The variables in the query need not be those in the definition. Prolog attaches no meaning to variable names other than the establishment of uniqueness in a clause or compound clause, within a clause all variable instances with the same name will be bound to the same value.

The present definition of wife only finds wives when these are in the first position of the spouse clause. This can be corrected by adding a second definition of wife:

wife(Wife, Ego) :- spouse(Ego,Wife), gender(Wife,female).

Any predicate can have any number of instances, either of a concrete form or defined in terms of compound clauses. When evaluating the query, wife(Wife,Ego), the first instance will be tried, and then the second.

7.2 The Input Data Model

7.2.1 Representing the Input Data Model

Table 7.1 describes some partial data for a genealogy, which corresponds to the Input Data Model outlined in § 6.3.6. Although data could be prepared in this form, the purpose of the model is not to determine the form in which the data is collected, or even the form in which data will be presented to the program, but represents the information requirements of the program (in the terms of § 6.2 the Abstract Data Model) in a relatively non-redundant form. Whatever form the input data takes, it must be transformable into terms of the Input Data Model.

The Relation in each case designates a collection of information. The other headings label a category of information within this collection. The different headings are associated by the relations. The relation gender associates each Person ID with a Gender value. The relation spouse_set associates each Person ID with a Union ID (a name for a specific union or marriage) and Union Order. In this case Union Order is a simple number designating the ordinal of this particular union (first, second, third marriage), but it could be more complex if required in a specific social context. The relation sib_set associates a person with their parent's Union ID and the ordinal of their position in the sibling set (first born etc.).

Relation Person ID Gender

gender abdul male
gender mustaffa male
gender hamiz male
gender sufia female
gender humza male
gender nadya female

Relation Union ID Person ID Union Order

spouse_set 1 mustaffa 1
spouse_set 2 hamiz 1
spouse_set 2 sufia 2
spouse_set 5 humza 1
spouse_set 10 abdul 1
spouse_set 10 nadya 1

Relation Union ID Person ID Sibling Order

sib_set 1 abdul 3
sib_set 2 mustaffa 2
sib_set 3 hamiz 1
sib_set 4 sufia 2
sib_set 2 humza 1
sib_set 5 nadya 2

Table 7.1. Data fragment for Input Data Model from § 6.3.6

7.2.3 Defining the Input Data Model in Prolog

Table 7.2 shows one way the data in Table 7.1 can be represented in Prolog. A relation name becomes a predicate name in Prolog. Prolog has no information about the predicates gender, spouse_set and sib_set in Table 7.2, other than the following: there is a predicate named gender which is applied to two arguments. Any other meaning must be established by conventions which are applied in the program. It may be obvious (to people) from the predicate form of gender that the first position is a person's name and the second is an English gender term, but Prolog makes no such presumptions. The predicate 'X(Y,Z)' would have served as well, so long as we remember that X is a predicate which relates a gender Z to a given person Y, which gender value Z stands for and who Y is.

gender(abdul, male).
gender(mustaffa, male).
gender(hamiz, male).
gender(sufia, female).
gender(humza, male).
gender(nadya, female).

spouse_set(1, mustaffa, 1).
spouse_set(2, hamiz, 1).
spouse_set(2, sufia, 2).
spouse_set(5, humza, 1).
spouse_set(10, abdul, 1).
spouse_set(10, nadya, 1).

sib_set(1, abdul, 3).
sib_set(2, mustaffa, 2).
sib_set(3, hamiz, 1).
sib_set(4, sufia, 2).
sib_set(2, humza, 1).
sib_set(5, nadya, 2).

Table 7.2. Prolog representation of Table 7.1.

7.3 Representing the Abstract Data Model

7.3.1 The Abstract Data Model

The need for a separate Input Data Model and Abstract Data Model arises from the the basic concepts of genealogical analysis (among others). Genealogical relations are conceptualised as being relative to a specific individual, but it is useful to have the freedom to designate and vary which individual that might be. To input all the data directly in this form would be very wasteful. For a single genealogical interview of 30 people, this might consist of 871 entries. Using the proposed Input Data Model in § 7.2.3 requires 90 entries at most, and contains all the information required to construct these 870 relationships.

The Abstract Data Model is the set of schema or rules required to transform the normalised input data format into ego-relative terms. Following the specification in § 6.3.7, Program 7.1 is a Prolog implementation of instances which define the basic genealogical categories from the normalised input data.

These rules define the basic relationships, designated by a single letter corresponding to the notation suggested by Barnard and Good (1984:4 Table 1.1), with the exception of using lowercase letters for convenience due to the special meaning of uppercase letters in Prolog. For example compare the Prolog rule (predicate) for p /parent/ with the specification for parent in §6.3.7:

parent iff Spouse_ID of A is Parent's Spouse_ID of Ego then A is Parent of Ego

p(Parent,Ego) :-

a) The goal of the rule is to determine if Parent is the parent of Ego. In Prolog this goal is written p(Parent, Ego)
b) The construction ':-' is written after the goal. This can be read as 'when' or 'if', and introduces the set of clauses which specify how to satisfy the goal statement.
c) The clauses necessary to establish the goal statement are written, separated by commas, and ended with a full stop ('.').

If the information in Table 7.2 and Program 7.1 are consulted as a program in Prolog, the following query can be evaluated:

p(X,Y). /Who is a parent of Whom, or list all the parent-child pairs/

To evaluate this query, the first goal in the definition, 'sib_set(Spset, Ego, Order)', will be evaluated. Since sib_set is defined by instances of data this clause will be evaluated as:

sib_set(1, abdul, 3)

Ego is now bound to 'abdul', Spset to 1 and Order to 3. The second clause, 'spouse_set(Spset,Parent,Union)', begins with Spset set to 1, or 'spouse_set(1,Parent,Union)'. This is evaluated as:

spouse_set(1, mustaffa, 1).

so Parent is now bound to 'mustaffa'. Since both clauses have evaluated true, the goal is satisfied, and Ego and Parent are bound to the specific values 'abdul' and 'mustaffa'.

Prolog is called a non-deterministic language because there can be more than one solution to a problem. For this query there are six:

p(mustaffa, abdul).
p(hamiz, mustaffa).
p(sufia, mustaffa).
p(hamiz, humza).
p(sufia, humza).
p(humza, nadya).

with the order of evaluation:

sib_set(1, abdul, 3)
spouse_set(1, mustaffa, 1) --> p(mustaffa, abdul).
sib_set(2, mustaffa, 2)
spouse_set(2, hamiz, 1) --> p(hamiz, mustaffa).
spouse_set(2, sufia, 2) --> p(sufia, mustaffa).
sib_set(3, hamiz, 1)
spouse_set(3,X,Union) --> FAIL (no solution, no report).
sib_set(4, sufia, 2)
spouse_set(3,X,Union) --> FAIL (no solution, no report).
sib_set(2, humza, 1)
spouse_set(2, hamiz, 1) --> p(hamiz, humza).
spouse_set(2, sufia, 2) --> p(sufia, humza).
sib_set(5, nadya, 2)
spouse_set(5, humza, 1) --> p(humza, nadya).

The evaluation of the query, 'p(hamiz, Ego)', the variable Parent will be bound to 'hamiz' wherever it appears in the definition, which will follow the order of evaluation:

sib_set(1, abdul, 3)
spouse_set(1, hamiz, Union)--> FAIL.
sib_set(2, mustaffa, 2)
spouse_set(2, hamiz, 1) --> p(hamiz, mustaffa).
sib_set(3, hamiz, 1)
spouse_set(3,hamiz,Union) --> FAIL.
sib_set(4, sufia, 2)
spouse_set(3,hamiz,Union) --> FAIL.
sib_set(2, humza, 1)
spouse_set(2, hamiz, 1) --> p(hamiz, humza).
sib_set(5, nadya, 2)
spouse_set(5, hamiz, 1) --> FAIL.

The query, 'p(Q, nadya)' will be evaluated as:

p(Q, nadya)
sib_set(5, nadya, 2)
spouse_set(5, humza, 1) --> p(humza, nadya).

These evaluation sequences illustrate three important points. Firstly, although the variable names carry no special meaning in Prolog, a given name has the same value throughout a rule: once a value is assigned to a variable it 'sticks' through all subsequent clauses. Secondly, the variable names used in the query need not match the variable names used in defining the rule. In Prolog the variables are related to the position in which they appear in the goal statement and query, not the specific form of the variable. Thirdly, the computational efficiency of a definition can vary depending on the order of the defining clauses (although the resulting answer will not). This is not important if there are six clauses to evaluate, but can be an issue if there are one or two thousand.

Once a rule is defined in Prolog, it can be used within the definition of another rule just as if it were defined in terms of data. Making rules establishes knowledge about interrelationships in the data within the program. Having established how to establish a parent relationship from the Input Data Model, the rule can be used to define other rules for father, mother and child relationships.

The remainder of the rules in Program 7.3 should be clear, except for the construction, 'Ego \= Sib', in the g rule (sibling) and 'Ego \= Spouse' in the e rule (spouse). 'Ego \= Sib' can be read, 'Ego is not equal to Sib'. This situation arises because we are using the predicate sib_set twice in the definition, and although Ego and Sib are different variables, these can have the same value. Unless a person can be considered their own sibling, then this situation must be blocked by using the 'not equal to' predicate at the end of the definition. It must be at the end because it is the sib_set predicates which bind the variables to values.

p(Parent,Ego) :-

f(Father,Ego) :- p(Father,Ego), gender(Father,male).
m(Mother,Ego) :- p(Mother,Ego), gender(Mother,female).

g(Sib,Ego) :- sib_set(Spset,Ego,O1), sib_set(Spset,Sib,O2), Ego \= Sib.

b(Bro,Ego) :- g(Bro,Ego), gender(Bro, male).
z(Sis,Ego) :- g(Sis,Ego), gender(Sis, female).

c(Child,Ego) :- p(Ego,Child).

s(Son,Ego) :- c(Son,Ego), gender(Son, male).
d(Daug,Ego) :- c(Daug,Ego), gender(Daug, female).

e(Spouse,Ego) :-
spouse_set(Spset, Spouse,_), Ego \= Spouse.

w(Wife,Ego) :-
e(Wife,Ego), gender(Wife, female).
h(Husband,Ego) :-
e(Husband,Ego), gender(Husband, male).

Program 7.3. Prolog implementation of the Abstract Data Model.

gy(Sib,Ego) :-
Ordsib < Ordego.
ge(Sib,Ego) :-
Ordsib > Ordego.

Program 7.4. Prolog rules for younger and elder sibling.

In many cases other kinds of conceptual distinctions are necessary to accommodate significant relationships in some societies. For example, many societies distinguish between elder and younger siblings. The Input Data Model encodes the necessary information in the sib_set predicate in the Order parameter. Rules for making this distinction in the Abstract Data Model are in Program 7.4. In these rules the Order parameter for each are compared. The '<' is read as less than and '>' as greater than. Because Order cannot be equal, it is not necessary to check that Ego and Sib are not the same value, although it would not disturb anything to do so.

7.3.2 Finding complex relationships based on the Abstract Data Model A simple version

Program 7.5a is a portion of a simple, if inelegant, program for finding arbitrarily complex ego-centric relationships between people. The program is composed of as many definitions of a new predicate, relation, as is required to identify all the relationships that are of interest.

The basic form is:

relation(Rel,Ego,Relname) :- link1(L1,Ego), link2(L2,L1), …, linkn(Rel,L2).

Because the kind of relationship is 'hard-wired' in the definition the Relname is supplied as a fixed term. The main disadvantage to this approach is that a large number of relation predicates (over 200) must be written to cover the gender-specific expansion of the 39 types of relationship recommended as a minimum by Barnard and Good (1984:30) reproduced in Table 6.3. It can also be a tedious exercise to modify if only subsets of relationships are needed for a specific analysis, although separate versions of the program can be developed for different needs with the aid of a text editor and multiple copies of an original. The advantages are:

a) the approach is easy to understand.
b) the order in which relationships are reported is directly under the researcher's control. Prolog evaluates all the applicable (matching) relation predicates in the order these appear in the original program text.
c) only the relationships of interest need be included (and reported).

The tedium of entering the predicates can be reduced somewhat by the use of a good text editor, duplicating previous similar clauses and amending these. Be careful, because this is also a potential source of systematic error. It is not necessary to use variable names such as MM or MMZ, as appear in the program text, but this is helpful when referring back to the program and looks rather better than X and Y. Also, as in the last definition in Program 7.5a, other definitions of relation can be used to find the last link to the potential relative.

relation(Rel,Ego,'F') :- f(Rel,Ego).
relation(Rel,Ego,'M') :- m(Rel,Ego).
relation(Rel,Ego,'B') :- b(Rel,Ego).
relation(Rel,Ego,'Z') :- z(Rel,Ego).
relation(Rel,Ego,'S') :- s(Rel,Ego).
relation(Rel,Ego,'D') :- d(Rel,Ego).
relation(Rel,Ego,'W') :- w(Rel,Ego).
relation(Rel,Ego,'H') :- h(Rel,Ego).
relation(Rel,Ego,'P') :- p(Rel,Ego).
relation(Rel,Ego,'C') :- c(Rel,Ego).
relation(Rel,Ego,'G') :- g(Rel,Ego).
relation(Rel,Ego,'E') :- e(Rel,Ego).

relation(Rel,Ego,'MZ') :- m(M,Ego), z(Rel,M).
relation(Rel,Ego,'MB') :- m(M,Ego), b(Rel,M).
relation(Rel,Ego,'FZ') :- f(F,Ego), z(Rel,F).
relation(Rel,Ego,'FB') :- f(F,Ego), b(Rel,F).

relation(Rel,Ego,'MM') :- m(M,Ego), m(Rel,M).
relation(Rel,Ego,'MF') :- m(M,Ego), f(Rel,M).
relation(Rel,Ego,'FM') :- f(F,Ego), m(Rel,F).
relation(Rel,Ego,'FF') :- f(F,Ego), f(Rel,F).

relation(Rel,Ego,'MMZ') :- m(M,Ego), m(MM,M), z(Rel,MM).
relation(Rel,Ego,'MFZ') :- m(M,Ego), f(FF,M), z(Rel,FF).
relation(Rel,Ego,'MMB') :- m(M,Ego), m(MM,M), b(Rel,MM).
relation(Rel,Ego,'MFB') :- m(M,Ego), f(FF,M), b(Rel,FF).

relation(Rel,Ego,'MMZD') :- m(M,Ego), m(MM,M), z(MMZ,MM), d(Rel,MMZ).


relation(Rel,Ego,'MMZD') :- relation(MMZ,Ego,'MMZ'), d(Rel,MMZ).


Program 7.5a. Simple program for finding complex etic relationships.

In many cases it is necessary (or highly desirable) to know not only the relationship, but the linking relatives as well. Program 7.5b demonstrates a simple modification for this purpose, adding a list to store the linking individuals. A list in Prolog acts as a single object, but can contain a complex structure which includes other lists. In this case the list is a simple one, containing the linking individuals, separated by commas. Lists are surrounded by left and right square brackets, '[' and ']'. In the parameter list of relation, the list is a single parameter, regardless of how many values are within. For example, consider the query:

relation(nadya, abdul, Relation, Links).

and the responses:

relation(Rel,Ego,'F',[Ego,Rel]) :- f(Rel,Ego).
relation(Rel,Ego,'M',[Ego,Rel]) :- m(Rel,Ego).
relation(Rel,Ego,'B',[Ego,Rel]) :- b(Rel,Ego).
relation(Rel,Ego,'Z',[Ego,Rel]) :- z(Rel,Ego).
relation(Rel,Ego,'S',[Ego,Rel]) :- s(Rel,Ego).
relation(Rel,Ego,'D',[Ego,Rel]) :- d(Rel,Ego).
relation(Rel,Ego,'W',[Ego,Rel]) :- w(Rel,Ego).
relation(Rel,Ego,'H',[Ego,Rel]) :- h(Rel,Ego).
relation(Rel,Ego,'P',[Ego,Rel]) :- p(Rel,Ego).
relation(Rel,Ego,'C',[Ego,Rel]) :- c(Rel,Ego).
relation(Rel,Ego,'G',[Ego,Rel]) :- g(Rel,Ego).
relation(Rel,Ego,'E',[Ego,Rel]) :- e(Rel,Ego).

relation(Rel,Ego,'MZ',[Ego,M,Rel]) :- m(M,Ego), z(Rel,M).
relation(Rel,Ego,'MB',[Ego,M,Rel]) :- m(M,Ego), b(Rel,M).
relation(Rel,Ego,'FZ',[Ego,F,Rel]) :- f(F,Ego), z(Rel,F).
relation(Rel,Ego,'FB',[Ego,F,Rel]) :- f(F,Ego), b(Rel,F).

relation(Rel,Ego,'MM',[Ego,M,Rel]) :- m(M,Ego), m(Rel,M).
relation(Rel,Ego,'MF',[Ego,M,Rel]) :- m(M,Ego), f(Rel,M).
relation(Rel,Ego,'FM',[Ego,F,Rel]) :- f(F,Ego), m(Rel,F).
relation(Rel,Ego,'FF',[Ego,F,Rel]) :- f(F,Ego), f(Rel,F).

relation(Rel,Ego,'MMZ',[Ego,M,MM,Rel]) :- m(M,Ego), m(MM,M), z(Rel,MM).
relation(Rel,Ego,'MFZ',[Ego,M,MF,Rel]) :- m(M,Ego), f(MF,M), z(Rel,FF).
relation(Rel,Ego,'MMB',[Ego,M,MM,Rel]) :- m(M,Ego), m(MM,M), b(Rel,MM).
relation(Rel,Ego,'MFB',[Ego,M,MF,Rel]) :- m(M,Ego), f(MF,M), b(Rel,FF).

relation(Rel,Ego,'MMZD',[Ego,M,MM,MMZ,Rel]) :-
m(M,Ego), m(MM,M), z(MMZ,MM), d(Rel,MMZ).


relation(Rel,Ego,'MMZD',[Ego,M,MM,MMZ,Rel]) :-
relation(MMZ,Ego,'MMZD',[Ego,M,MM,MMZ]), d(Rel,MMZ).


Program 7.5b. Program 7.5a modified to record people in relationship.

Relation='W', Links=[abdul,nadya].
Relation='FBD', Links=[abdul,mustaffa,humza,nadya]. Another stage

Although Program 7.5b will suffice for some research purposes, it is extremely cumbersome if relationships of reasonable depth are required. There are at least 460 possible gender-specific relationships (including affines) with up to four links ('MMZD'), and at least 1580 with up to five links.

Prolog has powerful mechanisms for specifying generative predicates, where specific cases can be generated rather than listed as in Program 7.5b. This increases the complexity of defining rules. Program 7.6a is a first draft of a simple generative program for finding relationships. It operates on the non-deterministic properties of Prolog interpretation, in which all possible positive solutions are produced.

For example, the first relations rule finds all one-term relationships by trying each instance of the simple_rel predicate in turn. Those which succeed are reported, binding the relevant variable as in Program 7.5b This single relations rule 'replaces' eight relation rules from Program 7.5b. The second relations rule finds all two-term relationships, replacing 64 relation rules (Program 7.6a finds relationships such as 'WH' and 'FS', and even 'WW', as it stands).

The four relations rules will attempt to locate 4096 relationships, although some of these are impossible or unlikely, such as 'HH' or 'HWHW'. The query:

?- relations(nadya, abdul, T, L).

reports seven relationships:

Nº1 T = 'W', L = [abdul, nadya]
Nº2 T = 'FBD', L = [abdul, mustaffa, humza, nadya]
Nº3 T = 'FSW', L = [abdul, mustaffa, abdul, nadya]
Nº4 T = 'WFD', L = [abdul, nadya, humza, nadya]
Nº5 T = 'WHW', L = [abdul, nadya, abdul, nadya]
Nº6 T = 'FFSD', L = [abdul, mustaffa, hamiz, humza, nadya]
Nº7 T = 'FMSD', L = [abdul, mustaffa, sufia, humza, nadya]

To control the range of possible relationships the transitions between terms must be controlled.Program 7.6b contains an additional predicate, transition, which generates valid transitions for genealogical terms, and revisions to the relations predicate which use these transitions. Again, because of the generative nature of Prolog, all the transitions will be used, if necessary. The order of the transitions is only relevant to the order in which relationships will be reported.

Valid transitions are, of course, a matter for the researcher. The given transition predicates should be altered as necessary. For example, if affinal relations are not required, remove the transitions for 'H' and 'W', and relationships such as 'WF' will not be considered.

simple_rel(Rel,Ego,'F') :- f(Rel,Ego).
simple_rel(Rel,Ego,'M') :- m(Rel,Ego).
simple_rel(Rel,Ego,'B') :- b(Rel,Ego).
simple_rel(Rel,Ego,'Z') :- z(Rel,Ego).
simple_rel(Rel,Ego,'S') :- s(Rel,Ego).
simple_rel(Rel,Ego,'D') :- d(Rel,Ego).
simple_rel(Rel,Ego,'W') :- w(Rel,Ego).
simple_rel(Rel,Ego,'H') :- h(Rel,Ego).

relations(Rel,Ego,Gen,[Ego,Rel]) :- simple_rel(Rel,Ego,Gen).

relations(Rel,Ego,Gen,[Ego,L1,Rel]) :-
simple_rel(L1,Ego,T1), simple_rel(Rel,L1,T2),

relations(Rel,Ego,Gen,[Ego,L1,L2,Rel]) :-
simple_rel(L1,Ego,T1), simple_rel(L2,L1,T2),

relations(Rel,Ego,Gen,[Ego,L1,L2,L3,Rel]) :-
simple_rel(L1,Ego,T1), simple_rel(L2,L1,T2),
simple_rel(L3,L2,T3), simple_rel(Rel,L3,T4),

[ If your Prolog lacks stringof you can use the following definition. Method using name predicate, which should be standard (refer to your manual)

stringof(Charlist, Atom) :-
name(Atom, Bytelist),


strof1([T|Rest], [C|More]) :-

stringof(Charlist,Atom) :-

Program 7.6a. A generative relationship program.

Program 7.6b may be slow if it is used on a database with a large number of people. This is less of a problem if the relationship to be found is not known in advance, for instance if the relationships between two people are at issue (eg '?- relations(abdul,nadya,T,L)'), and cannot easily be remedied in any case. But if the relation is known in advance, the process of finding people who correspond

transition('F','F'). transition('F','M'). transition('F','B'). transition('F','Z').
transition('M','F'). transition('M','M'). transition('M','B'). transition('M','Z').
transition('B','S'). transition('B','D'). transition('B','W').
transition('Z','S'). transition('Z','D'). transition('Z','H').
transition('S','S'). transition('S','D'). transition('S','W').
transition('D','S'). transition('D','D'). transition('D','H').
transition('H','B'). transition('H','Z'). transition('H','F'). transition('H','M').
transition('W','B'). transition('W','Z'). transition('W','F'). transition('W','M').

relations(Rel,Ego,Gen,[Ego,Rel]) :- simple_rel(Rel,Ego,Gen).

relations(Rel,Ego,Gen,[Ego,L1,Rel]) :-
simple_rel(L1,Ego,T1), transition(T1,T2),simple_rel(Rel,L1,T2),

relations(Rel,Ego,Gen,[Ego,L1,L2,Rel]) :-
simple_rel(L1,Ego,T1), transition(T1,T2),simple_rel(L2,L1,T2),
transition(T2,T3), simple_rel(Rel,L2,T3),

relations(Rel,Ego,Gen,[Ego,L1,L2,L3,Rel]) :-
simple_rel(L1,Ego,T1), transition(T1,T2),
simple_rel(L2,L1,T2), transition(T2,T3),
simple_rel(L3,L2,T3), transition(T3,T4), simple_rel(Rel,L3,T4),

Program 7.6b. Additions and revisions to generative relationship program.

to this relationship can be greatly accelerated. The problem with the program as it is written so far is that if the relationship is given in the query, the relationship is not checked until the end of the relations rule. After all cases of the relevant relationship are found, the program will continue until it has exhausted all possibilities. With up to five-term relationships there are 1580 possible simple and complex connections. On a large database of people as many as 7264 simple relationships (eg calls to simple_rel) might be evaluated. At 100 evaluations per second, the total search might amount to over a minute. In practice, these maximums are unlikely to occur for demographic reasons, but unnecessary delays will occur.

Program 7.6c contains a new predicate, relation, and revisions to relations to remedy this problem. The predicate relation checks to see if the relationship is a variable or not, that is, if the query has a term or a variable in that position. If it is not a variable, the stringof predicate converts the genealogical term (or atom in Prologese) into a list of the letters in the term. (stringof is built in to most Prologs. See Program 7.6a if it is not) This binds the variables T1, T2 etc in the rules to these specific simple relationships, so that alternatives will not be searched. It

relation(Rel,Ego,Gen,Link) :-
nonvar(Gen), stringof(Gens,Gen), relations(Rel,Ego,Gen,Link,Gens).
relation(Rel,Ego,Gen,Link) :-
var(Gen), relations(Rel,Ego,Gen,Link,Gens).

relations(Rel,Ego,Gen,[Ego,Rel],[Gen]) :- simple_rel(Rel,Ego,Gen).

relations(Rel,Ego,Gen,[Ego,L1,Rel],[T1,T2]) :-
simple_rel(L1,Ego,T1), transition(T1,T2),simple_rel(Rel,L1,T2),

relations(Rel,Ego,Gen,[Ego,L1,L2,Rel],[T1,T2,T3]) :-
simple_rel(L1,Ego,T1), transition(T1,T2),simple_rel(L2,L1,T2),
transition(T2,T3), simple_rel(Rel,L2,T3),

relations(Rel,Ego,Gen,[Ego,L1,L2,L3,Rel],[T1,T2,T3,T4]) :-
simple_rel(L1,Ego,T1), transition(T1,T2),
simple_rel(L2,L1,T2), transition(T2,T3),
simple_rel(L3,L2,T3), transition(T3,T4), simple_rel(Rel,L3,T4),

[Define more relations definitions for longer links if necessary.]

Program 7.6c. Additions and revisions to improve efficiency of generative relationship program.

also sets the length of the list to a specific value, so that if 'FBD' is the query relationship, only the three-term rule will apply.

The additions to the relations rules also opens up the possibility of greater control over the relations to examine. For example the query:

?- relations(Rel,abdul,Term,Link,['F','B',T]).

will find only relationships via the FB; FBD, FBS and FBW with the current transition control. However, by specifying the list as three long, only three-term relationships can be reported. Prolog has a special list notation that removes this restriction:

?- relations(Rel,abdul,Term,Link,['F','B'|Rest]).

The vertical bar '|' indicates that the remainder of the list, from zero to n elements, will be bound to the variable Rest (or whatever name chosen). This form of the query will report all two, three and higher term relationships.

It is also possible to specify individuals in the Link list in the same manner:

?- relations(Rel,abdul,Term,[X,hamiz|More],['F','B'|Rest]).

so that relationships through that person only will be considered, if any. If the additional control is not needed, then the relation predicate can be used.

7.3.3 Controlling the search

Program 7.6c illustrates the requirements of making the program both efficient and permitting more control over the relationships which are searched for, since most of the time we probably do not want to see all possible relationships. Program 7.7a introduces the idea of 'generate and test' for finding specific relationships. The primary addition is a predicate valid_rel which is a set of predicates in which you list those relations of interest. There are changes to relation which use valid_rel in the second relation definition to generate values to pass to the first definition of relation. It is assumed that any term you select for the first definition is valid. Note that the second relation predicate makes a call to relation. This is an elementary example of recursion, defining a predicate in terms of itself, although in this case it is trivial, since the call can only be successful for the first relation predicate, since the term corresponding to Gen will not be a variable since it is bound before the call.

Program 7.7a has much the same problem as Program 7.5b; you have to specify each kind of relationship you want to investigate, and there are a large number of these. Using recursion we can introduce another kind of control over the kinds of relationships to search for. In addition to asserting relationships using valid_rel, we can control term generation in terms of affinity, collaterality, and ascending and descending generation. We can, for example, generate all terms with no more than one affinal link, or no more than two ascending generations, or relations which are strictly lineal. In Program 7.7b, the predicate relation is modified to use a new predicate gen_rel, which generates relationship terms based on a specification by the user of how many links through affines, collaterals, and ascending and descending relations. As an example, the query:

valid_rel('F'). valid_rel('FF').
valid_rel('FFF'). valid_rel('FM').
valid_rel('FMF'). valid_rel('FMM').
valid_rel('FB'). valid_rel('FZ').
valid_rel('FBS'). valid_rel('FBD').
valid_rel('FZS'). valid_rel('FZD').
valid_rel('S'). valid_rel('SS').

relation(Rel,Ego,Gen,Link) :-
nonvar(Gen), stringof(Gens,Gen), relations(Rel,Ego,Gen,Link,Gens).
relation(Rel,Ego,Gen,Link) :-
var(Gen), valid_rel(Gen), relation(Rel,Ego,Gen,Link).

[The relations predicate is defined in Program 7.6c]

Program 7.7a. Program modifications to Program 7.6c to control search

relation(Rel, Ego, Term, Link, 1,1,1,1).

will report all people in the database related with at most one link of each type; F, FB, FBS, FBD, FBW, B, BW, BWF etc. It does this using the recursive routine grel1, which uses the transition predicate defined in Program 7.6b, and a predicate valid_trans which subtracts one from the appropriate variable as the links are followed. When all the parameters Affine, Collateral, Ascending, Descending are zero, then all calls to valid_trans fail, and the recursion in grel1 ceases. This is a more typical use of recursion, since the same definition of grel1 is called by itself to follow the next link; literally an operation which is defined in terms of itself. Prolog makes considerable use of recursion, and to fully use the language must be understood. Because of its importance, it is well covered in all textbooks on Prolog.

7.3.4 Kinship Terminologies

We can introduce the use of emic kinship categories into this program by the simple inclusion of a list of predicates with terms as one argument and the etic kin string as the other, such as:

terms(father, 'F'). terms(mother, 'M')
terms(grandfather, 'FF'). terms(grandmother, 'GM')
terms(uncle, 'FB'). terms(uncle, 'MB')

and so on. These can be used where desired to either classify output terms into emic categories, or to generate search terms from emic categories.

Prolog is also an excellent language for working with terminological properties themselves, although a treatment of this here is beyond the scope and word limits of this book. See Read and Behrens (1988 and 1992) for a description of their program KAES, which assists the user in determining algebraic descriptions of kinship terminologies.

7.4 Addendum: Drawing Kinship Diagrams

Drawing kinship diagrams automatically is a non-trivial task, not because it is difficult to draw a diagram, but because it is difficult to draw the diagram you want. As Read and Behrens (1992) suggest, drawing diagrams using a drawing program (§5.2.2) is sometimes the easiest way to get a diagram of a particular sort.

However, for all their flaws, computer generated diagrams have considerable attraction, and considerable value for some research purposes, since although they might not be suitable for publication, they can relate a lot of information, and they can be drawn quickly.

There are no generally available programs at present for plotting kinship diagrams, although quite a number have been written to assist in research. The basics are not difficult, however, and it is often possible to some assistance in developing such a program.

relation(Rel,Ego,Gen,Link, Affine,Collateral,Ascending,Descending) :-
nonvar(Gen), stringof(Gens,Gen), relations(Rel,Ego,Gen,Link,Gens).
relation(Rel,Ego,Gen,Link, Affine,Collateral,Ascending,Descending) :-
gen_rel(Gen, Affine,Collateral,Ascending,Descending), relation(Rel,Ego,Gen,Link).

[The relations predicate is defined as in Program 7.6c. valid_rel as in Program 7.7a]

gen_rel(Rel1, Affine,Collateral,Ascending,Descending) :-

gen_rel(Rel1, Affine,Collateral,Ascending,Descending) :-
valid_trans(R,Affine,Collateral,Ascending,Descending, A1,C1,As1,De1),


grel1(R,[R|Rel],Affine,Collateral,Ascending,Descending) :-
valid_trans(N,Affine,Collateral,Ascending,Descending, A1,C1,As1,De1),
/* fails if not valid*/

valid_trans('W',Affine,Coll,Ascend,Descend, A1,Coll,Ascend,Descend) :-
Affine \= 0,
A1 is Affine - 1.
valid_trans('H',Affine,Coll,Ascend,Descend, A1,Coll,Ascend,Descend) :-
Affine \= 0,
A1 is Affine - 1.

valid_trans('B',Affine,Coll,Ascend,Descend, Affine,C1,Ascend,Descend) :-
Coll \= 0,
C1 is Coll - 1.
valid_trans('Z',Affine,Coll,Ascend,Descend, Affine,C1,Ascend,Descend) :-
Coll \= 0,
C1 is Coll - 1.

valid_trans('F',Affine,Coll,Ascend,Descend, Affine,Coll,As1,Descend) :-
Ascend \= 0,
As1 is Ascend - 1.
valid_trans('M',Affine,Coll,Ascend,Descend, Affine,Coll,As1,Descend) :-
Ascend \= 0,
As1 is Ascend - 1.

valid_trans('S',Affine,Coll,Ascend,Descend, Affine,Coll,Ascend,De1) :-
Descend \= 0,
De1 is Descend - 1.
valid_trans('D',Affine,Coll,Ascend,Descend, Affine,Coll,Ascend,De1) :-
Descend \= 0,
De1 is Descend - 1.

Program 7.7b. Program modifications to Program 7.7a to control search using parameters

Program 7.8a and Program 7.8b together form a small Prolog program for drawing simple kinship diagrams. You will need to be fairly familiar with Prolog to follow it, though we have covered most of the features in it in the previous examples. It assumes that the same data structures for relationships as developed in this chapter. It is intended simply for example, and as a basis for more elaborate programs, since it illustrates the two basic operations: positioning people (Program 7.8a) and then drawing them at these locations and connecting them with appropriate lines (Program 7.8b).

Program 7.8a is fairly portable, which means it should work in almost any version of Prolog. It is also fairly general, and is likely to resemble any program routine which positions. One problem it does not deal with adequately is the placement of spouses whose nuclear families lay in the scope of the diagram, since in some cases they should be placed with the nuclear family and in others nearer the spouse. This can be added on a basis which reflects the analytic biases your particular analysis requires. The program makes no assumptions about screen or paper size, instead locating people on a abstract unit grid. It is up to the drawing program to translate these grid coordinates into actual screen or paper locations.

There are only two predicates that you need access, clr_loc and calc_all. clr_loc simply clears out the results of any previous calculations, so that a fresh set of locations can be set. calc_all is called with the form:

calc_all([List of Sibset IDs], Generation, Start width, End width).

where the first argument is a list of sibsets to calculate, Generation the relative generation of these sibsets, which should all be at the same level (e.g. '0', '1', '2'). A second call can be issued for other unrelated sibsets of a different relative generation. At the first call, Start width should be '0', and for subsequent calls should be the value of End width for a previous set of sibs (in the same diagram). This program makes considerable use of recursion (§7.3.3), to trace the links of descent and affinity. These recursive predicates (such as calc_sibsets) follow a common pattern in Prolog, consisting of at least two definitions of the predicate, and where the first matches the terminal condition for the recursion, breaking the cycle. This is usually finding an empty list ('[]'), since recursion is most often used with lists.

Program 7.8b draws a diagram based on the calculations in Program 7.8a. Although it is illustrative, it is specific to a particular 'dialect' of Prolog, MacProlog. This lack of generality results from the means of drawing, which tends to be idiosyncratic by program, although most will have equivalent drawing operations. In this example the first three predicates define the symbols for females, males and neuter genders, which are drawn in the predicate plot_sym . The sibline, spouseline, and descentline predicates define the line shapes for these lines, which are drawn in plot_ sibline, plot_ spouseline, and plot_ descentline respectively. MacProlog does its actual drawing with a built-in predicate add_pic which uses these definitions. You will have to implement something compatible with your version of Prolog, or another language if you translate the algorithm.

% Calculate Positions of a rooted tree in Grid Coords

clr_loc :-
retractall(location(X,(Y,Z))), % removes all location clauses

calc_all(Sibsets,Gen,In,Out) :-

calc_sibsets([Sibs|Rest],Gen, In,Out) :-
calc_sibsets(Rest,Gen, O1,Out).

calc_sibset(Spouse_Id, Gen, In_width, Out_width) :-
calc_people(Sibs,Gen,In_width,Out_width, Locs),

calc_people([Person|Rest],Gen, In,Out, [Loc|Slocs]) :-
calc_people(Rest,Gen, O1,Out, Slocs).

calc_person(Person, Gen, In, Out,Loc) :-
get_spouses(Person,Spouses), % get all spouses past and present
G1 is Gen + 1,
calc_sibsets(Spouses, G1, In, O1), % calculate for each set of children
calc_loc(In, O1, R), % terminal nodes are treated differently.
Out is O1 + 1,
Loc = (Person, G1, R),

calc_loc(In, In, In). % terminal node case.
calc_loc(In, Out, R) :-
In \= Out, % not a terminal case e.g. descendents
% adjusted the margin
R is (Out - In) / 2 + In - 1. % centre over descendents with room
% for spouse(s)

/* the following definitions use findall, which should be fairly
portable. If it is not present in your version of Prolog, look
for a function which collects all solutions to a clause.
bagof will usually do, setof will not, as it sorts the
results, losing the birth order if you entered the siblings
by birth order

get_sibs(Spouse_Id, Sibs) :-

get_spouses(Person, Spouses) :-

set_loc((Person,Gen,Col)) :-
not(location(Person,_)), % don't reset if location is already set
set_loc((Person,Gen,Col)). % if is already in database then succeed

set_sib(Sib_id, Sibs) :-
not(sib_loc(Sib_id,_)), % don't reset if location is already set
set_sib(Sib_id, Sibs). % if is already in database then succeed

Program 7.8a. Program to calculate positions on a genealogical grid for specified sibling groups.

The rest of the program is fairly portable, except for to_grid , which translates the abstract unit grid of Program 7.8a to a unit compatible with the drawing program, usually in terms of absolute pixels or dots on the screen or printer. In this case I used 35 pixels for the vertical translation, which is about one-half inch on my screen, and 18 pixels for the width.

plot_all will make the entire diagram. Note the use of the fail predicate in plot_people to step through all the locations. This is a common mechanism in Prolog because it lacks a general 'loop' operation. This works by getting each value, plotting it, and then failing, which causes Prolog to try the next value of location. When all the locations are dealt with, the next definition of plot_people is tried, which is designed to simply succeed, so that the remainder of the plot_all predicate can be interpreted. plot_sibsets works in a similar manner.



plot_all :- plot_people, plot_sibsets.

plot_people :-

plot_sym(P,(Y,X)) :-
plot_sym(P,(Y,X)) :-
plot_sym(P,(Y,X)) :-
not(male(P)), not(female(P)),

plot_sibsets :-

sib_lines([(P1,Y1,X1)|[(P2,Y2,X2)|Rest]]) :-
sib_lines([(P,Y,X)|Rest]) :-

plot_sibline((Y1,X1),(Y2,X2)) :-
D1 is X2 - X1,

spouse_line(S_id) :-
spouse_loc(P1, P2, L1, L2),

spouse_loc(P1,P2,L1,L2) :-

spouse_loc(P1,P2,(Y,X1),(Y,X)) :-
X1 is X + 1,
set_loc((P1, Y,X1)).

spouse_loc(P1,P2,(Y,X),(Y,X1)) :-
X1 is X + 1,
set_loc((P2, Y,X1)).

plot_spouseline((Y1,X1),(Y2,X2)) :-
D1 is X2 - X1,

descent_span([(_,Vert,First)|Sibs], (Vert,First), Last) :-
descent_line(Sibs) :-

plot_descentline((Y1,X1),(Y2,X2)) :-
D1 is (X2 - X1)/2,

to_grid(Y1,X1,Y2,X2) :-
Y2 is Y1 * 35,
X2 is X1 * 18.

Program 7.8b. MacProlog program to plot genealogical data calculated in Program 7.8a.