Jump to ContentJump to Main Navigation
Explaining Criminal CareersImplications for Justice Policy$

John F. MacLeod, Peter Grove, and David Farrington

Print publication date: 2012

Print ISBN-13: 9780199697243

Published to Oxford Scholarship Online: January 2014

DOI: 10.1093/acprof:oso/9780199697243.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (oxford.universitypressscholarship.com). (c) Copyright Oxford University Press, 2021. All Rights Reserved. An individual user may print out a PDF of a single chapter of a monograph in OSO for personal use. date: 16 June 2021

Characteristics of Individuals

Characteristics of Individuals

Chapter:
(p.122) (p.123) 6 Characteristics of Individuals
Source:
Explaining Criminal Careers
Author(s):

John F. Macleod

Peter G. Grove

David P. Farrington

Publisher:
Oxford University Press
DOI:10.1093/acprof:oso/9780199697243.003.0006

Abstract and Keywords

Analysis of data from the Offender Assessment System (OASys) database and the Police National Computer is used to demonstrate a link between psychological-assessment scores and the risk/rate categories of the theory proposed in this book. Two analyses are presented: the first of 1600 male offenders from the OASys pilot study; and the second using data from 154,000 offenders from the operational database. Using the psychological data from OASys the majority (65%) of high risk category offenders were identified using a simple dichotomy on total score. A principal component analysis improved this figure to 90%. Using the OASys data to identify the risk categories, together with conviction data from the PNC, the number of offenders reconvicted within 15 months of conviction in April 2004 was predicted to within 1%.

Keywords:   Offender Assessment System, OASys, PNC, psychological-assessment, principal component analysis, reconviction prediction

Orientation

Our theory, proposed to explain aggregate age–crime curves, assumes that there are three categories of offenders: high-risk/high-rate, high-risk/low-rate, and low-risk/low-rate. Each category has a constant rate of offending and a constant probability of reoffending. In Chapter 5 we showed that a theory assuming that offending was strongly determined by age, and that individual rates of offending increased to a peak in the teenage years and then decreased, could be ruled out. In this chapter, we investigate the psychological characteristics of offenders and whether these characteristics can be used to allocate individuals to the risk/rate categories.

Introduction

We have constructed very successful predictive models on the basis of our theory. However, the evidence for the underlying theory, the existence of a small number of categories with simple offending behaviours, is circumstantial. The theory is the simplest possible that explains the main features of the aggregate statistical data. Even making successful predictions of the number of convictions at any given age (the age–crime curve) for each offence number (first, second, third, etc) does not necessarily confirm the existence of distinct categories. As we have seen it is difficult to find more intuitive theories that can fit the known criminal career information. But, until we can demonstrate that the categories really do differ in their psychology, due to genetic, social, or other environmental factors, the suspicion might remain that all we are seeing is some statistical coincidence or artefact of the data.

Unfortunately, neither of the databases that we have relied upon so far, the Offenders Index (OI) and the Court Appearances database contain psychological information. The direct analysis of such (p.124) factors is thus impossible. On the other hand large databases of psychological information seldom contain criminal career information of sufficient detail to enable comparisons with the offender categorization we propose. We were therefore very grateful that the Home Office Offender Assessment System (OASys) database was made available to us for analysis. Initially approximately 2000 offender records collected in the pilot phase of OASys were provided for analysis. Since then the use of OASys has become routine for the probation and prison services and a second sample of linked OASys and PNC records were made available to us for further analysis. The two analyses used different methodologies and are reported on separately below.

In Chapter 2 we identified our high- and low-risk categories, not from the characteristics of individuals, but from the statistical properties of the offender population with respect to the distribution of conviction numbers. In this chapter we show that OASys assessments of individuals can be used to dichotomize the offender population into two groups one of which displays the recidivism characteristics of the high-risk category and the other, dual-risk characteristics in the proportions predicted by the OASys score distributions. This result suggests that psychological assessments can be used to allocate individual offenders to the risk categories more effectively than criminal history information on its own.

The Rationale and Development of OASys

In common with most jurisdictions the Home Office has a statistically based tool for estimating the aggregate probability of reconviction for a group of offenders with a particular set of characteristics (eg previous convictions, gender, and age). The Home Office tool is called OGRS (Offender Group Reconviction Score). The first version of the system is described by its developers in Copas et al (1998), and since then it has been further refined and developed by the Home Office. It was found that the best predictors of future offending were based purely on criminal history information. This is of course entirely consistent with expectations based on our theory where the categories have a very regular offending behaviour as seen in official statistics. However it is a rather static predictor.

One might ask, however, what about the underlying psychological factors which are believed to cause offending? These have been (p.125) extensively reviewed (see eg Farrington 2007, 2010; Jolliffe and Farrington 2010). However, although offending behaviour is driven by psychological characteristics, it is much easier to measure past offending than to directly measure the underlying offender characteristics. Previous offending may be the best surrogate measure of the psychological characteristics which influence future offending.

Since the mid-1990s, large-scale programmes seeking to reduce the criminality of offenders have been introduced, both within prisons and in the community. These programmes have led to the development of a tool to predict future offending on the basis of so-called ‘dynamic’ factors. The difference between a ‘dynamic’ factor, such as taking drugs, unemployment, peer group influences, etc, compared with the ‘static’ factors of age, gender, and previous conviction history, is that the ‘dynamic’ factors might be changed by offender programmes and other social policy. In addition, by measuring the change in the ‘dynamic’ factors it should be possible to measure the effectiveness of a programme while an offender is being treated, rather than having to wait to see if they are reconvicted. (For reviews of risk assessment, see eg Andrade, 2009; Otto and Douglas, 2010.)

The National Probation Service and the Prison Service of England and Wales (now the National Offender Management Service) have co-operatively developed the Offender Assessment System (OASys), to use as their tool to calculate a ‘dynamic’ factor reconviction score. OASys is actually rather more than this, being a system for risk assessment and management as well as a way of producing and reviewing a sentence/supervision plan. The OASys pilot study was a first step towards a national computerized database of ongoing assessments for most offenders entrusted to the care of the probation and prison services. Following the successful pilot, phased implementation of the system was set in train. This should be a very useful resource for future criminal career research.

OASys consists of an extensive questionnaire to be completed by prison/probation staff, in the course of offender interviews and from the offender’s documented criminal record. The questions which are of particular interest to our research are contained in sections 7, 11, and 12 of the questionnaire (see Table 6.1). Like most of the questions throughout the questionnaire the answers are scored 0, 1, or 2, where 0 is benign or positive, and 2 is problematic. Section 7, entitled Lifestyle and Associates, consists of (p.126)

Table 6.1 OASys Questionnaire sections 7, 11, and 12

7. Lifestyle and Associates

7.1

Community integration (Attachments to individual(s) or community groups. Participation in organized activities not linked to offending, including in prison, eg sports clubs, faith communities, etc) (Absence of any links = 2)

Score 0, 1, 2

7.2

Regular activities encourage offending (Do the leisure activities most commonly engaged in create opportunities to offend, or contribute to the need to offend eg gambling in prison?)

Score 0, 1, 2

7.3

Easily influenced by criminal associates (Are most offences committed with others? When in the community does s/he spend a large amount of their time with other offenders?)

Score 0, 1, 2

7.4

Manipulative / predatory lifestyle (Does s/he exploit others or abuse friendships, relationships, positions of trust? Does s/he use others, live off others without reciprocation, bully others?)

Score 0, 1, 2

7.5

Recklessness and risk-taking behaviour (Lifestyle includes excessive thrill-seeking and risk taking activities. Demonstrates intolerance for boring, unchallenging or unchanging situations: Needs excessive excitement or stimulation)

Score 0, 1, 2

11. Thinking and behaviour

11.1

Level of interpersonal skills (Are the offender’s social/interpersonal skills adequate ie to their background and normal circumstances?)

Score 0, 1, 2

11.2

Impulsivity (Does offender prefer to act rather than plan, take decisions which are later regretted, become bored easily, require stimulation?) Score 0, 1, 2

11.3

Aggressive/controlling behaviour (Does offender show aggression to others, or use violence or threats in order to resolve conflicts with others, eg domestic violence?) 0 = no aggressive behaviour.

Score 0, 1, 2

11.4

Temper control (Does offender lose his/her temper easily and often. Does s/he have a low tolerance, is s/he poor at conflict resolution, unable to control emotions) 0 = no problems controlling their temper.

Score 0, 1, 2

11.5

Ability to recognize problems (Does the offender have insight into areas of their life which are problematic?)

Score 0, 1, 2

11.6

Problem solving skills (Is the offender’s approach to solving problems illogical? Does s/he employ inappropriate strategies? Does s/he recognize contribution of others? Is s/he unable to think flexibly?)

Score 0, 1, 2

11.7

Awareness of consequences (Does the offender recognize that most courses of action have a mixture of positive and negative outcomes: Is s/he able to balance these?) 0 = is aware of consequences.

Score 0, 1, 2

11.8

Achieves goals (Does the offender fail to set goals in all areas of their life? Are they unrealistic and unsupported by planning? Does s/he lack motivation to achieve goals? No examples of reaching goals) 0 = Achieves goals.

Score 0, 1, 2

11.9

Understands other people’s views (Is the offender unable to interpret social situations correctly or form acceptable relationships with peers and those in authority? Does s/he fail to demonstrate feelings for others or remorse for victims?) 0 = is able to understand others.

Score 0, 1, 2

11.10

Concrete/ abstract thinking (Does offender hold rigid dogmatic views or have difficulty in thinking in general terms rather than about specific incidents. Is s/he unable to consider problems in the abstract, infer general principles and adapt to circumstances)

Score 0, 1, 2

12. Attitudes

12.1

Pro-criminal attitudes (Does s/he express attitudes supportive of criminal behaviour in general? Does s/he believe everyone offends given the opportunity?)

Score 0, 1, 2

12.2

Discriminatory attitudes/ behaviour (Evidence from offending or lifestyle of attitudes or behaviour which may be considered as racist/sexist or degrading to any group in society.) 0 = no discriminatory attitudes or behaviours:

Score 0, 1, 2

12.3

Attitude towards staff (Has s/he accepted and co-operated with authority?)

Score 0, 1, 2

12.4

Attitude towards supervision/licence (Past experience of supervision if applicable. Does s/he view supervision favourably or unfavourably? Is s/he likely to co-operate with supervision?) 0 = no problems with being supervised.

Score 0, 1, 2

12.5

Attitude to community / society (Does the offender acknowledge the rights of others, accept the necessary limits on personal freedom. Does s/he express a wish / willingness to be part of the community) 0 = acknowledges the rights of others.

Score 0, 1, 2

12.6

Does the offender understand their motivation for offending (How well does the offender recognize which of their own attitudes, beliefs, emotions and needs are linked to their offending? How much insight do they have into their own behaviour?)

Score 0, 1, 2

(p.127) (p.128) five questions which address issues of interaction with others. Section 11, entitled Thinking and Behaviour, consists of ten questions covering self-control, empathy and awareness. Section 12, entitled Attitudes, covers just that. In offender assessments the total scores from each of these sections are combined with the scores of the other sections (technically, by using a non-linear transformation followed by a linear weighting) to arrive at a summary measure related to the overall probability of reconviction.

Analysis of the Pilot OASys Data

To test the Offender Assessment System ‘in action’, a pilot study was undertaken of around 2,000 offenders convicted in late 1999 and early 2000 who were either placed in the care of the probation service or received a custodial sentence. The OASys assessments were carried out by carefully trained assessors, assuring a relatively high quality of assessment. Limited criminal career information was obtained for each of the 2,000 offenders from the Police National Computer.

The OASys pilot database contained paper records for around 1,600 male offenders. These records provided limited psychological information. Data from section 11 of the assessment questionnaire was made available to us together with basic information, from the Police National Computer (PNC), on the offender’s criminal careers. The section 11 questions loosely fall into two categories. Questions 11.1 to 11.4 address factors such as impulsivity, which might, if unchecked, lead to offending behaviour, while 11.5 to 11.10 look at factors influencing offenders’ ability to control (p.129) their behaviour, such as their ‘awareness of consequences’. Improving offenders’ ability to control their behaviour by the use of so called ‘cognitive behavioural’ programmes is the approach that has the greatest backing, from empirical evidence, for reducing recidivism (see eg Bernfeld, Farrington and Leschied 2001; Lipsey and Landenberger, 2006; McGuire 1995). Initially, however, we will consider only the total section 11 score.

In this analysis we confined ourselves to male offenders, as there are too few females for meaningful analysis. The offenders assessed in the pilot study had received relatively severe sentences, prison, or probation, presumably for relatively serious offences. We therefore expect them to be members of the offender categories identified in Chapters 2 and 3.

The fact that criminal career information in the database comes from the PNC, rather than the Offenders Index, raises three issues. The first is simply that the PNC records offences to a lower level of seriousness than the Offenders Index. We therefore expect the recidivism of the high- and low-risk categories to be higher in the PNC data compared to recidivism calculated from the OI. The second issue is that the PNC is an operational rather than statistical database. The focus of users of the PNC will be such as to minimize operational rather than statistical problems. Comparisons that have been carried out between the OI and the PNC (Francis and Crosland 2002) suggest that they both have missing data but with rather different patterns. Finally, we must remember that the PNC data is not complete before 1995 (see Chapter 2).

The first step in this analysis was to establish that the criminal career data for the male offenders included in the OASys pilot sample conforms to the recidivism patterns identified in Chapter 2. The small sample size and cross-sectional nature of the pilot data did not permit the joint estimation of the parameters (a, ph, and pl) as was done with the much larger cohort samples. However, following the procedures outlined in Chapter 2 and illustrated in Figure 2.3, we can estimate the high- and low-risk parameter values. We first estimated the high-risk probability, from the slope of the logarithmic plot of the conviction number frequency data for conviction numbers from 6 to 23. The parameter value obtained was then substituted in Equation 2.2 (see the inset graph of Figure 6.1). The residuals from the high-risk recidivism line were then calculated for the counts of conviction numbers one to eight, enabling the estimation of the (p.130)

Characteristics of Individuals

Figure 6.1 The dual risk recidivism model fitted to the conviction count data from the OASys pilot, male offenders

low-risk probability. The high- and low-risk parameters were then substituted into the dual-risk recidivism model (Equation 2.4).

The main graph in Figure 6.1 shows the OASys pilot data (points on the graph), the dual-risk recidivism model (central solid line), and the ±2σ bounds (95 per cent confidence interval, dotted lines on the graph) assuming a Poisson distribution of counts at each conviction number. It must be pointed out at this stage that we could not have reliably identified the dual risk recidivism model directly from the OASys/PNC data had we not known what to look for. But, despite that, the model does adequately describe the data as virtually all the data points fall within the expected bounds.

The recidivism parameters calculated for the dual risk recidivism model from the OASys/PNC pilot data are: a = 0.52, ph = 0.91, pl = 0.65. The probabilities are higher than the cohort and sentencing sample values of Chapter 2, derived from the Offenders Index, but closer to the values derived in Chapter 4 when considering serious offenders. The proportion of offenders in the high-risk category is also very much higher than in the 1953 and 1958 cohorts. As explained previously the parameters are characteristics of the sample of offenders rather than of the individuals within the sample and are therefore conditioned by the selection criteria for the sample. In this instance all the offenders in the sample will have been convicted for relatively serious offences and few if any of the low-seriousness offenders (who make up the majority of the low-recidivism risk category of the cohorts) are likely to be included. (p.131) Also we expect PNC data to produce higher recidivism probabilities for all categories and as we shall see later in this chapter the estimates are broadly consistent with those from a much larger sample of OASys/PNC data.

The Distribution of Section 11 Scores

As mentioned above, each of the questions in section 11 is coded 0, 1 or 2, where 0 represents ‘normal’ for the general population and 2 indicates serious problems. The expectation is that high scores are indicative of high criminal propensity and should be correlated with other measures of criminality. Table 6.2 shows the frequency distribution of total section 11 scores. The total score with the highest frequency occurs at zero with 124 offenders assessed as having no problems with any of the constructs covered in section 11 of the questionnaire. The next most frequent score occurs at a total score of 7, with 114 offenders. Only nine offenders are assessed as having serious problems with all of the constructs.

The distribution of scores can be described as bimodal, ie with two humps, with a general trend of reducing numbers of offenders as the scores increase. Criminality, as measured by number of previous convictions, on the other hand shows a reducing trend with only one maximum at zero (one conviction) (see Figure 6.1). The relationship between the two measures is clearly more complicated than a simple correlation.

The bimodal nature of the section 11 score is suggestive of a mixture of two separate homogeneous distributions which hopefully would be correlated with the risk categories identified above

Table 6.2 Frequency distribution of total section 11 scores

Total section 11 score

0

1

2

3

4

5

6

Count of offenders

124

79

92

95

85

100

100

Total section 11 score

7

8

9

10

11

12

13

Count of offenders

114

96

108

94

95

76

63

Total section 11 score

14

15

16

17

18

19

20

Count of offenders

51

38

39

19

23

13

9

Note: Based on 1,513 male offenders from the OASys pilot data.

(p.132) (see Figure 6.1). If this is the case what would we expect to see? Offenders in the low- (recidivism) risk category should have lower scores on all constructs with few of these offenders having total scores above say 5. We must also bear in mind that the measuring system for each construct is very coarse. A borderline, 0 or 1, score on all 10 constructs could result in a total score of anything between 0 and 10. However, even with such coding uncertainties, we would expect the frequency of higher scores to diminish very rapidly perhaps distributed as a negative exponential with a mean around 2 or 3.

The high-recidivism risk category, on the other hand, should dominate the higher section 11 scores. We would not expect many, if any, of the high-risk category to have section 11 total scores of 0 and, if the constructs are not too highly correlated, neither would we expect many to have the maximum score. If our high-risk offenders are indeed a homogeneous group with respect to the section 11 total score, we might expect the scores to be distributed normally with the mean at some central value between say 7 and 11. It could of course be the case that our high-risk category is not homogeneous with respect to the section 11 total score and that the score is strongly correlated with the number of previous convictions. In this latter case we would expect the mean score to increase as the number of previous convictions increased. We will test these propositions against the section 11 data from the OASys pilot.

Because we cannot unequivocally allocate offenders to the high- or low-risk categories simply on the basis of their criminal history we need to use the implications of our theory to try and identify the categories. By looking at the distribution of scores for various subsets of offenders we can confirm or refute the expectations outlined above. The first step is to systematically examine subsets of offenders selected on the basis of conviction number ranges. Figure 6.2 shows the frequency distribution of the section 11 total score for offenders with only one conviction.

The fitted curve is a negative exponential with a mean of 5.7, which is higher than we anticipated for our low-risk category. But, from our dual risk model fit of Figure 6.1, we know that over 50 per cent of offenders with just one conviction are in fact in the high-risk category, perhaps accounting for the small secondary peak at 11. Our dual risk model, for the pilot sample males, also suggests that there are unlikely to be any low-risk offenders with seven or more convictions. Figure 6.3 shows the frequency distribution of (p.133)

Characteristics of Individuals

Figure 6.2 Histogram of section 11 total score for offenders with only one conviction, overlaid with an exponential distribution curve

section 11 scores for the subset of offenders with more than ten convictions.

The data in Figure 6.3 has a mean score of 9.9 and a standard deviation 4.6. The data is not significantly different from a normal distribution with the same mean and standard deviation, in line with our expectation. Repeating this analysis for offenders with 15 or more convictions gave a mean of 10.1 and the same standard deviation. Again the data was normally distributed. The distribution of total score for the subset of offenders with more than 14

Characteristics of Individuals

Figure 6.3 Histogram of section 11 total score for offenders with eleven or more convictions, overlaid with the expected normal distribution curve

(p.134) convictions is clearly not different from the total score distribution for offenders with more than ten convictions. Increasing the subset to include all offenders with more than seven convictions reduces the mean to 9.2 and increases the standard deviation to 4.8. The means of the three nested distributions are consistent with a single normal distribution for high recidivism risk offenders, but the increasing trend, in mean total score with increasing conviction count, is of concern.

A subset consisting of offenders with between seven and ten convictions, inclusive, gave a normal distribution of the total score with a standard deviation of 4.8 but with a mean of only 8.0. This lower mean score for the ‘7 to 10’ subset is significantly different (at p = 0.01, two-tailed) from the ‘7 plus’ mean score. However, by overlaying the ‘7 to 10’ histogram, of total score data, with the appropriately scaled normal distribution derived from the ‘11 plus’ data (Figure 6.4), we see that the majority of data points lie within the ±2σ expected variation (assuming a Poisson distribution about the expected counts). The subsets in this comparison have no data in common.

The difference in means could be interpreted as evidence of heterogeneity amongst the high-recidivism risk offenders, revealing perhaps an additional category: the high-risk/low-rate offenders of Chapter 2 or the less serious offenders of Chapter 4. The reduced mean total score could be an artefact of the coarseness of the OASys scoring system. Also, although the constructs measured

Characteristics of Individuals

Figure 6.4 Histogram of section 11 total score for offenders with 7 to 10 convictions

Note: The overlaid curve is the scaled normal distribution from Figure 6.3 with ±2s bounds.

(p.135) in section 11 are independent of criminal history, the assessments may not be. For example it could be that, where there is uncertainty, assessors tend to give lower scores for offenders with fewer convictions, and higher scores for those with more convictions.

From the above it is apparent that the total score is not directly correlated with criminal history, certainly for offenders with more than 10 convictions, but we may have evidence supporting a different psychological profile for a third group of offenders. The evidence is not definitive as there is considerable overlap in the distributions and it is not clear how much of the variance in the scores is due to assessment errors caused by the coarse measurement scales and how much is due to real differences between categories of offenders.

The subset of offenders with fewer than seven convictions will contain individuals from both high- and low-reconviction risk categories. From the dual risk model parameter estimates (Figure 6.1) we expect that 484, of the 1161 offenders in the ‘0 to 6’ subset, would be high-risk. We would also expect that the total section 11 scores for these 484 high-risk offenders would be distributed normally with the mean and standard deviation as estimated from the ‘7 plus’ subset. By subtracting the expected (normally distributed) ‘high-risk’ section 11 score distribution from the distribution of total section 11 scores in this ‘0 to 6’ subset we can get an estimate of the distribution of scores for low-reconviction risk offenders (the low-risk residuals). Figure 6.5 shows a histogram of these

Characteristics of Individuals

Figure 6.5 Histogram of low-risk residuals of section 11 total score for offenders with fewer than 7 convictions

Note: The overlaid curve is the fitted negative exponential distribution with mean 4.23.

(p.136) residuals with a fitted exponential distribution overlaid on the graph. The mean of the exponential is 4.23 which is lower than the estimate of the mean of 5.7 derived from offenders with only one conviction (see Figure 6.2 above), but consistent with our expectation that low-risk offenders should have lower scores.

In the above analysis we have explored the relationship between the OASys section 11 total score and criminal history as manifested by the number of convictions sustained by the offenders in the OASys pilot study. It has been shown that the distribution of total section 11 scores is consistent with the expectations from our theory and in particular with the dual risk recidivism model derived in Chapter 2. With the aid of the model we have been able to partition section 11 total score frequency data into two subsets corresponding to our high- and low-recidivism risk categories. For the high-risk category total scores were normally distributed with a mean around 10 and for the low-risk category scores were distributed as a negative exponential with a mode of 0 and a mean of around 4.

At this point critics might argue that all we have done is to manipulate the section 11 total score data to fit our theory. If that were the case we would not necessarily expect to be able to identify our recidivism categories simply from the section 11 total scores. Figure 6.6 shows the distribution of numbers of convictions for offenders with a section 11 total score of six or more. The overlaid curve is the high-risk element of the dual risk recidivism model

Characteristics of Individuals

Figure 6.6 Recidivism plot for offenders with section 11 total scores of 6 or more

Note: The solid line is the theoretical recidivism plot with p = 0.91, the dotted lines are the ±2s bounds.

(p.137) (Figure 6.2), scaled to 80 per cent, with the ±2σ bounds (95 per cent confidence interval), and the high-risk recidivism probability ph = 0.91. The scaling down is necessary because we expect some 20 per cent of high-risk offenders to have total scores of less than six (left hand tail of the normal distribution; see Figure 6.3). The overlaid curve is a very good fit to the data, although not quite the best fit which would lie just below and parallel to the central line.

Figure 6.7 shows the distribution of numbers of convictions for offenders with a section 11 total score of less than six. The overlaid curve in Figure 6.7 is the best fit dual risk recidivism model with the high-risk parameter set to 0.91. The low-risk parameter estimated in this fitting process was 0.654, almost identical to the 0.65 calculated above and the estimated proportion of high-risk offenders is tolerably close to the 20 per cent not included in Figure 6.6.

The above analysis of the section 11 total score has demonstrated that the theory developed in Chapters 2 and 3 has some basis in a relatively independent measure of individual psychological characteristics. (This finding is similar to that of Blumstein et al (1985) which was discussed more fully in Chapter 1 pp 9–11.) In particular the offender categorization suggested by the dual risk recidivism model can in large measure be identified from the OASys section 11 total score. There is inevitably some overlap between the high and low section 11 total score groups but some 80 per cent of our theoretical high-risk category are identified simply from the

Characteristics of Individuals

Figure 6.7 Recidivism plot for offenders with section 11 total scores less than 6

Note: The solid line is the theoretical dual risk recidivism plot with a = 0.29, ph = 0.91 and pl = 0.65; the dotted lines are the ±2s bounds.

(p.138) section 11 score. Low conviction count offenders with higher scores appear in precisely the numbers predicted by our model. Offenders with low scores account for all of the low-risk offenders predicted by our model and also the correct number of offenders with higher conviction counts predicted by the score distribution for high-risk offenders.

Is there Structure in the Section 11 Information in OASys?

So far we have looked only at the total of the section 11 scores for each offender. We noted earlier that the questions naively fall into two distinct categories. Questions 1 to 4 (see Table 6.1) measure characteristics which might lead to criminal behaviour and questions 5 to 10 measure the lack of ability to check this behaviour. This classification can be investigated using a technique called non-metric multi-dimensional scaling (NMDS) (Davies and Coxon 1982). Here we will briefly review the technique and what it tells us.

Non-metric multi-dimensional scaling takes pair-wise information about the dissimilarities of a collection of objects and creates a picture in a multi-dimensional (most usefully, two- or three-dimensional) space which in some sense ‘best’ represents the dissimilarities between the objects. Thus ideally, if the point representing object A is further away in the picture from the point representing object C than it is from the point representing object B, then A is more dissimilar to C than B. That is, the closer the points representing the objects in the picture are, the more similar they are. NMDS makes the fewest possible number of assumptions about the data, and does as little as possible to make its pictures more than a very intuitive representation of the dissimilarities. (A good NMDS package will allow the user to look at the effects of changing even these minimal assumptions.) It can be shown both theoretically and from practical studies that the pair-wise dissimilarity information can recover most of the structure in a dataset, whereas the stronger assumptions of more usual statistical techniques merely force the data into the structure of the technique’s assumptions.

We can carry out an NMDS analysis of the OASys data by taking each section 11 question as an NMDS ‘object’. The dissimilarities between pairs of questions are defined on the basis of the correlations within the entire OASys pilot dataset (male and female) of the scores on those questions; the lower the correlation, the more (p.139) dissimilar are the questions. One of the features of NMDS is that it does not matter precisely how the correlations are converted to dissimilarities, as long as this is done in a consistent manner, as this in itself conveys no information about the data. It turns out that two dimensions are adequate to represent the main structure in the OASys section 11 pilot data. Figure 6.8 shows the results. The top plot is for the section 11 question scores only; the central plot includes the section 11 total score as an additional NMDS ‘object’; and the bottom plot is for the section 11 scores with the measured, post-assessment, 18-month reconviction results (R18FEB02) as an additional NMDS ‘object’.

In interpreting the plots in Figure 6.8, the following features need some explanation. The scales on the plots are unimportant and have been omitted, as it is the relative positions of the plotted points which convey the information. The ellipses on the plots indicate the grouping of the points representing questions 5 to 10

Characteristics of Individuals

Figure 6.8 Two dimensional non-metric multi-dimensional scaling (NMDS) plot for the questions in section 11 of OASys

(p.140) of section 11, suggesting that these questions measure related constructs. Similarly questions 3 and 4 seem closely related to each other but separate from the other measures. The addition of extra NMDS objects changes the relative positions of all the objects but, as can be seen from the plots, the grouping of questions 5 to 10 persists and questions 1 to 4 remain separate from the main group. Adding the total score tends to reduce the dissimilarity of the individual scores and the total score is itself positioned within the 5 to 10 grouping. However, perhaps most interestingly, adding the 18 month reconviction indicator seems to emphasize the ‘5 to 10’ grouping but suggests that reconviction, within the 18 months after the OASys assessment, is not strongly related to any of the section 11 scores.

In summary, questions 5–10 do seem to measure essentially the same thing whereas questions 1–4 measure something different from questions 5–10. Also questions 2 and 3 are distinct from questions 1 and 2, which are also distinct from each other. From now on we will describe questions 1–4 as the heterogeneous section 11 questions, and 5–10 as the homogeneous ones.

Homogeneous and Heterogeneous Section 11 Questions

Repeating the above analysis for the total scores of the heterogeneous group of section 11 questions gives the results displayed in

Characteristics of Individuals

Figure 6.9 Plots of offender count against total section 11 score for questions 1–4 for various subsets of offenders with different conviction counts

(p.141) Figure 6.9. The left hand graph is made up of subsets of offenders. The bottom most, line, is the ‘Q1–4 score’ distribution for offenders with more than 14 convictions. The next line is the distribution of scores for offenders with more than 10 convictions and the area between the lines represents the score distribution for offenders with Q1–4 scores from 11 to 14. The third from bottom line is the score distribution for offenders with seven or more convictions and the top most line is the score distribution for all offenders. The right hand graph shows the plot for offenders with less than three convictions; the overlaid line is a negative exponential curve like the one we encountered above in Figures 6.2 and 6.5.

In this analysis of the heterogeneous questions, Q1–4, we see evidence of the low-risk group in the right hand graph, which shows the score distribution for offenders with less than three convictions. In the left hand graph we see a similar distribution shape for all the offender subsets. This is consistent with the view that they measure personality attributes that are less amenable to change (see eg Roberts and DelVecchio 2000). It is also interesting to note that the distribution is skewed towards the lower scores which is consistent with the view that the constructs are relatively independent. The peak frequency at 2 and a mean score of only 2.6 suggest that the majority of offenders in the pilot scored badly on only one or two of the questions, with only 2.25 per cent attaining the maximum score.

In a similar analysis of the homogeneous questions, Q5–10, we see a different picture. The results of that analysis are shown in Figure 6.10. The ‘all convictions’ score distribution is bimodal

Characteristics of Individuals

Figure 6.10 Plots of offender count against total section 11 score for questions 5–10 for various subsets of offenders with different conviction counts

(p.142) suggesting a mixture of two offender types. The right hand graph, showing the distribution of scores for offenders with a Q5–10 score less than 7, again providing evidence of the low-risk category offenders. However, for offenders with higher numbers of convictions (7+), there appears to be a level area in the distribution for scores less than 4, followed by a steep rise to the peak at around 6 and a shallower decline to the maximum score of 12.

Questions 5–10 of section 11 measure cognitive-behavioural skills and it is believed that these can be improved by appropriate programmes, leading in turn to reduced recidivism. The scope for overall crime reduction is however quite small, in part because only the high-risk offenders score highly on these questions and there are relatively few of them, but also because even the most successful of these programmes are unlikely to reduce this element of the score to zero.

Conclusions from the OASys Pilot Data Analysis

Using conviction number information from the Police National Computer, we have shown in Figure 6.1 that the OASys pilot data displays the dual risk category structure that we previously identified in the Offenders Index data in Chapter 2. Our analysis of section 11 of the OASys questionnaire, total score, also indicates that there are two groups of offenders: one group with Normally distributed (mean ≈ 10, s = 4.8) total section 11 score and one group with lower scores (exponentially distributed, mean = 4.23). By dichotomizing the data, on the basis of section 11 total score, we have also shown that offenders with high (≥ 6) OASys section 11 total score account for 80 per cent of the high-risk category offenders identified in Figure 6.1. As illustrated in Figure 6.6 these high-scoring offenders also exhibit the same recidivism probability (0.91). The offenders with low-OASys section 11 total scores (< 6) are predominantly (70 per cent) made up of offenders with the recidivism properties (see Figure 6.7) of the low-risk category identified in Figure 6.1. The number of offenders in the high-risk component of Figure 6.7 corresponds, almost exactly, to the number in the left hand tail (total section 11 score < 6) of the fitted Normal distribution of high-risk offender scores (see Figure 6.3).We have also shown that there may be evidence for the existence of a group of moderate scoring offenders not identified by the dichotomy but perhaps accounting for some of the high-risk element in the offender (p.143) group with score < 6. We thus have strong evidence that our high- and low-risk categories actually exist as distinct sub-populations, with different psychological characteristics. The low-risk category offenders displaying relatively normal psychology and the high-risk category offenders displaying problems in areas of self-control, empathy and awareness.

It is important to stress here that our risk categories are inferences from the statistical structure of the offender population resulting in some difficulty allocating many individual offenders unequivocally to one or other of the risk categories especially for offenders with low conviction counts. The total section 11 score dichotomy on the other hand allocates individuals to the high- and low-risk categories on the basis of characteristics independent of criminal history and with much greater certainty if conviction count is used as an additional discriminant.

By means of an NMDS analysis, we split the section 11 questions into two sets, one set measuring underlying predispositions to criminal activity and the other measuring cognitive behavioural skills which might control criminal activity. The two score distribution groups are evident in both sets of questions. The higher scoring (normally distributed) group has a high-recidivism probability and the lower scoring (exponentially distributed) group has a lower recidivism probability. The distribution of scores measuring cognitive behavioural skills suggests a link between the lack of these skills and a high-recidivism probability. There is some evidence that these skills can be improved by treatment programmes leading to small reductions in recidivism.

We can also create a picture of a typical high-recidivism risk offender as someone who has several of the characteristics of being impulsive, aggressive, or having difficulty controlling his or her temper. In general they will also have difficulty in controlling the behaviour generated by these features of their personality because of their poor problem-solving skills and failure to consider the consequences of their actions.

A puzzle remains, however. The low-risk offenders generally score less than 6 and typically 0 to 2 on the OASys section 11 total score, indicating that their underlying impulsivity, aggression, and cognitive behavioural skills differ little from the general population, at least as compared with the high-risk category offenders. In Chapter 4 we saw some suggestion that very serious offences are disproportionately committed by the low-risk category offenders, (p.144) possibly explaining the lower reconviction probabilities for those serving very long prison sentences. This suggests that, while high-risk offenders are impulsive and have difficulty controlling themselves, low-risk offenders might be more calculating.

Analysis of Operational OASys Data

Following the successful pilot study, OASys was rolled out to the prison and probation services in England and Wales. From March 2000 a computerized database was established and an anonymized copy of the assessments up to March 2005, containing over 400,000 OASys assessments on 154,000 offenders, was made available to the authors. In addition a subset of PNC records for offenders convicted during April 2004 were also made available for analysis. Again the records were anonymized but could be linked to the OASys data.

The data of interest in the OASys dataset were the scores for individual questions in sections 7, 11, and 12 (see Table 6.1), and also the criminal history data in the form of the number of convictions sustained up to the latest assessment. Many of the offenders (53 per cent) in the OASys dataset have multiple assessments, potentially by several different assessors. However, despite the possibility of inconsistency in the scoring, 62 per cent of the multiple assessments were wholly consistent across the 21 questions in sections 7, 11, and 12. Also, for individual questions, from 85 per cent to 96 per cent of the scores did not change between assessments. In view of the coarse nature of the scoring, this level of consistency is reassuring. In cases where the score did change the average was used in the following analysis.

As with the pilot sample above, the first step in the analysis was to establish that the criminal career data, this time for both male and female offenders, included in the operational OASys data conforms to the recidivism patterns identified in Chapter 2. The results of the maximum likelihood fit of the model are given in Table 6.3 and Figure 6.11. The fit accounted for over 99.7 per cent of the variance in the data for both male and female subsets.

As with the pilot data, offenders in the OASys operational data base will in general have either committed more serious offences or simply more offences than offenders in general and the explanation of the higher parameter values for these subsets is the same as outlined above for the pilot. Males and females have similar recidivism (p.145)

Table 6.3 Dual-risk recidivism model parameter estimates for offenders in the operational OASys database

a

ph

pl

Cohort equivalent number of offenders N

Male

0.65

0.90

0.48

17429

Female

0.33

0.89

0.51

4440

Pilot male

0.52

0.91

0.65

192

Note: a = proportion high-risk, ph = high reconviction probability, pl = low reconviction probability. The cohort equivalent number of offenders is the number of first convictions in the data.

probabilities but, like the cohorts and OI sentencing samples, the proportion of high-risk is very much smaller for females than for males. Most importantly however, the structure of the data is consistent with the analysis of Chapter 2.

The next step in the analysis is to explore the sections 7, 11, and 12 data for internal statistical structure. In analysing the pilot data, we saw that section 11 question scores exhibited structures which were related to the recidivism characteristics of the offenders in the pilot. We now extend that analysis by including sections 7 and 12, ‘lifestyle and associates’ and ‘attitudes’ respectively. The analysis tool used in what follows is Principal Component Analysis (PCA). We start as before, in the NMDS analysis, with a correlation matrix, this time for the 21 individual questions in the three sections. Principal components are the combinations of questions which best describe a feature of the data (ie an underlying construct) independently of

Characteristics of Individuals

Figure 6.11 Dual risk recidivism model fit to OASys operational data

(p.146) the other components. By way of simple explanation: If we draw a line on a map we can describe it using a start position on the map grid followed by distances north and east to get to the end point, and moving the line will change all of the coordinates. The map grid initially has the x axis north south and the x axis east west. However if we rotate the grid so that the x axis lies along the line we only need one component (x) to describe the length of the line, and moving the line in the y direction does not change my description of the length. The new x direction is then the principal component of the line. This idea can be applied to the 21 measures contained in sections 7, 11, and 12 of the OASys questionnaire.

To find the principal components of a set of data we first need to compute the eigenvectors and eigenvalues of the correlation matrix of the data. Eigenvectors all have length one and are independent of (orthogonal to) all others. Each eigenvector has an associated eigenvalue which is the relative contribution that the eigenvector makes in the overall description of the data set. Large eigenvalues indicate which eigenvectors are important. These eigenvectors are the principal components of the data set and, if used to create single measures, show the greatest variation between individuals. Conventionally, eigenvalues greater than one are taken to indicate principal components. Figure 6.12 is a scree plot of the eigenvalues which shows that the first four (largest) are all greater than one but that the first factor is significantly larger and its corresponding eigenvector accounts for almost 35 per cent of the variation in the data.

Characteristics of Individuals

Figure 6.12 Scree plot of eigenvalues of the sections 7, 11, and 12 correlation matrix

(p.147) Our task now is to explore whether we can identify our criminal categories from the principal components. As we did above, for the total section 11 score in the analysis of the pilot data, we need to choose a dichotomy point for the Factor 1 score. Our aim is to identify two groups of offenders, one with a high Factor 1 score and the other with a low Factor 1 score, and to test whether these groups correspond to our risk categories. Using what is essentially a trial and error procedure a dichotomy point of 0.9 was chosen with the results shown in Figure 6.13. The circles on the graph represent the number of offenders with a Factor 1 score less than 0.9 and conviction counts as indicated on the x axis. The plusses on the graph represent offenders with a Factor 1 score greater than or equal to 0.9. The solid curves are the best fit recidivism models to the two subsets of the dichotomized data. The upper curve (Factor 1 score <0.9) is characteristic of a single group of 108,821 offenders with recidivism probability p = 0.905 (high-risk category) and the lower curve is characteristic of two groups, one of 13,675 offenders with p = 0.870 (high-risk category) and one of 16,185 offenders with p = 0.496 (low-risk category).

Characteristics of Individuals

Figure 6.13 Recidivism plot for OASys operational dataset (all offenders) with dichotomy point of Factor 1 score 0.9

The recidivism probability estimates from the operational OASys data and the dichotomized subsets are given in Table 6.4. At this point we must remind the reader that the OASys data is cross-sectional but the model is based on longitudinal cohort data. In the OASys data each offender appears only once whereas in the cohort data an individual may appear several times, once for each (p.148)

Table 6.4 Dual risk recidivism model parameters for the OASys operational data, all offenders and the dichotomized subsets

OASys subsets

a

ph

pl

Cohort equivalent number of offenders

Number of offenders

High

Low

All Offenders

0.58

0.902

0.477

20215

119639

16234

section 11 total ≥ 6

1.00

0.910

-

7011

77900

-

section 11 total < 6

0.26

0.89

0.522

13204

31209

20441

Factor 1 < 0.9

1.00

0.905

-

10338

108821

-

Factor 1 ≥ 0.9

0.18

0.870

0.496

9877

13675

16185

Note: a = proportion high-risk, ph = high reconviction probability, pl = low reconviction probability.

conviction during the career. The two types of data-set are equivalent if the process is stable over time and both demographics and criminality proportions are relatively constant. In Chapter 2 we showed that these were reasonable assumptions for the Offenders Index data and we now assume that they hold for the current analysis.

For cross-sections the cohort equivalent number of offenders is given by the number of first convictions in the data. In Table 6.4 the first row, ‘All offenders’, gives the dual risk recidivism model parameters, the actual number of first convictions in the OASys operational data set, and the modelled number of high- and low-risk offenders. The modelled total is 135,873 which is within 2 per cent of the 138,615 offenders in the OASys data set. The rows below repeat these estimates for the subsets created by both the ‘Section 11 Total Score’ dichotomy and the ‘PCA Factor 1’ dichotomy.

Including more of the principal component factors in the discriminant function provided no improvement in the separation of offenders into high- and low-risk categories. However, from our theory, we know that the low-risk offenders are very unlikely to have conviction counts greater than six. With this additional criminal history information an extra 5,000 offenders with a Factor 1 score ≥0.9 could be identified as high-risk.

Reducing the number of section 7, 11, and 12 raw scores included in the principal component analysis progressively reduces the discriminating power of the principal component. However, (p.149) several subsets of raw scores produced similar results: in particular section 11 scores only, section 7 and section 12 scores together, and the combinations of section 11 scores identified in Figure 6.8 all identified the two risk categories in the dual risk recidivism model. It is clear that all of the constructs measured are related to criminality and that they all make an independent contribution. None of the scores was found to be redundant, but at the same time much of the information relating to criminality is contained in most of the scores to the extent that disjoint subsets of scores provided similar discriminating power. The best discrimination was found when all the section 7, 11, and 12 scores were included in the principal component analysis.

Analysis of April 2004 PNC Conviction Data

The operational OASys data analysed above only contained conviction number information and no inter-conviction times. Only the risk element of our theory could be explored in relation to psychological and behavioural constructs. Also no information is available in that dataset on convicted offenders who have not been assessed using OASys. To remedy this deficiency an anonymized extract of PNC records for all offenders convicted of standard list offences during April 2004 was made available to the authors for analysis. The extract was drawn late in 2005 providing reconviction times up to sixteen months and also criminal history information that enabled inter-conviction times to be calculated. Sufficient information was provided to enable linkage with the OASys data analysed above.

In total 16,164 individuals were convicted of at least one offence during April 2004. Of these 14,340 were adult offenders, over 18 years of age at their conviction, who were thus eligible for assessment using OASys. However only 4,833 offenders, those sent to prison or put under the supervision of the probation service, were actually assessed and their records linked to the OASys data. The information available from the PNC extract included the offender’s date of birth, gender and the dates of all the individual’s convictions. An offender’s target conviction (court appearance) was taken as their earliest conviction in April 2004. From this the appearance number, time from the previous conviction and time to the next conviction were calculated. The previous conviction times were used to estimate the distribution of inter-conviction times for the various subsets of the April 2004 data.

(p.150) Repeating the analysis of Chapter 2 on the various subsets of the April 2004 data produces the risk and rate model parameters given in Table 6.5. The risk parameters derived from the OASys operational data are included in Table 6.5 for comparison. From Table 6.5 we can see that the high-risk probability is very consistent across subsets. Its value at about 0.9 is higher than that estimated from the cohorts but consistent with our expectation from cross-sectional data from the PNC. The 1997 Sentencing sample from the OI gave a value of 0.88, and we also expect PNC data to yield higher estimates.

The low-risk reconviction probability estimate for the whole April 2004 sample is again a little higher than the Chapter 2 estimates but not inconsistent with them given the different data source. The estimates of the λs are broadly consistent between subsets but are generally higher than the cohort estimates from Chapter 2. The difference could either be the result of speeding up the reconviction process in recent years, which has certainly been a policy priority, or simply due to the change in the data source. The increase in proportion of high-risk offenders amongst adults compared with all offenders can be explained by the omission of juveniles which resulted in a significant reduction in the numbers of first and second convictions and smaller reductions in higher conviction counts. The proportion of high-rate adult offenders is consistent with the whole sample value. Our theory suggests that average inter-conviction times, within categories, are consistent throughout the criminal career and not dependent on age.

Assessed adult offenders present quite different characteristics. They all appear to be high-risk, and also appear to reoffend at

Table 6.5 Model parameter estimates from April 2004 PNC extract and OASys data

Subset

a

ph

pl

B

λh

λl

Subset size

2004 all

.40

0.904

0.357

0.65

1.23

0.65

16,164

2004 adults

.49

0.906

0.299

0.63

1.17

0.63

14,340

2004 assessed

1

0.906

-

0.68

1.57

0.24

4,833

OASys

.58

0.902

0.477

-

-

-

141,219

Notes: α = proportion high-risk; ph = high-risk reconviction probability, pl = low-risk reconviction probability; B = proportion high-rate, λh = high reconviction rate; λl = low reconviction rate.

(p.151) higher rates with an increased proportion in the high-rate group. We suggest that, in large part, these apparent inconsistencies are due to selection effects for the assessed offender subset. All will have been convicted of relatively serious offences resulting in either custodial or supervisory sentences. Many will also be prolific offenders and may have received this particular sentence because their previous convictions were considered as aggravating factors in the sentencing decision. In addition there are no juveniles in the assessed offender subset, with a consequential reduction of early convictions in the data. Figure 6.14 shows the frequency distribution of conviction count (recidivism plot) for assessed offenders from the April 2004 sample, together with the plus and minus two sigma bounds, assuming a Poisson distribution about the expected count at each conviction number. It can be seen that for conviction numbers less than 8 the data falls on or below the lower bound, supporting our contention of selection effects.

Although our analysis of OASys operational data suggests that there should be a significant proportion of low-risk offenders among the April 2004 assessed offenders, there is, apparently, no evidence of them in Figure 6.14. However, by applying the Factor 1 dichotomy to the April 2004 assessed offender subset we obtain a recidivism plot very similar to Figure 6.13. In Table 6.4 we estimated the numbers of offenders in each of the risk categories in the Factor 1 dichotomized OASys data. Repeating those calculations for the assessed offenders in the April 2004 PNC data gives the

Characteristics of Individuals

Figure 6.14 Recidivism plot for assessed offenders in the April 2004 PNC data extract

(p.152)

Table 6.6 Estimated numbers of offenders in the risk categories for the Factor 1 dichotomized data for the assessed April 2004 and the OASys datasets

Data set

Factor 1 < 0.9

Factor 1 > = 0.9

High-risk

High-risk

Low-risk

OASys operational data

109234

13638

16076

78.6%

9.8%

11.6%

April 2004 assessed subset

3122

525

170

81.8%

13.7%

4.5%

results tabulated in Table 6.6, with the OASys results repeated for comparison.

At first inspection, the proportions in each of the categories seems inconsistent between the two sets of data. However the OASys data includes individual offenders convicted during a five-year period, whereas in the April 2004 data the period is only one month. High-risk individuals will be over-represented in short sampling periods because the probability of their conviction in the period is high compared with low-risk offenders. As the sampling period is increased the high-risk offenders may have several convictions but will only be counted once and low-risk offenders have an increasing probability of being included. It is therefore not surprising that the proportions in the risk categories are different.

Using the time from the previous conviction, the λs were estimated for the April 2004 assessed offenders and these values were used in the calculations which are described below. The theory and models developed in Chapter 2 allow us to estimate the expected number of offenders who will be reconvicted in a given period, in the set of April 2004 assessed offenders. To do the calculation we need estimates of the reconviction probabilities (ph and pl) for the high- and low-risk categories. For these we use the estimates derived from the OASys operational data analysis as this is the largest data set available. For estimates of the proportions (α and 1–a) we use the Factor 1 (0.9) dichotomy data and for (b and 1–b) and the λs (λh and λl) for the high- and low-rate categories, we use the values derived from the April 2004 assessed offender subset. For each of the risk/rate categories the proportion (r) reconvicted within time t is given by:

r = p ( 1 e λ t )
(6.1)

(p.153) In the April 2004 data we have reconviction information for up to 16 months after the target conviction but have censored that data at 15 months (1.25 years). If we had assumed that our initial analysis of the April 2004 assessed offender data was correct, (Figure 6.14), then our estimate of 15-month reconvictions would have been 3,023, some 12.75 per cent above the actual number reconvicted of 2,681.

However, if we apply the Factor 1 (0.90) dichotomy to the assessed offender data and, using Equation 6.1, calculate the number of reconvictions for each of the theoretical categories so defined, then we obtain the results given in Table 6.7. These predictions are a good prospective test of both the basic model and the link between our risk/rate categories and the psychological characteristics of offenders. Our overall prediction is now within 1 per cent of the actual value and for the subset of offenders with Factor 1 scores less than 0.9 the number reconvicted is within 1.25 per cent of the predicted value. Assuming that reconvictions occur as a Poisson process, these figures are well within one standard deviation (1.9 per cent) of the estimate.

The success of these predictions adds considerable support to our theory, particularly as we have relied on the OASys assessments to identify our low-risk offenders, who were otherwise hidden

Table 6.7 Fifteen-month reconviction prediction results

Dichotomy

a or (1−a)

p

Total offenders in risk/rate category

λ

b or

(1–b)

p*(1−exp (−λ*1.25))

Number reconvicted

Estimate

Actual

Factor 1 score < 0.9

1

0.90

2862

1.57

0.71

0.774

2214

2461

0.90

1169

0.24

0.29

0.237

277

Factor 1 score ≥ 0.9

0.38

0.87

195

1.57

0.36

0.748

146

220

0.87

346

0.24

0.64

0.126

44

0.62

0.48

221

0.24

1.00

0.126

28

Estimate

4792

2708

Actual

4833

2681

(p.154) amongst a group of what appeared to be exclusively high-risk offenders.

Conclusions

The PCA analysis of the operational OASys data has confirmed the pilot study findings with regard to the link between psychological characteristics (associated with self control, empathy, and awareness) and our high- and low-risk offender categories. Considering the small size of the pilot sample the parameter estimates were remarkably consistent with those derived from the operational data. In the pilot, OASys section 11 total score provided a satisfactory discriminator enabling the majority of high-recidivism risk offenders to be identified. However, the more sophisticated principal component analysis on an expanded set of OASys questions, sections 7, 11, and 12, (including measures of Lifestyle and Associates, and Attitudes) provided a significant improvement in identifying high-risk offenders.

Using the Factor 1 (0.9) dichotomy allocated 22 per cent more of the offenders (in the operational OASys database) to the high-risk group than would have been allocated using the ‘Total section 11 score ≥ 6’ dichotomy. The improved discrimination however owes more to the PCA technique than to the increased number of questions included in the analysis. Selecting disjoint subsets of the OASys questions for use in the PCA only marginally reduced the discrimination. Sections 7 and 12 combined performed as well as section 11 on its own. None of the OASys questions were identified as redundant in the analysis, but it would seem that the information contained in the principal component, Factor 1, is spread across most of the questions and can, in large part, be extracted from subsets of them.

Using our theory and the Factor 1 dichotomy, applied to a subset of a sample of offenders convicted in April 2004, we produced prospective predictions of reconvictions during a 15-month period from the target conviction. The overall number of actual reconvictions was within 1 per cent of the predicted number. This result provides convincing support for both the basic theory and the link between psychological characteristics and our high- and low-risk offender categories.