Road Safety Manual
A manual for practitioners and decision makers
on implementing safe system infrastructure!

You are here

5.3 Establishing and Maintaining Crash Data Systems

This section provides guidance on the establishment and maintenance of crash data systems. Information on the collection and use of other sources of data can be found in the following sections. Full details on the establishment and maintenance of crash data systems can be found in WHO (2010). The following is a summary of key issues.

Sources of Crash Data

Identified effective practice acknowledges that no single crash injury database will provide enough information to give a complete picture of road traffic injuries or to fully understand the underlying injury mechanisms (IRTAD, 2011). A number of countries which have improved their road safety performance use both crash injury data collected by the police as well as health sector data.

National crash data are typically collected by police (WHO (2013) reports that over 70% of countries use police data as the primary source), and entered into crash database systems for easy analysis and annual reporting. In some circumstances, data are collected from hospitals, or from both of these sources. The use of health sector data for meaningful injury classification at country level is necessary to complement police data and to provide an optimal means of defining serious injury. IRTAD (2011) recommends that police data should remain the primary source of road crash statistics, but that this should be complemented by hospital data due to data quality issues and to identify levels of under-reporting (see Section 5.4). Furthermore, in-depth data is needed from crash injury research to lead to meaningful conclusions concerning crash and injury causation.

National police-reported crash data

The police are well placed to collect information on crashes as they are often called to the scene. Alternatively, they may receive information about the crash following the event. Attendance at the crash scene allows for collection of detailed information that is useful for identifying crash causes and possible solutions.

A crash report form is typically completed (traditionally a paper-based form, although recently computer-based systems have been used), allowing collection of quite detailed information on the crash. Key variables typically collected include:

  • crash location (including geographic coordinates);
  • time of day, day of week, month of year, year;
  • information on those who were involved (including road user type, age, gender, injury sustained);
  • details regarding the road (whether at an intersection, speed limit, curvature, traffic control, markings);
  • details on the environment (light conditions, weather, road surface wet or dry);
  • information regarding what happened in the crash (vehicle movement types, objects struck (including off-road), and contributory factors such as speed, alcohol use or driver distraction);
  • vehicle factors (type of vehicles involved).

Examples of crash report forms, including the types of detail that should be collected as a minimum can be found in WHO (2010). The advice provided in the WHO document is based on the European Common Accident Dataset (CADaS). In addition, a number of countries have developed their own minimum criteria. For example, the US has a Model Minimum Uniform Crash Criteria (further information is available on a dedicated website at

A balance needs to be reached between collecting the required information, and the time it takes to perform this task. If too much burden is placed on the police, it is less likely that the crash report form will be completed. Police are key stakeholders in the establishment and continued collection and use of crash data, and should be included at each stage of the process.

Hospital data

Hospital data is used to identify levels of under-reporting or to obtain better injury information, particularly when police report data is not available or is inadequate. IRTAD (2011) suggests that because of under-reporting of crash data, hospital data should also be collected, and is the next most useful source of information for crash statistics.

Encouraged by the WHO and other institutions, medical authorities have established international recording systems that include road traffic injury. In particular, the International Classification of Diseases and related Health Problems (ICD) and the Abbreviated Injury Scale (AIS) coding systems are used widely. IRTAD (2011) recommend that an internationally agreed definition of ‘serious’ injury be developed, and that the Maximum Abbreviated Injury Scale (MAIS) be used as the basis for defining crash injury severity. This scale is based on maximum injury severity for any of nine parts of the body. A score of 3 or greater for one or more regions of the body (MAIS3+) is recommended as the point at which an injury is considered to be serious.

There are examples where enhanced data has been collected in hospitals thereby allowing more detailed analysis, as in the case study from Thailand below (Box 5.3).

Box 5.3 : Case study – GIS spatial analysis of hospital data, Khon Kaen province, Thailand

The problem: Although information on crashes existed from other sources, it was not sufficient to adequately manage safety in Khon Kaen province, Thailand. An alternative source of information existed from hospital data, but this did not provide information on crash locations.

The solution: Khon Kaen Regional Hospital in the northeast of Thailand has had its own Trauma Registry since 1989. Additional information on road crashes has been added to this database through the use of an ‘Accident Report’ form. This form includes information on the circumstances of the crash, factors that may have influenced the occurrence of the crash, and descriptive information regarding the location of the crash. The development of the GIS spatial analysis of hospital data involved revising the accident report form and the injury surveillance forms. The accident report form was modified to exclude information on property damage as this required site inspections and the injury surveillance form was modified to reflect all injured person information e.g. the use of seatbelts etc. In addition to basic crash data, all crash locations were spatially coded using a Geographical Information System (GIS). A link recognition technique on GIS was developed to obtain road geometry properties. Using GIS mapping allowed for the visualisation of crash location features e.g. intersections, level crossings and roundabouts.

The outcome: The addition of this information has allowed detailed analysis of crash data, including the identification of high risk locations. A number of detailed studies on specific road safety issues have been undertaken over recent years based on this rich source of information.

Further details can be found in Ruengsorn et al., (2001).


A further example is provided in Box 5.4 from Egypt, demonstrating the integration of data from different sources, as well as the use of this data by various key stakeholders.

Box 5.4 : Case study – EU twinning project on enhancing road safety in Egypt

The problem: The need for a centralised and accurate accident data system.

The solution: The National Accident Management system is intended to be built as a new system for both investigating and analysing road accident data and it is intended as a shared ‘space’ for all parties responding to different kinds of road accidents, such as medical rescue services, police forces, fire brigades, hospital personnel and the road administration. The procurement started in the last quarter of 2010 and included the hardware and software of GPS based mobile terminals, to be used at accident sites for determining the location, documentation, reporting and information by police officers and ambulance personnel.

The new system will install the unified process for accident statistics and investigation, match the data from different sources and provide the different researchers on road safety with the necessary accident data. The handheld devices for the accident registration will contain the accident report sheets in the form of multiple choice tables, easily to be filled in at the accident sites and organized in a way which will prohibit personal data from being given to authorities other than the court, insurances and hospitals.

Figure 5.2 - Source: Hans-Joachim Vollpracht, World Road Association (PIARC)

The outcome: The main outcome was that underreporting was successfully diminished and a close cooperation between the key stakeholders and a detailed system of information for accident investigations was achieved.

A key lesson in this exercise was that GPS based digital accident data management is a crucial tool for road safety policy and the infrastructure safety management as well. The data management system needs to overcome police restrictions on accessing accident data.

Source: Hans-Joachim Vollpracht, World Road Association (PIARC)


Registration systems

‘Vital registration’ data can be used as a source of information on road deaths. This information comes from death certificates completed by doctors which state the cause of death. WHO (2010) reports that around 40% of WHO member countries collect vital registration data of the detail required for monitoring road traffic deaths. WHO and other organizations have instituted an international registration system that includes those injured in traffic crashes.

Other crash data sources

Other sources of data on crashes can come from emergency services, tow truck drivers, members of the public, insurance companies, etc. However, it is important to recognise that the quality and extent of this information may be limited when compared to police and hospital reported data.

Establishing or Improving Crash Databases

Before establishing a new crash database system (or improving a current system), it is recommended that a situational assessment be undertaken (WHO, 2010). This involves:

  • stakeholder analysis;
  • assessment of data sources and existing systems/quality;
  • end user needs assessment;
  • environmental analysis.

These same steps are also required when establishing or improving on the collection of non-crash data (see Non Crash Data and Recording Systems).

A stakeholder analysis involves identifying organisations and individuals who have (or should have) a role in the collection and use of road safety data. Critical stakeholders will include police, transport agencies and health departments, but there are likely to be many others.

An assessment of data sources is required to determine what information is already collected, and the quality of the data. This is often a significant problem in many countries.

An end user assessment involves understanding who the key users are and, how these key stakeholders use the information. This knowledge will help improve the usability of the data.

An environmental analysis involves understanding the political environment and critical partnerships required for the successful collection, analysis and use of the data. Without this understanding and appropriate collaboration it is likely that collection and use of crash data will be severely hindered. There are many examples where expensive crash data systems have been established, but data has not been entered into the system due to inadequate communications and poor cooperation.

Following this situational assessment, the recommended process for establishing a crash database system is to:

  • establish a working group (typically drawn from the stakeholder analysis);
  • choose a course of action;
  • recommend minimum data elements and definitions;
  • design and implement the system (or improve the current system) (WHO, 2010).

Crash location is a key element in collecting and analysing data, particular for road engineers. Without this information, it is not possible to determine what locations to treat in the future. In addition, if the crash location is known (whether from police reports or other sources of data), there is potential to link this crash data to asset or other data sources (see Analysis of Data and Using Data to Improve Safety). This information may be of use in identifying other road-based elements that may have contributed to the crash risk.

Several methods are available for the accurate location of crashes, including the use of global positioning systems, reference to a local landmark (e.g. a link-node system), or reference to a route kilometre marker post (a linear referencing system).

Historically, crash data records were kept in paper-based filing systems, but now computerised database systems are used to store information on crashes. This allows relatively easy analysis of data, and is particularly useful in the identification of trends, high risk locations or areas, key crash types, etc. There are a number of computer software packages available for this task. At a minimum, such a system should have the capacity to:

  • produce quick summaries on key crash variables (e.g. total crashes; crashes by different severity; crashes by different user groups, such as pedestrians; trends over time);
  • allow cross-tabulations (i.e. analysis of two or more variables at a time (e.g. change in fatal pedestrian crashes over time);
  • identify hazardous locations (e.g. locations, routes or areas with concentrations of crashes, or crashes of a particular type).

Crash data systems have become very advanced in recent years, with features added that allow quicker and more useful analysis. WHO (2010) and Turner and Hore-Lacy (2010) provide a list of other desirable features of crash data systems. These include:

  • built-in quality checks (algorithms and logic checks);
  • GIS linkage to allow accurate identification of crash location;
  • ability to add new data fields without re-developing the database;
  • database navigation features such as drop-down menus, clickable maps;
  • pre-defined queries and reports;
  • options for customised, user-defined queries and reports;
  • mapping ability for data entry, crash selection and presentation of aggregated crash information;
  • ability to export data to third-party applications (e.g. Microsoft Excel, Statistical Analysis Software (SAS)) for further statistical analysis;
  • inclusion of crash narrative, sketches of crash scene, photographs and videos linked to crash;
  • automatically generated collision diagrams;
  • site ranking based on crash rates, numbers, costs;
  • ability to monitor sites of interest, i.e. before and after treatments.

An example of the successful implementation of a crash data system is provided in the case study below from Cambodia (Box 5.5)

Box 5.5 : Case study – Cambodia’s road crash victim and information system (RCVIS)

The problem: The database system for Cambodia was inadequate to adequately manage effective road safety outcomes. Since 2002, three different ministries were involved in road crash data collection in Cambodia (Ministry of Public Works and Transport, Ministry of Interior (MoI), and Ministry of Health (MoH)). Although the databases developed by the three ministries provided some relevant information on the road safety situation in the country, a need for improvement was observed as the databases were not compatible, there were important discrepancies between them, they were under-reporting the real situation, and they were limited in their scope.

The solution: In 2004, Handicap International Belgium and the Cambodian Red Cross proposed a new system, based on a standardised and more detailed data collection form. The objective of the new system is to provide accurate, continuous and comprehensive information on road crashes and victims for the purposes of increasing understanding of the road safety situation, planning appropriate responses, and evaluating impact of the road safety initiatives. The system was initially piloted in the capital city of Phnom Penh in 2004. The pilot’s success resulted in expansion to all provinces in 2006.

Handicap International (HI), an international aid organisation, played a key role in the development and implementation of the RCVIS and was initially responsible for overseeing the entire system. HI has slowly transitioned the work to the Cambodian government, moving data collection responsibilities to the MoH and the MoI in 2008–2009 and transferring system oversight to the National Road Safety Committee (NRSC) in 2010.

The system combines data from the police and health sectors. The structure of RCVIS (from 2009 onwards) is shown below:

Figure 5.3 - Source: Chariya Ear, Handicap International, Cambodia.

The outcome: Combining both data sources has enabled the system to cover more elements on detailed crash information (from police) and casualty information (from hospitals). This has led to higher levels of reporting. There were several factors that helped with the success of the RCVIS. One factor was the development of the two RCVIS forms at the same time — one for traffic police and another for the health sector. This provided a common understanding on the technical aspects of the system, which required common questions (variables) in both forms. The main parties were all involved in this development stage, which also meant that they participated and approved the common sections of the two forms. Secondly, the Ministry of Public Works and Transport was empowered to play the coordination role, with the understanding since the beginning that they would combine the two data sources. They were supported to organise meetings, lead discussion, chair RCVIS annual workshops, and become a co-author of the RCVIS publication, etc. Thirdly, the findings and data from the pilot phase helped all parties understand key issues and refine the process. It was clear that the reports from the pilot phase were highly appreciated by key government stakeholders, media, and the working group members.

Although there have been significant advancements through the RCVIS, there are still issues with the system. Data reporting is inconsistent, particularly from the health facilities. To address the issue of non-reporting, it is recommended that the NRSC and the Ministry of Health take steps such as increasing buy-in from senior level administrators in health facilities, formalising and standardising RCVIS reporting procedures, and showing health facilities how RCVIS data can be useful to them.

Source: Chariya Ear, Handicap International, Cambodia.


The Swedish Strada system is a unique database that integrates police and hospital data. It is important to recognise that although this linkage provides valuable additional information, it does occur at additional cost. Further details are provided in Box 5.6.

Box 5.6: Case study – Swedish traffic accident data acquisition (Strada)

The problem: Lack of reliable information on crashes, including information on injury outcomes.

The solution: In October 1996 the Swedish Road Administration was commissioned by the Swedish government to initiate a new information system covering injuries and accidents in the entire road traffic system. This was accomplished in co-operation with the Swedish Police, the Swedish National Board of Health and Welfare, the Swedish Institute for Transport and Communications Analysis, Statistics Sweden and the Swedish Association of Local Authorities and Regions.

Strada is based on information reported from two sources; the Swedish police and hospitals. All police districts, since 2003, report to Strada on a national scale. Strada also receives information from an increasing number of hospitals. The incorporation of hospital data makes this method very different from earlier methods of registration of injuries and accidents in the road transport system.

The outcomes: By bringing together data from two sources – the police and the hospitals – Strada provides more detailed information, thus increasing the knowledge of road traffic injuries and accidents. Incorporating hospital data decreases underreporting since the police have limited knowledge about some road traffic accidents (mainly involving unprotected road users: pedestrians, cyclists and moped drivers). In addition, the hospitals’ reporting of diagnoses broadens the knowledge of the injuries and their degree of seriousness.

The data are entered by the local police and hospitals into Strada, where a match immediately takes place.

To date, the level of accuracy in data matching is very high. Matching means the data was recorded by both the police and the hospitals. A more complete description of accident circumstances and injuries involved is obtained from matched data. This results in more accurate information on the real traffic safety problems which in turn facilitates the planning and prioritisation of traffic safety measures.

In 2013, the police recorded 15 000 accidents with 20 600 injured persons. They were matched with 32 700 persons who sought treatment in hospital. Of these, 9 800 were matched, that is known to both sources. It also means that 48% of those known to the police were also known to the hospitals; also 30% of those registered by the hospitals were registered by the police. One conclusion from this match is that the police reporting is far from comprehensive.

Since a number of hospitals do not yet report to Strada, the existing official statistics are based exclusively on accidents reported by the police. The information derived from the hospitals is shown in a supplement containing medical statistics.


Some countries have undertaken in-depth studies of serious crashes to provide a more thorough understanding of crash causation factors and possible solutions. Such studies typically investigate a sample of high severity crashes. As an example, in the UK, the ‘On the Spot’ project collected detailed and high-quality crash information over two regions. More than 2000 variables were collected for each incident based on scene investigation soon after the crash occurred, as well as follow-up communication with medical services and local government. The information was analysed to provide insight about human involvement, vehicle design, and highway design in crash and injury causation. Mansfield et al. (2008) provide an initial analysis of around 2000 incidents from this program. Such investigation can provide far more detail than what is typically available through a crash report, and to a higher degree of reliability.

Similar examples can be found in many other countries, including the USA, Germany, France, Malaysia, India and Australia. Some of these programmes have been in place for many years, and have produced large amounts of valuable information. One of the key outputs from the EU DaCoTA project (which collected and analysed data from European countries on various road safety topics) was guidance on the collection of such data, as well as standardised procedures (Thomas et al., 2013). A Pan-European In-depth Accident Investigation Network has been established, and tools such as an online manual for in-depth road accident investigations have been developed (see

The US has established the second Strategic Highway Research Program (SHRP2). SHRP2 is perhaps the most comprehensive database of information on factors occurring before and during crashes and near-crash events. The information collected includes data from the Naturalistic Driving Study (NDS) database. This dataset includes information from over 2300 drivers, collected through equipment installed in their own vehicles, and through normal driving. The massive amount of data collected through the NDS is supplemented through the Roadway Information Database (RID) which includes comprehensive information on road elements in the study areas as well as other relevant data (including crash data). This globally significant database is expected to provide the research basis for studies on driver performance and behaviour. More information can be found at

Data sharing

Sharing of data from different sources is required for the comprehensive collection, analysis and integration of data. Efficient data sharing, particularly between the police and the highway authority, is essential for good practice road safety management.

However, it is important to note that some organisations may be reluctant to share certain data, particularly personal identifiers, due to the issues it poses surrounding the privacy and anonymity of those involved. One response is to collect the personal details on a separate page of the crash report form (e.g. name and address information). This page can then be removed before sending the remaining pages on to partner agencies. In some cases, it may be necessary to develop appropriate privacy policies to ensure this issue is addressed, or for certain variables to be removed to prevent the identification of individuals.

Crash data on its own is a valuable source of information on crash risk, and when combined with other sources of data, this value can be greatly increased. The following section discusses some of the other data sources, while Analysis of Data and Using Data to Improve Safety discusses combining these sources.


Reference sources

No reference sources found.