We’ve been thinking a bit about common data standards lately - what they might encompass, how they might be developed, and the pain points they could help us all collectively address.
I’d love to know people’s takes on two questions:
What does a common data standard mean to you?
What would (or wouldn’t) be useful to have in them?
Some explanation and background -
What is a common data standard?
A set of definitions for data attributes that provide a consistent approach to collecting and representing data that is commonly used across organisations. For example - agreeing how to define and collect sex and gender information or how addresses should be collected and validated.
For a human services-focussed example, check out the Common dataset for human services from the ACT government.
For an infrastructure and assets-focussed example, check out the Open council data standards.
What challenges might it help solve?
- Differences in how key pieces of information are recorded by organisations make it difficult to take a whole-of-government view of particular groups in the community (e.g. Organisation A collects people’s Culturally and Linguistically Diverse status as [Yes / No] based on the individual’s own identification, Organisation B collects it based on a combination of [Country of birth and Main language spoken at home]
- Differences in interpretation of key concepts both in data and by the people on the frontlines of service delivery (e.g. The difference between sex and gender)
Hi Keith, Interesting concept and one that has been around for a millennium as I am sure you know. The challenge as I see it is not in implementing a ‘standard’ (and there are already standards for many forms of data) but more aligned to providing guidance for common vocabularies and data structures as I see the ‘data standard as you defined’ doing both from the examples provided.
What you point out as ‘commonly used across government’ is probably the real focus area. Common data (demographics, services, financial data, property, topo data) may be a focus to begin with but that being said, not all agencies capture this data. There is also the challenge of defining standards for data capture by contractors (where a lot of data is captured) and getting this into the contract space so it minimises re-use.
Possible pointing towards a common data standard guidance area would assist with contracts.
Sounds like it would be of benefit relating to demographic type data as you pointed out. For assets and alike, certain parts of the information may be shared, others are more focused on internal risk management activities (i.e knowledge about the asset, criticality, life etc.) so may not ever have a common approach other than the defined standards (PAS55 series, / ISO 15500 series) and linkages to other standards (Assets to BIM ISO 19650). Probably steps up a notch in complexity for environmental/biodiversity data.
To answer your questions:
1. What does a common data standard mean to you? Data Standards relate to the metadata and structure of the data as aligned to a compliance or regulatory need. To be certified in asset management or record keeping, how the information is structured and stored is more important than providing a common way for others to access and share.
2. What would (or wouldn’t) be useful to have in them? If provided at a guidance level, formatting guidance (DDMMYYY), common descriptors and definitions for capture method, intended use, so basically the 6 quality measures of Accuracy, Consistency, Integrity, Completeness, Currency and Intended Purpose defined for common shared data. Each having some defined structure to represent these elements for the disciplines of use. (accuracy means many things dependent on your use)
Governance and process are hugely important in defining/useage of any data standards as Daz has outline above.
On a more technical note, its so much easier when we (the consultant doing the work) receives a file geodatabase with coded domains (aka dropdowns) this really helps to ensure Accuracy, Consistency, Integrity, Completeness, Currency in whatever it is we’re documenting/collecting information on.
While i personally much prefer GDB’s (see why here - https://www.esri.com/news/arcuser/0309/files/9reasons.pdf) it is important to give the contractor options in which data format is best to store/supply spatial information that matches their available/preferred software. So have identical templates in formats such as GDB/ Feature class, Shapefile, Tab file, excel ect…
However, your organisation needs to be comfortable with handling/QA’ing each and ingesting it into a master database, this process needs to be automated to reduce risk/liability. This would all be covered in a spatial strategy for your organisation.
Sorry to come in late on this Keith but here are my thoughts. I have been involved in the exploration of common data standards at state and national level in agriculture and it can be a testing process to get people to agree on common data standards for complex areas like soil type or geology etc. Having said that there are some great standards out there as Darren points out. (At PPA we are proud to have implemened the national underground services data standards as we survey and pick up all our undergornd services acorss the Pilbara with great success.) Despite all this, I think we are still in the early stages of the making data available journey as a state (government) and at quite a low level of maturity when you compare us to our counterparts in the US and UK. Our efforts for setting and creating standrds at this stage of maturity should be weighed up with the much bigger task in front of us of encouraging agencies to collect and share data which I think should be our key focus at the moment. Several times I have seen good data projects get bogged down with discussions and indecision on data standards freezing initially motivated organisations into inaction! Dont get me wrong, common data standrads have merit but in this day and age I dont really think the common standards are what is holding back WA become a really successful state fuelled by great data assets it is primarily the fact that our systems for recording and managaing data across all the activties our government undertakes are often quite dated and not easy to access and there is an inherent lack of interest in making available and sharing data for other agencies to use supposedly because of risk and accountability issues but I suspect not many places actually manage their data well enough t have confidence in third parties using it A bit harsh I know but I think it speaks to my point of data governance maturity level of our organisations and that we need to focus on growing the maturity levels through working with the people responsible for the data to better understand the benefits form data sharingand collaborating and also how the risk issues can be overcome.
(ps dont forget the googles and the amazons of this world threw ot data standards years ago and found it mch easier to sort out any data incompatibility issues with AI and ML enabling them to smashed together a broad range of disparate datasets to get the data and info they want.)